[ 
https://issues.apache.org/jira/browse/PDFBOX-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17947312#comment-17947312
 ] 

Tilman Hausherr commented on PDFBOX-5965:
-----------------------------------------

I haven't been able to fix it, but here's a test that fails
{code:java}
@Test
void testRewindAcrossBuffers() throws IOException
{
    byte[] ba = new byte[4096 + 5];
    int rewSize = 7;
    byte testVal = 123;
    ba[ba.length - rewSize] = testVal;
    ByteArrayInputStream bais = new ByteArrayInputStream(ba);
    try (RandomAccessRead rar = new 
NonSeekableRandomAccessReadInputStream(bais))
    {
        int len = rar.read(new byte[ba.length - rewSize]);
        assertEquals(ba.length - rewSize, len);
        len = rar.read(new byte[rewSize]);
        assertEquals(rewSize, len);
        int by = rar.read();
        assertEquals(-1, by);
        assertTrue(rar.isEOF());
        rar.rewind(len);
        by = rar.read(); // went ArrayIndexOutOfBoundsException here
        assertEquals(testVal, by);
    }
}
{code}

> Rewind in NonSeekableRandomAccessReadInputStream malfunction near end of file
> -----------------------------------------------------------------------------
>
>                 Key: PDFBOX-5965
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5965
>             Project: PDFBox
>          Issue Type: Bug
>          Components: IO
>    Affects Versions: 3.0.4 PDFBox
>            Reporter: Tilman Hausherr
>            Assignee: Tilman Hausherr
>            Priority: Major
>              Labels: regression
>             Fix For: 3.0.5 PDFBox, 4.0.0
>
>         Attachments: POPPLER-208-p2.pdf
>
>
> A user (memo for me: on 3.3.2025 16:58) has contacted me with a problem that 
> involves a token count difference of parsed content stream tokens between 
> PDFBox 2.0.33 and 3.0.4. It happens when an inline image is close to the end 
> of the file, and is followed by two "Q" operators. The two "Q" operators are 
> ignored despite the rewind() in PDFStreamParser. The problem goes away by 
> replacing {{new NonSeekableRandomAccessReadInputStream(decoderStream)}} with 
> {{new RandomAccessReadBuffer(decoderStream)}} in {{PDPage}}. It does not 
> happen in PDFBox 3.0.2.
> I have a file that reproduces the problem but I doubt I can share it, I will 
> try to find or create something.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to