[ https://issues.apache.org/jira/browse/PDFBOX-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tilman Hausherr updated PDFBOX-5965: ------------------------------------ Description: A user (memo for me: on 3.3.2025 16:58) has contacted me with a problem that involves a token count difference of parsed content stream tokens between PDFBox 2.0.33 and 3.0.4. It happens when an inline image is close to the end of the file, and is followed by two "Q" operators. The two "Q" operators are ignored despite the rewind() in PDFStreamParser. The problem goes away by replacing {{new NonSeekableRandomAccessReadInputStream(decoderStream)}} with {{new RandomAccessReadBuffer(decoderStream)}} in {{PDPage}}. It does not happen in PDFBox 3.0.2. I have a file that reproduces the problem but I doubt I can share it, I will try to find or create something. was: A user has contacted me with a problem that involves a token count difference of parsed content stream tokens between PDFBox 2.0.33 and 3.0.4. It happens when an inline image is close to the end of the file, and is followed by two "Q" operators. The two "Q" operators are ignored despite the rewind() in PDFStreamParser. The problem goes away by replacing {{new NonSeekableRandomAccessReadInputStream(decoderStream)}} with {{new RandomAccessReadBuffer(decoderStream)}} in {{PDPage}}. It does not happen in PDFBox 3.0.2. I have a file that reproduces the problem but I doubt I can share it, I will try to find or create something. > Rewind in NonSeekableRandomAccessReadInputStream malfunction near end of file > ----------------------------------------------------------------------------- > > Key: PDFBOX-5965 > URL: https://issues.apache.org/jira/browse/PDFBOX-5965 > Project: PDFBox > Issue Type: Bug > Components: IO > Affects Versions: 3.0.4 PDFBox > Reporter: Tilman Hausherr > Assignee: Tilman Hausherr > Priority: Major > Labels: regression > Fix For: 3.0.5 PDFBox, 4.0.0 > > Attachments: POPPLER-208-p2.pdf > > > A user (memo for me: on 3.3.2025 16:58) has contacted me with a problem that > involves a token count difference of parsed content stream tokens between > PDFBox 2.0.33 and 3.0.4. It happens when an inline image is close to the end > of the file, and is followed by two "Q" operators. The two "Q" operators are > ignored despite the rewind() in PDFStreamParser. The problem goes away by > replacing {{new NonSeekableRandomAccessReadInputStream(decoderStream)}} with > {{new RandomAccessReadBuffer(decoderStream)}} in {{PDPage}}. It does not > happen in PDFBox 3.0.2. > I have a file that reproduces the problem but I doubt I can share it, I will > try to find or create something. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org