[ 
https://issues.apache.org/jira/browse/PDFBOX-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472703#comment-17472703
 ] 

Oliver Schmidtmer commented on PDFBOX-5340:
-------------------------------------------

[~tilman] 

Through a failing PDF I noticed that the detection feature mentioned in the 
ticket description is included in the CCITTFaxFilter, not in the 
CCITTFaxDecoderStream itself.

When the CCITT stream is smaller than 20 bytes it currently fails. Our current 
version just aborts the search if no EOL code is found before the end of the 
stream and assumes RLE encoding.

[https://github.com/haraldk/TwelveMonkeys/blob/master/imageio/imageio-tiff/src/main/java/com/twelvemonkeys/imageio/plugins/tiff/CCITTFaxDecoderStream.java#L142]

Unfortunately I can't share the PDF, but the stream is w=683 h=4, k=0 with a 
content of 18 bytes [103, 44, 103, 44, 103, 44, 103, 44, 0, 16, 1, 0, 16, 1, 0, 
16, 1, 10]

that equals to codes for 4 white lines followed by 6 EOL codes and one more 
byte in the stream. So RLE reading would work there.

Oliver

> Update CCITTFaxDecoderStream.java from twelvemonkeys (3)
> --------------------------------------------------------
>
>                 Key: PDFBOX-5340
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5340
>             Project: PDFBox
>          Issue Type: Task
>          Components: Rendering
>    Affects Versions: 2.0.24
>            Reporter: Tilman Hausherr
>            Assignee: Tilman Hausherr
>            Priority: Minor
>             Fix For: 2.0.25, 3.0.0 PDFBox
>
>
> From what I see, only the last two commits are relevant. The other changes 
> are about a detection feature that we didn't include.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to