[ 
https://issues.apache.org/jira/browse/PDFBOX-4200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448279#comment-16448279
 ] 

Xin Lin edited comment on PDFBOX-4200 at 4/23/18 3:09 PM:
----------------------------------------------------------

I dug a little deeper and, instead of the characters between stream and 
endstream, PDFBox seems to think what I pasted below is a stream and tried to 
decompress it and that apparently is the cause of the exception:
{code:java}
Σ
ðÔ-rÑqá¸r‘.d.Œ_xp¡ÔUÛ•ãZëúÌM×?çvÄmÄÝØ=Ùý¸û+K‘G‹Ç”§“çÏ^ˆ—¯W‘W¯·’÷bïjï§>:>‰>?>¾v¾«}/øaýývúÝó×ðçú×ûO8¬
 è
¤FV>2 uÃÁÁ»‚/Ò_$\ÔBüCv…< 5
]ús.,4¬&ìy¸Ux~xw-bEDCÄ»H?ÈÒÈG‹?KwFÉGÅEÕGME{E—EK—X,Y³äFŒZŒ 
¦={$vr©÷ÒÝK‡ãìâ
ãî.3\–³ìÚrµå©ËÏ®?_ÁYq*ßÿ‰©åL®ô_¹wåד»‡û’çÆ+ç?ñ]øeü‘—„²„ÑD—Ä]‰cI®IIãOAµàu²_ò?ä©”?”£)3©Ñ©Íi„´ø´ÓB%aа+]3='½/Ã4£0CºÊiÕîU¢@Ñ‘L(sYf»˜ŽþLõHŒ$›%ƒY³j²ÞgGeŸÊQÌæôäšänËÉóÉû~5f5wug¾vþ†üÁ5îk­…Ö®\Û¹Nw]Áºáõ¾ë?m
 mHÙðËFË?eßnŠÞÔQ 
Q°¾`h³ïæÆB¹BQá½-Î[lÅllíÝf³­jÛ—"^ÑõbËâŠâO%Ü’ëßY}WùÝÌö„í½¥ö¥ûwàwÜÝéºóX™bY^ÙЮà]­åÌò¢ò·»Wì¾Va[q`i?d?´2¨²½J¯jGÕ§ê¤ê??šæ½ê{·í?ÚÇÛ׿ßmÓ?Å>¼È÷Pk­AmÅaÜá¬ÃÏë¢êº¿g_DíHñ‘ÏG…G¥ÇÂ?uÕ;Ô×7¨7”6Â?’ƱãqÇoýàõC

{«éP3£¹ø8!9ñâÇøïž<ÙyŠ}

ªé'ýŸö¶ÐZŠZ¡ÖÜÖ‰¶¤6i{L

{ßé€Ó?Î-?›ÿ|ôŒö™š³ÊgKϑΜ›9Ÿw~òBÆ…ñ‹‰‡:Wt>º´äÒ?®°®ÞË?—¯^ñ¹r©Û½ûüU—«g®9];}

?}½í†ý?Ö»ž–_ì~iéµïm½ép³ý–ã­Ž¾}çú]û/Þöº}åŽÿ?‹úî.¾{ÿ^Ü=é}ÞýÑ©^?Ìz8ýhýcìã¢'
O*žª?­ýÕø×f©½ôì 
×`ϳˆg?†¸C/ÿ•ù¯OÃÏ©Ï+F´FêG­GÏŒùŒÝz±ôÅðËŒ—Óã…¿)þ¶÷•Ñ«Ÿ~wû½gbÉÄðkÑë™?JÞ¨¾9úÖömçdèäÓwi獵ŠÞ«¾?ö?ý¡ûcôÇ‘éìOøO•Ÿ??w|
 üòx&mfæß÷„óû
endstream
endobj
18 0 obj
2612
endobj
9 0 obj
[ /ICCBased 17 0 R ]
endobj
19 0 obj
<< /Length 20 0 R /N 3 /Alternate /DeviceRGB /Filter /FlateDecode >>
stream
x…U[ˆUþ“9É
»ÎÓÚÕ-¤C½t)»K¶Ý¥´š[“´k²ÙÕA³““dÌì$ÎLÒ}*‚â‹«¾IA¼½‚ÒzÁÖûR©PVwë"(>´xA(ôE·ñ;“d&Yj›eÏ|óýßùo矢?µB½®û¢%Ã6sɨòÜÑcÊÀ:ùé!¤Q,¨V=’ÍÎ~B+®ý¿[??O0W'îlïWo¹,rK%òݾV´Ô%àD?³jÝ´‰†ÁO·ë‹†Mü¢Àå6†?†Ûø5G“ÏÅ
 9,«•Bxx|±‡/÷àvPÀO’ÜÔTEô"kÖJšÎC

{¹‡¹Gy7¸¤7P³óÛ?uȪÎÆu
µ¿R,Äž^Q‰9àG€¯5µ…Lß®ÛÑðcDþ??ê|x7pªdœ†¿Yi¤ºø?S•ü³à·?ÿÆXÌéì]S­zI;Áß®ð´èoˆHR4;?†é€YË
 =r?JEO ?¿^­9À§ô™Õœ¼ÈgíT%&òüå— óá*ú*å6?!¾'«ÕOšZ¹b+

{Âá'>}

\Iêä¸RÐuÅ1YŠÉ-n6yq’ÄwSì#º™s¾‡¾mW<Î~†hÿ_x÷}ïqÇD+ÑÈã7†wåï?{Bm˜Í¶?òù¾#²J{÷8÷¾¡(Þ_?·Z7ñx‹hóÍVëŸ÷[­Íàƒè‚þ
 Ÿ|U

{code}
If I put a breakpoint at a line in FlateFilter.java ( int read = in.read(buf); 
) in the private method 

'decompress(InputStream in, OutputStream out)' and consume that input stream 
'in' in a debugger, the image is rendered with no error and it even looked 
fine. (no discernible differences between the image and the PDF).


was (Author: xin.lin):
I dug a little deeper and, instead of the characters between stream and 
endstream, PDFBox seems to think what I pasted below is a stream and tried to 
decompress it and that apparently is the cause of the exception:
{code:java}
Σ
ðÔ-rÑqá¸r‘.d.Œ_xp¡ÔUÛ•ãZëúÌM×?çvÄmÄÝØ=Ùý¸û+K‘G‹Ç”§“çÏ^ˆ—¯W‘W¯·’÷bïjï§>:>‰>?>¾v¾«}/øaýývúÝó×ðçú×ûO8¬
 è
¤FV>2 uÃÁÁ»‚/Ò_$\ÔBüCv…< 5
]ús.,4¬&ìy¸Ux~xw-bEDCÄ»H?ÈÒÈG‹?KwFÉGÅEÕGME{E—EK—X,Y³äFŒZŒ 
¦={$vr©÷ÒÝK‡ãìâ
ãî.3\–³ìÚrµå©ËÏ®?_ÁYq*ßÿ‰©åL®ô_¹wåד»‡û’çÆ+ç?ñ]øeü‘—„²„ÑD—Ä]‰cI®IIãOAµàu²_ò?ä©”?”£)3©Ñ©Íi„´ø´ÓB%aа+]3='½/Ã4£0CºÊiÕîU¢@Ñ‘L(sYf»˜ŽþLõHŒ$›%ƒY³j²ÞgGeŸÊQÌæôäšänËÉóÉû~5f5wug¾vþ†üÁ5îk­…Ö®\Û¹Nw]Áºáõ¾ë?m
 mHÙðËFË?eßnŠÞÔQ 
Q°¾`h³ïæÆB¹BQá½-Î[lÅllíÝf³­jÛ—"^ÑõbËâŠâO%Ü’ëßY}WùÝÌö„í½¥ö¥ûwàwÜÝéºóX™bY^ÙЮà]­åÌò¢ò·»Wì¾Va[q`i?d?´2¨²½J¯jGÕ§ê¤ê??šæ½ê{·í?ÚÇÛ׿ßmÓ?Å>¼È÷Pk­AmÅaÜá¬ÃÏë¢êº¿g_DíHñ‘ÏG…G¥ÇÂ?uÕ;Ô×7¨7”6Â?’ƱãqÇoýàõC

{«éP3£¹ø8!9ñâÇøïž<ÙyŠ}

ªé'ýŸö¶ÐZŠZ¡ÖÜÖ‰¶¤6i{L

{ßé€Ó?Î-?›ÿ|ôŒö™š³ÊgKϑΜ›9Ÿw~òBÆ…ñ‹‰‡:Wt>º´äÒ?®°®ÞË?—¯^ñ¹r©Û½ûüU—«g®9];}

?}½í†ý?Ö»ž–_ì~iéµïm½ép³ý–ã­Ž¾}çú]û/Þöº}åŽÿ?‹úî.¾{ÿ^Ü=é}ÞýÑ©^?Ìz8ýhýcìã¢'
O*žª?­ýÕø×f©½ôì 
×`ϳˆg?†¸C/ÿ•ù¯OÃÏ©Ï+F´FêG­GÏŒùŒÝz±ôÅðËŒ—Óã…¿)þ¶÷•Ñ«Ÿ~wû½gbÉÄðkÑë™?JÞ¨¾9úÖömçdèäÓwi獵ŠÞ«¾?ö?ý¡ûcôÇ‘éìOøO•Ÿ??w|
 üòx&mfæß÷„óû
endstream
endobj
18 0 obj
2612
endobj
9 0 obj
[ /ICCBased 17 0 R ]
endobj
19 0 obj
<< /Length 20 0 R /N 3 /Alternate /DeviceRGB /Filter /FlateDecode >>
stream
x…U[ˆUþ“9É
»ÎÓÚÕ-¤C½t)»K¶Ý¥´š[“´k²ÙÕA³““dÌì$ÎLÒ}*‚â‹«¾IA¼½‚ÒzÁÖûR©PVwë"(>´xA(ôE·ñ;“d&Yj›eÏ|óýßùo矢?µB½®û¢%Ã6sɨòÜÑcÊÀ:ùé!¤Q,¨V=’ÍÎ~B+®ý¿[??O0W'îlïWo¹,rK%òݾV´Ô%àD?³jÝ´‰†ÁO·ë‹†Mü¢Àå6†?†Ûø5G“ÏÅ
 9,«•Bxx|±‡/÷àvPÀO’ÜÔTEô"kÖJšÎC

{¹‡¹Gy7¸¤7P³óÛ?uȪÎÆu
µ¿R,Äž^Q‰9àG€¯5µ…Lß®ÛÑðcDþ??ê|x7pªdœ†¿Yi¤ºø?S•ü³à·?ÿÆXÌéì]S­zI;Áß®ð´èoˆHR4;?†é€YË
 =r?JEO ?¿^­9À§ô™Õœ¼ÈgíT%&òüå— óá*ú*å6?!¾'«ÕOšZ¹b+

{Âá'>}

\Iêä¸RÐuÅ1YŠÉ-n6yq’ÄwSì#º™s¾‡¾mW<Î~†hÿ_x÷}ïqÇD+ÑÈã7†wåï?{Bm˜Í¶?òù¾#²J{÷8÷¾¡(Þ_?·Z7ñx‹hóÍVëŸ÷[­Íàƒè‚þ
 Ÿ|U

{code}
If I put a breakpoint at line 70 in FlateFilter.java ( int read = in.read(buf); 
) and consume that input stream 'in' in a debugger, the image is rendered with 
no error and it even looked fine. (no discernible differences between the image 
and the PDF).

> DataFormatException: invalid code lengths set when rendering image
> ------------------------------------------------------------------
>
>                 Key: PDFBOX-4200
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4200
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Rendering
>    Affects Versions: 2.0.7, 2.0.8, 2.0.9
>            Reporter: Xin Lin
>            Priority: Major
>         Attachments: invalid.pdf
>
>
> When rendering image from the attached pdf, an exception was thrown and below 
> is the relevant stack trace:
> java.util.zip.DataFormatException: invalid code lengths set
>     at java.util.zip.Inflater.inflateBytes(Native Method) ~[?:1.8.0_121]
>     at java.util.zip.Inflater.inflate(Inflater.java:259) ~[?:1.8.0_121]
>     at java.util.zip.Inflater.inflate(Inflater.java:280) ~[?:1.8.0_121]
>     at org.apache.pdfbox.filter.FlateFilter.decompress(FlateFilter.java:108) 
> ~[pdfbox-2.0.9.jar:2.0.9]
>     at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:74) 
> ~[pdfbox-2.0.9.jar:2.0.9]
>     at org.apache.pdfbox.filter.Filter.decode(Filter.java:87) 
> ~[pdfbox-2.0.9.jar:2.0.9]
>     at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:77) 
> ~[pdfbox-2.0.9.jar:2.0.9]
>     at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:175) 
> ~[pdfbox-2.0.9.jar:2.0.9]
>     at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:163) 
> ~[pdfbox-2.0.9.jar:2.0.9]
>     at 
> org.apache.pdfbox.pdmodel.common.PDStream.createInputStream(PDStream.java:236)
>  ~[pdfbox-2.0.9.jar:2.0.9]
>     at 
> org.apache.pdfbox.pdmodel.graphics.color.PDICCBased.loadICCProfile(PDICCBased.java:124)
>  ~[pdfbox-2.0.9.jar:2.0.9]
>     at 
> org.apache.pdfbox.pdmodel.graphics.color.PDICCBased.<init>(PDICCBased.java:98)
>  ~[pdfbox-2.0.9.jar:2.0.9]
>     at 
> org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:192)
>  ~[pdfbox-2.0.9.jar:2.0.9]
>     at 
> org.apache.pdfbox.pdmodel.PDResources.getColorSpace(PDResources.java:199) 
> ~[pdfbox-2.0.9.jar:2.0.9]
>     at 
> org.apache.pdfbox.pdmodel.PDResources.getColorSpace(PDResources.java:169) 
> ~[pdfbox-2.0.9.jar:2.0.9]
>     at 
> org.apache.pdfbox.contentstream.operator.color.SetNonStrokingColorSpace.process(SetNonStrokingColorSpace.java:41)
>  ~[pdfbox-2.0.9.jar:2.0.9]
>     at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:848)
>  ~[pdfbox-2.0.9.jar:2.0.9]
>     at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:503)
>  ~[pdfbox-2.0.9.jar:2.0.9]
>     at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:477)
>  ~[pdfbox-2.0.9.jar:2.0.9]
>     at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:150)
>  ~[pdfbox-2.0.9.jar:2.0.9]
>     at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:246) 
> ~[pdfbox-2.0.9.jar:2.0.9]
>     at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:225) 
> ~[pdfbox-2.0.9.jar:2.0.9]
>     at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:138) 
> ~[pdfbox-2.0.9.jar:2.0.9]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to