[ https://issues.apache.org/jira/browse/PDFBOX-4200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448279#comment-16448279 ]
Xin Lin edited comment on PDFBOX-4200 at 4/23/18 3:09 PM: ---------------------------------------------------------- I dug a little deeper and, instead of the characters between stream and endstream, PDFBox seems to think what I pasted below is a stream and tried to decompress it and that apparently is the cause of the exception: {code:java} Σ ðÔ-rÑqá¸r‘.d.Œ_xp¡ÔUÛ•ãZëúÌM×?çvÄmÄÝØ=Ùý¸û+K‘G‹Ç”§“çÏ^ˆ—¯W‘W¯·’÷bïjï§>:>‰>?>¾v¾«}/øaýývúÝó×ðçú×ûO8¬ è ¤FV>2 uÃÁÁ»‚/Ò_$\ÔBüCv…< 5 ]ús.,4¬&ìy¸Ux~xw-bEDCÄ»H?ÈÒÈG‹?KwFÉGÅEÕGME{E—EK—X,Y³äFŒZŒ ¦={$vr©÷ÒÝK‡ãìâ ãî.3\–³ìÚrµå©ËÏ®?_ÁYq*ßÿ‰Â©åL®ô_¹wåד»‡û’çÆ+ç?ñ]øeü‘—„²„ÑD—Ä]‰cI®IIãOAµàu²_ò?ä©”?”£)3©Ñ©Íi„´ø´ÓB%aа+]3='½/Ã4£0CºÊiÕîU¢@Ñ‘L(sYf»˜ŽþLõHŒ$›%ƒY³j²ÞgGeŸÊQÌæôäšänËÉóÉû~5f5wug¾vþ†üÁ5îk…Ö®\Û¹Nw]Áºáõ¾ë?m mHÙðËFË?eßnŠÞÔQ Q°¾`h³ïæÆB¹BQá½-Î[lÅllíÝf³jÛ—"^ÑõbËâŠâO%Ü’ëßY}WùÝÌö„í½¥ö¥ûwàwÜÝéºóX™bY^ÙЮà]åÌò¢ò·»Wì¾Va[q`i?d?´2¨²½J¯jGÕ§ê¤ê??šæ½ê{·í?ÚÇÛ׿ßmÓ?Å>¼È÷PkAmÅaÜá¬ÃÏë¢êº¿g_DíHñ‘ÏG…G¥ÇÂ?uÕ;Ô×7¨7”6Â?’ƱãqÇoýàõC {«éP3£¹ø8!9ñâÇøïž<ÙyŠ} ªé'ýŸö¶ÐZŠZ¡ÖÜÖ‰¶¤6i{L {ßé€Ó?Î-?›ÿ|ôŒö™š³ÊgKϑΜ›9Ÿw~òBÆ…ñ‹‰‡:Wt>º´äÒ?®°®ÞË?—¯^ñ¹r©Û½ûüU—«g®9];} ?}½í†ý?Ö»ž–_ì~iéµïm½ép³ý–㎾}çú]û/Þöº}åŽÿ?‹úî.¾{ÿ^Ü=é}ÞýÑ©^?Ìz8ýhýcìã¢' O*žª?ýÕø×f©½ôì ×`ϳˆg?†¸C/ÿ•ù¯OÃÏ©Ï+F´FêGGÏŒùŒÝz±ôÅðËŒ—Óã…¿)þ¶÷•Ñ«Ÿ~wû½gbÉÄðkÑë™?JÞ¨¾9úÖömçdèäÓwi獵ŠÞ«¾?ö?ý¡ûcôÇ‘éìOøO•Ÿ??w| üòx&mfæß÷„óû endstream endobj 18 0 obj 2612 endobj 9 0 obj [ /ICCBased 17 0 R ] endobj 19 0 obj << /Length 20 0 R /N 3 /Alternate /DeviceRGB /Filter /FlateDecode >> stream x…U[ˆUþ“9É »ÎÓÚÕ-¤C½t)»K¶Ý¥´š[“´k²ÙÕA³““dÌì$ÎLÒ}*‚â‹«¾IA¼½‚ÒzÁÖûR©PVwë"(>´xA(ôE·ñ;“d&Yj›eÏ|óýßùo矢?µB½®û¢%Ã6sɨòÜÑcÊÀ:ùé!¤Q,¨V=’ÍÎ~B+®ý¿[??O0W'îlïWo¹,rK%òݾV´Ô%àD?³jÝ´‰†ÁO·ë‹†Mü¢Àå6†?†Ûø5G“ÏÅ 9,«•Bxx|±‡/÷àvPÀO’ÜÔTEô"kÖJšÎC {¹‡¹Gy7¸¤7P³óÛ?uÈªÎÆu µ¿R,Äž^Q‰9àG€¯5µ…Lß®ÛÑðcDþ??ê|x7pªdœ†¿Yi¤ºø?S•ü³à·?ÿÆXÌéì]SzI;Áß®ð´èoˆHR4;?†é€YË =r?JEO ?¿^9À§ô™Õœ¼ÈgíT%&òüå— óá*ú*å6?!¾'«ÕOšZ¹b+ {Âá'>} \Iêä¸RÐuÅ1YŠÉ-n6yq’ÄwSì#º™s¾‡¾mW<Î~†hÿ_x÷}ïqÇD+ÑÈã7†wåï?{Bm˜Í¶?òù¾#²J{÷8÷¾¡(Þ_?·Z7ñx‹hóÍVëŸ÷[Íàƒè‚þ Ÿ|U {code} If I put a breakpoint at a line in FlateFilter.java ( int read = in.read(buf); ) in the private method 'decompress(InputStream in, OutputStream out)' and consume that input stream 'in' in a debugger, the image is rendered with no error and it even looked fine. (no discernible differences between the image and the PDF). was (Author: xin.lin): I dug a little deeper and, instead of the characters between stream and endstream, PDFBox seems to think what I pasted below is a stream and tried to decompress it and that apparently is the cause of the exception: {code:java} Σ ðÔ-rÑqá¸r‘.d.Œ_xp¡ÔUÛ•ãZëúÌM×?çvÄmÄÝØ=Ùý¸û+K‘G‹Ç”§“çÏ^ˆ—¯W‘W¯·’÷bïjï§>:>‰>?>¾v¾«}/øaýývúÝó×ðçú×ûO8¬ è ¤FV>2 uÃÁÁ»‚/Ò_$\ÔBüCv…< 5 ]ús.,4¬&ìy¸Ux~xw-bEDCÄ»H?ÈÒÈG‹?KwFÉGÅEÕGME{E—EK—X,Y³äFŒZŒ ¦={$vr©÷ÒÝK‡ãìâ ãî.3\–³ìÚrµå©ËÏ®?_ÁYq*ßÿ‰Â©åL®ô_¹wåד»‡û’çÆ+ç?ñ]øeü‘—„²„ÑD—Ä]‰cI®IIãOAµàu²_ò?ä©”?”£)3©Ñ©Íi„´ø´ÓB%aа+]3='½/Ã4£0CºÊiÕîU¢@Ñ‘L(sYf»˜ŽþLõHŒ$›%ƒY³j²ÞgGeŸÊQÌæôäšänËÉóÉû~5f5wug¾vþ†üÁ5îk…Ö®\Û¹Nw]Áºáõ¾ë?m mHÙðËFË?eßnŠÞÔQ Q°¾`h³ïæÆB¹BQá½-Î[lÅllíÝf³jÛ—"^ÑõbËâŠâO%Ü’ëßY}WùÝÌö„í½¥ö¥ûwàwÜÝéºóX™bY^ÙЮà]åÌò¢ò·»Wì¾Va[q`i?d?´2¨²½J¯jGÕ§ê¤ê??šæ½ê{·í?ÚÇÛ׿ßmÓ?Å>¼È÷PkAmÅaÜá¬ÃÏë¢êº¿g_DíHñ‘ÏG…G¥ÇÂ?uÕ;Ô×7¨7”6Â?’ƱãqÇoýàõC {«éP3£¹ø8!9ñâÇøïž<ÙyŠ} ªé'ýŸö¶ÐZŠZ¡ÖÜÖ‰¶¤6i{L {ßé€Ó?Î-?›ÿ|ôŒö™š³ÊgKϑΜ›9Ÿw~òBÆ…ñ‹‰‡:Wt>º´äÒ?®°®ÞË?—¯^ñ¹r©Û½ûüU—«g®9];} ?}½í†ý?Ö»ž–_ì~iéµïm½ép³ý–㎾}çú]û/Þöº}åŽÿ?‹úî.¾{ÿ^Ü=é}ÞýÑ©^?Ìz8ýhýcìã¢' O*žª?ýÕø×f©½ôì ×`ϳˆg?†¸C/ÿ•ù¯OÃÏ©Ï+F´FêGGÏŒùŒÝz±ôÅðËŒ—Óã…¿)þ¶÷•Ñ«Ÿ~wû½gbÉÄðkÑë™?JÞ¨¾9úÖömçdèäÓwi獵ŠÞ«¾?ö?ý¡ûcôÇ‘éìOøO•Ÿ??w| üòx&mfæß÷„óû endstream endobj 18 0 obj 2612 endobj 9 0 obj [ /ICCBased 17 0 R ] endobj 19 0 obj << /Length 20 0 R /N 3 /Alternate /DeviceRGB /Filter /FlateDecode >> stream x…U[ˆUþ“9É »ÎÓÚÕ-¤C½t)»K¶Ý¥´š[“´k²ÙÕA³““dÌì$ÎLÒ}*‚â‹«¾IA¼½‚ÒzÁÖûR©PVwë"(>´xA(ôE·ñ;“d&Yj›eÏ|óýßùo矢?µB½®û¢%Ã6sɨòÜÑcÊÀ:ùé!¤Q,¨V=’ÍÎ~B+®ý¿[??O0W'îlïWo¹,rK%òݾV´Ô%àD?³jÝ´‰†ÁO·ë‹†Mü¢Àå6†?†Ûø5G“ÏÅ 9,«•Bxx|±‡/÷àvPÀO’ÜÔTEô"kÖJšÎC {¹‡¹Gy7¸¤7P³óÛ?uÈªÎÆu µ¿R,Äž^Q‰9àG€¯5µ…Lß®ÛÑðcDþ??ê|x7pªdœ†¿Yi¤ºø?S•ü³à·?ÿÆXÌéì]SzI;Áß®ð´èoˆHR4;?†é€YË =r?JEO ?¿^9À§ô™Õœ¼ÈgíT%&òüå— óá*ú*å6?!¾'«ÕOšZ¹b+ {Âá'>} \Iêä¸RÐuÅ1YŠÉ-n6yq’ÄwSì#º™s¾‡¾mW<Î~†hÿ_x÷}ïqÇD+ÑÈã7†wåï?{Bm˜Í¶?òù¾#²J{÷8÷¾¡(Þ_?·Z7ñx‹hóÍVëŸ÷[Íàƒè‚þ Ÿ|U {code} If I put a breakpoint at line 70 in FlateFilter.java ( int read = in.read(buf); ) and consume that input stream 'in' in a debugger, the image is rendered with no error and it even looked fine. (no discernible differences between the image and the PDF). > DataFormatException: invalid code lengths set when rendering image > ------------------------------------------------------------------ > > Key: PDFBOX-4200 > URL: https://issues.apache.org/jira/browse/PDFBOX-4200 > Project: PDFBox > Issue Type: Bug > Components: Rendering > Affects Versions: 2.0.7, 2.0.8, 2.0.9 > Reporter: Xin Lin > Priority: Major > Attachments: invalid.pdf > > > When rendering image from the attached pdf, an exception was thrown and below > is the relevant stack trace: > java.util.zip.DataFormatException: invalid code lengths set > at java.util.zip.Inflater.inflateBytes(Native Method) ~[?:1.8.0_121] > at java.util.zip.Inflater.inflate(Inflater.java:259) ~[?:1.8.0_121] > at java.util.zip.Inflater.inflate(Inflater.java:280) ~[?:1.8.0_121] > at org.apache.pdfbox.filter.FlateFilter.decompress(FlateFilter.java:108) > ~[pdfbox-2.0.9.jar:2.0.9] > at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:74) > ~[pdfbox-2.0.9.jar:2.0.9] > at org.apache.pdfbox.filter.Filter.decode(Filter.java:87) > ~[pdfbox-2.0.9.jar:2.0.9] > at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:77) > ~[pdfbox-2.0.9.jar:2.0.9] > at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:175) > ~[pdfbox-2.0.9.jar:2.0.9] > at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:163) > ~[pdfbox-2.0.9.jar:2.0.9] > at > org.apache.pdfbox.pdmodel.common.PDStream.createInputStream(PDStream.java:236) > ~[pdfbox-2.0.9.jar:2.0.9] > at > org.apache.pdfbox.pdmodel.graphics.color.PDICCBased.loadICCProfile(PDICCBased.java:124) > ~[pdfbox-2.0.9.jar:2.0.9] > at > org.apache.pdfbox.pdmodel.graphics.color.PDICCBased.<init>(PDICCBased.java:98) > ~[pdfbox-2.0.9.jar:2.0.9] > at > org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:192) > ~[pdfbox-2.0.9.jar:2.0.9] > at > org.apache.pdfbox.pdmodel.PDResources.getColorSpace(PDResources.java:199) > ~[pdfbox-2.0.9.jar:2.0.9] > at > org.apache.pdfbox.pdmodel.PDResources.getColorSpace(PDResources.java:169) > ~[pdfbox-2.0.9.jar:2.0.9] > at > org.apache.pdfbox.contentstream.operator.color.SetNonStrokingColorSpace.process(SetNonStrokingColorSpace.java:41) > ~[pdfbox-2.0.9.jar:2.0.9] > at > org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:848) > ~[pdfbox-2.0.9.jar:2.0.9] > at > org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:503) > ~[pdfbox-2.0.9.jar:2.0.9] > at > org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:477) > ~[pdfbox-2.0.9.jar:2.0.9] > at > org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:150) > ~[pdfbox-2.0.9.jar:2.0.9] > at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:246) > ~[pdfbox-2.0.9.jar:2.0.9] > at > org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:225) > ~[pdfbox-2.0.9.jar:2.0.9] > at > org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:138) > ~[pdfbox-2.0.9.jar:2.0.9] -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org