[ 
https://issues.apache.org/jira/browse/PDFBOX-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373156#comment-17373156
 ] 

LuoWeiWei commented on PDFBOX-5231:
-----------------------------------

I try it with version 2.0.24, but it appears following errors:

 

七月 02, 2021 2:24:29 上午 org.apache.pdfbox.filter.FlateFilter decode

七月 02, 2021 2:24:29 上午 org.apache.pdfbox.filter.FlateFilter decode严重: 
FlateFilter: stop reading corrupt stream due to a DataFormatException

七月 02, 2021 2:24:29 上午 org.apache.pdfbox.pdmodel.graphics.color.PDICCBased 
fallbackToAlternateColorSpace警告: Can't read embedded ICC profile (null), using 
alternate color space: DeviceRGB

七月 02, 2021 2:24:29 上午 org.apache.pdfbox.filter.FlateFilter decode严重: 
FlateFilter: stop reading corrupt stream due to a DataFormatException

七月 02, 2021 2:24:29 上午 org.apache.pdfbox.pdmodel.graphics.color.PDICCBased 
fallbackToAlternateColorSpace警告: Can't read embedded ICC profile (Unexpectedly 
no bytes available for read in buffer.), using alternate color space: DeviceRGB

七月 02, 2021 2:24:29 上午 org.apache.pdfbox.pdmodel.graphics.color.PDICCBased 
fallbackToAlternateColorSpace警告: Can't read embedded ICC profile (4096), using 
alternate color space: DeviceRGB

七月 02, 2021 2:24:29 上午 org.apache.pdfbox.filter.FlateFilter decode严重: 
FlateFilter: stop reading corrupt stream due to a DataFormatException

七月 02, 2021 2:24:29 上午 org.apache.pdfbox.pdmodel.graphics.color.PDICCBased 
fallbackToAlternateColorSpace警告: Can't read embedded ICC profile 
(java.util.zip.DataFormatException: invalid stored block lengths), using 
alternate color space: DeviceRGB

七月 02, 2021 2:24:29 上午 org.apache.pdfbox.filter.FlateFilter decode严重: 
FlateFilter: stop reading corrupt stream due to a DataFormatException

七月 02, 2021 2:24:29 上午 org.apache.pdfbox.pdmodel.graphics.color.PDICCBased 
fallbackToAlternateColorSpace警告: Can't read embedded ICC profile 
(java.util.zip.DataFormatException: invalid stored block lengths), using 
alternate color space: DeviceRGB

七月 02, 2021 2:24:29 上午 org.apache.pdfbox.pdmodel.graphics.color.PDICCBased 
fallbackToAlternateColorSpace警告: Can't read embedded ICC profile 
(java.util.zip.DataFormatException: invalid code lengths set), using alternate 
color space: DeviceRGB

七月 02, 2021 2:24:29 上午 org.apache.pdfbox.pdmodel.graphics.color.PDICCBased 
fallbackToAlternateColorSpace警告: Can't read embedded ICC profile 
(java.util.zip.DataFormatException: invalid stored block lengths), using 
alternate color space: DeviceRGB

02:24:30.028 [pool-1-thread-19] ERROR c.n.backend.pdf2image.Pdf2Image - 
[PdfConvert] pdf file : /opt/nts/chongwudian.pdf convertToImage exception 
:java.lang.ArrayIndexOutOfBoundsException: null at 
java.lang.System.arraycopy(Native Method) ~[na:1.8.0_251]

at org.apache.pdfbox.io.ScratchFileBuffer.read(ScratchFileBuffer.java:470) 
~[pdf2Image1.3.jar:na] at 
org.apache.pdfbox.io.RandomAccessInputStream.read(RandomAccessInputStream.java:98)
 ~[pdf2Image1.3.jar:na]

at java.io.InputStream.read(InputStream.java:101) ~[na:1.8.0_251] at 
org.apache.pdfbox.filter.FlateFilter.decompress(FlateFilter.java:112) 
~[pdf2Image1.3.jar:na] at 
org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:50) 
~[pdf2Image1.3.jar:na]

at org.apache.pdfbox.filter.Filter.decode(Filter.java:87) 
~[pdf2Image1.3.jar:na] at 
org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:80) 
~[pdf2Image1.3.jar:na] at 
org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:175) 
~[pdf2Image1.3.jar:na]

at 
org.apache.pdfbox.pdmodel.common.PDStream.createInputStream(PDStream.java:243) 
~[pdf2Image1.3.jar:na]

at 
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.createInputStream(PDImageXObject.java:791)
 ~[pdf2Image1.3.jar:na]

at 
org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.from8bit(SampledImageReader.java:517)
 ~[pdf2Image1.3.jar:na]

at 
org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.getRGBImage(SampledImageReader.java:226)
 ~[pdf2Image1.3.jar:na]

at 
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.getImage(PDImageXObject.java:481)
 ~[pdf2Image1.3.jar:na]

at 
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.getImage(PDImageXObject.java:462)
 ~[pdf2Image1.3.jar:na]

at org.apache.pdfbox.rendering.PageDrawer.drawImage(PageDrawer.java:1222) 
~[pdf2Image1.3.jar:na]

at 
org.apache.pdfbox.contentstream.operator.graphics.DrawObject.process(DrawObject.java:67)
 ~[pdf2Image1.3.jar:na]

at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:933)
 ~[pdf2Image1.3.jar:na] at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:514)
 ~[pdf2Image1.3.jar:na] at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:492)
 ~[pdf2Image1.3.jar:na] at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:155)
 ~[pdf2Image1.3.jar:na] at 
org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:277) 
~[pdf2Image1.3.jar:na] at 
org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:347) 
~[pdf2Image1.3.jar:na] at 
org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:268) 
~[pdf2Image1.3.jar:na] at 
org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:254)
 ~[pdf2Image1.3.jar:na] at 
com.netease.backend.pdf2image.Pdf2Image$ConversionThread.call(Pdf2Image.java:197)
 [pdf2Image1.3.jar:na] at 
com.netease.backend.pdf2image.Pdf2Image$ConversionThread.call(Pdf2Image.java:162)
 [pdf2Image1.3.jar:na] at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_251] at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_251] at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[na:1.8.0_251] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[na:1.8.0_251] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[na:1.8.0_251] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_251]

> Text part of picture converted from pdf to jpg by pdfBox2.0.23 looks very 
> unclear.
> ----------------------------------------------------------------------------------
>
>                 Key: PDFBOX-5231
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5231
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.23
>         Environment: Linux kp-kylin-04 4.19.90-17.5.ky10.aarch64 #1 SMP Fri 
> Aug 7 13:35:33 CST 2020 aarch64 GNU/Linux
> Java(TM) SE Runtime Environment (build 1.8.0_251-b08)
>            Reporter: LuoWeiWei
>            Priority: Blocker
>         Attachments: private.png
>
>
> *Text part in converted jpg looks particularly unclear like attached picture.*
> *The pdf url: (ps. I can't upload the pdf to the attachments)*
> http://nos.netease.com/nts-bucket-output/chongwudian.pdf
> **
> *When I use the same jar to convert in my local environment and other server, 
> it turns to be clear .*
> *So I doubt it may be caused by system environment, and it's system info:*
> _Linux kp-kylin-04 4.19.90-17.5.ky10.aarch64 #1 SMP Fri Aug 7 13:35:33 CST 
> 2020 aarch64 GNU/Linux_
> Java(TM) SE Runtime Environment (build 1.8.0_251-b08)
> *And there are some warning message:*
> _July 01, 2021 6:11:46 AM org.apache.pdfbox.pdmodel.font.PDType0Font 
> toUnicode_
>  _Warning: No Unicode mapping for CID+3108 (3108) in font 
> SPIATY+FZLTHJW--GB1-0_
>  _July_ _01, 2021 6:11:46_ _AM_ _org.apache.pdfbox.pdmodel.font.PDType0Font 
> toUnicode_
>  _Warning__: No Unicode mapping for CID+3108 (3108) in font 
> SPIATY+FZLTHJW--GB1-0_
>  _July_ _01, 2021 6:11:46_ _AM_ _org.apache.pdfbox.pdmodel.font.PDType0Font 
> toUnicode_
>  _Warning__: No Unicode mapping for CID+3108 (3108) in font 
> SPIATY+FZLTHJW--GB1-0_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to