[
https://issues.apache.org/jira/browse/PDFBOX-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373156#comment-17373156
]
LuoWeiWei commented on PDFBOX-5231:
-----------------------------------
I try it with version 2.0.24, but it appears following errors:
七月 02, 2021 2:24:29 上午 org.apache.pdfbox.filter.FlateFilter decode
七月 02, 2021 2:24:29 上午 org.apache.pdfbox.filter.FlateFilter decode严重:
FlateFilter: stop reading corrupt stream due to a DataFormatException
七月 02, 2021 2:24:29 上午 org.apache.pdfbox.pdmodel.graphics.color.PDICCBased
fallbackToAlternateColorSpace警告: Can't read embedded ICC profile (null), using
alternate color space: DeviceRGB
七月 02, 2021 2:24:29 上午 org.apache.pdfbox.filter.FlateFilter decode严重:
FlateFilter: stop reading corrupt stream due to a DataFormatException
七月 02, 2021 2:24:29 上午 org.apache.pdfbox.pdmodel.graphics.color.PDICCBased
fallbackToAlternateColorSpace警告: Can't read embedded ICC profile (Unexpectedly
no bytes available for read in buffer.), using alternate color space: DeviceRGB
七月 02, 2021 2:24:29 上午 org.apache.pdfbox.pdmodel.graphics.color.PDICCBased
fallbackToAlternateColorSpace警告: Can't read embedded ICC profile (4096), using
alternate color space: DeviceRGB
七月 02, 2021 2:24:29 上午 org.apache.pdfbox.filter.FlateFilter decode严重:
FlateFilter: stop reading corrupt stream due to a DataFormatException
七月 02, 2021 2:24:29 上午 org.apache.pdfbox.pdmodel.graphics.color.PDICCBased
fallbackToAlternateColorSpace警告: Can't read embedded ICC profile
(java.util.zip.DataFormatException: invalid stored block lengths), using
alternate color space: DeviceRGB
七月 02, 2021 2:24:29 上午 org.apache.pdfbox.filter.FlateFilter decode严重:
FlateFilter: stop reading corrupt stream due to a DataFormatException
七月 02, 2021 2:24:29 上午 org.apache.pdfbox.pdmodel.graphics.color.PDICCBased
fallbackToAlternateColorSpace警告: Can't read embedded ICC profile
(java.util.zip.DataFormatException: invalid stored block lengths), using
alternate color space: DeviceRGB
七月 02, 2021 2:24:29 上午 org.apache.pdfbox.pdmodel.graphics.color.PDICCBased
fallbackToAlternateColorSpace警告: Can't read embedded ICC profile
(java.util.zip.DataFormatException: invalid code lengths set), using alternate
color space: DeviceRGB
七月 02, 2021 2:24:29 上午 org.apache.pdfbox.pdmodel.graphics.color.PDICCBased
fallbackToAlternateColorSpace警告: Can't read embedded ICC profile
(java.util.zip.DataFormatException: invalid stored block lengths), using
alternate color space: DeviceRGB
02:24:30.028 [pool-1-thread-19] ERROR c.n.backend.pdf2image.Pdf2Image -
[PdfConvert] pdf file : /opt/nts/chongwudian.pdf convertToImage exception
:java.lang.ArrayIndexOutOfBoundsException: null at
java.lang.System.arraycopy(Native Method) ~[na:1.8.0_251]
at org.apache.pdfbox.io.ScratchFileBuffer.read(ScratchFileBuffer.java:470)
~[pdf2Image1.3.jar:na] at
org.apache.pdfbox.io.RandomAccessInputStream.read(RandomAccessInputStream.java:98)
~[pdf2Image1.3.jar:na]
at java.io.InputStream.read(InputStream.java:101) ~[na:1.8.0_251] at
org.apache.pdfbox.filter.FlateFilter.decompress(FlateFilter.java:112)
~[pdf2Image1.3.jar:na] at
org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:50)
~[pdf2Image1.3.jar:na]
at org.apache.pdfbox.filter.Filter.decode(Filter.java:87)
~[pdf2Image1.3.jar:na] at
org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:80)
~[pdf2Image1.3.jar:na] at
org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:175)
~[pdf2Image1.3.jar:na]
at
org.apache.pdfbox.pdmodel.common.PDStream.createInputStream(PDStream.java:243)
~[pdf2Image1.3.jar:na]
at
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.createInputStream(PDImageXObject.java:791)
~[pdf2Image1.3.jar:na]
at
org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.from8bit(SampledImageReader.java:517)
~[pdf2Image1.3.jar:na]
at
org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.getRGBImage(SampledImageReader.java:226)
~[pdf2Image1.3.jar:na]
at
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.getImage(PDImageXObject.java:481)
~[pdf2Image1.3.jar:na]
at
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.getImage(PDImageXObject.java:462)
~[pdf2Image1.3.jar:na]
at org.apache.pdfbox.rendering.PageDrawer.drawImage(PageDrawer.java:1222)
~[pdf2Image1.3.jar:na]
at
org.apache.pdfbox.contentstream.operator.graphics.DrawObject.process(DrawObject.java:67)
~[pdf2Image1.3.jar:na]
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:933)
~[pdf2Image1.3.jar:na] at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:514)
~[pdf2Image1.3.jar:na] at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:492)
~[pdf2Image1.3.jar:na] at
org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:155)
~[pdf2Image1.3.jar:na] at
org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:277)
~[pdf2Image1.3.jar:na] at
org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:347)
~[pdf2Image1.3.jar:na] at
org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:268)
~[pdf2Image1.3.jar:na] at
org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:254)
~[pdf2Image1.3.jar:na] at
com.netease.backend.pdf2image.Pdf2Image$ConversionThread.call(Pdf2Image.java:197)
[pdf2Image1.3.jar:na] at
com.netease.backend.pdf2image.Pdf2Image$ConversionThread.call(Pdf2Image.java:162)
[pdf2Image1.3.jar:na] at
java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_251] at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[na:1.8.0_251] at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[na:1.8.0_251] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[na:1.8.0_251] at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[na:1.8.0_251] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_251]
> Text part of picture converted from pdf to jpg by pdfBox2.0.23 looks very
> unclear.
> ----------------------------------------------------------------------------------
>
> Key: PDFBOX-5231
> URL: https://issues.apache.org/jira/browse/PDFBOX-5231
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 2.0.23
> Environment: Linux kp-kylin-04 4.19.90-17.5.ky10.aarch64 #1 SMP Fri
> Aug 7 13:35:33 CST 2020 aarch64 GNU/Linux
> Java(TM) SE Runtime Environment (build 1.8.0_251-b08)
> Reporter: LuoWeiWei
> Priority: Blocker
> Attachments: private.png
>
>
> *Text part in converted jpg looks particularly unclear like attached picture.*
> *The pdf url: (ps. I can't upload the pdf to the attachments)*
> http://nos.netease.com/nts-bucket-output/chongwudian.pdf
> **
> *When I use the same jar to convert in my local environment and other server,
> it turns to be clear .*
> *So I doubt it may be caused by system environment, and it's system info:*
> _Linux kp-kylin-04 4.19.90-17.5.ky10.aarch64 #1 SMP Fri Aug 7 13:35:33 CST
> 2020 aarch64 GNU/Linux_
> Java(TM) SE Runtime Environment (build 1.8.0_251-b08)
> *And there are some warning message:*
> _July 01, 2021 6:11:46 AM org.apache.pdfbox.pdmodel.font.PDType0Font
> toUnicode_
> _Warning: No Unicode mapping for CID+3108 (3108) in font
> SPIATY+FZLTHJW--GB1-0_
> _July_ _01, 2021 6:11:46_ _AM_ _org.apache.pdfbox.pdmodel.font.PDType0Font
> toUnicode_
> _Warning__: No Unicode mapping for CID+3108 (3108) in font
> SPIATY+FZLTHJW--GB1-0_
> _July_ _01, 2021 6:11:46_ _AM_ _org.apache.pdfbox.pdmodel.font.PDType0Font
> toUnicode_
> _Warning__: No Unicode mapping for CID+3108 (3108) in font
> SPIATY+FZLTHJW--GB1-0_
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]