[
https://issues.apache.org/jira/browse/PDFBOX-2607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
John Hewson updated PDFBOX-2607:
--------------------------------
Description:
Hi,
I try to extract an image out of the attatched pdf. PDFViewer like "Acrobat
Reader" or the Ubuntu "Document Viewer" are able to display the PDF in a
correct way. pdfbox is throwing exception:
{code}
SCHWERWIEGEND: Can't read the embedded Type1 font GLCNUS+StempelGaramond-Roman
java.io.IOException: Invalid start of ASCII segment
at org.apache.fontbox.type1.Type1Parser.parseASCII(Type1Parser.java:83)
at org.apache.fontbox.type1.Type1Parser.parse(Type1Parser.java:61)
at
org.apache.fontbox.type1.Type1Font.createWithSegments(Type1Font.java:70)
at
org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:174)
at
org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:65)
at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:92)
at
org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:50)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:803)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:465)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:439)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:149)
at
org.apache.pdfbox.tools.ExtractImages$ImageGraphicsEngine.run(ExtractImages.java:195)
at org.apache.pdfbox.tools.ExtractImages.extract(ExtractImages.java:174)
at org.apache.pdfbox.tools.ExtractImages.run(ExtractImages.java:139)
at org.apache.pdfbox.tools.ExtractImages.main(ExtractImages.java:83)
at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:59)
{code}
Checked with the latest version from git.
{code}
java -jar pdfbox-app-2.0.0-SNAPSHOT.jar ExtractImages
/home/hf/Downloads/0023-4834_t1_1.pdf
{code}
was:
Hi,
I try to extract an image out of the attatched pdf. PDFViewer like "Acrobat
Reader" or the Ubuntu "Document Viewer" are able to display the PDF in a
correct way. pdfbox is throwing exception:
"""
SCHWERWIEGEND: Can't read the embedded Type1 font GLCNUS+StempelGaramond-Roman
java.io.IOException: Invalid start of ASCII segment
at org.apache.fontbox.type1.Type1Parser.parseASCII(Type1Parser.java:83)
at org.apache.fontbox.type1.Type1Parser.parse(Type1Parser.java:61)
at
org.apache.fontbox.type1.Type1Font.createWithSegments(Type1Font.java:70)
at
org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:174)
at
org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:65)
at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:92)
at
org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:50)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:803)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:465)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:439)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:149)
at
org.apache.pdfbox.tools.ExtractImages$ImageGraphicsEngine.run(ExtractImages.java:195)
at org.apache.pdfbox.tools.ExtractImages.extract(ExtractImages.java:174)
at org.apache.pdfbox.tools.ExtractImages.run(ExtractImages.java:139)
at org.apache.pdfbox.tools.ExtractImages.main(ExtractImages.java:83)
at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:59)
"""
Checked with the latest version from git.
"""
java -jar pdfbox-app-2.0.0-SNAPSHOT.jar ExtractImages
/home/hf/Downloads/0023-4834_t1_1.pdf
"""
> Failed reading embedded Font
> ----------------------------
>
> Key: PDFBOX-2607
> URL: https://issues.apache.org/jira/browse/PDFBOX-2607
> Project: PDFBox
> Issue Type: Bug
> Components: FontBox
> Reporter: Holger Floerke
> Attachments: 0023-4834_t1_1.pdf
>
>
> Hi,
> I try to extract an image out of the attatched pdf. PDFViewer like "Acrobat
> Reader" or the Ubuntu "Document Viewer" are able to display the PDF in a
> correct way. pdfbox is throwing exception:
> {code}
> SCHWERWIEGEND: Can't read the embedded Type1 font GLCNUS+StempelGaramond-Roman
> java.io.IOException: Invalid start of ASCII segment
> at org.apache.fontbox.type1.Type1Parser.parseASCII(Type1Parser.java:83)
> at org.apache.fontbox.type1.Type1Parser.parse(Type1Parser.java:61)
> at
> org.apache.fontbox.type1.Type1Font.createWithSegments(Type1Font.java:70)
> at
> org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:174)
> at
> org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:65)
> at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:92)
> at
> org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:50)
> at
> org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:803)
> at
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:465)
> at
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:439)
> at
> org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:149)
> at
> org.apache.pdfbox.tools.ExtractImages$ImageGraphicsEngine.run(ExtractImages.java:195)
> at org.apache.pdfbox.tools.ExtractImages.extract(ExtractImages.java:174)
> at org.apache.pdfbox.tools.ExtractImages.run(ExtractImages.java:139)
> at org.apache.pdfbox.tools.ExtractImages.main(ExtractImages.java:83)
> at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:59)
> {code}
> Checked with the latest version from git.
> {code}
> java -jar pdfbox-app-2.0.0-SNAPSHOT.jar ExtractImages
> /home/hf/Downloads/0023-4834_t1_1.pdf
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)