[ 
https://issues.apache.org/jira/browse/PDFBOX-2607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holger Floerke updated PDFBOX-2607:
-----------------------------------
    Description: 
Hi,

I try to extract an image out of the attatched pdf. PDFViewer like "Acrobat 
Reader" or the Ubuntu "Document Viewer" are able to display the PDF in a 
correct way. pdfbox is throwing exception:
"""
SCHWERWIEGEND: Can't read the embedded Type1 font GLCNUS+StempelGaramond-Roman
java.io.IOException: Invalid start of ASCII segment
        at org.apache.fontbox.type1.Type1Parser.parseASCII(Type1Parser.java:83)
        at org.apache.fontbox.type1.Type1Parser.parse(Type1Parser.java:61)
        at 
org.apache.fontbox.type1.Type1Font.createWithSegments(Type1Font.java:70)
        at 
org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:174)
        at 
org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:65)
        at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:92)
        at 
org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:50)
        at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:803)
        at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:465)
        at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:439)
        at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:149)
        at 
org.apache.pdfbox.tools.ExtractImages$ImageGraphicsEngine.run(ExtractImages.java:195)
        at org.apache.pdfbox.tools.ExtractImages.extract(ExtractImages.java:174)
        at org.apache.pdfbox.tools.ExtractImages.run(ExtractImages.java:139)
        at org.apache.pdfbox.tools.ExtractImages.main(ExtractImages.java:83)
        at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:59)
"""

Checked with the latest version from git.
"""
java -jar pdfbox-app-2.0.0-SNAPSHOT.jar ExtractImages 
/home/hf/Downloads/0023-4834_t1_1.pdf
"""


  was:
Hi,

I try to extract an image out of the attatched pdf. PDFViewer like "Acrobat 
Reader" or the Ubuntu "Document Viewer" are able to display the PDF in a 
correct way. pdfbox is throwing exception:
"""
SCHWERWIEGEND: Can't read the embedded Type1 font GLCNUS+StempelGaramond-Roman
java.io.IOException: Invalid start of ASCII segment
        at org.apache.fontbox.type1.Type1Parser.parseASCII(Type1Parser.java:83)
        at org.apache.fontbox.type1.Type1Parser.parse(Type1Parser.java:61)
        at 
org.apache.fontbox.type1.Type1Font.createWithSegments(Type1Font.java:70)
        at 
org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:174)
        at 
org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:65)
        at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:92)
        at 
org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:50)
        at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:803)
        at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:465)
        at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:439)
        at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:149)
        at 
org.apache.pdfbox.tools.ExtractImages$ImageGraphicsEngine.run(ExtractImages.java:195)
        at org.apache.pdfbox.tools.ExtractImages.extract(ExtractImages.java:174)
        at org.apache.pdfbox.tools.ExtractImages.run(ExtractImages.java:139)
        at org.apache.pdfbox.tools.ExtractImages.main(ExtractImages.java:83)
        at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:59)
"""

Checked with the latest version from git.
.



> Failed reading embedded Font
> ----------------------------
>
>                 Key: PDFBOX-2607
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2607
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>            Reporter: Holger Floerke
>         Attachments: 0023-4834_t1_1.pdf
>
>
> Hi,
> I try to extract an image out of the attatched pdf. PDFViewer like "Acrobat 
> Reader" or the Ubuntu "Document Viewer" are able to display the PDF in a 
> correct way. pdfbox is throwing exception:
> """
> SCHWERWIEGEND: Can't read the embedded Type1 font GLCNUS+StempelGaramond-Roman
> java.io.IOException: Invalid start of ASCII segment
>       at org.apache.fontbox.type1.Type1Parser.parseASCII(Type1Parser.java:83)
>       at org.apache.fontbox.type1.Type1Parser.parse(Type1Parser.java:61)
>       at 
> org.apache.fontbox.type1.Type1Font.createWithSegments(Type1Font.java:70)
>       at 
> org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:174)
>       at 
> org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:65)
>       at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:92)
>       at 
> org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:50)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:803)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:465)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:439)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:149)
>       at 
> org.apache.pdfbox.tools.ExtractImages$ImageGraphicsEngine.run(ExtractImages.java:195)
>       at org.apache.pdfbox.tools.ExtractImages.extract(ExtractImages.java:174)
>       at org.apache.pdfbox.tools.ExtractImages.run(ExtractImages.java:139)
>       at org.apache.pdfbox.tools.ExtractImages.main(ExtractImages.java:83)
>       at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:59)
> """
> Checked with the latest version from git.
> """
> java -jar pdfbox-app-2.0.0-SNAPSHOT.jar ExtractImages 
> /home/hf/Downloads/0023-4834_t1_1.pdf
> """



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to