Hi,
In some PDF files parsing we see different errors related to PDF parsing,
one is OutOfMemmory exception during pdf parsing and another:

WARN      - Could not read embedded TTF for font ABCDEE+Segoe UI,BoldItalic
java.io.IOException: Kerning sub-table too short, got 0 bytes, expect 6 or
more.
at
org.apache.fontbox.ttf.KerningSubtable.readSubtable0(KerningSubtable.java:191)
at org.apache.fontbox.ttf.KerningSubtable.read(KerningSubtable.java:70)
at org.apache.fontbox.ttf.KerningTable.read(KerningTable.java:80)
at org.apache.fontbox.ttf.TrueTypeFont.readTable(TrueTypeFont.java:353)
at org.apache.fontbox.ttf.TTFParser.parseTables(TTFParser.java:173)
at org.apache.fontbox.ttf.TTFParser.parse(TTFParser.java:150)
at org.apache.fontbox.ttf.TTFParser.parse(TTFParser.java:106)
at
org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.<init>(PDTrueTypeFont.java:198)
at
org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:75)
at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:146)
at
org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:60)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:869)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:505)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:479)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:152)
at
org.apache.pdfbox.text.LegacyPDFStreamEngine.processPage(LegacyPDFStreamEngine.java:139)
at
org.apache.pdfbox.text.PDFTextStripper.processPage(PDFTextStripper.java:391)
at org.apache.tika.parser.pdf.PDF2XHTML.processPage(PDF2XHTML.java:153)
at
org.apache.tika.parser.pdf.AbstractPDF2XHTML.processPages(AbstractPDF2XHTML.java:835)
at
org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:266)
at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:124)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:172)

How can I skip parsing of embedded TTF inside PDF ?

Thanks

Reply via email to