[ https://issues.apache.org/jira/browse/TIKA-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401089#comment-15401089 ]
Nick Burch commented on TIKA-2046: ---------------------------------- Can you try following the steps in https://wiki.apache.org/tika/Troubleshooting%20Tika#PDF_Text_Problems to see if it's a Tika issue or an Apache PDFBox one? > Can not read PDF correctly > -------------------------- > > Key: TIKA-2046 > URL: https://issues.apache.org/jira/browse/TIKA-2046 > Project: Tika > Issue Type: Bug > Components: core, detector, handler, languageidentifier, metadata, > parser > Affects Versions: 1.13 > Reporter: gopalbhalala > Priority: Critical > > Hi Team, > I have two PDF in Gujarati language but font is Different, 1st PDF have > Shruti font and 2nd PDF have LMG-RUPE font, Shruti read correctly in tika > parser and it gives me a correct output, but LMG-RUPE pdf gives me a worng > output. Metadata is same for both pdf. > 1) drive.google.com/open?id=0B4Sse_x7pvrqRnRETzNsUk1BY0k (Shruti font) > 2) drive.google.com/open?id=0B4Sse_x7pvrqVC0zb (LMG-RUPE font) > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)