[ 
https://issues.apache.org/jira/browse/NUTCH-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

M A updated NUTCH-2742:
-----------------------
    Comment: was deleted

(was: Apologies, didn't realise that was a feature.)

> Unable to parse specific pdf file
> ---------------------------------
>
>                 Key: NUTCH-2742
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2742
>             Project: Nutch
>          Issue Type: Bug
>          Components: nutchNewbie, parser
>    Affects Versions: 1.15
>            Reporter: M A
>            Priority: Minor
>
> It appears that the Tika plugin is not parsing some PDF files.
> When I completed a dump of the segment data there is no content
> EDIT: See attached for output and crawl log
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to