[
https://issues.apache.org/jira/browse/TIKA-2403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Boopathi updated TIKA-2403:
---------------------------
Description: We are using Elasticsearch 5.2.2 for Full text search. With
the help of ingest node we are able to parse the content of files which tika
supports. We are facing some issue while parsing the content of some PDF files
. It parsed the content of file successfully and in addition to that some
additional terms which is not even the content of that document. [sample screen
shot|https://www.screencast.com/t/AQWK9Rzvrdo8]. Kindly let me know what is
reason for this and how can it be fixed (was: We are using Elasticsearch 5.2.2
for Full text search. With the help of ingest node we are able to parse the
content of files which tika supports. We are facing some issue while parsing
the content the PDF file . It parsed the content of file successfully and in
addition to that some additional terms which is not even the content of that
document. [sample screen shot|https://www.screencast.com/t/AQWK9Rzvrdo8].
Kindly let me know what is reason for this and how can it be fixed)
> Elasticsearch 5.2.2 - Ingest Node - PDF - Parsing Issue
> -------------------------------------------------------
>
> Key: TIKA-2403
> URL: https://issues.apache.org/jira/browse/TIKA-2403
> Project: Tika
> Issue Type: Bug
> Reporter: Boopathi
>
> We are using Elasticsearch 5.2.2 for Full text search. With the help of
> ingest node we are able to parse the content of files which tika supports. We
> are facing some issue while parsing the content of some PDF files . It parsed
> the content of file successfully and in addition to that some additional
> terms which is not even the content of that document. [sample screen
> shot|https://www.screencast.com/t/AQWK9Rzvrdo8]. Kindly let me know what is
> reason for this and how can it be fixed
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)