Nicholas DiPiazza created TIKA-3220:
---------------------------------------
Summary: ForkParser displays incorrect message when parse timeout
is reached
Key: TIKA-3220
URL: https://issues.apache.org/jira/browse/TIKA-3220
Project: Tika
Issue Type: Bug
Reporter: Nicholas DiPiazza
Build this ForkParser example
https://github.com/nddipiazza/tika-fork-parser-example
but change the server timeout to be 10 seconds.
{code}
forkParser.setServerWaitTimeoutMillis(10000);
{code}
Now run it with the following (open licensed xls file)
https://public.opendatasoft.com/explore/dataset/activite-epidemique-covid-19-departement-france/download/?format=xls&timezone=America/Chicago&lang=en&use_labels_for_header=true
Expected Result:
Stop parsing after it reached the max time and return the bytes so far.
Actual result:
{code}
/home/ndipiazza/lucidworks/tika-fork-parser-example/tika-fork-main/build/dist
/home/ndipiazza/Downloads/coronavirus-tranche-age-urgences-sosmedecins-dep-france.xls
{code}
You get the following error message.
{code}
Exception in thread "main" org.apache.tika.exception.TikaException: Could not
parse
at
org.apache.tika.client.CollectingParser.parseInternal(CollectingParser.java:104)
at
org.apache.tika.client.CollectingParser.parse(CollectingParser.java:70)
at org.apache.tika.client.TikaForkExample.main(TikaForkExample.java:49)
Caused by: org.apache.tika.exception.TikaException: Failed to communicate with
a forked parser process. The process has most likely crashed due to some error
like running out of memory. A new process will be started for the next parsing
request.
at org.apache.tika.fork.ForkParser.parse(ForkParser.java:275)
at
org.apache.tika.client.CollectingParser.parseInternal(CollectingParser.java:101)
... 2 more
Caused by: java.io.IOException: Lost connection to a forked server process
at org.apache.tika.fork.ForkClient.waitForResponse(ForkClient.java:284)
at org.apache.tika.fork.ForkClient.call(ForkClient.java:209)
at org.apache.tika.fork.ForkParser.parse(ForkParser.java:267)
... 3 more
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)