[
https://issues.apache.org/jira/browse/CONNECTORS-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18023105#comment-18023105
]
mbiso commented on CONNECTORS-1778:
-----------------------------------
I think that, after the crashes, I see tika is newly available and probably
ManifoldCF feels the crash and it retries to process the files unti the time
specified as in the attachment ErrorManifoldCF.jpg
After it get over the "Retry Limit" time, it reports the log error for the job:
Error: Repeated service interruptions - failure processing document: The target
server failed to respond.
So, probably, despite the errors about the Tika processing for the big files, I
would like to know if there would be a way to bypass the errors on Manifold
because I think that, when the job goes in error, it doesn't process further
files.
> Error: Repeated service interruptions - failure processing document: The
> target server failed to respond
> --------------------------------------------------------------------------------------------------------
>
> Key: CONNECTORS-1778
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1778
> Project: ManifoldCF
> Issue Type: Bug
> Components: Tika extractor
> Affects Versions: ManifoldCF 2.28
> Reporter: mbiso
> Assignee: Piergiorgio Lucidi
> Priority: Major
> Attachments: ErrorManifoldCF.jpg
>
>
> Hi.
> I have a job ingesting a windows network share.
> It use tika server (standalone)
> There are many errors on Tika because some files cause error like:
>
> {code:java}
> ERROR [qtp131037934-61] 10:44:03,903
> org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler Failed to parse SST
> index '1356 '
> java.lang.NumberFormatException: For input string: "1356 " {code}
> The errors cause a restart of a child tika process, and this is reported like
> an interruption in the ManifoldCF job.
> It ends with the message: "Error: Repeated service interruptions - failure
> processing document: The target server failed to respond"
>
> How could I get over this issue?
> I have opened an issue [TIKA-4494 ] on Tika as well, but It could be a right
> behaviour on Tika: many errors cause a restart child process, so this is a
> problem for me.
>
> Any suggestion?
> Thanks a lot.
> Mario Bisonti
--
This message was sent by Atlassian Jira
(v8.20.10#820010)