[ 
https://issues.apache.org/jira/browse/CONNECTORS-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18023105#comment-18023105
 ] 

mbiso commented on CONNECTORS-1778:
-----------------------------------

I think that, after the crashes, I see tika is newly available and probably 
ManifoldCF feels the crash and it retries to process the files unti the time 
specified as in the attachment ErrorManifoldCF.jpg

After it get over the "Retry Limit" time, it reports the log error for the job: 
Error: Repeated service interruptions - failure processing document: The target 
server failed to respond.

So, probably, despite the errors about the Tika processing for the big files, I 
would like to know if there would be a way to bypass the errors on Manifold 
because I think that, when the job goes in error, it doesn't process further 
files.

 

 

 
 
 
 

 

> Error: Repeated service interruptions - failure processing document: The 
> target server failed to respond
> --------------------------------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-1778
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1778
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Tika extractor
>    Affects Versions: ManifoldCF 2.28
>            Reporter: mbiso
>            Assignee: Piergiorgio Lucidi
>            Priority: Major
>         Attachments: ErrorManifoldCF.jpg
>
>
> Hi.
> I have a job ingesting a windows network share.
> It use tika server (standalone)
> There are many errors on Tika because some files cause error like:
>  
> {code:java}
> ERROR [qtp131037934-61] 10:44:03,903 
> org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler Failed to parse SST 
> index '1356 '
> java.lang.NumberFormatException: For input string: "1356 " {code}
> The errors cause a restart of a child tika process, and this is reported like 
> an interruption in the ManifoldCF job.
> It ends with the message: "Error: Repeated service interruptions - failure 
> processing document: The target server failed to respond"
>  
> How could I get over this issue?
> I have opened an issue [TIKA-4494 ] on Tika as well,  but It could be a right 
> behaviour on Tika: many errors cause a restart child process, so this is a 
> problem for me.
>  
> Any suggestion?
> Thanks a lot.
> Mario Bisonti



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to