[ 
https://issues.apache.org/jira/browse/TIKA-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16684907#comment-16684907
 ] 

Mario Bisonti commented on TIKA-2776:
-------------------------------------

Hallo Tim.
>From the ManifoldCF side I read the log:
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service 
interruptions - failure processing document: The target server failed to respond
        at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:489) 
[mcf-pull-agent.jar:?]
Caused by: org.apache.http.NoHttpResponseException: The target server failed to 
respond
        at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:141)
 ~[httpclient-4.5.6.jar:4.5.6]
        at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
 ~[httpclient-4.5.6.jar:4.5.6]
        at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
 ~[httpcore-4.4.10.jar:4.4.10]
        at 
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
 ~[httpcore-4.4.10.jar:4.4.10]
        at 
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165) 
~[httpclient-4.5.6.jar:4.5.6]
        at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
 ~[httpcore-4.4.10.jar:4.4.10]
        at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
 ~[httpcore-4.4.10.jar:4.4.10]
        at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) 
~[httpclient-4.5.6.jar:4.5.6]
        at 
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) 
~[httpclient-4.5.6.jar:4.5.6]
        at 
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) 
~[httpclient-4.5.6.jar:4.5.6]
        at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
 ~[httpclient-4.5.6.jar:4.5.6]
        at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:118)
 ~[httpclient-4.5.6.jar:4.5.6]
        at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
 ~[httpclient-4.5.6.jar:4.5.6]
        at 
org.apache.manifoldcf.agents.transformation.tikaservice.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:608)
 ~[?:?]
        at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226)
 ~[mcf-agents.jar:?]
        at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077)
 ~[mcf-agents.jar:?]
        at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$MonitoredAddActivityWrapper.sendDocument(IncrementalIngester.java:3471)
 ~[mcf-agents.jar:?]
        at 
org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter.addOrReplaceDocumentWithException(DocumentFilter.java:208)
 ~[?:?]
        at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226)
 ~[mcf-agents.jar:?]
        at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077)
 ~[mcf-agents.jar:?]
        at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2708)
 ~[mcf-agents.jar:?]
        at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756)
 ~[mcf-agents.jar:?]
        at 
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583)
 ~[mcf-pull-agent.jar:?]
        at 
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548)
 ~[mcf-pull-agent.jar:?]
        at 
org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:939)
 ~[?:?]
        at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399) 
~[mcf-pull-agent.jar:?]
 WARN 2018-11-13T09:50:58,546 (Worker thread '48') - Service interruption 
reported for job 1533797717712 connection 'WinShare': Tika down, retrying: 
Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] 
failed: Connection refused (Connection refused)
 WARN 2018-11-13T09:50:58,606 (Worker thread '34') - Service interruption 
reported for job 1533797717712 connection 'WinShare': Tika down, retrying: 
Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] 
failed: Connection refused (Connection refused)
 WARN 2018-11-13T09:50:58,947 (Worker thread '0') - Service interruption 
reported for job 1533797717712 connection 'WinShare': Tika down, retrying: 
Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] 
failed: Connection refused (Connection refused)
 WARN 2018-11-13T09:50:58,948 (Worker thread '61') - Service interruption 
reported for job 1533797717712 connection 'WinShare': Tika down, retrying: 
Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] 
failed: Connection refused (Connection refused)
 WARN 2018-11-13T09:50:58,947 (Worker thread '10') - Service interruption 
reported for job 1533797717712 connection 'WinShare': Tika down, retrying: 
Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] 
failed: Connection refused (Connection refused)
 WARN 2018-11-13T09:50:59,344 (Worker thread '5') - Service interruption 
reported for job 1533797717712 connection 'WinShare': Tika down, retrying: 
Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] 
failed: Connection refused (Connection refused)

I see that Tika server restarts correctly the child but I hav no log from Tika 
server.

For me is very difficult to investigate why tika server child is 
restarted/crashed.

Is there any way to log Tika server?


Thanks a lot
Mario

> Tika server child restart
> -------------------------
>
>                 Key: TIKA-2776
>                 URL: https://issues.apache.org/jira/browse/TIKA-2776
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Mario Bisonti
>            Priority: Major
>
> Hallo.
> I use tika server standalone started with the option:
> java -jar /opt/tika/tika-server-1.19.1.jar -spawnChild
> I use ManifoldCF and Solr to index file using tika server.
> It happens that indexing is continuously crashed because I obtain many:
> Tika down, retrying: Connection reset
> etc.
> I suspect that, when a process is restarted, the client crash as mentioned 
> here:
> _If the child process is in the process of shutting down, and it gets a new 
> request it will return 503 -- Service Unavailable. If the server times out on 
> a file, the client will receive an IOException from the closed socket. Note 
> that all other files that are being processed will end with an IOException 
> from a closed socket when the child process shuts down; e.g. if you send 
> three files to tika-server concurrently, and one of them causes a 
> catastrophic problem requiring the child to shut down, you won't be able to 
> tell which file caused the problems. In the future, we may implement a 
> gentler shutdown than we currently have._
> as reported here https://wiki.apache.org/tika/TikaJAXRS
> How could I workaround it ?
> Thanks a lot
> Mario



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to