[
https://issues.apache.org/jira/browse/TIKA-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16684907#comment-16684907
]
Mario Bisonti commented on TIKA-2776:
-------------------------------------
Hallo Tim.
>From the ManifoldCF side I read the log:
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service
interruptions - failure processing document: The target server failed to respond
at
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:489)
[mcf-pull-agent.jar:?]
Caused by: org.apache.http.NoHttpResponseException: The target server failed to
respond
at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:141)
~[httpclient-4.5.6.jar:4.5.6]
at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
~[httpclient-4.5.6.jar:4.5.6]
at
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
~[httpcore-4.4.10.jar:4.4.10]
at
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
~[httpcore-4.4.10.jar:4.4.10]
at
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
~[httpclient-4.5.6.jar:4.5.6]
at
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
~[httpcore-4.4.10.jar:4.4.10]
at
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
~[httpcore-4.4.10.jar:4.4.10]
at
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
~[httpclient-4.5.6.jar:4.5.6]
at
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
~[httpclient-4.5.6.jar:4.5.6]
at
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
~[httpclient-4.5.6.jar:4.5.6]
at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
~[httpclient-4.5.6.jar:4.5.6]
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:118)
~[httpclient-4.5.6.jar:4.5.6]
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
~[httpclient-4.5.6.jar:4.5.6]
at
org.apache.manifoldcf.agents.transformation.tikaservice.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:608)
~[?:?]
at
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226)
~[mcf-agents.jar:?]
at
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077)
~[mcf-agents.jar:?]
at
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$MonitoredAddActivityWrapper.sendDocument(IncrementalIngester.java:3471)
~[mcf-agents.jar:?]
at
org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter.addOrReplaceDocumentWithException(DocumentFilter.java:208)
~[?:?]
at
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226)
~[mcf-agents.jar:?]
at
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077)
~[mcf-agents.jar:?]
at
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2708)
~[mcf-agents.jar:?]
at
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756)
~[mcf-agents.jar:?]
at
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583)
~[mcf-pull-agent.jar:?]
at
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548)
~[mcf-pull-agent.jar:?]
at
org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:939)
~[?:?]
at
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)
~[mcf-pull-agent.jar:?]
WARN 2018-11-13T09:50:58,546 (Worker thread '48') - Service interruption
reported for job 1533797717712 connection 'WinShare': Tika down, retrying:
Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1]
failed: Connection refused (Connection refused)
WARN 2018-11-13T09:50:58,606 (Worker thread '34') - Service interruption
reported for job 1533797717712 connection 'WinShare': Tika down, retrying:
Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1]
failed: Connection refused (Connection refused)
WARN 2018-11-13T09:50:58,947 (Worker thread '0') - Service interruption
reported for job 1533797717712 connection 'WinShare': Tika down, retrying:
Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1]
failed: Connection refused (Connection refused)
WARN 2018-11-13T09:50:58,948 (Worker thread '61') - Service interruption
reported for job 1533797717712 connection 'WinShare': Tika down, retrying:
Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1]
failed: Connection refused (Connection refused)
WARN 2018-11-13T09:50:58,947 (Worker thread '10') - Service interruption
reported for job 1533797717712 connection 'WinShare': Tika down, retrying:
Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1]
failed: Connection refused (Connection refused)
WARN 2018-11-13T09:50:59,344 (Worker thread '5') - Service interruption
reported for job 1533797717712 connection 'WinShare': Tika down, retrying:
Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1]
failed: Connection refused (Connection refused)
I see that Tika server restarts correctly the child but I hav no log from Tika
server.
For me is very difficult to investigate why tika server child is
restarted/crashed.
Is there any way to log Tika server?
Thanks a lot
Mario
> Tika server child restart
> -------------------------
>
> Key: TIKA-2776
> URL: https://issues.apache.org/jira/browse/TIKA-2776
> Project: Tika
> Issue Type: Bug
> Reporter: Mario Bisonti
> Priority: Major
>
> Hallo.
> I use tika server standalone started with the option:
> java -jar /opt/tika/tika-server-1.19.1.jar -spawnChild
> I use ManifoldCF and Solr to index file using tika server.
> It happens that indexing is continuously crashed because I obtain many:
> Tika down, retrying: Connection reset
> etc.
> I suspect that, when a process is restarted, the client crash as mentioned
> here:
> _If the child process is in the process of shutting down, and it gets a new
> request it will return 503 -- Service Unavailable. If the server times out on
> a file, the client will receive an IOException from the closed socket. Note
> that all other files that are being processed will end with an IOException
> from a closed socket when the child process shuts down; e.g. if you send
> three files to tika-server concurrently, and one of them causes a
> catastrophic problem requiring the child to shut down, you won't be able to
> tell which file caused the problems. In the future, we may implement a
> gentler shutdown than we currently have._
> as reported here https://wiki.apache.org/tika/TikaJAXRS
> How could I workaround it ?
> Thanks a lot
> Mario
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)