[ https://issues.apache.org/jira/browse/TIKA-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16684907#comment-16684907 ]
Mario Bisonti commented on TIKA-2776: ------------------------------------- Hallo Tim. >From the ManifoldCF side I read the log: org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service interruptions - failure processing document: The target server failed to respond at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:489) [mcf-pull-agent.jar:?] Caused by: org.apache.http.NoHttpResponseException: The target server failed to respond at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:141) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) ~[httpcore-4.4.10.jar:4.4.10] at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) ~[httpcore-4.4.10.jar:4.4.10] at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) ~[httpcore-4.4.10.jar:4.4.10] at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) ~[httpcore-4.4.10.jar:4.4.10] at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:118) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.manifoldcf.agents.transformation.tikaservice.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:608) ~[?:?] at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226) ~[mcf-agents.jar:?] at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077) ~[mcf-agents.jar:?] at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$MonitoredAddActivityWrapper.sendDocument(IncrementalIngester.java:3471) ~[mcf-agents.jar:?] at org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter.addOrReplaceDocumentWithException(DocumentFilter.java:208) ~[?:?] at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226) ~[mcf-agents.jar:?] at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077) ~[mcf-agents.jar:?] at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2708) ~[mcf-agents.jar:?] at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756) ~[mcf-agents.jar:?] at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583) ~[mcf-pull-agent.jar:?] at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548) ~[mcf-pull-agent.jar:?] at org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:939) ~[?:?] at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399) ~[mcf-pull-agent.jar:?] WARN 2018-11-13T09:50:58,546 (Worker thread '48') - Service interruption reported for job 1533797717712 connection 'WinShare': Tika down, retrying: Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused (Connection refused) WARN 2018-11-13T09:50:58,606 (Worker thread '34') - Service interruption reported for job 1533797717712 connection 'WinShare': Tika down, retrying: Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused (Connection refused) WARN 2018-11-13T09:50:58,947 (Worker thread '0') - Service interruption reported for job 1533797717712 connection 'WinShare': Tika down, retrying: Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused (Connection refused) WARN 2018-11-13T09:50:58,948 (Worker thread '61') - Service interruption reported for job 1533797717712 connection 'WinShare': Tika down, retrying: Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused (Connection refused) WARN 2018-11-13T09:50:58,947 (Worker thread '10') - Service interruption reported for job 1533797717712 connection 'WinShare': Tika down, retrying: Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused (Connection refused) WARN 2018-11-13T09:50:59,344 (Worker thread '5') - Service interruption reported for job 1533797717712 connection 'WinShare': Tika down, retrying: Connect to localhost:9998 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused (Connection refused) I see that Tika server restarts correctly the child but I hav no log from Tika server. For me is very difficult to investigate why tika server child is restarted/crashed. Is there any way to log Tika server? Thanks a lot Mario > Tika server child restart > ------------------------- > > Key: TIKA-2776 > URL: https://issues.apache.org/jira/browse/TIKA-2776 > Project: Tika > Issue Type: Bug > Reporter: Mario Bisonti > Priority: Major > > Hallo. > I use tika server standalone started with the option: > java -jar /opt/tika/tika-server-1.19.1.jar -spawnChild > I use ManifoldCF and Solr to index file using tika server. > It happens that indexing is continuously crashed because I obtain many: > Tika down, retrying: Connection reset > etc. > I suspect that, when a process is restarted, the client crash as mentioned > here: > _If the child process is in the process of shutting down, and it gets a new > request it will return 503 -- Service Unavailable. If the server times out on > a file, the client will receive an IOException from the closed socket. Note > that all other files that are being processed will end with an IOException > from a closed socket when the child process shuts down; e.g. if you send > three files to tika-server concurrently, and one of them causes a > catastrophic problem requiring the child to shut down, you won't be able to > tell which file caused the problems. In the future, we may implement a > gentler shutdown than we currently have._ > as reported here https://wiki.apache.org/tika/TikaJAXRS > How could I workaround it ? > Thanks a lot > Mario -- This message was sent by Atlassian JIRA (v7.6.3#76005)