[ 
https://issues.apache.org/jira/browse/TIKA-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705208#comment-16705208
 ] 

Tim Allison commented on TIKA-2776:
-----------------------------------

This caught me by surprise...I thought that I had left the default as -1 (no 
max), but I clearly set it to 100,000.

I _think_ my reasoning was that this is good jvm hygiene given the craziness 
some of our parsers can do to a jvm.  I readily admit that I don't have good 
data to support my decision aside from the reasoning that "we've had memory 
leaks before because of caching; we'll have them again."  I'd be willing to 
bump the default to a higher value, but I wouldn't want to turn it off.

You can avoid the restarts caused by HIT_MAX by setting it to -1 on the 
commandline.  

I'll update the documentation on the wiki and in the code.  Thank you!

  

> Tika server child restart
> -------------------------
>
>                 Key: TIKA-2776
>                 URL: https://issues.apache.org/jira/browse/TIKA-2776
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Mario Bisonti
>            Assignee: Tim Allison
>            Priority: Blocker
>             Fix For: 2.0.0, 1.20
>
>         Attachments: Log.zip, MCF_JOB.png, log4j.xml, log4j_child.xml, 
> log4j_child.xml, man_tika.zip, tikalogchild.log
>
>
> Hallo.
> I use tika server standalone started with the option:
> java -jar /opt/tika/tika-server-1.19.1.jar -spawnChild
> I use ManifoldCF and Solr to index file using tika server.
> It happens that indexing is continuously crashed because I obtain many:
> Tika down, retrying: Connection reset
> etc.
> I suspect that, when a process is restarted, the client crash as mentioned 
> here:
> _If the child process is in the process of shutting down, and it gets a new 
> request it will return 503 -- Service Unavailable. If the server times out on 
> a file, the client will receive an IOException from the closed socket. Note 
> that all other files that are being processed will end with an IOException 
> from a closed socket when the child process shuts down; e.g. if you send 
> three files to tika-server concurrently, and one of them causes a 
> catastrophic problem requiring the child to shut down, you won't be able to 
> tell which file caused the problems. In the future, we may implement a 
> gentler shutdown than we currently have._
> as reported here https://wiki.apache.org/tika/TikaJAXRS
> How could I workaround it ?
> Thanks a lot
> Mario



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to