Dear Wiki user, You have subscribed to a wiki page or wiki category on "Tika Wiki" for change notification.
The "TikaJAXRS" page has been changed by TimothyAllison: https://wiki.apache.org/tika/TikaJAXRS?action=diff&rev1=48&rev2=49 Also, please be polite. This feature was added as a convenience. Please consider using a robust crawler (instead of our simple {{{TikaInputStream.get(new URL(fileUrl))}}}) that will allow for better configuration of redirects, timeouts, cookies, etc.; and a robust crawler will respect robots.txt! + = Making Tika Server Robust to OOMs, Infinite Loops and Memory Leaks = + As of Tika 1.19, users can make tika-server more robust by running it with the {{{-spawnChild}}} option. This + starts tika-server in a child process, and if there's an OOM, a timeout or other catastrophic problem with the child process, the + parent process will kill and/or restart the child process. + + The following options are available only with the {{{-spawnChild}}} option. + + * {{{-maxFiles}}}: restart the child process after it has processed {{{maxFiles}}}. If there is a slow building memory leak, this restart of the JVM should help. + * {{{-taskTimeoutMillis}}} and {{{-taskPulseMillis}}}: {{{taskPulseMillis}}} specifies how often to check to determine if a parse/detect task has timed out {{{taskTimeoutMillis}}} + * {{{-pingTimeoutMillis}}} and {{{-pingPulseMillis}}}: {{{pingPulseMillis}}} specifies how often for the parent process to ping the child process to check status. {{{pingTimeoutMillis}}} how long the parent process should wait to hear back from the child process before restarting it and/or how long the child process should wait to receive a ping from the parent process before shutting itself down. + + If the child process is in the process of shutting down, and it gets a new request it will return {{{503 -- Service Unavailable}}}. If the server times out on a file, the client will receive an IOException from the closed socket. +
