[ 
https://issues.apache.org/jira/browse/TIKA-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tino Schöllhorn updated TIKA-4422:
----------------------------------
    Summary: Availability problem with TikaServer 3.1.0  (was: Availability 
problem with TikaServer 3.1.0 List-ID:<user.tika.apache.org>)

> Availability problem with TikaServer 3.1.0
> ------------------------------------------
>
>                 Key: TIKA-4422
>                 URL: https://issues.apache.org/jira/browse/TIKA-4422
>             Project: Tika
>          Issue Type: Bug
>          Components: tika-server
>    Affects Versions: 3.1.0
>         Environment: Java21
> Ubuntu22
>  
>            Reporter: Tino Schöllhorn
>            Priority: Major
>
> Hi,
> we have a problem when running the TikaServer. We use Tika 3.1.0 on Ubuntu 
> with Java21. 
> Previously, we used Tika 2.4.x - there we could not observe this problem. 
> We run a *lot* of text-extraction requests. After a few hours (8-10h) Tika is 
> not able to restart its worker processes. 
> Tika runs via systemd and via journalctl we see the following output:
>  
> {noformat}
> May 28 04:39:39 dss-index java[350084]: INFO  [pool-2-thread-1] 04:39:39,752 
> org.apache.tika.server.core.TikaServerWatchDog forked process exited with 
> exit value 3
> May 28 04:39:40 dss-index java[376963]: May 28, 2025 4:39:40 AM 
> org.apache.cxf.endpoint.ServerImpl initDestination
> May 28 04:39:40 dss-index java[376963]: INFO: Setting the server's publish 
> address to be http://localhost:9998/
> May 28 05:35:32 dss-index java[350084]: INFO  [pool-2-thread-1] 05:35:32,896 
> org.apache.tika.server.core.TikaServerWatchDog forked process exited with 
> exit value 2
> May 28 05:35:34 dss-index java[377213]: May 28, 2025 5:35:34 AM 
> org.apache.cxf.endpoint.ServerImpl initDestination
> May 28 05:35:34 dss-index java[377213]: INFO: Setting the server's publish 
> address to be http://localhost:9998/{noformat}
> After these messages the TikaServer does not respond to requests any more. A 
> restart of the Tika-Parent process is the only thing which helps. 
> The error messages are emitted in TikaServerWatchDog:161. Yet, I do not 
> understand what is going wrong here. Probably the messages are error 
> messages from the OS. perror gives the following output: 
> {noformat}
> OS error code   2:  No such file or directory
> OS error code   3:  No such process{noformat}
> Yet, it is unclear to me, what happens. Below you'll find the tika.config. 
> As far as I understand the situation this seems a bug which has been 
> introduced sometime between version 2.4.x and 3.1.0. 
> Hope that someone has an idea what is going on and how this can be remedied. 
> Tino
> – tika.config.start
> {code:java}
> <?xml version="1.0" encoding="UTF-8"?>
> <properties>
>    <parsers>
>       <parser class="org.apache.tika.parser.DefaultParser">
>       </parser>
>    </parsers>
>    <server>
>     <params>
>       <port>9998</port>
>       <host>localhost</host>
>       <digest>sha256</digest>
>       <digestMarkLimit>1000000</digestMarkLimit>
>       <id></id>
>       <cors>NONE</cors>
>       <logLevel>info</logLevel>
>       <returnStackTrace>false</returnStackTrace>
>       <noFork>false</noFork>
>       <taskTimeoutMillis>300000</taskTimeoutMillis>
>       <maxForkedStartupMillis>120000</maxForkedStartupMillis>
>       <maxRestarts>-1</maxRestarts>
>       <maxFiles>25000</maxFiles>
>       <javaPath>java</javaPath>
>       <forkedJvmArgs>
>         <arg>-Xms4g</arg>
>         <arg>-Xmx4g</arg>
>         <arg>-Dlog4j.configurationFile=tika-forked-log4j2.xml</arg>
>        </forkedJvmArgs>
>       <enableUnsecureFeatures>false</enableUnsecureFeatures>
>       <endpoints>
>         <endpoint>status</endpoint>
>         <endpoint>tika</endpoint>
>         <endpoint>rmeta</endpoint>
>         <endpoint>language</endpoint>
>       </endpoints>
>     </params>
>   </server>
> </properties>
> {code}
> – tika.config.stop
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to