[ https://issues.apache.org/jira/browse/TIKA-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913227#comment-17913227 ]
mbiso edited comment on TIKA-4186 at 1/15/25 9:22 AM: ------------------------------------------------------ hi. I am using tika server 3.0 on a docker container. This is my entrypoint: "Entrypoint": [ "/bin/sh", "-c", "exec java {-}cp \"/tika-server-standard{-}${TIKA_VERSION}.jar:/tika-extras/*\" org.apache.tika.server.core.TikaServerCli -h 0.0.0.0 $0 $@" ] how could I verify if tika uses /pipes and /async endpoints the "next gen" alternative ? Thanks a lot Mario was (Author: JIRAUSER308329): hi. I am using tika server 3.0 on a docker container. This is my entrypoint: "Entrypoint": [ "/bin/sh", "-c", "exec java -cp \"/tika-server-standard-${TIKA_VERSION}.jar:/tika-extras/*\" org.apache.tika.server.core.TikaServerCli -h 0.0.0.0 $0 $@" ] how could I verify if tika uses /pipes and /async endpoints the "next gen" alternative ? Thanks a lot Mario > tika server shut down innocent connections > ------------------------------------------ > > Key: TIKA-4186 > URL: https://issues.apache.org/jira/browse/TIKA-4186 > Project: Tika > Issue Type: Improvement > Components: tika-server > Affects Versions: 2.9.1 > Environment: macOS running tika-server-standard-2.9.1.jar > Reporter: Itai > Priority: Major > > The Tika server shuts down and restarts in case of an issue (OOM, crash, > timeout). > When tika server shut down, all active connections are being closed. > A single connection can cause a side effect on other connections. > This makes it hard to make parallel calls to a single server in a production > environment. > How to reproduce? > - prepare a large sample.pdf file that takes more then 30secs to digest. > run: > java -jar ~/Downloads/tika-server-standard-2.9.1.jar > — > terminal 2 run: > curl -v -T sample.pdf [http://localhost:9998/tika] --header "Accept: > text/plain" --header "X-Tika-Timeout-Millis: 30001" > — > wait ~20-25 seconds > — > terminal 3 run: > curl -v -T sample.pdf [http://localhost:9998/tika] --header "Accept: > text/plain" > Expected result: > - terminal 2 connection should timeout after 30 secs > - terminal 3 connection should not timeout and return successfully. > Actual result: > - both curl commands fails after 30 secs. > logs: > ``` > INFO [qtp486662053-44] 11:57:30,251 > org.apache.tika.server.core.resource.TikaResource /tika (autodetecting type) > WARN [qtp486662053-44] 11:57:30,278 org.apache.pdfbox.pdfparser.BaseParser > Empty COSName at offset 628452 > ERROR [Thread-21] 11:57:37,566 > org.apache.tika.server.core.ServerStatusWatcher Timeout task PARSE, millis > elapsed 30014; consider increasing the allowable time with the > <taskTimeoutMillis/> parameter or the X-Tika-Timeout-Millis header > WARN [Thread-21] 11:57:37,573 > org.apache.tika.server.core.ServerStatusWatcher forked process observed > TIMEOUT and is shutting down. > INFO [Thread-21] 11:57:37,613 > org.apache.tika.server.core.ServerStatusWatcher Shutting down forked process > with status: TIMEOUT > INFO [pool-2-thread-1] 11:57:38,039 > org.apache.tika.server.core.TikaServerWatchDog forked process exited with > exit value 3 > INFO [main] 11:57:39,340 org.apache.tika.server.core.TikaServerProcess > Starting Apache Tika 2.9.1 server > INFO [main] 11:57:39,564 org.apache.tika.server.core.TikaServerProcess > loading resource from SPI: class > org.apache.tika.server.standard.resource.XMPMetadataResource > Jan 29, 2024 11:57:39 AM org.apache.cxf.endpoint.ServerImpl initDestination > INFO: Setting the server's publish address to be [http://localhost:9998/] > INFO [main] 11:57:39,747 org.eclipse.jetty.util.log Logging initialized > @1640ms to org.eclipse.jetty.util.log.Slf4jLog > INFO [main] 11:57:39,790 org.eclipse.jetty.server.Server > jetty-9.4.53.v20231009; built: 2023-10-09T12:29:09.265Z; git: > 27bde00a0b95a1d5bbee0eae7984f891d2d0f8c9; jvm 21.0.1 > INFO [main] 11:57:39,833 org.eclipse.jetty.server.AbstractConnector Started > ServerConnector@48bfb884\{HTTP/1.1, (http/1.1)} > {localhost:9998} > INFO [main] 11:57:39,833 org.eclipse.jetty.server.Server Started @1729ms > ``` > — > ``` > * Trying 127.0.0.1:9998... > * Connected to localhost (127.0.0.1) port 9998 (#0) > > PUT /tika HTTP/1.1 > > Host: localhost:9998 > > User-Agent: curl/7.85.0 > > Accept: text/plain > > Content-Length: 636978 > > Expect: 100-continue > > > * Mark bundle as not supporting multiuse > < HTTP/1.1 100 Continue > * We are completely uploaded and fine > * Empty reply from server > * Closing connection 0 > curl: (52) Empty reply from server > ``` > > -- This message was sent by Atlassian Jira (v8.20.10#820010)