[jira] [Commented] (TIKA-3668) High CPU utilization in Tika 2.2.0

Manjunath Dhongadi (Jira) Wed, 02 Mar 2022 08:44:06 -0800


    [ 
https://issues.apache.org/jira/browse/TIKA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500277#comment-17500277
 ]


Manjunath Dhongadi commented on TIKA-3668:
------------------------------------------

We have observed this scenario during performance testing when we scan around 
100GB of data.
This is not case with specific file formats, its generic across all.
We do not use any custom settings for parsers.

> High CPU utilization in Tika 2.2.0
> ----------------------------------
>
>                 Key: TIKA-3668
>                 URL: https://issues.apache.org/jira/browse/TIKA-3668
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Manjunath Dhongadi
>            Priority: Major
>
> Recently we upgraded Tika version from 1.26 to 2.2.0.
> We see the CPU utilization have gone high drastically(6 to 8 times more) in 
> both cases Tesseract enabled and Tesseract disabled case.
> We are using tika-parsers-standard-package of 2.2.0.
> Whether this is normal behavior of high version of Tika 2.2.0. 
> Any fine tuning parameters available for same.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (TIKA-3668) High CPU utilization in Tika 2.2.0

Reply via email to