apismensky commented on PR #985:
URL: https://github.com/apache/tika/pull/985#issuecomment-1446743975

   I was going to submit this issue last week. 
   My observation was similar - lots of overhead around BitSet - mem 
allocations / cpu. 
   We switched from tika 1.27 to 2.7.0 
   For one of the files we saw the difference: 
   Extraction took: 2199 ( tika 1.27) vs
   Extraction took: 27010 ( tika 2.7.0) 
   
   Both in ms, so it is more than 10 times slower.
   Original file size is 50.5 Mb
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to