I'm just getting started with Tika, and I tried the basic AutoDetectParser 
and the basic ParsingReader on a batch of a few thousand docx files (tika-app 
v1.0).  On my laptop, I was able to extract text at a rate of 200 docs per 
minute.  When I ran XWPFWordExtractor (poi 3.8) on the same docs, the rate was 
1000 docs per minute.  Is there a faster way to use Tika to extract text from a 
file?  Is this performance difference expected and/or experienced by others?

     Thank you.

Reply via email to