Hello, Our tests indicate that while Tika can extract text from average files it fails to extract text from large files of certain types. In our tests Tika extracted 0 characters from 100 MB PPTX, 60 MB DOCX and 113 MB PDF files. However, it extracted the right text from 94MB TXT file.
Is this Tika's limitation? How can we troubleshoot this? Thanks, Alec
