Hi,

I have a question about the log into Tika and for Tika server specifically.
We use Tika server for indexing millions of files into a Windows fileshare. To 
be more precise we use Apache ManifoldCF to crawl the files and the text 
extraction is done by Tika server 1.19. 
The spawnChild option is active. In case of very big files, we have somme OOM 
and the Tika server parent kills and restarts child process as it should. It 
works great, I just wanted to know if it would be possible to have into the 
Tika server child log the name of the file that caused the OOM. So far in the 
Tika log I can find the error and the date of the error but not the filename. I 
changed the log mode to debug but the filename did not appear neither.

To find this information first I have to find the date and time of the restart 
of the child in the Tika server log.  Then I open the log of Apache ManifoldCF 
and search into it at the date and time found before in the Tika log  to 
finally find the problematic file sent to Tika.
Did I miss something and the filename can be found on the Tika log ? If Tika 
could add the filename into its own log, it would be very helpful for us. 

Thanks,
Best regards,
Olivier 

Reply via email to