Doh. Sorry. I just added that in bf75e39. Please let us know what else you find!
Aside from the unit tests, I haven't had a chance to try to break the -spawnChild option with our regression corpus. On Thu, Oct 11, 2018 at 9:59 AM Olivier Tavard <[email protected]> wrote: > > Hi, > > I have a question about the log into Tika and for Tika server specifically. > We use Tika server for indexing millions of files into a Windows fileshare. > To be more precise we use Apache ManifoldCF to crawl the files and the text > extraction is done by Tika server 1.19. > The spawnChild option is active. In case of very big files, we have somme OOM > and the Tika server parent kills and restarts child process as it should. It > works great, I just wanted to know if it would be possible to have into the > Tika server child log the name of the file that caused the OOM. So far in the > Tika log I can find the error and the date of the error but not the filename. > I changed the log mode to debug but the filename did not appear neither. > > To find this information first I have to find the date and time of the > restart of the child in the Tika server log. Then I open the log of Apache > ManifoldCF and search into it at the date and time found before in the Tika > log to finally find the problematic file sent to Tika. > Did I miss something and the filename can be found on the Tika log ? If Tika > could add the filename into its own log, it would be very helpful for us. > > Thanks, > Best regards, > Olivier
