Hello, Thanks for the fix, it works well ! Best regards,
Olivier > Le 12 oct. 2018 à 18:41, Tim Allison <[email protected]> a écrit : > > Except that it didn't fix anything! I _think_ I got it right this > time: https://issues.apache.org/jira/browse/TIKA-2754 Let me know > what you find. > > Thank you, again. > > Cheers, > > Tim > On Fri, Oct 12, 2018 at 5:44 AM Olivier Tavard > <[email protected]> wrote: >> >> Hi, >> >> Thanks for the quick fix ! >> The value of the parameter "path" where you did the commit (parse method in >> Tikaresource class) is always set to "unpack/all" when I launched the >> indexation on the file share. Normally it should be the file path right ? I >> do not understand why it has this value. >> >> Thanks, >> Best regards, >> >> Olivier >> >> >> Le 11 oct. 2018 à 19:46, Tim Allison <[email protected]> a écrit : >> >> Doh. Sorry. I just added that in bf75e39. Please let us know what >> else you find! >> >> Aside from the unit tests, I haven't had a chance to try to break the >> -spawnChild option with our regression corpus. >> On Thu, Oct 11, 2018 at 9:59 AM Olivier Tavard >> <[email protected]> wrote: >> >> >> Hi, >> >> I have a question about the log into Tika and for Tika server specifically. >> We use Tika server for indexing millions of files into a Windows fileshare. >> To be more precise we use Apache ManifoldCF to crawl the files and the text >> extraction is done by Tika server 1.19. >> The spawnChild option is active. In case of very big files, we have somme >> OOM and the Tika server parent kills and restarts child process as it >> should. It works great, I just wanted to know if it would be possible to >> have into the Tika server child log the name of the file that caused the >> OOM. So far in the Tika log I can find the error and the date of the error >> but not the filename. I changed the log mode to debug but the filename did >> not appear neither. >> >> To find this information first I have to find the date and time of the >> restart of the child in the Tika server log. Then I open the log of Apache >> ManifoldCF and search into it at the date and time found before in the Tika >> log to finally find the problematic file sent to Tika. >> Did I miss something and the filename can be found on the Tika log ? If Tika >> could add the filename into its own log, it would be very helpful for us. >> >> Thanks, >> Best regards, >> Olivier >> >>
