Hello,

Thanks for the fix, it works well !
 
Best regards,

Olivier 


> Le 12 oct. 2018 à 18:41, Tim Allison <[email protected]> a écrit :
> 
> Except that it didn't fix anything!  I _think_ I got it right this
> time: https://issues.apache.org/jira/browse/TIKA-2754  Let me know
> what you find.
> 
> Thank you, again.
> 
> Cheers,
> 
>         Tim
> On Fri, Oct 12, 2018 at 5:44 AM Olivier Tavard
> <[email protected]> wrote:
>> 
>> Hi,
>> 
>> Thanks for the quick fix !
>> The value of the parameter "path" where you did the commit (parse method in 
>> Tikaresource class) is always set to "unpack/all" when I launched the 
>> indexation on the file share. Normally it should be the file path right ? I 
>> do not understand why it has this value.
>> 
>> Thanks,
>> Best regards,
>> 
>> Olivier
>> 
>> 
>> Le 11 oct. 2018 à 19:46, Tim Allison <[email protected]> a écrit :
>> 
>> Doh. Sorry.  I just added that in bf75e39.  Please let us know what
>> else you find!
>> 
>> Aside from the unit tests, I haven't had a chance to try to break the
>> -spawnChild option with our regression corpus.
>> On Thu, Oct 11, 2018 at 9:59 AM Olivier Tavard
>> <[email protected]> wrote:
>> 
>> 
>> Hi,
>> 
>> I have a question about the log into Tika and for Tika server specifically.
>> We use Tika server for indexing millions of files into a Windows fileshare. 
>> To be more precise we use Apache ManifoldCF to crawl the files and the text 
>> extraction is done by Tika server 1.19.
>> The spawnChild option is active. In case of very big files, we have somme 
>> OOM and the Tika server parent kills and restarts child process as it 
>> should. It works great, I just wanted to know if it would be possible to 
>> have into the Tika server child log the name of the file that caused the 
>> OOM. So far in the Tika log I can find the error and the date of the error 
>> but not the filename. I changed the log mode to debug but the filename did 
>> not appear neither.
>> 
>> To find this information first I have to find the date and time of the 
>> restart of the child in the Tika server log.  Then I open the log of Apache 
>> ManifoldCF and search into it at the date and time found before in the Tika 
>> log  to finally find the problematic file sent to Tika.
>> Did I miss something and the filename can be found on the Tika log ? If Tika 
>> could add the filename into its own log, it would be very helpful for us.
>> 
>> Thanks,
>> Best regards,
>> Olivier
>> 
>> 

Reply via email to