Except that it didn't fix anything!  I _think_ I got it right this
time: https://issues.apache.org/jira/browse/TIKA-2754  Let me know
what you find.

Thank you, again.

Cheers,

         Tim
On Fri, Oct 12, 2018 at 5:44 AM Olivier Tavard
<[email protected]> wrote:
>
> Hi,
>
> Thanks for the quick fix !
> The value of the parameter "path" where you did the commit (parse method in 
> Tikaresource class) is always set to "unpack/all" when I launched the 
> indexation on the file share. Normally it should be the file path right ? I 
> do not understand why it has this value.
>
> Thanks,
> Best regards,
>
> Olivier
>
>
> Le 11 oct. 2018 à 19:46, Tim Allison <[email protected]> a écrit :
>
> Doh. Sorry.  I just added that in bf75e39.  Please let us know what
> else you find!
>
> Aside from the unit tests, I haven't had a chance to try to break the
> -spawnChild option with our regression corpus.
> On Thu, Oct 11, 2018 at 9:59 AM Olivier Tavard
> <[email protected]> wrote:
>
>
> Hi,
>
> I have a question about the log into Tika and for Tika server specifically.
> We use Tika server for indexing millions of files into a Windows fileshare. 
> To be more precise we use Apache ManifoldCF to crawl the files and the text 
> extraction is done by Tika server 1.19.
> The spawnChild option is active. In case of very big files, we have somme OOM 
> and the Tika server parent kills and restarts child process as it should. It 
> works great, I just wanted to know if it would be possible to have into the 
> Tika server child log the name of the file that caused the OOM. So far in the 
> Tika log I can find the error and the date of the error but not the filename. 
> I changed the log mode to debug but the filename did not appear neither.
>
> To find this information first I have to find the date and time of the 
> restart of the child in the Tika server log.  Then I open the log of Apache 
> ManifoldCF and search into it at the date and time found before in the Tika 
> log  to finally find the problematic file sent to Tika.
> Did I miss something and the filename can be found on the Tika log ? If Tika 
> could add the filename into its own log, it would be very helpful for us.
>
> Thanks,
> Best regards,
> Olivier
>
>

Reply via email to