All,
  While integrating 2.0.0 trunk into Tika and running against govdocs1, I'm 
finding two issues that are difficult to reproduce.

Background:
Tika-batch has a parent process that kicks off a Tika processor in a child 
process, if that dies unexpectedly, the parent kicks it off again.  I'm running 
with 10 consumer/parser threads and -Xmx5g on an (8 cpu/8GB vm); RHEL 7, Linux 
cloud-server-02 3.10.0-123.20.1.el7.x86_64 #1 SMP Wed Jan 21 09:45:55 EST 2015 
x86_64 x86_64 x86_64 GNU/Linux)

Two problems:

1)      The child process exits with value 1. I'm catching Throwable around the 
primary execution call in the child process and logging it; nothing shows up in 
the log files from that part of the code. From the parser log files (at trace), 
I can tell which 10 files were being processed at the time, but I'm not seeing 
any other information about what caused the exit.  When I run against just 
those 10 files, all is ok.

2)      The OS is killing the child far more often than it does with 1.8.9 
(exit code 137).

For the second problem, I'll wait until the optimizations to the caching are 
completed before I start worrying about that.  However, do you have any 
recommendations on how to figure out what's going on with 1)?

Thank you!

             Cheers,

                   Tim

Reply via email to