With  ~125k files, and there were 10 restarts, 7x with exit code=137 and 2x 
with exit code=1.  The exit code=253 was a timeout for: 111126.pdf.

Happens roughly every 8-10 minutes.

502907 2015-07-20 17:13:24,420 [main] WARN  
org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process 
(exitValue=137 numRestarts=0 receivedRestartMessage=false)
986787 2015-07-20 17:21:28,300 [main] WARN  
org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process 
(exitValue=253 numRestarts=1 receivedRestartMessage=false)
1574818 2015-07-20 17:31:16,331 [main] WARN  
org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process 
(exitValue=137 numRestarts=2 receivedRestartMessage=false)
2040741 2015-07-20 17:39:02,254 [main] WARN  
org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process 
(exitValue=137 numRestarts=3 receivedRestartMessage=false)
2545702 2015-07-20 17:47:27,215 [main] WARN  
org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process 
(exitValue=137 numRestarts=4 receivedRestartMessage=false)
3084672 2015-07-20 17:56:26,185 [main] WARN  
org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process 
(exitValue=137 numRestarts=5 receivedRestartMessage=false)
3571616 2015-07-20 18:04:33,129 [main] WARN  
org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process 
(exitValue=1 numRestarts=6 receivedRestartMessage=false)
4021342 2015-07-20 18:12:02,855 [main] WARN  
org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process 
(exitValue=1 numRestarts=7 receivedRestartMessage=false)
4503161 2015-07-20 18:20:04,674 [main] WARN  
org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process 
(exitValue=137 numRestarts=8 receivedRestartMessage=false)
4958976 2015-07-20 18:27:40,489 [main] WARN  
org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process 
(exitValue=137 numRestarts=9 receivedRestartMessage=false)
5437962 2015-07-20 18:35:39,475 [main] WARN  
org.apache.tika.batch.BatchProcessDriverCLI  - Hit the maximum number of 
process restarts. Driver is shutting down now.

-----Original Message-----
From: Allison, Timothy B. [mailto:[email protected]] 
Sent: Monday, July 20, 2015 3:18 PM
To: [email protected]
Subject: RE: help debugging integration of PDFBox 2.0.0 trunk

Y, sorry, Tilman.  I'm not running into problems with 1.8.9 and straight text 
extraction, though.

Following Timo's recommendation...looks like a memory issue.  Let me know if I 
should post the full file or move to a more recent version of Java. :)

# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 403177472 bytes for 
committing reserved memory.
# Possible reasons:
#   The system is out of physical RAM or swap space
...
#  Out of Memory Error (os_linux.cpp:2798), pid=14958, tid=140419564971776
...
vm_info: OpenJDK 64-Bit Server VM (24.75-b04) for linux-amd64 JRE 
(1.7.0_75-b13), built on Jan 16 2015 09:15:47 by "mockbuild" with gcc 4.8.2 
20140120 (Red Hat 4.8.2-16)


-----Original Message-----
From: Tilman Hausherr [mailto:[email protected]] 
Sent: Monday, July 20, 2015 1:28 PM
To: [email protected]
Subject: Re: help debugging integration of PDFBox 2.0.0 trunk

Am 20.07.2015 um 18:12 schrieb Allison, Timothy B.:
> All,
>    While integrating 2.0.0 trunk into Tika and running against govdocs1, I'm 
> finding two issues that are difficult to reproduce.
>
> Background:
> Tika-batch has a parent process that kicks off a Tika processor in a child 
> process, if that dies unexpectedly, the parent kicks it off again.  I'm 
> running with 10 consumer/parser threads and -Xmx5g on an (8 cpu/8GB vm); RHEL 
> 7, Linux cloud-server-02 3.10.0-123.20.1.el7.x86_64 #1 SMP Wed Jan 21 
> 09:45:55 EST 2015 x86_64 x86_64 x86_64 GNU/Linux)
>
> Two problems:
>
> 1)      The child process exits with value 1. I'm catching Throwable around 
> the primary execution call in the child process and logging it; nothing shows 
> up in the log files from that part of the code. From the parser log files (at 
> trace), I can tell which 10 files were being processed at the time, but I'm 
> not seeing any other information about what caused the exit.  When I run 
> against just those 10 files, all is ok.
>
> 2)      The OS is killing the child far more often than it does with 1.8.9 
> (exit code 137).
>
> For the second problem, I'll wait until the optimizations to the caching are 
> completed before I start worrying about that.  However, do you have any 
> recommendations on how to figure out what's going on with 1)?

I'm also having some problem with that system... with my test software, 
I have observed that java uses more and more space, despite it being 
told not to use more than a certain amount with -Xmx. After some time, 
the "process killer" kills the application.

Seems something changed in java memory management:
http://karunsubramanian.com/websphere/one-important-change-in-memory-management-in-java-8/

I did some investigation on this a few months ago, but gave up out of 
frustration.

Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to