Y, sorry, Tilman.  I'm not running into problems with 1.8.9 and straight text 
extraction, though.

Following Timo's recommendation...looks like a memory issue.  Let me know if I 
should post the full file or move to a more recent version of Java. :)

# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 403177472 bytes for 
committing reserved memory.
# Possible reasons:
#   The system is out of physical RAM or swap space
...
#  Out of Memory Error (os_linux.cpp:2798), pid=14958, tid=140419564971776
...
vm_info: OpenJDK 64-Bit Server VM (24.75-b04) for linux-amd64 JRE 
(1.7.0_75-b13), built on Jan 16 2015 09:15:47 by "mockbuild" with gcc 4.8.2 
20140120 (Red Hat 4.8.2-16)


-----Original Message-----
From: Tilman Hausherr [mailto:[email protected]] 
Sent: Monday, July 20, 2015 1:28 PM
To: [email protected]
Subject: Re: help debugging integration of PDFBox 2.0.0 trunk

Am 20.07.2015 um 18:12 schrieb Allison, Timothy B.:
> All,
>    While integrating 2.0.0 trunk into Tika and running against govdocs1, I'm 
> finding two issues that are difficult to reproduce.
>
> Background:
> Tika-batch has a parent process that kicks off a Tika processor in a child 
> process, if that dies unexpectedly, the parent kicks it off again.  I'm 
> running with 10 consumer/parser threads and -Xmx5g on an (8 cpu/8GB vm); RHEL 
> 7, Linux cloud-server-02 3.10.0-123.20.1.el7.x86_64 #1 SMP Wed Jan 21 
> 09:45:55 EST 2015 x86_64 x86_64 x86_64 GNU/Linux)
>
> Two problems:
>
> 1)      The child process exits with value 1. I'm catching Throwable around 
> the primary execution call in the child process and logging it; nothing shows 
> up in the log files from that part of the code. From the parser log files (at 
> trace), I can tell which 10 files were being processed at the time, but I'm 
> not seeing any other information about what caused the exit.  When I run 
> against just those 10 files, all is ok.
>
> 2)      The OS is killing the child far more often than it does with 1.8.9 
> (exit code 137).
>
> For the second problem, I'll wait until the optimizations to the caching are 
> completed before I start worrying about that.  However, do you have any 
> recommendations on how to figure out what's going on with 1)?

I'm also having some problem with that system... with my test software, 
I have observed that java uses more and more space, despite it being 
told not to use more than a certain amount with -Xmx. After some time, 
the "process killer" kills the application.

Seems something changed in java memory management:
http://karunsubramanian.com/websphere/one-important-change-in-memory-management-in-java-8/

I did some investigation on this a few months ago, but gave up out of 
frustration.

Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to