memory leak in pdfbox--SolrCel needs to call COSName.clearResources?
We've been struggling with solr hangs in the solr process that indexes incoming PDF documents. TLDR; summary is that I'm thinking that PDFBox needs to have COSName.clearResources() called on it if the solr indexer expects to be able to keep running indefinitely. Is that likely? Is there anybody on this list who is doing PDF extraction in a long-running process and having it work? The thread dump of a hung process often shows lots of threads hanging on this: java.lang.Thread.State: BLOCKED (on object monitor) at java.util.Collections$SynchronizedMap.get(Collections.java:1975) - waiting to lock 0x00072551f908 (a java.util.Collections$SynchronizedMap) at org.apache.pdfbox.util.PDFOperator.getOperator(PDFOperator.java:68) at org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:441) And the heap is almost full: Heap PSYoungGen total 796416K, used 386330K eden space 398208K, 97% used from space 398208K, 0% used to space 398208K, 0% used PSOldGen object space 2389376K, 99% used PSPermGen object space 53824K, 99% used Using eclipse's mat to look at the heap dump of a hung process shows one of the chief memory leak suspects is PDFBox's COSName class The class org.apache.pdfbox.cos.COSName, loaded by java.net.FactoryURLClassLoader @ 0x725a1a230, occupies 151,183,360 (16.64%) bytes. The memory is accumulated in one instance of java.util.concurrent.ConcurrentHashMap$Segment[] loaded by system class loader. and the Shortest Paths To the Accumulation Point graph for that looks like this: Class Name Shallow HeapRetained Heap java.util.concurrent.ConcurrentHashMap$Segment[16] 80151,160,680 segments java.util.concurrent.ConcurrentHashMap 48 151,160,728 nameMap class org.apache.pdfbox.cos.COSName 1,184 151,183,360 [123] java.lang.Object[2560] 10,256 26,004,368 elementData java.util.Vector 32 26,004,400 classes java.net.FactoryURLClassLoader 72 26,228,440 classloader class org.apache.pdfbox.cos.COSDocument 8 8 class org.apache.pdfbox.cos.COSDocument 641,703,704 referent java.lang.ref.Finalizer 40 1,703,744 And the Dominator Tree chart looks like this: 26.69% org.apache.solr.core.SolrCore 16.64% class org.apache.pdfbox.cos.COSName 2.89% java.net.Factory.URLClassLoader Now the implementation of COSName says this: /** * Not usually needed except if resources need to be reclaimed in a long * running process. * Patch provided by fles...@gmail.com * incorporated 5/23/08, danielwil...@users.sourceforge.net */ public static synchronized void clearResources() { // Clear them all nameMap.clear(); } I *don't* see a call to clearResources anywhere in solr or tika, and I think that's the problem. The implementation puts all the COSNames in a class-level static HashMap, which never gets emptied, and apparently keeps growing forever. I suspect the fact that the URLClassLoader is involved in that graph to the COSNames class is what's filling up the PermGen space in the heap. Does that sound likely? Possible? Can anyone speak to that? Anyone have suggested next steps for us, besides restarting our solr indexer process every couple of hours?
solr hanging, possible memory issues?
We're running Solr 3.4, a fairly out-of-the-box solr/jetty setup, with -Xms1200m -Xmx3500m . When we start pushing more than a couple documents per second at it (PDFs, they go through SolrCell/Tika/PDFBox), the java process hangs, becoming completely unresponsive. We thought it might be an issue with PDFBox was problematic with lots of blocked threads, but now we've also noticed that all the thread dumps show a similar situation in the heap, where three of the areas are at 99%, even though we don't get any out-of-memory messages. Heap PSYoungGen total 796416K, used 386330K eden space 398208K, 97% used from space 398208K, 0% used to space 398208K, 0% used PSOldGen object space 2389376K, 99% used PSPermGen object space 53824K, 99% used We've also just noticed that after restarting Solr, the PermGen space grows steadily until it hits 99% and then just stays there. 1) Is that behavior of PermGen normal, growing steadily to 99% and then staying there? Apparently we can increase PermSpace to -XX:MaxPermSize=128M , but if there's a memory leak that only postpones the problem. 2) If all three of those indicators are pegged at 99%, I would think that the JVM would throw an out-of-memory exception, rather than just withdrawing into its own navel, is that expected behavior or is it indicative of anything else? Any tips would be greatly appreciated, thanks! -- Kevin M. Goess Software Engineer Berkeley Electronic Press kgo...@bepress.com 510-665-1200 x179 www.bepress.com bepress: sustainable scholarly publishing
solr hanging regularly, thread dump attached, PDFOperator
If we have more than half a dozen processes posting updates to our Solr instance (each about 1/sec), we get regular and fairly reproducible solr hangs, requiring a kill -9 of the solr process. We're running Solr 3.4, a fairly out-of-the-box solr/jetty setup. I've got a thread dump and some GC logging information from the latest one. I haven't programmed in java for a decade, and any help decoding these dumps would be hugely appreciated. When I first was alerted that the process was hanging, this is the count of what threads were doing what: 33 java.lang.Thread.State: BLOCKED (on object monitor) 16 java.lang.Thread.State: RUNNABLE 2 java.lang.Thread.State: TIMED_WAITING (on object monitor) 1 java.lang.Thread.State: WAITING (on object monitor) 77 java.lang.Thread.State: WAITING (parking) A couple minutes later it looked like this: 29 java.lang.Thread.State: BLOCKED (on object monitor) 8 java.lang.Thread.State: RUNNABLE 1 java.lang.Thread.State: TIMED_WAITING (on object monitor) 1 java.lang.Thread.State: TIMED_WAITING (parking) 1 java.lang.Thread.State: TIMED_WAITING (sleeping) 3 java.lang.Thread.State: WAITING (on object monitor) 1 java.lang.Thread.State: WAITING (parking) All the blocked threads looked like this: java.lang.Thread.State: BLOCKED (on object monitor) at java.util.Collections$SynchronizedMap.get(Collections.java:1975) - waiting to lock 0x00072551f908 (a java.util.Collections$SynchronizedMap) at org.apache.pdfbox.util.PDFOperator.getOperator(PDFOperator.java:68) at org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:441) at org.apache.pdfbox.pdfparser.PDFStreamParser.access$000(PDFStreamParser.java:46) at org.apache.pdfbox.pdfparser.PDFStreamParser$1.tryNext(PDFStreamParser.java:175) at org.apache.pdfbox.pdfparser.PDFStreamParser$1.hasNext(PDFStreamParser.java:187) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:266) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:251) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:225) at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:442) at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:366) at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:322) at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:242) at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:53) at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:90) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:137) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:213) ... and yes, that 0x00072551f908 address is the same for each thread that's blocked. In the first case where there were 77 threads waiting (parking), they were all 2012-08-30T20:23:55-0700 java.lang.Thread.State: WAITING (parking) 2012-08-30T20:23:55-0700at sun.misc.Unsafe.park(Native Method) 2012-08-30T20:23:55-0700- parking to wait for 0x000726089700 (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync) 2012-08-30T20:23:55-0700at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) 2012-08-30T20:23:55-0700at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811) 2012-08-30T20:23:55-0700at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:941) 2012-08-30T20:23:55-0700at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1261) 2012-08-30T20:23:55-0700at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:594) 2012-08-30T20:23:55-0700at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:214) 2012-08-30T20:23:55-0700at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61) 2012-08-30T20:23:55-0700at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:115) 2012-08-30T20:23:55-0700at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:158) 2012-08-30T20:23:55-0700at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:79) 2012-08-30T20:23:55-0700at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:67) ... and yes, they're all waiting for 0x000726089700 A gist of the first thread dump, with most of the repetition
turning up logging using the web UI, can't get more than INFO
We have a pretty standard out-of-the-box solr/jetty setup. Using the web UI at /solr/admin/logging, for WARNING or SEVERE we get less logging, but none of CONFIG, FINE or FINEST result in any *more* logging than just at INFO. Is there another place to look for something that might be controlling that? Maybe a threshold=INFO somewhere? We've been unable to find anything. We're trying to turn up logging because our solr indexing server is hanging at least once a day and we're trying to find out why. It becomes unresponsive and we have to kill -9 it.
Re: turning up logging using the web UI, can't get more than INFO
On Fri, Aug 24, 2012 at 10:23 AM, Ahmet Arslan iori...@yahoo.com wrote: We have a pretty standard out-of-the-box solr/jetty setup. Using the web UI at /solr/admin/logging, for WARNING or SEVERE we get less logging, but none of CONFIG, FINE or FINEST result in any *more* logging than just at INFO. Taken from solr-trunk/solr/example/README.txt By default, Solr will log to the console. This can be convenient when first getting started, but eventually you will want to log to a file. To enable logging, you can just pass a system property to Jetty on startup: java -Djava.util.logging.config.file=etc/logging.properties -jar start.jar Thanks, Ahmet. We're actually fine capturing console logging to a file right now. But are you saying that the FINEST/FINE/CONFIG levels on the web ui at /solr/admin/logging/ won't work without a logging.config.file set up ? -- Kevin M. Goess Software Engineer Berkeley Electronic Press kgo...@bepress.com 510-665-1200 x179 www.bepress.com bepress: sustainable scholarly publishing