My guess is that you're running out of RAM. Actual Java profiling is beyond me, but I have seen issues on updating that were solved by more RAM.

If you are updating every few minutes, and your new index takes more than a few minutes to warm, you could be running into overlapping warming indexes issues. Some more info on what I mean by this in this FAQ, although the FAQ isn't actually targetted at this case exactly: http://wiki.apache.org/solr/FAQ#What_does_.22exceeded_limit_of_maxWarmingSearchers.3DX.22_mean.3F

Overlapping warming indexes can result in excessive RAM and/or CPU usage.

If you haven't given your JVM options to tune the JVM Garbage Collection, that can also help things, using the options for concurrent thread GC. But if your fundamental problem is overlapping warming queries, you probably need to make that stop.

On 3/8/2011 5:17 PM, danomano wrote:
Hi folks, I've been using solr for about 3 months.

Our Solr install is a single node, and we have been injecting logging data
into the solr server every couple of minutes, which each updating taking few
minutes.

Everything working fine until this morning, at which point it appeared that
all updates were hung.

Retarting the solr server did not help, as all updaters immediately 'hung'
again.

Poking around in the threads, and strace, I do in fact see stuff happening.

The index size itself is about 270Gb, (we are hopping to support upto
500-1TB), and have supplied the system with ~3TB diskspace.

Any Tips on what could be happening?
notes: we have never run an optimize yet.
           we have never deleted from system yet.


The merge Thread appears to be the one..'never returnning'
"Lucene Merge Thread #0" - Thread t@41
    java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.FileDispatcher.pread0(Native Method)
        at sun.nio.ch.FileDispatcher.pread(FileDispatcher.java:31)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:234)
        at sun.nio.ch.IOUtil.read(IOUtil.java:210)
        at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:622)
        at
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:161)
        at
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:139)
        at
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:94)
        at org.apache.lucene.store.DataOutput.copyBytes(DataOutput.java:176)
        at
org.apache.lucene.index.FieldsWriter.addRawDocuments(FieldsWriter.java:209)
        at
org.apache.lucene.index.SegmentMerger.copyFieldsNoDeletions(SegmentMerger.java:424)
        at
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:332)
        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:153)
        at 
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4053)
        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3645)
        at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:339)
        at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:407)


Some ptrace output:
23178 pread(172,
"\270\316\276\2\245\371\274\2\271\316\276\2\272\316\276\2\273\316\276\2\274\316\276\2\275\316\276\2\276\316\276\2"...,
4096, 98004192) = 4096<0.000009>
23178 pread(172,
"\245\371\274\2\271\316\276\2\272\316\276\2\273\316\276\2\274\316\276\2\275\316\276\2\276\316\276\2\277\316\276\2"...,
4096, 98004196) = 4096<0.000009>
23178 pread(172,
"\271\316\276\2\272\316\276\2\273\316\276\2\274\316\276\2\275\316\276\2\276\316\276\2\277\316\276\2\300\316\276\2"...,
4096, 98004200) = 4096<0.000008>
23178 pread(172,
"\272\316\276\2\273\316\276\2\274\316\276\2\275\316\276\2\276\316\276\2\277\316\276\2\300\316\276\2\301\316\276\2"...,
4096, 98004204) = 4096<0.000008>
23178 pread(172,
"\273\316\276\2\274\316\276\2\275\316\276\2\276\316\276\2\277\316\276\2\300\316\276\2\301\316\276\2\302\316\276\2"...,
4096, 98004208) = 4096<0.000008>
23178 pread(172,
"\274\316\276\2\275\316\276\2\276\316\276\2\277\316\276\2\300\316\276\2\301\316\276\2\302\316\276\2\367\343\274\2"...,
4096, 98004212) = 4096<0.000009>
23178 pread(172,
"\275\316\276\2\276\316\276\2\277\316\276\2\300\316\276\2\301\316\276\2\302\316\276\2\367\343\274\2\246\371\274\2"...,
4096, 98004216) = 4096<0.000008>
23178 pread(172,
"\276\316\276\2\277\316\276\2\300\316\276\2\301\316\276\2\302\316\276\2\367\343\274\2\246\371\274\2\303\316\276\2"...,
4096, 98004220) = 4096<0.000009>
23178 pread(172,
"\277\316\276\2\300\316\276\2\301\316\276\2\302\316\276\2\367\343\274\2\246\371\274\2\303\316\276\2\304\316\276\2"...,
4096, 98004224) = 4096<0.000013>
22688<... futex resumed>  )             = -1 ETIMEDOUT (Connection timed
out)<0.051276>
23178 pread(172,
"\300\316\276\2\301\316\276\2\302\316\276\2\367\343\274\2\246\371\274\2\303\316\276\2\304\316\276\2\305\316\276\2"...,
4096, 98004228) = 4096<0.000010>
22688 futex(0x464a9f28, FUTEX_WAKE_PRIVATE, 1
23178 pread(172,
"\301\316\276\2\302\316\276\2\367\343\274\2\246\371\274\2\303\316\276\2\304\316\276\2\305\316\276\2\306\316\276\2"...,
4096, 98004232) = 4096<0.000010>
22688<... futex resumed>  )             = 0<0.000051>
23178 pread(172,
"\302\316\276\2\367\343\274\2\246\371\274\2\303\316\276\2\304\316\276\2\305\316\276\2\306\316\276\2\307\316\276\2"...,
4096, 98004236) = 4096<0.000010>
22688 clock_gettime(CLOCK_MONOTONIC,
23178 pread(172,
"\367\343\274\2\246\371\274\2\303\316\276\2\304\316\276\2\305\316\276\2\306\316\276\2\307\316\276\2\310\316\276\2"...,
4096, 98004240) = 4096<0.000010>
22688<... clock_gettime resumed>  {1900472, 454038316}) = 0<0.000054>
23178 pread(172,
"\246\371\274\2\303\316\276\2\304\316\276\2\305\316\276\2\306\316\276\2\307\316\276\2\310\316\276\2\311\316\276\2"...,
4096, 98004244) = 4096<0.000011>
22688 clock_gettime(CLOCK_MONOTONIC,
23178 pread(172,
"\303\316\276\2\304\316\276\2\305\316\276\2\306\316\276\2\307\316\276\2\310\316\276\2\311\316\276\2\312\316\276\2"...,
4096, 98004248) = 4096<0.000010>
22688<... clock_gettime resumed>  {1900472, 454169316}) = 0<0.000051>
23178 pread(172,
"\304\316\276\2\305\316\276\2\306\316\276\2\307\316\276\2\310\316\276\2\311\316\276\2\312\316\276\2\313\316\276\2"...,
4096, 98004252) = 4096<0.000010>
22688 clock_gettime(CLOCK_MONOTONIC,
23178 pread(172,
"\305\316\276\2\306\316\276\2\307\316\276\2\310\316\276\2\311\316\276\2\312\316\276\2\313\316\276\2\314\316\276\2"...,
4096, 98004256) = 4096<0.000011>
22688<... clock_gettime resumed>  {1900472, 454290316}) = 0<0.000049>
23178 pread(172,
"\306\316\276\2\307\316\276\2\310\316\276\2\311\316\276\2\312\316\276\2\313\316\276\2\314\316\276\2\247\371\274\2"...,
4096, 98004260) = 4096<0.000010>
22688 clock_gettime(CLOCK_REALTIME,
23178 pread(172,
"\307\316\276\2\310\316\276\2\311\316\276\2\312\316\276\2\313\316\276\2\314\316\276\2\247\371\274\2\315\316\276\2"...,
4096, 98004264) = 4096<0.000010>
22688<... clock_gettime resumed>  {1299621913, 884373000}) = 0<0.000050>
23178 pread(172,
"\310\316\276\2\311\316\276\2\312\316\276\2\313\316\276\2\314\316\276\2\247\371\274\2\315\316\276\2\316\316\276\2"...,
4096, 98004268) = 4096<0.000010>
22688 futex(0x2aac7406ae34, FUTEX_WAIT_PRIVATE, 1, {0, 49938000}
23178 pread(172,
"\311\316\276\2\312\316\276\2\313\316\276\2\314\316\276\2\247\371\274\2\315\316\276\2\316\316\276\2\317\316\276\2"...,
4096, 98004272) = 4096<0.000008>
23178 pread(172,
"\312\316\276\2\313\316\276\2\314\316\276\2\247\371\274\2\315\316\276\2\316\316\276\2\317\316\276\2\320\316\276\2"...,
4096, 98004276) = 4096<0.000009>
23178 pread(172,
"\313\316\276\2\314\316\276\2\247\371\274\2\315\316\276\2\316\316\276\2\317\316\276\2\320\316\276\2\321\316\276\2"...,
4096, 98004280) = 4096<0.000008>
23178 pread(172,
"\314\316\276\2\247\371\274\2\315\316\276\2\316\316\276\2\317\316\276\2\320\316\276\2\321\316\276\2\322\316\276\2"...,
4096, 98004284) = 4096<0.000009>
23178 pread(172,
"\247\371\274\2\315\316\276\2\316\316\276\2\317\316\276\2\320\316\276\2\321\316\276\2\322\316\276\2\323\316\276\2"...,
4096, 98004288) = 40



Thanks
Dan





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Hanging-all-of-sudden-with-update-csv-tp2652903p2652903.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to