Re: OutOfMemoryError when indexing into Solr

Markus Jelsma Thu, 27 Oct 2011 05:28:42 -0700

Your problem is not the same judging from the stack trace on the Solr list. 
Your Solr runs OOM, not Nutch.


On Thursday 27 October 2011 14:20:10 Fred Zimmerman wrote:
> I'm having the exact same problem. I am trying to isolate whether it is a
> Solr problem or a Nutch+Solr problem.
> 
> On Wed, Oct 26, 2011 at 11:54 PM, <[email protected]> wrote:
> > Hi,
> > 
> > I am working with a Nutch 1.4 snapshot and having a very strange problem
> > that makes the system run out of memory when indexing into Solr. This
> > does not look like a trivial lack of memory problem that can be solved
> > by giving more memory to the JVM. I've increased the max memory size
> > from 2Gb to 3Gb, then to 6Gb, but this did not make any difference.
> > 
> > A log extract is included below.
> > 
> > Would anyone have any idea of how to fix this problem?
> > 
> > Thanks,
> > 
> > Arkadi
> > 
> > 
> > 2011-10-27 07:08:22,162 INFO  solr.SolrWriter - Adding 1000 documents
> > 2011-10-27 07:08:42,248 INFO  solr.SolrWriter - Adding 1000 documents
> > 2011-10-27 07:13:54,110 WARN  mapred.LocalJobRunner - job_local_0254
> > java.lang.OutOfMemoryError: Java heap space
> > 
> >       at java.util.Arrays.copyOfRange(Arrays.java:3209)
> >       at java.lang.String.<init>(String.java:215)
> >       at java.nio.HeapCharBuffer.toString(HeapCharBuffer.java:542)
> >       at java.nio.CharBuffer.toString(CharBuffer.java:1157)
> >       at org.apache.hadoop.io.Text.decode(Text.java:350)
> >       at org.apache.hadoop.io.Text.decode(Text.java:322)
> >       at org.apache.hadoop.io.Text.readString(Text.java:403)
> >       at org.apache.nutch.parse.ParseText.readFields(ParseText.java:50)
> >       at
> > 
> > org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWrita
> > bleConfigurable.java:54)
> > 
> >       at
> > 
> > org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserialize
> > r.deserialize(WritableSerialization.java:67)
> > 
> >       at
> > 
> > org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserialize
> > r.deserialize(WritableSerialization.java:40)
> > 
> >       at
> > 
> > org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:991)
> > 
> >       at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:931)
> >       at
> > 
> > org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(Reduc
> > eTask.java:241)
> > 
> >       at
> > 
> > org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.
> > java:237)
> > 
> >       at
> > 
> > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:81
> > )
> > 
> >       at
> > 
> > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50
> > )
> > 
> >       at
> > 
> > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
> > 
> >       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
> >       at
> > 
> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> > 2011-10-27 07:13:54,382 ERROR solr.SolrIndexer - java.io.IOException: Job
> > failed!

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: OutOfMemoryError when indexing into Solr

Reply via email to