Hi,

I am working with a Nutch 1.4 snapshot and having a very strange problem that 
makes the system run out of memory when indexing into Solr. This does not look 
like a trivial lack of memory problem that can be solved by giving more memory 
to the JVM. I've increased the max memory size from 2Gb to 3Gb, then to 6Gb, 
but this did not make any difference.

A log extract is included below.

Would anyone have any idea of how to fix this problem?

Thanks,

Arkadi


2011-10-27 07:08:22,162 INFO  solr.SolrWriter - Adding 1000 documents
2011-10-27 07:08:42,248 INFO  solr.SolrWriter - Adding 1000 documents
2011-10-27 07:13:54,110 WARN  mapred.LocalJobRunner - job_local_0254
java.lang.OutOfMemoryError: Java heap space
       at java.util.Arrays.copyOfRange(Arrays.java:3209)
       at java.lang.String.<init>(String.java:215)
       at java.nio.HeapCharBuffer.toString(HeapCharBuffer.java:542)
       at java.nio.CharBuffer.toString(CharBuffer.java:1157)
       at org.apache.hadoop.io.Text.decode(Text.java:350)
       at org.apache.hadoop.io.Text.decode(Text.java:322)
       at org.apache.hadoop.io.Text.readString(Text.java:403)
       at org.apache.nutch.parse.ParseText.readFields(ParseText.java:50)
       at 
org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:54)
       at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
       at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
       at 
org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:991)
       at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:931)
       at 
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:241)
       at 
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:237)
       at 
org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:81)
       at 
org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50)
       at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
       at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
2011-10-27 07:13:54,382 ERROR solr.SolrIndexer - java.io.IOException: Job 
failed!

Reply via email to