I'm having the exact same problem. I am trying to isolate whether it is a Solr problem or a Nutch+Solr problem.
On Wed, Oct 26, 2011 at 11:54 PM, <[email protected]> wrote: > Hi, > > I am working with a Nutch 1.4 snapshot and having a very strange problem > that makes the system run out of memory when indexing into Solr. This does > not look like a trivial lack of memory problem that can be solved by giving > more memory to the JVM. I've increased the max memory size from 2Gb to 3Gb, > then to 6Gb, but this did not make any difference. > > A log extract is included below. > > Would anyone have any idea of how to fix this problem? > > Thanks, > > Arkadi > > > 2011-10-27 07:08:22,162 INFO solr.SolrWriter - Adding 1000 documents > 2011-10-27 07:08:42,248 INFO solr.SolrWriter - Adding 1000 documents > 2011-10-27 07:13:54,110 WARN mapred.LocalJobRunner - job_local_0254 > java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOfRange(Arrays.java:3209) > at java.lang.String.<init>(String.java:215) > at java.nio.HeapCharBuffer.toString(HeapCharBuffer.java:542) > at java.nio.CharBuffer.toString(CharBuffer.java:1157) > at org.apache.hadoop.io.Text.decode(Text.java:350) > at org.apache.hadoop.io.Text.decode(Text.java:322) > at org.apache.hadoop.io.Text.readString(Text.java:403) > at org.apache.nutch.parse.ParseText.readFields(ParseText.java:50) > at > org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:54) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) > at > org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:991) > at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:931) > at > org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:241) > at > org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:237) > at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:81) > at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) > 2011-10-27 07:13:54,382 ERROR solr.SolrIndexer - java.io.IOException: Job > failed! > >

