[ https://issues.apache.org/jira/browse/NUTCH-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Doğacan Güney closed NUTCH-492. ------------------------------- Resolution: Duplicate Assignee: Doğacan Güney Duplicate of NUTCH-356. > java.lang.OutOfMemoryError while indexing. > ------------------------------------------ > > Key: NUTCH-492 > URL: https://issues.apache.org/jira/browse/NUTCH-492 > Project: Nutch > Issue Type: Bug > Components: indexer > Affects Versions: 0.9.0 > Reporter: Nicolás Lichtmaier > Assignee: Doğacan Güney > > I'm getting this: > java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOf(Arrays.java:2786) > at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94) > at java.io.DataOutputStream.write(DataOutputStream.java:90) > at org.apache.hadoop.io.Text.writeString(Text.java:399) > at org.apache.nutch.metadata.Metadata.write(Metadata.java:225) > at org.apache.nutch.parse.ParseData.write(ParseData.java:165) > at > org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:154) > at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:65) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:315) > at org.apache.nutch.indexer.Indexer.map(Indexer.java:306) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:126) > 2007-05-26 11:07:22,517 FATAL indexer.Indexer - Indexer: java.io.IOException: > Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604) > at org.apache.nutch.indexer.Indexer.index(Indexer.java:273) > at org.apache.nutch.indexer.Indexer.run(Indexer.java:295) > at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189) > at org.apache.nutch.indexer.Indexer.main(Indexer.java:278) > Something weird I'm seeing in hadoop.log is that the plugins are loaded again > and again. I've created a custom plugin (if that can be causing something). > According to the code a nre plugin repository is created for each > "configuration object". I'm sure I'm not modifying the configuration object > in any part of my code (I've checked). > Why are the plugins loaded again and again and again until the heap is full? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers