[ 
https://issues.apache.org/jira/browse/NUTCH-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doğacan Güney closed NUTCH-492.
-------------------------------

    Resolution: Duplicate
      Assignee: Doğacan Güney

Duplicate of NUTCH-356.

> java.lang.OutOfMemoryError while indexing.
> ------------------------------------------
>
>                 Key: NUTCH-492
>                 URL: https://issues.apache.org/jira/browse/NUTCH-492
>             Project: Nutch
>          Issue Type: Bug
>          Components: indexer
>    Affects Versions: 0.9.0
>            Reporter: Nicolás Lichtmaier
>            Assignee: Doğacan Güney
>
> I'm getting this:
> java.lang.OutOfMemoryError: Java heap space
>         at java.util.Arrays.copyOf(Arrays.java:2786)
>         at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
>         at java.io.DataOutputStream.write(DataOutputStream.java:90)
>         at org.apache.hadoop.io.Text.writeString(Text.java:399)
>         at org.apache.nutch.metadata.Metadata.write(Metadata.java:225)
>         at org.apache.nutch.parse.ParseData.write(ParseData.java:165)
>         at 
> org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:154)
>         at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:65)
>         at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:315)
>         at org.apache.nutch.indexer.Indexer.map(Indexer.java:306)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175)
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:126)
> 2007-05-26 11:07:22,517 FATAL indexer.Indexer - Indexer: java.io.IOException: 
> Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
>         at org.apache.nutch.indexer.Indexer.index(Indexer.java:273)
>         at org.apache.nutch.indexer.Indexer.run(Indexer.java:295)
>         at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189)
>         at org.apache.nutch.indexer.Indexer.main(Indexer.java:278)
> Something weird I'm seeing in hadoop.log is that the plugins are loaded again 
> and again. I've created a custom plugin (if that can be causing something). 
> According to the code a nre plugin repository is created for each 
> "configuration object". I'm sure I'm not modifying the configuration object 
> in any part of my code (I've checked).
> Why are the plugins loaded again and again and again until the heap is full?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to