Spill failed

2010-02-10 Thread Santiago Pérez
Hej I am running Nutch in a cluster with 1 master and 6 slaves in Amazon (with the same instances for all of them with 1.7GB RAM memory) My configuration is the following: HADOOP_HEAPSIZE=1300 HADOOP_NAMENODE_OPTS=-Xmx400m HADOOP_SECONDARYNAMENODE_OPTS=-Xmx400m HADOOP_JOBTRACKER_OPTS=-Xmx400m

Re: Spill failed

2010-02-10 Thread Julien Nioche
the explanation can be found in the stack trace you sent : java.io.IOException: error=12, Cannot allocate memory Small instances on EC2 does not give you enough memory. from the configuration below the slaves will use up to 1300M for the datanode and tasktracker; if you add to that the memory

[jira] Updated: (NUTCH-787) Upgrade Lucene to 3.0.0.

2010-02-10 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-787: Fix Version/s: 1.1 Upgrade Lucene to 3.0.0. Key:

Re: Spill failed

2010-02-10 Thread Santiago Pérez
Ok, thanks I will set the value to a lower number. BTW, which is the relationship between HADOOP_HEAPSIZE and NUTCH_HEAPSIZE? Should I set a value for NUTCH_HEAPSIZE for HADOOP_HEAPSIZE+NUTCH_HEAPSIZETOTAL RAM? or NUTCH_HEAPSIZE depends on HADOOP_HEAPSIZE? I was looking for this info but I did

[jira] Commented: (NUTCH-766) Tika parser

2010-02-10 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832250#action_12832250 ] Andrzej Bialecki commented on NUTCH-766: - +1 to commit this - please remember to

[jira] Commented: (NUTCH-766) Tika parser

2010-02-10 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832255#action_12832255 ] Chris A. Mattmann commented on NUTCH-766: - {quote} +1 to commit this... {quote}

[jira] Created: (NUTCH-788) search.jsp typo causing fail

2010-02-10 Thread Sammy Yu (JIRA)
search.jsp typo causing fail Key: NUTCH-788 URL: https://issues.apache.org/jira/browse/NUTCH-788 Project: Nutch Issue Type: Bug Components: web gui Affects Versions: 1.1 Environment: On

[jira] Updated: (NUTCH-788) search.jsp typo causing searches to fail

2010-02-10 Thread Sammy Yu (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sammy Yu updated NUTCH-788: --- Fix Version/s: (was: 1.1) search.jsp typo causing searches to fail

[jira] Updated: (NUTCH-788) search.jsp typo causing searches to fail

2010-02-10 Thread Sammy Yu (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sammy Yu updated NUTCH-788: --- Patch Info: [Patch Available] Summary: search.jsp typo causing searches to fail (was: search.jsp typo

[jira] Commented: (NUTCH-766) Tika parser

2010-02-10 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832398#action_12832398 ] Chris A. Mattmann commented on NUTCH-766: - I'm going to hold off on committing this

[jira] Commented: (NUTCH-766) Tika parser

2010-02-10 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832406#action_12832406 ] Sami Siren commented on NUTCH-766: -- I suggest that we would still drive this a bit further

[jira] Updated: (NUTCH-766) Tika parser

2010-02-10 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-766: - Attachment: NutchTikaConfig.java Extended TikaConfig that is able to load parsers and can be used with

[jira] Updated: (NUTCH-766) Tika parser

2010-02-10 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-766: - Attachment: TikaParser.java Modified parser that can process package formats too. To get rid of the mime