Nutch Hadoop 0.20 - Exception

2009-12-06 Thread Eran Zinman
Hi, Just upgraded to the latest version of Nutch with Hadoop 0.20. I'm getting the following exception in the namenode log and DFS doesn't start: 2009-12-06 15:48:32,523 ERROR namenode.NameNode - java.lang.SecurityException: sealing violation: can't seal package org.mortbay.util: already loaded

Configurable depth for fetcher queue ?

2009-12-06 Thread MilleBii
Currently the depth queue is hardcoded to 50... however one needs to keep #Threads x Depth below a certain number otherwise fetcher spends its life in managing the queues and CPU becomes the limiting factor. This is what was creating my L shape kind of bandwidth usage. I had to patch it by hand,

Re: Fetch failing ?

2009-12-06 Thread MilleBii
Works fine and my memory problem had to do with the fact that I had too many threads... 2009/12/5 MilleBii mille...@gmail.com Thx again Julien, Yes I'm going to buy myself the Hadoop book, because I thought I could do without but I realize that I need to make good use of hadooop. Didn't

Nutch 1.0 ms-powerpoint plugin

2009-12-06 Thread Joe Bell
Hi - this is my first post to the nutch mailing list, please let me know if I commit any list protocol errors. I'm currently using Nutch 1.0 with the Powerpoint plugin enabled and can verify that Nutch is indeed pulling in the entire file for passing off to the parser (i.e., I've set the

Re: Fetch failing ?

2009-12-06 Thread MilleBii
New and longer run ... I get plenty of : failed with: java.lang.OutOfMemoryError: Java heap space Fetching still goes on, not sure if this the expected behavior. 2009/12/6 MilleBii mille...@gmail.com Works fine and my memory problem had to do with the fact that I had too many threads...