Hi,
Just upgraded to the latest version of Nutch with Hadoop 0.20.
I'm getting the following exception in the namenode log and DFS doesn't
start:
2009-12-06 15:48:32,523 ERROR namenode.NameNode -
java.lang.SecurityException: sealing violation: can't seal package
org.mortbay.util: already loaded
Currently the depth queue is hardcoded to 50... however one needs to keep
#Threads x Depth below a certain number otherwise fetcher spends its life in
managing the queues and CPU becomes the limiting factor.
This is what was creating my L shape kind of bandwidth usage.
I had to patch it by hand,
Works fine and my memory problem had to do with the fact that I had too many
threads...
2009/12/5 MilleBii mille...@gmail.com
Thx again Julien,
Yes I'm going to buy myself the Hadoop book, because I thought I could do
without but I realize that I need to make good use of hadooop.
Didn't
Hi - this is my first post to the nutch mailing list, please let me know
if I commit any list protocol errors.
I'm currently using Nutch 1.0 with the Powerpoint plugin enabled and can
verify that Nutch is indeed pulling in the entire file for passing off
to the parser (i.e., I've set the
New and longer run ... I get plenty of : failed with:
java.lang.OutOfMemoryError: Java heap space
Fetching still goes on, not sure if this the expected behavior.
2009/12/6 MilleBii mille...@gmail.com
Works fine and my memory problem had to do with the fact that I had too
many threads...