I was using a nightly build that Pitor had given me the nutch-nightly.jar
(actually it was nutch-dev0.7.jar or something of that nature) I tested it on
the windows platform, I had 5 machines running it, 2 at 100 mbit both quad p3
xeon, 1 pentium 4 3ghz hyperthreading, 1 amd athlon xp 2600+ and
Hi Doug,
The slides from my talk yesterday at OSCON give some hints on how
to get started. We need a MapReduce tutorial.
http://wiki.apache.org/nutch/Presentations
Can you explan what this means: Page 20:
- cheduling is bottleneck, not disk, network or CPU?
Thanks.
Stefan
Stefan Groschupf wrote:
http://wiki.apache.org/nutch/Presentations
Can you explan what this means: Page 20:
- cheduling is bottleneck, not disk, network or CPU?
I mean that neither the CPUs, disks or network are at 100% of capacity.
Disks are running around 50% busy, CPUs a bit higher, and
at the Jira to check if some more bugs can be fixed before deadline
proposed by Andrzej.
Regards
Piotr
Andrzej Bialecki wrote:
Doug Cutting wrote:
Here's a near-term plan for Nutch.
1. Release Nutch 0.7, based on current trunk. We should do this ASAP.
Are there bugs in trunk that we need
replication throughput running level 1)
- Original Message -
From: Doug Cutting [EMAIL PROTECTED]
To: nutch-dev@lucene.apache.org
Sent: Thursday, August 04, 2005 3:54 PM
Subject: Re: near-term plan
Stefan Groschupf wrote:
http://wiki.apache.org/nutch/Presentations
Can you explan
Jay Pound wrote:
Doug I also ran into this when I was testing ndfs the system would have to
wait for the namenode to tell the datanodes what data to recieve and which
data to replicate
When did you test this? Which version of Nutch? How many nodes? My
benchmark results from just a few days
Doug Cutting wrote:
Andrzej Bialecki wrote:
So, I would propose a deadline of Aug 8 for the last commits, and then
perhaps Aug 15 for the release?
Sounds good to me. Thanks for helping with this!
Unfortunately, the patches related to detecting the unmodified content
will have to wait