On Fri, 2005-11-04 at 19:15 -0800, Doug Cutting wrote:
> Rod Taylor wrote:
> > There is only a single datanode and there are 20 hosts.
> 
> That's a lot of load on one datanode.  I typically run a datanode on 
> every host, accessing the local drives on that host.

I tried running one datanode per machine connecting back to the same SAN
but it seemed pretty clunky.  A crash of any datanode would take down
the entire system (no data replication since it's a common data-store in
the end). Reducing it to a single datanode did not have this impact.

The boxes themselves don't have much for local drives aside from a bit
of temp space.

Recently we moved the datanode, namenode and jobtracker to their own
machine per your earlier suggestion and upgraded Nutch sources to Nov
1st from about October 20th. This is when the difficulties started.

Earlier with the single datanode, namenode and jobtracker on an
overloaded worker machine (load average was around 20 normally) things
worked without errors, but slowly.

-- 
Rod Taylor <[EMAIL PROTECTED]>

Reply via email to