On 9/13/07 6:00 AM, "C G" <[EMAIL PROTECTED]> wrote:

>   I'd like to run nodes with around 2T of local disk set up as JBOD.  So I
> would have 4 separate file systems per machine, for example /hdfs_a, /hdfs_b,
> /hdfs_c, /hdfs_d .  Is it possible to configure things so that HDFS knows
> about all 4 file systems?

Yes.  This is normally done to allow heterogeneity in data/task nodes.  You
make a list of all of the file systems that MIGHT be available and hadoop
figures out which are available and which have space to use.

> Since we're using HDFS replication I see no point in
> using RAID-anything...to me that's the whole point of replication  Comments?

That is the intent!

>   Is it possible to set things up in Hadoop to run multiple masters?

Not yet.  Doug makes very good points on this topic that a single master
will be fairly reliable and that it is the cluster that will have common
failures and thus must be robust to node failure.

There are lots of HA options.  One that looks very nice to me (but that I
haven't tried) is DRDB which is a block level disk replication service.  See
http://www.drbd.org/ for more information (and let us know how it looks).

The secondary nameserver may be of some help in recovery as well, but it is
unlikely to be as quick as a replicated disk and a CARP based IP address.

>   If you can't run multiple namenodes, then that sort of implies the machine
> which is hosting *the* namenode needs to do all the traditional things to
> protect against data loss/corruption, including frequent backups, RAID
> mirroring, etc.  

Some of these things are happening already, but the others are not a bad
idea at all.  Consider your hardware carefully.  RAID mirroring can
*decrease* reliability if you get a failure from either drive.  Happened to
me on my home machine and I have heard of other cases as well.  Even in
sophisticated implementations such as are done by Netapp, you can have drive
failures that freeze an entire shelf.  My preference any more is replicated
simple machines rather than fancy machines.


Reply via email to