I don't understand this use case.

Suppose that you lose half the nodes in the cluster.  On average,
12.5% of your blocks were exclusively stored on the half the cluster
that's dead.  For many (most?) applications, a random 87.5% of the
data isn't really useful.  Storing metadata in more places would let
you turn a dead cluster into a corrupt cluster, but not into a working
one.   If you need to survive major disasters, you want a second HDFS
cluster in a different place.

The thing that might be useful to you, if you're worried about
simultaneous namenode and secondary NN failure, is to store the edit
log and fsimage on a SAN, and get fault tolerance that way.

--Ari

On Tue, Sep 9, 2008 at 6:38 PM, 叶双明 <[EMAIL PROTECTED]> wrote:
> Thanks for paying attention  to my tentative idea!
>
> What I thought isn't how to store the meradata, but the final (or last) way
> to recover valuable data in the cluster when something worst (which destroy
> the metadata in all multiple NameNode) happen. i.e. terrorist attack  or
> natural disasters destroy half of cluster nodes within all NameNode, we can
> recover as much data as possible by this mechanism, and hava big chance to
> recover entire data of cluster because fo original replication.
>
> Any suggestion is appreciate!
>
> 2008/9/10 Pete Wyckoff <[EMAIL PROTECTED]>
>
>> +1 -
>>
>> from the perspective of the data nodes, dfs is just a block-level store and
>> is thus much more robust and scalable.
>>
>>
>>
>> On 9/9/08 9:14 AM, "Owen O'Malley" <[EMAIL PROTECTED]> wrote:
>>
>> > This isn't a very stable direction. You really don't want multiple
>> distinct
>> > methods for storing the metadata, because discrepancies are very bad.
>> High
>> > Availability (HA) is a very important medium term goal for HDFS, but it
>> will
>> > likely be done using multiple NameNodes and ZooKeeper.
>> >
>> > -- Owen
>>

-- 
Ari Rabkin [EMAIL PROTECTED]
UC Berkeley Computer Science Department

Reply via email to