If the client and all the datanode(s) for a block are dead, then the file is
corrupt. It cannot be recovered and the lease cannot be reclaimed. Is that a
problem?

>  your replication factor is 3 and the NN's minReplication is also 3.

In the current trunk, the NN guarantees to keep blocks replicated only when
a file is closed. So, if you think that replicas could get lost while the
block is being written (i.e. file is not yet closed by the writer), then you
should set minreplciation accordingly. This behaviour could get better in
the future if the client could replace lost replicas, but current code does
not do that.

thanks,
dhruba



On Wed, May 20, 2009 at 11:09 PM, Sangmin Lee <sangmin....@gmail.com> wrote:

> On Wed, May 20, 2009 at 5:43 PM, Dhruba Borthakur <dhr...@gmail.com>
> wrote:
>
> > > What if all datanodes in INodeFileUnderConstruction targets are dead ?
> >
> > If all datanodes in a pipeline are dead, than that file cannot be
> recovered
> > at all. This is expected and most file-systems behave this way when the
> > underlying storage goes bad.
>
>
> Yeah, I understand that. But I don't see how the lease will be removed.
> That is, when the client and all datanodes are dead, I don't see any code
> to
> handle this.
>
> Apart from this, I have another question regarding append.
> Suppose that you are trying to append to a file.
> And your replication factor is 3 and the NN's minReplication is also 3.
> As a part of appending, client asks datanodes (which store the last block)
> to sync but one of them fails.
> The primary DN will do commitBlockSynchronisation with only two DNs.
> (I believe the NN should do something at this point since it will never
> receive enough blockreceived msgs)
> And Client also proceeds with two DNs.
> Then later, when client wants to allocate another block, it will get
> NotReplicatedYetException.
>
> Thanks,
> Sangmin
>
>
>
>
>
> >
> >
> > >I thought generationStamp should be checked when the NN process
> > blockreports from DN,
> >
> > The generation stamp is used to compute the hashCode for a Block object.
> >
> > thanks,
> > dhruba
> >
> >
> > On Wed, May 20, 2009 at 11:58 AM, Sangmin Lee <sangmin....@gmail.com>
> > wrote:
> >
> > > Dhruba,
> > >
> > > Thanks for the response.
> > > What if all datanodes in INodeFileUnderConstruction.targets are dead ?
> > > I don't see any code to handle this case.
> > >
> > > One other thing I wonder is that when is the generationStamp used by
> the
> > NN
> > > ?
> > > I thought generationStamp should be checked when the NN process block
> > > reports from DN, but I can only see it checks blocks length. Am I
> missing
> > > something here?
> > >
> > > Thanks,
> > > Sangmin
> > >
> > >
> > > On Wed, May 20, 2009 at 12:24 PM, Dhruba Borthakur <dhr...@gmail.com>
> > > wrote:
> > >
> > > > The NN has a timer for dead-clients. When the HARD_LIMIT (1 hour)
> > > expires,
> > > > the NN extracts the primary datanode from the
> > > > INodeFileUnderConstruction.targets and asks the primary datanode to
> > > recover
> > > > the lease. At the end of the lease recovery, the primary datanode
> > invokes
> > > > NameNode.commitBlockSynchronisation method, and the lease recovery is
> > > > complete.
> > > >
> > > > hope this helps,
> > > > thanks,
> > > > dhruba
> > > >
> > > >
> > > >
> > > > On Wed, May 20, 2009 at 9:14 AM, Sangmin Lee <sangmin....@gmail.com>
> > > > wrote:
> > > >
> > > > > I am looking at 0.19.0(or maybe 0.19.1) and 0.20.0.
> > > > > In fact, I am still curious about the case (maybe too much extream
> > > case)
> > > > > where
> > > > > a client open a file, request a block and prematurely dies.
> > > > > Also all datanodes go dead.
> > > > > I don't see how the lease will be recovered or reaped in this case.
> > > > > Don't we need some mechanism that discards the block and removes
> the
> > > > lease
> > > > > after several attempts for lease recovery ?
> > > > >
> > > > > Thanks,
> > > > > Sangmin
> > > > >
> > > > > On Wed, May 20, 2009 at 10:40 AM, Edward J. Yoon <
> > > edwardy...@apache.org
> > > > > >wrote:
> > > > >
> > > > > > Can I ask what version do you read? You looks reach so deeply
> into
> > > the
> > > > > > architecture of a system...
> > > > > >
> > > > > > On Thu, May 21, 2009 at 12:28 AM, Sangmin Lee <
> > sangmin....@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > Okay.. I was going dumb by misreading some source code.
> > > > > > > Please ignore my question regarding this.
> > > > > > > Sorry about this.
> > > > > > >
> > > > > > > Sangmin
> > > > > > >
> > > > > > > On Tue, May 19, 2009 at 11:59 PM, Sangmin Lee <
> > > sangmin....@gmail.com
> > > > >
> > > > > > wrote:
> > > > > > >
> > > > > > >> Hi all,
> > > > > > >>
> > > > > > >> I have some question regarding the hdfs recovery mechanism.
> > > > > > >>
> > > > > > >> I see that INodeFileUnderConstruction has a "targets" field
> that
> > > > > stores
> > > > > > >> list of datanodes which store its last block.
> > > > > > >> However, I don't see them being used at all except that
> > > > > > >> "internalReleaseLease" function uses the length of the
> datanode
> > > > list.
> > > > > > >> Is there any other use of the "target" fields rather than
> > checking
> > > > its
> > > > > > >> length?
> > > > > > >>
> > > > > > >> Could anyone shed some light on this?
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> Sangmin
> > > > > > >>
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best Regards, Edward J. Yoon @ NHN, corp.
> > > > > > edwardy...@apache.org
> > > > > > http://blog.udanax.org
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to