Thanks guys. On Wed, Feb 11, 2015 at 8:03 AM, dlmarion <[email protected]> wrote:
> https://issues.apache.org/jira/browse/HDFS-7097 > > > > > -------- Original message -------- > From: Chen Song <[email protected]> > Date:02/11/2015 7:48 AM (GMT-05:00) > To: [email protected] > Cc: > Subject: Re: missing data blocks after active name node crashes > > Thanks David. > > Do you have the relative Jira ticket number handy? > > Chen > > On Tue, Feb 10, 2015 at 5:54 PM, david marion <[email protected]> > wrote: > >> I believe therr was an issue fixed in 2.5 or 2.6 where the standby NN >> would not process block reports from the DNs when it was dealing with the >> checkpoint process. The missing blocks will get reported eventually. >> >> >> -------- Original message -------- >> From: Chen Song <[email protected]> >> Date:02/10/2015 2:44 PM (GMT-05:00) >> To: [email protected], Ravi Prakash <[email protected]> >> Cc: >> Subject: Re: missing data blocks after active name node crashes >> >> Thanks for the reply, Ravi. >> >> In my case, what I see constantly is there are always missing blocks >> every time active name node crashes. The active name node crashes because >> of timeout on journal nodes. >> >> Could this be a specific case which could lead to missing blocks? >> >> Chen >> >> On Tue, Feb 10, 2015 at 2:20 PM, Ravi Prakash <[email protected]> wrote: >> >> Hi Chen! >> >> From my understanding, every operation on the Namenode is logged (and >> flushed) to disk / QJM / shared storage. This includes the addBlock >> operation. So when a client requests to write a new block, the metadata is >> logged by the active NN, so even if it crashes later on, the new active NN >> would still see the creation of the block. >> >> HTH >> Ravi >> >> >> On Tuesday, February 10, 2015 9:38 AM, Chen Song < >> [email protected]> wrote: >> >> >> When the active name node crashes, it seems there is always a chance >> that the data blocks in flight will be missing. >> My understanding is that when the active name node crashes, the metadata >> of data blocks in transition which exist in active name node memory is not >> successfully captured by journal nodes and thus not available on standby >> name node when it is promoted to active by zkfc. >> Is my understanding correct? Any way to mitigate this problem or race >> condition? >> >> -- >> Chen Song >> >> >> >> >> >> >> -- >> Chen Song >> >> > > > -- > Chen Song > > -- Chen Song
