Thanks David. Do you have the relative Jira ticket number handy?
Chen On Tue, Feb 10, 2015 at 5:54 PM, david marion <[email protected]> wrote: > I believe therr was an issue fixed in 2.5 or 2.6 where the standby NN > would not process block reports from the DNs when it was dealing with the > checkpoint process. The missing blocks will get reported eventually. > > > -------- Original message -------- > From: Chen Song <[email protected]> > Date:02/10/2015 2:44 PM (GMT-05:00) > To: [email protected], Ravi Prakash <[email protected]> > Cc: > Subject: Re: missing data blocks after active name node crashes > > Thanks for the reply, Ravi. > > In my case, what I see constantly is there are always missing blocks > every time active name node crashes. The active name node crashes because > of timeout on journal nodes. > > Could this be a specific case which could lead to missing blocks? > > Chen > > On Tue, Feb 10, 2015 at 2:20 PM, Ravi Prakash <[email protected]> wrote: > > Hi Chen! > > From my understanding, every operation on the Namenode is logged (and > flushed) to disk / QJM / shared storage. This includes the addBlock > operation. So when a client requests to write a new block, the metadata is > logged by the active NN, so even if it crashes later on, the new active NN > would still see the creation of the block. > > HTH > Ravi > > > On Tuesday, February 10, 2015 9:38 AM, Chen Song <[email protected]> > wrote: > > > When the active name node crashes, it seems there is always a chance > that the data blocks in flight will be missing. > My understanding is that when the active name node crashes, the metadata > of data blocks in transition which exist in active name node memory is not > successfully captured by journal nodes and thus not available on standby > name node when it is promoted to active by zkfc. > Is my understanding correct? Any way to mitigate this problem or race > condition? > > -- > Chen Song > > > > > > > -- > Chen Song > > -- Chen Song
