If I remember correctly,

Having dfs.safemode.threshold.pct = 1 may lead to a problem that the Namenode 
is not leaving safemode because of floating point round off errors.

Having dfs.safemode.threshold.pct > 1 means that Namenode can never exit 
safemode since it is not achievable.

Nicholas Sze




----- Original Message ----
> From: Raghu Angadi <[email protected]>
> To: [email protected]
> Sent: Tuesday, October 6, 2009 6:03:52 PM
> Subject: Re: A question on dfs.safemode.threshold.pct
> 
> I am not sure what the real concern is... You can set it to 1.0 (or even 1.1
> :)) if you prefer. Many admins do.
> 
> Raghu.
> 
> On Tue, Oct 6, 2009 at 5:20 PM, Manhee Jo wrote:
> 
> > Thank you, Raghu.
> > Then, when the percentage is below 0.999, how can you tell
> > if some datanodes are just slower than others or some of the data blocks
> > are lost?
> > I think "percentage 1" should have speacial meaning like
> > it guarantees integrity of data in HDFS.
> > If it's below 1, then the integrity is not said to be guaranteed.
> >
> > Or are there any other useful means that a NameNode can fix the lost
> > blocks,
> > so that it doesn't care even 0.1% of data is lost?
> >
> >
> > Thanks,
> > Manhee
> >
> > ----- Original Message ----- From: "Raghu Angadi" 
> > To: 
> > Sent: Wednesday, October 07, 2009 1:26 AM
> > Subject: Re: A question on dfs.safemode.threshold.pct
> >
> >
> >
> >  Yes, it is mostly geared towards replication greater than 1. One of the
> >> reasons for waiting for this threshold is to avoid HDFS starting
> >> unnecessary
> >> replications of blocks at the start up when some of the datanodes are
> >> slower
> >> to start up.
> >>
> >> When the replication is 1, you don't have that issue. A block either
> >> exists
> >> or does not.
> >>
> >> Raghu
> >> 2009/10/5 Manhee Jo 
> >>
> >>  Hi all,
> >>>
> >>> Why isn't the dfs.safemode.threshold.pct 1 by default?
> >>> When dfs.replication.min=1 with dfs.safemode.threshold.pct=0.999,
> >>> there might be chances for a NameNode to check in with incomplete data
> >>> in its file system. Am I right? Is it permissible? Or is it assuming that
> >>> replication would be always more than 1?
> >>>
> >>>
> >>> Thanks,
> >>> Manhee
> >>>
> >>
> >>
> >
> >

Reply via email to