Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page

2009-12-08 Thread Chris Douglas
On Mon, Dec 7, 2009 at 2:27 PM, Allen Wittenauer
awittena...@linkedin.com wrote:
 On 12/7/09 2:00 PM, Sanjay Radia sra...@yahoo-inc.com wrote:
 Allen raises a good point that the rest of the community may  not need
 some of the features that Yahoo finds useful internally.

 FWIW, I have no real issues with the change itself. I'm much more concerned
 that a UI enhancement was deemed so critical. So important was this
 enhancement that it required a vote for back porting after feature freeze.

I see this differently. The UI enhancement isn't critical, so it
requires a vote. If it were critical, it would simply be committed. As
0.21 is tested, if it is discovered that some features are unusable
but not broken, then voting on whether the feature is important enough
to merit a fix is a reasonable, if imperfect process. I hope that all
those experimenting with 0.21 consider correctness and usability of
new features; if a particular feature is awkward enough to merit a
patch, then we'll do this again, and consider the risk/reward as a
community.

I respectfully, but emphatically disagree with the assertion that
forking our UI and administration tools would be positive for the
project, but the point is outside the scope of this thread.

 Meanwhile, other fixes and changes that do not impact Yahoo! for which
 patches have existed for a long time, sit there idle, uncommitted.

I'm not sure which issues you're referring to, but if the fixes and
changes are for 0.22, then holding off on those while 0.21 settles
seems consistent with the priority you identified earlier, i.e.
releasing 0.21. If there are patches you want to see in this release
that remain uncommitted, it would help to know which ones are in that
set.

 It makes me wonder what the priorities truly are.  What does feature freeze
 actually mean?

The priority has always been a stable, usable 0.21. Feature freeze
means that the bar for new features/improvements to core is raised for
all the obvious reasons, and where it's not clear: the community
discusses it. -C


Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page

2009-12-07 Thread Sanjay Radia


On Nov 25, 2009, at 12:46 PM, Allen Wittenauer wrote:



Then you'll have no issues patching other things in 0.21 that  
are actual
bug fixes that also meet this criteria, right?  Or does this only  
apply to

things that Yahoo! is hitting/deemed worthy?



Allen raises a good point that the rest of the community may  not need  
some of the features that Yahoo

finds useful internally. It may clutter Hadoop unnecessarily.
Most of these admin GUI improvements are pluggable.
Perhaps this particular  plugin did not even need to go into trunk. It  
could have been made available as a separate downloadable or
contrib module. This way folks can use it across releases and only if  
they need it.


Further, there are a few new GUI improvements in the hadoop community  
that are proprietary - we should make it easier to create new admin  
plugins easily. This way users  can create new plugins that are useful  
to them internally; it also allows companies to create proprietary  
plugins for their customers.


sanjay




On 11/25/09 12:03 PM, Tsz Wo (Nicholas), Sze s29752-hadoop...@yahoo.com 


wrote:

 +1 on committing it to 0.21

 I also agree that it does not impact the 0.21 release since the  
patch is
 already done.  The argument of not committing it to 0.21 would be  
either (1)
 the patch is not safe, or (2) the patch is not that useful.  I  
don't see they

 are the cases here.

 Nicholas Sze




 - Original Message 
 From: Jakob Homan jho...@yahoo-inc.com
 To: common-dev@hadoop.apache.org
 Sent: Wed, November 25, 2009 11:31:08 AM
 Subject: Re: HDFS-758 in Hadoop-21 , Updates to  Namenode health  
page


 +1. Backporting this does not in any way impact the release of 21.
 -Jakob
 Hairong Kuang wrote:
 +1. Although this is a new feature, I'd like to have it  
committed to 0.21

 since we have so many issues with delayed decomission recently.

 Hairong


 On 11/24/09 6:06 PM, Suresh Srinivas wrote:

 +1. This will also help debug the issues when decommissioning  
takes a long

 time to complete.


 On 11/23/09 7:36 PM, Jitendra Nath Pandey wrote:



 Hi,
We will be committing some changes to the Namenode Health  
page
 (dfshealth.jsp) as part of the fix in HDFS-758. This will  
enable us to
 monitor the progress of decommissioning of datanodes more  
effectively.

Summary of changes :
1. A new link on the page for Decommissioning nodes.
2. This link will point to a new page with details about
 decommissioning
 status for each node which include
 a) Number of under-relplicated blocks in the node.
 b) Number of blocks with only no live replica (i.e.  
All its

 replicas
 are on decommissioning nodes).
 c) Number of under-replicated blocks in open files.
d) Time since decommissioning started.
3. The main page will also contain total number of under- 
replicated

 blocks
 in the cluster.

 Thanks
 jitendra








Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page

2009-12-07 Thread Eli Collins
On Mon, Dec 7, 2009 at 2:00 PM, Sanjay Radia sra...@yahoo-inc.com wrote:


 On Nov 25, 2009, at 12:46 PM, Allen Wittenauer wrote:


Then you'll have no issues patching other things in 0.21 that are
 actual
 bug fixes that also meet this criteria, right?  Or does this only apply to
 things that Yahoo! is hitting/deemed worthy?



 Allen raises a good point that the rest of the community may  not need some
 of the features that Yahoo
 finds useful internally. It may clutter Hadoop unnecessarily.
 Most of these admin GUI improvements are pluggable.
 Perhaps this particular  plugin did not even need to go into trunk. It
 could have been made available as a separate downloadable or
 contrib module. This way folks can use it across releases and only if they
 need it.

 Further, there are a few new GUI improvements in the hadoop community that
 are proprietary - we should make it easier to create new admin plugins
 easily. This way users  can create new plugins that are useful to them
 internally; it also allows companies to create proprietary plugins for their
 customers.



+1 to administration plugins.

I'm also +1 to HDFS-758 going into 21, would be genuinely useful to several
customers.

Thanks,
Eli


Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page

2009-11-25 Thread Hairong Kuang
+1. Although this is a new feature, I'd like to have it committed to 0.21
since we have so many issues with delayed decomission recently.

Hairong 


On 11/24/09 6:06 PM, Suresh Srinivas sures...@yahoo-inc.com wrote:

 +1. This will also help debug the issues when decommissioning takes a long
 time to complete.
 
 
 On 11/23/09 7:36 PM, Jitendra Nath Pandey jiten...@yahoo-inc.com wrote:
 
 
 
 Hi,
We will be committing some changes to the Namenode Health page
 (dfshealth.jsp) as part of the fix in HDFS-758. This will enable us to
 monitor the progress of decommissioning of datanodes more effectively.
Summary of changes :
1. A new link on the page for Decommissioning nodes.
2. This link will point to a new page with details about decommissioning
 status for each node which include
 a) Number of under-relplicated blocks in the node.
 b) Number of blocks with only no live replica (i.e. All its replicas
 are on decommissioning nodes).
 c) Number of under-replicated blocks in open files.
d) Time since decommissioning started.
3. The main page will also contain total number of under-replicated
 blocks
 in the cluster.
 
 Thanks
 jitendra
 
 



Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page

2009-11-25 Thread Jitendra Nath Pandey

 We have had several issues with decommissioning in recent past.
Decommissioning takes long time and operations guys have no means to find
out what is taking so long. This is a change in the namenode web UI that
will greatly help hadoop users to monitor the status of decommissioning and
to discover cause of 'long tails' in decommissioning.
   Brian Bockelman's comment on the jira (HDFS-758) confirms that 'long
tails' in decommissioning have been a problem not only at Yahoo but also for
other hadoop users.
   Therefore, this feature seems to be significant enough to be backported
to 21 so that it is available for our users sooner rather than later.
  We have backported this change to yahoo distribution of hadoop-20 as well.

Thanks
jitendra


On 11/25/09 11:14 AM, Allen Wittenauer awittena...@linkedin.com wrote:

 
 -1
 
 We're never going to see 0.21 if features keep getting backported.
 
 
 On 11/24/09 9:44 PM, Owen O'Malley owen.omal...@gmail.com wrote:
 
 +1 This sounds like useful information that will and has aided debugging.
 
 -- Owen
 



Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page

2009-11-25 Thread Tsz Wo (Nicholas), Sze
+1 on committing it to 0.21

I also agree that it does not impact the 0.21 release since the patch is 
already done.  The argument of not committing it to 0.21 would be either (1) 
the patch is not safe, or (2) the patch is not that useful.  I don't see they 
are the cases here.

Nicholas Sze




- Original Message 
 From: Jakob Homan jho...@yahoo-inc.com
 To: common-dev@hadoop.apache.org
 Sent: Wed, November 25, 2009 11:31:08 AM
 Subject: Re: HDFS-758 in Hadoop-21 , Updates to  Namenode health page
 
 +1. Backporting this does not in any way impact the release of 21.
 -Jakob
 Hairong Kuang wrote:
  +1. Although this is a new feature, I'd like to have it committed to 0.21
  since we have so many issues with delayed decomission recently.
  
  Hairong 
  
  
  On 11/24/09 6:06 PM, Suresh Srinivas wrote:
  
  +1. This will also help debug the issues when decommissioning takes a long
  time to complete.
 
 
  On 11/23/09 7:36 PM, Jitendra Nath Pandey wrote:
 
 
 
  Hi,
 We will be committing some changes to the Namenode Health page
  (dfshealth.jsp) as part of the fix in HDFS-758. This will enable us to
  monitor the progress of decommissioning of datanodes more effectively.
 Summary of changes :
 1. A new link on the page for Decommissioning nodes.
 2. This link will point to a new page with details about 
  decommissioning
  status for each node which include
  a) Number of under-relplicated blocks in the node.
  b) Number of blocks with only no live replica (i.e. All its 
 replicas
  are on decommissioning nodes).
  c) Number of under-replicated blocks in open files.
 d) Time since decommissioning started.
 3. The main page will also contain total number of under-replicated
  blocks
  in the cluster.
 
  Thanks
  jitendra
 
  



Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page

2009-11-25 Thread Allen Wittenauer

Then you'll have no issues patching other things in 0.21 that are actual
bug fixes that also meet this criteria, right?  Or does this only apply to
things that Yahoo! is hitting/deemed worthy?


On 11/25/09 12:03 PM, Tsz Wo (Nicholas), Sze s29752-hadoop...@yahoo.com
wrote:

 +1 on committing it to 0.21
 
 I also agree that it does not impact the 0.21 release since the patch is
 already done.  The argument of not committing it to 0.21 would be either (1)
 the patch is not safe, or (2) the patch is not that useful.  I don't see they
 are the cases here.
 
 Nicholas Sze
 
 
 
 
 - Original Message 
 From: Jakob Homan jho...@yahoo-inc.com
 To: common-dev@hadoop.apache.org
 Sent: Wed, November 25, 2009 11:31:08 AM
 Subject: Re: HDFS-758 in Hadoop-21 , Updates to  Namenode health page
 
 +1. Backporting this does not in any way impact the release of 21.
 -Jakob
 Hairong Kuang wrote:
 +1. Although this is a new feature, I'd like to have it committed to 0.21
 since we have so many issues with delayed decomission recently.
 
 Hairong 
 
 
 On 11/24/09 6:06 PM, Suresh Srinivas wrote:
 
 +1. This will also help debug the issues when decommissioning takes a long
 time to complete.
 
 
 On 11/23/09 7:36 PM, Jitendra Nath Pandey wrote:
 
 
 
 Hi,
We will be committing some changes to the Namenode Health page
 (dfshealth.jsp) as part of the fix in HDFS-758. This will enable us to
 monitor the progress of decommissioning of datanodes more effectively.
Summary of changes :
1. A new link on the page for Decommissioning nodes.
2. This link will point to a new page with details about
 decommissioning
 status for each node which include
 a) Number of under-relplicated blocks in the node.
 b) Number of blocks with only no live replica (i.e. All its
 replicas
 are on decommissioning nodes).
 c) Number of under-replicated blocks in open files.
d) Time since decommissioning started.
3. The main page will also contain total number of under-replicated
 blocks
 in the cluster.
 
 Thanks
 jitendra
 
 
 



Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page

2009-11-25 Thread Tsz Wo (Nicholas), Sze
Hi Allen,

I guess the you in your questions are referring me.  My answers are yes and 
no, respectively.

Actually, we could possibly commit the patch to ydist, like what we does for 
yahoo-hadoop-0.20.  It is not a big difference.

BTW, as a cluster administrator, do you think that HDFS-758 is very useful?  
Are there other reasons in your mind against committing it to 0.21?

Nicholas





- Original Message 
 From: Allen Wittenauer awittena...@linkedin.com
 To: common-dev@hadoop.apache.org
 Sent: Wed, November 25, 2009 12:46:30 PM
 Subject: Re: HDFS-758 in Hadoop-21 , Updates to  Namenode health page
 
 
 Then you'll have no issues patching other things in 0.21 that are actual
 bug fixes that also meet this criteria, right?  Or does this only apply to
 things that Yahoo! is hitting/deemed worthy?
 
 
 On 11/25/09 12:03 PM, Tsz Wo (Nicholas), Sze 
 wrote:
 
  +1 on committing it to 0.21
  
  I also agree that it does not impact the 0.21 release since the patch is
  already done.  The argument of not committing it to 0.21 would be either (1)
  the patch is not safe, or (2) the patch is not that useful.  I don't see 
  they
  are the cases here.
  
  Nicholas Sze
  
  
  
  
  - Original Message 
  From: Jakob Homan 
  To: common-dev@hadoop.apache.org
  Sent: Wed, November 25, 2009 11:31:08 AM
  Subject: Re: HDFS-758 in Hadoop-21 , Updates to  Namenode health page
  
  +1. Backporting this does not in any way impact the release of 21.
  -Jakob
  Hairong Kuang wrote:
  +1. Although this is a new feature, I'd like to have it committed to 0.21
  since we have so many issues with delayed decomission recently.
  
  Hairong 
  
  
  On 11/24/09 6:06 PM, Suresh Srinivas wrote:
  
  +1. This will also help debug the issues when decommissioning takes a 
  long
  time to complete.
  
  
  On 11/23/09 7:36 PM, Jitendra Nath Pandey wrote:
  
  
  
  Hi,
 We will be committing some changes to the Namenode Health page
  (dfshealth.jsp) as part of the fix in HDFS-758. This will enable us to
  monitor the progress of decommissioning of datanodes more effectively.
 Summary of changes :
 1. A new link on the page for Decommissioning nodes.
 2. This link will point to a new page with details about
  decommissioning
  status for each node which include
  a) Number of under-relplicated blocks in the node.
  b) Number of blocks with only no live replica (i.e. All its
  replicas
  are on decommissioning nodes).
  c) Number of under-replicated blocks in open files.
 d) Time since decommissioning started.
 3. The main page will also contain total number of under-replicated
  blocks
  in the cluster.
  
  Thanks
  jitendra
  
  
  



Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page

2009-11-25 Thread Konstantin Shvachko

+1
I am in favor of committing this to 0.21 because imo it is
not a new HDFS feature but rather an improvement of web UI.

Allen Wittenauer wrote:
 Then you'll have no issues patching other things in 0.21 that are actual
 bug fixes that also meet this criteria, right?  Or does this only apply to
 things that Yahoo! is hitting/deemed worthy?

I don't see what is the problem here.
Yahoo! developers detected, fixed and tested the problem, then called for a 
vote for inclusion.
If others have similar critical problems, they can also fix them and call for a 
vote.

I might miss some context here, could you please clarify.

Thanks,
--Konstantin


On 11/25/09 12:03 PM, Tsz Wo (Nicholas), Sze s29752-hadoop...@yahoo.com
wrote:


+1 on committing it to 0.21

I also agree that it does not impact the 0.21 release since the patch is
already done.  The argument of not committing it to 0.21 would be either (1)
the patch is not safe, or (2) the patch is not that useful.  I don't see they
are the cases here.

Nicholas Sze




- Original Message 

From: Jakob Homan jho...@yahoo-inc.com
To: common-dev@hadoop.apache.org
Sent: Wed, November 25, 2009 11:31:08 AM
Subject: Re: HDFS-758 in Hadoop-21 , Updates to  Namenode health page

+1. Backporting this does not in any way impact the release of 21.
-Jakob
Hairong Kuang wrote:

+1. Although this is a new feature, I'd like to have it committed to 0.21
since we have so many issues with delayed decomission recently.

Hairong 



On 11/24/09 6:06 PM, Suresh Srinivas wrote:


+1. This will also help debug the issues when decommissioning takes a long
time to complete.


On 11/23/09 7:36 PM, Jitendra Nath Pandey wrote:




Hi,

   We will be committing some changes to the Namenode Health page
(dfshealth.jsp) as part of the fix in HDFS-758. This will enable us to
monitor the progress of decommissioning of datanodes more effectively.
   Summary of changes :
   1. A new link on the page for Decommissioning nodes.
   2. This link will point to a new page with details about
decommissioning
status for each node which include
a) Number of under-relplicated blocks in the node.
b) Number of blocks with only no live replica (i.e. All its

replicas

are on decommissioning nodes).
c) Number of under-replicated blocks in open files.
   d) Time since decommissioning started.
   3. The main page will also contain total number of under-replicated
blocks
in the cluster.

Thanks
jitendra







Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page

2009-11-24 Thread Suresh Srinivas
+1. This will also help debug the issues when decommissioning takes a long time 
to complete.


On 11/23/09 7:36 PM, Jitendra Nath Pandey jiten...@yahoo-inc.com wrote:



 Hi,
We will be committing some changes to the Namenode Health page
 (dfshealth.jsp) as part of the fix in HDFS-758. This will enable us to
 monitor the progress of decommissioning of datanodes more effectively.
Summary of changes :
1. A new link on the page for Decommissioning nodes.
2. This link will point to a new page with details about decommissioning
 status for each node which include
 a) Number of under-relplicated blocks in the node.
 b) Number of blocks with only no live replica (i.e. All its replicas
 are on decommissioning nodes).
 c) Number of under-replicated blocks in open files.
d) Time since decommissioning started.
3. The main page will also contain total number of under-replicated blocks
 in the cluster.

 Thanks
 jitendra