Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page
On Mon, Dec 7, 2009 at 2:27 PM, Allen Wittenauer awittena...@linkedin.com wrote: On 12/7/09 2:00 PM, Sanjay Radia sra...@yahoo-inc.com wrote: Allen raises a good point that the rest of the community may not need some of the features that Yahoo finds useful internally. FWIW, I have no real issues with the change itself. I'm much more concerned that a UI enhancement was deemed so critical. So important was this enhancement that it required a vote for back porting after feature freeze. I see this differently. The UI enhancement isn't critical, so it requires a vote. If it were critical, it would simply be committed. As 0.21 is tested, if it is discovered that some features are unusable but not broken, then voting on whether the feature is important enough to merit a fix is a reasonable, if imperfect process. I hope that all those experimenting with 0.21 consider correctness and usability of new features; if a particular feature is awkward enough to merit a patch, then we'll do this again, and consider the risk/reward as a community. I respectfully, but emphatically disagree with the assertion that forking our UI and administration tools would be positive for the project, but the point is outside the scope of this thread. Meanwhile, other fixes and changes that do not impact Yahoo! for which patches have existed for a long time, sit there idle, uncommitted. I'm not sure which issues you're referring to, but if the fixes and changes are for 0.22, then holding off on those while 0.21 settles seems consistent with the priority you identified earlier, i.e. releasing 0.21. If there are patches you want to see in this release that remain uncommitted, it would help to know which ones are in that set. It makes me wonder what the priorities truly are. What does feature freeze actually mean? The priority has always been a stable, usable 0.21. Feature freeze means that the bar for new features/improvements to core is raised for all the obvious reasons, and where it's not clear: the community discusses it. -C
Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page
On Nov 25, 2009, at 12:46 PM, Allen Wittenauer wrote: Then you'll have no issues patching other things in 0.21 that are actual bug fixes that also meet this criteria, right? Or does this only apply to things that Yahoo! is hitting/deemed worthy? Allen raises a good point that the rest of the community may not need some of the features that Yahoo finds useful internally. It may clutter Hadoop unnecessarily. Most of these admin GUI improvements are pluggable. Perhaps this particular plugin did not even need to go into trunk. It could have been made available as a separate downloadable or contrib module. This way folks can use it across releases and only if they need it. Further, there are a few new GUI improvements in the hadoop community that are proprietary - we should make it easier to create new admin plugins easily. This way users can create new plugins that are useful to them internally; it also allows companies to create proprietary plugins for their customers. sanjay On 11/25/09 12:03 PM, Tsz Wo (Nicholas), Sze s29752-hadoop...@yahoo.com wrote: +1 on committing it to 0.21 I also agree that it does not impact the 0.21 release since the patch is already done. The argument of not committing it to 0.21 would be either (1) the patch is not safe, or (2) the patch is not that useful. I don't see they are the cases here. Nicholas Sze - Original Message From: Jakob Homan jho...@yahoo-inc.com To: common-dev@hadoop.apache.org Sent: Wed, November 25, 2009 11:31:08 AM Subject: Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page +1. Backporting this does not in any way impact the release of 21. -Jakob Hairong Kuang wrote: +1. Although this is a new feature, I'd like to have it committed to 0.21 since we have so many issues with delayed decomission recently. Hairong On 11/24/09 6:06 PM, Suresh Srinivas wrote: +1. This will also help debug the issues when decommissioning takes a long time to complete. On 11/23/09 7:36 PM, Jitendra Nath Pandey wrote: Hi, We will be committing some changes to the Namenode Health page (dfshealth.jsp) as part of the fix in HDFS-758. This will enable us to monitor the progress of decommissioning of datanodes more effectively. Summary of changes : 1. A new link on the page for Decommissioning nodes. 2. This link will point to a new page with details about decommissioning status for each node which include a) Number of under-relplicated blocks in the node. b) Number of blocks with only no live replica (i.e. All its replicas are on decommissioning nodes). c) Number of under-replicated blocks in open files. d) Time since decommissioning started. 3. The main page will also contain total number of under- replicated blocks in the cluster. Thanks jitendra
Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page
On Mon, Dec 7, 2009 at 2:00 PM, Sanjay Radia sra...@yahoo-inc.com wrote: On Nov 25, 2009, at 12:46 PM, Allen Wittenauer wrote: Then you'll have no issues patching other things in 0.21 that are actual bug fixes that also meet this criteria, right? Or does this only apply to things that Yahoo! is hitting/deemed worthy? Allen raises a good point that the rest of the community may not need some of the features that Yahoo finds useful internally. It may clutter Hadoop unnecessarily. Most of these admin GUI improvements are pluggable. Perhaps this particular plugin did not even need to go into trunk. It could have been made available as a separate downloadable or contrib module. This way folks can use it across releases and only if they need it. Further, there are a few new GUI improvements in the hadoop community that are proprietary - we should make it easier to create new admin plugins easily. This way users can create new plugins that are useful to them internally; it also allows companies to create proprietary plugins for their customers. +1 to administration plugins. I'm also +1 to HDFS-758 going into 21, would be genuinely useful to several customers. Thanks, Eli
Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page
+1. Although this is a new feature, I'd like to have it committed to 0.21 since we have so many issues with delayed decomission recently. Hairong On 11/24/09 6:06 PM, Suresh Srinivas sures...@yahoo-inc.com wrote: +1. This will also help debug the issues when decommissioning takes a long time to complete. On 11/23/09 7:36 PM, Jitendra Nath Pandey jiten...@yahoo-inc.com wrote: Hi, We will be committing some changes to the Namenode Health page (dfshealth.jsp) as part of the fix in HDFS-758. This will enable us to monitor the progress of decommissioning of datanodes more effectively. Summary of changes : 1. A new link on the page for Decommissioning nodes. 2. This link will point to a new page with details about decommissioning status for each node which include a) Number of under-relplicated blocks in the node. b) Number of blocks with only no live replica (i.e. All its replicas are on decommissioning nodes). c) Number of under-replicated blocks in open files. d) Time since decommissioning started. 3. The main page will also contain total number of under-replicated blocks in the cluster. Thanks jitendra
Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page
We have had several issues with decommissioning in recent past. Decommissioning takes long time and operations guys have no means to find out what is taking so long. This is a change in the namenode web UI that will greatly help hadoop users to monitor the status of decommissioning and to discover cause of 'long tails' in decommissioning. Brian Bockelman's comment on the jira (HDFS-758) confirms that 'long tails' in decommissioning have been a problem not only at Yahoo but also for other hadoop users. Therefore, this feature seems to be significant enough to be backported to 21 so that it is available for our users sooner rather than later. We have backported this change to yahoo distribution of hadoop-20 as well. Thanks jitendra On 11/25/09 11:14 AM, Allen Wittenauer awittena...@linkedin.com wrote: -1 We're never going to see 0.21 if features keep getting backported. On 11/24/09 9:44 PM, Owen O'Malley owen.omal...@gmail.com wrote: +1 This sounds like useful information that will and has aided debugging. -- Owen
Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page
+1 on committing it to 0.21 I also agree that it does not impact the 0.21 release since the patch is already done. The argument of not committing it to 0.21 would be either (1) the patch is not safe, or (2) the patch is not that useful. I don't see they are the cases here. Nicholas Sze - Original Message From: Jakob Homan jho...@yahoo-inc.com To: common-dev@hadoop.apache.org Sent: Wed, November 25, 2009 11:31:08 AM Subject: Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page +1. Backporting this does not in any way impact the release of 21. -Jakob Hairong Kuang wrote: +1. Although this is a new feature, I'd like to have it committed to 0.21 since we have so many issues with delayed decomission recently. Hairong On 11/24/09 6:06 PM, Suresh Srinivas wrote: +1. This will also help debug the issues when decommissioning takes a long time to complete. On 11/23/09 7:36 PM, Jitendra Nath Pandey wrote: Hi, We will be committing some changes to the Namenode Health page (dfshealth.jsp) as part of the fix in HDFS-758. This will enable us to monitor the progress of decommissioning of datanodes more effectively. Summary of changes : 1. A new link on the page for Decommissioning nodes. 2. This link will point to a new page with details about decommissioning status for each node which include a) Number of under-relplicated blocks in the node. b) Number of blocks with only no live replica (i.e. All its replicas are on decommissioning nodes). c) Number of under-replicated blocks in open files. d) Time since decommissioning started. 3. The main page will also contain total number of under-replicated blocks in the cluster. Thanks jitendra
Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page
Then you'll have no issues patching other things in 0.21 that are actual bug fixes that also meet this criteria, right? Or does this only apply to things that Yahoo! is hitting/deemed worthy? On 11/25/09 12:03 PM, Tsz Wo (Nicholas), Sze s29752-hadoop...@yahoo.com wrote: +1 on committing it to 0.21 I also agree that it does not impact the 0.21 release since the patch is already done. The argument of not committing it to 0.21 would be either (1) the patch is not safe, or (2) the patch is not that useful. I don't see they are the cases here. Nicholas Sze - Original Message From: Jakob Homan jho...@yahoo-inc.com To: common-dev@hadoop.apache.org Sent: Wed, November 25, 2009 11:31:08 AM Subject: Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page +1. Backporting this does not in any way impact the release of 21. -Jakob Hairong Kuang wrote: +1. Although this is a new feature, I'd like to have it committed to 0.21 since we have so many issues with delayed decomission recently. Hairong On 11/24/09 6:06 PM, Suresh Srinivas wrote: +1. This will also help debug the issues when decommissioning takes a long time to complete. On 11/23/09 7:36 PM, Jitendra Nath Pandey wrote: Hi, We will be committing some changes to the Namenode Health page (dfshealth.jsp) as part of the fix in HDFS-758. This will enable us to monitor the progress of decommissioning of datanodes more effectively. Summary of changes : 1. A new link on the page for Decommissioning nodes. 2. This link will point to a new page with details about decommissioning status for each node which include a) Number of under-relplicated blocks in the node. b) Number of blocks with only no live replica (i.e. All its replicas are on decommissioning nodes). c) Number of under-replicated blocks in open files. d) Time since decommissioning started. 3. The main page will also contain total number of under-replicated blocks in the cluster. Thanks jitendra
Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page
Hi Allen, I guess the you in your questions are referring me. My answers are yes and no, respectively. Actually, we could possibly commit the patch to ydist, like what we does for yahoo-hadoop-0.20. It is not a big difference. BTW, as a cluster administrator, do you think that HDFS-758 is very useful? Are there other reasons in your mind against committing it to 0.21? Nicholas - Original Message From: Allen Wittenauer awittena...@linkedin.com To: common-dev@hadoop.apache.org Sent: Wed, November 25, 2009 12:46:30 PM Subject: Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page Then you'll have no issues patching other things in 0.21 that are actual bug fixes that also meet this criteria, right? Or does this only apply to things that Yahoo! is hitting/deemed worthy? On 11/25/09 12:03 PM, Tsz Wo (Nicholas), Sze wrote: +1 on committing it to 0.21 I also agree that it does not impact the 0.21 release since the patch is already done. The argument of not committing it to 0.21 would be either (1) the patch is not safe, or (2) the patch is not that useful. I don't see they are the cases here. Nicholas Sze - Original Message From: Jakob Homan To: common-dev@hadoop.apache.org Sent: Wed, November 25, 2009 11:31:08 AM Subject: Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page +1. Backporting this does not in any way impact the release of 21. -Jakob Hairong Kuang wrote: +1. Although this is a new feature, I'd like to have it committed to 0.21 since we have so many issues with delayed decomission recently. Hairong On 11/24/09 6:06 PM, Suresh Srinivas wrote: +1. This will also help debug the issues when decommissioning takes a long time to complete. On 11/23/09 7:36 PM, Jitendra Nath Pandey wrote: Hi, We will be committing some changes to the Namenode Health page (dfshealth.jsp) as part of the fix in HDFS-758. This will enable us to monitor the progress of decommissioning of datanodes more effectively. Summary of changes : 1. A new link on the page for Decommissioning nodes. 2. This link will point to a new page with details about decommissioning status for each node which include a) Number of under-relplicated blocks in the node. b) Number of blocks with only no live replica (i.e. All its replicas are on decommissioning nodes). c) Number of under-replicated blocks in open files. d) Time since decommissioning started. 3. The main page will also contain total number of under-replicated blocks in the cluster. Thanks jitendra
Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page
+1 I am in favor of committing this to 0.21 because imo it is not a new HDFS feature but rather an improvement of web UI. Allen Wittenauer wrote: Then you'll have no issues patching other things in 0.21 that are actual bug fixes that also meet this criteria, right? Or does this only apply to things that Yahoo! is hitting/deemed worthy? I don't see what is the problem here. Yahoo! developers detected, fixed and tested the problem, then called for a vote for inclusion. If others have similar critical problems, they can also fix them and call for a vote. I might miss some context here, could you please clarify. Thanks, --Konstantin On 11/25/09 12:03 PM, Tsz Wo (Nicholas), Sze s29752-hadoop...@yahoo.com wrote: +1 on committing it to 0.21 I also agree that it does not impact the 0.21 release since the patch is already done. The argument of not committing it to 0.21 would be either (1) the patch is not safe, or (2) the patch is not that useful. I don't see they are the cases here. Nicholas Sze - Original Message From: Jakob Homan jho...@yahoo-inc.com To: common-dev@hadoop.apache.org Sent: Wed, November 25, 2009 11:31:08 AM Subject: Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page +1. Backporting this does not in any way impact the release of 21. -Jakob Hairong Kuang wrote: +1. Although this is a new feature, I'd like to have it committed to 0.21 since we have so many issues with delayed decomission recently. Hairong On 11/24/09 6:06 PM, Suresh Srinivas wrote: +1. This will also help debug the issues when decommissioning takes a long time to complete. On 11/23/09 7:36 PM, Jitendra Nath Pandey wrote: Hi, We will be committing some changes to the Namenode Health page (dfshealth.jsp) as part of the fix in HDFS-758. This will enable us to monitor the progress of decommissioning of datanodes more effectively. Summary of changes : 1. A new link on the page for Decommissioning nodes. 2. This link will point to a new page with details about decommissioning status for each node which include a) Number of under-relplicated blocks in the node. b) Number of blocks with only no live replica (i.e. All its replicas are on decommissioning nodes). c) Number of under-replicated blocks in open files. d) Time since decommissioning started. 3. The main page will also contain total number of under-replicated blocks in the cluster. Thanks jitendra
Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page
+1. This will also help debug the issues when decommissioning takes a long time to complete. On 11/23/09 7:36 PM, Jitendra Nath Pandey jiten...@yahoo-inc.com wrote: Hi, We will be committing some changes to the Namenode Health page (dfshealth.jsp) as part of the fix in HDFS-758. This will enable us to monitor the progress of decommissioning of datanodes more effectively. Summary of changes : 1. A new link on the page for Decommissioning nodes. 2. This link will point to a new page with details about decommissioning status for each node which include a) Number of under-relplicated blocks in the node. b) Number of blocks with only no live replica (i.e. All its replicas are on decommissioning nodes). c) Number of under-replicated blocks in open files. d) Time since decommissioning started. 3. The main page will also contain total number of under-replicated blocks in the cluster. Thanks jitendra