Re: [VOTE] - Release 2.0.5-beta
Hi Guys, +1 We @ ebay would like to see snapshots before we start testing/deploying hadoop 2.0 next month. Thanks, Mayank On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@).
Re: [VOTE] - Release 2.0.5-beta
+1 on 2.0.5 defined in this thread with the new features. But I am supportive of an earlier release that has ALL the compatibility changes, without the features. sanjay On May 15, 2013, at 10:57 AM, Arun C Murthy wrote: Folks, ... I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990
Re: [VOTE] - Release 2.0.5-beta
-1 for the record. This is a great plan for 2.1, which I would gladly support, but not for 2.0.5. I do not see how the previous vote could have been confusing, as it contained a direct quotation of the relative clause of Bylaws. Arun, the format of this vote remains confusing. What is the action and what approval method you plan to use is still undefined. Thanks, --Konstantin On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@).
Re: [VOTE] - Release 2.0.5-beta
Chris, I find you are contradicting yourself within this message and with some other of yours. But I want to address only one thing here This has exposed a bug in our bylaws, which we can fix. This could be a bug, and we may need to fix it. But until then it is a bylaw, which is the only rule we have to come to an agreement if we disagree. If we both respect the rules we can come to an agreement. If not and people start forcing their way by saying the rule is wrong - let's ignore it today, or by conducting an infinite chain of counter votes - this creates chaos. Thanks, --Konstantin On Sat, May 18, 2013 at 4:22 PM, Chris Douglas cdoug...@apache.org wrote: The release plan vote is not binding in any way. Nobody lost a vote, or risks having an outcome reversed, because there are no consequences to these exercises. Konstantin, I've been trying to tell you for more than a week that you can go forward without anyone's blessing or consent. There are no precedents, because the release plan vote has been a formality until now, and I don't know of any other projects that even bother with it. Most of our committers and PMC members didn't even know who was eligible to vote on it, because we usually ignore it. What *does* matter is the majority vote of the PMC on the release artifact. While we under-defined what the release plan means, we have zero ambiguity on when a release artifact becomes real. In the discussion, you were offered a minor release series, help selecting patches from branch-2, and every administrative barrier was removed from your path. Instead of taking this and running with it, you continued to press for... I don't know what. Please decide how you're going to move a development branch- any of them- forward and start working on it. There is nothing to win in these threads. This has exposed a bug in our bylaws, which we can fix. Right now, these votes are confusing everybody and stalling the project. I don't care who comes up with 2.0.5-beta, whether it's part of 2.1, or if we create 3.0. Any committer who wants to offer an candidate needs to demonstrate that they have a non-trivial, non-sectarian proportion of the community behind it by (1) creating the artifact (2) passing a PMC vote to make that artifact a release. It's that simple. With respect to the board: they are not parents, and we are not children. Neither are they interested or equipped to tell us how to partition releases of Hadoop. This is routine development, we are failing at it, but we will recover by eliminating this pointless ritual and getting back to producing software. -C On Fri, May 17, 2013 at 1:10 PM, Konstantin Shvachko shv.had...@gmail.com wrote: BCC: general@ Since we recognize now that this is a vote to overrule previous decision, I am referring to Vinod's note on general *http://s.apache.org/h7x* should this be brought to the attention of the Board? I don't remember any precedents of this kind in Hadoop history. But other projects may have had such experience. A clarification on categorizing this action and on voting practices from ASF may help. Thanks, --Konstantin On Wed, May 15, 2013 at 3:36 PM, Konstantin Shvachko shv.had...@gmail.comwrote: Arun, I am glad I at least convinced you to finally announce your release plan and put it into vote. Even though it is to overrule the vote that just completed, which you were against and lost, well - Twice. I am glad you removed the NFS feature from the list proposed earlier. I think this vote is late. The lazy consensus on that issue has been just reached. I don't see the basis for the new vote, and it is not clear what action you seek to approve. Thanks, --Konstantin On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's
Re: [VOTE] - Release 2.0.5-beta
I've now started a separate discussion thread in common-dev@, titled [PROPOSAL] change in bylaws to remove Release Plan vote. If it achieves consensus, I'll put it to a vote to so change the bylaws. Best, --Matt On Sat, May 18, 2013 at 4:22 PM, Chris Douglas cdoug...@apache.org wrote: The release plan vote is not binding in any way. Nobody lost a vote, or risks having an outcome reversed, because there are no consequences to these exercises. Konstantin, I've been trying to tell you for more than a week that you can go forward without anyone's blessing or consent. There are no precedents, because the release plan vote has been a formality until now, and I don't know of any other projects that even bother with it. Most of our committers and PMC members didn't even know who was eligible to vote on it, because we usually ignore it. What *does* matter is the majority vote of the PMC on the release artifact. While we under-defined what the release plan means, we have zero ambiguity on when a release artifact becomes real. In the discussion, you were offered a minor release series, help selecting patches from branch-2, and every administrative barrier was removed from your path. Instead of taking this and running with it, you continued to press for... I don't know what. Please decide how you're going to move a development branch- any of them- forward and start working on it. There is nothing to win in these threads. This has exposed a bug in our bylaws, which we can fix. Right now, these votes are confusing everybody and stalling the project. I don't care who comes up with 2.0.5-beta, whether it's part of 2.1, or if we create 3.0. Any committer who wants to offer an candidate needs to demonstrate that they have a non-trivial, non-sectarian proportion of the community behind it by (1) creating the artifact (2) passing a PMC vote to make that artifact a release. It's that simple. With respect to the board: they are not parents, and we are not children. Neither are they interested or equipped to tell us how to partition releases of Hadoop. This is routine development, we are failing at it, but we will recover by eliminating this pointless ritual and getting back to producing software. -C On Fri, May 17, 2013 at 1:10 PM, Konstantin Shvachko shv.had...@gmail.com wrote: BCC: general@ Since we recognize now that this is a vote to overrule previous decision, I am referring to Vinod's note on general *http://s.apache.org/h7x* should this be brought to the attention of the Board? I don't remember any precedents of this kind in Hadoop history. But other projects may have had such experience. A clarification on categorizing this action and on voting practices from ASF may help. Thanks, --Konstantin On Wed, May 15, 2013 at 3:36 PM, Konstantin Shvachko shv.had...@gmail.comwrote: Arun, I am glad I at least convinced you to finally announce your release plan and put it into vote. Even though it is to overrule the vote that just completed, which you were against and lost, well - Twice. I am glad you removed the NFS feature from the list proposed earlier. I think this vote is late. The lazy consensus on that issue has been just reached. I don't see the basis for the new vote, and it is not clear what action you seek to approve. Thanks, --Konstantin On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386
Re: [VOTE] - Release 2.0.5-beta
The release plan vote is not binding in any way. Nobody lost a vote, or risks having an outcome reversed, because there are no consequences to these exercises. Konstantin, I've been trying to tell you for more than a week that you can go forward without anyone's blessing or consent. There are no precedents, because the release plan vote has been a formality until now, and I don't know of any other projects that even bother with it. Most of our committers and PMC members didn't even know who was eligible to vote on it, because we usually ignore it. What *does* matter is the majority vote of the PMC on the release artifact. While we under-defined what the release plan means, we have zero ambiguity on when a release artifact becomes real. In the discussion, you were offered a minor release series, help selecting patches from branch-2, and every administrative barrier was removed from your path. Instead of taking this and running with it, you continued to press for... I don't know what. Please decide how you're going to move a development branch- any of them- forward and start working on it. There is nothing to win in these threads. This has exposed a bug in our bylaws, which we can fix. Right now, these votes are confusing everybody and stalling the project. I don't care who comes up with 2.0.5-beta, whether it's part of 2.1, or if we create 3.0. Any committer who wants to offer an candidate needs to demonstrate that they have a non-trivial, non-sectarian proportion of the community behind it by (1) creating the artifact (2) passing a PMC vote to make that artifact a release. It's that simple. With respect to the board: they are not parents, and we are not children. Neither are they interested or equipped to tell us how to partition releases of Hadoop. This is routine development, we are failing at it, but we will recover by eliminating this pointless ritual and getting back to producing software. -C On Fri, May 17, 2013 at 1:10 PM, Konstantin Shvachko shv.had...@gmail.com wrote: BCC: general@ Since we recognize now that this is a vote to overrule previous decision, I am referring to Vinod's note on general *http://s.apache.org/h7x* should this be brought to the attention of the Board? I don't remember any precedents of this kind in Hadoop history. But other projects may have had such experience. A clarification on categorizing this action and on voting practices from ASF may help. Thanks, --Konstantin On Wed, May 15, 2013 at 3:36 PM, Konstantin Shvachko shv.had...@gmail.comwrote: Arun, I am glad I at least convinced you to finally announce your release plan and put it into vote. Even though it is to overrule the vote that just completed, which you were against and lost, well - Twice. I am glad you removed the NFS feature from the list proposed earlier. I think this vote is late. The lazy consensus on that issue has been just reached. I don't see the basis for the new vote, and it is not clear what action you seek to approve. Thanks, --Konstantin On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.comwrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing
Re: [VOTE] - Release 2.0.5-beta
Thanks a bunch Nathan, for clearly letting us know the Yahoo! team's perspective. We are getting started on rolling upgrades from YARN side (Sid opened YARN-666) and I hear HDFS side is too. We definitely need compatibility and testing kits. Have to get started on this. Work-preserving restart on YARN side - we plan to scope down next. Thanks, +Vinod On May 16, 2013, at 11:28 AM, Nathan Roberts wrote: (initially respond on general@, sorry about that. copied here) +1 (non-binding) From my perspective: * The key feature that will drive me to adopt 2.x is Rolling Upgrades * In order to get to rolling upgrades, we need a compatibility story that is significantly better than we have today ** We need a comprehensive definition of what compatibility really means ** We need better testing in place to verify we're not breaking compatibility ** We need better definition and testing of what rolling upgrades really means. Rolling between bug-fix releases Required, Rolling between minor releases Required, Rolling between major releases Desired. ** We need work-preserving restart on the YARN side. Restarting jobs isn't sufficient. ** ... * Given that Rolling upgrades aren't there yet, and there is still work to be done to solidify the compatibility story, I'm ok with the feature window remaining open until these are in place, especially given the fact that the proposed features are likely to have non-zero impact on compatibility/rolling_upgrades. * I'd certainly like a release with rolling upgrades as soon as possible, so I feel like the feature window needs to ramp down very quickly. Something like 2.0.5-beta in May with the current list of proposed features, then 2.0.6-beta in late summer with full rolling upgrade support and a solid compatibility story, would seem like a reasonable timeline. Once we have a beta release with rolling upgrades, I can look at pushing 2.x to some of our larger clusters. Nathan Roberts nrobe...@yahoo-inc.com On 5/15/13 1:06 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Seems like you forgot to bcc. Forwarding this to general. Thanks, +Vinod On May 15, 2013, at 10:57 AM, Arun C Murthy wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@).
Re: [VOTE] - Release 2.0.5-beta
Apologies for a bunch of delayed responses (and as such adding even more emails to this thread). On Wed, May 15, 2013 at 4:47 PM, Arun C Murthy a...@hortonworks.com wrote: My reading of your response is that while you appreciate the feedback Bigtop is providing you're not of an opinion that investigating the level of stability of Hadoop wrt. downstream any further than what is currently happening would be a worthy investment of Hadoop's community (or your personal for that matter) time? Everyone is welcome to contribute in any and all manner. I can't speak for everyone. It would be useful if Bigtop could run regressions on releases here consistently. We've also talked in the past about running Bigtop on branch-2, nightly. Is that something you could help with? You'd earn my personal gratitude. There's a bunch of stuff that Bigtop can offer wrt. infrastructure and existing functionality that helps with integration testing of Hadoop. There's 10x more stuff we can do if folks other than Bigtop members would be willing to pitch in. I think a good closure to this discussion would be to identify things that we can do in Bigtop to help Hadoop stabilize quicker and also identify anybody who's potentially willing to help implementing these ideas. I'll try to collect all the good points that you and others have made on this thread wrt. this approach and will fork it into a separate discussion shortly. Thanks, Roman.
Re: [VOTE] - Release 2.0.5-beta
BCC: general@ Since we recognize now that this is a vote to overrule previous decision, I am referring to Vinod's note on general *http://s.apache.org/h7x* should this be brought to the attention of the Board? I don't remember any precedents of this kind in Hadoop history. But other projects may have had such experience. A clarification on categorizing this action and on voting practices from ASF may help. Thanks, --Konstantin On Wed, May 15, 2013 at 3:36 PM, Konstantin Shvachko shv.had...@gmail.comwrote: Arun, I am glad I at least convinced you to finally announce your release plan and put it into vote. Even though it is to overrule the vote that just completed, which you were against and lost, well - Twice. I am glad you removed the NFS feature from the list proposed earlier. I think this vote is late. The lazy consensus on that issue has been just reached. I don't see the basis for the new vote, and it is not clear what action you seek to approve. Thanks, --Konstantin On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.comwrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@).
Re: [VOTE] - Release 2.0.5-beta
On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote: To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: If you intend to nullify a prior vote then you should be very explicit, e.g., making that a clause in the new proposal. # nullify the result of the vote on general@ that started with message-id xxx... Also, why is this on common-dev? Isn't this list just for discussion of things that happen in the hadoop-common tree? http://hadoop.apache.org/mailing_lists.html#Common all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught Wow. The folks you disagreed with there had absolutely no impact on your thinking about this release? Release numbers are cheap. We shouldn't fight over them. Doug
Re: [VOTE] - Release 2.0.5-beta
Guys, this is a pretty long email with all the details I can think of on how Bigtop can help stabilization efforts of Hadoop 2.x. A lot of this information is required background. I really, really encourage everyone who's thinking of contributing to this effort to read it up. Once again, I do apologize for its size. Matt, Andrew, you both brought up very good point, so let me summarize a few things wrt. Bigtop. I'm also CCing Bigtop dev ML so that everybody who's interested in pitching in could discuss the matter further over there. On Wed, May 15, 2013 at 9:25 PM, Andrew Purtell apurt...@apache.org wrote: The other comment on this thread that suggests ASF governance structures being inadequate for negotiating changes in a large ecosystem might be on to something, but at the same time Apache BigTop may be an effective ASF-native answer to that. That is my sincere hope as well. Of course, Apache Bigtop is a project in its own right with its own release schedules, community of users, etc. What we are developing is not really an integration testsuite for Hadoop, it just so happens that without a stable Hadoop base we can't really deliver much. Hence we have a huge vested interest in having a predictable schedule for the stable releases of Hadoop. We also have all the interest in the world to help Hadoop achieve that. At the same time we're a very small project juggling ~18 different open source components trying to put them into a coherent distribution. I don't think it is realistic to expect us to be able to do all the work that ideally we would need to do in order to provide the most of feedback for Hadoop stabilization exercise. At the same time it would be really unfortunate if we all just give up on this collective goal. Ideally we can all pitch in to the extent we believe in the need in having a stable Hadoop 2.x code line out there. I'll elaborate on what exactly bigtop can contribute a bit later and I would expect all the folks who'd be willing to pitch in in the particular area to reach out to us either here or on bigtop ML. On Wed, May 15, 2013 at 4:54 PM, Matt Foley ma...@apache.org wrote: Roman, what is your model for how test results from Bigtop should feed back into Hadoop-2 development? With the understanding that (a) software does have bugs, and (b) you're not going to get an SLA on community-sponsored software, what are your ideas for how to close the loop better? Would CI runs of Bigtop against branch-2 be feasible, as Arun suggests? How should we accomodate changes in individual components (Hadoop Core, but others as well) that may require changes in one or more other components? How does Bigtop keep doing a viable nightly build in that chaotic environment? Is this a previously solved problem? All excellent questions! Here's my laundry list of what Bigtop can offer today: #0 a publically available continuous integration Jenkins instance that runs on EC2 (because of Cloudera's gracious support of our project) and ties the rest of the bigtop infrastrucutre together: http://bigtop01.cloudera.org:8080/ The benefit of this infrastructure in the open is pretty clear -- just like with builds.apache.org if there are failures/etc. anybody who's interested can jump on it and start making progress. #1 a continuous integration build of all the components comprising the 'current' trunk of Apache Bigtop all the way up to producing easy to install packages for the following Linux platforms: http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-Repository/ Basically the above link allows one to install nightly builds of Apache Bigtop Hadoop distribution as easyly as typing 'yum install hadoop-conf-pseudo' #2 a potential for 'tracking' builds all the way to packages of each individual component: http://bigtop01.cloudera.org:8080/view/Upstream-tests/ Basically this allows one to easily install the base, fully tested distribution of Hadoop (lets say Bigtop 0.5.0), upgrade just one component and see how it fares. Currently these builds are add-hoc, but I'm trying to work with respective upstream communities to figure out what branches of development they would be interested in testing that way. This is one of the things that Arun and I talked about wrt. hooking up Bigtop Jenkins to the branch-2 on a continuous basis. I wish I had time to do that I honestly simply don't. I might in a few weeks, but again, if anybody is willing to pitch in and help -- that'll be greatly appreciated. #3 a collection of puppet recipes that allow one to deploy packaged Bigtop distro (either from #1 or #2) on a fully distributed cluster. #4 an existing collection of integration tests (~200) for all the components we've got in our stack:
Re: [VOTE] - Release 2.0.5-beta
Guys, I guess what you're missing is that Bigtop isn't a testing framework for Hadoop. It is stack framework that verifies that components are dealing with each other nicely. Every single stack is different: Bigtop 0.5.0 differs from 0.6.0, and so on. Bigtop - as any other ASF project - has its releases that might or might not be aligned with particular version of Hadoop. Hence, an ethalon stack needs to be defined first and foremost. Before we even start talking about running it nightly (another question is on what hardware, let's not get there for now) let's understand who will can help with triage'ing test failures? Downstreams, Hadoop or Bigtop? Judging by a number of other emails there's a number of people on this list who care plenty about integration issues. Any volunteers to help with integration testing in the open? Is this a previously solved problem? Yes. The problem is solved by separating actively developed (aka unstable) release from more mature and less volatile ones. This is what has been concluded upon two days ago in this voting thread http://s.apache.org/MnU Cos On Wed, May 15, 2013 at 04:54PM, Matt Foley wrote: Roman, what is your model for how test results from Bigtop should feed back into Hadoop-2 development? With the understanding that (a) software does have bugs, and (b) you're not going to get an SLA on community-sponsored software, what are your ideas for how to close the loop better? Would CI runs of Bigtop against branch-2 be feasible, as Arun suggests? How should we accomodate changes in individual components (Hadoop Core, but others as well) that may require changes in one or more other components? How does Bigtop keep doing a viable nightly build in that chaotic environment? Is this a previously solved problem? Thanks, --Matt On Wed, May 15, 2013 at 4:47 PM, Arun C Murthy a...@hortonworks.com wrote: On May 15, 2013, at 3:50 PM, Roman Shaposhnik wrote: Arun, am I reading yours answer to my binary question correctly? It is a 'no'. No. My reading of your response is that while you appreciate the feedback Bigtop is providing you're not of an opinion that investigating the level of stability of Hadoop wrt. downstream any further than what is currently happening would be a worthy investment of Hadoop's community (or your personal for that matter) time? Everyone is welcome to contribute in any and all manner. I can't speak for everyone. It would be useful if Bigtop could run regressions on releases here consistently. We've also talked in the past about running Bigtop on branch-2, nightly. Is that something you could help with? You'd earn my personal gratitude. thanks, Arun Thanks, Roman. On Wed, May 15, 2013 at 3:02 PM, Arun C Murthy a...@hortonworks.com wrote: Roman, Furthermore, before we rush into finding flaws and scaring kids at night it would be useful to remember one thing: Software has *bugs*. We can't block any release till the entire universe validates it, in fact they won't validate it if we don't release since are at the bottom of the stack. Any help prior to the release is welcome; I know people who work for the same employer as I do have plans to do further testing after we freeze apis via the beta release(s). I hope and pray others can join this effort - thanks to everyone who already has. Again, freezing APIs and protocols is the primary aim of 2.0.5-beta. There are no guarantees it's 100% bug-free, we can never make such guarantees anyway. If, and when, we find bugs with 2.0.5-beta I'm more than happy to quickly turn around and make more releases (2.0.6-beta, 2.0.7-beta). Obviously I'll make a call on which bugs are critical - feedback to help me decide is, as always, welcome. I've been clear, many times, that we might need more than one beta release to iron out bugs etc. None of this should be a surprise - this has happened many, many times in the lifetime of this and other projects. 2.0.3-alpha vis-a-vis 2.0.4-alpha is the most recent example - it won't be the last. So, I hope, concludes this meme. thanks, Arun On May 15, 2013, at 2:20 PM, Arun C Murthy wrote: Great summary, thanks Vinod. On May 15, 2013, at 2:14 PM, Vinod Kumar Vavilapalli wrote: Roman, I keep this same argument again and again. Should've refuted earlier. Please list down all the issues that BigTop ran into *because of* new features. You continue to argue that new features are destabilizing 2.0.*, which I don't agree with at all. 2.0.3-alpha was the last time major features got merged in, and we found blockers irrespective of those. MAPREDUCE-5240 specifically isn't due to any feature merge. This was a bug. I'd say this is a long standing bug in 2.0.x. You sure this passed in 2.0.3? Even so, this is mostly broken by another bug-fix and *not* because of
Re: [VOTE] - Release 2.0.5-beta
On May 15, 2013, at 2:54 PM, Roman Shaposhnik wrote: This is not my argument at all. I apologize if somehow I failed to communicate it, but here's what my argument boils down to: given *my* experience with Hadoop 2.0.x series and Bigtop release every time I try a different release of Hadoop 2.0.x I run into issues that scare me. They scare me because they are so basic yet they make component like Sqoop and Oozie (and I believe Giraph on one occasion) pretty much DOA for YARN-base mapreduce implementation. Why they should *scare* *you*? As Stevel and Arun also pointed out on others mails in this thread, we have no way of finding all the bugs. What is basic in your environment isn't basic in mine. Just today I ran into a *basic* issue of not being able to run secure oozie setup on top of one of the stable 1.x release in one of the usual environments. Now if I share your level of concern, I should be *scared* about why none of the testing we did in the past 3 years manage to uncover it. Or may be why BigTop is not able to help find these issues for us if at all. In my mind, what that translates into is the fact that nobody did *any* real testing of a particular downstream component running on a given Hadoop 2.0.x release. Like I said -- the issues so far make the components in question DOA. Thanks for *finally* adding this bit of a particular downstream component. It is very likely none of us tested Sqoop on top of 2.0.3/4-alpha. But you know what, if BigTop didn't exist, someone from Sqoop community would have nudged us. And nobody will really need to be *scared* - all of this is expected in the alpha life cycle. How many more issues like that one (regardless of how they originated) are in branch-2? Wouldn't we want to know before declaring Hadoop 2.0.5 beta? We want to know, and you know what, we already know some of them. Lots of them. You can find them in issue tracker if you ever want to. For example in YARN: https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=truejqlQuery=project+%3D+YARN+AND+status+%3D+Open+ORDER+BY+priority+DESCmode=hide There are a total of 258 open tickets, 10 critical, 253 major (and if categorized will turn into critical and blocker issues). And particularly 23 against 2.0.4-alpha and 34 against 2.0.3-alpha. So, please. With due respect, BigTop isn't the only project discovering bugs which make downstream components DOA. Thanks, +Vinod
Re: [VOTE] - Release 2.0.5-beta
On Wed, May 15, 2013 at 10:52PM, Suresh Srinivas wrote: Assuming that you are talking about HDFS features when you say features going into a beta on a very short short timetable and laundry list etc, No, that would not be a correct assumption. So these features are not something that are impulsively developed and irresponsibly pushed to a release. They have gone through considerable testing and have been developed over a long time. There is no need to reframe my comment in combative terms and read in insults that are not there. No insult taken. But I want to make a case that feature are not proposed lightly and due diligence both during development and testing are done. As I read Arun's mail the plan is to integrate several feature branches into branch-2. That would of course result in brand new never before tested code. I do not believe that should have the label alpha. This is just my personal opinion. Shit happens when commits happen. - You know this as well as I. That does not mean I am here to attack or insult you by pointing that out and suggesting more measured alternatives. There is little to gain in engaging in debate club. If you are not interested in hearing these opinions, that is fine, I have received that message already, nothing further need be said. Andy, I value your feedback. I am only trying to allay the concerns by sharing my perspective. What I am seeing times and again in these endless discussion threads is this: a) downstream or bigtop: we are seeing a bunch of integration issues with every new feature introduced/something even a commit made b) feature developers: no-no, these features are developed for a long time, tests are ran, no need to be concerned The same pattern is repeated times and again. The only conclusion that I can make out of it, is that either the meaning of integration testing is different for a) and b) or that a) and b) are using very different validation mechanisms. Which one is that? I am puzzled. Bugs are quite expected - Andrew put it very eloquently, actually. But you only can deal with them effectively if the flow of changes is controlled, e.g. via smaller and focused releases. The development process has to be converging, and not fanning-out. Case in point? Sure. 2.0.3-alpha had to be followed by 2.0.4-alpha release (officially called bugfix release); it - in turn - requires 2.0.4.1-alpha to make it suitable for other downstream components. So, it took 2 releases to simply fix issues caused by a bunch of bugfixes and no major new features being committed into 2.0.3-alpha. These are just cold facts - not attacking any ones' ego here. Cos signature.asc Description: Digital signature
Re: [VOTE] - Release 2.0.5-beta
-0 (Binding) I have made my opinion known in the previous thread/vote, but I have spent enough time discussing this and need to get back to my day job. If the community is able to get snapshots and everything else in this list merged and stable without breaking the stack above it in two weeks it will be wonderful, but I have serious doubts that it is going to actually be possible. --Bobby On 5/15/13 12:57 PM, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@).
Re: [VOTE] - Release 2.0.5-beta
On 15 May 2013 23:19, Konstantin Boudnik c...@apache.org wrote: Guys, I guess what you're missing is that Bigtop isn't a testing framework for Hadoop. It is stack framework that verifies that components are dealing with each other nicely. which to me means Some form of integration test Every single stack is different: Bigtop 0.5.0 differs from 0.6.0, and so on. Bigtop - as any other ASF project - has its releases that might or might not be aligned with particular version of Hadoop. Hence, an ethalon stack needs to be defined first and foremost. Before we even start talking about running it nightly (another question is on what hardware, let's not get there for now) let's understand who will can help with triage'ing test failures? Downstreams, Hadoop or Bigtop? Judging by a number of other emails there's a number of people on this list who care plenty about integration issues. Any volunteers to help with integration testing in the open? As I said at the HUG, I want to get the non-swift-FS specific tests that do things like run Pig jobs against any FS in, though I also need a home for some very swift-specific partitioned file tests. Is this a previously solved problem? Yes. The problem is solved by separating actively developed (aka unstable) release from more mature and less volatile ones. not in filesystems. If you look how long it took ext4 to be implemented and then adopted, you can see that nobody put data they cared about on it until they were happy that what you put on write() came back on a read() [and stat() returned the amount of data, [seek(X);read()] returned the byte at offset X and other little details that those of us writing tests for the filesystem APIs care about]
Re: [VOTE] - Release 2.0.5-beta
Cos, On May 15, 2013, at 11:38 PM, Konstantin Boudnik wrote: What I am seeing times and again in these endless discussion threads is this: a) downstream or bigtop: we are seeing a bunch of integration issues with every new feature introduced/something even a commit made b) feature developers: no-no, these features are developed for a long time, tests are ran, no need to be concerned It's unfortunate you are continuing to take digs at people who actually are moving the project forward. The 'cold facts' you describe do not give any credence your conclusions. Let's review the bugs Bigtop has found over the course of this year, Vinod pointed them out: I quickly checked other bugs you reported in 2.0.x: - MAPREDUCE-5088 was caused by the fix for HADOOP-9299 which was again a long standing issue in 2.0.x - MAPREDUCE-3728 is similar - MAPREDUCE-5117 is similar - MAPREDUCE-4219 was a security related feature request from you. - MAPREDUCE-3916 was because of new proxy-server added. And now, MAPREDUCE-5240 - again, a long standing bug. Given the above, please help me understand how 'feature developers' are hurting? I've repeatedly asked you or Roman to run CI on branch-2, instead of stepping up to help on a concrete proposal you continue to take digs at a number of contributors here, hopefully this will stop. Arun
Re: [VOTE] - Release 2.0.5-beta
(initially respond on general@, sorry about that. copied here) +1 (non-binding) From my perspective: * The key feature that will drive me to adopt 2.x is Rolling Upgrades * In order to get to rolling upgrades, we need a compatibility story that is significantly better than we have today ** We need a comprehensive definition of what compatibility really means ** We need better testing in place to verify we're not breaking compatibility ** We need better definition and testing of what rolling upgrades really means. Rolling between bug-fix releases Required, Rolling between minor releases Required, Rolling between major releases Desired. ** We need work-preserving restart on the YARN side. Restarting jobs isn't sufficient. ** ... * Given that Rolling upgrades aren't there yet, and there is still work to be done to solidify the compatibility story, I'm ok with the feature window remaining open until these are in place, especially given the fact that the proposed features are likely to have non-zero impact on compatibility/rolling_upgrades. * I'd certainly like a release with rolling upgrades as soon as possible, so I feel like the feature window needs to ramp down very quickly. Something like 2.0.5-beta in May with the current list of proposed features, then 2.0.6-beta in late summer with full rolling upgrade support and a solid compatibility story, would seem like a reasonable timeline. Once we have a beta release with rolling upgrades, I can look at pushing 2.x to some of our larger clusters. Nathan Roberts nrobe...@yahoo-inc.com On 5/15/13 1:06 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Seems like you forgot to bcc. Forwarding this to general. Thanks, +Vinod On May 15, 2013, at 10:57 AM, Arun C Murthy wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@).
[VOTE] - Release 2.0.5-beta
Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@).
Re: [VOTE] - Release 2.0.5-beta
This is the course that we were taking before the unfortunate disruption. We should be able to meet both the stabilization goals and compatibility goals quickly with this proposal. I personally am willing to invest a lot of time in testing, code reviews and work on adding missing functionality to ensure the goal of this proposal is successful. +1. On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@). -- http://hortonworks.com/download/
Re: [VOTE] - Release 2.0.5-beta
Seems like you forgot to bcc. Forwarding this to general. Thanks, +Vinod On May 15, 2013, at 10:57 AM, Arun C Murthy wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@).
Re: [VOTE] - Release 2.0.5-beta
good, glad we are back on track again. BTW, we have already started build (IBM and OpenJDK SDK), unit test, and limited integration testing on x86 and POWER, results are promising. Best Regards Amir Sanjar System Management Architect PowerLinux Open Source Hadoop development lead IBM Senior Software Engineer Phone# 512-286-8393 Fax# 512-838-8858 From: Arun C Murthy a...@hortonworks.com To: common-dev@hadoop.apache.org, Date: 05/15/2013 12:58 PM Subject:[VOTE] - Release 2.0.5-beta Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@).
Re: [VOTE] - Release 2.0.5-beta
Hi Arun, Can we add HADOOP-9517 to the list - having compatibility guidelines should help us support users and downstream projects better? Thanks Karthik On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@).
Re: [VOTE] - Release 2.0.5-beta
Do we need to add YARN-397? Thanks. On Wed, May 15, 2013 at 11:23 AM, Karthik Kambatla ka...@cloudera.comwrote: Hi Arun, Can we add HADOOP-9517 to the list - having compatibility guidelines should help us support users and downstream projects better? Thanks Karthik On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@). -- Alejandro
Re: [VOTE] - Release 2.0.5-beta
Thanks for laying out a very specific release plan, easy to vote on. I am watching most of YARN and MAPREDUCE changes, glad that those are called out specifically. Apart from that, we have - RM restart which is mostly already committed but needs a couple more in - a couple of scheduling related APIs which fall under the protocol changes you mentioned, that are close to commit - a couple of security issues which aren't exactly features. Just calling them out specifically so that there is no ambiguity. +1 (binding for this) Thanks, +Vinod Kumar Vavilapalli On May 15, 2013, at 10:57 AM, Arun C Murthy wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@).
Re: [VOTE] - Release 2.0.5-beta
- RM restart which is mostly already committed but needs a couple more in - a couple of scheduling related APIs which fall under the protocol changes you mentioned, that are close to commit - a couple of security issues which aren't exactly features. I should have been clearer: - RM restart stuff is tracked at YARN-128 - scheduling APIs tracked at YARN-397 - security stuff tracked at YARN-47 Thanks, +Vinod
Re: [VOTE] - Release 2.0.5-beta
Yes to all. As long as we are making timely and compatible progress, we don't need to debate individual issues here. Let's continue discussion on relevant jiras. thanks, Arun On May 15, 2013, at 12:11 PM, Vinod Kumar Vavilapalli wrote: - RM restart which is mostly already committed but needs a couple more in - a couple of scheduling related APIs which fall under the protocol changes you mentioned, that are close to commit - a couple of security issues which aren't exactly features. I should have been clearer: - RM restart stuff is tracked at YARN-128 - scheduling APIs tracked at YARN-397 - security stuff tracked at YARN-47 Thanks, +Vinod
Re: [VOTE] - Release 2.0.5-beta
I also feel that some of YARN-397 should go in. If you also feel so, please put in a +1 to state your intention. Thanks, +Vinod On May 15, 2013, at 11:32 AM, Alejandro Abdelnur wrote: Do we need to add YARN-397? Thanks. On Wed, May 15, 2013 at 11:23 AM, Karthik Kambatla ka...@cloudera.comwrote: Hi Arun, Can we add HADOOP-9517 to the list - having compatibility guidelines should help us support users and downstream projects better? Thanks Karthik On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@). -- Alejandro
Re: [VOTE] - Release 2.0.5-beta
+1 On May 15, 2013, at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@).
Re: [VOTE] - Release 2.0.5-beta
+1 (binding). I think it's important to maintain the release continuity, otherwise we could end up with the 0.20.2 / 0.20.200 problem all over again (parallel stable dev tracks without a parent-child relationship to each other, ie with disjoint subsets of functionality). I consider achieving a stable basis for API backward compat very important. And Arun is committing to hit beta in the very near future. --Matt On Wed, May 15, 2013 at 1:16 PM, Bikas Saha bi...@hortonworks.com wrote: I am +1 to the proposal because it maintains the original cadence a bunch of us committers/contributors have been working with. Windows related changes have been made in a conservative manner so as not to destabilize the code base. The changes are being extensively tested and validated by community members, especially those from Microsoft. YARN-397 jiras are mainly enhancements that can be added in a backwards compatible manner. Would be great if some of them make it but I would not hold the release for them. Let us all make the effort to get the release out with all the long awaited and useful features as planned. Bikas -Original Message- From: Vinod Kumar Vavilapalli [mailto:vino...@hortonworks.com] Sent: Wednesday, May 15, 2013 12:20 PM To: common-dev@hadoop.apache.org Subject: Re: [VOTE] - Release 2.0.5-beta I also feel that some of YARN-397 should go in. If you also feel so, please put in a +1 to state your intention. Thanks, +Vinod On May 15, 2013, at 11:32 AM, Alejandro Abdelnur wrote: Do we need to add YARN-397? Thanks. On Wed, May 15, 2013 at 11:23 AM, Karthik Kambatla ka...@cloudera.comwrote: Hi Arun, Can we add HADOOP-9517 to the list - having compatibility guidelines should help us support users and downstream projects better? Thanks Karthik On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@). -- Alejandro
Re: [VOTE] - Release 2.0.5-beta
+1 (non-binding) Agreed with Bikas that we should get the scheduler API enhancements (YARN-397) in we are able, but they don't need to be blockers because they will be backwards compatible. Arun, not sure whether your Yes to all already covered this, but I'd like to throw in support for the compatibility guidelines being a blocker. On Wed, May 15, 2013 at 1:20 PM, eric baldeschwieler eri...@hortonworks.com wrote: +1 On May 15, 2013, at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@).
Re: [VOTE] - Release 2.0.5-beta
Arun, not sure whether your Yes to all already covered this, but I'd like to throw in support for the compatibility guidelines being a blocker. +1 to that. Definitely an overriding concern for me. On Wed, May 15, 2013 at 1:25 PM, Sandy Ryza sandy.r...@cloudera.com wrote: +1 (non-binding) Agreed with Bikas that we should get the scheduler API enhancements (YARN-397) in we are able, but they don't need to be blockers because they will be backwards compatible. Arun, not sure whether your Yes to all already covered this, but I'd like to throw in support for the compatibility guidelines being a blocker. On Wed, May 15, 2013 at 1:20 PM, eric baldeschwieler eri...@hortonworks.com wrote: +1 On May 15, 2013, at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@).
Re: [VOTE] - Release 2.0.5-beta
lets fork this thread into the appropriate ML and discuss the practical, achievable steps that can be included into the release criteria of Hadoop 2.0.5-beta Seems to me common-dev is the appropriate ML, and Arun has invited Jiras to include. Open a Jira with your suggested list, and we carry on the discussion from there. Does that work? On Wed, May 15, 2013 at 1:29 PM, Roman Shaposhnik r...@apache.org wrote: On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote: I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: I have a very basic question: what are the steps that we, as a community, are willing to undertake to ensure that our aggressive release schedule (end of May is exactly two weeks away) and our intent of actually calling this a beta release would be realistic? Please tell me if my expectations are incorrect, but to me the -beta would signify it being a 'safe' target for the downstream components. We're still finding *very* basic and *very* disruptive issues (MAPREDUCE-5240 is a good example) that essentially mean DOA for downstream that depends on this functionality. Are we comfortable with delivering 2.0.5-beta and later on starting to discover things like MAPREDUCE-5240 more or less accidentally? As I mentioned in a different thread -- there are a few things that Apache Bigtop can help with in that regard -- but they can only happen if we as a community agree that they need to happen before we can call Hadoop 2.x a beta release. If this sounds useful to the Hadoop community at large -- lets fork this thread into the appropriate ML and discuss the practical, achievable steps that can be included into the release criteria of Hadoop 2.0.5-beta as it is being discussed here. Thanks, Roman.
Re: [VOTE] - Release 2.0.5-beta
+1 (binding) on the proposal. 2-3 weeks doesn't sound too long a time, and we have many committers willing to be on-call to fix issues when they are discovered. On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. thanks, Arun PS: To keep this discussion grounded in technical details I've moved this to dev@ (bcc general@).
Re: [VOTE] - Release 2.0.5-beta
On Wed, May 15, 2013 at 1:29 PM, Matt Foley mfo...@hortonworks.com wrote: Arun, not sure whether your Yes to all already covered this, but I'd like to throw in support for the compatibility guidelines being a blocker. +1 to that. Definitely an overriding concern for me. +1 Likewise. Would be great to get more eyeballs on Karthik's patch on HADOOP-9517 if people haven't review it already.
Re: [VOTE] - Release 2.0.5-beta
+1 (non-binding) on the proposal. On Wed, May 15, 2013 at 1:43 PM, Eli Collins e...@cloudera.com wrote: On Wed, May 15, 2013 at 1:29 PM, Matt Foley mfo...@hortonworks.com wrote: Arun, not sure whether your Yes to all already covered this, but I'd like to throw in support for the compatibility guidelines being a blocker. +1 to that. Definitely an overriding concern for me. +1 Likewise. Would be great to get more eyeballs on Karthik's patch on HADOOP-9517 if people haven't review it already. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/
Re: [VOTE] - Release 2.0.5-beta
On Wed, May 15, 2013 at 1:36 PM, Matt Foley mfo...@hortonworks.com wrote: lets fork this thread into the appropriate ML and discuss the practical, achievable steps that can be included into the release criteria of Hadoop 2.0.5-beta Seems to me common-dev is the appropriate ML, Thanks. I'll stick to this thread then. and Arun has invited Jiras to include. Open a Jira with your suggested list, and we carry on the discussion from there. Does that work? But this is exactly my concern -- I don't have the list of JIRAs. In fact, there's work that needs to be done to arrive at the list of JIRAs that would be complete enough to give me confidence that something like MAPREDUCE-5240 (I'll stick to this example simply because I remember it by heart now ;-)). What I'm saying is this -- if nobody is willing to do this work outside of the very few folks who are part of Apache Bigtop then I have very little confidence in this proposal actually delivering on its promise of beta quality in Hadoop 2.0.5. The question I'm asking is actually quite simple: does Hadoop community believe in investing in doing this work to COME UP with such a list? Or to ask it differently -- does the Hadoop community value the feedback that such a work would provide to a degree that it would be made part of the release criteria for Hadoop 2.0.5-beta? This is really a binary question as far as I can tell. Thanks, Roman. P.S. There's a second level of discussion which is -- what exactly does that extra work entail -- but lets deal with the basic question first.
Re: [VOTE] - Release 2.0.5-beta
Roman, I keep this same argument again and again. Should've refuted earlier. Please list down all the issues that BigTop ran into *because of* new features. You continue to argue that new features are destabilizing 2.0.*, which I don't agree with at all. 2.0.3-alpha was the last time major features got merged in, and we found blockers irrespective of those. MAPREDUCE-5240 specifically isn't due to any feature merge. This was a bug. I'd say this is a long standing bug in 2.0.x. You sure this passed in 2.0.3? Even so, this is mostly broken by another bug-fix and *not* because of any feature. I quickly checked other bugs you reported in 2.0.x: - MAPREDUCE-5088 was caused by the fix for HADOOP-9299 which was again a long standing issue in 2.0.x - MAPREDUCE-3728 is similar - MAPREDUCE-5117 is similar - MAPREDUCE-4219 was a security related feature request from you. - MAPREDUCE-3916 was because of new proxy-server added. I am not arguing that new features *may* destabilize the branch, but you've repeatedly stated this as if that were a fact. Really appreciate the testing done by BigTop, but please don't distort the facts. Thanks, +Vinod On May 15, 2013, at 1:29 PM, Roman Shaposhnik wrote: Please tell me if my expectations are incorrect, but to me the -beta would signify it being a 'safe' target for the downstream components. We're still finding *very* basic and *very* disruptive issues (MAPREDUCE-5240 is a good example) that essentially mean DOA for downstream that depends on this functionality. Are we comfortable with delivering 2.0.5-beta and later on starting to discover things like MAPREDUCE-5240 more or less accidentally?
Re: [VOTE] - Release 2.0.5-beta
Typo, keep hearing* Thanks, +Vinod On May 15, 2013, at 2:14 PM, Vinod Kumar Vavilapalli wrote: Roman, I keep this same argument again and again. Should've refuted earlier.
Re: [VOTE] - Release 2.0.5-beta
Great summary, thanks Vinod. On May 15, 2013, at 2:14 PM, Vinod Kumar Vavilapalli wrote: Roman, I keep this same argument again and again. Should've refuted earlier. Please list down all the issues that BigTop ran into *because of* new features. You continue to argue that new features are destabilizing 2.0.*, which I don't agree with at all. 2.0.3-alpha was the last time major features got merged in, and we found blockers irrespective of those. MAPREDUCE-5240 specifically isn't due to any feature merge. This was a bug. I'd say this is a long standing bug in 2.0.x. You sure this passed in 2.0.3? Even so, this is mostly broken by another bug-fix and *not* because of any feature. I quickly checked other bugs you reported in 2.0.x: - MAPREDUCE-5088 was caused by the fix for HADOOP-9299 which was again a long standing issue in 2.0.x - MAPREDUCE-3728 is similar - MAPREDUCE-5117 is similar - MAPREDUCE-4219 was a security related feature request from you. - MAPREDUCE-3916 was because of new proxy-server added. I am not arguing that new features *may* destabilize the branch, but you've repeatedly stated this as if that were a fact. Really appreciate the testing done by BigTop, but please don't distort the facts. Thanks, +Vinod On May 15, 2013, at 1:29 PM, Roman Shaposhnik wrote: Please tell me if my expectations are incorrect, but to me the -beta would signify it being a 'safe' target for the downstream components. We're still finding *very* basic and *very* disruptive issues (MAPREDUCE-5240 is a good example) that essentially mean DOA for downstream that depends on this functionality. Are we comfortable with delivering 2.0.5-beta and later on starting to discover things like MAPREDUCE-5240 more or less accidentally? -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/
Re: [VOTE] - Release 2.0.5-beta
On 15 May 2013 10:57, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. +1 (binding)
Re: [VOTE] - Release 2.0.5-beta
On Wed, May 15, 2013 at 2:14 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Please list down all the issues that BigTop ran into *because of* new features. Whether the bug is *because of* new feature or not is a red herring for my argument. Please lets drop this distinction. I never used it. You continue to argue that new features are destabilizing 2.0.*, which I don't agree with at all. 2.0.3-alpha was the last time major features got merged in, and we found blockers irrespective of those. This is not my argument at all. I apologize if somehow I failed to communicate it, but here's what my argument boils down to: given *my* experience with Hadoop 2.0.x series and Bigtop release every time I try a different release of Hadoop 2.0.x I run into issues that scare me. They scare me because they are so basic yet they make component like Sqoop and Oozie (and I believe Giraph on one occasion) pretty much DOA for YARN-base mapreduce implementation. In my mind, what that translates into is the fact that nobody did *any* real testing of a particular downstream component running on a given Hadoop 2.0.x release. Like I said -- the issues so far make the components in question DOA. Effectively the onion of issues remain unpeeled, so to speak. What I'm asking on this thread (and somehow nobody is willing to give me a straight answer) is whether the Hadoop community is willing to invest in peeling this onion of issues somewhat more before declaring Hadoop 2.0.5 a beta release. Once again it is a binary question -- please give me an answer of yes or no. I am not arguing that new features *may* destabilize the branch, but you've repeatedly stated this as if that were a fact. Your list of issues is pretty complete (give or take a few that I didn't file but Cos and others did). And I'd be the first one to agree that it is not a large list of issues. What scares me is not its size, but the fact how basic they are and how the block the *rest* of the testing completely. To be extra clear -- what scares me about something like MAPREDUCE-5240 is not whether it came as a result of a merge or was sitting there since day one. What scares me is that we've identified it last week and yet Sqoop 2 is DOA in its presense. How many more issues like that one (regardless of how they originated) are in branch-2? Wouldn't we want to know before declaring Hadoop 2.0.5 beta? Now, knowing would require work -- that's what my argument is all about. Thanks, Roman.
Re: [VOTE] - Release 2.0.5-beta
Roman, Furthermore, before we rush into finding flaws and scaring kids at night it would be useful to remember one thing: Software has *bugs*. We can't block any release till the entire universe validates it, in fact they won't validate it if we don't release since are at the bottom of the stack. Any help prior to the release is welcome; I know people who work for the same employer as I do have plans to do further testing after we freeze apis via the beta release(s). I hope and pray others can join this effort - thanks to everyone who already has. Again, freezing APIs and protocols is the primary aim of 2.0.5-beta. There are no guarantees it's 100% bug-free, we can never make such guarantees anyway. If, and when, we find bugs with 2.0.5-beta I'm more than happy to quickly turn around and make more releases (2.0.6-beta, 2.0.7-beta). Obviously I'll make a call on which bugs are critical - feedback to help me decide is, as always, welcome. I've been clear, many times, that we might need more than one beta release to iron out bugs etc. None of this should be a surprise - this has happened many, many times in the lifetime of this and other projects. 2.0.3-alpha vis-a-vis 2.0.4-alpha is the most recent example - it won't be the last. So, I hope, concludes this meme. thanks, Arun On May 15, 2013, at 2:20 PM, Arun C Murthy wrote: Great summary, thanks Vinod. On May 15, 2013, at 2:14 PM, Vinod Kumar Vavilapalli wrote: Roman, I keep this same argument again and again. Should've refuted earlier. Please list down all the issues that BigTop ran into *because of* new features. You continue to argue that new features are destabilizing 2.0.*, which I don't agree with at all. 2.0.3-alpha was the last time major features got merged in, and we found blockers irrespective of those. MAPREDUCE-5240 specifically isn't due to any feature merge. This was a bug. I'd say this is a long standing bug in 2.0.x. You sure this passed in 2.0.3? Even so, this is mostly broken by another bug-fix and *not* because of any feature. I quickly checked other bugs you reported in 2.0.x: - MAPREDUCE-5088 was caused by the fix for HADOOP-9299 which was again a long standing issue in 2.0.x - MAPREDUCE-3728 is similar - MAPREDUCE-5117 is similar - MAPREDUCE-4219 was a security related feature request from you. - MAPREDUCE-3916 was because of new proxy-server added. I am not arguing that new features *may* destabilize the branch, but you've repeatedly stated this as if that were a fact. Really appreciate the testing done by BigTop, but please don't distort the facts. Thanks, +Vinod On May 15, 2013, at 1:29 PM, Roman Shaposhnik wrote: Please tell me if my expectations are incorrect, but to me the -beta would signify it being a 'safe' target for the downstream components. We're still finding *very* basic and *very* disruptive issues (MAPREDUCE-5240 is a good example) that essentially mean DOA for downstream that depends on this functionality. Are we comfortable with delivering 2.0.5-beta and later on starting to discover things like MAPREDUCE-5240 more or less accidentally? -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/
Re: [VOTE] - Release 2.0.5-beta
+1 (binding) on the proposal. However, the value we get from these release plan votes is dubious, to put it mildly. The surrounding discussion has cost more than it is worth, and votes on executive summaries of releases discourage the sort of detailed collaboration we're trying to create. It replaces development with zero-sum struggles over abstractions. This is, in effect, another poll about the direction we're taking 2.x. If we can't reach consensus on development directions without voting, that's more evidence that the project should be split, IMO. -C On Wed, May 15, 2013 at 2:21 PM, Steve Loughran ste...@hortonworks.com wrote: On 15 May 2013 10:57, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. +1 (binding)
Re: [VOTE] - Release 2.0.5-beta
On 15 May 2013 15:02, Arun C Murthy a...@hortonworks.com wrote: Roman, Furthermore, before we rush into finding flaws and scaring kids at night it would be useful to remember one thing: Software has *bugs*. We can't block any release till the entire universe validates it, in fact they won't validate it if we don't release since are at the bottom of the stack. more subtly: we aren't going to find all the corner case situations until things ship into the hands of people whose {networks, configs, applications, hardware} are different. Marking something as -beta means more people will use it, and find those problems, at a time when it is still possible for a fast turnaround on fixes. what we are implicitly saying with a -beta tag is ready for others to use, which in Hadoop's case means doesn't lose data unless you do something suicidal and we're not going to move APIs on you. The gulf from -beta to shipping is usually much less dramatic than -alpha to -beta, as it happens when everyone is happy that the last beta is good enough to push out. -Steve (who will be at the HUG in Sunnyvale this evening)
Re: [VOTE] - Release 2.0.5-beta
Arun, am I reading yours answer to my binary question correctly? It is a 'no'. My reading of your response is that while you appreciate the feedback Bigtop is providing you're not of an opinion that investigating the level of stability of Hadoop wrt. downstream any further than what is currently happening would be a worthy investment of Hadoop's community (or your personal for that matter) time? Thanks, Roman. On Wed, May 15, 2013 at 3:02 PM, Arun C Murthy a...@hortonworks.com wrote: Roman, Furthermore, before we rush into finding flaws and scaring kids at night it would be useful to remember one thing: Software has *bugs*. We can't block any release till the entire universe validates it, in fact they won't validate it if we don't release since are at the bottom of the stack. Any help prior to the release is welcome; I know people who work for the same employer as I do have plans to do further testing after we freeze apis via the beta release(s). I hope and pray others can join this effort - thanks to everyone who already has. Again, freezing APIs and protocols is the primary aim of 2.0.5-beta. There are no guarantees it's 100% bug-free, we can never make such guarantees anyway. If, and when, we find bugs with 2.0.5-beta I'm more than happy to quickly turn around and make more releases (2.0.6-beta, 2.0.7-beta). Obviously I'll make a call on which bugs are critical - feedback to help me decide is, as always, welcome. I've been clear, many times, that we might need more than one beta release to iron out bugs etc. None of this should be a surprise - this has happened many, many times in the lifetime of this and other projects. 2.0.3-alpha vis-a-vis 2.0.4-alpha is the most recent example - it won't be the last. So, I hope, concludes this meme. thanks, Arun On May 15, 2013, at 2:20 PM, Arun C Murthy wrote: Great summary, thanks Vinod. On May 15, 2013, at 2:14 PM, Vinod Kumar Vavilapalli wrote: Roman, I keep this same argument again and again. Should've refuted earlier. Please list down all the issues that BigTop ran into *because of* new features. You continue to argue that new features are destabilizing 2.0.*, which I don't agree with at all. 2.0.3-alpha was the last time major features got merged in, and we found blockers irrespective of those. MAPREDUCE-5240 specifically isn't due to any feature merge. This was a bug. I'd say this is a long standing bug in 2.0.x. You sure this passed in 2.0.3? Even so, this is mostly broken by another bug-fix and *not* because of any feature. I quickly checked other bugs you reported in 2.0.x: - MAPREDUCE-5088 was caused by the fix for HADOOP-9299 which was again a long standing issue in 2.0.x - MAPREDUCE-3728 is similar - MAPREDUCE-5117 is similar - MAPREDUCE-4219 was a security related feature request from you. - MAPREDUCE-3916 was because of new proxy-server added. I am not arguing that new features *may* destabilize the branch, but you've repeatedly stated this as if that were a fact. Really appreciate the testing done by BigTop, but please don't distort the facts. Thanks, +Vinod On May 15, 2013, at 1:29 PM, Roman Shaposhnik wrote: Please tell me if my expectations are incorrect, but to me the -beta would signify it being a 'safe' target for the downstream components. We're still finding *very* basic and *very* disruptive issues (MAPREDUCE-5240 is a good example) that essentially mean DOA for downstream that depends on this functionality. Are we comfortable with delivering 2.0.5-beta and later on starting to discover things like MAPREDUCE-5240 more or less accidentally? -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/
Re: [VOTE] - Release 2.0.5-beta
Indeed. I think the root of the issue is deeper. ASF software practices are great to deal with isolated, relatively contained projects like httpd, libreoffice, trac, etc. However, Hadoop based stack - essentially, software aimed at enterprises with bigger scale operations - is a different animal, that requires balancing of a huge number of moving parts and an unbroken flow of feedback up the stream. Anyone who have delivered any enterprise grade software system knows perfectly well how hard is that. However, in the environment where a release pushed out in the rush (essentially causing DOA issues downstream), these are got fixed in consequent releases. That ironically is likely to contain some other DOAs because an integration testing - and I mean real world integration system testing - is done by this small project, that is treated like a toy for adolescent kids. And there's no other real integration testing happening OPENLY on the full stack. Despite numerous claims, that is. Software comes with bugs - this is a somewhat expected phenomena. However, bug fixes shouldn't be mixed with new features, increasing entropy in the system. In other words, the development process should fan-in. A process with multiple consequent stable releases helps to achieve it; and compatibility issues would be addressed by working on the next major release. The model above leaves downstream with a choice of sticking to the 3.x or switching to 4.x and so on. Where's having permanent alpha tag is a convenient way to control software project that effectively became a vendor-controlled effort. And yes - this leads to fragmentation, makes no mistakes about it. Because no one can sit on the hands for a year and wait until a usable release with all great features will come about: lot of organizations just silently forking away to make their own environment suitable for production or sale; some of them might sporadically contribute something back. And of course - this is not the aim of Apache project to produce commercial grade platform. Cos On Wed, May 15, 2013 at 02:54PM, Roman Shaposhnik wrote: On Wed, May 15, 2013 at 2:14 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Please list down all the issues that BigTop ran into *because of* new features. Whether the bug is *because of* new feature or not is a red herring for my argument. Please lets drop this distinction. I never used it. You continue to argue that new features are destabilizing 2.0.*, which I don't agree with at all. 2.0.3-alpha was the last time major features got merged in, and we found blockers irrespective of those. This is not my argument at all. I apologize if somehow I failed to communicate it, but here's what my argument boils down to: given *my* experience with Hadoop 2.0.x series and Bigtop release every time I try a different release of Hadoop 2.0.x I run into issues that scare me. They scare me because they are so basic yet they make component like Sqoop and Oozie (and I believe Giraph on one occasion) pretty much DOA for YARN-base mapreduce implementation. In my mind, what that translates into is the fact that nobody did *any* real testing of a particular downstream component running on a given Hadoop 2.0.x release. Like I said -- the issues so far make the components in question DOA. Effectively the onion of issues remain unpeeled, so to speak. What I'm asking on this thread (and somehow nobody is willing to give me a straight answer) is whether the Hadoop community is willing to invest in peeling this onion of issues somewhat more before declaring Hadoop 2.0.5 a beta release. Once again it is a binary question -- please give me an answer of yes or no. I am not arguing that new features *may* destabilize the branch, but you've repeatedly stated this as if that were a fact. Your list of issues is pretty complete (give or take a few that I didn't file but Cos and others did). And I'd be the first one to agree that it is not a large list of issues. What scares me is not its size, but the fact how basic they are and how the block the *rest* of the testing completely. To be extra clear -- what scares me about something like MAPREDUCE-5240 is not whether it came as a result of a merge or was sitting there since day one. What scares me is that we've identified it last week and yet Sqoop 2 is DOA in its presense. How many more issues like that one (regardless of how they originated) are in branch-2? Wouldn't we want to know before declaring Hadoop 2.0.5 beta? Now, knowing would require work -- that's what my argument is all about. Thanks, Roman.
Re: [VOTE] - Release 2.0.5-beta
On May 15, 2013, at 3:27 PM, Chris Douglas wrote: +1 (binding) on the proposal. However, the value we get from these release plan votes is dubious, to put it mildly. The surrounding discussion has cost more than it is worth, and votes on executive summaries of releases discourage the sort of detailed collaboration we're trying to create. It replaces development with zero-sum struggles over abstractions. Agree, I propose we edit bylaws to do away with them for the future. This is, in effect, another poll about the direction we're taking 2.x. If we can't reach consensus on development directions without voting, that's more evidence that the project should be split, IMO. -C +1e100 Arun On Wed, May 15, 2013 at 2:21 PM, Steve Loughran ste...@hortonworks.com wrote: On 15 May 2013 10:57, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. +1 (binding) -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/
Re: [VOTE] - Release 2.0.5-beta
On May 15, 2013, at 3:50 PM, Roman Shaposhnik wrote: Arun, am I reading yours answer to my binary question correctly? It is a 'no'. No. My reading of your response is that while you appreciate the feedback Bigtop is providing you're not of an opinion that investigating the level of stability of Hadoop wrt. downstream any further than what is currently happening would be a worthy investment of Hadoop's community (or your personal for that matter) time? Everyone is welcome to contribute in any and all manner. I can't speak for everyone. It would be useful if Bigtop could run regressions on releases here consistently. We've also talked in the past about running Bigtop on branch-2, nightly. Is that something you could help with? You'd earn my personal gratitude. thanks, Arun Thanks, Roman. On Wed, May 15, 2013 at 3:02 PM, Arun C Murthy a...@hortonworks.com wrote: Roman, Furthermore, before we rush into finding flaws and scaring kids at night it would be useful to remember one thing: Software has *bugs*. We can't block any release till the entire universe validates it, in fact they won't validate it if we don't release since are at the bottom of the stack. Any help prior to the release is welcome; I know people who work for the same employer as I do have plans to do further testing after we freeze apis via the beta release(s). I hope and pray others can join this effort - thanks to everyone who already has. Again, freezing APIs and protocols is the primary aim of 2.0.5-beta. There are no guarantees it's 100% bug-free, we can never make such guarantees anyway. If, and when, we find bugs with 2.0.5-beta I'm more than happy to quickly turn around and make more releases (2.0.6-beta, 2.0.7-beta). Obviously I'll make a call on which bugs are critical - feedback to help me decide is, as always, welcome. I've been clear, many times, that we might need more than one beta release to iron out bugs etc. None of this should be a surprise - this has happened many, many times in the lifetime of this and other projects. 2.0.3-alpha vis-a-vis 2.0.4-alpha is the most recent example - it won't be the last. So, I hope, concludes this meme. thanks, Arun On May 15, 2013, at 2:20 PM, Arun C Murthy wrote: Great summary, thanks Vinod. On May 15, 2013, at 2:14 PM, Vinod Kumar Vavilapalli wrote: Roman, I keep this same argument again and again. Should've refuted earlier. Please list down all the issues that BigTop ran into *because of* new features. You continue to argue that new features are destabilizing 2.0.*, which I don't agree with at all. 2.0.3-alpha was the last time major features got merged in, and we found blockers irrespective of those. MAPREDUCE-5240 specifically isn't due to any feature merge. This was a bug. I'd say this is a long standing bug in 2.0.x. You sure this passed in 2.0.3? Even so, this is mostly broken by another bug-fix and *not* because of any feature. I quickly checked other bugs you reported in 2.0.x: - MAPREDUCE-5088 was caused by the fix for HADOOP-9299 which was again a long standing issue in 2.0.x - MAPREDUCE-3728 is similar - MAPREDUCE-5117 is similar - MAPREDUCE-4219 was a security related feature request from you. - MAPREDUCE-3916 was because of new proxy-server added. I am not arguing that new features *may* destabilize the branch, but you've repeatedly stated this as if that were a fact. Really appreciate the testing done by BigTop, but please don't distort the facts. Thanks, +Vinod On May 15, 2013, at 1:29 PM, Roman Shaposhnik wrote: Please tell me if my expectations are incorrect, but to me the -beta would signify it being a 'safe' target for the downstream components. We're still finding *very* basic and *very* disruptive issues (MAPREDUCE-5240 is a good example) that essentially mean DOA for downstream that depends on this functionality. Are we comfortable with delivering 2.0.5-beta and later on starting to discover things like MAPREDUCE-5240 more or less accidentally? -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/
Re: [VOTE] - Release 2.0.5-beta
Roman, what is your model for how test results from Bigtop should feed back into Hadoop-2 development? With the understanding that (a) software does have bugs, and (b) you're not going to get an SLA on community-sponsored software, what are your ideas for how to close the loop better? Would CI runs of Bigtop against branch-2 be feasible, as Arun suggests? How should we accomodate changes in individual components (Hadoop Core, but others as well) that may require changes in one or more other components? How does Bigtop keep doing a viable nightly build in that chaotic environment? Is this a previously solved problem? Thanks, --Matt On Wed, May 15, 2013 at 4:47 PM, Arun C Murthy a...@hortonworks.com wrote: On May 15, 2013, at 3:50 PM, Roman Shaposhnik wrote: Arun, am I reading yours answer to my binary question correctly? It is a 'no'. No. My reading of your response is that while you appreciate the feedback Bigtop is providing you're not of an opinion that investigating the level of stability of Hadoop wrt. downstream any further than what is currently happening would be a worthy investment of Hadoop's community (or your personal for that matter) time? Everyone is welcome to contribute in any and all manner. I can't speak for everyone. It would be useful if Bigtop could run regressions on releases here consistently. We've also talked in the past about running Bigtop on branch-2, nightly. Is that something you could help with? You'd earn my personal gratitude. thanks, Arun Thanks, Roman. On Wed, May 15, 2013 at 3:02 PM, Arun C Murthy a...@hortonworks.com wrote: Roman, Furthermore, before we rush into finding flaws and scaring kids at night it would be useful to remember one thing: Software has *bugs*. We can't block any release till the entire universe validates it, in fact they won't validate it if we don't release since are at the bottom of the stack. Any help prior to the release is welcome; I know people who work for the same employer as I do have plans to do further testing after we freeze apis via the beta release(s). I hope and pray others can join this effort - thanks to everyone who already has. Again, freezing APIs and protocols is the primary aim of 2.0.5-beta. There are no guarantees it's 100% bug-free, we can never make such guarantees anyway. If, and when, we find bugs with 2.0.5-beta I'm more than happy to quickly turn around and make more releases (2.0.6-beta, 2.0.7-beta). Obviously I'll make a call on which bugs are critical - feedback to help me decide is, as always, welcome. I've been clear, many times, that we might need more than one beta release to iron out bugs etc. None of this should be a surprise - this has happened many, many times in the lifetime of this and other projects. 2.0.3-alpha vis-a-vis 2.0.4-alpha is the most recent example - it won't be the last. So, I hope, concludes this meme. thanks, Arun On May 15, 2013, at 2:20 PM, Arun C Murthy wrote: Great summary, thanks Vinod. On May 15, 2013, at 2:14 PM, Vinod Kumar Vavilapalli wrote: Roman, I keep this same argument again and again. Should've refuted earlier. Please list down all the issues that BigTop ran into *because of* new features. You continue to argue that new features are destabilizing 2.0.*, which I don't agree with at all. 2.0.3-alpha was the last time major features got merged in, and we found blockers irrespective of those. MAPREDUCE-5240 specifically isn't due to any feature merge. This was a bug. I'd say this is a long standing bug in 2.0.x. You sure this passed in 2.0.3? Even so, this is mostly broken by another bug-fix and *not* because of any feature. I quickly checked other bugs you reported in 2.0.x: - MAPREDUCE-5088 was caused by the fix for HADOOP-9299 which was again a long standing issue in 2.0.x - MAPREDUCE-3728 is similar - MAPREDUCE-5117 is similar - MAPREDUCE-4219 was a security related feature request from you. - MAPREDUCE-3916 was because of new proxy-server added. I am not arguing that new features *may* destabilize the branch, but you've repeatedly stated this as if that were a fact. Really appreciate the testing done by BigTop, but please don't distort the facts. Thanks, +Vinod On May 15, 2013, at 1:29 PM, Roman Shaposhnik wrote: Please tell me if my expectations are incorrect, but to me the -beta would signify it being a 'safe' target for the downstream components. We're still finding *very* basic and *very* disruptive issues (MAPREDUCE-5240 is a good example) that essentially mean DOA for downstream that depends on this functionality. Are we comfortable with delivering 2.0.5-beta and later on starting to discover things like MAPREDUCE-5240 more or less accidentally? -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- Arun C. Murthy Hortonworks Inc.
Re: [VOTE] - Release 2.0.5-beta
I'm actually drafting such a proposal. Will open the discussion as a [PROPOSAL] in general@ --Matt On Wed, May 15, 2013 at 4:44 PM, Arun C Murthy a...@hortonworks.com wrote: On May 15, 2013, at 3:27 PM, Chris Douglas wrote: +1 (binding) on the proposal. However, the value we get from these release plan votes is dubious, to put it mildly. The surrounding discussion has cost more than it is worth, and votes on executive summaries of releases discourage the sort of detailed collaboration we're trying to create. It replaces development with zero-sum struggles over abstractions. Agree, I propose we edit bylaws to do away with them for the future. This is, in effect, another poll about the direction we're taking 2.x. If we can't reach consensus on development directions without voting, that's more evidence that the project should be split, IMO. -C +1e100 Arun On Wed, May 15, 2013 at 2:21 PM, Steve Loughran ste...@hortonworks.com wrote: On 15 May 2013 10:57, Arun C Murthy a...@hortonworks.com wrote: Folks, A considerable number of people have expressed confusion regarding the recent vote on 2.0.5, beta status etc. given lack of specifics, the voting itself (validity of the vote itself, whose votes are binding) etc. IMHO technical arguments (incompatibility b/w 2.0 2.1, current stability of 3 features under debate etc.) have been lost in the discussion in favor of non-technical (almost dramatic) nuances such as seizing the moment. There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a red flag for me; particularly when there are just 3 features being debated and active committers and contributors are confident of and ready to stand by their work. All patches, I believe, are ready to be merged in the the next few days per discussions on jira. This will, clearly, not delay the other API work which everyone agrees is crucial. As a result, I feel no recourse but to restart a new vote - all attempts at calm, reasoned, civil discussion based on technical arguments have come to naught - I apologize for the thrash caused to everyone's attention. To get past all of this confusion, I'd like to present an alternate, specific proposal for consideration. I propose we continue the original plan and make a 2.0.5-beta release by May end with the following content: # HDFS-347 # HDFS Snapshots # Windows support # Necessary final API/protocol changes such as: * Final YARN API changes: YARN-386 * MR Binary Compatibility: MAPREDUCE-5108 * Final RPC cleanup: HADOOP-8990 People working on the above features have all expressed considerable comfort with them and are ready to stand-by to help expedite any necessary bug-fixes etc. to get to stabilization quickly. I'm confident we can get this release out by end of May. This sets stage for a hadoop-2.x GA release right after with some more testing - this means I think I can quickly turn around and make bug-fix releases as necessary right after 2.0.5-beta. I request that people consider helping out with this plan and sign up to help push hadoop-2.x to stability as outlined above. I believe this will help achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can support it for forseeable future in a compatible manner for the benefit of our users and downstream projects. Please vote, the vote will run the normal 7 days. Obviously, I'm +1. +1 (binding) -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/
Re: [VOTE] - Release 2.0.5-beta
On Wed, May 15, 2013 at 9:25 PM, Andrew Purtell apurt...@apache.org wrote: The other thread or vote or whatever at least served the purpose in fresh surfacing of concerns. Talk of new features going in to a beta on a very short short timetable is concerning for anyone with experience working on large software projects. It's not a little ironic that this vote thread, done in response to sort out the other one predicated on stability concerns, begins with a laundry list of features and JIRAs to go in. I think it is usually the case that a beta release receives only bugfixes* over the alpha that proceeded it. This may just be a lack of consensus on what beta means. Assuming that you are talking about HDFS features when you say features going into a beta on a very short short timetable and laundry list etc, I request you to take a cursory look at the development of these features. Snapshot is being developed since 2012 Nov, excluding the early prototype that happened in 2012 May. Most of the development was complete by the early February except for the support of rename capability, which has been tricky. As regards to Windows support, this is a work that has been happening for more than an year in many other branches. So these features are not something that are impulsively developed and irresponsibly pushed to a release. They have gone through considerable testing and have been developed over a long time. Please set aside discussion on particular features or Hadoop bylaws or politics or debate club. I can't speak for all of downstream of course, but to the extent that I can I can say we don't care about that. The core ask, at least mine, is take a fresh look at reducing per-release disruptions to the rest of the entire ecosystem that has grown up around Hadoop. What is the disruption you anticipate due to the current content of the release? If it is stability, I am confident that very few bugs will come out of these features and stability should not be affected. This has been the case for the HDFS features for many years. The development is generally done in a feature branch, the feature is tested and stabilized in that branch before merging to trunk. This is contrary to few people's incorrect claims about how it has taken a long time to stabilize an HDFS features in branch-2. Needless to say stability is not just a concern of downstream projects. We spend long hours, day in day out, trying to ensure features are stable as core contributors. Regards, Suresh