Re: [VOTE] - Release 2.0.5-beta

2013-05-22 Thread Mayank Bansal
Hi Guys,

+1

We @ ebay would like to see snapshots before we start testing/deploying
hadoop 2.0 next month.

Thanks,
Mayank


On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote:

 Folks,

 A considerable number of people have expressed confusion regarding the
 recent vote on 2.0.5, beta status etc. given lack of specifics, the voting
 itself (validity of the vote itself, whose votes are binding) etc.

 IMHO technical arguments (incompatibility b/w 2.0  2.1, current stability
 of 3 features under debate etc.) have been lost in the discussion in favor
 of non-technical (almost dramatic) nuances such as seizing the moment.
 There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1)
 - this is a red flag for me; particularly when there are just 3 features
 being debated and active committers and contributors are confident of and
 ready to stand by their work. All patches, I believe, are ready to be
 merged in the the next few days per discussions on jira. This will,
 clearly, not delay the other API work which everyone agrees is crucial. As
 a result, I feel no recourse but to restart a new vote - all attempts at
 calm, reasoned, civil discussion based on technical arguments have come to
 naught - I apologize for the thrash caused to everyone's attention.

 To get past all of this confusion, I'd like to present an alternate,
 specific proposal for consideration.

 I propose we continue the original plan and make a 2.0.5-beta release by
 May end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
  * Final YARN API changes: YARN-386
  * MR Binary Compatibility: MAPREDUCE-5108
  * Final RPC cleanup: HADOOP-8990

 People working on the above features have all expressed considerable
 comfort with them and are ready to stand-by to help expedite any necessary
 bug-fixes etc. to get to stabilization quickly. I'm confident we can get
 this release out by end of May. This sets stage for a hadoop-2.x GA release
 right after with some more testing - this means I think I can quickly turn
 around and make bug-fix releases as necessary right after 2.0.5-beta.

 I request that people consider helping out with this plan and sign up to
 help push hadoop-2.x to stability as outlined above. I believe this will
 help achieve our shared goals of quickly stabilizing hadoop-2 and help
 ensure we can support it for forseeable future in a compatible manner for
 the benefit of our users and downstream projects.

 Please vote, the vote will run the normal 7 days. Obviously, I'm +1.

 thanks,
 Arun

 PS: To keep this discussion grounded in technical details I've moved this
 to dev@ (bcc general@).




Re: [VOTE] - Release 2.0.5-beta

2013-05-21 Thread sanjay Radia
+1 on 2.0.5 defined in this thread with the new features.
But I am supportive of an earlier release that has ALL the compatibility 
changes, without the features.


sanjay

On May 15, 2013, at 10:57 AM, Arun C Murthy wrote:

 Folks,
 
 ...
 
 I propose we continue the original plan and make a 2.0.5-beta release by May 
 end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
 * Final YARN API changes: YARN-386
 * MR Binary Compatibility: MAPREDUCE-5108
 * Final RPC cleanup: HADOOP-8990



Re: [VOTE] - Release 2.0.5-beta

2013-05-21 Thread Konstantin Shvachko
-1  for the record.
This is a great plan for 2.1, which I would gladly support, but not for
2.0.5.

I do not see how the previous vote could have been confusing,
as it contained a direct quotation of the relative clause of Bylaws.

Arun, the format of this vote remains confusing.
What is the action and what approval method you plan to use is still
undefined.

Thanks,
--Konstantin


On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote:

 Folks,

 A considerable number of people have expressed confusion regarding the
 recent vote on 2.0.5, beta status etc. given lack of specifics, the voting
 itself (validity of the vote itself, whose votes are binding) etc.

 IMHO technical arguments (incompatibility b/w 2.0  2.1, current stability
 of 3 features under debate etc.) have been lost in the discussion in favor
 of non-technical (almost dramatic) nuances such as seizing the moment.
 There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1)
 - this is a red flag for me; particularly when there are just 3 features
 being debated and active committers and contributors are confident of and
 ready to stand by their work. All patches, I believe, are ready to be
 merged in the the next few days per discussions on jira. This will,
 clearly, not delay the other API work which everyone agrees is crucial. As
 a result, I feel no recourse but to restart a new vote - all attempts at
 calm, reasoned, civil discussion based on technical arguments have come to
 naught - I apologize for the thrash caused to everyone's attention.

 To get past all of this confusion, I'd like to present an alternate,
 specific proposal for consideration.

 I propose we continue the original plan and make a 2.0.5-beta release by
 May end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
  * Final YARN API changes: YARN-386
  * MR Binary Compatibility: MAPREDUCE-5108
  * Final RPC cleanup: HADOOP-8990

 People working on the above features have all expressed considerable
 comfort with them and are ready to stand-by to help expedite any necessary
 bug-fixes etc. to get to stabilization quickly. I'm confident we can get
 this release out by end of May. This sets stage for a hadoop-2.x GA release
 right after with some more testing - this means I think I can quickly turn
 around and make bug-fix releases as necessary right after 2.0.5-beta.

 I request that people consider helping out with this plan and sign up to
 help push hadoop-2.x to stability as outlined above. I believe this will
 help achieve our shared goals of quickly stabilizing hadoop-2 and help
 ensure we can support it for forseeable future in a compatible manner for
 the benefit of our users and downstream projects.

 Please vote, the vote will run the normal 7 days. Obviously, I'm +1.

 thanks,
 Arun

 PS: To keep this discussion grounded in technical details I've moved this
 to dev@ (bcc general@).




Re: [VOTE] - Release 2.0.5-beta

2013-05-21 Thread Konstantin Shvachko
Chris,

I find you are contradicting yourself within this message and with some
other of yours.

But I want to address only one thing here

 This has exposed a bug in our bylaws, which we can fix.

This could be a bug, and we may need to fix it. But until then it is a
bylaw,
which is the only rule we have to come to an agreement if we disagree.
If we both respect the rules we can come to an agreement. If not and
people start forcing their way by saying the rule is wrong - let's ignore
it today, or by conducting an infinite chain of counter votes - this creates
chaos.

Thanks,
--Konstantin


On Sat, May 18, 2013 at 4:22 PM, Chris Douglas cdoug...@apache.org wrote:

 The release plan vote is not binding in any way. Nobody lost a
 vote, or risks having an outcome reversed, because there are no
 consequences to these exercises.

 Konstantin, I've been trying to tell you for more than a week that you
 can go forward without anyone's blessing or consent. There are no
 precedents, because the release plan vote has been a formality until
 now, and I don't know of any other projects that even bother with it.
 Most of our committers and PMC members didn't even know who was
 eligible to vote on it, because we usually ignore it. What *does*
 matter is the majority vote of the PMC on the release artifact. While
 we under-defined what the release plan means, we have zero ambiguity
 on when a release artifact becomes real.

 In the discussion, you were offered a minor release series, help
 selecting patches from branch-2, and every administrative barrier was
 removed from your path. Instead of taking this and running with it,
 you continued to press for... I don't know what. Please decide how
 you're going to move a development branch- any of them- forward and
 start working on it. There is nothing to win in these threads.

 This has exposed a bug in our bylaws, which we can fix.

 Right now, these votes are confusing everybody and stalling the
 project. I don't care who comes up with 2.0.5-beta, whether it's part
 of 2.1, or if we create 3.0. Any committer who wants to offer an
 candidate needs to demonstrate that they have a non-trivial,
 non-sectarian proportion of the community behind it by (1) creating
 the artifact (2) passing a PMC vote to make that artifact a release.
 It's that simple.

 With respect to the board: they are not parents, and we are not
 children. Neither are they interested or equipped to tell us how to
 partition releases of Hadoop. This is routine development, we are
 failing at it, but we will recover by eliminating this pointless
 ritual and getting back to producing software. -C

 On Fri, May 17, 2013 at 1:10 PM, Konstantin Shvachko
 shv.had...@gmail.com wrote:
  BCC: general@
 
  Since we recognize now that this is a vote to overrule previous decision,
  I am referring to Vinod's note on general
  *http://s.apache.org/h7x*
  should this be brought to the attention of the Board?
 
  I don't remember any precedents of this kind in Hadoop history.
  But other projects may have had such experience.
  A clarification on categorizing this action and on voting practices
  from ASF may help.
 
  Thanks,
  --Konstantin
 
 
 
  On Wed, May 15, 2013 at 3:36 PM, Konstantin Shvachko
  shv.had...@gmail.comwrote:
 
  Arun,
 
  I am glad I at least convinced you to finally announce your release plan
  and put it into vote.
  Even though it is to overrule the vote that just completed, which you
 were
  against and lost, well - Twice.
 
  I am glad you removed the NFS feature from the list proposed earlier.
 
  I think this vote is late. The lazy consensus on that issue has been
 just
  reached.
  I don't see the basis for the new vote,
  and it is not clear what action you seek to approve.
 
  Thanks,
  --Konstantin
 
 
 
  On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com
 wrote:
 
  Folks,
 
  A considerable number of people have expressed confusion regarding the
  recent vote on 2.0.5, beta status etc. given lack of specifics, the
 voting
  itself (validity of the vote itself, whose votes are binding) etc.
 
  IMHO technical arguments (incompatibility b/w 2.0  2.1, current
  stability of 3 features under debate etc.) have been lost in the
 discussion
  in favor of non-technical (almost dramatic) nuances such as seizing
 the
  moment. There is now dangerous talk of tolerating incompatibility b/w
 2.0
  and 2.1) - this is a red flag for me; particularly when there are just
 3
  features being debated and active committers and contributors are
 confident
  of and ready to stand by their work. All patches, I believe, are ready
 to
  be merged in the the next few days per discussions on jira. This will,
  clearly, not delay the other API work which everyone agrees is
 crucial. As
  a result, I feel no recourse but to restart a new vote - all attempts
 at
  calm, reasoned, civil discussion based on technical arguments have
 come to
  naught - I apologize for the thrash caused to everyone's 

Re: [VOTE] - Release 2.0.5-beta

2013-05-21 Thread Matt Foley
I've now started a separate discussion thread in common-dev@, titled
[PROPOSAL] change in bylaws to remove Release Plan vote.  If it achieves
consensus, I'll put it to a vote to so change the bylaws.

Best,
--Matt


On Sat, May 18, 2013 at 4:22 PM, Chris Douglas cdoug...@apache.org wrote:

 The release plan vote is not binding in any way. Nobody lost a
 vote, or risks having an outcome reversed, because there are no
 consequences to these exercises.

 Konstantin, I've been trying to tell you for more than a week that you
 can go forward without anyone's blessing or consent. There are no
 precedents, because the release plan vote has been a formality until
 now, and I don't know of any other projects that even bother with it.
 Most of our committers and PMC members didn't even know who was
 eligible to vote on it, because we usually ignore it. What *does*
 matter is the majority vote of the PMC on the release artifact. While
 we under-defined what the release plan means, we have zero ambiguity
 on when a release artifact becomes real.

 In the discussion, you were offered a minor release series, help
 selecting patches from branch-2, and every administrative barrier was
 removed from your path. Instead of taking this and running with it,
 you continued to press for... I don't know what. Please decide how
 you're going to move a development branch- any of them- forward and
 start working on it. There is nothing to win in these threads.

 This has exposed a bug in our bylaws, which we can fix.

 Right now, these votes are confusing everybody and stalling the
 project. I don't care who comes up with 2.0.5-beta, whether it's part
 of 2.1, or if we create 3.0. Any committer who wants to offer an
 candidate needs to demonstrate that they have a non-trivial,
 non-sectarian proportion of the community behind it by (1) creating
 the artifact (2) passing a PMC vote to make that artifact a release.
 It's that simple.

 With respect to the board: they are not parents, and we are not
 children. Neither are they interested or equipped to tell us how to
 partition releases of Hadoop. This is routine development, we are
 failing at it, but we will recover by eliminating this pointless
 ritual and getting back to producing software. -C

 On Fri, May 17, 2013 at 1:10 PM, Konstantin Shvachko
 shv.had...@gmail.com wrote:
  BCC: general@
 
  Since we recognize now that this is a vote to overrule previous decision,
  I am referring to Vinod's note on general
  *http://s.apache.org/h7x*
  should this be brought to the attention of the Board?
 
  I don't remember any precedents of this kind in Hadoop history.
  But other projects may have had such experience.
  A clarification on categorizing this action and on voting practices
  from ASF may help.
 
  Thanks,
  --Konstantin
 
 
 
  On Wed, May 15, 2013 at 3:36 PM, Konstantin Shvachko
  shv.had...@gmail.comwrote:
 
  Arun,
 
  I am glad I at least convinced you to finally announce your release plan
  and put it into vote.
  Even though it is to overrule the vote that just completed, which you
 were
  against and lost, well - Twice.
 
  I am glad you removed the NFS feature from the list proposed earlier.
 
  I think this vote is late. The lazy consensus on that issue has been
 just
  reached.
  I don't see the basis for the new vote,
  and it is not clear what action you seek to approve.
 
  Thanks,
  --Konstantin
 
 
 
  On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com
 wrote:
 
  Folks,
 
  A considerable number of people have expressed confusion regarding the
  recent vote on 2.0.5, beta status etc. given lack of specifics, the
 voting
  itself (validity of the vote itself, whose votes are binding) etc.
 
  IMHO technical arguments (incompatibility b/w 2.0  2.1, current
  stability of 3 features under debate etc.) have been lost in the
 discussion
  in favor of non-technical (almost dramatic) nuances such as seizing
 the
  moment. There is now dangerous talk of tolerating incompatibility b/w
 2.0
  and 2.1) - this is a red flag for me; particularly when there are just
 3
  features being debated and active committers and contributors are
 confident
  of and ready to stand by their work. All patches, I believe, are ready
 to
  be merged in the the next few days per discussions on jira. This will,
  clearly, not delay the other API work which everyone agrees is
 crucial. As
  a result, I feel no recourse but to restart a new vote - all attempts
 at
  calm, reasoned, civil discussion based on technical arguments have
 come to
  naught - I apologize for the thrash caused to everyone's attention.
 
  To get past all of this confusion, I'd like to present an alternate,
  specific proposal for consideration.
 
  I propose we continue the original plan and make a 2.0.5-beta release
 by
  May end with the following content:
  # HDFS-347
  # HDFS Snapshots
  # Windows support
  # Necessary  final API/protocol changes such as:
   * Final YARN API changes: YARN-386
   

Re: [VOTE] - Release 2.0.5-beta

2013-05-18 Thread Chris Douglas
The release plan vote is not binding in any way. Nobody lost a
vote, or risks having an outcome reversed, because there are no
consequences to these exercises.

Konstantin, I've been trying to tell you for more than a week that you
can go forward without anyone's blessing or consent. There are no
precedents, because the release plan vote has been a formality until
now, and I don't know of any other projects that even bother with it.
Most of our committers and PMC members didn't even know who was
eligible to vote on it, because we usually ignore it. What *does*
matter is the majority vote of the PMC on the release artifact. While
we under-defined what the release plan means, we have zero ambiguity
on when a release artifact becomes real.

In the discussion, you were offered a minor release series, help
selecting patches from branch-2, and every administrative barrier was
removed from your path. Instead of taking this and running with it,
you continued to press for... I don't know what. Please decide how
you're going to move a development branch- any of them- forward and
start working on it. There is nothing to win in these threads.

This has exposed a bug in our bylaws, which we can fix.

Right now, these votes are confusing everybody and stalling the
project. I don't care who comes up with 2.0.5-beta, whether it's part
of 2.1, or if we create 3.0. Any committer who wants to offer an
candidate needs to demonstrate that they have a non-trivial,
non-sectarian proportion of the community behind it by (1) creating
the artifact (2) passing a PMC vote to make that artifact a release.
It's that simple.

With respect to the board: they are not parents, and we are not
children. Neither are they interested or equipped to tell us how to
partition releases of Hadoop. This is routine development, we are
failing at it, but we will recover by eliminating this pointless
ritual and getting back to producing software. -C

On Fri, May 17, 2013 at 1:10 PM, Konstantin Shvachko
shv.had...@gmail.com wrote:
 BCC: general@

 Since we recognize now that this is a vote to overrule previous decision,
 I am referring to Vinod's note on general
 *http://s.apache.org/h7x*
 should this be brought to the attention of the Board?

 I don't remember any precedents of this kind in Hadoop history.
 But other projects may have had such experience.
 A clarification on categorizing this action and on voting practices
 from ASF may help.

 Thanks,
 --Konstantin



 On Wed, May 15, 2013 at 3:36 PM, Konstantin Shvachko
 shv.had...@gmail.comwrote:

 Arun,

 I am glad I at least convinced you to finally announce your release plan
 and put it into vote.
 Even though it is to overrule the vote that just completed, which you were
 against and lost, well - Twice.

 I am glad you removed the NFS feature from the list proposed earlier.

 I think this vote is late. The lazy consensus on that issue has been just
 reached.
 I don't see the basis for the new vote,
 and it is not clear what action you seek to approve.

 Thanks,
 --Konstantin



 On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.comwrote:

 Folks,

 A considerable number of people have expressed confusion regarding the
 recent vote on 2.0.5, beta status etc. given lack of specifics, the voting
 itself (validity of the vote itself, whose votes are binding) etc.

 IMHO technical arguments (incompatibility b/w 2.0  2.1, current
 stability of 3 features under debate etc.) have been lost in the discussion
 in favor of non-technical (almost dramatic) nuances such as seizing the
 moment. There is now dangerous talk of tolerating incompatibility b/w 2.0
 and 2.1) - this is a red flag for me; particularly when there are just 3
 features being debated and active committers and contributors are confident
 of and ready to stand by their work. All patches, I believe, are ready to
 be merged in the the next few days per discussions on jira. This will,
 clearly, not delay the other API work which everyone agrees is crucial. As
 a result, I feel no recourse but to restart a new vote - all attempts at
 calm, reasoned, civil discussion based on technical arguments have come to
 naught - I apologize for the thrash caused to everyone's attention.

 To get past all of this confusion, I'd like to present an alternate,
 specific proposal for consideration.

 I propose we continue the original plan and make a 2.0.5-beta release by
 May end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
  * Final YARN API changes: YARN-386
  * MR Binary Compatibility: MAPREDUCE-5108
  * Final RPC cleanup: HADOOP-8990

 People working on the above features have all expressed considerable
 comfort with them and are ready to stand-by to help expedite any necessary
 bug-fixes etc. to get to stabilization quickly. I'm confident we can get
 this release out by end of May. This sets stage for a hadoop-2.x GA release
 right after with some more testing 

Re: [VOTE] - Release 2.0.5-beta

2013-05-17 Thread Vinod Kumar Vavilapalli

Thanks a bunch Nathan, for clearly letting us know the Yahoo! team's 
perspective.

We are getting started on rolling upgrades from YARN side (Sid opened YARN-666) 
and I hear HDFS side is too.

We definitely need compatibility and testing kits. Have to get started on this.

Work-preserving restart on YARN side - we plan to scope down next.

Thanks,
+Vinod

On May 16, 2013, at 11:28 AM, Nathan Roberts wrote:

 (initially respond on general@, sorry about that. copied here)
 
 +1 (non-binding)
 
 From my perspective:
 
 * The key feature that will drive me to adopt 2.x is Rolling Upgrades
 * In order to get to rolling upgrades, we need a compatibility story that
 is significantly better than we have today
 ** We need a comprehensive definition of what compatibility really means
  ** We need better testing in place to verify we're not breaking
 compatibility
 ** We need better definition and testing of what rolling upgrades really
 means. Rolling between bug-fix releases ­ Required, Rolling between minor
 releases ­ Required, Rolling between major releases ­ Desired.
  ** We need work-preserving restart on the YARN side. Restarting jobs
 isn't sufficient.
 ** ...
 * Given that Rolling upgrades aren't there yet, and there is still work to
 be done to solidify the compatibility story, I'm ok with the feature
 window remaining open until these are in place, especially given the fact
 that the proposed features are likely to have non-zero impact on
 compatibility/rolling_upgrades.
 * I'd certainly like a release with rolling upgrades as soon as possible,
 so I feel like the feature window needs to ramp down very quickly.
 Something like 2.0.5-beta in May with the current list of proposed
 features, then 2.0.6-beta in late summer with full rolling upgrade support
 and a solid compatibility story, would seem like a reasonable timeline.
 Once we have a beta release with rolling upgrades, I can look at pushing
 2.x to some of our larger clusters.
 
 Nathan Roberts
 nrobe...@yahoo-inc.com
 
 
 
 On 5/15/13 1:06 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com
 wrote:
 
 
 Seems like you forgot to bcc. Forwarding this to general.
 
 Thanks,
 +Vinod
 On May 15, 2013, at 10:57 AM, Arun C Murthy wrote:
 
 Folks,
 
 A considerable number of people have expressed confusion regarding the
 recent vote on 2.0.5, beta status etc. given lack of specifics, the
 voting itself (validity of the vote itself, whose votes are binding) etc.
 
 IMHO technical arguments (incompatibility b/w 2.0  2.1, current
 stability of 3 features under debate etc.) have been lost in the
 discussion in favor of non-technical (almost dramatic) nuances such as
 seizing the moment. There is now dangerous talk of tolerating
 incompatibility b/w 2.0 and 2.1) - this is a red flag for me;
 particularly when there are just 3 features being debated and active
 committers and contributors are confident of and ready to stand by their
 work. All patches, I believe, are ready to be merged in the the next few
 days per discussions on jira. This will, clearly, not delay the other
 API work which everyone agrees is crucial. As a result, I feel no
 recourse but to restart a new vote - all attempts at calm, reasoned,
 civil discussion based on technical arguments have come to naught - I
 apologize for the thrash caused to everyone's attention.
 
 To get past all of this confusion, I'd like to present an alternate,
 specific proposal for consideration.
 
 I propose we continue the original plan and make a 2.0.5-beta release
 by May end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
 * Final YARN API changes: YARN-386
 * MR Binary Compatibility: MAPREDUCE-5108
 * Final RPC cleanup: HADOOP-8990
 
 People working on the above features have all expressed considerable
 comfort with them and are ready to stand-by to help expedite any
 necessary bug-fixes etc. to get to stabilization quickly. I'm confident
 we can get this release out by end of May. This sets stage for a
 hadoop-2.x GA release right after with some more testing - this means I
 think I can quickly turn around and make bug-fix releases as necessary
 right after 2.0.5-beta.
 
 I request that people consider helping out with this plan and sign up
 to help push hadoop-2.x to stability as outlined above. I believe this
 will help achieve our shared goals of quickly stabilizing hadoop-2 and
 help ensure we can support it for forseeable future in a compatible
 manner for the benefit of our users and downstream projects.
 
 Please vote, the vote will run the normal 7 days. Obviously, I'm +1.
 
 thanks,
 Arun
 
 PS: To keep this discussion grounded in technical details I've moved
 this to dev@ (bcc general@).
 
 
 



Re: [VOTE] - Release 2.0.5-beta

2013-05-17 Thread Roman Shaposhnik
Apologies for a bunch of delayed responses (and as such adding
even more emails to this thread).

On Wed, May 15, 2013 at 4:47 PM, Arun C Murthy a...@hortonworks.com wrote:
 My reading of your response is that while you appreciate the feedback
 Bigtop is providing you're not of an opinion that investigating the level
 of stability of Hadoop wrt. downstream any further than what is currently
 happening would be a worthy investment of Hadoop's community
 (or your personal for that matter) time?

 Everyone is welcome to contribute in any and all manner. I can't speak for 
 everyone.
 It would be useful if Bigtop could run regressions on releases here 
 consistently.
 We've also talked in the past about running Bigtop on branch-2, nightly.
 Is that something you could help with? You'd earn my personal gratitude.

There's a bunch of stuff that Bigtop can offer wrt. infrastructure and existing
functionality that helps with integration testing of Hadoop. There's 10x more
stuff we can do if folks other than Bigtop members would be willing to pitch
in.

I think a good closure to this discussion would be to identify things that
we can do in Bigtop to help Hadoop stabilize quicker and also identify
anybody who's potentially willing to help implementing these ideas.

I'll try to collect all the good points that you and others have made on this
thread wrt. this approach and will fork it into a separate discussion shortly.

Thanks,
Roman.


Re: [VOTE] - Release 2.0.5-beta

2013-05-17 Thread Konstantin Shvachko
BCC: general@

Since we recognize now that this is a vote to overrule previous decision,
I am referring to Vinod's note on general
*http://s.apache.org/h7x*
should this be brought to the attention of the Board?

I don't remember any precedents of this kind in Hadoop history.
But other projects may have had such experience.
A clarification on categorizing this action and on voting practices
from ASF may help.

Thanks,
--Konstantin



On Wed, May 15, 2013 at 3:36 PM, Konstantin Shvachko
shv.had...@gmail.comwrote:

 Arun,

 I am glad I at least convinced you to finally announce your release plan
 and put it into vote.
 Even though it is to overrule the vote that just completed, which you were
 against and lost, well - Twice.

 I am glad you removed the NFS feature from the list proposed earlier.

 I think this vote is late. The lazy consensus on that issue has been just
 reached.
 I don't see the basis for the new vote,
 and it is not clear what action you seek to approve.

 Thanks,
 --Konstantin



 On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.comwrote:

 Folks,

 A considerable number of people have expressed confusion regarding the
 recent vote on 2.0.5, beta status etc. given lack of specifics, the voting
 itself (validity of the vote itself, whose votes are binding) etc.

 IMHO technical arguments (incompatibility b/w 2.0  2.1, current
 stability of 3 features under debate etc.) have been lost in the discussion
 in favor of non-technical (almost dramatic) nuances such as seizing the
 moment. There is now dangerous talk of tolerating incompatibility b/w 2.0
 and 2.1) - this is a red flag for me; particularly when there are just 3
 features being debated and active committers and contributors are confident
 of and ready to stand by their work. All patches, I believe, are ready to
 be merged in the the next few days per discussions on jira. This will,
 clearly, not delay the other API work which everyone agrees is crucial. As
 a result, I feel no recourse but to restart a new vote - all attempts at
 calm, reasoned, civil discussion based on technical arguments have come to
 naught - I apologize for the thrash caused to everyone's attention.

 To get past all of this confusion, I'd like to present an alternate,
 specific proposal for consideration.

 I propose we continue the original plan and make a 2.0.5-beta release by
 May end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
  * Final YARN API changes: YARN-386
  * MR Binary Compatibility: MAPREDUCE-5108
  * Final RPC cleanup: HADOOP-8990

 People working on the above features have all expressed considerable
 comfort with them and are ready to stand-by to help expedite any necessary
 bug-fixes etc. to get to stabilization quickly. I'm confident we can get
 this release out by end of May. This sets stage for a hadoop-2.x GA release
 right after with some more testing - this means I think I can quickly turn
 around and make bug-fix releases as necessary right after 2.0.5-beta.

 I request that people consider helping out with this plan and sign up to
 help push hadoop-2.x to stability as outlined above. I believe this will
 help achieve our shared goals of quickly stabilizing hadoop-2 and help
 ensure we can support it for forseeable future in a compatible manner for
 the benefit of our users and downstream projects.

 Please vote, the vote will run the normal 7 days. Obviously, I'm +1.

 thanks,
 Arun

 PS: To keep this discussion grounded in technical details I've moved this
 to dev@ (bcc general@).





Re: [VOTE] - Release 2.0.5-beta

2013-05-17 Thread Doug Cutting
On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote:
 To get past all of this confusion, I'd like to present an alternate, specific 
 proposal for consideration.

 I propose we continue the original plan and make a 2.0.5-beta release by May 
 end with the following content:

If you intend to nullify a prior vote then you should be very
explicit, e.g., making that a clause in the new proposal.

  # nullify the result of the vote on general@ that started with
message-id xxx...

Also, why is this on common-dev?  Isn't this list just for discussion
of things that happen in the hadoop-common tree?

  http://hadoop.apache.org/mailing_lists.html#Common

  all attempts at calm, reasoned, civil discussion based on technical 
 arguments have come to naught

Wow.  The folks you disagreed with there had absolutely no impact on
your thinking about this release?

Release numbers are cheap.  We shouldn't fight over them.

Doug


Re: [VOTE] - Release 2.0.5-beta

2013-05-17 Thread Roman Shaposhnik
Guys, this is a pretty long email with all the details
I can think of on how Bigtop can help stabilization efforts of
Hadoop 2.x. A lot of this information is required background.
I really, really encourage everyone who's thinking of
contributing to this effort to read it up. Once again,
I do apologize for its size.

Matt, Andrew,

you both brought up very good point, so let me summarize
a few things wrt. Bigtop. I'm also CCing Bigtop dev ML
so that everybody who's interested in pitching in could
discuss the matter further over there.

On Wed, May 15, 2013 at 9:25 PM, Andrew Purtell apurt...@apache.org wrote:
 The other comment on this thread that suggests ASF governance structures
 being inadequate for negotiating changes in a large ecosystem might be on to
 something, but at the same time Apache BigTop may be an effective ASF-native
 answer to that.

That is my sincere hope as well. Of course, Apache Bigtop is a project in its
own right with its own release schedules, community of users, etc. What we
are developing is not really an integration testsuite for Hadoop, it
just so happens
that without a stable Hadoop base we can't really deliver much. Hence we
have a huge vested interest in having a predictable schedule for the stable
releases of Hadoop. We also have all the interest in the world to help Hadoop
achieve that.

At the same time we're a very small project juggling ~18 different open source
components trying to put them into a coherent distribution. I don't think it is
realistic to expect us to be able to do all the work that ideally we would need
to do in order to provide the most of feedback for Hadoop
stabilization exercise.

At the same time it would be really unfortunate if we all just give up on this
collective goal. Ideally we can all pitch in to the extent we believe in the
need in having a stable Hadoop 2.x code line out there. I'll elaborate on
what exactly bigtop can contribute a bit later and I would expect all the
folks who'd be willing to pitch in in the particular area to reach out to us
either here or on bigtop ML.

On Wed, May 15, 2013 at 4:54 PM, Matt Foley ma...@apache.org wrote:
 Roman, what is your model for how test results from Bigtop should feed back
 into Hadoop-2 development?
 With the understanding that (a) software does have bugs, and (b) you're not
 going to get an SLA on community-sponsored software,
 what are your ideas for how to close the loop better?

 Would CI runs of Bigtop against branch-2 be feasible, as Arun suggests?
 How should we accomodate changes in individual components (Hadoop Core, but
 others as well) that may require changes in one or more other components?
 How does Bigtop keep doing a viable nightly build in that chaotic
 environment?
 Is this a previously solved problem?

All excellent questions! Here's my laundry list of what Bigtop can offer today:
#0 a publically available continuous integration Jenkins instance that
 runs on EC2 (because of Cloudera's gracious support of our project)
 and ties the rest of the bigtop infrastrucutre together:
 http://bigtop01.cloudera.org:8080/

 The benefit of this infrastructure in the open is pretty clear -- just
  like with builds.apache.org if there are failures/etc. anybody who's
  interested can jump on it and start making progress.

#1 a continuous integration build of all the components comprising the
 'current' trunk of Apache Bigtop all the way up to producing easy to
 install packages for the following Linux platforms:

http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-Repository/
 Basically the above link allows one to install nightly builds
of Apache Bigtop
 Hadoop distribution as easyly as typing 'yum install
hadoop-conf-pseudo'

#2 a potential for  'tracking' builds all the way to packages of
each individual
 component: http://bigtop01.cloudera.org:8080/view/Upstream-tests/
 Basically this allows one to easily install the base, fully
tested distribution
 of Hadoop (lets say Bigtop 0.5.0), upgrade just one component
and see how
 it fares. Currently these builds are add-hoc, but I'm trying
to work with respective
 upstream communities to figure out what branches of
development they would
 be interested in testing that way.

 This is one of the things that Arun and I talked about wrt.
hooking up Bigtop
 Jenkins to the branch-2 on a continuous basis. I wish I had
time to do that
 I honestly simply don't. I might in a few weeks, but again,
if anybody is willing
 to pitch in and help -- that'll be greatly appreciated.

#3 a collection of puppet recipes that allow one to deploy
packaged Bigtop distro
 (either from #1 or #2) on a fully distributed cluster.

#4 an existing collection of integration tests (~200) for all the components
 we've got in our stack: 

Re: [VOTE] - Release 2.0.5-beta

2013-05-16 Thread Konstantin Boudnik
Guys,

I guess what you're missing is that Bigtop isn't a testing framework for
Hadoop. It is stack framework that verifies that components are dealing with
each other nicely. Every single stack is different: Bigtop 0.5.0 differs from
0.6.0, and so on. Bigtop - as any other ASF project - has its releases that
might or might not be aligned with particular version of Hadoop. Hence, an
ethalon stack needs to be defined first and foremost.

Before we even start talking about running it nightly (another question is on
what hardware, let's not get there for now) let's understand who will can help
with triage'ing test failures? Downstreams, Hadoop or Bigtop?

Judging by a number of other emails there's a number of people on this list
who care plenty about integration issues. Any volunteers to help with
integration testing in the open?

 Is this a previously solved problem?

Yes. The problem is solved by separating actively developed (aka unstable)
release from more mature and less volatile ones. This is what has been
concluded upon two days ago in this voting thread http://s.apache.org/MnU

Cos

On Wed, May 15, 2013 at 04:54PM, Matt Foley wrote:
 Roman, what is your model for how test results from Bigtop should feed back
 into Hadoop-2 development?
 With the understanding that (a) software does have bugs, and (b) you're not
 going to get an SLA on community-sponsored software,
 what are your ideas for how to close the loop better?
 
 Would CI runs of Bigtop against branch-2 be feasible, as Arun suggests?
 How should we accomodate changes in individual components (Hadoop Core, but
 others as well) that may require changes in one or more other components?
 How does Bigtop keep doing a viable nightly build in that chaotic
 environment?
 Is this a previously solved problem?
 
 Thanks,
 --Matt
 
 
 On Wed, May 15, 2013 at 4:47 PM, Arun C Murthy a...@hortonworks.com wrote:
 
 
  On May 15, 2013, at 3:50 PM, Roman Shaposhnik wrote:
 
   Arun,
  
   am I reading yours answer to my binary question correctly? It is a 'no'.
 
  No.
 
  
   My reading of your response is that while you appreciate the feedback
   Bigtop is providing you're not of an opinion that investigating the level
   of stability of Hadoop wrt. downstream any further than what is currently
   happening would be a worthy investment of Hadoop's community
   (or your personal for that matter) time?
 
  Everyone is welcome to contribute in any and all manner. I can't speak for
  everyone.
  It would be useful if Bigtop could run regressions on releases here
  consistently. We've also talked in the past about running Bigtop on
  branch-2, nightly. Is that something you could help with? You'd earn my
  personal gratitude.
 
  thanks,
  Arun
 
  
   Thanks,
   Roman.
  
   On Wed, May 15, 2013 at 3:02 PM, Arun C Murthy a...@hortonworks.com
  wrote:
   Roman,
  
   Furthermore, before we rush into finding flaws and scaring kids at
  night it would be useful to remember one thing:
   Software has *bugs*. We can't block any release till the entire
  universe validates it, in fact they won't validate it if we don't release
  since are at the bottom of the stack.
  
   Any help prior to the release is welcome; I know people who work for
  the same employer as I do have plans to do further testing after we freeze
  apis via the beta release(s). I hope and pray others can join this effort -
  thanks to everyone who already has.
  
   Again, freezing APIs and protocols is the primary aim of 2.0.5-beta.
  There are no guarantees it's 100% bug-free, we can never make such
  guarantees anyway.
  
   If, and when, we find bugs with 2.0.5-beta I'm more than happy to
  quickly turn around and make more releases (2.0.6-beta, 2.0.7-beta).
  Obviously I'll make a call on which bugs are critical - feedback to help me
  decide is, as always, welcome.
   I've been clear, many times, that we might need more than one beta
  release to iron out bugs etc.
  
   None of this should be a surprise - this has happened many, many times
  in the lifetime of this and other projects. 2.0.3-alpha vis-a-vis
  2.0.4-alpha is the most recent example - it won't be the last.
  
   So, I hope, concludes this meme.
  
   thanks,
   Arun
  
   On May 15, 2013, at 2:20 PM, Arun C Murthy wrote:
  
   Great summary, thanks Vinod.
  
   On May 15, 2013, at 2:14 PM, Vinod Kumar Vavilapalli wrote:
  
  
   Roman, I keep this same argument again and again. Should've refuted
  earlier.
  
   Please list down all the issues that BigTop ran into *because of* new
  features. You continue to argue that new features are destabilizing 2.0.*,
  which I don't agree with at all. 2.0.3-alpha was the last time major
  features got merged in, and we found blockers irrespective of those.
  
   MAPREDUCE-5240 specifically isn't due to any feature merge. This was
  a bug. I'd say this is a long standing bug in 2.0.x. You sure this passed
  in 2.0.3? Even so, this is mostly broken by another bug-fix and *not*
  because of 

Re: [VOTE] - Release 2.0.5-beta

2013-05-16 Thread Vinod Kumar Vavilapalli

On May 15, 2013, at 2:54 PM, Roman Shaposhnik wrote:

 This is not my argument at all. I apologize if somehow I failed to
 communicate it, but here's what my argument boils down to:
 given *my* experience with Hadoop 2.0.x series and Bigtop
 release every time I try a different release of Hadoop 2.0.x
 I run into issues that scare me. They scare me because
 they are so basic yet they make component like Sqoop
 and Oozie (and I believe Giraph on one occasion)
 pretty much DOA for YARN-base mapreduce implementation.


Why they should *scare* *you*? As Stevel and Arun also pointed out on others 
mails in this thread, we have no way of finding all the bugs. What is basic in 
your environment isn't basic in mine. Just today I ran into a *basic* issue of 
not being able to run secure oozie setup
on top of one of the stable 1.x release in one of the usual environments. Now 
if I share your level of concern, I should be *scared* about why none of the 
testing we did in the past 3 years manage to uncover it. Or may be why BigTop 
is not able to help find these issues for us if at all.


 In my mind, what that translates into is the fact that nobody
 did *any* real testing of a particular downstream component
 running on a given Hadoop 2.0.x release. Like I said --
 the issues so far make the components in question DOA.


Thanks for *finally* adding this bit of a particular downstream component. It 
is very likely none of us tested Sqoop on top of 2.0.3/4-alpha. But you know 
what, if BigTop didn't exist, someone from Sqoop community would have nudged 
us. And nobody will really need to be *scared* - all of this is expected in the 
alpha life cycle.


 How many more issues like that one (regardless of how
 they originated) are in branch-2? Wouldn't we want to
 know before declaring Hadoop 2.0.5 beta?


We want to know, and you know what, we already know some of them. Lots of them. 
You can find them in issue tracker if you ever want to. For example in YARN: 
https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=truejqlQuery=project+%3D+YARN+AND+status+%3D+Open+ORDER+BY+priority+DESCmode=hide

There are a total of 258 open tickets, 10 critical, 253 major (and if 
categorized will turn into critical and blocker issues). And particularly 23 
against 2.0.4-alpha and 34 against 2.0.3-alpha.

So, please. With due respect, BigTop isn't the only project discovering bugs 
which make downstream components DOA.

Thanks,
+Vinod

Re: [VOTE] - Release 2.0.5-beta

2013-05-16 Thread Konstantin Boudnik
On Wed, May 15, 2013 at 10:52PM, Suresh Srinivas wrote:
  Assuming that you are talking about HDFS features when you say
 
   features going into a beta on a very short short timetable and
   laundry list etc,
  
 
  No, that would not be a correct assumption.
 
  So these features are not something that are impulsively developed
   and irresponsibly pushed to a release. They have gone through
   considerable testing and have been developed over a long time.
  
 
  There is no need to reframe my comment in combative terms and read in
  insults that are not there.
 
 
 No insult taken. But I want to make a case that feature are not proposed
 lightly and due diligence both during development and testing are done.
 
 
  As I read Arun's mail the plan is to integrate several feature branches
  into branch-2. That would of course result in brand new never before tested
  code. I do not believe that should have the label alpha. This is just my
  personal opinion. Shit happens when commits happen. - You know this as
  well as I. That does not mean I am here to attack or insult you by pointing
  that out and suggesting more measured alternatives.
 
  There is little to gain in engaging in debate club. If you are not
  interested in hearing these opinions, that is fine, I have received that
  message already, nothing further need be said.
 
 
 Andy, I value your feedback. I am only trying to allay the concerns by
 sharing my perspective.

What I am seeing times and again in these endless discussion threads is this:

  a) downstream or bigtop: we are seeing a bunch of integration issues with
every new feature introduced/something even a commit made
  b) feature developers: no-no, these features are developed for a long time,
tests are ran, no need to be concerned

The same pattern is repeated times and again. The only conclusion that I can
make out of it, is that either the meaning of integration testing is
different for a) and b) or that a) and b) are using very different validation
mechanisms.

Which one is that? I am puzzled.

Bugs are quite expected - Andrew put it very eloquently, actually. But you
only can deal with them effectively if the flow of changes is controlled, e.g.
via smaller and focused releases. The development process has to be
converging, and not fanning-out. Case in point? Sure. 2.0.3-alpha had to be
followed by 2.0.4-alpha release (officially called bugfix release); it - in
turn - requires 2.0.4.1-alpha to make it suitable for other downstream
components. So, it took 2 releases to simply fix issues caused by a bunch of
bugfixes and no major new features being committed into 2.0.3-alpha. These are
just cold facts - not attacking any ones' ego here.

  Cos



signature.asc
Description: Digital signature


Re: [VOTE] - Release 2.0.5-beta

2013-05-16 Thread Robert Evans
-0 (Binding)

I have made my opinion known in the previous thread/vote, but I have spent
enough time discussing this and need to get back to my day job. If the
community is able to get snapshots and everything else in this list merged
and stable without breaking the stack above it in two weeks it will be
wonderful, but I have serious doubts that it is going to actually be
possible.

--Bobby

On 5/15/13 12:57 PM, Arun C Murthy a...@hortonworks.com wrote:

Folks,

A considerable number of people have expressed confusion regarding the
recent vote on 2.0.5, beta status etc. given lack of specifics, the
voting itself (validity of the vote itself, whose votes are binding) etc.

IMHO technical arguments (incompatibility b/w 2.0  2.1, current
stability of 3 features under debate etc.) have been lost in the
discussion in favor of non-technical (almost dramatic) nuances such as
seizing the moment. There is now dangerous talk of tolerating
incompatibility b/w 2.0 and 2.1) - this is a red flag for me;
particularly when there are just 3 features being debated and active
committers and contributors are confident of and ready to stand by their
work. All patches, I believe, are ready to be merged in the the next few
days per discussions on jira. This will, clearly, not delay the other API
work which everyone agrees is crucial. As a result, I feel no recourse
but to restart a new vote - all attempts at calm, reasoned, civil
discussion based on technical arguments have come to naught - I apologize
for the thrash caused to everyone's attention.

To get past all of this confusion, I'd like to present an alternate,
specific proposal for consideration.

I propose we continue the original plan and make a 2.0.5-beta release by
May end with the following content:
# HDFS-347
# HDFS Snapshots
# Windows support
# Necessary  final API/protocol changes such as:
 * Final YARN API changes: YARN-386
 * MR Binary Compatibility: MAPREDUCE-5108
 * Final RPC cleanup: HADOOP-8990

People working on the above features have all expressed considerable
comfort with them and are ready to stand-by to help expedite any
necessary bug-fixes etc. to get to stabilization quickly. I'm confident
we can get this release out by end of May. This sets stage for a
hadoop-2.x GA release right after with some more testing - this means I
think I can quickly turn around and make bug-fix releases as necessary
right after 2.0.5-beta.

I request that people consider helping out with this plan and sign up to
help push hadoop-2.x to stability as outlined above. I believe this will
help achieve our shared goals of quickly stabilizing hadoop-2 and help
ensure we can support it for forseeable future in a compatible manner for
the benefit of our users and downstream projects.

Please vote, the vote will run the normal 7 days. Obviously, I'm +1.

thanks,
Arun

PS: To keep this discussion grounded in technical details I've moved this
to dev@ (bcc general@).




Re: [VOTE] - Release 2.0.5-beta

2013-05-16 Thread Steve Loughran
On 15 May 2013 23:19, Konstantin Boudnik c...@apache.org wrote:

 Guys,

 I guess what you're missing is that Bigtop isn't a testing framework for
 Hadoop. It is stack framework that verifies that components are dealing
 with
 each other nicely.



which to me means Some form of integration test


 Every single stack is different: Bigtop 0.5.0 differs from
 0.6.0, and so on. Bigtop - as any other ASF project - has its releases that
 might or might not be aligned with particular version of Hadoop. Hence, an
 ethalon stack needs to be defined first and foremost.

 Before we even start talking about running it nightly (another question is
 on
 what hardware, let's not get there for now) let's understand who will can
 help
 with triage'ing test failures? Downstreams, Hadoop or Bigtop?





 Judging by a number of other emails there's a number of people on this list
 who care plenty about integration issues. Any volunteers to help with
 integration testing in the open?


As I said at the HUG, I want to get the non-swift-FS specific tests that do
things like run Pig jobs against any FS in, though I also need a home for
some very swift-specific partitioned file tests.

 Is this a previously solved problem?

 Yes. The problem is solved by separating actively developed (aka unstable)
 release from more mature and less volatile ones.


not in filesystems. If you look how long it took ext4 to be implemented and
then adopted, you can see that nobody put data they cared about on it until
they were happy that what you put on write() came back on a read() [and
stat() returned the amount of data, [seek(X);read()] returned the byte at
offset X and other little details that those of us writing tests for the
filesystem APIs care about]


Re: [VOTE] - Release 2.0.5-beta

2013-05-16 Thread Arun C Murthy
Cos,

On May 15, 2013, at 11:38 PM, Konstantin Boudnik wrote:

 What I am seeing times and again in these endless discussion threads is this:
 
  a) downstream or bigtop: we are seeing a bunch of integration issues with
every new feature introduced/something even a commit made
  b) feature developers: no-no, these features are developed for a long time,
tests are ran, no need to be concerned

It's unfortunate you are continuing to take digs at people who actually are 
moving the project forward.

The 'cold facts' you describe do not give any credence your conclusions.

Let's review the bugs Bigtop has found over the course of this year, Vinod 
pointed them out:

 I quickly checked other bugs you reported in 2.0.x:
 - MAPREDUCE-5088 was caused by the fix for HADOOP-9299 which was again a long 
 standing issue in 2.0.x
 - MAPREDUCE-3728 is similar
 - MAPREDUCE-5117 is similar
 - MAPREDUCE-4219 was a security related feature request from you.
 - MAPREDUCE-3916 was because of new proxy-server added.

And now, MAPREDUCE-5240 - again, a long standing bug.

Given the above, please help me understand how 'feature developers' are hurting?

I've repeatedly asked you or Roman to run CI on branch-2, instead of stepping 
up to help on a concrete proposal you continue to take digs at a number of 
contributors here, hopefully this will stop.

Arun



Re: [VOTE] - Release 2.0.5-beta

2013-05-16 Thread Nathan Roberts
(initially respond on general@, sorry about that. copied here)

+1 (non-binding)

From my perspective:

* The key feature that will drive me to adopt 2.x is Rolling Upgrades
* In order to get to rolling upgrades, we need a compatibility story that
is significantly better than we have today
** We need a comprehensive definition of what compatibility really means
  ** We need better testing in place to verify we're not breaking
compatibility
** We need better definition and testing of what rolling upgrades really
means. Rolling between bug-fix releases ­ Required, Rolling between minor
releases ­ Required, Rolling between major releases ­ Desired.
  ** We need work-preserving restart on the YARN side. Restarting jobs
isn't sufficient.
** ...
* Given that Rolling upgrades aren't there yet, and there is still work to
be done to solidify the compatibility story, I'm ok with the feature
window remaining open until these are in place, especially given the fact
that the proposed features are likely to have non-zero impact on
compatibility/rolling_upgrades.
* I'd certainly like a release with rolling upgrades as soon as possible,
so I feel like the feature window needs to ramp down very quickly.
Something like 2.0.5-beta in May with the current list of proposed
features, then 2.0.6-beta in late summer with full rolling upgrade support
and a solid compatibility story, would seem like a reasonable timeline.
Once we have a beta release with rolling upgrades, I can look at pushing
2.x to some of our larger clusters.

Nathan Roberts
nrobe...@yahoo-inc.com



On 5/15/13 1:06 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com
wrote:


Seems like you forgot to bcc. Forwarding this to general.

Thanks,
+Vinod
On May 15, 2013, at 10:57 AM, Arun C Murthy wrote:

 Folks,
 
 A considerable number of people have expressed confusion regarding the
recent vote on 2.0.5, beta status etc. given lack of specifics, the
voting itself (validity of the vote itself, whose votes are binding) etc.
 
 IMHO technical arguments (incompatibility b/w 2.0  2.1, current
stability of 3 features under debate etc.) have been lost in the
discussion in favor of non-technical (almost dramatic) nuances such as
seizing the moment. There is now dangerous talk of tolerating
incompatibility b/w 2.0 and 2.1) - this is a red flag for me;
particularly when there are just 3 features being debated and active
committers and contributors are confident of and ready to stand by their
work. All patches, I believe, are ready to be merged in the the next few
days per discussions on jira. This will, clearly, not delay the other
API work which everyone agrees is crucial. As a result, I feel no
recourse but to restart a new vote - all attempts at calm, reasoned,
civil discussion based on technical arguments have come to naught - I
apologize for the thrash caused to everyone's attention.
 
 To get past all of this confusion, I'd like to present an alternate,
specific proposal for consideration.
 
 I propose we continue the original plan and make a 2.0.5-beta release
by May end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
 * Final YARN API changes: YARN-386
 * MR Binary Compatibility: MAPREDUCE-5108
 * Final RPC cleanup: HADOOP-8990
 
 People working on the above features have all expressed considerable
comfort with them and are ready to stand-by to help expedite any
necessary bug-fixes etc. to get to stabilization quickly. I'm confident
we can get this release out by end of May. This sets stage for a
hadoop-2.x GA release right after with some more testing - this means I
think I can quickly turn around and make bug-fix releases as necessary
right after 2.0.5-beta.
 
 I request that people consider helping out with this plan and sign up
to help push hadoop-2.x to stability as outlined above. I believe this
will help achieve our shared goals of quickly stabilizing hadoop-2 and
help ensure we can support it for forseeable future in a compatible
manner for the benefit of our users and downstream projects.
 
 Please vote, the vote will run the normal 7 days. Obviously, I'm +1.
 
 thanks,
 Arun
 
 PS: To keep this discussion grounded in technical details I've moved
this to dev@ (bcc general@).
 




[VOTE] - Release 2.0.5-beta

2013-05-15 Thread Arun C Murthy
Folks,

A considerable number of people have expressed confusion regarding the recent 
vote on 2.0.5, beta status etc. given lack of specifics, the voting itself 
(validity of the vote itself, whose votes are binding) etc.

IMHO technical arguments (incompatibility b/w 2.0  2.1, current stability of 3 
features under debate etc.) have been lost in the discussion in favor of 
non-technical (almost dramatic) nuances such as seizing the moment. There is 
now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this is a 
red flag for me; particularly when there are just 3 features being debated and 
active committers and contributors are confident of and ready to stand by their 
work. All patches, I believe, are ready to be merged in the the next few days 
per discussions on jira. This will, clearly, not delay the other API work which 
everyone agrees is crucial. As a result, I feel no recourse but to restart a 
new vote - all attempts at calm, reasoned, civil discussion based on technical 
arguments have come to naught - I apologize for the thrash caused to everyone's 
attention.

To get past all of this confusion, I'd like to present an alternate, specific 
proposal for consideration.

I propose we continue the original plan and make a 2.0.5-beta release by May 
end with the following content:
# HDFS-347
# HDFS Snapshots
# Windows support
# Necessary  final API/protocol changes such as:
 * Final YARN API changes: YARN-386
 * MR Binary Compatibility: MAPREDUCE-5108
 * Final RPC cleanup: HADOOP-8990

People working on the above features have all expressed considerable comfort 
with them and are ready to stand-by to help expedite any necessary bug-fixes 
etc. to get to stabilization quickly. I'm confident we can get this release out 
by end of May. This sets stage for a hadoop-2.x GA release right after with 
some more testing - this means I think I can quickly turn around and make 
bug-fix releases as necessary right after 2.0.5-beta.

I request that people consider helping out with this plan and sign up to help 
push hadoop-2.x to stability as outlined above. I believe this will help 
achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we can 
support it for forseeable future in a compatible manner for the benefit of our 
users and downstream projects.

Please vote, the vote will run the normal 7 days. Obviously, I'm +1.

thanks,
Arun

PS: To keep this discussion grounded in technical details I've moved this to 
dev@ (bcc general@).



Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Suresh Srinivas
This is the course that we were taking before the unfortunate disruption.
We should be able to meet both the stabilization goals and compatibility
goals quickly with this proposal. I personally am willing to invest a lot
of time in testing, code reviews and work on adding missing functionality
to ensure the goal of this proposal is successful.

+1.


On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote:

 Folks,

 A considerable number of people have expressed confusion regarding the
 recent vote on 2.0.5, beta status etc. given lack of specifics, the voting
 itself (validity of the vote itself, whose votes are binding) etc.

 IMHO technical arguments (incompatibility b/w 2.0  2.1, current stability
 of 3 features under debate etc.) have been lost in the discussion in favor
 of non-technical (almost dramatic) nuances such as seizing the moment.
 There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1)
 - this is a red flag for me; particularly when there are just 3 features
 being debated and active committers and contributors are confident of and
 ready to stand by their work. All patches, I believe, are ready to be
 merged in the the next few days per discussions on jira. This will,
 clearly, not delay the other API work which everyone agrees is crucial. As
 a result, I feel no recourse but to restart a new vote - all attempts at
 calm, reasoned, civil discussion based on technical arguments have come to
 naught - I apologize for the thrash caused to everyone's attention.

 To get past all of this confusion, I'd like to present an alternate,
 specific proposal for consideration.

 I propose we continue the original plan and make a 2.0.5-beta release by
 May end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
  * Final YARN API changes: YARN-386
  * MR Binary Compatibility: MAPREDUCE-5108
  * Final RPC cleanup: HADOOP-8990

 People working on the above features have all expressed considerable
 comfort with them and are ready to stand-by to help expedite any necessary
 bug-fixes etc. to get to stabilization quickly. I'm confident we can get
 this release out by end of May. This sets stage for a hadoop-2.x GA release
 right after with some more testing - this means I think I can quickly turn
 around and make bug-fix releases as necessary right after 2.0.5-beta.

 I request that people consider helping out with this plan and sign up to
 help push hadoop-2.x to stability as outlined above. I believe this will
 help achieve our shared goals of quickly stabilizing hadoop-2 and help
 ensure we can support it for forseeable future in a compatible manner for
 the benefit of our users and downstream projects.

 Please vote, the vote will run the normal 7 days. Obviously, I'm +1.

 thanks,
 Arun

 PS: To keep this discussion grounded in technical details I've moved this
 to dev@ (bcc general@).




-- 
http://hortonworks.com/download/


Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Vinod Kumar Vavilapalli

Seems like you forgot to bcc. Forwarding this to general.

Thanks,
+Vinod
On May 15, 2013, at 10:57 AM, Arun C Murthy wrote:

 Folks,
 
 A considerable number of people have expressed confusion regarding the recent 
 vote on 2.0.5, beta status etc. given lack of specifics, the voting itself 
 (validity of the vote itself, whose votes are binding) etc.
 
 IMHO technical arguments (incompatibility b/w 2.0  2.1, current stability of 
 3 features under debate etc.) have been lost in the discussion in favor of 
 non-technical (almost dramatic) nuances such as seizing the moment. There 
 is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this 
 is a red flag for me; particularly when there are just 3 features being 
 debated and active committers and contributors are confident of and ready to 
 stand by their work. All patches, I believe, are ready to be merged in the 
 the next few days per discussions on jira. This will, clearly, not delay the 
 other API work which everyone agrees is crucial. As a result, I feel no 
 recourse but to restart a new vote - all attempts at calm, reasoned, civil 
 discussion based on technical arguments have come to naught - I apologize for 
 the thrash caused to everyone's attention.
 
 To get past all of this confusion, I'd like to present an alternate, specific 
 proposal for consideration.
 
 I propose we continue the original plan and make a 2.0.5-beta release by May 
 end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
 * Final YARN API changes: YARN-386
 * MR Binary Compatibility: MAPREDUCE-5108
 * Final RPC cleanup: HADOOP-8990
 
 People working on the above features have all expressed considerable comfort 
 with them and are ready to stand-by to help expedite any necessary bug-fixes 
 etc. to get to stabilization quickly. I'm confident we can get this release 
 out by end of May. This sets stage for a hadoop-2.x GA release right after 
 with some more testing - this means I think I can quickly turn around and 
 make bug-fix releases as necessary right after 2.0.5-beta.
 
 I request that people consider helping out with this plan and sign up to help 
 push hadoop-2.x to stability as outlined above. I believe this will help 
 achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we 
 can support it for forseeable future in a compatible manner for the benefit 
 of our users and downstream projects.
 
 Please vote, the vote will run the normal 7 days. Obviously, I'm +1.
 
 thanks,
 Arun
 
 PS: To keep this discussion grounded in technical details I've moved this to 
 dev@ (bcc general@).
 



Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Amir Sanjar

good, glad we are back on track again.
BTW, we have already started build (IBM and OpenJDK SDK), unit test, and
limited integration testing  on x86 and POWER, results are promising.

Best Regards
Amir Sanjar

System Management Architect
PowerLinux Open Source Hadoop development lead
IBM Senior Software Engineer
Phone# 512-286-8393
Fax#  512-838-8858





From:   Arun C Murthy a...@hortonworks.com
To: common-dev@hadoop.apache.org,
Date:   05/15/2013 12:58 PM
Subject:[VOTE] - Release 2.0.5-beta



Folks,

A considerable number of people have expressed confusion regarding the
recent vote on 2.0.5, beta status etc. given lack of specifics, the voting
itself (validity of the vote itself, whose votes are binding) etc.

IMHO technical arguments (incompatibility b/w 2.0  2.1, current stability
of 3 features under debate etc.) have been lost in the discussion in favor
of non-technical (almost dramatic) nuances such as seizing the moment.
There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1)
- this is a red flag for me; particularly when there are just 3 features
being debated and active committers and contributors are confident of and
ready to stand by their work. All patches, I believe, are ready to be
merged in the the next few days per discussions on jira. This will,
clearly, not delay the other API work which everyone agrees is crucial. As
a result, I feel no recourse but to restart a new vote - all attempts at
calm, reasoned, civil discussion based on technical arguments have come to
naught - I apologize for the thrash caused to everyone's attention.

To get past all of this confusion, I'd like to present an alternate,
specific proposal for consideration.

I propose we continue the original plan and make a 2.0.5-beta release by
May end with the following content:
# HDFS-347
# HDFS Snapshots
# Windows support
# Necessary  final API/protocol changes such as:
 * Final YARN API changes: YARN-386
 * MR Binary Compatibility: MAPREDUCE-5108
 * Final RPC cleanup: HADOOP-8990

People working on the above features have all expressed considerable
comfort with them and are ready to stand-by to help expedite any necessary
bug-fixes etc. to get to stabilization quickly. I'm confident we can get
this release out by end of May. This sets stage for a hadoop-2.x GA release
right after with some more testing - this means I think I can quickly turn
around and make bug-fix releases as necessary right after 2.0.5-beta.

I request that people consider helping out with this plan and sign up to
help push hadoop-2.x to stability as outlined above. I believe this will
help achieve our shared goals of quickly stabilizing hadoop-2 and help
ensure we can support it for forseeable future in a compatible manner for
the benefit of our users and downstream projects.

Please vote, the vote will run the normal 7 days. Obviously, I'm +1.

thanks,
Arun

PS: To keep this discussion grounded in technical details I've moved this
to dev@ (bcc general@).



Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Karthik Kambatla
Hi Arun,

Can we add HADOOP-9517 to the list - having compatibility guidelines should
help us support users and downstream projects better?

Thanks
Karthik


On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote:

 Folks,

 A considerable number of people have expressed confusion regarding the
 recent vote on 2.0.5, beta status etc. given lack of specifics, the voting
 itself (validity of the vote itself, whose votes are binding) etc.

 IMHO technical arguments (incompatibility b/w 2.0  2.1, current stability
 of 3 features under debate etc.) have been lost in the discussion in favor
 of non-technical (almost dramatic) nuances such as seizing the moment.
 There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1)
 - this is a red flag for me; particularly when there are just 3 features
 being debated and active committers and contributors are confident of and
 ready to stand by their work. All patches, I believe, are ready to be
 merged in the the next few days per discussions on jira. This will,
 clearly, not delay the other API work which everyone agrees is crucial. As
 a result, I feel no recourse but to restart a new vote - all attempts at
 calm, reasoned, civil discussion based on technical arguments have come to
 naught - I apologize for the thrash caused to everyone's attention.

 To get past all of this confusion, I'd like to present an alternate,
 specific proposal for consideration.

 I propose we continue the original plan and make a 2.0.5-beta release by
 May end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
  * Final YARN API changes: YARN-386
  * MR Binary Compatibility: MAPREDUCE-5108
  * Final RPC cleanup: HADOOP-8990

 People working on the above features have all expressed considerable
 comfort with them and are ready to stand-by to help expedite any necessary
 bug-fixes etc. to get to stabilization quickly. I'm confident we can get
 this release out by end of May. This sets stage for a hadoop-2.x GA release
 right after with some more testing - this means I think I can quickly turn
 around and make bug-fix releases as necessary right after 2.0.5-beta.

 I request that people consider helping out with this plan and sign up to
 help push hadoop-2.x to stability as outlined above. I believe this will
 help achieve our shared goals of quickly stabilizing hadoop-2 and help
 ensure we can support it for forseeable future in a compatible manner for
 the benefit of our users and downstream projects.

 Please vote, the vote will run the normal 7 days. Obviously, I'm +1.

 thanks,
 Arun

 PS: To keep this discussion grounded in technical details I've moved this
 to dev@ (bcc general@).




Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Alejandro Abdelnur
Do we need to add YARN-397?

Thanks.


On Wed, May 15, 2013 at 11:23 AM, Karthik Kambatla ka...@cloudera.comwrote:

 Hi Arun,

 Can we add HADOOP-9517 to the list - having compatibility guidelines should
 help us support users and downstream projects better?

 Thanks
 Karthik


 On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com
 wrote:

  Folks,
 
  A considerable number of people have expressed confusion regarding the
  recent vote on 2.0.5, beta status etc. given lack of specifics, the
 voting
  itself (validity of the vote itself, whose votes are binding) etc.
 
  IMHO technical arguments (incompatibility b/w 2.0  2.1, current
 stability
  of 3 features under debate etc.) have been lost in the discussion in
 favor
  of non-technical (almost dramatic) nuances such as seizing the moment.
  There is now dangerous talk of tolerating incompatibility b/w 2.0 and
 2.1)
  - this is a red flag for me; particularly when there are just 3 features
  being debated and active committers and contributors are confident of and
  ready to stand by their work. All patches, I believe, are ready to be
  merged in the the next few days per discussions on jira. This will,
  clearly, not delay the other API work which everyone agrees is crucial.
 As
  a result, I feel no recourse but to restart a new vote - all attempts at
  calm, reasoned, civil discussion based on technical arguments have come
 to
  naught - I apologize for the thrash caused to everyone's attention.
 
  To get past all of this confusion, I'd like to present an alternate,
  specific proposal for consideration.
 
  I propose we continue the original plan and make a 2.0.5-beta release by
  May end with the following content:
  # HDFS-347
  # HDFS Snapshots
  # Windows support
  # Necessary  final API/protocol changes such as:
   * Final YARN API changes: YARN-386
   * MR Binary Compatibility: MAPREDUCE-5108
   * Final RPC cleanup: HADOOP-8990
 
  People working on the above features have all expressed considerable
  comfort with them and are ready to stand-by to help expedite any
 necessary
  bug-fixes etc. to get to stabilization quickly. I'm confident we can get
  this release out by end of May. This sets stage for a hadoop-2.x GA
 release
  right after with some more testing - this means I think I can quickly
 turn
  around and make bug-fix releases as necessary right after 2.0.5-beta.
 
  I request that people consider helping out with this plan and sign up to
  help push hadoop-2.x to stability as outlined above. I believe this will
  help achieve our shared goals of quickly stabilizing hadoop-2 and help
  ensure we can support it for forseeable future in a compatible manner for
  the benefit of our users and downstream projects.
 
  Please vote, the vote will run the normal 7 days. Obviously, I'm +1.
 
  thanks,
  Arun
 
  PS: To keep this discussion grounded in technical details I've moved this
  to dev@ (bcc general@).
 
 




-- 
Alejandro


Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Vinod Kumar Vavilapalli

Thanks for laying out a very specific release plan, easy to vote on.

I am watching most of YARN and MAPREDUCE changes, glad that those are called 
out specifically. Apart from that, we have

 - RM restart which is mostly already committed but needs a couple more in
 - a couple of scheduling related APIs which fall under the protocol changes 
you mentioned, that are close to commit
 - a couple of security issues which aren't exactly features.

Just calling them out specifically so that there is no ambiguity.

+1 (binding for this)

Thanks,
+Vinod Kumar Vavilapalli

On May 15, 2013, at 10:57 AM, Arun C Murthy wrote:

 Folks,
 
 A considerable number of people have expressed confusion regarding the recent 
 vote on 2.0.5, beta status etc. given lack of specifics, the voting itself 
 (validity of the vote itself, whose votes are binding) etc.
 
 IMHO technical arguments (incompatibility b/w 2.0  2.1, current stability of 
 3 features under debate etc.) have been lost in the discussion in favor of 
 non-technical (almost dramatic) nuances such as seizing the moment. There 
 is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this 
 is a red flag for me; particularly when there are just 3 features being 
 debated and active committers and contributors are confident of and ready to 
 stand by their work. All patches, I believe, are ready to be merged in the 
 the next few days per discussions on jira. This will, clearly, not delay the 
 other API work which everyone agrees is crucial. As a result, I feel no 
 recourse but to restart a new vote - all attempts at calm, reasoned, civil 
 discussion based on technical arguments have come to naught - I apologize for 
 the thrash caused to everyone's attention.
 
 To get past all of this confusion, I'd like to present an alternate, specific 
 proposal for consideration.
 
 I propose we continue the original plan and make a 2.0.5-beta release by May 
 end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
 * Final YARN API changes: YARN-386
 * MR Binary Compatibility: MAPREDUCE-5108
 * Final RPC cleanup: HADOOP-8990
 
 People working on the above features have all expressed considerable comfort 
 with them and are ready to stand-by to help expedite any necessary bug-fixes 
 etc. to get to stabilization quickly. I'm confident we can get this release 
 out by end of May. This sets stage for a hadoop-2.x GA release right after 
 with some more testing - this means I think I can quickly turn around and 
 make bug-fix releases as necessary right after 2.0.5-beta.
 
 I request that people consider helping out with this plan and sign up to help 
 push hadoop-2.x to stability as outlined above. I believe this will help 
 achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we 
 can support it for forseeable future in a compatible manner for the benefit 
 of our users and downstream projects.
 
 Please vote, the vote will run the normal 7 days. Obviously, I'm +1.
 
 thanks,
 Arun
 
 PS: To keep this discussion grounded in technical details I've moved this to 
 dev@ (bcc general@).
 



Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Vinod Kumar Vavilapalli


  - RM restart which is mostly already committed but needs a couple more in
  - a couple of scheduling related APIs which fall under the protocol changes 
 you mentioned, that are close to commit
  - a couple of security issues which aren't exactly features.

I should have been clearer:
 - RM restart stuff is tracked at YARN-128
 - scheduling APIs tracked at YARN-397
 - security stuff tracked at YARN-47

Thanks,
+Vinod



Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Arun C Murthy
Yes to all. As long as we are making timely and compatible progress, we don't 
need to debate individual issues here. Let's continue discussion on relevant 
jiras.

thanks,
Arun

On May 15, 2013, at 12:11 PM, Vinod Kumar Vavilapalli wrote:

 
 
 - RM restart which is mostly already committed but needs a couple more in
 - a couple of scheduling related APIs which fall under the protocol changes 
 you mentioned, that are close to commit
 - a couple of security issues which aren't exactly features.
 
 I should have been clearer:
 - RM restart stuff is tracked at YARN-128
 - scheduling APIs tracked at YARN-397
 - security stuff tracked at YARN-47
 
 Thanks,
 +Vinod
 




Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Vinod Kumar Vavilapalli

I also feel that some of YARN-397 should go in. If you also feel so, please put 
in a +1 to state your intention.

Thanks,
+Vinod

On May 15, 2013, at 11:32 AM, Alejandro Abdelnur wrote:

 Do we need to add YARN-397?
 
 Thanks.
 
 
 On Wed, May 15, 2013 at 11:23 AM, Karthik Kambatla ka...@cloudera.comwrote:
 
 Hi Arun,
 
 Can we add HADOOP-9517 to the list - having compatibility guidelines should
 help us support users and downstream projects better?
 
 Thanks
 Karthik
 
 
 On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com
 wrote:
 
 Folks,
 
 A considerable number of people have expressed confusion regarding the
 recent vote on 2.0.5, beta status etc. given lack of specifics, the
 voting
 itself (validity of the vote itself, whose votes are binding) etc.
 
 IMHO technical arguments (incompatibility b/w 2.0  2.1, current
 stability
 of 3 features under debate etc.) have been lost in the discussion in
 favor
 of non-technical (almost dramatic) nuances such as seizing the moment.
 There is now dangerous talk of tolerating incompatibility b/w 2.0 and
 2.1)
 - this is a red flag for me; particularly when there are just 3 features
 being debated and active committers and contributors are confident of and
 ready to stand by their work. All patches, I believe, are ready to be
 merged in the the next few days per discussions on jira. This will,
 clearly, not delay the other API work which everyone agrees is crucial.
 As
 a result, I feel no recourse but to restart a new vote - all attempts at
 calm, reasoned, civil discussion based on technical arguments have come
 to
 naught - I apologize for the thrash caused to everyone's attention.
 
 To get past all of this confusion, I'd like to present an alternate,
 specific proposal for consideration.
 
 I propose we continue the original plan and make a 2.0.5-beta release by
 May end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
 * Final YARN API changes: YARN-386
 * MR Binary Compatibility: MAPREDUCE-5108
 * Final RPC cleanup: HADOOP-8990
 
 People working on the above features have all expressed considerable
 comfort with them and are ready to stand-by to help expedite any
 necessary
 bug-fixes etc. to get to stabilization quickly. I'm confident we can get
 this release out by end of May. This sets stage for a hadoop-2.x GA
 release
 right after with some more testing - this means I think I can quickly
 turn
 around and make bug-fix releases as necessary right after 2.0.5-beta.
 
 I request that people consider helping out with this plan and sign up to
 help push hadoop-2.x to stability as outlined above. I believe this will
 help achieve our shared goals of quickly stabilizing hadoop-2 and help
 ensure we can support it for forseeable future in a compatible manner for
 the benefit of our users and downstream projects.
 
 Please vote, the vote will run the normal 7 days. Obviously, I'm +1.
 
 thanks,
 Arun
 
 PS: To keep this discussion grounded in technical details I've moved this
 to dev@ (bcc general@).
 
 
 
 
 
 
 -- 
 Alejandro



Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread eric baldeschwieler
+1

On May 15, 2013, at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote:

 Folks,
 
 A considerable number of people have expressed confusion regarding the recent 
 vote on 2.0.5, beta status etc. given lack of specifics, the voting itself 
 (validity of the vote itself, whose votes are binding) etc.
 
 IMHO technical arguments (incompatibility b/w 2.0  2.1, current stability of 
 3 features under debate etc.) have been lost in the discussion in favor of 
 non-technical (almost dramatic) nuances such as seizing the moment. There 
 is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1) - this 
 is a red flag for me; particularly when there are just 3 features being 
 debated and active committers and contributors are confident of and ready to 
 stand by their work. All patches, I believe, are ready to be merged in the 
 the next few days per discussions on jira. This will, clearly, not delay the 
 other API work which everyone agrees is crucial. As a result, I feel no 
 recourse but to restart a new vote - all attempts at calm, reasoned, civil 
 discussion based on technical arguments have come to naught - I apologize for 
 the thrash caused to everyone's attention.
 
 To get past all of this confusion, I'd like to present an alternate, specific 
 proposal for consideration.
 
 I propose we continue the original plan and make a 2.0.5-beta release by May 
 end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
 * Final YARN API changes: YARN-386
 * MR Binary Compatibility: MAPREDUCE-5108
 * Final RPC cleanup: HADOOP-8990
 
 People working on the above features have all expressed considerable comfort 
 with them and are ready to stand-by to help expedite any necessary bug-fixes 
 etc. to get to stabilization quickly. I'm confident we can get this release 
 out by end of May. This sets stage for a hadoop-2.x GA release right after 
 with some more testing - this means I think I can quickly turn around and 
 make bug-fix releases as necessary right after 2.0.5-beta.
 
 I request that people consider helping out with this plan and sign up to help 
 push hadoop-2.x to stability as outlined above. I believe this will help 
 achieve our shared goals of quickly stabilizing hadoop-2 and help ensure we 
 can support it for forseeable future in a compatible manner for the benefit 
 of our users and downstream projects.
 
 Please vote, the vote will run the normal 7 days. Obviously, I'm +1.
 
 thanks,
 Arun
 
 PS: To keep this discussion grounded in technical details I've moved this to 
 dev@ (bcc general@).
 



Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Matt Foley
+1 (binding).  I think it's important to maintain the release continuity,
otherwise we could end up with the 0.20.2 / 0.20.200 problem all over again
(parallel stable dev tracks without a parent-child relationship to each
other, ie with disjoint subsets of functionality).  I consider achieving a
stable basis for API backward compat very important.  And Arun is
committing to hit beta in the very near future.

--Matt


On Wed, May 15, 2013 at 1:16 PM, Bikas Saha bi...@hortonworks.com wrote:

 I am +1 to the proposal because it maintains the original cadence a bunch
 of us committers/contributors have been working with.

 Windows related changes have been made in a conservative manner so as not
 to destabilize the code base. The changes are being extensively tested and
 validated by community members, especially those from Microsoft.

 YARN-397 jiras are mainly enhancements that can be added in a backwards
 compatible manner. Would be great if some of them make it but I would not
 hold the release for them.

 Let us all make the effort to get the release out with all the long
 awaited and useful features as planned.
 Bikas

 -Original Message-
 From: Vinod Kumar Vavilapalli [mailto:vino...@hortonworks.com]
 Sent: Wednesday, May 15, 2013 12:20 PM
 To: common-dev@hadoop.apache.org
 Subject: Re: [VOTE] - Release 2.0.5-beta


 I also feel that some of YARN-397 should go in. If you also feel so,
 please put in a +1 to state your intention.

 Thanks,
 +Vinod

 On May 15, 2013, at 11:32 AM, Alejandro Abdelnur wrote:

  Do we need to add YARN-397?
 
  Thanks.
 
 
  On Wed, May 15, 2013 at 11:23 AM, Karthik Kambatla
 ka...@cloudera.comwrote:
 
  Hi Arun,
 
  Can we add HADOOP-9517 to the list - having compatibility guidelines
  should help us support users and downstream projects better?
 
  Thanks
  Karthik
 
 
  On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com
  wrote:
 
  Folks,
 
  A considerable number of people have expressed confusion regarding
  the recent vote on 2.0.5, beta status etc. given lack of specifics,
  the
  voting
  itself (validity of the vote itself, whose votes are binding) etc.
 
  IMHO technical arguments (incompatibility b/w 2.0  2.1, current
  stability
  of 3 features under debate etc.) have been lost in the discussion in
  favor
  of non-technical (almost dramatic) nuances such as seizing the
 moment.
  There is now dangerous talk of tolerating incompatibility b/w 2.0
  and
  2.1)
  - this is a red flag for me; particularly when there are just 3
  features being debated and active committers and contributors are
  confident of and ready to stand by their work. All patches, I
  believe, are ready to be merged in the the next few days per
  discussions on jira. This will, clearly, not delay the other API work
 which everyone agrees is crucial.
  As
  a result, I feel no recourse but to restart a new vote - all
  attempts at calm, reasoned, civil discussion based on technical
  arguments have come
  to
  naught - I apologize for the thrash caused to everyone's attention.
 
  To get past all of this confusion, I'd like to present an alternate,
  specific proposal for consideration.
 
  I propose we continue the original plan and make a 2.0.5-beta
  release by May end with the following content:
  # HDFS-347
  # HDFS Snapshots
  # Windows support
  # Necessary  final API/protocol changes such as:
  * Final YARN API changes: YARN-386
  * MR Binary Compatibility: MAPREDUCE-5108
  * Final RPC cleanup: HADOOP-8990
 
  People working on the above features have all expressed considerable
  comfort with them and are ready to stand-by to help expedite any
  necessary
  bug-fixes etc. to get to stabilization quickly. I'm confident we can
  get this release out by end of May. This sets stage for a hadoop-2.x
  GA
  release
  right after with some more testing - this means I think I can
  quickly
  turn
  around and make bug-fix releases as necessary right after 2.0.5-beta.
 
  I request that people consider helping out with this plan and sign
  up to help push hadoop-2.x to stability as outlined above. I believe
  this will help achieve our shared goals of quickly stabilizing
  hadoop-2 and help ensure we can support it for forseeable future in
  a compatible manner for the benefit of our users and downstream
 projects.
 
  Please vote, the vote will run the normal 7 days. Obviously, I'm +1.
 
  thanks,
  Arun
 
  PS: To keep this discussion grounded in technical details I've moved
  this to dev@ (bcc general@).
 
 
 
 
 
 
  --
  Alejandro



Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Sandy Ryza
+1 (non-binding)

Agreed with Bikas that we should get the scheduler API enhancements
(YARN-397) in we are able, but they don't need to be blockers because they
will be backwards compatible.

Arun, not sure whether your Yes to all already covered this, but I'd like
to throw in support for the compatibility guidelines being a blocker.


On Wed, May 15, 2013 at 1:20 PM, eric baldeschwieler eri...@hortonworks.com
 wrote:

 +1

 On May 15, 2013, at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote:

  Folks,
 
  A considerable number of people have expressed confusion regarding the
 recent vote on 2.0.5, beta status etc. given lack of specifics, the voting
 itself (validity of the vote itself, whose votes are binding) etc.
 
  IMHO technical arguments (incompatibility b/w 2.0  2.1, current
 stability of 3 features under debate etc.) have been lost in the discussion
 in favor of non-technical (almost dramatic) nuances such as seizing the
 moment. There is now dangerous talk of tolerating incompatibility b/w 2.0
 and 2.1) - this is a red flag for me; particularly when there are just 3
 features being debated and active committers and contributors are confident
 of and ready to stand by their work. All patches, I believe, are ready to
 be merged in the the next few days per discussions on jira. This will,
 clearly, not delay the other API work which everyone agrees is crucial. As
 a result, I feel no recourse but to restart a new vote - all attempts at
 calm, reasoned, civil discussion based on technical arguments have come to
 naught - I apologize for the thrash caused to everyone's attention.
 
  To get past all of this confusion, I'd like to present an alternate,
 specific proposal for consideration.
 
  I propose we continue the original plan and make a 2.0.5-beta release by
 May end with the following content:
  # HDFS-347
  # HDFS Snapshots
  # Windows support
  # Necessary  final API/protocol changes such as:
  * Final YARN API changes: YARN-386
  * MR Binary Compatibility: MAPREDUCE-5108
  * Final RPC cleanup: HADOOP-8990
 
  People working on the above features have all expressed considerable
 comfort with them and are ready to stand-by to help expedite any necessary
 bug-fixes etc. to get to stabilization quickly. I'm confident we can get
 this release out by end of May. This sets stage for a hadoop-2.x GA release
 right after with some more testing - this means I think I can quickly turn
 around and make bug-fix releases as necessary right after 2.0.5-beta.
 
  I request that people consider helping out with this plan and sign up to
 help push hadoop-2.x to stability as outlined above. I believe this will
 help achieve our shared goals of quickly stabilizing hadoop-2 and help
 ensure we can support it for forseeable future in a compatible manner for
 the benefit of our users and downstream projects.
 
  Please vote, the vote will run the normal 7 days. Obviously, I'm +1.
 
  thanks,
  Arun
 
  PS: To keep this discussion grounded in technical details I've moved
 this to dev@ (bcc general@).
 




Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Matt Foley
 Arun, not sure whether your Yes to all already covered this, but I'd
like
 to throw in support for the compatibility guidelines being a blocker.

+1 to that.  Definitely an overriding concern for me.


On Wed, May 15, 2013 at 1:25 PM, Sandy Ryza sandy.r...@cloudera.com wrote:

 +1 (non-binding)

 Agreed with Bikas that we should get the scheduler API enhancements
 (YARN-397) in we are able, but they don't need to be blockers because they
 will be backwards compatible.

 Arun, not sure whether your Yes to all already covered this, but I'd like
 to throw in support for the compatibility guidelines being a blocker.


 On Wed, May 15, 2013 at 1:20 PM, eric baldeschwieler 
 eri...@hortonworks.com
  wrote:

  +1
 
  On May 15, 2013, at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote:
 
   Folks,
  
   A considerable number of people have expressed confusion regarding the
  recent vote on 2.0.5, beta status etc. given lack of specifics, the
 voting
  itself (validity of the vote itself, whose votes are binding) etc.
  
   IMHO technical arguments (incompatibility b/w 2.0  2.1, current
  stability of 3 features under debate etc.) have been lost in the
 discussion
  in favor of non-technical (almost dramatic) nuances such as seizing the
  moment. There is now dangerous talk of tolerating incompatibility b/w
 2.0
  and 2.1) - this is a red flag for me; particularly when there are just 3
  features being debated and active committers and contributors are
 confident
  of and ready to stand by their work. All patches, I believe, are ready to
  be merged in the the next few days per discussions on jira. This will,
  clearly, not delay the other API work which everyone agrees is crucial.
 As
  a result, I feel no recourse but to restart a new vote - all attempts at
  calm, reasoned, civil discussion based on technical arguments have come
 to
  naught - I apologize for the thrash caused to everyone's attention.
  
   To get past all of this confusion, I'd like to present an alternate,
  specific proposal for consideration.
  
   I propose we continue the original plan and make a 2.0.5-beta release
 by
  May end with the following content:
   # HDFS-347
   # HDFS Snapshots
   # Windows support
   # Necessary  final API/protocol changes such as:
   * Final YARN API changes: YARN-386
   * MR Binary Compatibility: MAPREDUCE-5108
   * Final RPC cleanup: HADOOP-8990
  
   People working on the above features have all expressed considerable
  comfort with them and are ready to stand-by to help expedite any
 necessary
  bug-fixes etc. to get to stabilization quickly. I'm confident we can get
  this release out by end of May. This sets stage for a hadoop-2.x GA
 release
  right after with some more testing - this means I think I can quickly
 turn
  around and make bug-fix releases as necessary right after 2.0.5-beta.
  
   I request that people consider helping out with this plan and sign up
 to
  help push hadoop-2.x to stability as outlined above. I believe this will
  help achieve our shared goals of quickly stabilizing hadoop-2 and help
  ensure we can support it for forseeable future in a compatible manner for
  the benefit of our users and downstream projects.
  
   Please vote, the vote will run the normal 7 days. Obviously, I'm +1.
  
   thanks,
   Arun
  
   PS: To keep this discussion grounded in technical details I've moved
  this to dev@ (bcc general@).
  
 
 



Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Matt Foley
 lets fork this thread into the appropriate ML and discuss the practical,
achievable
 steps that can be included into the release criteria of Hadoop 2.0.5-beta

Seems to me common-dev is the appropriate ML, and Arun has invited Jiras to
include.
Open a Jira with your suggested list, and we carry on the discussion from
there.  Does that work?


On Wed, May 15, 2013 at 1:29 PM, Roman Shaposhnik r...@apache.org wrote:

 On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com
 wrote:
  I propose we continue the original plan and make a 2.0.5-beta release by
 May end with the following content:

 I have a very basic question: what are the steps that we, as a community,
 are willing to undertake to ensure that our aggressive release schedule
 (end of May is exactly two weeks away) and our intent of actually
 calling this a beta release would be realistic?

 Please tell me if my expectations are incorrect, but to me the -beta would
 signify it being a 'safe' target for the downstream components. We're still
 finding *very* basic and *very* disruptive issues (MAPREDUCE-5240 is
 a good example) that essentially mean DOA for downstream that depends
 on this functionality.

 Are we comfortable with delivering 2.0.5-beta and later on starting
 to discover things like MAPREDUCE-5240 more or less accidentally?

 As I mentioned in a different thread -- there are a few things that
 Apache Bigtop can help with in that regard -- but they can only
 happen if we as a community agree that they need to happen
 before we can call Hadoop 2.x a beta release.

 If this sounds useful to the Hadoop community at large -- lets fork
 this thread into the appropriate ML and discuss the practical, achievable
 steps that can be included into the release criteria of Hadoop 2.0.5-beta
 as it is being discussed here.

 Thanks,
 Roman.



Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Devaraj Das
+1 (binding) on the proposal. 2-3 weeks doesn't sound too long a time, and
we have many committers willing to be on-call to fix issues when they are
discovered.


On Wed, May 15, 2013 at 10:57 AM, Arun C Murthy a...@hortonworks.com wrote:

 Folks,

 A considerable number of people have expressed confusion regarding the
 recent vote on 2.0.5, beta status etc. given lack of specifics, the voting
 itself (validity of the vote itself, whose votes are binding) etc.

 IMHO technical arguments (incompatibility b/w 2.0  2.1, current stability
 of 3 features under debate etc.) have been lost in the discussion in favor
 of non-technical (almost dramatic) nuances such as seizing the moment.
 There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1)
 - this is a red flag for me; particularly when there are just 3 features
 being debated and active committers and contributors are confident of and
 ready to stand by their work. All patches, I believe, are ready to be
 merged in the the next few days per discussions on jira. This will,
 clearly, not delay the other API work which everyone agrees is crucial. As
 a result, I feel no recourse but to restart a new vote - all attempts at
 calm, reasoned, civil discussion based on technical arguments have come to
 naught - I apologize for the thrash caused to everyone's attention.

 To get past all of this confusion, I'd like to present an alternate,
 specific proposal for consideration.

 I propose we continue the original plan and make a 2.0.5-beta release by
 May end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
  * Final YARN API changes: YARN-386
  * MR Binary Compatibility: MAPREDUCE-5108
  * Final RPC cleanup: HADOOP-8990

 People working on the above features have all expressed considerable
 comfort with them and are ready to stand-by to help expedite any necessary
 bug-fixes etc. to get to stabilization quickly. I'm confident we can get
 this release out by end of May. This sets stage for a hadoop-2.x GA release
 right after with some more testing - this means I think I can quickly turn
 around and make bug-fix releases as necessary right after 2.0.5-beta.

 I request that people consider helping out with this plan and sign up to
 help push hadoop-2.x to stability as outlined above. I believe this will
 help achieve our shared goals of quickly stabilizing hadoop-2 and help
 ensure we can support it for forseeable future in a compatible manner for
 the benefit of our users and downstream projects.

 Please vote, the vote will run the normal 7 days. Obviously, I'm +1.

 thanks,
 Arun

 PS: To keep this discussion grounded in technical details I've moved this
 to dev@ (bcc general@).




Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Eli Collins
On Wed, May 15, 2013 at 1:29 PM, Matt Foley mfo...@hortonworks.com wrote:

  Arun, not sure whether your Yes to all already covered this, but I'd
 like
  to throw in support for the compatibility guidelines being a blocker.

 +1 to that.  Definitely an overriding concern for me.


+1  Likewise.   Would be great to get more eyeballs on Karthik's patch
on HADOOP-9517 if people haven't review it already.


Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Zhijie Shen
+1 (non-binding) on the proposal.


On Wed, May 15, 2013 at 1:43 PM, Eli Collins e...@cloudera.com wrote:

 On Wed, May 15, 2013 at 1:29 PM, Matt Foley mfo...@hortonworks.com
 wrote:

   Arun, not sure whether your Yes to all already covered this, but I'd
  like
   to throw in support for the compatibility guidelines being a blocker.
 
  +1 to that.  Definitely an overriding concern for me.
 
 
 +1  Likewise.   Would be great to get more eyeballs on Karthik's patch
 on HADOOP-9517 if people haven't review it already.




-- 
Zhijie Shen
Hortonworks Inc.
http://hortonworks.com/


Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Roman Shaposhnik
On Wed, May 15, 2013 at 1:36 PM, Matt Foley mfo...@hortonworks.com wrote:
 lets fork this thread into the appropriate ML and discuss the practical,
 achievable
 steps that can be included into the release criteria of Hadoop 2.0.5-beta

 Seems to me common-dev is the appropriate ML,

Thanks. I'll stick to this thread then.

 and Arun has invited Jiras to include.
 Open a Jira with your suggested list, and we carry on the discussion from
 there.  Does that work?

But this is exactly my concern -- I don't have the list of JIRAs.
In fact, there's work that needs to be done to arrive at the
list of JIRAs that would be complete enough to give me
confidence that something like MAPREDUCE-5240 (I'll stick
to this example simply because I remember it by heart now ;-)).
What I'm saying is this -- if nobody is willing to do this work
outside of the very few folks who are part of Apache Bigtop
then I have very little confidence in this proposal actually
delivering on its promise of beta quality in Hadoop 2.0.5.

The question I'm asking is actually quite simple: does Hadoop
community believe in investing in doing this work to COME UP
with such a list? Or to ask it differently -- does the Hadoop
community value the feedback that such a work would provide
to a degree that it would be made part of the release criteria
for Hadoop 2.0.5-beta?

This is really a binary question as far as I can tell.

Thanks,
Roman.

P.S. There's a second level of discussion which is -- what
exactly does that extra work entail -- but lets deal with
the basic question first.


Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Vinod Kumar Vavilapalli

Roman, I keep this same argument again and again. Should've refuted earlier.

Please list down all the issues that BigTop ran into *because of* new features. 
You continue to argue that new features are destabilizing 2.0.*, which I don't 
agree with at all. 2.0.3-alpha was the last time major features got merged in, 
and we found blockers irrespective of those.

MAPREDUCE-5240 specifically isn't due to any feature merge. This was a bug. I'd 
say this is a long standing bug in 2.0.x. You sure this passed in 2.0.3? Even 
so, this is mostly broken by another bug-fix and *not* because of any feature.

I quickly checked other bugs you reported in 2.0.x:
 - MAPREDUCE-5088 was caused by the fix for HADOOP-9299 which was again a long 
standing issue in 2.0.x
 - MAPREDUCE-3728 is similar
 - MAPREDUCE-5117 is similar
 - MAPREDUCE-4219 was a security related feature request from you.
 - MAPREDUCE-3916 was because of new proxy-server added.

I am not arguing that new features *may* destabilize the branch, but you've 
repeatedly stated this as if that were a fact.

Really appreciate the testing done by BigTop, but please don't distort the 
facts.

Thanks,
+Vinod


On May 15, 2013, at 1:29 PM, Roman Shaposhnik wrote:

 Please tell me if my expectations are incorrect, but to me the -beta would
 signify it being a 'safe' target for the downstream components. We're still
 finding *very* basic and *very* disruptive issues (MAPREDUCE-5240 is
 a good example) that essentially mean DOA for downstream that depends
 on this functionality.
 
 Are we comfortable with delivering 2.0.5-beta and later on starting
 to discover things like MAPREDUCE-5240 more or less accidentally?



Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Vinod Kumar Vavilapalli
Typo, keep hearing*

Thanks,
+Vinod

On May 15, 2013, at 2:14 PM, Vinod Kumar Vavilapalli wrote:

 Roman, I keep this same argument again and again. Should've refuted earlier.



Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Arun C Murthy
Great summary, thanks Vinod.

On May 15, 2013, at 2:14 PM, Vinod Kumar Vavilapalli wrote:

 
 Roman, I keep this same argument again and again. Should've refuted earlier.
 
 Please list down all the issues that BigTop ran into *because of* new 
 features. You continue to argue that new features are destabilizing 2.0.*, 
 which I don't agree with at all. 2.0.3-alpha was the last time major features 
 got merged in, and we found blockers irrespective of those.
 
 MAPREDUCE-5240 specifically isn't due to any feature merge. This was a bug. 
 I'd say this is a long standing bug in 2.0.x. You sure this passed in 2.0.3? 
 Even so, this is mostly broken by another bug-fix and *not* because of any 
 feature.
 
 I quickly checked other bugs you reported in 2.0.x:
 - MAPREDUCE-5088 was caused by the fix for HADOOP-9299 which was again a long 
 standing issue in 2.0.x
 - MAPREDUCE-3728 is similar
 - MAPREDUCE-5117 is similar
 - MAPREDUCE-4219 was a security related feature request from you.
 - MAPREDUCE-3916 was because of new proxy-server added.
 
 I am not arguing that new features *may* destabilize the branch, but you've 
 repeatedly stated this as if that were a fact.
 
 Really appreciate the testing done by BigTop, but please don't distort the 
 facts.
 
 Thanks,
 +Vinod
 
 
 On May 15, 2013, at 1:29 PM, Roman Shaposhnik wrote:
 
 Please tell me if my expectations are incorrect, but to me the -beta would
 signify it being a 'safe' target for the downstream components. We're still
 finding *very* basic and *very* disruptive issues (MAPREDUCE-5240 is
 a good example) that essentially mean DOA for downstream that depends
 on this functionality.
 
 Are we comfortable with delivering 2.0.5-beta and later on starting
 to discover things like MAPREDUCE-5240 more or less accidentally?
 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/




Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Steve Loughran
On 15 May 2013 10:57, Arun C Murthy a...@hortonworks.com wrote:

 Folks,

 A considerable number of people have expressed confusion regarding the
 recent vote on 2.0.5, beta status etc. given lack of specifics, the voting
 itself (validity of the vote itself, whose votes are binding) etc.

 IMHO technical arguments (incompatibility b/w 2.0  2.1, current stability
 of 3 features under debate etc.) have been lost in the discussion in favor
 of non-technical (almost dramatic) nuances such as seizing the moment.
 There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1)
 - this is a red flag for me; particularly when there are just 3 features
 being debated and active committers and contributors are confident of and
 ready to stand by their work. All patches, I believe, are ready to be
 merged in the the next few days per discussions on jira. This will,
 clearly, not delay the other API work which everyone agrees is crucial. As
 a result, I feel no recourse but to restart a new vote - all attempts at
 calm, reasoned, civil discussion based on technical arguments have come to
 naught - I apologize for the thrash caused to everyone's attention.

 To get past all of this confusion, I'd like to present an alternate,
 specific proposal for consideration.

 I propose we continue the original plan and make a 2.0.5-beta release by
 May end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
  * Final YARN API changes: YARN-386
  * MR Binary Compatibility: MAPREDUCE-5108
  * Final RPC cleanup: HADOOP-8990

 People working on the above features have all expressed considerable
 comfort with them and are ready to stand-by to help expedite any necessary
 bug-fixes etc. to get to stabilization quickly. I'm confident we can get
 this release out by end of May. This sets stage for a hadoop-2.x GA release
 right after with some more testing - this means I think I can quickly turn
 around and make bug-fix releases as necessary right after 2.0.5-beta.

 I request that people consider helping out with this plan and sign up to
 help push hadoop-2.x to stability as outlined above. I believe this will
 help achieve our shared goals of quickly stabilizing hadoop-2 and help
 ensure we can support it for forseeable future in a compatible manner for
 the benefit of our users and downstream projects.

 Please vote, the vote will run the normal 7 days. Obviously, I'm +1.


+1 (binding)


Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Roman Shaposhnik
On Wed, May 15, 2013 at 2:14 PM, Vinod Kumar Vavilapalli
vino...@hortonworks.com wrote:
 Please list down all the issues that BigTop ran into *because of* new 
 features.

Whether the bug is *because of* new feature or not is a red herring
for my argument. Please lets drop this distinction. I never used it.

 You continue to argue that new features are destabilizing 2.0.*,
 which I don't agree with at all. 2.0.3-alpha was the last time major
 features got merged in, and we found blockers irrespective of those.

This is not my argument at all. I apologize if somehow I failed to
communicate it, but here's what my argument boils down to:
given *my* experience with Hadoop 2.0.x series and Bigtop
release every time I try a different release of Hadoop 2.0.x
I run into issues that scare me. They scare me because
they are so basic yet they make component like Sqoop
and Oozie (and I believe Giraph on one occasion)
pretty much DOA for YARN-base mapreduce implementation.

In my mind, what that translates into is the fact that nobody
did *any* real testing of a particular downstream component
running on a given Hadoop 2.0.x release. Like I said --
the issues so far make the components in question DOA.

Effectively the onion of issues remain unpeeled, so to speak.

What I'm asking on this thread (and somehow nobody is willing
to give me a straight answer) is whether the Hadoop community
is willing to invest in peeling this onion of issues somewhat more
before declaring Hadoop 2.0.5 a beta release.

Once again it is a binary question -- please give me an answer
of yes or no.

 I am not arguing that new features *may* destabilize the branch, but you've 
 repeatedly stated this as if that were a fact.

Your list of issues is pretty complete (give or take a few that I didn't file
but Cos and others did). And I'd be the first one to agree that
it is not a large list of issues. What scares me is not its size,
but the fact how basic they are and how the block the *rest*
of the testing completely.

To be extra clear -- what scares me about something like
MAPREDUCE-5240 is not whether it came as a result of
a merge or was sitting there since day one. What scares
me is that we've identified it last week and yet Sqoop 2 is
DOA in its presense.

How many more issues like that one (regardless of how
they originated) are in branch-2? Wouldn't we want to
know before declaring Hadoop 2.0.5 beta?

Now, knowing would require work -- that's what my
argument is all about.


Thanks,
Roman.


Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Arun C Murthy
Roman,

Furthermore, before we rush into finding flaws and scaring kids at night it 
would be useful to remember one thing:
Software has *bugs*. We can't block any release till the entire universe 
validates it, in fact they won't validate it if we don't release since are at 
the bottom of the stack.

Any help prior to the release is welcome; I know people who work for the same 
employer as I do have plans to do further testing after we freeze apis via the 
beta release(s). I hope and pray others can join this effort - thanks to 
everyone who already has.

Again, freezing APIs and protocols is the primary aim of 2.0.5-beta. There are 
no guarantees it's 100% bug-free, we can never make such guarantees anyway.

If, and when, we find bugs with 2.0.5-beta I'm more than happy to quickly turn 
around and make more releases (2.0.6-beta, 2.0.7-beta). Obviously I'll make a 
call on which bugs are critical - feedback to help me decide is, as always, 
welcome.
I've been clear, many times, that we might need more than one beta release to 
iron out bugs etc.

None of this should be a surprise - this has happened many, many times in the 
lifetime of this and other projects. 2.0.3-alpha vis-a-vis 2.0.4-alpha is the 
most recent example - it won't be the last.

So, I hope, concludes this meme.

thanks,
Arun

On May 15, 2013, at 2:20 PM, Arun C Murthy wrote:

 Great summary, thanks Vinod.
 
 On May 15, 2013, at 2:14 PM, Vinod Kumar Vavilapalli wrote:
 
 
 Roman, I keep this same argument again and again. Should've refuted earlier.
 
 Please list down all the issues that BigTop ran into *because of* new 
 features. You continue to argue that new features are destabilizing 2.0.*, 
 which I don't agree with at all. 2.0.3-alpha was the last time major 
 features got merged in, and we found blockers irrespective of those.
 
 MAPREDUCE-5240 specifically isn't due to any feature merge. This was a bug. 
 I'd say this is a long standing bug in 2.0.x. You sure this passed in 2.0.3? 
 Even so, this is mostly broken by another bug-fix and *not* because of any 
 feature.
 
 I quickly checked other bugs you reported in 2.0.x:
 - MAPREDUCE-5088 was caused by the fix for HADOOP-9299 which was again a 
 long standing issue in 2.0.x
 - MAPREDUCE-3728 is similar
 - MAPREDUCE-5117 is similar
 - MAPREDUCE-4219 was a security related feature request from you.
 - MAPREDUCE-3916 was because of new proxy-server added.
 
 I am not arguing that new features *may* destabilize the branch, but you've 
 repeatedly stated this as if that were a fact.
 
 Really appreciate the testing done by BigTop, but please don't distort the 
 facts.
 
 Thanks,
 +Vinod
 
 
 On May 15, 2013, at 1:29 PM, Roman Shaposhnik wrote:
 
 Please tell me if my expectations are incorrect, but to me the -beta would
 signify it being a 'safe' target for the downstream components. We're still
 finding *very* basic and *very* disruptive issues (MAPREDUCE-5240 is
 a good example) that essentially mean DOA for downstream that depends
 on this functionality.
 
 Are we comfortable with delivering 2.0.5-beta and later on starting
 to discover things like MAPREDUCE-5240 more or less accidentally?
 
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/
 
 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/




Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Chris Douglas
+1 (binding) on the proposal.

However, the value we get from these release plan votes is dubious,
to put it mildly. The surrounding discussion has cost more than it is
worth, and votes on executive summaries of releases discourage the
sort of detailed collaboration we're trying to create. It replaces
development with zero-sum struggles over abstractions.

This is, in effect, another poll about the direction we're taking 2.x.
If we can't reach consensus on development directions without voting,
that's more evidence that the project should be split, IMO. -C

On Wed, May 15, 2013 at 2:21 PM, Steve Loughran ste...@hortonworks.com wrote:
 On 15 May 2013 10:57, Arun C Murthy a...@hortonworks.com wrote:

 Folks,

 A considerable number of people have expressed confusion regarding the
 recent vote on 2.0.5, beta status etc. given lack of specifics, the voting
 itself (validity of the vote itself, whose votes are binding) etc.

 IMHO technical arguments (incompatibility b/w 2.0  2.1, current stability
 of 3 features under debate etc.) have been lost in the discussion in favor
 of non-technical (almost dramatic) nuances such as seizing the moment.
 There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1)
 - this is a red flag for me; particularly when there are just 3 features
 being debated and active committers and contributors are confident of and
 ready to stand by their work. All patches, I believe, are ready to be
 merged in the the next few days per discussions on jira. This will,
 clearly, not delay the other API work which everyone agrees is crucial. As
 a result, I feel no recourse but to restart a new vote - all attempts at
 calm, reasoned, civil discussion based on technical arguments have come to
 naught - I apologize for the thrash caused to everyone's attention.

 To get past all of this confusion, I'd like to present an alternate,
 specific proposal for consideration.

 I propose we continue the original plan and make a 2.0.5-beta release by
 May end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
  * Final YARN API changes: YARN-386
  * MR Binary Compatibility: MAPREDUCE-5108
  * Final RPC cleanup: HADOOP-8990

 People working on the above features have all expressed considerable
 comfort with them and are ready to stand-by to help expedite any necessary
 bug-fixes etc. to get to stabilization quickly. I'm confident we can get
 this release out by end of May. This sets stage for a hadoop-2.x GA release
 right after with some more testing - this means I think I can quickly turn
 around and make bug-fix releases as necessary right after 2.0.5-beta.

 I request that people consider helping out with this plan and sign up to
 help push hadoop-2.x to stability as outlined above. I believe this will
 help achieve our shared goals of quickly stabilizing hadoop-2 and help
 ensure we can support it for forseeable future in a compatible manner for
 the benefit of our users and downstream projects.

 Please vote, the vote will run the normal 7 days. Obviously, I'm +1.


 +1 (binding)


Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Steve Loughran
On 15 May 2013 15:02, Arun C Murthy a...@hortonworks.com wrote:

 Roman,

 Furthermore, before we rush into finding flaws and scaring kids at night
 it would be useful to remember one thing:
 Software has *bugs*. We can't block any release till the entire universe
 validates it, in fact they won't validate it if we don't release since are
 at the bottom of the stack.



more subtly: we aren't going to find all the corner case situations until
things ship into the hands of people whose {networks, configs,
applications, hardware} are different. Marking something as -beta means
more people will use it, and find those problems, at a time when it is
still possible for a  fast turnaround on fixes. what we are implicitly
saying with a -beta tag is  ready for others to use, which in Hadoop's
case means doesn't lose data unless you do something suicidal and we're
not going to move APIs on you. The gulf from -beta to shipping is usually
much less dramatic than -alpha to -beta, as it happens when everyone is
happy that the last beta is good enough to push out.

-Steve

(who will be at the HUG in Sunnyvale this evening)


Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Roman Shaposhnik
Arun,

am I reading yours answer to my binary question correctly? It is a 'no'.

My reading of your response is that while you appreciate the feedback
Bigtop is providing you're not of an opinion that investigating the level
of stability of Hadoop wrt. downstream any further than what is currently
happening would be a worthy investment of Hadoop's community
(or your personal for that matter) time?

Thanks,
Roman.

On Wed, May 15, 2013 at 3:02 PM, Arun C Murthy a...@hortonworks.com wrote:
 Roman,

 Furthermore, before we rush into finding flaws and scaring kids at night it 
 would be useful to remember one thing:
 Software has *bugs*. We can't block any release till the entire universe 
 validates it, in fact they won't validate it if we don't release since are at 
 the bottom of the stack.

 Any help prior to the release is welcome; I know people who work for the same 
 employer as I do have plans to do further testing after we freeze apis via 
 the beta release(s). I hope and pray others can join this effort - thanks to 
 everyone who already has.

 Again, freezing APIs and protocols is the primary aim of 2.0.5-beta. There 
 are no guarantees it's 100% bug-free, we can never make such guarantees 
 anyway.

 If, and when, we find bugs with 2.0.5-beta I'm more than happy to quickly 
 turn around and make more releases (2.0.6-beta, 2.0.7-beta). Obviously I'll 
 make a call on which bugs are critical - feedback to help me decide is, as 
 always, welcome.
 I've been clear, many times, that we might need more than one beta release to 
 iron out bugs etc.

 None of this should be a surprise - this has happened many, many times in the 
 lifetime of this and other projects. 2.0.3-alpha vis-a-vis 2.0.4-alpha is the 
 most recent example - it won't be the last.

 So, I hope, concludes this meme.

 thanks,
 Arun

 On May 15, 2013, at 2:20 PM, Arun C Murthy wrote:

 Great summary, thanks Vinod.

 On May 15, 2013, at 2:14 PM, Vinod Kumar Vavilapalli wrote:


 Roman, I keep this same argument again and again. Should've refuted earlier.

 Please list down all the issues that BigTop ran into *because of* new 
 features. You continue to argue that new features are destabilizing 2.0.*, 
 which I don't agree with at all. 2.0.3-alpha was the last time major 
 features got merged in, and we found blockers irrespective of those.

 MAPREDUCE-5240 specifically isn't due to any feature merge. This was a bug. 
 I'd say this is a long standing bug in 2.0.x. You sure this passed in 
 2.0.3? Even so, this is mostly broken by another bug-fix and *not* because 
 of any feature.

 I quickly checked other bugs you reported in 2.0.x:
 - MAPREDUCE-5088 was caused by the fix for HADOOP-9299 which was again a 
 long standing issue in 2.0.x
 - MAPREDUCE-3728 is similar
 - MAPREDUCE-5117 is similar
 - MAPREDUCE-4219 was a security related feature request from you.
 - MAPREDUCE-3916 was because of new proxy-server added.

 I am not arguing that new features *may* destabilize the branch, but you've 
 repeatedly stated this as if that were a fact.

 Really appreciate the testing done by BigTop, but please don't distort the 
 facts.

 Thanks,
 +Vinod


 On May 15, 2013, at 1:29 PM, Roman Shaposhnik wrote:

 Please tell me if my expectations are incorrect, but to me the -beta would
 signify it being a 'safe' target for the downstream components. We're still
 finding *very* basic and *very* disruptive issues (MAPREDUCE-5240 is
 a good example) that essentially mean DOA for downstream that depends
 on this functionality.

 Are we comfortable with delivering 2.0.5-beta and later on starting
 to discover things like MAPREDUCE-5240 more or less accidentally?


 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/



 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/




Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Konstantin Boudnik
Indeed. I think the root of the issue is deeper. ASF software practices are
great to deal with isolated, relatively contained projects like httpd,
libreoffice, trac, etc. However, Hadoop based stack - essentially, software
aimed at enterprises with bigger scale operations - is a different animal,
that requires balancing of a huge number of moving parts and an unbroken
flow of feedback up the stream. Anyone who have delivered any enterprise
grade software system knows perfectly well how hard is that.

However, in the environment where a release pushed out in the rush
(essentially causing DOA issues downstream), these are got fixed in consequent
releases.  That ironically is likely to contain some other DOAs because an
integration testing - and I mean real world integration system testing - is
done by this small project, that is treated like a toy for adolescent kids.
And there's no other real integration testing happening OPENLY on the full
stack. Despite numerous claims, that is.

Software comes with bugs - this is a somewhat expected phenomena. However, bug
fixes shouldn't be mixed with new features, increasing entropy in the system.
In other words, the development process should fan-in. A process with multiple
consequent stable releases helps to achieve it; and compatibility issues would
be addressed by working on the next major release.

The model above leaves downstream with a choice of sticking to the 3.x or 
switching
to 4.x and so on. Where's having permanent alpha tag is a convenient way to 
control
software project that effectively became a vendor-controlled effort.

And yes - this leads to fragmentation, makes no mistakes about it. Because no
one can sit on the hands for a year and wait until a usable release with all
great features will come about: lot of organizations just silently forking away
to make their own environment suitable for production or sale; some of them
might sporadically contribute something back. And of course - this is not the
aim of Apache project to produce commercial grade platform.

Cos

On Wed, May 15, 2013 at 02:54PM, Roman Shaposhnik wrote:
 On Wed, May 15, 2013 at 2:14 PM, Vinod Kumar Vavilapalli
 vino...@hortonworks.com wrote:
  Please list down all the issues that BigTop ran into *because of* new 
  features.
 
 Whether the bug is *because of* new feature or not is a red herring
 for my argument. Please lets drop this distinction. I never used it.
 
  You continue to argue that new features are destabilizing 2.0.*,
  which I don't agree with at all. 2.0.3-alpha was the last time major
  features got merged in, and we found blockers irrespective of those.
 
 This is not my argument at all. I apologize if somehow I failed to
 communicate it, but here's what my argument boils down to:
 given *my* experience with Hadoop 2.0.x series and Bigtop
 release every time I try a different release of Hadoop 2.0.x
 I run into issues that scare me. They scare me because
 they are so basic yet they make component like Sqoop
 and Oozie (and I believe Giraph on one occasion)
 pretty much DOA for YARN-base mapreduce implementation.
 
 In my mind, what that translates into is the fact that nobody
 did *any* real testing of a particular downstream component
 running on a given Hadoop 2.0.x release. Like I said --
 the issues so far make the components in question DOA.

 Effectively the onion of issues remain unpeeled, so to speak.
 
 What I'm asking on this thread (and somehow nobody is willing
 to give me a straight answer) is whether the Hadoop community
 is willing to invest in peeling this onion of issues somewhat more
 before declaring Hadoop 2.0.5 a beta release.
 
 Once again it is a binary question -- please give me an answer
 of yes or no.
 
  I am not arguing that new features *may* destabilize the branch, but you've 
  repeatedly stated this as if that were a fact.
 
 Your list of issues is pretty complete (give or take a few that I didn't file
 but Cos and others did). And I'd be the first one to agree that
 it is not a large list of issues. What scares me is not its size,
 but the fact how basic they are and how the block the *rest*
 of the testing completely.
 
 To be extra clear -- what scares me about something like
 MAPREDUCE-5240 is not whether it came as a result of
 a merge or was sitting there since day one. What scares
 me is that we've identified it last week and yet Sqoop 2 is
 DOA in its presense.
 
 How many more issues like that one (regardless of how
 they originated) are in branch-2? Wouldn't we want to
 know before declaring Hadoop 2.0.5 beta?
 
 Now, knowing would require work -- that's what my
 argument is all about.
 
 
 Thanks,
 Roman.


Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Arun C Murthy

On May 15, 2013, at 3:27 PM, Chris Douglas wrote:

 +1 (binding) on the proposal.
 
 However, the value we get from these release plan votes is dubious,
 to put it mildly. The surrounding discussion has cost more than it is
 worth, and votes on executive summaries of releases discourage the
 sort of detailed collaboration we're trying to create. It replaces
 development with zero-sum struggles over abstractions.

Agree, I propose we edit bylaws to do away with them for the future.
 
 This is, in effect, another poll about the direction we're taking 2.x.
 If we can't reach consensus on development directions without voting,
 that's more evidence that the project should be split, IMO. -C

+1e100

Arun

 
 On Wed, May 15, 2013 at 2:21 PM, Steve Loughran ste...@hortonworks.com 
 wrote:
 On 15 May 2013 10:57, Arun C Murthy a...@hortonworks.com wrote:
 
 Folks,
 
 A considerable number of people have expressed confusion regarding the
 recent vote on 2.0.5, beta status etc. given lack of specifics, the voting
 itself (validity of the vote itself, whose votes are binding) etc.
 
 IMHO technical arguments (incompatibility b/w 2.0  2.1, current stability
 of 3 features under debate etc.) have been lost in the discussion in favor
 of non-technical (almost dramatic) nuances such as seizing the moment.
 There is now dangerous talk of tolerating incompatibility b/w 2.0 and 2.1)
 - this is a red flag for me; particularly when there are just 3 features
 being debated and active committers and contributors are confident of and
 ready to stand by their work. All patches, I believe, are ready to be
 merged in the the next few days per discussions on jira. This will,
 clearly, not delay the other API work which everyone agrees is crucial. As
 a result, I feel no recourse but to restart a new vote - all attempts at
 calm, reasoned, civil discussion based on technical arguments have come to
 naught - I apologize for the thrash caused to everyone's attention.
 
 To get past all of this confusion, I'd like to present an alternate,
 specific proposal for consideration.
 
 I propose we continue the original plan and make a 2.0.5-beta release by
 May end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
 * Final YARN API changes: YARN-386
 * MR Binary Compatibility: MAPREDUCE-5108
 * Final RPC cleanup: HADOOP-8990
 
 People working on the above features have all expressed considerable
 comfort with them and are ready to stand-by to help expedite any necessary
 bug-fixes etc. to get to stabilization quickly. I'm confident we can get
 this release out by end of May. This sets stage for a hadoop-2.x GA release
 right after with some more testing - this means I think I can quickly turn
 around and make bug-fix releases as necessary right after 2.0.5-beta.
 
 I request that people consider helping out with this plan and sign up to
 help push hadoop-2.x to stability as outlined above. I believe this will
 help achieve our shared goals of quickly stabilizing hadoop-2 and help
 ensure we can support it for forseeable future in a compatible manner for
 the benefit of our users and downstream projects.
 
 Please vote, the vote will run the normal 7 days. Obviously, I'm +1.
 
 
 +1 (binding)

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/




Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Arun C Murthy

On May 15, 2013, at 3:50 PM, Roman Shaposhnik wrote:

 Arun,
 
 am I reading yours answer to my binary question correctly? It is a 'no'.

No.

 
 My reading of your response is that while you appreciate the feedback
 Bigtop is providing you're not of an opinion that investigating the level
 of stability of Hadoop wrt. downstream any further than what is currently
 happening would be a worthy investment of Hadoop's community
 (or your personal for that matter) time?

Everyone is welcome to contribute in any and all manner. I can't speak for 
everyone.
It would be useful if Bigtop could run regressions on releases here 
consistently. We've also talked in the past about running Bigtop on branch-2, 
nightly. Is that something you could help with? You'd earn my personal 
gratitude.

thanks,
Arun

 
 Thanks,
 Roman.
 
 On Wed, May 15, 2013 at 3:02 PM, Arun C Murthy a...@hortonworks.com wrote:
 Roman,
 
 Furthermore, before we rush into finding flaws and scaring kids at night it 
 would be useful to remember one thing:
 Software has *bugs*. We can't block any release till the entire universe 
 validates it, in fact they won't validate it if we don't release since are 
 at the bottom of the stack.
 
 Any help prior to the release is welcome; I know people who work for the 
 same employer as I do have plans to do further testing after we freeze apis 
 via the beta release(s). I hope and pray others can join this effort - 
 thanks to everyone who already has.
 
 Again, freezing APIs and protocols is the primary aim of 2.0.5-beta. There 
 are no guarantees it's 100% bug-free, we can never make such guarantees 
 anyway.
 
 If, and when, we find bugs with 2.0.5-beta I'm more than happy to quickly 
 turn around and make more releases (2.0.6-beta, 2.0.7-beta). Obviously I'll 
 make a call on which bugs are critical - feedback to help me decide is, as 
 always, welcome.
 I've been clear, many times, that we might need more than one beta release 
 to iron out bugs etc.
 
 None of this should be a surprise - this has happened many, many times in 
 the lifetime of this and other projects. 2.0.3-alpha vis-a-vis 2.0.4-alpha 
 is the most recent example - it won't be the last.
 
 So, I hope, concludes this meme.
 
 thanks,
 Arun
 
 On May 15, 2013, at 2:20 PM, Arun C Murthy wrote:
 
 Great summary, thanks Vinod.
 
 On May 15, 2013, at 2:14 PM, Vinod Kumar Vavilapalli wrote:
 
 
 Roman, I keep this same argument again and again. Should've refuted 
 earlier.
 
 Please list down all the issues that BigTop ran into *because of* new 
 features. You continue to argue that new features are destabilizing 2.0.*, 
 which I don't agree with at all. 2.0.3-alpha was the last time major 
 features got merged in, and we found blockers irrespective of those.
 
 MAPREDUCE-5240 specifically isn't due to any feature merge. This was a 
 bug. I'd say this is a long standing bug in 2.0.x. You sure this passed in 
 2.0.3? Even so, this is mostly broken by another bug-fix and *not* because 
 of any feature.
 
 I quickly checked other bugs you reported in 2.0.x:
 - MAPREDUCE-5088 was caused by the fix for HADOOP-9299 which was again a 
 long standing issue in 2.0.x
 - MAPREDUCE-3728 is similar
 - MAPREDUCE-5117 is similar
 - MAPREDUCE-4219 was a security related feature request from you.
 - MAPREDUCE-3916 was because of new proxy-server added.
 
 I am not arguing that new features *may* destabilize the branch, but 
 you've repeatedly stated this as if that were a fact.
 
 Really appreciate the testing done by BigTop, but please don't distort the 
 facts.
 
 Thanks,
 +Vinod
 
 
 On May 15, 2013, at 1:29 PM, Roman Shaposhnik wrote:
 
 Please tell me if my expectations are incorrect, but to me the -beta would
 signify it being a 'safe' target for the downstream components. We're 
 still
 finding *very* basic and *very* disruptive issues (MAPREDUCE-5240 is
 a good example) that essentially mean DOA for downstream that depends
 on this functionality.
 
 Are we comfortable with delivering 2.0.5-beta and later on starting
 to discover things like MAPREDUCE-5240 more or less accidentally?
 
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/
 
 
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/
 
 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/




Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Matt Foley
Roman, what is your model for how test results from Bigtop should feed back
into Hadoop-2 development?
With the understanding that (a) software does have bugs, and (b) you're not
going to get an SLA on community-sponsored software,
what are your ideas for how to close the loop better?

Would CI runs of Bigtop against branch-2 be feasible, as Arun suggests?
How should we accomodate changes in individual components (Hadoop Core, but
others as well) that may require changes in one or more other components?
How does Bigtop keep doing a viable nightly build in that chaotic
environment?
Is this a previously solved problem?

Thanks,
--Matt


On Wed, May 15, 2013 at 4:47 PM, Arun C Murthy a...@hortonworks.com wrote:


 On May 15, 2013, at 3:50 PM, Roman Shaposhnik wrote:

  Arun,
 
  am I reading yours answer to my binary question correctly? It is a 'no'.

 No.

 
  My reading of your response is that while you appreciate the feedback
  Bigtop is providing you're not of an opinion that investigating the level
  of stability of Hadoop wrt. downstream any further than what is currently
  happening would be a worthy investment of Hadoop's community
  (or your personal for that matter) time?

 Everyone is welcome to contribute in any and all manner. I can't speak for
 everyone.
 It would be useful if Bigtop could run regressions on releases here
 consistently. We've also talked in the past about running Bigtop on
 branch-2, nightly. Is that something you could help with? You'd earn my
 personal gratitude.

 thanks,
 Arun

 
  Thanks,
  Roman.
 
  On Wed, May 15, 2013 at 3:02 PM, Arun C Murthy a...@hortonworks.com
 wrote:
  Roman,
 
  Furthermore, before we rush into finding flaws and scaring kids at
 night it would be useful to remember one thing:
  Software has *bugs*. We can't block any release till the entire
 universe validates it, in fact they won't validate it if we don't release
 since are at the bottom of the stack.
 
  Any help prior to the release is welcome; I know people who work for
 the same employer as I do have plans to do further testing after we freeze
 apis via the beta release(s). I hope and pray others can join this effort -
 thanks to everyone who already has.
 
  Again, freezing APIs and protocols is the primary aim of 2.0.5-beta.
 There are no guarantees it's 100% bug-free, we can never make such
 guarantees anyway.
 
  If, and when, we find bugs with 2.0.5-beta I'm more than happy to
 quickly turn around and make more releases (2.0.6-beta, 2.0.7-beta).
 Obviously I'll make a call on which bugs are critical - feedback to help me
 decide is, as always, welcome.
  I've been clear, many times, that we might need more than one beta
 release to iron out bugs etc.
 
  None of this should be a surprise - this has happened many, many times
 in the lifetime of this and other projects. 2.0.3-alpha vis-a-vis
 2.0.4-alpha is the most recent example - it won't be the last.
 
  So, I hope, concludes this meme.
 
  thanks,
  Arun
 
  On May 15, 2013, at 2:20 PM, Arun C Murthy wrote:
 
  Great summary, thanks Vinod.
 
  On May 15, 2013, at 2:14 PM, Vinod Kumar Vavilapalli wrote:
 
 
  Roman, I keep this same argument again and again. Should've refuted
 earlier.
 
  Please list down all the issues that BigTop ran into *because of* new
 features. You continue to argue that new features are destabilizing 2.0.*,
 which I don't agree with at all. 2.0.3-alpha was the last time major
 features got merged in, and we found blockers irrespective of those.
 
  MAPREDUCE-5240 specifically isn't due to any feature merge. This was
 a bug. I'd say this is a long standing bug in 2.0.x. You sure this passed
 in 2.0.3? Even so, this is mostly broken by another bug-fix and *not*
 because of any feature.
 
  I quickly checked other bugs you reported in 2.0.x:
  - MAPREDUCE-5088 was caused by the fix for HADOOP-9299 which was
 again a long standing issue in 2.0.x
  - MAPREDUCE-3728 is similar
  - MAPREDUCE-5117 is similar
  - MAPREDUCE-4219 was a security related feature request from you.
  - MAPREDUCE-3916 was because of new proxy-server added.
 
  I am not arguing that new features *may* destabilize the branch, but
 you've repeatedly stated this as if that were a fact.
 
  Really appreciate the testing done by BigTop, but please don't
 distort the facts.
 
  Thanks,
  +Vinod
 
 
  On May 15, 2013, at 1:29 PM, Roman Shaposhnik wrote:
 
  Please tell me if my expectations are incorrect, but to me the -beta
 would
  signify it being a 'safe' target for the downstream components.
 We're still
  finding *very* basic and *very* disruptive issues (MAPREDUCE-5240 is
  a good example) that essentially mean DOA for downstream that depends
  on this functionality.
 
  Are we comfortable with delivering 2.0.5-beta and later on starting
  to discover things like MAPREDUCE-5240 more or less accidentally?
 
 
  --
  Arun C. Murthy
  Hortonworks Inc.
  http://hortonworks.com/
 
 
 
  --
  Arun C. Murthy
  Hortonworks Inc.
  

Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Matt Foley
I'm actually drafting such a proposal.  Will open the discussion as a
[PROPOSAL] in general@
--Matt


On Wed, May 15, 2013 at 4:44 PM, Arun C Murthy a...@hortonworks.com wrote:


 On May 15, 2013, at 3:27 PM, Chris Douglas wrote:

  +1 (binding) on the proposal.
 
  However, the value we get from these release plan votes is dubious,
  to put it mildly. The surrounding discussion has cost more than it is
  worth, and votes on executive summaries of releases discourage the
  sort of detailed collaboration we're trying to create. It replaces
  development with zero-sum struggles over abstractions.

 Agree, I propose we edit bylaws to do away with them for the future.
 
  This is, in effect, another poll about the direction we're taking 2.x.
  If we can't reach consensus on development directions without voting,
  that's more evidence that the project should be split, IMO. -C

 +1e100

 Arun

 
  On Wed, May 15, 2013 at 2:21 PM, Steve Loughran ste...@hortonworks.com
 wrote:
  On 15 May 2013 10:57, Arun C Murthy a...@hortonworks.com wrote:
 
  Folks,
 
  A considerable number of people have expressed confusion regarding the
  recent vote on 2.0.5, beta status etc. given lack of specifics, the
 voting
  itself (validity of the vote itself, whose votes are binding) etc.
 
  IMHO technical arguments (incompatibility b/w 2.0  2.1, current
 stability
  of 3 features under debate etc.) have been lost in the discussion in
 favor
  of non-technical (almost dramatic) nuances such as seizing the
 moment.
  There is now dangerous talk of tolerating incompatibility b/w 2.0 and
 2.1)
  - this is a red flag for me; particularly when there are just 3
 features
  being debated and active committers and contributors are confident of
 and
  ready to stand by their work. All patches, I believe, are ready to be
  merged in the the next few days per discussions on jira. This will,
  clearly, not delay the other API work which everyone agrees is
 crucial. As
  a result, I feel no recourse but to restart a new vote - all attempts
 at
  calm, reasoned, civil discussion based on technical arguments have
 come to
  naught - I apologize for the thrash caused to everyone's attention.
 
  To get past all of this confusion, I'd like to present an alternate,
  specific proposal for consideration.
 
  I propose we continue the original plan and make a 2.0.5-beta release
 by
  May end with the following content:
  # HDFS-347
  # HDFS Snapshots
  # Windows support
  # Necessary  final API/protocol changes such as:
  * Final YARN API changes: YARN-386
  * MR Binary Compatibility: MAPREDUCE-5108
  * Final RPC cleanup: HADOOP-8990
 
  People working on the above features have all expressed considerable
  comfort with them and are ready to stand-by to help expedite any
 necessary
  bug-fixes etc. to get to stabilization quickly. I'm confident we can
 get
  this release out by end of May. This sets stage for a hadoop-2.x GA
 release
  right after with some more testing - this means I think I can quickly
 turn
  around and make bug-fix releases as necessary right after 2.0.5-beta.
 
  I request that people consider helping out with this plan and sign up
 to
  help push hadoop-2.x to stability as outlined above. I believe this
 will
  help achieve our shared goals of quickly stabilizing hadoop-2 and help
  ensure we can support it for forseeable future in a compatible manner
 for
  the benefit of our users and downstream projects.
 
  Please vote, the vote will run the normal 7 days. Obviously, I'm +1.
 
 
  +1 (binding)

 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/





Re: [VOTE] - Release 2.0.5-beta

2013-05-15 Thread Suresh Srinivas
On Wed, May 15, 2013 at 9:25 PM, Andrew Purtell apurt...@apache.org wrote:

 The other thread or vote or whatever at least served the purpose in fresh
 surfacing of concerns. Talk of new features going in to a beta on a very
 short short timetable is concerning for anyone with experience working on
 large software projects. It's not a little ironic that this vote thread,
 done in response to sort out the other one predicated on stability
 concerns, begins with a laundry list of features and JIRAs to go in. I
 think it is usually the case that a beta release receives only bugfixes*
 over the alpha that proceeded it. This may just be a lack of consensus on
 what beta means.


Assuming that you are talking about HDFS features when you say
features going into a beta on a very short short timetable and
laundry list etc, I request you to take a cursory look at the development
of these features.

Snapshot is being developed since 2012 Nov, excluding the early
prototype that happened in 2012 May. Most of the development
was complete by the early February except for the support of rename
capability, which has been tricky. As regards to Windows support, this is a
work that has been happening for more than an year in many other branches.

So these features are not something that are impulsively developed
and irresponsibly pushed to a release. They have gone through
considerable testing and have been developed over a long time.


 Please set aside discussion on particular features or Hadoop bylaws or
 politics or debate club. I can't speak for all of downstream of course, but
 to the extent that I can I can say we don't care about that. The core ask,
 at least mine, is take a fresh look at reducing per-release disruptions to
 the rest of the entire ecosystem that has grown up around Hadoop.


What is the disruption you anticipate due to the current content of
the release?

If it is stability, I am confident that very few bugs will come out
of these features and stability should not be affected. This has been
the case for the HDFS features for many years. The development
is generally done in a feature branch, the feature is tested and stabilized
in that branch before merging to trunk. This is contrary to few people's
incorrect claims about how it has taken a long time to stabilize an HDFS
features in branch-2.

Needless to say stability is not just a concern of downstream projects. We
spend long hours, day in day out, trying to ensure features are stable as
core contributors.

Regards,
Suresh