Re: [DISCUSS] A final minor release off branch-2?
Hi Andrew, bq. Source and binary compatibility are not required for 3.0.0. It's a new major release, and there are known, documented incompatibilities in this regard. Technically, it is true. However, in practically, we should retain compatibility as much as we can. Otherwise, we could break downstream projects, third-party libraries and existing users applications unintentionally. A quick example here is a blocker issue I just reported in HADOOP-15059 which break old (2.x) MR application with 3.0 deployment - due to token format incompatible issue. bq. To follow up on my earlier email, I don't think there's need for a bridge release given that we've successfully tested rolling upgrade from 2.x to 3.0.0. Did we find the same issue as HADOOP-15059? If so, just curious on what rolling upgrade means here - IMO, upgrade with breaking running applications shouldn't be recognized as "rolling". Do I miss anything? Thanks, Junping From: Andrew Wang <andrew.w...@cloudera.com> Sent: Wednesday, November 15, 2017 10:34 AM To: Junping Du Cc: Wangda Tan; Steve Loughran; Vinod Kumar Vavilapalli; Kai Zheng; Arun Suresh; common-dev@hadoop.apache.org; yarn-...@hadoop.apache.org; Hdfs-dev; mapreduce-...@hadoop.apache.org Subject: Re: [DISCUSS] A final minor release off branch-2? Hi Junping, On Wed, Nov 15, 2017 at 1:37 AM, Junping Du <j...@hortonworks.com<mailto:j...@hortonworks.com>> wrote: Thanks Vinod to bring up this discussion, which is just in time. I agree with most responses that option C is not a good choice as our community bandwidth is precious and we should focus on very limited mainstream branches to develop, test and deployment. Of course, we should still follow Apache way to allow any interested committer for rolling up his/her own release given specific requirement over the mainstream releases. I am not biased on option A or B (I will discuss this later), but I think a bridge release for upgrading to and back from 3.x is very necessary. The reasons are obviously: 1. Given lesson learned from previous experience of migration from 1.x to 2.x, no matter how careful we tend to be, there is still chance that some level of compatibility (source, binary, configuration, etc.) get broken for the migration to new major release. Some of these incompatibilities can only be identified in runtime after GA release with widely deployed in production cluster - we have tons of downstream projects and numerous configurations and we cannot cover them all from in-house deployment and test. Source and binary compatibility are not required for 3.0.0. It's a new major release, and there are known, documented incompatibilities in this regard. That said, we've done far, far more in this regard compared to previous major or minor releases. We've compiled all of CDH against Hadoop 3 and run our suite of system tests for the platform. We've been testing in this way since 3.0.0-alpha1 and found and fixed plenty of source and binary compatibility issues during the alpha and beta process. Many of these fixes trickled down into 2.8 and 2.9. 2. From recent classpath isolation work, I was surprised to find out that many of our downstream projects (HBase, Tez, etc.) are still consuming many non-public, server side APIs of Hadoop, not saying the projects/products outside of hadoop ecosystem. Our API compatibility test does not (and should not) cover these cases and situations. We can claim that new major release shouldn't be responsible for these private API changes. But given the possibility of breaking existing applications in some way, users could be very hesitated to migrate to 3.x release if there is no safe solution to roll back. This is true for 2.x releases as well. Similar to the previous answer, we've compiled all of CDH against Hadoop 3, providing a much higher level of assurance even compared to 2.x releases. 3. Beside incompatibilities, there is also possible to have performance regressions (lower throughput, higher latency, slower job running, bigger memory footprint or even memory leaking, etc.) for new hadoop releases. While the performance impact of migration (if any) could be neglectable to some users, other users could be very sensitive and wish to roll back if it happens on their production cluster. Yes, bugs exist. I won't claim that 3.0.0 is bug-free. All new releases can potentially introduce new bugs. However, I don't think rollback is the solution. In my experience, users rarely rollback since it's so disruptive and causes data loss. It's much more common that they patch and upgrade. With that in mind, I'd rather we spend our effort on making 3.0.x high-quality vs. making it easier to rollback. The root of my concern in announcing a "bridge release" is that it discourages users from upgrading to 3.0.0 until a bridge release is out. I strongly believe the level of quality provid
Re: [DISCUSS] A final minor release off branch-2?
> From recent classpath isolation work, I was surprised to find out that many of our downstream projects (HBase, Tez, etc.) are still consuming many non-public, server side APIs of Hadoop, not saying the projects/products outside of hadoop ecosystem. Our API compatibility test does not (and should not) cover these cases and situations. We can claim that new major release shouldn't be responsible for these private API changes. Would you consider filing HBase JIRAs for what are in your opinion the worst offenses? We can at least take a look. On Wed, Nov 15, 2017 at 1:37 AM, Junping Du <j...@hortonworks.com> wrote: > Thanks Vinod to bring up this discussion, which is just in time. > > I agree with most responses that option C is not a good choice as our > community bandwidth is precious and we should focus on very limited > mainstream branches to develop, test and deployment. Of course, we should > still follow Apache way to allow any interested committer for rolling up > his/her own release given specific requirement over the mainstream releases. > > I am not biased on option A or B (I will discuss this later), but I think > a bridge release for upgrading to and back from 3.x is very necessary. > The reasons are obviously: > 1. Given lesson learned from previous experience of migration from 1.x to > 2.x, no matter how careful we tend to be, there is still chance that some > level of compatibility (source, binary, configuration, etc.) get broken for > the migration to new major release. Some of these incompatibilities can > only be identified in runtime after GA release with widely deployed in > production cluster - we have tons of downstream projects and numerous > configurations and we cannot cover them all from in-house deployment and > test. > > 2. From recent classpath isolation work, I was surprised to find out that > many of our downstream projects (HBase, Tez, etc.) are still consuming many > non-public, server side APIs of Hadoop, not saying the projects/products > outside of hadoop ecosystem. Our API compatibility test does not (and > should not) cover these cases and situations. We can claim that new major > release shouldn't be responsible for these private API changes. But given > the possibility of breaking existing applications in some way, users could > be very hesitated to migrate to 3.x release if there is no safe solution to > roll back. > > 3. Beside incompatibilities, there is also possible to have performance > regressions (lower throughput, higher latency, slower job running, bigger > memory footprint or even memory leaking, etc.) for new hadoop releases. > While the performance impact of migration (if any) could be neglectable to > some users, other users could be very sensitive and wish to roll back if it > happens on their production cluster. > > As Andrew mentioned in early email threads, some work has been done for > verifying rolling upgrade from 2.x to 3.0 (just curious that which 2.x > release is tested to upgrade from? 2.8.2 or 2.9.0 which is still in > releasing?). But I am not aware any work we are doing now to test downgrade > from 3.0 to 2.x (correct me if I miss any work). If users hit any of three > situations I mentioned above then we should give them the chance to roll > back if they are really conservative to these unexpected side-effect of > upgrading. Given this, we should have this bridge release to cover the case > for 3.0 safely roll back (no matter rolling or not). I am not sure it > should be 2.9.x or 2.10.x for now (we can just call it 2.BR release) > because we are not sure what exactly changes we should include for > supporting roll back from 3.0 at this moment. We can defer this decision to > discuss later when we have better ideas. > > Summary for my two cents: > - No more feature release should happen on branch-2. 2.9 or 2.10 should be > the last minor release (mainstream of community) on branch-2 > > - A bridge release is necessary for safely upgrade/downgrade to 3.x > > - We can decide later to see if 2.10 is necessary when scope of the bridge > release is more clear. > > > Thanks, > > Junping > > > From: Andrew Wang <andrew.w...@cloudera.com> > Sent: Tuesday, November 14, 2017 2:25 PM > To: Wangda Tan > Cc: Steve Loughran; Vinod Kumar Vavilapalli; Kai Zheng; Arun Suresh; > common-dev@hadoop.apache.org; yarn-...@hadoop.apache.org; Hdfs-dev; > mapreduce-...@hadoop.apache.org > Subject: Re: [DISCUSS] A final minor release off branch-2? > > To follow up on my earlier email, I don't think there's need for a bridge > release given that we've successfully tested rolling upgrade from 2.x to > 3.0.0. I expect we'll keep making improvements to smooth over any > addi
Re: [DISCUSS] A final minor release off branch-2?
On 11/15/17 10:34 AM, Andrew Wang wrote: Hi Junping, On Wed, Nov 15, 2017 at 1:37 AM, Junping Duwrote: 3. Beside incompatibilities, there is also possible to have performance regressions (lower throughput, higher latency, slower job running, bigger memory footprint or even memory leaking, etc.) for new hadoop releases. While the performance impact of migration (if any) could be neglectable to some users, other users could be very sensitive and wish to roll back if it happens on their production cluster. Yes, bugs exist. I won't claim that 3.0.0 is bug-free. All new releases can potentially introduce new bugs. However, I don't think rollback is the solution. In my experience, users rarely rollback since it's so disruptive and causes data loss. It's much more common that they patch and upgrade. With that in mind, I'd rather we spend our effort on making 3.0.x high-quality vs. making it easier to rollback. The root of my concern in announcing a "bridge release" is that it discourages users from upgrading to 3.0.0 until a bridge release is out. I strongly believe the level of quality provided by 3.0.0 is at least equal to new 2.x minor releases, given our extended testing and integration process, and we don't have bridge releases for 2.x. This is why I asked for a list of known issues with 2.x -> 3.0 upgrades, that would necessitate a bridge release. Arun raised a concern about NM rollback. Are there any other *known* issues? While going over the JACC report as part of YARN-6142, I filed HADOOP-14534, MAPREDUCE-6902, and YARN-6717 to document the major issues that I ran across. I think we found one or two other JIRAs which we marked as incompatible as part of this investigation. The protobuf changes should be forward compatible going from 2.8.0 to 3.0.0. YARN-6798 should fix the NM state store versioning when upgrading from 2.9.0 to 3.0.0. 2.8.0 to 3.0.0 could have an issue if the relevant features are enabled. (queued containers, work-preserving NM restart w/AMRMProxy). -Ray - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: [DISCUSS] A final minor release off branch-2?
On 15 Nov 2017, at 09:37, Junping Du> wrote: 2. From recent classpath isolation work, I was surprised to find out that many of our downstream projects (HBase, Tez, etc.) are still consuming many non-public, server side APIs of Hadoop, not saying the projects/products outside of hadoop ecosystem. Our API compatibility test does not (and should not) cover these cases and situations. We can claim that new major release shouldn't be responsible for these private API changes. But given the possibility of breaking existing applications in some way, users could be very hesitated to migrate to 3.x release if there is no safe solution to roll back. The problem here is that it has historically been impossible to write a yarn app without using APIs marked as private. I'll cite UGI as an example, bit if you look at the distributed shell example, it's happy calliing things like NMClientAsyncImpl. We lack the moral high ground to tell other projects off
Re: [DISCUSS] A final minor release off branch-2?
Hi Junping, On Wed, Nov 15, 2017 at 1:37 AM, Junping Duwrote: > Thanks Vinod to bring up this discussion, which is just in time. > > I agree with most responses that option C is not a good choice as our > community bandwidth is precious and we should focus on very limited > mainstream branches to develop, test and deployment. Of course, we should > still follow Apache way to allow any interested committer for rolling up > his/her own release given specific requirement over the mainstream releases. > > I am not biased on option A or B (I will discuss this later), but I think > a bridge release for upgrading to and back from 3.x is very necessary. > The reasons are obviously: > 1. Given lesson learned from previous experience of migration from 1.x to > 2.x, no matter how careful we tend to be, there is still chance that some > level of compatibility (source, binary, configuration, etc.) get broken for > the migration to new major release. Some of these incompatibilities can > only be identified in runtime after GA release with widely deployed in > production cluster - we have tons of downstream projects and numerous > configurations and we cannot cover them all from in-house deployment and > test. > Source and binary compatibility are not required for 3.0.0. It's a new major release, and there are known, documented incompatibilities in this regard. That said, we've done far, far more in this regard compared to previous major or minor releases. We've compiled all of CDH against Hadoop 3 and run our suite of system tests for the platform. We've been testing in this way since 3.0.0-alpha1 and found and fixed plenty of source and binary compatibility issues during the alpha and beta process. Many of these fixes trickled down into 2.8 and 2.9. > > 2. From recent classpath isolation work, I was surprised to find out that > many of our downstream projects (HBase, Tez, etc.) are still consuming many > non-public, server side APIs of Hadoop, not saying the projects/products > outside of hadoop ecosystem. Our API compatibility test does not (and > should not) cover these cases and situations. We can claim that new major > release shouldn't be responsible for these private API changes. But given > the possibility of breaking existing applications in some way, users could > be very hesitated to migrate to 3.x release if there is no safe solution to > roll back. > This is true for 2.x releases as well. Similar to the previous answer, we've compiled all of CDH against Hadoop 3, providing a much higher level of assurance even compared to 2.x releases. > > 3. Beside incompatibilities, there is also possible to have performance > regressions (lower throughput, higher latency, slower job running, bigger > memory footprint or even memory leaking, etc.) for new hadoop releases. > While the performance impact of migration (if any) could be neglectable to > some users, other users could be very sensitive and wish to roll back if it > happens on their production cluster. > > Yes, bugs exist. I won't claim that 3.0.0 is bug-free. All new releases can potentially introduce new bugs. However, I don't think rollback is the solution. In my experience, users rarely rollback since it's so disruptive and causes data loss. It's much more common that they patch and upgrade. With that in mind, I'd rather we spend our effort on making 3.0.x high-quality vs. making it easier to rollback. The root of my concern in announcing a "bridge release" is that it discourages users from upgrading to 3.0.0 until a bridge release is out. I strongly believe the level of quality provided by 3.0.0 is at least equal to new 2.x minor releases, given our extended testing and integration process, and we don't have bridge releases for 2.x. This is why I asked for a list of known issues with 2.x -> 3.0 upgrades, that would necessitate a bridge release. Arun raised a concern about NM rollback. Are there any other *known* issues? Best, Andrew
Re: [DISCUSS] A final minor release off branch-2?
Thanks Vinod to bring up this discussion, which is just in time. I agree with most responses that option C is not a good choice as our community bandwidth is precious and we should focus on very limited mainstream branches to develop, test and deployment. Of course, we should still follow Apache way to allow any interested committer for rolling up his/her own release given specific requirement over the mainstream releases. I am not biased on option A or B (I will discuss this later), but I think a bridge release for upgrading to and back from 3.x is very necessary. The reasons are obviously: 1. Given lesson learned from previous experience of migration from 1.x to 2.x, no matter how careful we tend to be, there is still chance that some level of compatibility (source, binary, configuration, etc.) get broken for the migration to new major release. Some of these incompatibilities can only be identified in runtime after GA release with widely deployed in production cluster - we have tons of downstream projects and numerous configurations and we cannot cover them all from in-house deployment and test. 2. From recent classpath isolation work, I was surprised to find out that many of our downstream projects (HBase, Tez, etc.) are still consuming many non-public, server side APIs of Hadoop, not saying the projects/products outside of hadoop ecosystem. Our API compatibility test does not (and should not) cover these cases and situations. We can claim that new major release shouldn't be responsible for these private API changes. But given the possibility of breaking existing applications in some way, users could be very hesitated to migrate to 3.x release if there is no safe solution to roll back. 3. Beside incompatibilities, there is also possible to have performance regressions (lower throughput, higher latency, slower job running, bigger memory footprint or even memory leaking, etc.) for new hadoop releases. While the performance impact of migration (if any) could be neglectable to some users, other users could be very sensitive and wish to roll back if it happens on their production cluster. As Andrew mentioned in early email threads, some work has been done for verifying rolling upgrade from 2.x to 3.0 (just curious that which 2.x release is tested to upgrade from? 2.8.2 or 2.9.0 which is still in releasing?). But I am not aware any work we are doing now to test downgrade from 3.0 to 2.x (correct me if I miss any work). If users hit any of three situations I mentioned above then we should give them the chance to roll back if they are really conservative to these unexpected side-effect of upgrading. Given this, we should have this bridge release to cover the case for 3.0 safely roll back (no matter rolling or not). I am not sure it should be 2.9.x or 2.10.x for now (we can just call it 2.BR release) because we are not sure what exactly changes we should include for supporting roll back from 3.0 at this moment. We can defer this decision to discuss later when we have better ideas. Summary for my two cents: - No more feature release should happen on branch-2. 2.9 or 2.10 should be the last minor release (mainstream of community) on branch-2 - A bridge release is necessary for safely upgrade/downgrade to 3.x - We can decide later to see if 2.10 is necessary when scope of the bridge release is more clear. Thanks, Junping From: Andrew Wang <andrew.w...@cloudera.com> Sent: Tuesday, November 14, 2017 2:25 PM To: Wangda Tan Cc: Steve Loughran; Vinod Kumar Vavilapalli; Kai Zheng; Arun Suresh; common-dev@hadoop.apache.org; yarn-...@hadoop.apache.org; Hdfs-dev; mapreduce-...@hadoop.apache.org Subject: Re: [DISCUSS] A final minor release off branch-2? To follow up on my earlier email, I don't think there's need for a bridge release given that we've successfully tested rolling upgrade from 2.x to 3.0.0. I expect we'll keep making improvements to smooth over any additional incompatibilities found, but there isn't a requirement that a user upgrade to a bridge release before upgrading to 3.0. Otherwise, I don't have a strong opinion about when to discontinue branch-2 releases. Historically, a release line is maintained until interest in it wanes. If the maintainers are taking care of the backports, it's not much work for the rest of us to vote on the RCs. Best, Andrew On Mon, Nov 13, 2017 at 4:19 PM, Wangda Tan <wheele...@gmail.com> wrote: > Thanks Vinod for staring this, > > I'm also leaning towards the plan (A): > > > > > * (A)-- Make 2.9.x the last minor release off branch-2-- Have a > maintenance release that bridges 2.9 to 3.x-- Continue to make more > maintenance releases on 2.8 and 2.9 as necessary* > > The only part I'm not sure is having a separate bridge release other than > 3.x. > > For the bridge release, Steve's suggestion sounds more doable: &
Re: [DISCUSS] A final minor release off branch-2?
To follow up on my earlier email, I don't think there's need for a bridge release given that we've successfully tested rolling upgrade from 2.x to 3.0.0. I expect we'll keep making improvements to smooth over any additional incompatibilities found, but there isn't a requirement that a user upgrade to a bridge release before upgrading to 3.0. Otherwise, I don't have a strong opinion about when to discontinue branch-2 releases. Historically, a release line is maintained until interest in it wanes. If the maintainers are taking care of the backports, it's not much work for the rest of us to vote on the RCs. Best, Andrew On Mon, Nov 13, 2017 at 4:19 PM, Wangda Tanwrote: > Thanks Vinod for staring this, > > I'm also leaning towards the plan (A): > > > > > * (A)-- Make 2.9.x the last minor release off branch-2-- Have a > maintenance release that bridges 2.9 to 3.x-- Continue to make more > maintenance releases on 2.8 and 2.9 as necessary* > > The only part I'm not sure is having a separate bridge release other than > 3.x. > > For the bridge release, Steve's suggestion sounds more doable: > > ** 3.1+ for new features* > ** fixes to 3.0.x &, where appropriate, 2.9, esp feature stabilisation* > ** whoever puts their hand up to do 2.x releases deserves support in > testing * > ** If someone makes a really strong case to backport a feature from 3.x to > branch-2 and its backwards compatible, I'm not going to stop them. It's > just once 3.0 is out and a 3.1 on the way, it's less compelling* > > This makes community can focus on 3.x releases and fill whatever gaps of > migrating from 2.x to 3.x. > > Best, > Wangda > > > On Wed, Nov 8, 2017 at 3:57 AM, Steve Loughran > wrote: > >> >> > On 7 Nov 2017, at 19:08, Vinod Kumar Vavilapalli >> wrote: >> > >> > >> > >> > >> >> Frankly speaking, working on some bridging release not targeting any >> feature isn't so attractive to me as a contributor. Overall, the final >> minor release off branch-2 is good, we should also give 3.x more time to >> evolve and mature, therefore it looks to me we would have to work on two >> release lines meanwhile for some time. I'd like option C), and suggest we >> focus on the recent releases. >> > >> > >> > >> > Answering this question is also one of the goals of my starting this >> thread. Collectively we need to conclude if we are okay or not okay with no >> longer putting any new feature work in general on the 2.x line after 2.9.0 >> release and move over our focus into 3.0. >> > >> > >> > Thanks >> > +Vinod >> > >> >> >> As a developer of new features (e.g the Hadoop S3A committers), I'm >> mostly already committed to targeting 3.1; the code in there to deal with >> failures and retries has unashamedly embraced java 8 lambda-expressions in >> production code: backporting that is going to be traumatic in terms of >> IDE-assisted code changes and the resultant diff in source between branch-2 >> & trunk. What's worse, its going to be traumatic to test as all my JVMs >> start with an 8 at the moment, and I'm starting to worry about whether I >> should bump a windows VM up to Java 9 to keep an eye on Akira's work there. >> Currently the only testing I'm really doing on java 7 is yetus branch-2 & >> internal test runs. >> >> >> 3.0 will be out the door, and we can assume that CDH will ship with it >> soon (*) which will allow for a rapid round trip time on inevitable bugs: >> 3.1 can be the release with compatibility tuned, those reported issues >> addressed. It's certainly where I'd like to focus. >> >> >> At the same time: 2.7.2-2.8.x are the broadly used versions, we can't >> just say "move to 3.0" & expect everyone to do it, not given we have >> explicitly got backwards-incompatible changes in. I don't seen people >> rushing to do it until the layers above are all qualified (HBase, Hive, >> Spark, ...). Which means big users of 2.7/2,8 won't be in a rush to move >> and we are going to have to maintain 2.x for a while, including security >> patches for old versions. One issue there: what if a patch (such as bumping >> up a JAR version) is incompatible? >> >> For me then >> >> * 3.1+ for new features >> * fixes to 3.0.x &, where appropriate, 2.9, esp feature stabilisation >> * whoever puts their hand up to do 2.x releases deserves support in >> testing >> * If someone makes a really strong case to backport a feature from 3.x to >> branch-2 and its backwards compatible, I'm not going to stop them. It's >> just once 3.0 is out and a 3.1 on the way, it's less compelling >> >> -Steve >> >> Note: I'm implicitly assuming a timely 3.1 out the door with my work >> included, all all issues arriving from 3,0 fixed. We can worry when 3.1 >> ships whether there's any benefit in maintaining a 3.0.x, or whether it's >> best to say "move to 3.1" >> >> >> >> (*) just a guess based the effort & test reports of Andrew & others >> >> >>
Re: [DISCUSS] A final minor release off branch-2?
Thanks Vinod for staring this, I'm also leaning towards the plan (A): * (A)-- Make 2.9.x the last minor release off branch-2-- Have a maintenance release that bridges 2.9 to 3.x-- Continue to make more maintenance releases on 2.8 and 2.9 as necessary* The only part I'm not sure is having a separate bridge release other than 3.x. For the bridge release, Steve's suggestion sounds more doable: ** 3.1+ for new features* ** fixes to 3.0.x &, where appropriate, 2.9, esp feature stabilisation* ** whoever puts their hand up to do 2.x releases deserves support in testing * ** If someone makes a really strong case to backport a feature from 3.x to branch-2 and its backwards compatible, I'm not going to stop them. It's just once 3.0 is out and a 3.1 on the way, it's less compelling* This makes community can focus on 3.x releases and fill whatever gaps of migrating from 2.x to 3.x. Best, Wangda On Wed, Nov 8, 2017 at 3:57 AM, Steve Loughranwrote: > > > On 7 Nov 2017, at 19:08, Vinod Kumar Vavilapalli > wrote: > > > > > > > > > >> Frankly speaking, working on some bridging release not targeting any > feature isn't so attractive to me as a contributor. Overall, the final > minor release off branch-2 is good, we should also give 3.x more time to > evolve and mature, therefore it looks to me we would have to work on two > release lines meanwhile for some time. I'd like option C), and suggest we > focus on the recent releases. > > > > > > > > Answering this question is also one of the goals of my starting this > thread. Collectively we need to conclude if we are okay or not okay with no > longer putting any new feature work in general on the 2.x line after 2.9.0 > release and move over our focus into 3.0. > > > > > > Thanks > > +Vinod > > > > > As a developer of new features (e.g the Hadoop S3A committers), I'm mostly > already committed to targeting 3.1; the code in there to deal with failures > and retries has unashamedly embraced java 8 lambda-expressions in > production code: backporting that is going to be traumatic in terms of > IDE-assisted code changes and the resultant diff in source between branch-2 > & trunk. What's worse, its going to be traumatic to test as all my JVMs > start with an 8 at the moment, and I'm starting to worry about whether I > should bump a windows VM up to Java 9 to keep an eye on Akira's work there. > Currently the only testing I'm really doing on java 7 is yetus branch-2 & > internal test runs. > > > 3.0 will be out the door, and we can assume that CDH will ship with it > soon (*) which will allow for a rapid round trip time on inevitable bugs: > 3.1 can be the release with compatibility tuned, those reported issues > addressed. It's certainly where I'd like to focus. > > > At the same time: 2.7.2-2.8.x are the broadly used versions, we can't just > say "move to 3.0" & expect everyone to do it, not given we have explicitly > got backwards-incompatible changes in. I don't seen people rushing to do it > until the layers above are all qualified (HBase, Hive, Spark, ...). Which > means big users of 2.7/2,8 won't be in a rush to move and we are going to > have to maintain 2.x for a while, including security patches for old > versions. One issue there: what if a patch (such as bumping up a JAR > version) is incompatible? > > For me then > > * 3.1+ for new features > * fixes to 3.0.x &, where appropriate, 2.9, esp feature stabilisation > * whoever puts their hand up to do 2.x releases deserves support in > testing > * If someone makes a really strong case to backport a feature from 3.x to > branch-2 and its backwards compatible, I'm not going to stop them. It's > just once 3.0 is out and a 3.1 on the way, it's less compelling > > -Steve > > Note: I'm implicitly assuming a timely 3.1 out the door with my work > included, all all issues arriving from 3,0 fixed. We can worry when 3.1 > ships whether there's any benefit in maintaining a 3.0.x, or whether it's > best to say "move to 3.1" > > > > (*) just a guess based the effort & test reports of Andrew & others > > > - > To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org > >
Re: [DISCUSS] A final minor release off branch-2?
> On 7 Nov 2017, at 19:08, Vinod Kumar Vavilapalliwrote: > > > > >> Frankly speaking, working on some bridging release not targeting any feature >> isn't so attractive to me as a contributor. Overall, the final minor release >> off branch-2 is good, we should also give 3.x more time to evolve and >> mature, therefore it looks to me we would have to work on two release lines >> meanwhile for some time. I'd like option C), and suggest we focus on the >> recent releases. > > > > Answering this question is also one of the goals of my starting this thread. > Collectively we need to conclude if we are okay or not okay with no longer > putting any new feature work in general on the 2.x line after 2.9.0 release > and move over our focus into 3.0. > > > Thanks > +Vinod > As a developer of new features (e.g the Hadoop S3A committers), I'm mostly already committed to targeting 3.1; the code in there to deal with failures and retries has unashamedly embraced java 8 lambda-expressions in production code: backporting that is going to be traumatic in terms of IDE-assisted code changes and the resultant diff in source between branch-2 & trunk. What's worse, its going to be traumatic to test as all my JVMs start with an 8 at the moment, and I'm starting to worry about whether I should bump a windows VM up to Java 9 to keep an eye on Akira's work there. Currently the only testing I'm really doing on java 7 is yetus branch-2 & internal test runs. 3.0 will be out the door, and we can assume that CDH will ship with it soon (*) which will allow for a rapid round trip time on inevitable bugs: 3.1 can be the release with compatibility tuned, those reported issues addressed. It's certainly where I'd like to focus. At the same time: 2.7.2-2.8.x are the broadly used versions, we can't just say "move to 3.0" & expect everyone to do it, not given we have explicitly got backwards-incompatible changes in. I don't seen people rushing to do it until the layers above are all qualified (HBase, Hive, Spark, ...). Which means big users of 2.7/2,8 won't be in a rush to move and we are going to have to maintain 2.x for a while, including security patches for old versions. One issue there: what if a patch (such as bumping up a JAR version) is incompatible? For me then * 3.1+ for new features * fixes to 3.0.x &, where appropriate, 2.9, esp feature stabilisation * whoever puts their hand up to do 2.x releases deserves support in testing * If someone makes a really strong case to backport a feature from 3.x to branch-2 and its backwards compatible, I'm not going to stop them. It's just once 3.0 is out and a 3.1 on the way, it's less compelling -Steve Note: I'm implicitly assuming a timely 3.1 out the door with my work included, all all issues arriving from 3,0 fixed. We can worry when 3.1 ships whether there's any benefit in maintaining a 3.0.x, or whether it's best to say "move to 3.1" (*) just a guess based the effort & test reports of Andrew & others - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: [DISCUSS] A final minor release off branch-2?
>> You mentioned rolling-upgrades: It will be good to exactly outline the type of testing. For e.g., the rolling-upgrades orchestration order has direct implication on the testing done. Complete details are available in HDFS-11096 where I'm trying to get scripts to automate these tests committed so we can run them on Jenkins. For HDFS, I follow the same order as the documentation. I did not see any documentation indicate when to upgrade zkfc daemons, so it is done at the end. I also did not see any documentation about a rolling upgrade for YARN, so I'm doing ResourceManagers first then NodeManager, basically following the pattern used in HDFS. I can't speak much about app compatibility in YARN, etc. but the rolling upgrade runs Terasuite from Hadoop 2 continually while doing the upgrade and for sometime afterward. 1 incompatibility was found and fixed in trunk quite a while ago - that part of the test has been working well for quite a while now. >> Copying data between 2.x clusters and 3.x clusters: Does this work already? Is it broken anywhere that we cannot fix? Do we need bridging features for this work? HDFS-11096 also includes tests that data can be copied distcp'd over webhdfs:// to and from old and new clusters regardless of where the distcp job is launched from. I'll try a test run that uses hdfs:// this week, too. As part of that JIRA I also looked through all the protobuf's for any discrepancies / incompatibilities. One was found and fixed, but the rest looked good to me. On Mon, Nov 6, 2017 at 6:42 PM, Vinod Kumar Vavilapalliwrote: > The main goal of the bridging release is to ease transition on stuff that > is guaranteed to be broken. > > Of the top of my head, one of the biggest areas is application > compatibility. When folks move from 2.x to 3.x, are their apps binary > compatible? Source compatible? Or need changes? > > In 1.x -> 2.x upgrade, we did a bunch of work to atleast make old apps be > source compatible. This means relooking at the API compatibility in 3.x and > their impact of migrating applications. We will have to revist and > un-deprecate old APIs, un-delete old APIs and write documentation on how > apps can be migrated. > > Most of this work will be in 3.x line. The bridging release on the other > hand will have deprecation for APIs that cannot be undeleted. This may be > already have been done in many places. But we need to make sure and fill > gaps if any. > > Other areas that I can recall from the old days > - Config migration: Many configs are deprecated or deleted. We need > documentation to help folks to move. We also need deprecations in the > bridging release for configs that cannot be undeleted. > - You mentioned rolling-upgrades: It will be good to exactly outline the > type of testing. For e.g., the rolling-upgrades orchestration order has > direct implication on the testing done. > - Story for downgrades? > - Copying data between 2.x clusters and 3.x clusters: Does this work > already? Is it broken anywhere that we cannot fix? Do we need bridging > features for this work? > > +Vinod > > > On Nov 6, 2017, at 12:49 PM, Andrew Wang > wrote: > > > > What are the known gaps that need bridging between 2.x and 3.x? > > > > From an HDFS perspective, we've tested wire compat, rolling upgrade, and > > rollback. > > > > From a YARN perspective, we've tested wire compat and rolling upgrade. > Arun > > just mentioned an NM rollback issue that I'm not familiar with. > > > > Anything else? External to this discussion, these should be documented as > > known issues for 3.0. > > > > Best. > > Andrew > > > > On Sun, Nov 5, 2017 at 1:46 PM, Arun Suresh wrote: > > > >> Thanks for starting this discussion VInod. > >> > >> I agree (C) is a bad idea. > >> I would prefer (A) given that ATM, branch-2 is still very close to > >> branch-2.9 - and it is a good time to make a collective decision to lock > >> down commits to branch-2. > >> > >> I think we should also clearly define what the 'bridging' release should > >> be. > >> I assume it means the following: > >> * Any 2.x user wanting to move to 3.x must first upgrade to the bridging > >> release first and then upgrade to the 3.x release. > >> * With regard to state store upgrades (at least NM state stores) the > >> bridging state stores should be aware of all new 3.x keys so the > implicit > >> assumption would be that a user can only rollback from the 3.x release > to > >> the bridging release and not to the old 2.x release. > >> * Use the opportunity to clean up deprecated API ? > >> * Do we even want to consider a separate bridging release for 2.7, 2.8 > an > >> 2.9 lines ? > >> > >> Cheers > >> -Arun > >> > >> On Fri, Nov 3, 2017 at 5:07 PM, Vinod Kumar Vavilapalli < > >> vino...@apache.org> > >> wrote: > >> > >>> Hi all, > >>> > >>> With 3.0.0 GA around the corner (tx for the push, Andrew!), 2.9.0 RC > out > >>> (tx Arun / Subru!) and 2.8.2 (tx
Re: [DISCUSS] A final minor release off branch-2?
Thanks for your comments, Zheng. Replies inline. > On the other hand, I've discussed with quite a few 3.0 potential users, it > looks like most of them are interested in the erasure coding feature and a > major scenario for that is to back up their large volume of data to save > storage cost. They might run analytics workload using Hive, Spark, Impala and > Kylin on the new cluster based on the version, but it's not a must at the > first time. They understand there might be some gaps so they'd migrate their > workloads incrementally. For the major analytics workload, we've performed > lots of benchmark and integration tests as well as other sides I believe, we > did find some issues but they should be fixed in downstream projects. I > thought the release of GA will accelerate the progress and expose the issues > if any. We couldn't wait for it being matured. There isn't perfectness. 3.0 is a GA release from the Apache Hadoop community. So, we cannot assume that all usages in the short term are *only* going to be for storage optimization features and only on dedicated clusters. We have to make sure that the workloads can be migrated right now and/or that existing clusters can be upgraded in-place. If not, we shouldn't be calling it GA. > This sounds a good consideration. I'm thinking if I'm a Hadoop user, for > example, I'm using 2.7.4 or 2.8.2 or whatever 2.x version, would I first > upgrade to this bridging release then use the bridge support to upgrade to > 3.x version? I'm not sure. On the other hand, I might tend to look for some > guides or supports in 3.x docs about how to upgrade from 2.7 to 3.x. Arun Suresh also asked this same question earlier. I think this will really depend on what we discover as part of the migration and user-acceptance testing. If we don't find major issues, you are right, folks can jump directly from one of 2.7, 2.8 or 2.9 to 3.0. > Frankly speaking, working on some bridging release not targeting any feature > isn't so attractive to me as a contributor. Overall, the final minor release > off branch-2 is good, we should also give 3.x more time to evolve and mature, > therefore it looks to me we would have to work on two release lines meanwhile > for some time. I'd like option C), and suggest we focus on the recent > releases. Answering this question is also one of the goals of my starting this thread. Collectively we need to conclude if we are okay or not okay with no longer putting any new feature work in general on the 2.x line after 2.9.0 release and move over our focus into 3.0. Thanks +Vinod > -Original Message- > From: Vinod Kumar Vavilapalli [mailto:vino...@apache.org] > Sent: Tuesday, November 07, 2017 9:43 AM > To: Andrew Wang <andrew.w...@cloudera.com> > Cc: Arun Suresh <asur...@apache.org>; common-dev@hadoop.apache.org; > yarn-...@hadoop.apache.org; Hdfs-dev <hdfs-...@hadoop.apache.org>; > mapreduce-...@hadoop.apache.org > Subject: Re: [DISCUSS] A final minor release off branch-2? > > The main goal of the bridging release is to ease transition on stuff that is > guaranteed to be broken. > > Of the top of my head, one of the biggest areas is application compatibility. > When folks move from 2.x to 3.x, are their apps binary compatible? Source > compatible? Or need changes? > > In 1.x -> 2.x upgrade, we did a bunch of work to atleast make old apps be > source compatible. This means relooking at the API compatibility in 3.x and > their impact of migrating applications. We will have to revist and > un-deprecate old APIs, un-delete old APIs and write documentation on how apps > can be migrated. > > Most of this work will be in 3.x line. The bridging release on the other hand > will have deprecation for APIs that cannot be undeleted. This may be already > have been done in many places. But we need to make sure and fill gaps if any. > > Other areas that I can recall from the old days > - Config migration: Many configs are deprecated or deleted. We need > documentation to help folks to move. We also need deprecations in the > bridging release for configs that cannot be undeleted. > - You mentioned rolling-upgrades: It will be good to exactly outline the type > of testing. For e.g., the rolling-upgrades orchestration order has direct > implication on the testing done. > - Story for downgrades? > - Copying data between 2.x clusters and 3.x clusters: Does this work already? > Is it broken anywhere that we cannot fix? Do we need bridging features for > this work? > > +Vinod > >> On Nov 6, 2017, at 12:49 PM, Andrew Wang <andrew.w...@cloudera.com> wrote: >> >> What are the known gaps that need bridging between 2.x and 3.x? >> >> From an HDFS perspective,
RE: [DISCUSS] A final minor release off branch-2?
Thanks Vinod. >> Of the top of my head, one of the biggest areas is application >> compatibility. When folks move from 2.x to 3.x, are their apps binary >> compatible? Source compatible? Or need changes? I thought these are good concerns from overall perspective. On the other hand, I've discussed with quite a few 3.0 potential users, it looks like most of them are interested in the erasure coding feature and a major scenario for that is to back up their large volume of data to save storage cost. They might run analytics workload using Hive, Spark, Impala and Kylin on the new cluster based on the version, but it's not a must at the first time. They understand there might be some gaps so they'd migrate their workloads incrementally. For the major analytics workload, we've performed lots of benchmark and integration tests as well as other sides I believe, we did find some issues but they should be fixed in downstream projects. I thought the release of GA will accelerate the progress and expose the issues if any. We couldn't wait for it being matured. There isn't perfectness. >> The main goal of the bridging release is to ease transition on stuff that is >> guaranteed to be broken. This sounds a good consideration. I'm thinking if I'm a Hadoop user, for example, I'm using 2.7.4 or 2.8.2 or whatever 2.x version, would I first upgrade to this bridging release then use the bridge support to upgrade to 3.x version? I'm not sure. On the other hand, I might tend to look for some guides or supports in 3.x docs about how to upgrade from 2.7 to 3.x. Frankly speaking, working on some bridging release not targeting any feature isn't so attractive to me as a contributor. Overall, the final minor release off branch-2 is good, we should also give 3.x more time to evolve and mature, therefore it looks to me we would have to work on two release lines meanwhile for some time. I'd like option C), and suggest we focus on the recent releases. Just some thoughts. Regards, Kai -Original Message- From: Vinod Kumar Vavilapalli [mailto:vino...@apache.org] Sent: Tuesday, November 07, 2017 9:43 AM To: Andrew Wang <andrew.w...@cloudera.com> Cc: Arun Suresh <asur...@apache.org>; common-dev@hadoop.apache.org; yarn-...@hadoop.apache.org; Hdfs-dev <hdfs-...@hadoop.apache.org>; mapreduce-...@hadoop.apache.org Subject: Re: [DISCUSS] A final minor release off branch-2? The main goal of the bridging release is to ease transition on stuff that is guaranteed to be broken. Of the top of my head, one of the biggest areas is application compatibility. When folks move from 2.x to 3.x, are their apps binary compatible? Source compatible? Or need changes? In 1.x -> 2.x upgrade, we did a bunch of work to atleast make old apps be source compatible. This means relooking at the API compatibility in 3.x and their impact of migrating applications. We will have to revist and un-deprecate old APIs, un-delete old APIs and write documentation on how apps can be migrated. Most of this work will be in 3.x line. The bridging release on the other hand will have deprecation for APIs that cannot be undeleted. This may be already have been done in many places. But we need to make sure and fill gaps if any. Other areas that I can recall from the old days - Config migration: Many configs are deprecated or deleted. We need documentation to help folks to move. We also need deprecations in the bridging release for configs that cannot be undeleted. - You mentioned rolling-upgrades: It will be good to exactly outline the type of testing. For e.g., the rolling-upgrades orchestration order has direct implication on the testing done. - Story for downgrades? - Copying data between 2.x clusters and 3.x clusters: Does this work already? Is it broken anywhere that we cannot fix? Do we need bridging features for this work? +Vinod > On Nov 6, 2017, at 12:49 PM, Andrew Wang <andrew.w...@cloudera.com> wrote: > > What are the known gaps that need bridging between 2.x and 3.x? > > From an HDFS perspective, we've tested wire compat, rolling upgrade, > and rollback. > > From a YARN perspective, we've tested wire compat and rolling upgrade. > Arun just mentioned an NM rollback issue that I'm not familiar with. > > Anything else? External to this discussion, these should be documented > as known issues for 3.0. > > Best. > Andrew > > On Sun, Nov 5, 2017 at 1:46 PM, Arun Suresh <asur...@apache.org> wrote: > >> Thanks for starting this discussion VInod. >> >> I agree (C) is a bad idea. >> I would prefer (A) given that ATM, branch-2 is still very close to >> branch-2.9 - and it is a good time to make a collective decision to >> lock down commits to branch-2. >> >> I think we should also clearly define what the 'bridging' release >> shoul
Re: [DISCUSS] A final minor release off branch-2?
The main goal of the bridging release is to ease transition on stuff that is guaranteed to be broken. Of the top of my head, one of the biggest areas is application compatibility. When folks move from 2.x to 3.x, are their apps binary compatible? Source compatible? Or need changes? In 1.x -> 2.x upgrade, we did a bunch of work to atleast make old apps be source compatible. This means relooking at the API compatibility in 3.x and their impact of migrating applications. We will have to revist and un-deprecate old APIs, un-delete old APIs and write documentation on how apps can be migrated. Most of this work will be in 3.x line. The bridging release on the other hand will have deprecation for APIs that cannot be undeleted. This may be already have been done in many places. But we need to make sure and fill gaps if any. Other areas that I can recall from the old days - Config migration: Many configs are deprecated or deleted. We need documentation to help folks to move. We also need deprecations in the bridging release for configs that cannot be undeleted. - You mentioned rolling-upgrades: It will be good to exactly outline the type of testing. For e.g., the rolling-upgrades orchestration order has direct implication on the testing done. - Story for downgrades? - Copying data between 2.x clusters and 3.x clusters: Does this work already? Is it broken anywhere that we cannot fix? Do we need bridging features for this work? +Vinod > On Nov 6, 2017, at 12:49 PM, Andrew Wangwrote: > > What are the known gaps that need bridging between 2.x and 3.x? > > From an HDFS perspective, we've tested wire compat, rolling upgrade, and > rollback. > > From a YARN perspective, we've tested wire compat and rolling upgrade. Arun > just mentioned an NM rollback issue that I'm not familiar with. > > Anything else? External to this discussion, these should be documented as > known issues for 3.0. > > Best. > Andrew > > On Sun, Nov 5, 2017 at 1:46 PM, Arun Suresh wrote: > >> Thanks for starting this discussion VInod. >> >> I agree (C) is a bad idea. >> I would prefer (A) given that ATM, branch-2 is still very close to >> branch-2.9 - and it is a good time to make a collective decision to lock >> down commits to branch-2. >> >> I think we should also clearly define what the 'bridging' release should >> be. >> I assume it means the following: >> * Any 2.x user wanting to move to 3.x must first upgrade to the bridging >> release first and then upgrade to the 3.x release. >> * With regard to state store upgrades (at least NM state stores) the >> bridging state stores should be aware of all new 3.x keys so the implicit >> assumption would be that a user can only rollback from the 3.x release to >> the bridging release and not to the old 2.x release. >> * Use the opportunity to clean up deprecated API ? >> * Do we even want to consider a separate bridging release for 2.7, 2.8 an >> 2.9 lines ? >> >> Cheers >> -Arun >> >> On Fri, Nov 3, 2017 at 5:07 PM, Vinod Kumar Vavilapalli < >> vino...@apache.org> >> wrote: >> >>> Hi all, >>> >>> With 3.0.0 GA around the corner (tx for the push, Andrew!), 2.9.0 RC out >>> (tx Arun / Subru!) and 2.8.2 (tx Junping!), I think it's high time we >> have >>> a discussion on how we manage our developmental bandwidth between 2.x >> line >>> and 3.x lines. >>> >>> Once 3.0 GA goes out, we will have two parallel and major release lines. >>> The last time we were in this situation was back when we did 1.x -> 2.x >>> jump. >>> >>> The parallel releases implies overhead of decisions, branch-merges and >>> back-ports. Right now we already do backports for 2.7.5, 2.8.2, 2.9.1, >>> 3.0.1 and potentially a 3.1.0 in a few months after 3.0.0 GA. And many of >>> these lines - for e.g 2.8, 2.9 - are going to be used for a while at a >>> bunch of large sites! At the same time, our users won't migrate to 3.0 GA >>> overnight - so we do have to support two parallel lines. >>> >>> I propose we start thinking of the fate of branch-2. The idea is to have >>> one final release that helps our users migrate from 2.x to 3.x. This >>> includes any changes on the older line to bridge compatibility issues, >>> upgrade issues, layout changes, tooling etc. >>> >>> We have a few options I think >>> (A) >>>-- Make 2.9.x the last minor release off branch-2 >>>-- Have a maintenance release that bridges 2.9 to 3.x >>>-- Continue to make more maintenance releases on 2.8 and 2.9 as >>> necessary >>>-- All new features obviously only go into the 3.x line as no >> features >>> can go into the maint line. >>> >>> (B) >>>-- Create a new 2.10 release which doesn't have any new features, but >>> as a bridging release >>>-- Continue to make more maintenance releases on 2.8, 2.9 and 2.10 as >>> necessary >>>-- All new features, other than the bridging changes, go into the 3.x >>> line >>> >>> (C) >>>-- Continue making branch-2
Re: [DISCUSS] A final minor release off branch-2?
What are the known gaps that need bridging between 2.x and 3.x? >From an HDFS perspective, we've tested wire compat, rolling upgrade, and rollback. >From a YARN perspective, we've tested wire compat and rolling upgrade. Arun just mentioned an NM rollback issue that I'm not familiar with. Anything else? External to this discussion, these should be documented as known issues for 3.0. Best. Andrew On Sun, Nov 5, 2017 at 1:46 PM, Arun Sureshwrote: > Thanks for starting this discussion VInod. > > I agree (C) is a bad idea. > I would prefer (A) given that ATM, branch-2 is still very close to > branch-2.9 - and it is a good time to make a collective decision to lock > down commits to branch-2. > > I think we should also clearly define what the 'bridging' release should > be. > I assume it means the following: > * Any 2.x user wanting to move to 3.x must first upgrade to the bridging > release first and then upgrade to the 3.x release. > * With regard to state store upgrades (at least NM state stores) the > bridging state stores should be aware of all new 3.x keys so the implicit > assumption would be that a user can only rollback from the 3.x release to > the bridging release and not to the old 2.x release. > * Use the opportunity to clean up deprecated API ? > * Do we even want to consider a separate bridging release for 2.7, 2.8 an > 2.9 lines ? > > Cheers > -Arun > > On Fri, Nov 3, 2017 at 5:07 PM, Vinod Kumar Vavilapalli < > vino...@apache.org> > wrote: > > > Hi all, > > > > With 3.0.0 GA around the corner (tx for the push, Andrew!), 2.9.0 RC out > > (tx Arun / Subru!) and 2.8.2 (tx Junping!), I think it's high time we > have > > a discussion on how we manage our developmental bandwidth between 2.x > line > > and 3.x lines. > > > > Once 3.0 GA goes out, we will have two parallel and major release lines. > > The last time we were in this situation was back when we did 1.x -> 2.x > > jump. > > > > The parallel releases implies overhead of decisions, branch-merges and > > back-ports. Right now we already do backports for 2.7.5, 2.8.2, 2.9.1, > > 3.0.1 and potentially a 3.1.0 in a few months after 3.0.0 GA. And many of > > these lines - for e.g 2.8, 2.9 - are going to be used for a while at a > > bunch of large sites! At the same time, our users won't migrate to 3.0 GA > > overnight - so we do have to support two parallel lines. > > > > I propose we start thinking of the fate of branch-2. The idea is to have > > one final release that helps our users migrate from 2.x to 3.x. This > > includes any changes on the older line to bridge compatibility issues, > > upgrade issues, layout changes, tooling etc. > > > > We have a few options I think > > (A) > > -- Make 2.9.x the last minor release off branch-2 > > -- Have a maintenance release that bridges 2.9 to 3.x > > -- Continue to make more maintenance releases on 2.8 and 2.9 as > > necessary > > -- All new features obviously only go into the 3.x line as no > features > > can go into the maint line. > > > > (B) > > -- Create a new 2.10 release which doesn't have any new features, but > > as a bridging release > > -- Continue to make more maintenance releases on 2.8, 2.9 and 2.10 as > > necessary > > -- All new features, other than the bridging changes, go into the 3.x > > line > > > > (C) > > -- Continue making branch-2 releases and postpone this discussion for > > later > > > > I'm leaning towards (A) or to a lesser extent (B). Willing to hear > > otherwise. > > > > Now, this obviously doesn't mean blocking of any more minor releases on > > branch-2. Obviously, any interested committer / PMC can roll up his/her > > sleeves, create a release plan and release, but we all need to > acknowledge > > that versions are not cheap and figure out how the community bandwidth is > > split overall. > > > > Thanks > > +Vinod > > PS: The proposal is obviously not to force everyone to go in one > direction > > but more of a nudging the community to figure out if we can focus a major > > part of of our bandwidth on one line. I had a similar concern when we > were > > doing 2.8 and 3.0 in parallel, but the impending possibility of spreading > > too thin is much worse IMO. > > PPS: (C) is a bad choice. With 2.8 and 2.9 we are already seeing user > > adoption splintering between two lines. With 2.10, 2.11 etc coexisting > with > > 3.0, 3.1 etc, we will revisit the mad phase years ago when we had 0.20.x, > > 0.20-security coexisting with 0.21, 0.22 etc. >
Re: [DISCUSS] A final minor release off branch-2?
Thanks for starting this discussion VInod. I agree (C) is a bad idea. I would prefer (A) given that ATM, branch-2 is still very close to branch-2.9 - and it is a good time to make a collective decision to lock down commits to branch-2. I think we should also clearly define what the 'bridging' release should be. I assume it means the following: * Any 2.x user wanting to move to 3.x must first upgrade to the bridging release first and then upgrade to the 3.x release. * With regard to state store upgrades (at least NM state stores) the bridging state stores should be aware of all new 3.x keys so the implicit assumption would be that a user can only rollback from the 3.x release to the bridging release and not to the old 2.x release. * Use the opportunity to clean up deprecated API ? * Do we even want to consider a separate bridging release for 2.7, 2.8 an 2.9 lines ? Cheers -Arun On Fri, Nov 3, 2017 at 5:07 PM, Vinod Kumar Vavilapalliwrote: > Hi all, > > With 3.0.0 GA around the corner (tx for the push, Andrew!), 2.9.0 RC out > (tx Arun / Subru!) and 2.8.2 (tx Junping!), I think it's high time we have > a discussion on how we manage our developmental bandwidth between 2.x line > and 3.x lines. > > Once 3.0 GA goes out, we will have two parallel and major release lines. > The last time we were in this situation was back when we did 1.x -> 2.x > jump. > > The parallel releases implies overhead of decisions, branch-merges and > back-ports. Right now we already do backports for 2.7.5, 2.8.2, 2.9.1, > 3.0.1 and potentially a 3.1.0 in a few months after 3.0.0 GA. And many of > these lines - for e.g 2.8, 2.9 - are going to be used for a while at a > bunch of large sites! At the same time, our users won't migrate to 3.0 GA > overnight - so we do have to support two parallel lines. > > I propose we start thinking of the fate of branch-2. The idea is to have > one final release that helps our users migrate from 2.x to 3.x. This > includes any changes on the older line to bridge compatibility issues, > upgrade issues, layout changes, tooling etc. > > We have a few options I think > (A) > -- Make 2.9.x the last minor release off branch-2 > -- Have a maintenance release that bridges 2.9 to 3.x > -- Continue to make more maintenance releases on 2.8 and 2.9 as > necessary > -- All new features obviously only go into the 3.x line as no features > can go into the maint line. > > (B) > -- Create a new 2.10 release which doesn't have any new features, but > as a bridging release > -- Continue to make more maintenance releases on 2.8, 2.9 and 2.10 as > necessary > -- All new features, other than the bridging changes, go into the 3.x > line > > (C) > -- Continue making branch-2 releases and postpone this discussion for > later > > I'm leaning towards (A) or to a lesser extent (B). Willing to hear > otherwise. > > Now, this obviously doesn't mean blocking of any more minor releases on > branch-2. Obviously, any interested committer / PMC can roll up his/her > sleeves, create a release plan and release, but we all need to acknowledge > that versions are not cheap and figure out how the community bandwidth is > split overall. > > Thanks > +Vinod > PS: The proposal is obviously not to force everyone to go in one direction > but more of a nudging the community to figure out if we can focus a major > part of of our bandwidth on one line. I had a similar concern when we were > doing 2.8 and 3.0 in parallel, but the impending possibility of spreading > too thin is much worse IMO. > PPS: (C) is a bad choice. With 2.8 and 2.9 we are already seeing user > adoption splintering between two lines. With 2.10, 2.11 etc coexisting with > 3.0, 3.1 etc, we will revisit the mad phase years ago when we had 0.20.x, > 0.20-security coexisting with 0.21, 0.22 etc.