Re: IMPORTANT: automatic changelog creation
On Jul 8, 2015 2:13 AM, Tsuyoshi Ozawa oz...@apache.org wrote: +1, thanks Allen and Andrew for taking lots effort! Is there any possibility that, we can restrict someone from editing the issue in jira once its marked as closed after release? Vinay's comment looks considerable for us to me. What do you think? Mistakes happen, even during the release process. Presuming the set of folks who can edit closed tickets is already restricted to contributors, why not assume any edits are the community making things more accurate? -- Sean
Re: IMPORTANT: automatic changelog creation
I think it defeats the purpose ofu On Jul 8, 2015, at 12:13 AM, Tsuyoshi Ozawa oz...@apache.org wrote: +1, thanks Allen and Andrew for taking lots effort! Is there any possibility that, we can restrict someone from editing the issue in jira once its marked as closed after release? Vinay's comment looks considerable for us to me. What do you think? If you lock closed liras, then how does re-open work when something gets reverted? I’m also not sure what purpose it ultimately serves. Is it the fear that the release data that gets shipped with the src tar ball will be wrong? Is it that the data might change? That happens now and is pretty much unfixable no matter what one does. [1] Besides, it’s going to be *extremely* valuable for RMs to be able to edit the JIRA data when cutting a release, especially given how many people are refusing to write release notes for incompatible changes. Being easily audit-able _and easily fixable!_ is a huge win. It really is a feature that this data can be changed. Let’s say there is a change. Next release, we can re-roll the old change and release notes data for all releases and it will be correct on the website, etc, from that point on. FWIW, this is one of the reasons why we really should be publishing trunk’s doc-set to the website. It’s generally more accurate than the last release. Never mind everyone seeming to think that “current” = trunk and getting confused when they write a doc patch that doesn’t apply to trunk. [1] As part of the JIRA cleanup last year, I added hundreds of JIRAs to 2.0.0-alpha and 0.23.x last year that were incorrectly marked for 0.24 and 3.0.0. I doubt anyone but me (and the future 3.0.0 RM?) really cares.
Re: IMPORTANT: automatic changelog creation
+1, thanks Allen and Andrew. Regards, Akira On 7/3/15 22:31, Devaraj K wrote: +1 Thanks Allen and Andrew for your efforts on this. Thanks Devaraj On Fri, Jul 3, 2015 at 11:29 AM, Varun Vasudev vvasu...@apache.org wrote: +1 Many thanks to Allen and Andrew for driving this. -Varun On 7/3/15, 10:25 AM, Vinayakumar B vinayakum...@apache.org wrote: +1 for the auto generation. bq. Besides, after a release R1 is out, someone may (accidentally or intentionally) modify the JIRA summary. Is there any possibility that, we can restrict someone from editing the issue in jira once its marked as closed after release? Regards, Vinay On Fri, Jul 3, 2015 at 8:32 AM, Karthik Kambatla ka...@cloudera.com wrote: Huge +1 On Thursday, July 2, 2015, Chris Nauroth cnaur...@hortonworks.com wrote: +1 Thank you to Allen for the script, and thank you to Andrew for volunteering to drive the conversion. --Chris Nauroth On 7/2/15, 2:01 PM, Andrew Wang andrew.w...@cloudera.com javascript:; wrote: Hi all, I want to revive the discussion on this thread, since the overhead of CHANGES.txt came up again in the context of backporting fixes for maintenance releases. Allen's automatic generation script (HADOOP-11731) went into trunk but not branch-2, so we're still maintaining CHANGES.txt everywhere. What do people think about backporting this to branch-2 and then removing CHANGES.txt from trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source of information, and JIRA is at least as reliable and probably much more so. Thus I don't see any downsides to backporting it. Would like to hear everyone's thoughts on this, I'm willing to drive the effort. Thanks, Andrew On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze szets...@yahoo.com.invalid wrote: Generating change log from JIRA is a good idea. It bases on an assumption that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect the committed change. Unfortunately, the assumption is invalid for many cases since we never enforce that the JIRA summary must be the same as the change log. We may compare the current CHANGES.txt with the generated change log. I beg the diff is long. Besides, after a release R1 is out, someone may (accidentally or intentionally) modify the JIRA summary. Then, the entry for the same item in a later release R2 could be different from the one in R1. I agree that manually editing CHANGES.txt is not a perfect solution. However, it works well in the past for many releases. I suggest we keep the current dev workflow. Try using the new script provided by HADOOP-11731 to generate the next release. If everything works well, we shell remove CHANGES.txt and revise the dev workflow. What do you think? Regards,Tsz-Wo On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer a...@altiscale.com javascript:; wrote: On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com javascript:; wrote: We'd then doing two commits for every patch. Let's simply not remove CHANGES.txt from trunk, keep the existing dev workflow, but doc the release process to remove CHANGES.txt in trunk at the time of a release going out of trunk. Might as well copy branch-2¹s changes.txt into trunk then. (or 2.7¹s. Last I looked, people updated branch-2 and not 2.7¹s or vice versa for some patches that went into both branches.) So that folks who are committing to both branches and want to cherry pick all changes can. I mean, trunk¹s is very very very wrong. Right now. Today. Borderline useless. See HADOOP-11718 (which I will now close out as won¹t fix)Š and that jira is only what is miscategorized, not what is missing. -- Mobile
Re: IMPORTANT: automatic changelog creation
+1, thanks Allen and Andrew for taking lots effort! Is there any possibility that, we can restrict someone from editing the issue in jira once its marked as closed after release? Vinay's comment looks considerable for us to me. What do you think? - Tsuyoshi On Wed, Jul 8, 2015 at 3:57 PM, Akira AJISAKA ajisa...@oss.nttdata.co.jp wrote: +1, thanks Allen and Andrew. Regards, Akira On 7/3/15 22:31, Devaraj K wrote: +1 Thanks Allen and Andrew for your efforts on this. Thanks Devaraj On Fri, Jul 3, 2015 at 11:29 AM, Varun Vasudev vvasu...@apache.org wrote: +1 Many thanks to Allen and Andrew for driving this. -Varun On 7/3/15, 10:25 AM, Vinayakumar B vinayakum...@apache.org wrote: +1 for the auto generation. bq. Besides, after a release R1 is out, someone may (accidentally or intentionally) modify the JIRA summary. Is there any possibility that, we can restrict someone from editing the issue in jira once its marked as closed after release? Regards, Vinay On Fri, Jul 3, 2015 at 8:32 AM, Karthik Kambatla ka...@cloudera.com wrote: Huge +1 On Thursday, July 2, 2015, Chris Nauroth cnaur...@hortonworks.com wrote: +1 Thank you to Allen for the script, and thank you to Andrew for volunteering to drive the conversion. --Chris Nauroth On 7/2/15, 2:01 PM, Andrew Wang andrew.w...@cloudera.com javascript:; wrote: Hi all, I want to revive the discussion on this thread, since the overhead of CHANGES.txt came up again in the context of backporting fixes for maintenance releases. Allen's automatic generation script (HADOOP-11731) went into trunk but not branch-2, so we're still maintaining CHANGES.txt everywhere. What do people think about backporting this to branch-2 and then removing CHANGES.txt from trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source of information, and JIRA is at least as reliable and probably much more so. Thus I don't see any downsides to backporting it. Would like to hear everyone's thoughts on this, I'm willing to drive the effort. Thanks, Andrew On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze szets...@yahoo.com.invalid wrote: Generating change log from JIRA is a good idea. It bases on an assumption that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect the committed change. Unfortunately, the assumption is invalid for many cases since we never enforce that the JIRA summary must be the same as the change log. We may compare the current CHANGES.txt with the generated change log. I beg the diff is long. Besides, after a release R1 is out, someone may (accidentally or intentionally) modify the JIRA summary. Then, the entry for the same item in a later release R2 could be different from the one in R1. I agree that manually editing CHANGES.txt is not a perfect solution. However, it works well in the past for many releases. I suggest we keep the current dev workflow. Try using the new script provided by HADOOP-11731 to generate the next release. If everything works well, we shell remove CHANGES.txt and revise the dev workflow. What do you think? Regards,Tsz-Wo On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer a...@altiscale.com javascript:; wrote: On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com javascript:; wrote: We'd then doing two commits for every patch. Let's simply not remove CHANGES.txt from trunk, keep the existing dev workflow, but doc the release process to remove CHANGES.txt in trunk at the time of a release going out of trunk. Might as well copy branch-2¹s changes.txt into trunk then. (or 2.7¹s. Last I looked, people updated branch-2 and not 2.7¹s or vice versa for some patches that went into both branches.) So that folks who are committing to both branches and want to cherry pick all changes can. I mean, trunk¹s is very very very wrong. Right now. Today. Borderline useless. See HADOOP-11718 (which I will now close out as won¹t fix)Š and that jira is only what is miscategorized, not what is missing. -- Mobile
Re: IMPORTANT: automatic changelog creation
+1 Thanks Allen and Andrew for your efforts on this. Thanks Devaraj On Fri, Jul 3, 2015 at 11:29 AM, Varun Vasudev vvasu...@apache.org wrote: +1 Many thanks to Allen and Andrew for driving this. -Varun On 7/3/15, 10:25 AM, Vinayakumar B vinayakum...@apache.org wrote: +1 for the auto generation. bq. Besides, after a release R1 is out, someone may (accidentally or intentionally) modify the JIRA summary. Is there any possibility that, we can restrict someone from editing the issue in jira once its marked as closed after release? Regards, Vinay On Fri, Jul 3, 2015 at 8:32 AM, Karthik Kambatla ka...@cloudera.com wrote: Huge +1 On Thursday, July 2, 2015, Chris Nauroth cnaur...@hortonworks.com wrote: +1 Thank you to Allen for the script, and thank you to Andrew for volunteering to drive the conversion. --Chris Nauroth On 7/2/15, 2:01 PM, Andrew Wang andrew.w...@cloudera.com javascript:; wrote: Hi all, I want to revive the discussion on this thread, since the overhead of CHANGES.txt came up again in the context of backporting fixes for maintenance releases. Allen's automatic generation script (HADOOP-11731) went into trunk but not branch-2, so we're still maintaining CHANGES.txt everywhere. What do people think about backporting this to branch-2 and then removing CHANGES.txt from trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source of information, and JIRA is at least as reliable and probably much more so. Thus I don't see any downsides to backporting it. Would like to hear everyone's thoughts on this, I'm willing to drive the effort. Thanks, Andrew On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze szets...@yahoo.com.invalid wrote: Generating change log from JIRA is a good idea. It bases on an assumption that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect the committed change. Unfortunately, the assumption is invalid for many cases since we never enforce that the JIRA summary must be the same as the change log. We may compare the current CHANGES.txt with the generated change log. I beg the diff is long. Besides, after a release R1 is out, someone may (accidentally or intentionally) modify the JIRA summary. Then, the entry for the same item in a later release R2 could be different from the one in R1. I agree that manually editing CHANGES.txt is not a perfect solution. However, it works well in the past for many releases. I suggest we keep the current dev workflow. Try using the new script provided by HADOOP-11731 to generate the next release. If everything works well, we shell remove CHANGES.txt and revise the dev workflow. What do you think? Regards,Tsz-Wo On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer a...@altiscale.com javascript:; wrote: On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com javascript:; wrote: We'd then doing two commits for every patch. Let's simply not remove CHANGES.txt from trunk, keep the existing dev workflow, but doc the release process to remove CHANGES.txt in trunk at the time of a release going out of trunk. Might as well copy branch-2¹s changes.txt into trunk then. (or 2.7¹s. Last I looked, people updated branch-2 and not 2.7¹s or vice versa for some patches that went into both branches.) So that folks who are committing to both branches and want to cherry pick all changes can. I mean, trunk¹s is very very very wrong. Right now. Today. Borderline useless. See HADOOP-11718 (which I will now close out as won¹t fix)Š and that jira is only what is miscategorized, not what is missing. -- Mobile -- Thanks Devaraj K
Re: IMPORTANT: automatic changelog creation
+1 Many thanks to Allen and Andrew for driving this. -Varun On 7/3/15, 10:25 AM, Vinayakumar B vinayakum...@apache.org wrote: +1 for the auto generation. bq. Besides, after a release R1 is out, someone may (accidentally or intentionally) modify the JIRA summary. Is there any possibility that, we can restrict someone from editing the issue in jira once its marked as closed after release? Regards, Vinay On Fri, Jul 3, 2015 at 8:32 AM, Karthik Kambatla ka...@cloudera.com wrote: Huge +1 On Thursday, July 2, 2015, Chris Nauroth cnaur...@hortonworks.com wrote: +1 Thank you to Allen for the script, and thank you to Andrew for volunteering to drive the conversion. --Chris Nauroth On 7/2/15, 2:01 PM, Andrew Wang andrew.w...@cloudera.com javascript:; wrote: Hi all, I want to revive the discussion on this thread, since the overhead of CHANGES.txt came up again in the context of backporting fixes for maintenance releases. Allen's automatic generation script (HADOOP-11731) went into trunk but not branch-2, so we're still maintaining CHANGES.txt everywhere. What do people think about backporting this to branch-2 and then removing CHANGES.txt from trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source of information, and JIRA is at least as reliable and probably much more so. Thus I don't see any downsides to backporting it. Would like to hear everyone's thoughts on this, I'm willing to drive the effort. Thanks, Andrew On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze szets...@yahoo.com.invalid wrote: Generating change log from JIRA is a good idea. It bases on an assumption that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect the committed change. Unfortunately, the assumption is invalid for many cases since we never enforce that the JIRA summary must be the same as the change log. We may compare the current CHANGES.txt with the generated change log. I beg the diff is long. Besides, after a release R1 is out, someone may (accidentally or intentionally) modify the JIRA summary. Then, the entry for the same item in a later release R2 could be different from the one in R1. I agree that manually editing CHANGES.txt is not a perfect solution. However, it works well in the past for many releases. I suggest we keep the current dev workflow. Try using the new script provided by HADOOP-11731 to generate the next release. If everything works well, we shell remove CHANGES.txt and revise the dev workflow. What do you think? Regards,Tsz-Wo On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer a...@altiscale.com javascript:; wrote: On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com javascript:; wrote: We'd then doing two commits for every patch. Let's simply not remove CHANGES.txt from trunk, keep the existing dev workflow, but doc the release process to remove CHANGES.txt in trunk at the time of a release going out of trunk. Might as well copy branch-2¹s changes.txt into trunk then. (or 2.7¹s. Last I looked, people updated branch-2 and not 2.7¹s or vice versa for some patches that went into both branches.) So that folks who are committing to both branches and want to cherry pick all changes can. I mean, trunk¹s is very very very wrong. Right now. Today. Borderline useless. See HADOOP-11718 (which I will now close out as won¹t fix)Š and that jira is only what is miscategorized, not what is missing. -- Mobile
Re: IMPORTANT: automatic changelog creation
+1 for the auto generation. bq. Besides, after a release R1 is out, someone may (accidentally or intentionally) modify the JIRA summary. Is there any possibility that, we can restrict someone from editing the issue in jira once its marked as closed after release? Regards, Vinay On Fri, Jul 3, 2015 at 8:32 AM, Karthik Kambatla ka...@cloudera.com wrote: Huge +1 On Thursday, July 2, 2015, Chris Nauroth cnaur...@hortonworks.com wrote: +1 Thank you to Allen for the script, and thank you to Andrew for volunteering to drive the conversion. --Chris Nauroth On 7/2/15, 2:01 PM, Andrew Wang andrew.w...@cloudera.com javascript:; wrote: Hi all, I want to revive the discussion on this thread, since the overhead of CHANGES.txt came up again in the context of backporting fixes for maintenance releases. Allen's automatic generation script (HADOOP-11731) went into trunk but not branch-2, so we're still maintaining CHANGES.txt everywhere. What do people think about backporting this to branch-2 and then removing CHANGES.txt from trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source of information, and JIRA is at least as reliable and probably much more so. Thus I don't see any downsides to backporting it. Would like to hear everyone's thoughts on this, I'm willing to drive the effort. Thanks, Andrew On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze szets...@yahoo.com.invalid wrote: Generating change log from JIRA is a good idea. It bases on an assumption that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect the committed change. Unfortunately, the assumption is invalid for many cases since we never enforce that the JIRA summary must be the same as the change log. We may compare the current CHANGES.txt with the generated change log. I beg the diff is long. Besides, after a release R1 is out, someone may (accidentally or intentionally) modify the JIRA summary. Then, the entry for the same item in a later release R2 could be different from the one in R1. I agree that manually editing CHANGES.txt is not a perfect solution. However, it works well in the past for many releases. I suggest we keep the current dev workflow. Try using the new script provided by HADOOP-11731 to generate the next release. If everything works well, we shell remove CHANGES.txt and revise the dev workflow. What do you think? Regards,Tsz-Wo On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer a...@altiscale.com javascript:; wrote: On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com javascript:; wrote: We'd then doing two commits for every patch. Let's simply not remove CHANGES.txt from trunk, keep the existing dev workflow, but doc the release process to remove CHANGES.txt in trunk at the time of a release going out of trunk. Might as well copy branch-2¹s changes.txt into trunk then. (or 2.7¹s. Last I looked, people updated branch-2 and not 2.7¹s or vice versa for some patches that went into both branches.) So that folks who are committing to both branches and want to cherry pick all changes can. I mean, trunk¹s is very very very wrong. Right now. Today. Borderline useless. See HADOOP-11718 (which I will now close out as won¹t fix)Š and that jira is only what is miscategorized, not what is missing. -- Mobile
Re: IMPORTANT: automatic changelog creation
+1 Thank you to Allen for the script, and thank you to Andrew for volunteering to drive the conversion. --Chris Nauroth On 7/2/15, 2:01 PM, Andrew Wang andrew.w...@cloudera.com wrote: Hi all, I want to revive the discussion on this thread, since the overhead of CHANGES.txt came up again in the context of backporting fixes for maintenance releases. Allen's automatic generation script (HADOOP-11731) went into trunk but not branch-2, so we're still maintaining CHANGES.txt everywhere. What do people think about backporting this to branch-2 and then removing CHANGES.txt from trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source of information, and JIRA is at least as reliable and probably much more so. Thus I don't see any downsides to backporting it. Would like to hear everyone's thoughts on this, I'm willing to drive the effort. Thanks, Andrew On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze szets...@yahoo.com.invalid wrote: Generating change log from JIRA is a good idea. It bases on an assumption that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect the committed change. Unfortunately, the assumption is invalid for many cases since we never enforce that the JIRA summary must be the same as the change log. We may compare the current CHANGES.txt with the generated change log. I beg the diff is long. Besides, after a release R1 is out, someone may (accidentally or intentionally) modify the JIRA summary. Then, the entry for the same item in a later release R2 could be different from the one in R1. I agree that manually editing CHANGES.txt is not a perfect solution. However, it works well in the past for many releases. I suggest we keep the current dev workflow. Try using the new script provided by HADOOP-11731 to generate the next release. If everything works well, we shell remove CHANGES.txt and revise the dev workflow. What do you think? Regards,Tsz-Wo On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer a...@altiscale.com wrote: On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: We'd then doing two commits for every patch. Let's simply not remove CHANGES.txt from trunk, keep the existing dev workflow, but doc the release process to remove CHANGES.txt in trunk at the time of a release going out of trunk. Might as well copy branch-2¹s changes.txt into trunk then. (or 2.7¹s. Last I looked, people updated branch-2 and not 2.7¹s or vice versa for some patches that went into both branches.) So that folks who are committing to both branches and want to cherry pick all changes can. I mean, trunk¹s is very very very wrong. Right now. Today. Borderline useless. See HADOOP-11718 (which I will now close out as won¹t fix)Š and that jira is only what is miscategorized, not what is missing.
Re: IMPORTANT: automatic changelog creation
Hi all, I want to revive the discussion on this thread, since the overhead of CHANGES.txt came up again in the context of backporting fixes for maintenance releases. Allen's automatic generation script (HADOOP-11731) went into trunk but not branch-2, so we're still maintaining CHANGES.txt everywhere. What do people think about backporting this to branch-2 and then removing CHANGES.txt from trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source of information, and JIRA is at least as reliable and probably much more so. Thus I don't see any downsides to backporting it. Would like to hear everyone's thoughts on this, I'm willing to drive the effort. Thanks, Andrew On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze szets...@yahoo.com.invalid wrote: Generating change log from JIRA is a good idea. It bases on an assumption that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect the committed change. Unfortunately, the assumption is invalid for many cases since we never enforce that the JIRA summary must be the same as the change log. We may compare the current CHANGES.txt with the generated change log. I beg the diff is long. Besides, after a release R1 is out, someone may (accidentally or intentionally) modify the JIRA summary. Then, the entry for the same item in a later release R2 could be different from the one in R1. I agree that manually editing CHANGES.txt is not a perfect solution. However, it works well in the past for many releases. I suggest we keep the current dev workflow. Try using the new script provided by HADOOP-11731 to generate the next release. If everything works well, we shell remove CHANGES.txt and revise the dev workflow. What do you think? Regards,Tsz-Wo On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer a...@altiscale.com wrote: On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: We'd then doing two commits for every patch. Let's simply not remove CHANGES.txt from trunk, keep the existing dev workflow, but doc the release process to remove CHANGES.txt in trunk at the time of a release going out of trunk. Might as well copy branch-2’s changes.txt into trunk then. (or 2.7’s. Last I looked, people updated branch-2 and not 2.7’s or vice versa for some patches that went into both branches.) So that folks who are committing to both branches and want to cherry pick all changes can. I mean, trunk’s is very very very wrong. Right now. Today. Borderline useless. See HADOOP-11718 (which I will now close out as won’t fix)… and that jira is only what is miscategorized, not what is missing.
Re: IMPORTANT: automatic changelog creation
On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: We'd then doing two commits for every patch. Let's simply not remove CHANGES.txt from trunk, keep the existing dev workflow, but doc the release process to remove CHANGES.txt in trunk at the time of a release going out of trunk. Might as well copy branch-2’s changes.txt into trunk then. (or 2.7’s. Last I looked, people updated branch-2 and not 2.7’s or vice versa for some patches that went into both branches.) So that folks who are committing to both branches and want to cherry pick all changes can. I mean, trunk’s is very very very wrong. Right now. Today. Borderline useless. See HADOOP-11718 (which I will now close out as won’t fix)… and that jira is only what is miscategorized, not what is missing.
Re: IMPORTANT: automatic changelog creation
We'd then doing two commits for every patch. Let's simply not remove CHANGES.txt from trunk, keep the existing dev workflow, but doc the release process to remove CHANGES.txt in trunk at the time of a release going out of trunk. +Vinod On Apr 2, 2015, at 10:12 AM, Allen Wittenauer a...@altiscale.commailto:a...@altiscale.com wrote: But in reality, I suspect the opposite: removing changes.txt just from trunk will make cherry picks easier. If you don’t have to update trunk’s changes.txt, you can cherry-pick with no worries about conflict merges on changes.txt in other branches. Then just update changes.txt in branch-2 manually as you would have done pre- this change anyway.
Re: IMPORTANT: automatic changelog creation
Generating change log from JIRA is a good idea. It bases on an assumption that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect the committed change. Unfortunately, the assumption is invalid for many cases since we never enforce that the JIRA summary must be the same as the change log. We may compare the current CHANGES.txt with the generated change log. I beg the diff is long. Besides, after a release R1 is out, someone may (accidentally or intentionally) modify the JIRA summary. Then, the entry for the same item in a later release R2 could be different from the one in R1. I agree that manually editing CHANGES.txt is not a perfect solution. However, it works well in the past for many releases. I suggest we keep the current dev workflow. Try using the new script provided by HADOOP-11731 to generate the next release. If everything works well, we shell remove CHANGES.txt and revise the dev workflow. What do you think? Regards,Tsz-Wo On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer a...@altiscale.com wrote: On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: We'd then doing two commits for every patch. Let's simply not remove CHANGES.txt from trunk, keep the existing dev workflow, but doc the release process to remove CHANGES.txt in trunk at the time of a release going out of trunk. Might as well copy branch-2’s changes.txt into trunk then. (or 2.7’s. Last I looked, people updated branch-2 and not 2.7’s or vice versa for some patches that went into both branches.) So that folks who are committing to both branches and want to cherry pick all changes can. I mean, trunk’s is very very very wrong. Right now. Today. Borderline useless. See HADOOP-11718 (which I will now close out as won’t fix)… and that jira is only what is miscategorized, not what is missing.
Re: IMPORTANT: automatic changelog creation
On Apr 2, 2015, at 11:36 AM, Mai Haohui ricet...@gmail.com wrote: Hi Allen, Thanks for driving this. Just some quick questions: Removing changes.txt, relnotes.py, etc from branch-2 would be an incompatible change. Pushing aside the questions of that document’s quality (hint: lots of outright lying and missing several hundred jiras), it's effectively an interface in used by quite a few folks. Why removing CHANGES.txt is an incompatible change? Why CHANGES.txt is an interface? Can you give some examples? With my end user ops hat on, for years I'd often run scripts over CHANGES.TXT to pull key things in releases including to get extra metadata that wasn’t in that file and reformat for my users to digest. (especially since the release notes weren’t published with the release tar and—let’s be honest--were mostly indecipherable heaps of crap to the point that even the RM’s never bothered to really look at them...) CHANGES.txt was useful to get the base dataset, esp in the days before JIRA’s REST interface. It is/was, in essence, an interface. It looks like that the meaning of incompatibility is overloaded -- at the very least, in http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html, compatibility means source and binary compatibility. FWIW, removing relnotes.py is definitely covered by that document. At least to me that CHANGES.txt is not part of the contract of compatibility. It would be great to see this patch to occur in branch-2. But yes, I mean beyond that. It’s a ‘de facto’ standard given how many people use it for critical information about what we’ve released. This is about managing user expectations and not just what’s convenient for us. You know, that whole community that we always mention but seem to stomp all over. Just because we CAN do something doesn’t mean we SHOULD. An excellent example of this is the HADOOP_OPTS variable. I’d LOVE LOVE LOVE to kick it to the curb. It’s the source of a LOT of end user bugs and problematic areas in the shell code. During the rewrite + the above rules, I had the opportunity and bylaws standing to do so. But I didn’t because it’d just flat out break too much stuff, known and unknown. It’s ok to be conservative when it comes to change.
Re: IMPORTANT: automatic changelog creation
On Apr 2, 2015, at 9:51 AM, Karthik Kambatla ka...@cloudera.com wrote: a) remove CHANGES.TXT from trunk Removing this from trunk makes it particularly hard to cherry-pick changes from trunk to branch-2. I would gate this on the removal of CHANGES.txt on branch-2 as well, at least until we have some non-future releases off branch-2. Removing changes.txt, relnotes.py, etc from branch-2 would be an incompatible change. Pushing aside the questions of that document’s quality (hint: lots of outright lying and missing several hundred jiras), it's effectively an interface in used by quite a few folks. But in reality, I suspect the opposite: removing changes.txt just from trunk will make cherry picks easier. If you don’t have to update trunk’s changes.txt, you can cherry-pick with no worries about conflict merges on changes.txt in other branches. Then just update changes.txt in branch-2 manually as you would have done pre- this change anyway. b) pre-populate x amount of Hadoop 2.x release data into trunk so that the auto-indexer can pick it up c) update the HowToRelease information with, well, how to do releases based upon these new capabilities There is a create-release script that likely needs updating. Yup. I knew about it (I happen to have it running in another window as I type this), but from what I can see, it’s completely undocumented. :( As I update HowToRelease, I’m going to puts some notes in it about this script.
Re: IMPORTANT: automatic changelog creation
Hi Allen, Thanks for driving this. Just some quick questions: Removing changes.txt, relnotes.py, etc from branch-2 would be an incompatible change. Pushing aside the questions of that document’s quality (hint: lots of outright lying and missing several hundred jiras), it's effectively an interface in used by quite a few folks. Why removing CHANGES.txt is an incompatible change? Why CHANGES.txt is an interface? Can you give some examples? It looks like that the meaning of incompatibility is overloaded -- at the very least, in http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html, compatibility means source and binary compatibility. At least to me that CHANGES.txt is not part of the contract of compatibility. It would be great to see this patch to occur in branch-2. ~Haohui On Thu, Apr 2, 2015 at 10:12 AM, Allen Wittenauer a...@altiscale.com wrote: On Apr 2, 2015, at 9:51 AM, Karthik Kambatla ka...@cloudera.com wrote: a) remove CHANGES.TXT from trunk Removing this from trunk makes it particularly hard to cherry-pick changes from trunk to branch-2. I would gate this on the removal of CHANGES.txt on branch-2 as well, at least until we have some non-future releases off branch-2. Removing changes.txt, relnotes.py, etc from branch-2 would be an incompatible change. Pushing aside the questions of that document’s quality (hint: lots of outright lying and missing several hundred jiras), it's effectively an interface in used by quite a few folks. But in reality, I suspect the opposite: removing changes.txt just from trunk will make cherry picks easier. If you don’t have to update trunk’s changes.txt, you can cherry-pick with no worries about conflict merges on changes.txt in other branches. Then just update changes.txt in branch-2 manually as you would have done pre- this change anyway. b) pre-populate x amount of Hadoop 2.x release data into trunk so that the auto-indexer can pick it up c) update the HowToRelease information with, well, how to do releases based upon these new capabilities There is a create-release script that likely needs updating. Yup. I knew about it (I happen to have it running in another window as I type this), but from what I can see, it’s completely undocumented. :( As I update HowToRelease, I’m going to puts some notes in it about this script.
IMPORTANT: automatic changelog creation
Hello everyone! (to: and reply-to: set to common-dev, cc: the rest of ‘em, to concentrate the discussion) HADOOP-11731 has just been committed to *trunk*. This change does two things: a) Removes dev-support/relnotes.py b) Adds dev-support/releasedocmaker.py releasedocmaker.py works as a replacement for both the release notes generation process as well as the CHANGES.TXT file. As documented in BUILDING.TXT, running ‘mvn site -Preleasedocs’ will generate both the release notes and a change log for that release in markdown format based upon the FixVersion field in JIRA. During the creation of the website, these files are then converted to HTML for use on the apache.org website. The release notes file only contains incompatible changes and JIRA that specifically have release notes. The changes file only has the data for that release. This is obviously an incompatible change. There is a good chance this code will not appear in branch-2. There might be some additional fallout (esp since some folks write code against CHANGES.TXT) so I wanted to give everyone a heads up. Also, I’ll be filing some additional JIRAs/doing some additional work to: a) remove CHANGES.TXT from trunk b) pre-populate x amount of Hadoop 2.x release data into trunk so that the auto-indexer can pick it up c) update the HowToRelease information with, well, how to do releases based upon these new capabilities Thanks!