Re: IMPORTANT: automatic changelog creation

2015-07-08 Thread Sean Busbey
On Jul 8, 2015 2:13 AM, Tsuyoshi Ozawa oz...@apache.org wrote:

 +1, thanks Allen and Andrew for taking lots effort!

  Is there any possibility that, we can restrict someone from editing the
  issue in jira once its marked as closed after release?

 Vinay's comment looks considerable for us to me. What do you think?


Mistakes happen, even during the release process.

Presuming the set of folks who can edit closed tickets is already
restricted to contributors, why not assume any edits are the community
making things more accurate?

-- 
Sean


Re: IMPORTANT: automatic changelog creation

2015-07-08 Thread Allen Wittenauer

I think it defeats the purpose ofu

On Jul 8, 2015, at 12:13 AM, Tsuyoshi Ozawa oz...@apache.org wrote:

 +1, thanks Allen and Andrew for taking lots effort!
 
 Is there any possibility that, we can restrict someone from editing the
 issue in jira once its marked as closed after release?
 
 Vinay's comment looks considerable for us to me. What do you think?


If you lock closed liras, then how does re-open work when something 
gets reverted?

I’m also not sure what purpose it ultimately serves.  Is it the fear 
that the release data that gets shipped with the src tar ball will be wrong? Is 
it that the data might change?  That happens now and is pretty much unfixable 
no matter what one does. [1]  Besides, it’s going to be *extremely* valuable 
for RMs to be able to edit the JIRA data when cutting a release, especially 
given how many people are refusing to write release notes for incompatible 
changes. Being easily audit-able _and easily fixable!_ is a huge win.

It really is a feature that this data can be changed.

Let’s say there is a change. Next release, we can re-roll the old 
change and release notes data for all releases and it will be correct on the 
website, etc, from that point on.

FWIW, this is one of the reasons why we really should be publishing 
trunk’s doc-set to the website.  It’s generally more accurate than the last 
release. Never mind everyone seeming to think that “current” = trunk and 
getting confused when they write a doc patch that doesn’t apply to trunk.

[1] As part of the JIRA cleanup last year,  I added hundreds of JIRAs to 
2.0.0-alpha and 0.23.x last year that were incorrectly marked for 0.24 and 
3.0.0.  I doubt anyone but me (and the future 3.0.0 RM?) really cares.  



Re: IMPORTANT: automatic changelog creation

2015-07-08 Thread Akira AJISAKA

+1, thanks Allen and Andrew.

Regards,
Akira

On 7/3/15 22:31, Devaraj K wrote:

+1

Thanks Allen and Andrew for your efforts on this.

Thanks
Devaraj

On Fri, Jul 3, 2015 at 11:29 AM, Varun Vasudev vvasu...@apache.org wrote:


+1

Many thanks to Allen and Andrew for driving this.

-Varun



On 7/3/15, 10:25 AM, Vinayakumar B vinayakum...@apache.org wrote:


+1 for the auto generation.

bq. Besides, after a release R1 is out, someone may (accidentally or
intentionally) modify the JIRA summary.
Is there any possibility that, we can restrict someone from editing the
issue in jira once its marked as closed after release?

Regards,
Vinay

On Fri, Jul 3, 2015 at 8:32 AM, Karthik Kambatla ka...@cloudera.com

wrote:



Huge +1

On Thursday, July 2, 2015, Chris Nauroth cnaur...@hortonworks.com

wrote:



+1

Thank you to Allen for the script, and thank you to Andrew for
volunteering to drive the conversion.

--Chris Nauroth




On 7/2/15, 2:01 PM, Andrew Wang andrew.w...@cloudera.com

javascript:;

wrote:


Hi all,

I want to revive the discussion on this thread, since the overhead of
CHANGES.txt came up again in the context of backporting fixes for
maintenance releases.

Allen's automatic generation script (HADOOP-11731) went into trunk

but

not

branch-2, so we're still maintaining CHANGES.txt everywhere. What do
people
think about backporting this to branch-2 and then removing

CHANGES.txt

from
trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread

and in

HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable

source

of

information, and JIRA is at least as reliable and probably much more

so.

Thus I don't see any downsides to backporting it.

Would like to hear everyone's thoughts on this, I'm willing to drive

the

effort.

Thanks,
Andrew

On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze

szets...@yahoo.com.invalid

wrote:


Generating change log from JIRA is a good idea.  It bases on an
assumption
that each JIRA has an accurate summary (a.k.a. JIRA title) to

reflect

the
committed change. Unfortunately, the assumption is invalid for many
cases
since we never enforce that the JIRA summary must be the same as

the

change
log.  We may compare the current CHANGES.txt with the generated

change

log.  I beg the diff is long.
Besides, after a release R1 is out, someone may (accidentally or
intentionally) modify the JIRA summary.  Then, the entry for the

same

item
in a later release R2 could be different from the one in R1.
I agree that manually editing CHANGES.txt is not a perfect

solution.

However, it works well in the past for many releases.  I suggest we

keep

the current dev workflow.  Try using the new script provided by
HADOOP-11731 to generate the next release.  If everything works

well,

we

shell remove CHANGES.txt and revise the dev workflow.  What do you
think?
Regards,Tsz-Wo


  On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer 
a...@altiscale.com javascript:; wrote:





On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli 
vino...@hortonworks.com javascript:; wrote:



We'd then doing two commits for every patch. Let's simply not

remove

CHANGES.txt from trunk, keep the existing dev workflow, but doc the
release
process to remove CHANGES.txt in trunk at the time of a release

going

out
of trunk.



Might as well copy branch-2¹s changes.txt into trunk then. (or

2.7¹s.

Last I looked, people updated branch-2 and not 2.7¹s or vice versa

for

some
patches that went into both branches.)  So that folks who are
committing to
both branches and want to cherry pick all changes can.

I mean, trunk¹s is very very very wrong. Right now. Today.

Borderline

useless. See HADOOP-11718 (which I will now close out as won¹t

fix)Š

and

that jira is only what is miscategorized, not what is missing.









--
Mobile











Re: IMPORTANT: automatic changelog creation

2015-07-08 Thread Tsuyoshi Ozawa
+1, thanks Allen and Andrew for taking lots effort!

 Is there any possibility that, we can restrict someone from editing the
 issue in jira once its marked as closed after release?

Vinay's comment looks considerable for us to me. What do you think?

- Tsuyoshi

On Wed, Jul 8, 2015 at 3:57 PM, Akira AJISAKA
ajisa...@oss.nttdata.co.jp wrote:
 +1, thanks Allen and Andrew.

 Regards,
 Akira


 On 7/3/15 22:31, Devaraj K wrote:

 +1

 Thanks Allen and Andrew for your efforts on this.

 Thanks
 Devaraj

 On Fri, Jul 3, 2015 at 11:29 AM, Varun Vasudev vvasu...@apache.org
 wrote:

 +1

 Many thanks to Allen and Andrew for driving this.

 -Varun



 On 7/3/15, 10:25 AM, Vinayakumar B vinayakum...@apache.org wrote:

 +1 for the auto generation.

 bq. Besides, after a release R1 is out, someone may (accidentally or
 intentionally) modify the JIRA summary.
 Is there any possibility that, we can restrict someone from editing the
 issue in jira once its marked as closed after release?

 Regards,
 Vinay

 On Fri, Jul 3, 2015 at 8:32 AM, Karthik Kambatla ka...@cloudera.com

 wrote:


 Huge +1

 On Thursday, July 2, 2015, Chris Nauroth cnaur...@hortonworks.com

 wrote:


 +1

 Thank you to Allen for the script, and thank you to Andrew for
 volunteering to drive the conversion.

 --Chris Nauroth




 On 7/2/15, 2:01 PM, Andrew Wang andrew.w...@cloudera.com

 javascript:;

 wrote:

 Hi all,

 I want to revive the discussion on this thread, since the overhead of
 CHANGES.txt came up again in the context of backporting fixes for
 maintenance releases.

 Allen's automatic generation script (HADOOP-11731) went into trunk

 but

 not

 branch-2, so we're still maintaining CHANGES.txt everywhere. What do
 people
 think about backporting this to branch-2 and then removing

 CHANGES.txt

 from
 trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread

 and in

 HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable

 source

 of

 information, and JIRA is at least as reliable and probably much more

 so.

 Thus I don't see any downsides to backporting it.

 Would like to hear everyone's thoughts on this, I'm willing to drive

 the

 effort.

 Thanks,
 Andrew

 On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze

 szets...@yahoo.com.invalid

 wrote:

 Generating change log from JIRA is a good idea.  It bases on an
 assumption
 that each JIRA has an accurate summary (a.k.a. JIRA title) to

 reflect

 the
 committed change. Unfortunately, the assumption is invalid for many
 cases
 since we never enforce that the JIRA summary must be the same as

 the

 change
 log.  We may compare the current CHANGES.txt with the generated

 change

 log.  I beg the diff is long.
 Besides, after a release R1 is out, someone may (accidentally or
 intentionally) modify the JIRA summary.  Then, the entry for the

 same

 item
 in a later release R2 could be different from the one in R1.
 I agree that manually editing CHANGES.txt is not a perfect

 solution.

 However, it works well in the past for many releases.  I suggest we

 keep

 the current dev workflow.  Try using the new script provided by
 HADOOP-11731 to generate the next release.  If everything works

 well,

 we

 shell remove CHANGES.txt and revise the dev workflow.  What do you
 think?
 Regards,Tsz-Wo


   On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer 
 a...@altiscale.com javascript:; wrote:





 On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli 
 vino...@hortonworks.com javascript:; wrote:


 We'd then doing two commits for every patch. Let's simply not

 remove

 CHANGES.txt from trunk, keep the existing dev workflow, but doc the
 release
 process to remove CHANGES.txt in trunk at the time of a release

 going

 out
 of trunk.



 Might as well copy branch-2¹s changes.txt into trunk then. (or

 2.7¹s.

 Last I looked, people updated branch-2 and not 2.7¹s or vice versa

 for

 some
 patches that went into both branches.)  So that folks who are
 committing to
 both branches and want to cherry pick all changes can.

 I mean, trunk¹s is very very very wrong. Right now. Today.

 Borderline

 useless. See HADOOP-11718 (which I will now close out as won¹t

 fix)Š

 and

 that jira is only what is miscategorized, not what is missing.







 --
 Mobile








Re: IMPORTANT: automatic changelog creation

2015-07-03 Thread Devaraj K
+1

Thanks Allen and Andrew for your efforts on this.

Thanks
Devaraj

On Fri, Jul 3, 2015 at 11:29 AM, Varun Vasudev vvasu...@apache.org wrote:

 +1

 Many thanks to Allen and Andrew for driving this.

 -Varun



 On 7/3/15, 10:25 AM, Vinayakumar B vinayakum...@apache.org wrote:

 +1 for the auto generation.
 
 bq. Besides, after a release R1 is out, someone may (accidentally or
 intentionally) modify the JIRA summary.
 Is there any possibility that, we can restrict someone from editing the
 issue in jira once its marked as closed after release?
 
 Regards,
 Vinay
 
 On Fri, Jul 3, 2015 at 8:32 AM, Karthik Kambatla ka...@cloudera.com
 wrote:
 
  Huge +1
 
  On Thursday, July 2, 2015, Chris Nauroth cnaur...@hortonworks.com
 wrote:
 
   +1
  
   Thank you to Allen for the script, and thank you to Andrew for
   volunteering to drive the conversion.
  
   --Chris Nauroth
  
  
  
  
   On 7/2/15, 2:01 PM, Andrew Wang andrew.w...@cloudera.com
  javascript:;
   wrote:
  
   Hi all,
   
   I want to revive the discussion on this thread, since the overhead of
   CHANGES.txt came up again in the context of backporting fixes for
   maintenance releases.
   
   Allen's automatic generation script (HADOOP-11731) went into trunk
 but
  not
   branch-2, so we're still maintaining CHANGES.txt everywhere. What do
   people
   think about backporting this to branch-2 and then removing
 CHANGES.txt
   from
   trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread
 and in
   HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable
 source
  of
   information, and JIRA is at least as reliable and probably much more
 so.
   Thus I don't see any downsides to backporting it.
   
   Would like to hear everyone's thoughts on this, I'm willing to drive
 the
   effort.
   
   Thanks,
   Andrew
   
   On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze
 szets...@yahoo.com.invalid
   wrote:
   
Generating change log from JIRA is a good idea.  It bases on an
   assumption
that each JIRA has an accurate summary (a.k.a. JIRA title) to
 reflect
   the
committed change. Unfortunately, the assumption is invalid for many
   cases
since we never enforce that the JIRA summary must be the same as
 the
   change
log.  We may compare the current CHANGES.txt with the generated
 change
log.  I beg the diff is long.
Besides, after a release R1 is out, someone may (accidentally or
intentionally) modify the JIRA summary.  Then, the entry for the
 same
   item
in a later release R2 could be different from the one in R1.
I agree that manually editing CHANGES.txt is not a perfect
 solution.
However, it works well in the past for many releases.  I suggest we
  keep
the current dev workflow.  Try using the new script provided by
HADOOP-11731 to generate the next release.  If everything works
 well,
  we
shell remove CHANGES.txt and revise the dev workflow.  What do you
   think?
Regards,Tsz-Wo
   
   
 On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer 
a...@altiscale.com javascript:; wrote:
   
   
   
   
   
On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli 
vino...@hortonworks.com javascript:; wrote:
   

 We'd then doing two commits for every patch. Let's simply not
 remove
CHANGES.txt from trunk, keep the existing dev workflow, but doc the
   release
process to remove CHANGES.txt in trunk at the time of a release
 going
   out
of trunk.
   
   
   
Might as well copy branch-2¹s changes.txt into trunk then. (or
 2.7¹s.
Last I looked, people updated branch-2 and not 2.7¹s or vice versa
 for
   some
patches that went into both branches.)  So that folks who are
   committing to
both branches and want to cherry pick all changes can.
   
I mean, trunk¹s is very very very wrong. Right now. Today.
 Borderline
useless. See HADOOP-11718 (which I will now close out as won¹t
 fix)Š
  and
that jira is only what is miscategorized, not what is missing.
   
   
   
   
  
  
 
  --
  Mobile
 




-- 


Thanks
Devaraj K


Re: IMPORTANT: automatic changelog creation

2015-07-03 Thread Varun Vasudev
+1

Many thanks to Allen and Andrew for driving this.

-Varun



On 7/3/15, 10:25 AM, Vinayakumar B vinayakum...@apache.org wrote:

+1 for the auto generation.

bq. Besides, after a release R1 is out, someone may (accidentally or
intentionally) modify the JIRA summary.
Is there any possibility that, we can restrict someone from editing the
issue in jira once its marked as closed after release?

Regards,
Vinay

On Fri, Jul 3, 2015 at 8:32 AM, Karthik Kambatla ka...@cloudera.com wrote:

 Huge +1

 On Thursday, July 2, 2015, Chris Nauroth cnaur...@hortonworks.com wrote:

  +1
 
  Thank you to Allen for the script, and thank you to Andrew for
  volunteering to drive the conversion.
 
  --Chris Nauroth
 
 
 
 
  On 7/2/15, 2:01 PM, Andrew Wang andrew.w...@cloudera.com
 javascript:;
  wrote:
 
  Hi all,
  
  I want to revive the discussion on this thread, since the overhead of
  CHANGES.txt came up again in the context of backporting fixes for
  maintenance releases.
  
  Allen's automatic generation script (HADOOP-11731) went into trunk but
 not
  branch-2, so we're still maintaining CHANGES.txt everywhere. What do
  people
  think about backporting this to branch-2 and then removing CHANGES.txt
  from
  trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in
  HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source
 of
  information, and JIRA is at least as reliable and probably much more so.
  Thus I don't see any downsides to backporting it.
  
  Would like to hear everyone's thoughts on this, I'm willing to drive the
  effort.
  
  Thanks,
  Andrew
  
  On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze szets...@yahoo.com.invalid
  wrote:
  
   Generating change log from JIRA is a good idea.  It bases on an
  assumption
   that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect
  the
   committed change. Unfortunately, the assumption is invalid for many
  cases
   since we never enforce that the JIRA summary must be the same as the
  change
   log.  We may compare the current CHANGES.txt with the generated change
   log.  I beg the diff is long.
   Besides, after a release R1 is out, someone may (accidentally or
   intentionally) modify the JIRA summary.  Then, the entry for the same
  item
   in a later release R2 could be different from the one in R1.
   I agree that manually editing CHANGES.txt is not a perfect solution.
   However, it works well in the past for many releases.  I suggest we
 keep
   the current dev workflow.  Try using the new script provided by
   HADOOP-11731 to generate the next release.  If everything works well,
 we
   shell remove CHANGES.txt and revise the dev workflow.  What do you
  think?
   Regards,Tsz-Wo
  
  
On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer 
   a...@altiscale.com javascript:; wrote:
  
  
  
  
  
   On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli 
   vino...@hortonworks.com javascript:; wrote:
  
   
We'd then doing two commits for every patch. Let's simply not remove
   CHANGES.txt from trunk, keep the existing dev workflow, but doc the
  release
   process to remove CHANGES.txt in trunk at the time of a release going
  out
   of trunk.
  
  
  
   Might as well copy branch-2¹s changes.txt into trunk then. (or 2.7¹s.
   Last I looked, people updated branch-2 and not 2.7¹s or vice versa for
  some
   patches that went into both branches.)  So that folks who are
  committing to
   both branches and want to cherry pick all changes can.
  
   I mean, trunk¹s is very very very wrong. Right now. Today. Borderline
   useless. See HADOOP-11718 (which I will now close out as won¹t fix)Š
 and
   that jira is only what is miscategorized, not what is missing.
  
  
  
  
 
 

 --
 Mobile




Re: IMPORTANT: automatic changelog creation

2015-07-02 Thread Vinayakumar B
+1 for the auto generation.

bq. Besides, after a release R1 is out, someone may (accidentally or
intentionally) modify the JIRA summary.
Is there any possibility that, we can restrict someone from editing the
issue in jira once its marked as closed after release?

Regards,
Vinay

On Fri, Jul 3, 2015 at 8:32 AM, Karthik Kambatla ka...@cloudera.com wrote:

 Huge +1

 On Thursday, July 2, 2015, Chris Nauroth cnaur...@hortonworks.com wrote:

  +1
 
  Thank you to Allen for the script, and thank you to Andrew for
  volunteering to drive the conversion.
 
  --Chris Nauroth
 
 
 
 
  On 7/2/15, 2:01 PM, Andrew Wang andrew.w...@cloudera.com
 javascript:;
  wrote:
 
  Hi all,
  
  I want to revive the discussion on this thread, since the overhead of
  CHANGES.txt came up again in the context of backporting fixes for
  maintenance releases.
  
  Allen's automatic generation script (HADOOP-11731) went into trunk but
 not
  branch-2, so we're still maintaining CHANGES.txt everywhere. What do
  people
  think about backporting this to branch-2 and then removing CHANGES.txt
  from
  trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in
  HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source
 of
  information, and JIRA is at least as reliable and probably much more so.
  Thus I don't see any downsides to backporting it.
  
  Would like to hear everyone's thoughts on this, I'm willing to drive the
  effort.
  
  Thanks,
  Andrew
  
  On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze szets...@yahoo.com.invalid
  wrote:
  
   Generating change log from JIRA is a good idea.  It bases on an
  assumption
   that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect
  the
   committed change. Unfortunately, the assumption is invalid for many
  cases
   since we never enforce that the JIRA summary must be the same as the
  change
   log.  We may compare the current CHANGES.txt with the generated change
   log.  I beg the diff is long.
   Besides, after a release R1 is out, someone may (accidentally or
   intentionally) modify the JIRA summary.  Then, the entry for the same
  item
   in a later release R2 could be different from the one in R1.
   I agree that manually editing CHANGES.txt is not a perfect solution.
   However, it works well in the past for many releases.  I suggest we
 keep
   the current dev workflow.  Try using the new script provided by
   HADOOP-11731 to generate the next release.  If everything works well,
 we
   shell remove CHANGES.txt and revise the dev workflow.  What do you
  think?
   Regards,Tsz-Wo
  
  
On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer 
   a...@altiscale.com javascript:; wrote:
  
  
  
  
  
   On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli 
   vino...@hortonworks.com javascript:; wrote:
  
   
We'd then doing two commits for every patch. Let's simply not remove
   CHANGES.txt from trunk, keep the existing dev workflow, but doc the
  release
   process to remove CHANGES.txt in trunk at the time of a release going
  out
   of trunk.
  
  
  
   Might as well copy branch-2¹s changes.txt into trunk then. (or 2.7¹s.
   Last I looked, people updated branch-2 and not 2.7¹s or vice versa for
  some
   patches that went into both branches.)  So that folks who are
  committing to
   both branches and want to cherry pick all changes can.
  
   I mean, trunk¹s is very very very wrong. Right now. Today. Borderline
   useless. See HADOOP-11718 (which I will now close out as won¹t fix)Š
 and
   that jira is only what is miscategorized, not what is missing.
  
  
  
  
 
 

 --
 Mobile



Re: IMPORTANT: automatic changelog creation

2015-07-02 Thread Chris Nauroth
+1

Thank you to Allen for the script, and thank you to Andrew for
volunteering to drive the conversion.

--Chris Nauroth




On 7/2/15, 2:01 PM, Andrew Wang andrew.w...@cloudera.com wrote:

Hi all,

I want to revive the discussion on this thread, since the overhead of
CHANGES.txt came up again in the context of backporting fixes for
maintenance releases.

Allen's automatic generation script (HADOOP-11731) went into trunk but not
branch-2, so we're still maintaining CHANGES.txt everywhere. What do
people
think about backporting this to branch-2 and then removing CHANGES.txt
from
trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in
HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source of
information, and JIRA is at least as reliable and probably much more so.
Thus I don't see any downsides to backporting it.

Would like to hear everyone's thoughts on this, I'm willing to drive the
effort.

Thanks,
Andrew

On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze szets...@yahoo.com.invalid
wrote:

 Generating change log from JIRA is a good idea.  It bases on an
assumption
 that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect
the
 committed change. Unfortunately, the assumption is invalid for many
cases
 since we never enforce that the JIRA summary must be the same as the
change
 log.  We may compare the current CHANGES.txt with the generated change
 log.  I beg the diff is long.
 Besides, after a release R1 is out, someone may (accidentally or
 intentionally) modify the JIRA summary.  Then, the entry for the same
item
 in a later release R2 could be different from the one in R1.
 I agree that manually editing CHANGES.txt is not a perfect solution.
 However, it works well in the past for many releases.  I suggest we keep
 the current dev workflow.  Try using the new script provided by
 HADOOP-11731 to generate the next release.  If everything works well, we
 shell remove CHANGES.txt and revise the dev workflow.  What do you
think?
 Regards,Tsz-Wo


  On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer 
 a...@altiscale.com wrote:





 On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli 
 vino...@hortonworks.com wrote:

 
  We'd then doing two commits for every patch. Let's simply not remove
 CHANGES.txt from trunk, keep the existing dev workflow, but doc the
release
 process to remove CHANGES.txt in trunk at the time of a release going
out
 of trunk.



 Might as well copy branch-2¹s changes.txt into trunk then. (or 2.7¹s.
 Last I looked, people updated branch-2 and not 2.7¹s or vice versa for
some
 patches that went into both branches.)  So that folks who are
committing to
 both branches and want to cherry pick all changes can.

 I mean, trunk¹s is very very very wrong. Right now. Today. Borderline
 useless. See HADOOP-11718 (which I will now close out as won¹t fix)Š and
 that jira is only what is miscategorized, not what is missing.







Re: IMPORTANT: automatic changelog creation

2015-07-02 Thread Andrew Wang
Hi all,

I want to revive the discussion on this thread, since the overhead of
CHANGES.txt came up again in the context of backporting fixes for
maintenance releases.

Allen's automatic generation script (HADOOP-11731) went into trunk but not
branch-2, so we're still maintaining CHANGES.txt everywhere. What do people
think about backporting this to branch-2 and then removing CHANGES.txt from
trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in
HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source of
information, and JIRA is at least as reliable and probably much more so.
Thus I don't see any downsides to backporting it.

Would like to hear everyone's thoughts on this, I'm willing to drive the
effort.

Thanks,
Andrew

On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze szets...@yahoo.com.invalid
wrote:

 Generating change log from JIRA is a good idea.  It bases on an assumption
 that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect the
 committed change. Unfortunately, the assumption is invalid for many cases
 since we never enforce that the JIRA summary must be the same as the change
 log.  We may compare the current CHANGES.txt with the generated change
 log.  I beg the diff is long.
 Besides, after a release R1 is out, someone may (accidentally or
 intentionally) modify the JIRA summary.  Then, the entry for the same item
 in a later release R2 could be different from the one in R1.
 I agree that manually editing CHANGES.txt is not a perfect solution.
 However, it works well in the past for many releases.  I suggest we keep
 the current dev workflow.  Try using the new script provided by
 HADOOP-11731 to generate the next release.  If everything works well, we
 shell remove CHANGES.txt and revise the dev workflow.  What do you think?
 Regards,Tsz-Wo


  On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer 
 a...@altiscale.com wrote:





 On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli 
 vino...@hortonworks.com wrote:

 
  We'd then doing two commits for every patch. Let's simply not remove
 CHANGES.txt from trunk, keep the existing dev workflow, but doc the release
 process to remove CHANGES.txt in trunk at the time of a release going out
 of trunk.



 Might as well copy branch-2’s changes.txt into trunk then. (or 2.7’s.
 Last I looked, people updated branch-2 and not 2.7’s or vice versa for some
 patches that went into both branches.)  So that folks who are committing to
 both branches and want to cherry pick all changes can.

 I mean, trunk’s is very very very wrong. Right now. Today. Borderline
 useless. See HADOOP-11718 (which I will now close out as won’t fix)… and
 that jira is only what is miscategorized, not what is missing.






Re: IMPORTANT: automatic changelog creation

2015-04-02 Thread Allen Wittenauer


On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com 
wrote:

 
 We'd then doing two commits for every patch. Let's simply not remove 
 CHANGES.txt from trunk, keep the existing dev workflow, but doc the release 
 process to remove CHANGES.txt in trunk at the time of a release going out of 
 trunk.



Might as well copy branch-2’s changes.txt into trunk then. (or 2.7’s.  Last I 
looked, people updated branch-2 and not 2.7’s or vice versa for some patches 
that went into both branches.)  So that folks who are committing to both 
branches and want to cherry pick all changes can.  

I mean, trunk’s is very very very wrong. Right now. Today. Borderline useless. 
See HADOOP-11718 (which I will now close out as won’t fix)… and that jira is 
only what is miscategorized, not what is missing.

Re: IMPORTANT: automatic changelog creation

2015-04-02 Thread Vinod Kumar Vavilapalli

We'd then doing two commits for every patch. Let's simply not remove 
CHANGES.txt from trunk, keep the existing dev workflow, but doc the release 
process to remove CHANGES.txt in trunk at the time of a release going out of 
trunk.

+Vinod

On Apr 2, 2015, at 10:12 AM, Allen Wittenauer 
a...@altiscale.commailto:a...@altiscale.com wrote:

But in reality, I suspect the opposite: removing changes.txt just from trunk 
will make cherry picks easier.  If you don’t have to update trunk’s 
changes.txt, you can cherry-pick with no worries about conflict merges on 
changes.txt in other branches. Then just update changes.txt in branch-2 
manually as you would have done pre- this change anyway.



Re: IMPORTANT: automatic changelog creation

2015-04-02 Thread Tsz Wo Sze
Generating change log from JIRA is a good idea.  It bases on an assumption that 
each JIRA has an accurate summary (a.k.a. JIRA title) to reflect the committed 
change. Unfortunately, the assumption is invalid for many cases since we never 
enforce that the JIRA summary must be the same as the change log.  We may 
compare the current CHANGES.txt with the generated change log.  I beg the diff 
is long.
Besides, after a release R1 is out, someone may (accidentally or intentionally) 
modify the JIRA summary.  Then, the entry for the same item in a later release 
R2 could be different from the one in R1.
I agree that manually editing CHANGES.txt is not a perfect solution.  However, 
it works well in the past for many releases.  I suggest we keep the current dev 
workflow.  Try using the new script provided by HADOOP-11731 to generate the 
next release.  If everything works well, we shell remove CHANGES.txt and revise 
the dev workflow.  What do you think?
Regards,Tsz-Wo 


 On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer a...@altiscale.com 
wrote:
   
 

 

On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com 
wrote:

 
 We'd then doing two commits for every patch. Let's simply not remove 
 CHANGES.txt from trunk, keep the existing dev workflow, but doc the release 
 process to remove CHANGES.txt in trunk at the time of a release going out of 
 trunk.



Might as well copy branch-2’s changes.txt into trunk then. (or 2.7’s.  Last I 
looked, people updated branch-2 and not 2.7’s or vice versa for some patches 
that went into both branches.)  So that folks who are committing to both 
branches and want to cherry pick all changes can.  

I mean, trunk’s is very very very wrong. Right now. Today. Borderline useless. 
See HADOOP-11718 (which I will now close out as won’t fix)… and that jira is 
only what is miscategorized, not what is missing.

 
  

Re: IMPORTANT: automatic changelog creation

2015-04-02 Thread Allen Wittenauer

On Apr 2, 2015, at 11:36 AM, Mai Haohui ricet...@gmail.com wrote:

 Hi Allen,
 
 Thanks for driving this. Just some quick questions:
 
   Removing changes.txt, relnotes.py, etc from branch-2 would be an 
 incompatible change.  Pushing aside the questions of that document’s 
 quality (hint: lots of outright lying and missing several hundred jiras), 
 it's effectively an interface in used by quite a few folks.
 
 Why removing CHANGES.txt  is an incompatible change? Why CHANGES.txt
 is an interface? Can you give some examples?

With my end user ops hat on, for years I'd often run scripts over 
CHANGES.TXT to pull key things in releases including to get extra metadata that 
wasn’t in that file and reformat for my users to digest.  (especially since the 
release notes weren’t published with the release tar and—let’s be honest--were 
mostly indecipherable heaps of crap to the point that even the RM’s never 
bothered to really look at them...) CHANGES.txt was useful to get the base 
dataset, esp in the days before JIRA’s REST interface.

It is/was, in essence, an interface.

 It looks like that the meaning of incompatibility is overloaded -- at
 the very least, in
 http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html,
 compatibility means source and binary compatibility.

FWIW, removing relnotes.py is definitely covered by that document.

 At least to me that CHANGES.txt is not part of the contract of
 compatibility. It would be great to see this patch to occur in
 branch-2.

But yes, I mean beyond that.  It’s a ‘de facto’ standard given how many 
people use it for critical information about what we’ve released.  This is 
about managing user expectations and not just what’s convenient for us.  You 
know, that whole community that we always mention but seem to stomp all over.  
Just because we CAN do something doesn’t mean we SHOULD.

An excellent example of this is the HADOOP_OPTS variable.  I’d LOVE 
LOVE LOVE to kick it to the curb.  It’s the source of a LOT of end user bugs 
and problematic areas in the shell code.  During the rewrite +  the above 
rules, I had the opportunity and bylaws standing to do so.  But I didn’t 
because it’d just flat out break too much stuff, known and unknown.

It’s ok to be conservative when it comes to change.



Re: IMPORTANT: automatic changelog creation

2015-04-02 Thread Allen Wittenauer

On Apr 2, 2015, at 9:51 AM, Karthik Kambatla ka...@cloudera.com wrote:
 
 a) remove CHANGES.TXT from trunk
 
 
 Removing this from trunk makes it particularly hard to cherry-pick changes
 from trunk to branch-2. I would gate this on the removal of CHANGES.txt on
 branch-2 as well, at least until we have some non-future releases off
 branch-2.

Removing changes.txt, relnotes.py, etc from branch-2 would be an 
incompatible change.  Pushing aside the questions of that document’s quality 
(hint: lots of outright lying and missing several hundred jiras), it's 
effectively an interface in used by quite a few folks. 

But in reality, I suspect the opposite: removing changes.txt just from 
trunk will make cherry picks easier.  If you don’t have to update trunk’s 
changes.txt, you can cherry-pick with no worries about conflict merges on 
changes.txt in other branches.  Then just update changes.txt in branch-2 
manually as you would have done pre- this change anyway.


 b) pre-populate x amount of Hadoop 2.x release data into trunk so that the
 auto-indexer can pick it up
 c) update the HowToRelease information with, well, how to do releases
 based upon these new capabilities
 
 
 There is a create-release script that likely needs updating.
 


Yup.  I knew about it (I happen to have it running in another window as 
I type this), but from what I can see, it’s completely undocumented. :(  As I 
update HowToRelease, I’m going to puts some notes in it about this script. 

Re: IMPORTANT: automatic changelog creation

2015-04-02 Thread Mai Haohui
Hi Allen,

Thanks for driving this. Just some quick questions:

Removing changes.txt, relnotes.py, etc from branch-2 would be an 
 incompatible change.  Pushing aside the questions of that document’s quality 
 (hint: lots of outright lying and missing several hundred jiras), it's 
 effectively an interface in used by quite a few folks.

Why removing CHANGES.txt  is an incompatible change? Why CHANGES.txt
is an interface? Can you give some examples?

It looks like that the meaning of incompatibility is overloaded -- at
the very least, in
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html,
compatibility means source and binary compatibility.

At least to me that CHANGES.txt is not part of the contract of
compatibility. It would be great to see this patch to occur in
branch-2.

~Haohui

On Thu, Apr 2, 2015 at 10:12 AM, Allen Wittenauer a...@altiscale.com wrote:

 On Apr 2, 2015, at 9:51 AM, Karthik Kambatla ka...@cloudera.com wrote:

 a) remove CHANGES.TXT from trunk


 Removing this from trunk makes it particularly hard to cherry-pick changes
 from trunk to branch-2. I would gate this on the removal of CHANGES.txt on
 branch-2 as well, at least until we have some non-future releases off
 branch-2.

 Removing changes.txt, relnotes.py, etc from branch-2 would be an 
 incompatible change.  Pushing aside the questions of that document’s quality 
 (hint: lots of outright lying and missing several hundred jiras), it's 
 effectively an interface in used by quite a few folks.

 But in reality, I suspect the opposite: removing changes.txt just 
 from trunk will make cherry picks easier.  If you don’t have to update 
 trunk’s changes.txt, you can cherry-pick with no worries about conflict 
 merges on changes.txt in other branches.  Then just update changes.txt in 
 branch-2 manually as you would have done pre- this change anyway.


 b) pre-populate x amount of Hadoop 2.x release data into trunk so that the
 auto-indexer can pick it up
 c) update the HowToRelease information with, well, how to do releases
 based upon these new capabilities


 There is a create-release script that likely needs updating.



 Yup.  I knew about it (I happen to have it running in another window 
 as I type this), but from what I can see, it’s completely undocumented. :(  
 As I update HowToRelease, I’m going to puts some notes in it about this 
 script.


IMPORTANT: automatic changelog creation

2015-04-01 Thread Allen Wittenauer
Hello everyone!

(to: and reply-to: set to common-dev, cc: the rest of ‘em, to 
concentrate the discussion)

HADOOP-11731 has just been committed to *trunk*.  This change does two 
things:

a) Removes dev-support/relnotes.py
b) Adds dev-support/releasedocmaker.py

releasedocmaker.py works as a replacement for both the release notes 
generation process as well as the CHANGES.TXT file.  As documented in 
BUILDING.TXT, running ‘mvn site -Preleasedocs’ will generate both the release 
notes and a change log for that release in markdown format based upon the 
FixVersion field in JIRA.  During the creation of the website, these files are 
then converted to HTML for use on the apache.org website.   The release notes 
file only contains incompatible changes and JIRA that specifically have release 
notes.  The changes file only has the data for that release.

This is obviously an incompatible change.  There is a good chance this 
code will not appear in branch-2.  There might be some additional fallout (esp 
since some folks write code against CHANGES.TXT) so I wanted to give everyone a 
heads up.

Also, I’ll be filing some additional JIRAs/doing some additional work 
to:

a) remove CHANGES.TXT from trunk
b) pre-populate x amount of Hadoop 2.x release data into trunk so that the 
auto-indexer can pick it up
c) update the HowToRelease information with, well, how to do releases based 
upon these new capabilities


Thanks!