From the perspective of an end user who is reading multiple versions'
listings at once, listing the same JIRA being fixed in multiple releases is
totally confusing, especially now that release notes are actually readable.
"So which version was it ACTUALLY fixed in?" is going to be the question. It'd
be worthwhile for folks to actually build, say, trunk and look at the release
notes section of the site build to see how these things are presented in
aggregate before coming to any conclusions. Just viewing a single version's
output will likely give a skewed perspective. (Or, I suppose you can read
https://gitlab.com/_a__w_/eco-release-metadata/tree/master/HADOOP too, but the
sort order is "wrong" for web viewing.)
My read of the HowToCommit fix rules is that they were written from the
perspective of how we typically use branches to cut releases. In other words,
the changes and release notes for 2.6.x, where x>0, 2.7.y, where y>0, will
likely not be fully present/complete in 2.8.0 so wouldn't actually reflect the
entirety of, say, the 2.7.4 release if 2.7.4 and 2.8.0 are being worked in
parallel. This in turn means the changes and release notes become orthogonal
once the minor release branch is cut. This is also important because there is
no guarantee that a change made in, say, 2.7.4 is actually in 2.8.0 because the
code may have changed to the point that the fix isn't needed or wanted.
From an automation perspective, I took the perspective that this means
that the a.b.0 release notes are expected to be committed to all non-released
major branches. So trunk will have release notes for 2.7.0, 2.8.0, 2.9.0, etc
but not from 2.7.1, 2.8.1, or 2.9.1. This makes the fix rules actually pretty
easy: the lowest a.b.0 release and all non-.0 releases. trunk, as always, is
only listed if that is the only place where it was committed. (i.e., the lowest
a.b.0 release happens to be the highest one available.)
I suspect people are feeling confused or think the rules need to be
changed mainly because a) we have a lot more branches getting RE work than ever
before in Hadoop's history and b) 2.8.0 has been hanging out in an unreleased
branch for ~7 months. [The PMC should probably vote to kill that branch and
just cut a new 2.8.0 based off of the current top of branch-2. I think that'd
go a long way to clearing the confusion as well as actually making 2.8.0
relevant again for those that still want to work on branch-2.]
Also:
> Assuming the semantic versioning (http://semver.org) as
> our baseline thinking,
We don't use semantic versioning and you'll find zero references to it
in any Apache Hadoop documentation. If we were following semver, even in the
loosest sense, 2.7.0 should have been 3.0.0 with the JRE upgrade requirement.
(which, ironically, is still causing issues with folks moving things between
2.6 and 2.7+, see the other thread about the Dockerfile.) In a stricter sense,
we should be on v11 or something, given the amount of incompatible changes
throughout branch-2's history.
> On Jul 22, 2016, at 11:44 AM, Andrew Wang <[email protected]> wrote:
>
>>
>>
>>> I am also not quite sure I understand the rationale of what's in the
>> HowToCommit wiki. Assuming the semantic versioning (http://semver.org) as
>> our baseline thinking, having concurrent release streams alone breaks the
>> principle. And that is *regardless of* how we line up individual releases
>> in time (2.6.4 v. 2.7.3). Semantic versioning means 2.6.z < 2.7.* where *
>> is any number. Therefore, the moment we have any new 2.6.z release after
>> 2.7.0, the rule is broken and remains that way. Timing of subsequent
>> releases is somewhat irrelevant.
>>
>> From a practical standpoint, I would love to know whether a certain patch
>> has been backported to a specific version. Thus, I would love to see fix
>> version enumerating all the releases that the JIRA went into. Basically the
>> more disclosure, the better. That would also make it easier for us
>> committers to see the state of the porting and identify issues like being
>> ported to 2.6.x but not to 2.7.x. What do you think? Should we revise our
>> policy?
>>
>>
> I also err towards more fix versions. Based on our branching strategy of
> branch-x -> branch-x.y -> branch->x.y.z, I think this means that the
> changelog will identify everything since the previous
> last-version-component of the branch name. So 2.6.5 diffs against 2.6.4,
> 2.8.0 diffs against 2.7.0, 3.0.0 against 2.0.0. This makes it more
> straightforward for users to determine what changelogs are important, based
> purely on the version number.
>
> I agree with Sangjin that the #1 question that the changelogs should
> address is whether a certain patch is present in a version. For this
> usecase, it's better to have duplicate info than to omit something.
>
> To answer "what's new", I think that's answered by the manually curated
> release notes, like the ones we put together at HADOOP-13383.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]