Thanks for bringing this to a conclusion.

On Mon, Oct 15, 2018 at 6:18 PM Thomas Weise <t...@apache.org> wrote:
>
> Here is my attempt to summarize the discussion, please see the TBDs.
>
> I would work on a PR with respective contributor and committer guideline 
> updates next.
>
> Thanks,
> Thomas
>
>
> Goals:
>
> - Clear history with purpose and origin of changes
> - Ability to perform a granular rollback, if necessary
> - Efficiency and quality of review (avoid tiny or out-of-context changes or 
> huge mega-changes)
> - Efficiency of authoring (don't want to wait on a review for a tiny bit 
> because GitHub makes it very hard to stack up reviews in sequence / don't 
> want to have major changes blocked because of difficulty of review)
> - Ease of new contribution ("OK for committers to do more, while new/one-time 
> contributors shouldn't need to know or obey any policy"). TBD: I think that 
> quote needs clarification. IMO contributors are expected to read and adhere 
> to contribution guidelines.
>
> Clean history:
>
> - Commit messages should tag JIRAs and be otherwise descriptive. It should 
> not be necessary to find a merge or first PR commit to find out what caused a 
> change
> - We want to squash "Fixup!", "Address comments" type of commits to achieve a 
> clean history
> - We prefer that PR authors squash such commits after review is complete. 
> This expectation should be described clearly in the Contributor's Guide.
> - The process should not burden committer due to back and forth with author 
> and deal with cleaning up PR history and other cosmetics. We want to reduce 
> committer overhead. But note that we don't want to shift the burden to first 
> time contributors either.
> - Committer can use the "squash and merge" option (or modify the PR commits 
> in other ways). This should address the overhead concern. TBD: I would 
> suggest that the author, when this is explicitly not desired, needs to 
> indicate it upfront.

I wouldn't require that the author note this on every PR. I would say
if there are (obvious) fixup commits, the committer is free to squash
unless asked not to. If the history is clean, favor trusting the
author. Subject to the bullet point below.

> - Committer is ultimately responsible (committer guidelines) and we "trust 
> the committer's judgment"
>
> Granularity of changes:
>
> - We prefer small independent, incremental PRs with descriptive, isolated 
> commits. Each commit is a single clear change.
> - It is OK to keep separate commits for different logical pieces of the code, 
> which make reviewing and revisiting code easier
> - Making commits isolated is a good practice, PR author should be able to 
> relatively easily split the PR upon reviewer's request
> - For multiple commits in a PR, every commit that gets merged should compile 
> and pass tests

Generally +1, though there is an important exception. E,g. for large
refactoring, often it's clearer (for both history and reviewing) to
have a single commit that's "ran this tool/command line/script/..." on
the codebase and a follow-up that's manual, human edits (e.g. to get
it to compile and pass tests).

> Git playbook
>
> - How to rollback multiple commits from single PR
> - Simple step-by-step instructions for authors to cleanup history

Regarding "rebase and merge," it's often the worst of both, as it
gives multiple commits, but none of them matching what was in the
author's branch. Generally those who care about history enough to
create multiple distinct commits also prefer that the commits remain
intact.

> On Mon, Oct 1, 2018 at 11:50 AM Rui Wang <ruw...@google.com> wrote:
>>
>> +1 to add JIRA issue as the part of commit message. it requires less effort 
>> but will help a lot on our commit history. I used to not do that but I will 
>> start to add JIRA info to my future commits.
>>
>> -Rui
>>
>> On Mon, Oct 1, 2018 at 2:59 AM Alexey Romanenko <aromanenko....@gmail.com> 
>> wrote:
>>>
>>> +1 to what Anton said, I’m fully agree with this.
>>>
>>> Looking on the current commit history we can notice that there are many 
>>> commits like “fix”, “add/edit/remove”, “clean up”, etc, that, seems, were 
>>> introduced after the review process iterations. I think this makes commit 
>>> history too much verbose and more difficult to follow.
>>>
>>> Also, most of such commits don’t have Jira issue name as part of commit 
>>> message. So it requires to find a merge or fist PR commit in case we want 
>>> to find out what caused this change.
>>>
>>> I believe we can improve our commit guidelines in this way and it should 
>>> help to have commit history more clean and easy to read.
>>>
>>> On 1 Oct 2018, at 06:34, Kenneth Knowles <k...@apache.org> wrote:
>>>
>>> SGTM. I generally favor "trust the committer's judgment" aka (2)(d) when it 
>>> is obvious. We had some small communication problem about this before so I 
>>> just wanted to be careful to ask the author when it is not obvious.
>>>
>>> Kenn
>>>
>>> On Sat, Sep 29, 2018 at 9:06 AM Robert Bradshaw <rober...@google.com> wrote:
>>>>
>>>> There's two separate topics of discussion going on here.
>>>>
>>>> (1) Is it acceptable for PRs to have more than one commit. We've discussed 
>>>> this before, and the consensus seems to be that this is both allowed and 
>>>> intentional.
>>>>
>>>> (2) What to do about PRs that are clearly "[BEAM-NNNN] Implement feature" 
>>>> + (fixup/address comments/lint/...)*. I think this is easier (or at least 
>>>> quicker) to make a call on. Our options are
>>>>
>>>> (a) Merge as is.
>>>> (b) Ask the author to do a final squash. (Doing a squash before an 
>>>> approval, however, makes reviewing more difficult, so this enforces 
>>>> another round.)
>>>> (c) Having the committer manually pull, squash, (force push to PR?), and 
>>>> merge.
>>>> (d) Use the "squash and merge" button in this case.
>>>>
>>>> Clearly, we shouldn't be doing any squashing unless the intent is obvious, 
>>>> but when it is, I think (d) is the sanest and safest route.
>>>>
>>>> - Robert
>>>>
>>>>
>>>>
>>>> On Fri, Sep 28, 2018 at 9:53 PM Kenneth Knowles <k...@apache.org> wrote:
>>>>>
>>>>> On Fri, Sep 28, 2018 at 10:29 AM Thomas Weise <t...@apache.org> wrote:
>>>>>>
>>>>>> +1 for stating the goal of clear provenance and granular rollback.
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> Also of course efficiency and quality of review (we don't to review tiny 
>>>>> or out-of-context changes or huge mega-changes), efficiency of authoring 
>>>>> (don't want to wait on a review for a tiny bit because GitHub makes it 
>>>>> very hard to stack up reviews in sequence / don't want to have major 
>>>>> changes blocked because of difficulty of review), ease of new 
>>>>> contribution (OK for committers to do more IMO, while new/one-time 
>>>>> contributors shouldn't need to know or obey any policy).
>>>>>
>>>>>> I think this discussion helps to remind/identify best practices how to 
>>>>>> get there. Where appropriate we should augment guidelines for both, 
>>>>>> contributor and committer.
>>>>>
>>>>>
>>>>>> Kenn: would you elaborate on the "1 commit = 1 review (and sometimes 
>>>>>> even = 1 ticket)" a bit more. Is that a problem of insufficient task / 
>>>>>> ticket granularity or something else?
>>>>>
>>>>>
>>>>> The problem isn't a failure to define tasks at the right granularity, but 
>>>>> that they naturally and fundamentally exist with a different granularity 
>>>>> (unit of change -> unit of review -> unit of tracking/delivery). I'd 
>>>>> guess there's often a 5x jump in size from commit -> review -> ticket. 
>>>>> There are many many easy-to-isolate changes that are wasteful to 
>>>>> independent review or track.
>>>>>
>>>>> To get some concrete facts, I bet the one things we could probably find 
>>>>> research on is the ideal review size. And we could also scrape logs for 
>>>>> messages with bullet points (often each bullet is basically what would 
>>>>> have been a commit). Generally getting a sense of what these ratios are 
>>>>> in practice and idealized would be kind of interesting but maybe overkill.
>>>>>
>>>>> Kenn
>>>>>
>>>>>> Charles: my plan is to translate the outcome of this discussion to 
>>>>>> guideline updates and your log filter trick will be part of it. Though 
>>>>>> my hope is that it won't be needed if we get closer to the goals above.
>>>>>>
>>>>>> Thanks,
>>>>>> Thomas
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 28, 2018 at 9:44 AM Kenneth Knowles <k...@apache.org> wrote:
>>>>>>>
>>>>>>> Anton makes a good point. We have been talking about policy for what we 
>>>>>>> do, but the real issue is what we want to come out of it: a clear 
>>>>>>> history for seeing where code came from and granular rollback. I think 
>>>>>>> in both cases the key thing is that each commit is a single clear 
>>>>>>> change. How they get there is not the point.
>>>>>>>
>>>>>>> I have worked on multiple projects with a 1 commit = 1 review (and 
>>>>>>> sometimes even = 1 ticket). These pretty much never have a good 
>>>>>>> history. The best case is that each commit has a message that is a 
>>>>>>> bullet point of many separate changes, because it is simply too 
>>>>>>> inefficient to review each logical change separately. But since the 
>>>>>>> messages become less useful, it encourages a culture of not even 
>>>>>>> bothering to write meaningful messages at all.
>>>>>>>
>>>>>>> Note that is trivial for a committer-reviewer to edit the history any 
>>>>>>> way they like without the button. "Allow edits by maintainers" is on by 
>>>>>>> default. The "Squash and merge" button just adds a button for something 
>>>>>>> we can already do.
>>>>>>>
>>>>>>> Charles: super useful! Worth noting that for a PR with a good history 
>>>>>>> it will skip meaningful commits (but still give the summary line, which 
>>>>>>> is nice).
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 28, 2018 at 8:54 AM Anton Kedin <ke...@google.com> wrote:
>>>>>>>>
>>>>>>>> Is there an actual problem caused by squashing or not squashing the 
>>>>>>>> commits that we face in the project? I personally have never needed to 
>>>>>>>> revert something complicated that would be problematic either way (and 
>>>>>>>> don't have a strong opinion about which way we should do it). From 
>>>>>>>> what I see so far in the thread it doesn't look like reverting is a 
>>>>>>>> frequent major pain for anyone. Maybe it is exactly because we're 
>>>>>>>> mostly following some best practice and it makes it easy. If someone 
>>>>>>>> has concrete examples from their experience in the project, please 
>>>>>>>> share them, this way it would be easier to justify the choice.
>>>>>>>>
>>>>>>>> The PR and commit cleanliness, size and isolation are probably more 
>>>>>>>> important thing to have guidance and maybe enforcement for. There are 
>>>>>>>> well known practices and guidelines that I think we should follow, and 
>>>>>>>> I think they will make squashing or not squashing mostly irrelevant. 
>>>>>>>> For example, if we accept that commits should have description that 
>>>>>>>> actually describes what commit does, then "!fixup", "address comments" 
>>>>>>>> and similar should not be part of the history and should be squashed 
>>>>>>>> before submitting the PR no matter which way we decide to go in 
>>>>>>>> general. Also, I think that making commits isolated is also a good 
>>>>>>>> practice, and PR author should be able to relatively easily split the 
>>>>>>>> PR upon reviewer's request. And if we choose to keep whole PRs small 
>>>>>>>> and incremental with descriptive isolated commits, then there won't be 
>>>>>>>> too much difference how many commits there are.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Anton
>>>>>>>>
>>>>>>>> On Fri, Sep 28, 2018 at 8:21 AM Andrew Pilloud <apill...@google.com> 
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> I brought up this discussion a few months ago from the other side: I 
>>>>>>>>> don't like my commits being squashed. I try to create logical commits 
>>>>>>>>> that each passes tests and could be broken up into multiple PRs. 
>>>>>>>>> Keeping those changes intact is useful from a history perspective and 
>>>>>>>>> squashing may break other PRs I have in flight. If the intent is 
>>>>>>>>> clear (one commit with a descriptive message and a bunch of 
>>>>>>>>> "fixups"), then feel free to squash, otherwise ask first. When you do 
>>>>>>>>> squash, it would be good to leave a comment as to how the author can 
>>>>>>>>> avoid having their commits squashed in the future.
>>>>>>>>>
>>>>>>>>> https://lists.apache.org/thread.html/8d29e474e681ab9123280164d95075bb8b0b91486b66d3fa25ed20c2@%3Cdev.beam.apache.org%3E
>>>>>>>>>
>>>>>>>>> Andrew
>>>>>>>>>
>>>>>>>>> On Fri, Sep 28, 2018 at 7:29 AM Chamikara Jayalath 
>>>>>>>>> <chamik...@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Sep 27, 2018 at 9:51 AM Robert Bradshaw 
>>>>>>>>>> <rober...@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> I agree that we should create a good pointer for cleaning up PRs, 
>>>>>>>>>>> and request (though not require) that authors do it. It's 
>>>>>>>>>>> unfortunate though that squashing during a review makes things 
>>>>>>>>>>> difficult to follow, so adds one more round trip.
>>>>>>>>>>>
>>>>>>>>>>> We could consider for those PRs that make sense as a single logical 
>>>>>>>>>>> commit (most, but not all, of them) simply using the "squash and 
>>>>>>>>>>> merge" button even though it technically doesn't create a merge 
>>>>>>>>>>> commit.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> +1 for allowing "squash and merge" as an option. Most of the reviews 
>>>>>>>>>> (at least for me) consist of a single valid commit and several 
>>>>>>>>>> additional commits that get piled up during the review process which 
>>>>>>>>>> obviously should not be included in the commit history. Going 
>>>>>>>>>> through another round here just to ask the author to fixup 
>>>>>>>>>> everything is unnecessarily time consuming.
>>>>>>>>>>
>>>>>>>>>> - Cham
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Sep 21, 2018 at 9:24 PM Daniel Oliveira 
>>>>>>>>>>> <danolive...@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> As a non-committer I think some automated squashing of commits 
>>>>>>>>>>>> sounds best since it lightens the load of regular contributors, by 
>>>>>>>>>>>> not having to always remember to squash, and lightens the load of 
>>>>>>>>>>>> committers so it doesn't take as long to have your PR approved by 
>>>>>>>>>>>> one.
>>>>>>>>>>>>
>>>>>>>>>>>> But for now I think the second best route would be making it PR 
>>>>>>>>>>>> author's responsibility to squash fixup commits. Having that 
>>>>>>>>>>>> expectation described clearly in the Contributor's Guide, along 
>>>>>>>>>>>> with some simple step-by-step instructions for how to do so should 
>>>>>>>>>>>> be enough. I mainly support this because I've been doing the 
>>>>>>>>>>>> squashing myself since I saw a thread about it here a few months 
>>>>>>>>>>>> ago. It's not nearly as huge a burden on me as it probably is for 
>>>>>>>>>>>> committers who have to merge in many more PRs, it's very easy to 
>>>>>>>>>>>> learn how to do, and it's one less barrier to having my code 
>>>>>>>>>>>> merged in.
>>>>>>>>>>>>
>>>>>>>>>>>> Of course I wouldn't expect that committers wait for PR authors to 
>>>>>>>>>>>> squash their fixup commits, but I think leaving a message like 
>>>>>>>>>>>> "For future pull requests you should squash any small fixup 
>>>>>>>>>>>> commits, as described here: <link>" should be fine.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I was also thinking about the possibility of wanting to revert
>>>>>>>>>>>>> individual commits from a merge commit. The solution you propose 
>>>>>>>>>>>>> works,
>>>>>>>>>>>>> but only if you want to revert everything.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Does this happen often? I might not have enough context since I'm 
>>>>>>>>>>>> not a committer, but it seems to me that often the person 
>>>>>>>>>>>> performing a revert is not the original author of a change and 
>>>>>>>>>>>> doesn't have the context or time to pick out an individual commit 
>>>>>>>>>>>> to revert.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Sep 19, 2018 at 1:32 PM Maximilian Michels 
>>>>>>>>>>>> <m...@apache.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> I tend to agree with you Lukasz. Of course we should try to 
>>>>>>>>>>>>> follow the
>>>>>>>>>>>>> guide lines as much as possible but if it requires an extra back 
>>>>>>>>>>>>> and
>>>>>>>>>>>>> forth with the PR author for a cosmetic change, it may not be 
>>>>>>>>>>>>> worth the
>>>>>>>>>>>>> time.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 19.09.18 22:17, Lukasz Cwik wrote:
>>>>>>>>>>>>> > I have to say I'm guilty of not following the merge guidelines,
>>>>>>>>>>>>> > sometimes doing merges without rebasing/flatten commits.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > I find that it is a few extra mins of my time to fix someones 
>>>>>>>>>>>>> > PR history
>>>>>>>>>>>>> > if they have more then one logical commit they want to be 
>>>>>>>>>>>>> > separate and
>>>>>>>>>>>>> > it usually takes days for the PR author to do merging  with the 
>>>>>>>>>>>>> > extra
>>>>>>>>>>>>> > burden as a committer to keep track of another PR and its state 
>>>>>>>>>>>>> > (waiting
>>>>>>>>>>>>> > for clean-up) is taxing. I really liked the idea of the 
>>>>>>>>>>>>> > mergebot (even
>>>>>>>>>>>>> > though it didn't work out in practice) because it could do all 
>>>>>>>>>>>>> > the
>>>>>>>>>>>>> > policy work on my behalf.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Anything that reduces my overhead as a committer is useful as 
>>>>>>>>>>>>> > for the
>>>>>>>>>>>>> > 100s of PRs that I have merged, I've only had to rollback a 
>>>>>>>>>>>>> > couple so
>>>>>>>>>>>>> > I'm for Charle's suggestion which makes the rollback flow 
>>>>>>>>>>>>> > slightly more
>>>>>>>>>>>>> > complicated for a significantly easier PR merge workflow.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > On Wed, Sep 19, 2018 at 1:13 PM Charles Chen <c...@google.com
>>>>>>>>>>>>> > <mailto:c...@google.com>> wrote:
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >     What I mean is that if you get the first-parent commit 
>>>>>>>>>>>>> > using "git
>>>>>>>>>>>>> >     log --first-parent", it will incorporate any and all fix up 
>>>>>>>>>>>>> > commits
>>>>>>>>>>>>> >     so we don't need to worry about missing any.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >     On Wed, Sep 19, 2018, 1:07 PM Maximilian Michels 
>>>>>>>>>>>>> > <m...@apache.org
>>>>>>>>>>>>> >     <mailto:m...@apache.org>> wrote:
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >         Generally, +1 for isolated commits which are easy to 
>>>>>>>>>>>>> > revert.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >          > I don't think it's actually harder to roll back a 
>>>>>>>>>>>>> > set of
>>>>>>>>>>>>> >         commits that are merged together.
>>>>>>>>>>>>> >         I think Thomas was mainly concerned about "fixup" 
>>>>>>>>>>>>> > commits to
>>>>>>>>>>>>> >         land in
>>>>>>>>>>>>> >         master (as part of a merge). These indeed make 
>>>>>>>>>>>>> > reverting commits
>>>>>>>>>>>>> >         more
>>>>>>>>>>>>> >         difficult because you have to check whether you missed 
>>>>>>>>>>>>> > a "fixup".
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >          > Ideally every commit should compile and pass tests 
>>>>>>>>>>>>> > though, right?
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >         That is definitely what we should strive for when doing 
>>>>>>>>>>>>> > a merge
>>>>>>>>>>>>> >         against
>>>>>>>>>>>>> >         master.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >          > Perhaps the bigger issue is that we need better 
>>>>>>>>>>>>> > documentation
>>>>>>>>>>>>> >         and a playbook on how to do this these common tasks in 
>>>>>>>>>>>>> > git.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >         We do actually have basic documentation about this but 
>>>>>>>>>>>>> > most
>>>>>>>>>>>>> >         people don't
>>>>>>>>>>>>> >         read it. For example, the commit message of a Merge 
>>>>>>>>>>>>> > commit
>>>>>>>>>>>>> >         should be:
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >         Merge pull request #xxxx: [BEAM-yyyy] Issue title
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >         But most merge commits don't comply with this rule :) 
>>>>>>>>>>>>> > See
>>>>>>>>>>>>> >         
>>>>>>>>>>>>> > https://beam.apache.org/contribute/committer-guide/#merging-it
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >         On 19.09.18 21:34, Reuven Lax wrote:
>>>>>>>>>>>>> >          > Ideally every commit should compile and pass tests 
>>>>>>>>>>>>> > though, right?
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >          > On Wed, Sep 19, 2018 at 12:15 PM Ankur Goenka
>>>>>>>>>>>>> >         <goe...@google.com <mailto:goe...@google.com>
>>>>>>>>>>>>> >          > <mailto:goe...@google.com 
>>>>>>>>>>>>> > <mailto:goe...@google.com>>> wrote:
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >          >     I agree with the cleanliness of the Commit 
>>>>>>>>>>>>> > history.
>>>>>>>>>>>>> >          >     "Fixup!", "Address comments", "Address even more
>>>>>>>>>>>>> >         comments" type of
>>>>>>>>>>>>> >          >     comments does not convey meaningful information 
>>>>>>>>>>>>> > and are
>>>>>>>>>>>>> >         not very
>>>>>>>>>>>>> >          >     useful. Its a good idea to squash them.
>>>>>>>>>>>>> >          >     However, I think its ok to keep separate commits 
>>>>>>>>>>>>> > for
>>>>>>>>>>>>> >         different
>>>>>>>>>>>>> >          >     logical pieces of the code which make reviewing 
>>>>>>>>>>>>> > and
>>>>>>>>>>>>> >         revisiting code
>>>>>>>>>>>>> >          >     easier.
>>>>>>>>>>>>> >          >     Example PR: Support X in the pipeline
>>>>>>>>>>>>> >          >     Commit 1: Restructuring a bunch of code without 
>>>>>>>>>>>>> > any
>>>>>>>>>>>>> >         logical change.
>>>>>>>>>>>>> >          >     Commit 2: Changing validation logic for pipeline.
>>>>>>>>>>>>> >          >     Commit 3: Supporting new field "X" for pipeline.
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >          >     On Wed, Sep 19, 2018 at 11:27 AM Charles Chen
>>>>>>>>>>>>> >         <c...@google.com <mailto:c...@google.com>
>>>>>>>>>>>>> >          >     <mailto:c...@google.com 
>>>>>>>>>>>>> > <mailto:c...@google.com>>> wrote:
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >          >         To be concrete, it is very easy to revert a 
>>>>>>>>>>>>> > commit in
>>>>>>>>>>>>> >         any case:
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >          >          1. First, use "git log --first-parent" to 
>>>>>>>>>>>>> > find the
>>>>>>>>>>>>> >         first-parent
>>>>>>>>>>>>> >          >             commit corresponding to a PR merge (this 
>>>>>>>>>>>>> > is a
>>>>>>>>>>>>> >         one-to-one
>>>>>>>>>>>>> >          >             correspondence).
>>>>>>>>>>>>> >          >          2. Use "git revert -m 1 <commitid>" to 
>>>>>>>>>>>>> > revert the
>>>>>>>>>>>>> >         commit; this
>>>>>>>>>>>>> >          >             selects the first parent as the base for 
>>>>>>>>>>>>> > a merge
>>>>>>>>>>>>> >         commit (in
>>>>>>>>>>>>> >          >             the case where a single commit needs to 
>>>>>>>>>>>>> > be
>>>>>>>>>>>>> >         reverted, just
>>>>>>>>>>>>> >          >             use "git revert <commitid>" without the 
>>>>>>>>>>>>> > "-m 1" flag).
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >          >         In any case, as a general good engineering 
>>>>>>>>>>>>> > practice,
>>>>>>>>>>>>> >         I do agree
>>>>>>>>>>>>> >          >         that it is highly desirable to have small 
>>>>>>>>>>>>> > independent PRs
>>>>>>>>>>>>> >          >         instead of large jumbo PRs whenever possible.
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >          >         On Wed, Sep 19, 2018 at 11:20 AM Charles Chen
>>>>>>>>>>>>> >         <c...@google.com <mailto:c...@google.com>
>>>>>>>>>>>>> >          >         <mailto:c...@google.com 
>>>>>>>>>>>>> > <mailto:c...@google.com>>> wrote:
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >          >             I don't think it's actually harder to 
>>>>>>>>>>>>> > roll back a
>>>>>>>>>>>>> >         set of
>>>>>>>>>>>>> >          >             commits that are merged together.  Git 
>>>>>>>>>>>>> > has the
>>>>>>>>>>>>> >         notion of
>>>>>>>>>>>>> >          >             first-parent commits (you can see, for 
>>>>>>>>>>>>> > example,
>>>>>>>>>>>>> >         "git log
>>>>>>>>>>>>> >          >             --first-parent", which filters out the 
>>>>>>>>>>>>> > intermediate
>>>>>>>>>>>>> >          >             commits).  In this sense, PRs still get 
>>>>>>>>>>>>> > merged as
>>>>>>>>>>>>> >         one unit
>>>>>>>>>>>>> >          >             and this is preserved even if 
>>>>>>>>>>>>> > intermediate
>>>>>>>>>>>>> >         commits are
>>>>>>>>>>>>> >          >             kept.  Perhaps the bigger issue is that 
>>>>>>>>>>>>> > we need
>>>>>>>>>>>>> >         better
>>>>>>>>>>>>> >          >             documentation and a playbook on how to 
>>>>>>>>>>>>> > do this
>>>>>>>>>>>>> >         these common
>>>>>>>>>>>>> >          >             tasks in git.
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >          >             On Wed, Sep 19, 2018 at 9:27 AM Thomas 
>>>>>>>>>>>>> > Weise
>>>>>>>>>>>>> >         <t...@apache.org <mailto:t...@apache.org>
>>>>>>>>>>>>> >          >             <mailto:t...@apache.org 
>>>>>>>>>>>>> > <mailto:t...@apache.org>>>
>>>>>>>>>>>>> >         wrote:
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >          >                 Wanted to bring this up as reminder 
>>>>>>>>>>>>> > as well as
>>>>>>>>>>>>> >          >                 opportunity to discuss potential 
>>>>>>>>>>>>> > changes to our
>>>>>>>>>>>>> >          >                 committer guide. It has been a while 
>>>>>>>>>>>>> > since
>>>>>>>>>>>>> >         last related
>>>>>>>>>>>>> >          >                 discussion and we welcomed several 
>>>>>>>>>>>>> > new
>>>>>>>>>>>>> >         committers since
>>>>>>>>>>>>> >          >                 then.
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >          >                 Finishing up pull requests pre-merge:
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >         
>>>>>>>>>>>>> > https://beam.apache.org/contribute/committer-guide/#finishing-touches
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >          >                 PRs are worked on over time and may
>>>>>>>>>>>>> >         accumulate many
>>>>>>>>>>>>> >          >                 commits. Sometimes because scope 
>>>>>>>>>>>>> > expands,
>>>>>>>>>>>>> >         sometimes just
>>>>>>>>>>>>> >          >                 to separate independent changes but 
>>>>>>>>>>>>> > most of
>>>>>>>>>>>>> >         the time the
>>>>>>>>>>>>> >          >                 commits are just fixups that are 
>>>>>>>>>>>>> > added as
>>>>>>>>>>>>> >         review progresses.
>>>>>>>>>>>>> >          >                 It is important that the latter get 
>>>>>>>>>>>>> > squashed
>>>>>>>>>>>>> >         prior to PR
>>>>>>>>>>>>> >          >                 merge, as otherwise we lost the 
>>>>>>>>>>>>> > ability to
>>>>>>>>>>>>> >         roll back
>>>>>>>>>>>>> >          >                 changes by reverting a single commit 
>>>>>>>>>>>>> > and also
>>>>>>>>>>>>> >         generally
>>>>>>>>>>>>> >          >                 cause a lot of noise in the commit 
>>>>>>>>>>>>> > history
>>>>>>>>>>>>> >         that does not
>>>>>>>>>>>>> >          >                 help other contributors. To be 
>>>>>>>>>>>>> > clear, I refer
>>>>>>>>>>>>> >         to the
>>>>>>>>>>>>> >          >                 "Fixup!", "Address comments", 
>>>>>>>>>>>>> > "Address even more
>>>>>>>>>>>>> >          >                 comments" type of entries :)
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >          >                 I would also propose that every 
>>>>>>>>>>>>> > commit gets
>>>>>>>>>>>>> >         tagged with
>>>>>>>>>>>>> >          >                 a JIRA (except those fixups that 
>>>>>>>>>>>>> > will be
>>>>>>>>>>>>> >         squashed).
>>>>>>>>>>>>> >          >                 Having the JIRA and possibly other 
>>>>>>>>>>>>> > tags makes
>>>>>>>>>>>>> >         it easier
>>>>>>>>>>>>> >          >                 for others not involved in the PR to 
>>>>>>>>>>>>> > identify
>>>>>>>>>>>>> >         changes
>>>>>>>>>>>>> >          >                 after they were merged, for example 
>>>>>>>>>>>>> > when
>>>>>>>>>>>>> >         looking at the
>>>>>>>>>>>>> >          >                 revision history or annotated source.
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >          >                 As for other scenarios of jumbo PRs 
>>>>>>>>>>>>> > with many
>>>>>>>>>>>>> >         commits,
>>>>>>>>>>>>> >          >                 there are probably situations where 
>>>>>>>>>>>>> > work
>>>>>>>>>>>>> >         needs to be
>>>>>>>>>>>>> >          >                 broken down into smaller units, 
>>>>>>>>>>>>> > making life
>>>>>>>>>>>>> >         better for
>>>>>>>>>>>>> >          >                 both, contributor and reviewer(s). 
>>>>>>>>>>>>> > Ideally,
>>>>>>>>>>>>> >         every PR
>>>>>>>>>>>>> >          >                 would have only one commit, but that 
>>>>>>>>>>>>> > may be a
>>>>>>>>>>>>> >         bit much
>>>>>>>>>>>>> >          >                 to mandate? Is the general 
>>>>>>>>>>>>> > expectation
>>>>>>>>>>>>> >         something we need
>>>>>>>>>>>>> >          >                 to document more clearly?
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >          >                 Thanks,
>>>>>>>>>>>>> >          >                 Thomas
>>>>>>>>>>>>> >          >
>>>>>>>>>>>>> >
>>>
>>>

Reply via email to