Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-23 Thread Alexey Romanenko


> On 22 Apr 2021, at 20:18, Robert Bradshaw  wrote:
> 
> On Thu, Apr 22, 2021 at 9:33 AM Kenneth Knowles  > wrote:
> Heuristic CI that says "this commit history looks OK" might solve a lot of 
> the problem (I see that Robert already started on this).
> 
> And finally I was to repeat my agreement with Ismaël and Alexey that the root 
> problem is this: we need to actually care about the commit history and 
> communication of PR/commit titles and descriptions. We use tools to help us 
> to implement our intentions and to communicate them to newcomers, but I don't 
> think this will replace taking care of the repo.
> 
> Committers should care about taking care of the repo more than the average 
> contributor, but even there there is high variance. I think the issue is "oh, 
> I didn't think to squash vs. merge" rather than "who cares, I always press 
> merge anyway" in which case a timely reminder will go a long way. 

+100 (sorry) and if we can additionally have something like a warning before 
merge it would be helpful too. 
I don’t think we really want to add more complexity to development process and 
“light” version with just a warning perhaps will be enough.

> 
> Kenn
> 
> [1] 
> https://lists.apache.org/thread.html/4a65fb0b66935c9dc61568a3067538775edc3e685c6ac03dd3fa4725%40%3Cdev.beam.apache.org%3E
>  
> 
>  
> 
> As for me, I’d prefer that every committer paid more attention (if not yet) 
> on these “non code” things before reviewing/merging a PR.
> 
> [1] https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py 
> 
> 
> 
>> On 22 Apr 2021, at 01:28, Robert Bradshaw > > wrote:
>> 
>> I am also in the camp that it often makes sense to have more than 1 commit 
>> per PR, but rather than enforce a 1 commit per PR policy, I would say that 
>> it's too much bother to look at the commit history whether it should be 
>> squashed or merged (though I think it is almost always very obvious which is 
>> preferable for a given PR), go ahead and squash rather than merge by 
>> default. 
>> 
>> 
>> On Wed, Apr 21, 2021 at 2:23 PM Kenneth Knowles > > wrote:
>> This seems to come up a lot. Maybe we should change something?
>> 
>> Having worked on a number of projects and at companies with this policy, 
>> companies using non-distributed source control, and companies that just "use 
>> git like git", I know all these ways of life pretty well.
>> 
>> TL;DR my experience is:
>>  - when people care about the commit history and take care of it, then just 
>> "use git like git" results in faster development and clearer history, 
>> despite intermediate commits not being tested by Jenkins/Travis/GHA
>>  - when people see git as an inconvenience, view the history as an 
>> implementation detail, or think in linear history of PR merges and view the 
>> commits as an implementation detail, it becomes a mess
>> 
>> Empirically, this is what I expect from a 1 commit = 1 PR policy (and how I 
>> feel about each point):
>>  - fewer commits with bad messages (yay!)
>>  - simpler git graph if we squash + rebase (meh)
>>  - larger commits of related-but-independent changes (could be OK)
>>  - commits with bullet points in their description that bundle unrelated 
>> changes (sad face)
>>  - slowdown of development (neutral - slow can be good)
>>  - fewer "quality of life" improvements, since those would add lines of diff 
>> to a PR and are off topic; when they have to be done in a separate PR they 
>> don't get done and they don't get reviewed with the same priority (extra sad 
>> face)
>> 
>> I know I am in the minority. I tend to have a lot of PRs where there 
>> are 2-5 fairly independent commits. It is "to aid code review" but not in 
>> the way you might think: The best size for code review is pretty big, 
>> compared to the best size for commit. A commit is the unit of roll-forward, 
>> roll-back, cherry-pick, etc. Brian's point about commits not being 
>> independently tested is important: this is a tooling issue, but not that 
>> easy to change. Here is why I am not that worried about it: I believe 
>> strongly in a "rollback first" policy to restore greenness, but also that 
>> the rollback change itself must be verified to restore greenness. When a 
>> multi-commit PR fails, you can easily open a revert of the whole PR as well 
>> as reverts of individual suspect commits. The CI for these will finish 
>> around the same time, and if you manage a smaller revert, great! Imagine if 
>> to revert a PR you had to revert _every_ change between HEAD and that PR. It 
>> would restore to a known green state. Yet we don't do this, because we have 
>> technology that makes it unnecessary. Ultimately, single large commits with 
>> bullet points are just an 

Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-22 Thread Kenneth Knowles
Case study: https://github.com/apache/beam/pull/14618/commits

To get back to the question of Issue and PR titles:

 - When doing Jira triage, fix issue titles and issue type to be
meaningful. These autogenerate release notes, and also we use these to find
duplicates, etc.
 - Any committer can usually edit PR titles and descriptions. I do this
frequently. I am unhappy with how often the template "PLEASE add a
meaningful description" is left exactly as-is. The template is literally
begging you to actually describe the change. It is fine to duplicate the
body of the commit message. Good commit messages have the format
\n\n.

Kenn

On Thu, Apr 22, 2021 at 12:49 PM Kenneth Knowles  wrote:

> I think I am wrong about this. It seems like for squashed/rebased commits
> it is still GitHub that is committer? But it does seem to have the metadata
> about who did the squash & merge. This pattern of storing important
> metadata outside of git is not a good direction.
>
> Kenn
>
> On Thu, Apr 22, 2021 at 12:45 PM Kenneth Knowles  wrote:
>
>> That is unfortunate that GitHub is the committer of merge commits :-/
>> though I suppose you have the author field you can use. It is unfortunate
>> the this is a different field based on method.
>>
>> Kenn
>>
>> On Thu, Apr 22, 2021 at 12:39 PM Ismaël Mejía  wrote:
>>
>>> I was not referring to author identity but to committer identity that
>>> matters to know who accepted to merge something but it seems we are
>>> not really using this much because github is the 'committer' of merge
>>> commits too :S maybe something we can improve as part of this
>>> discussion.
>>>
>>> git show --pretty=full COMMITID
>>>
>>> On Thu, Apr 22, 2021 at 9:10 PM Valentyn Tymofieiev 
>>> wrote:
>>> >
>>> > Author identity is preserved. Here's an output of 'git log'
>>> >
>>> > commit 93ecc1d3a4b997b2490c4439972ffaf09125299f
>>> > Merge: 2e9ee8c005 4e3decbb4e
>>> <-- a merge commit that merges 2 commit,
>>> 4e3decbb4e and it's parent. Author history is preserved on 4e3decbb4e
>>> > Author: Ismaël Mejía 
>>><--  this is the author of merge commit
>>> > Date:   Thu Apr 22 12:46:38 2021 +0200
>>> >
>>> > Merge pull request #14616: [BEAM-12207] Remove log messages about
>>> files to stage.<-- Note that message was edited, and does not
>>> include a branch, which is nice!
>>> > commit 2e9ee8c0052d96045588e617c9e5de017f30454a
>>> >
>>> >
>>> > commit 28020effca12a18a65799ac7d2d3d520d73072d7
>>> > Author: yoshiki.obata <1285728+lazyl...@users.noreply.github.com>
>>> > Date:   Thu Apr 22 11:57:45 2021 +0900
>>> >
>>> > [BEAM-7372] cleanup codes for py2 from apache_beam/transforms
>>> (#14544) <--- 1-commit PR  was squashed-and-merged by me. Author's
>>> identity is preserved
>>> >
>>> > On Thu, Apr 22, 2021 at 11:47 AM Ismaël Mejía 
>>> wrote:
>>> >>
>>> >> In the past github squash and merge did not preserve the committer
>>> >> identity correctly, is it still the case? If  so we should not be
>>> >> using it.
>>> >> https://github.com/isaacs/github/issues/1368
>>> >>
>>> >> On Thu, Apr 22, 2021 at 8:41 PM Robert Bradshaw 
>>> wrote:
>>> >> >
>>> >> > On Thu, Apr 22, 2021 at 11:29 AM Valentyn Tymofieiev <
>>> valen...@google.com> wrote:
>>> >> >>
>>> >> >> I always squash-and-merge even when there is only 1 commit. This
>>> avoids the necessity to edit the commit message to remove not so helpful
>>> "Merge pull request xxx" message. Is there any harm to recommend squash by
>>> default in the upcoming squash bot even for single commit PRs?
>>> >> >
>>> >> >
>>> >> > Does squash-and-merge in that case preserve the commit as-is if
>>> there's only one? In that case, there'd be no issues of history. (I opted
>>> to not comment on 1-commit PRs to be less chatty.)
>>> >> >
>>> >> >>
>>> >> >>
>>> >> >> On Thu, Apr 22, 2021 at 11:19 AM Robert Bradshaw <
>>> rober...@google.com> wrote:
>>> >> >>>
>>> >> >>> On Thu, Apr 22, 2021 at 9:33 AM Kenneth Knowles 
>>> wrote:
>>> >> 
>>> >> 
>>> >>  On Thu, Apr 22, 2021 at 7:04 AM Alexey Romanenko <
>>> aromanenko@gmail.com> wrote:
>>> >> >
>>> >> > Thanks Ismael for bringing this on the table again. Kind of my
>>> “favourite” topic, unfortunately, that I raised a couple of times… Let me
>>> share some of my thoughts on this.
>>> >> >
>>> >> > First of all, as Beam developers, honestly we have to agree if
>>> we care about our commits history or not. If not (or not so much) then
>>> probably there is no more things to discuss and we use Git as just Git…
>>> It’s not a bad thing, it’s just different but for large projects, like
>>> Beam, clear commits history is ultra important, imho.
>>> >> >
>>> >> > Well, for now we do care and we clearly mention this in our
>>> Contribution Guide. Probably, it sounds only as a recommendation there or
>>> not all contributors (especially first-time ones) read this or take this
>>> into account or pay attention on this. It’s fine 

Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-22 Thread Kenneth Knowles
I think I am wrong about this. It seems like for squashed/rebased commits
it is still GitHub that is committer? But it does seem to have the metadata
about who did the squash & merge. This pattern of storing important
metadata outside of git is not a good direction.

Kenn

On Thu, Apr 22, 2021 at 12:45 PM Kenneth Knowles  wrote:

> That is unfortunate that GitHub is the committer of merge commits :-/
> though I suppose you have the author field you can use. It is unfortunate
> the this is a different field based on method.
>
> Kenn
>
> On Thu, Apr 22, 2021 at 12:39 PM Ismaël Mejía  wrote:
>
>> I was not referring to author identity but to committer identity that
>> matters to know who accepted to merge something but it seems we are
>> not really using this much because github is the 'committer' of merge
>> commits too :S maybe something we can improve as part of this
>> discussion.
>>
>> git show --pretty=full COMMITID
>>
>> On Thu, Apr 22, 2021 at 9:10 PM Valentyn Tymofieiev 
>> wrote:
>> >
>> > Author identity is preserved. Here's an output of 'git log'
>> >
>> > commit 93ecc1d3a4b997b2490c4439972ffaf09125299f
>> > Merge: 2e9ee8c005 4e3decbb4e
>>   <-- a merge commit that merges 2 commit,
>> 4e3decbb4e and it's parent. Author history is preserved on 4e3decbb4e
>> > Author: Ismaël Mejía 
>><--  this is the author of merge commit
>> > Date:   Thu Apr 22 12:46:38 2021 +0200
>> >
>> > Merge pull request #14616: [BEAM-12207] Remove log messages about
>> files to stage.<-- Note that message was edited, and does not
>> include a branch, which is nice!
>> > commit 2e9ee8c0052d96045588e617c9e5de017f30454a
>> >
>> >
>> > commit 28020effca12a18a65799ac7d2d3d520d73072d7
>> > Author: yoshiki.obata <1285728+lazyl...@users.noreply.github.com>
>> > Date:   Thu Apr 22 11:57:45 2021 +0900
>> >
>> > [BEAM-7372] cleanup codes for py2 from apache_beam/transforms
>> (#14544) <--- 1-commit PR  was squashed-and-merged by me. Author's
>> identity is preserved
>> >
>> > On Thu, Apr 22, 2021 at 11:47 AM Ismaël Mejía 
>> wrote:
>> >>
>> >> In the past github squash and merge did not preserve the committer
>> >> identity correctly, is it still the case? If  so we should not be
>> >> using it.
>> >> https://github.com/isaacs/github/issues/1368
>> >>
>> >> On Thu, Apr 22, 2021 at 8:41 PM Robert Bradshaw 
>> wrote:
>> >> >
>> >> > On Thu, Apr 22, 2021 at 11:29 AM Valentyn Tymofieiev <
>> valen...@google.com> wrote:
>> >> >>
>> >> >> I always squash-and-merge even when there is only 1 commit. This
>> avoids the necessity to edit the commit message to remove not so helpful
>> "Merge pull request xxx" message. Is there any harm to recommend squash by
>> default in the upcoming squash bot even for single commit PRs?
>> >> >
>> >> >
>> >> > Does squash-and-merge in that case preserve the commit as-is if
>> there's only one? In that case, there'd be no issues of history. (I opted
>> to not comment on 1-commit PRs to be less chatty.)
>> >> >
>> >> >>
>> >> >>
>> >> >> On Thu, Apr 22, 2021 at 11:19 AM Robert Bradshaw <
>> rober...@google.com> wrote:
>> >> >>>
>> >> >>> On Thu, Apr 22, 2021 at 9:33 AM Kenneth Knowles 
>> wrote:
>> >> 
>> >> 
>> >>  On Thu, Apr 22, 2021 at 7:04 AM Alexey Romanenko <
>> aromanenko@gmail.com> wrote:
>> >> >
>> >> > Thanks Ismael for bringing this on the table again. Kind of my
>> “favourite” topic, unfortunately, that I raised a couple of times… Let me
>> share some of my thoughts on this.
>> >> >
>> >> > First of all, as Beam developers, honestly we have to agree if
>> we care about our commits history or not. If not (or not so much) then
>> probably there is no more things to discuss and we use Git as just Git…
>> It’s not a bad thing, it’s just different but for large projects, like
>> Beam, clear commits history is ultra important, imho.
>> >> >
>> >> > Well, for now we do care and we clearly mention this in our
>> Contribution Guide. Probably, it sounds only as a recommendation there or
>> not all contributors (especially first-time ones) read this or take this
>> into account or pay attention on this. It’s fine and we always can expect
>> not following our guide because of many different reasons. And this is
>> exactly where Committers have to play their role! I mean that our clear Git
>> history mostly relies on committer's shoulders and, before clicking on
>> Merge button, every committer have (even “must" I’d say) make sure that PR
>> respects all our rules (we have them because of some reasons, right?) and
>> ready to be merged. Nice and correct titles/messages is one this thing.
>> Personally, the first thing that I do once I start to do a review and
>> before merge, is checking the PR’s title, branches (if it’s from a feature
>> branch and against main Beam branch), number of commits and their messages.
>> Then I take a look on related Jira issue which ID should be prefixed to
>> PR's title and 

Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-22 Thread Kenneth Knowles
That is unfortunate that GitHub is the committer of merge commits :-/
though I suppose you have the author field you can use. It is unfortunate
the this is a different field based on method.

Kenn

On Thu, Apr 22, 2021 at 12:39 PM Ismaël Mejía  wrote:

> I was not referring to author identity but to committer identity that
> matters to know who accepted to merge something but it seems we are
> not really using this much because github is the 'committer' of merge
> commits too :S maybe something we can improve as part of this
> discussion.
>
> git show --pretty=full COMMITID
>
> On Thu, Apr 22, 2021 at 9:10 PM Valentyn Tymofieiev 
> wrote:
> >
> > Author identity is preserved. Here's an output of 'git log'
> >
> > commit 93ecc1d3a4b997b2490c4439972ffaf09125299f
> > Merge: 2e9ee8c005 4e3decbb4e
>   <-- a merge commit that merges 2 commit,
> 4e3decbb4e and it's parent. Author history is preserved on 4e3decbb4e
> > Author: Ismaël Mejía 
>  <--  this is the author of merge commit
> > Date:   Thu Apr 22 12:46:38 2021 +0200
> >
> > Merge pull request #14616: [BEAM-12207] Remove log messages about
> files to stage.<-- Note that message was edited, and does not
> include a branch, which is nice!
> > commit 2e9ee8c0052d96045588e617c9e5de017f30454a
> >
> >
> > commit 28020effca12a18a65799ac7d2d3d520d73072d7
> > Author: yoshiki.obata <1285728+lazyl...@users.noreply.github.com>
> > Date:   Thu Apr 22 11:57:45 2021 +0900
> >
> > [BEAM-7372] cleanup codes for py2 from apache_beam/transforms
> (#14544) <--- 1-commit PR  was squashed-and-merged by me. Author's
> identity is preserved
> >
> > On Thu, Apr 22, 2021 at 11:47 AM Ismaël Mejía  wrote:
> >>
> >> In the past github squash and merge did not preserve the committer
> >> identity correctly, is it still the case? If  so we should not be
> >> using it.
> >> https://github.com/isaacs/github/issues/1368
> >>
> >> On Thu, Apr 22, 2021 at 8:41 PM Robert Bradshaw 
> wrote:
> >> >
> >> > On Thu, Apr 22, 2021 at 11:29 AM Valentyn Tymofieiev <
> valen...@google.com> wrote:
> >> >>
> >> >> I always squash-and-merge even when there is only 1 commit. This
> avoids the necessity to edit the commit message to remove not so helpful
> "Merge pull request xxx" message. Is there any harm to recommend squash by
> default in the upcoming squash bot even for single commit PRs?
> >> >
> >> >
> >> > Does squash-and-merge in that case preserve the commit as-is if
> there's only one? In that case, there'd be no issues of history. (I opted
> to not comment on 1-commit PRs to be less chatty.)
> >> >
> >> >>
> >> >>
> >> >> On Thu, Apr 22, 2021 at 11:19 AM Robert Bradshaw <
> rober...@google.com> wrote:
> >> >>>
> >> >>> On Thu, Apr 22, 2021 at 9:33 AM Kenneth Knowles 
> wrote:
> >> 
> >> 
> >>  On Thu, Apr 22, 2021 at 7:04 AM Alexey Romanenko <
> aromanenko@gmail.com> wrote:
> >> >
> >> > Thanks Ismael for bringing this on the table again. Kind of my
> “favourite” topic, unfortunately, that I raised a couple of times… Let me
> share some of my thoughts on this.
> >> >
> >> > First of all, as Beam developers, honestly we have to agree if we
> care about our commits history or not. If not (or not so much) then
> probably there is no more things to discuss and we use Git as just Git…
> It’s not a bad thing, it’s just different but for large projects, like
> Beam, clear commits history is ultra important, imho.
> >> >
> >> > Well, for now we do care and we clearly mention this in our
> Contribution Guide. Probably, it sounds only as a recommendation there or
> not all contributors (especially first-time ones) read this or take this
> into account or pay attention on this. It’s fine and we always can expect
> not following our guide because of many different reasons. And this is
> exactly where Committers have to play their role! I mean that our clear Git
> history mostly relies on committer's shoulders and, before clicking on
> Merge button, every committer have (even “must" I’d say) make sure that PR
> respects all our rules (we have them because of some reasons, right?) and
> ready to be merged. Nice and correct titles/messages is one this thing.
> Personally, the first thing that I do once I start to do a review and
> before merge, is checking the PR’s title, branches (if it’s from a feature
> branch and against main Beam branch), number of commits and their messages.
> Then I take a look on related Jira issue which ID should be prefixed to
> PR's title and commit’s message(s).
> >> >
> >> > For sure, there are always exceptions. In case of emergency, for
> example, if the build is broken because of tiny thing then it makes sense
> to fix this as fast as possible and then, probably, to neglect some rules.
> But if exceptions become the common practice and happen quite often, then
> it’s a signal that either we have to change the rules or change our
> attitude to this.
> >> >
> >> 

Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-22 Thread Ismaël Mejía
I was not referring to author identity but to committer identity that
matters to know who accepted to merge something but it seems we are
not really using this much because github is the 'committer' of merge
commits too :S maybe something we can improve as part of this
discussion.

git show --pretty=full COMMITID

On Thu, Apr 22, 2021 at 9:10 PM Valentyn Tymofieiev  wrote:
>
> Author identity is preserved. Here's an output of 'git log'
>
> commit 93ecc1d3a4b997b2490c4439972ffaf09125299f
> Merge: 2e9ee8c005 4e3decbb4e  
> <-- a merge commit that merges 2 commit, 4e3decbb4e and 
> it's parent. Author history is preserved on 4e3decbb4e
> Author: Ismaël Mejía   
><--  this is the author of merge commit
> Date:   Thu Apr 22 12:46:38 2021 +0200
>
> Merge pull request #14616: [BEAM-12207] Remove log messages about files 
> to stage.<-- Note that message was edited, and does not include a 
> branch, which is nice!
> commit 2e9ee8c0052d96045588e617c9e5de017f30454a
>
>
> commit 28020effca12a18a65799ac7d2d3d520d73072d7
> Author: yoshiki.obata <1285728+lazyl...@users.noreply.github.com>
> Date:   Thu Apr 22 11:57:45 2021 +0900
>
> [BEAM-7372] cleanup codes for py2 from apache_beam/transforms (#14544)
>  <--- 1-commit PR  was squashed-and-merged by me. Author's identity is 
> preserved
>
> On Thu, Apr 22, 2021 at 11:47 AM Ismaël Mejía  wrote:
>>
>> In the past github squash and merge did not preserve the committer
>> identity correctly, is it still the case? If  so we should not be
>> using it.
>> https://github.com/isaacs/github/issues/1368
>>
>> On Thu, Apr 22, 2021 at 8:41 PM Robert Bradshaw  wrote:
>> >
>> > On Thu, Apr 22, 2021 at 11:29 AM Valentyn Tymofieiev  
>> > wrote:
>> >>
>> >> I always squash-and-merge even when there is only 1 commit. This avoids 
>> >> the necessity to edit the commit message to remove not so helpful "Merge 
>> >> pull request xxx" message. Is there any harm to recommend squash by 
>> >> default in the upcoming squash bot even for single commit PRs?
>> >
>> >
>> > Does squash-and-merge in that case preserve the commit as-is if there's 
>> > only one? In that case, there'd be no issues of history. (I opted to not 
>> > comment on 1-commit PRs to be less chatty.)
>> >
>> >>
>> >>
>> >> On Thu, Apr 22, 2021 at 11:19 AM Robert Bradshaw  
>> >> wrote:
>> >>>
>> >>> On Thu, Apr 22, 2021 at 9:33 AM Kenneth Knowles  wrote:
>> 
>> 
>>  On Thu, Apr 22, 2021 at 7:04 AM Alexey Romanenko 
>>   wrote:
>> >
>> > Thanks Ismael for bringing this on the table again. Kind of my 
>> > “favourite” topic, unfortunately, that I raised a couple of times… Let 
>> > me share some of my thoughts on this.
>> >
>> > First of all, as Beam developers, honestly we have to agree if we care 
>> > about our commits history or not. If not (or not so much) then 
>> > probably there is no more things to discuss and we use Git as just 
>> > Git… It’s not a bad thing, it’s just different but for large projects, 
>> > like Beam, clear commits history is ultra important, imho.
>> >
>> > Well, for now we do care and we clearly mention this in our 
>> > Contribution Guide. Probably, it sounds only as a recommendation there 
>> > or not all contributors (especially first-time ones) read this or take 
>> > this into account or pay attention on this. It’s fine and we always 
>> > can expect not following our guide because of many different reasons. 
>> > And this is exactly where Committers have to play their role! I mean 
>> > that our clear Git history mostly relies on committer's shoulders and, 
>> > before clicking on Merge button, every committer have (even “must" I’d 
>> > say) make sure that PR respects all our rules (we have them because of 
>> > some reasons, right?) and ready to be merged. Nice and correct 
>> > titles/messages is one this thing. Personally, the first thing that I 
>> > do once I start to do a review and before merge, is checking the PR’s 
>> > title, branches (if it’s from a feature branch and against main Beam 
>> > branch), number of commits and their messages. Then I take a look on 
>> > related Jira issue which ID should be prefixed to PR's title and 
>> > commit’s message(s).
>> >
>> > For sure, there are always exceptions. In case of emergency, for 
>> > example, if the build is broken because of tiny thing then it makes 
>> > sense to fix this as fast as possible and then, probably, to neglect 
>> > some rules. But if exceptions become the common practice and happen 
>> > quite often, then it’s a signal that either we have to change the 
>> > rules or change our attitude to this.
>> >
>> > As I see, the initial Ismael’s message of his email was more about 
>> > titles and multiple commits per PR is 

Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-22 Thread Valentyn Tymofieiev
Author identity is preserved. Here's an output of 'git log'

commit 93ecc1d3a4b997b2490c4439972ffaf09125299f
Merge: 2e9ee8c005 4e3decbb4e
  <-- a merge commit that merges 2 commit, 4e3decbb4e
and it's parent. Author history is preserved on 4e3decbb4e
Author: Ismaël Mejía 
 <--  this is the author of merge commit
Date:   Thu Apr 22 12:46:38 2021 +0200

Merge pull request #14616: [BEAM-12207] Remove log messages about files
to stage.<-- Note that message was edited, and does not include a
branch, which is nice!
commit 2e9ee8c0052d96045588e617c9e5de017f30454a


commit 28020effca12a18a65799ac7d2d3d520d73072d7
Author: yoshiki.obata <1285728+lazyl...@users.noreply.github.com>
Date:   Thu Apr 22 11:57:45 2021 +0900

[BEAM-7372] cleanup codes for py2 from apache_beam/transforms (#14544)
   <--- 1-commit PR  was squashed-and-merged by me. Author's identity is
preserved

On Thu, Apr 22, 2021 at 11:47 AM Ismaël Mejía  wrote:

> In the past github squash and merge did not preserve the committer
> identity correctly, is it still the case? If  so we should not be
> using it.
> https://github.com/isaacs/github/issues/1368
>
> On Thu, Apr 22, 2021 at 8:41 PM Robert Bradshaw 
> wrote:
> >
> > On Thu, Apr 22, 2021 at 11:29 AM Valentyn Tymofieiev <
> valen...@google.com> wrote:
> >>
> >> I always squash-and-merge even when there is only 1 commit. This avoids
> the necessity to edit the commit message to remove not so helpful "Merge
> pull request xxx" message. Is there any harm to recommend squash by default
> in the upcoming squash bot even for single commit PRs?
> >
> >
> > Does squash-and-merge in that case preserve the commit as-is if there's
> only one? In that case, there'd be no issues of history. (I opted to not
> comment on 1-commit PRs to be less chatty.)
> >
> >>
> >>
> >> On Thu, Apr 22, 2021 at 11:19 AM Robert Bradshaw 
> wrote:
> >>>
> >>> On Thu, Apr 22, 2021 at 9:33 AM Kenneth Knowles 
> wrote:
> 
> 
>  On Thu, Apr 22, 2021 at 7:04 AM Alexey Romanenko <
> aromanenko@gmail.com> wrote:
> >
> > Thanks Ismael for bringing this on the table again. Kind of my
> “favourite” topic, unfortunately, that I raised a couple of times… Let me
> share some of my thoughts on this.
> >
> > First of all, as Beam developers, honestly we have to agree if we
> care about our commits history or not. If not (or not so much) then
> probably there is no more things to discuss and we use Git as just Git…
> It’s not a bad thing, it’s just different but for large projects, like
> Beam, clear commits history is ultra important, imho.
> >
> > Well, for now we do care and we clearly mention this in our
> Contribution Guide. Probably, it sounds only as a recommendation there or
> not all contributors (especially first-time ones) read this or take this
> into account or pay attention on this. It’s fine and we always can expect
> not following our guide because of many different reasons. And this is
> exactly where Committers have to play their role! I mean that our clear Git
> history mostly relies on committer's shoulders and, before clicking on
> Merge button, every committer have (even “must" I’d say) make sure that PR
> respects all our rules (we have them because of some reasons, right?) and
> ready to be merged. Nice and correct titles/messages is one this thing.
> Personally, the first thing that I do once I start to do a review and
> before merge, is checking the PR’s title, branches (if it’s from a feature
> branch and against main Beam branch), number of commits and their messages.
> Then I take a look on related Jira issue which ID should be prefixed to
> PR's title and commit’s message(s).
> >
> > For sure, there are always exceptions. In case of emergency, for
> example, if the build is broken because of tiny thing then it makes sense
> to fix this as fast as possible and then, probably, to neglect some rules.
> But if exceptions become the common practice and happen quite often, then
> it’s a signal that either we have to change the rules or change our
> attitude to this.
> >
> > As I see, the initial Ismael’s message of his email was more about
> titles and multiple commits per PR is another but, of course, related
> topic. For both, I believe we can partly automate it - add checks to
> prevent merging the commits with not correct messages or/and limit number
> of commits per PR, for example. Some other big projects, like Apache Spark,
> have even special tool to merge PR in well-formed way [1]. I’m not sure
> that we need to have something similar because I’m pretty sure it will
> affect the performance of adding new fixes/features (at least in the
> beginning), but since we already started the similar discussions several
> times on regular bases, we might want to think in this way as an option too.
> 
> 
>  Noting that we had one too [1]. The trouble was that the bot had a
> lot of downtime, code was not 

Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-22 Thread Ismaël Mejía
In the past github squash and merge did not preserve the committer
identity correctly, is it still the case? If  so we should not be
using it.
https://github.com/isaacs/github/issues/1368

On Thu, Apr 22, 2021 at 8:41 PM Robert Bradshaw  wrote:
>
> On Thu, Apr 22, 2021 at 11:29 AM Valentyn Tymofieiev  
> wrote:
>>
>> I always squash-and-merge even when there is only 1 commit. This avoids the 
>> necessity to edit the commit message to remove not so helpful "Merge pull 
>> request xxx" message. Is there any harm to recommend squash by default in 
>> the upcoming squash bot even for single commit PRs?
>
>
> Does squash-and-merge in that case preserve the commit as-is if there's only 
> one? In that case, there'd be no issues of history. (I opted to not comment 
> on 1-commit PRs to be less chatty.)
>
>>
>>
>> On Thu, Apr 22, 2021 at 11:19 AM Robert Bradshaw  wrote:
>>>
>>> On Thu, Apr 22, 2021 at 9:33 AM Kenneth Knowles  wrote:


 On Thu, Apr 22, 2021 at 7:04 AM Alexey Romanenko 
  wrote:
>
> Thanks Ismael for bringing this on the table again. Kind of my 
> “favourite” topic, unfortunately, that I raised a couple of times… Let me 
> share some of my thoughts on this.
>
> First of all, as Beam developers, honestly we have to agree if we care 
> about our commits history or not. If not (or not so much) then probably 
> there is no more things to discuss and we use Git as just Git… It’s not a 
> bad thing, it’s just different but for large projects, like Beam, clear 
> commits history is ultra important, imho.
>
> Well, for now we do care and we clearly mention this in our Contribution 
> Guide. Probably, it sounds only as a recommendation there or not all 
> contributors (especially first-time ones) read this or take this into 
> account or pay attention on this. It’s fine and we always can expect not 
> following our guide because of many different reasons. And this is 
> exactly where Committers have to play their role! I mean that our clear 
> Git history mostly relies on committer's shoulders and, before clicking 
> on Merge button, every committer have (even “must" I’d say) make sure 
> that PR respects all our rules (we have them because of some reasons, 
> right?) and ready to be merged. Nice and correct titles/messages is one 
> this thing. Personally, the first thing that I do once I start to do a 
> review and before merge, is checking the PR’s title, branches (if it’s 
> from a feature branch and against main Beam branch), number of commits 
> and their messages. Then I take a look on related Jira issue which ID 
> should be prefixed to PR's title and commit’s message(s).
>
> For sure, there are always exceptions. In case of emergency, for example, 
> if the build is broken because of tiny thing then it makes sense to fix 
> this as fast as possible and then, probably, to neglect some rules. But 
> if exceptions become the common practice and happen quite often, then 
> it’s a signal that either we have to change the rules or change our 
> attitude to this.
>
> As I see, the initial Ismael’s message of his email was more about titles 
> and multiple commits per PR is another but, of course, related topic. For 
> both, I believe we can partly automate it - add checks to prevent merging 
> the commits with not correct messages or/and limit number of commits per 
> PR, for example. Some other big projects, like Apache Spark, have even 
> special tool to merge PR in well-formed way [1]. I’m not sure that we 
> need to have something similar because I’m pretty sure it will affect the 
> performance of adding new fixes/features (at least in the beginning), but 
> since we already started the similar discussions several times on regular 
> bases, we might want to think in this way as an option too.


 Noting that we had one too [1]. The trouble was that the bot had a lot of 
 downtime, code was not part of Beam's repo, and also did not encode best 
 practices (for example it broke the connection between PRs and master 
 branch history because it just cherry-picked and squashed commits and 
 pushed those new unrelated commits straight to master). A script would 
 address much of this.
>>>
>>>
>>> Yeah, the mergebot was much more hassle than it was worth, and lots harder 
>>> to use than pushing a button. I wouldn't be opposed to trying again with a 
>>> better (simpler, under our control) one (and in my investigations of github 
>>> actions, it doesn't look that hard).
>>>
 Heuristic CI that says "this commit history looks OK" might solve a lot of 
 the problem (I see that Robert already started on this).

 And finally I was to repeat my agreement with Ismaël and Alexey that the 
 root problem is this: we need to actually care about the commit history 
 and 

Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-22 Thread Robert Bradshaw
On Thu, Apr 22, 2021 at 11:29 AM Valentyn Tymofieiev 
wrote:

> I always squash-and-merge even when there is only 1 commit. This avoids
> the necessity to edit the commit message to remove not so helpful "Merge
> pull request xxx" message. Is there any harm to recommend squash by default
> in the upcoming squash bot even for single commit PRs?
>

Does squash-and-merge in that case preserve the commit as-is if there's
only one? In that case, there'd be no issues of history. (I opted to not
comment on 1-commit PRs to be less chatty.)


>
> On Thu, Apr 22, 2021 at 11:19 AM Robert Bradshaw 
> wrote:
>
>> On Thu, Apr 22, 2021 at 9:33 AM Kenneth Knowles  wrote:
>>
>>>
>>> On Thu, Apr 22, 2021 at 7:04 AM Alexey Romanenko <
>>> aromanenko@gmail.com> wrote:
>>>
 Thanks Ismael for bringing this on the table again. Kind of my
 “favourite” topic, unfortunately, that I raised a couple of times… Let me
 share some of my thoughts on this.

 First of all, as Beam developers, honestly we have to agree if we care
 about our commits history or not. If not (or not so much) then probably
 there is no more things to discuss and we use Git as just Git… It’s not a
 bad thing, it’s just different but for large projects, like Beam, clear
 commits history is ultra important, imho.

 Well, for now we do care and we clearly mention this in our
 Contribution Guide. Probably, it sounds only as a recommendation there or
 not all contributors (especially first-time ones) read this or take this
 into account or pay attention on this. It’s fine and we always can expect
 not following our guide because of many different reasons. And this is
 exactly where Committers have to play their role! I mean that our clear Git
 history mostly relies on committer's shoulders and, before clicking on
 *Merge* button, every committer have (even “must" I’d say) make sure
 that PR respects all our rules (we have them because of some reasons,
 right?) and ready to be merged. Nice and correct titles/messages is one
 this thing. Personally, the first thing that I do once I start to do a
 review and before merge, is checking the PR’s title, branches (if it’s from
 a feature branch and against main Beam branch), number of commits and their
 messages. Then I take a look on related Jira issue which ID should be
 prefixed to PR's title and commit’s message(s).

 For sure, there are always exceptions. In case of emergency, for
 example, if the build is broken because of tiny thing then it makes sense
 to fix this as fast as possible and then, probably, to neglect some rules.
 But if exceptions become the common practice and happen quite often, then
 it’s a signal that either we have to change the rules or change our
 attitude to this.

 As I see, the initial Ismael’s message of his email was more about
 titles and multiple commits per PR is another but, of course, related
 topic. For both, I believe we can partly automate it - add checks to
 prevent merging the commits with not correct messages or/and limit number
 of commits per PR, for example. Some other big projects, like Apache Spark,
 have even special tool to merge PR in well-formed way [1]. I’m not sure
 that we need to have something similar because I’m pretty sure it will
 affect the performance of adding new fixes/features (at least in the
 beginning), but since we already started the similar discussions several
 times on regular bases, we might want to think in this way as an option 
 too.

>>>
>>> Noting that we had one too [1]. The trouble was that the bot had a lot
>>> of downtime, code was not part of Beam's repo, and also did not encode best
>>> practices (for example it broke the connection between PRs and master
>>> branch history because it just cherry-picked and squashed commits and
>>> pushed those new unrelated commits straight to master). A script would
>>> address much of this.
>>>
>>
>> Yeah, the mergebot was much more hassle than it was worth, and lots
>> harder to use than pushing a button. I wouldn't be opposed to trying again
>> with a better (simpler, under our control) one (and in my investigations of
>> github actions, it doesn't look that hard).
>>
>> Heuristic CI that says "this commit history looks OK" might solve a lot
>>> of the problem (I see that Robert already started on this).
>>>
>>> And finally I was to repeat my agreement with Ismaël and Alexey that the
>>> root problem is this: we need to actually care about the commit history and
>>> communication of PR/commit titles and descriptions. We use tools to help us
>>> to implement our intentions and to communicate them to newcomers, but I
>>> don't think this will replace taking care of the repo.
>>>
>>
>> Committers should care about taking care of the repo more than the
>> average contributor, but even there there is high variance. I think the

Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-22 Thread Valentyn Tymofieiev
I always squash-and-merge even when there is only 1 commit. This avoids the
necessity to edit the commit message to remove not so helpful "Merge pull
request xxx" message. Is there any harm to recommend squash by default in
the upcoming squash bot even for single commit PRs?

On Thu, Apr 22, 2021 at 11:19 AM Robert Bradshaw 
wrote:

> On Thu, Apr 22, 2021 at 9:33 AM Kenneth Knowles  wrote:
>
>>
>> On Thu, Apr 22, 2021 at 7:04 AM Alexey Romanenko <
>> aromanenko@gmail.com> wrote:
>>
>>> Thanks Ismael for bringing this on the table again. Kind of my
>>> “favourite” topic, unfortunately, that I raised a couple of times… Let me
>>> share some of my thoughts on this.
>>>
>>> First of all, as Beam developers, honestly we have to agree if we care
>>> about our commits history or not. If not (or not so much) then probably
>>> there is no more things to discuss and we use Git as just Git… It’s not a
>>> bad thing, it’s just different but for large projects, like Beam, clear
>>> commits history is ultra important, imho.
>>>
>>> Well, for now we do care and we clearly mention this in our Contribution
>>> Guide. Probably, it sounds only as a recommendation there or not all
>>> contributors (especially first-time ones) read this or take this into
>>> account or pay attention on this. It’s fine and we always can expect not
>>> following our guide because of many different reasons. And this is exactly
>>> where Committers have to play their role! I mean that our clear Git history
>>> mostly relies on committer's shoulders and, before clicking on *Merge*
>>> button, every committer have (even “must" I’d say) make sure that PR
>>> respects all our rules (we have them because of some reasons, right?) and
>>> ready to be merged. Nice and correct titles/messages is one this thing.
>>> Personally, the first thing that I do once I start to do a review and
>>> before merge, is checking the PR’s title, branches (if it’s from a feature
>>> branch and against main Beam branch), number of commits and their messages.
>>> Then I take a look on related Jira issue which ID should be prefixed to
>>> PR's title and commit’s message(s).
>>>
>>> For sure, there are always exceptions. In case of emergency, for
>>> example, if the build is broken because of tiny thing then it makes sense
>>> to fix this as fast as possible and then, probably, to neglect some rules.
>>> But if exceptions become the common practice and happen quite often, then
>>> it’s a signal that either we have to change the rules or change our
>>> attitude to this.
>>>
>>> As I see, the initial Ismael’s message of his email was more about
>>> titles and multiple commits per PR is another but, of course, related
>>> topic. For both, I believe we can partly automate it - add checks to
>>> prevent merging the commits with not correct messages or/and limit number
>>> of commits per PR, for example. Some other big projects, like Apache Spark,
>>> have even special tool to merge PR in well-formed way [1]. I’m not sure
>>> that we need to have something similar because I’m pretty sure it will
>>> affect the performance of adding new fixes/features (at least in the
>>> beginning), but since we already started the similar discussions several
>>> times on regular bases, we might want to think in this way as an option too.
>>>
>>
>> Noting that we had one too [1]. The trouble was that the bot had a lot of
>> downtime, code was not part of Beam's repo, and also did not encode best
>> practices (for example it broke the connection between PRs and master
>> branch history because it just cherry-picked and squashed commits and
>> pushed those new unrelated commits straight to master). A script would
>> address much of this.
>>
>
> Yeah, the mergebot was much more hassle than it was worth, and lots harder
> to use than pushing a button. I wouldn't be opposed to trying again with a
> better (simpler, under our control) one (and in my investigations of github
> actions, it doesn't look that hard).
>
> Heuristic CI that says "this commit history looks OK" might solve a lot
>> of the problem (I see that Robert already started on this).
>>
>> And finally I was to repeat my agreement with Ismaël and Alexey that the
>> root problem is this: we need to actually care about the commit history and
>> communication of PR/commit titles and descriptions. We use tools to help us
>> to implement our intentions and to communicate them to newcomers, but I
>> don't think this will replace taking care of the repo.
>>
>
> Committers should care about taking care of the repo more than the average
> contributor, but even there there is high variance. I think the issue is
> "oh, I didn't think to squash vs. merge" rather than "who cares, I always
> press merge anyway" in which case a timely reminder will go a long way.
>
> Kenn
>>
>> [1]
>> https://lists.apache.org/thread.html/4a65fb0b66935c9dc61568a3067538775edc3e685c6ac03dd3fa4725%40%3Cdev.beam.apache.org%3E
>>
>>
>>>
>>> As for me, I’d prefer that every 

Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-22 Thread Robert Bradshaw
On Thu, Apr 22, 2021 at 9:33 AM Kenneth Knowles  wrote:

>
> On Thu, Apr 22, 2021 at 7:04 AM Alexey Romanenko 
> wrote:
>
>> Thanks Ismael for bringing this on the table again. Kind of my
>> “favourite” topic, unfortunately, that I raised a couple of times… Let me
>> share some of my thoughts on this.
>>
>> First of all, as Beam developers, honestly we have to agree if we care
>> about our commits history or not. If not (or not so much) then probably
>> there is no more things to discuss and we use Git as just Git… It’s not a
>> bad thing, it’s just different but for large projects, like Beam, clear
>> commits history is ultra important, imho.
>>
>> Well, for now we do care and we clearly mention this in our Contribution
>> Guide. Probably, it sounds only as a recommendation there or not all
>> contributors (especially first-time ones) read this or take this into
>> account or pay attention on this. It’s fine and we always can expect not
>> following our guide because of many different reasons. And this is exactly
>> where Committers have to play their role! I mean that our clear Git history
>> mostly relies on committer's shoulders and, before clicking on *Merge*
>> button, every committer have (even “must" I’d say) make sure that PR
>> respects all our rules (we have them because of some reasons, right?) and
>> ready to be merged. Nice and correct titles/messages is one this thing.
>> Personally, the first thing that I do once I start to do a review and
>> before merge, is checking the PR’s title, branches (if it’s from a feature
>> branch and against main Beam branch), number of commits and their messages.
>> Then I take a look on related Jira issue which ID should be prefixed to
>> PR's title and commit’s message(s).
>>
>> For sure, there are always exceptions. In case of emergency, for example,
>> if the build is broken because of tiny thing then it makes sense to fix
>> this as fast as possible and then, probably, to neglect some rules. But if
>> exceptions become the common practice and happen quite often, then it’s a
>> signal that either we have to change the rules or change our attitude to
>> this.
>>
>> As I see, the initial Ismael’s message of his email was more about titles
>> and multiple commits per PR is another but, of course, related topic. For
>> both, I believe we can partly automate it - add checks to prevent merging
>> the commits with not correct messages or/and limit number of commits per
>> PR, for example. Some other big projects, like Apache Spark, have even
>> special tool to merge PR in well-formed way [1]. I’m not sure that we need
>> to have something similar because I’m pretty sure it will affect the
>> performance of adding new fixes/features (at least in the beginning), but
>> since we already started the similar discussions several times on regular
>> bases, we might want to think in this way as an option too.
>>
>
> Noting that we had one too [1]. The trouble was that the bot had a lot of
> downtime, code was not part of Beam's repo, and also did not encode best
> practices (for example it broke the connection between PRs and master
> branch history because it just cherry-picked and squashed commits and
> pushed those new unrelated commits straight to master). A script would
> address much of this.
>

Yeah, the mergebot was much more hassle than it was worth, and lots harder
to use than pushing a button. I wouldn't be opposed to trying again with a
better (simpler, under our control) one (and in my investigations of github
actions, it doesn't look that hard).

Heuristic CI that says "this commit history looks OK" might solve a lot
> of the problem (I see that Robert already started on this).
>
> And finally I was to repeat my agreement with Ismaël and Alexey that the
> root problem is this: we need to actually care about the commit history and
> communication of PR/commit titles and descriptions. We use tools to help us
> to implement our intentions and to communicate them to newcomers, but I
> don't think this will replace taking care of the repo.
>

Committers should care about taking care of the repo more than the average
contributor, but even there there is high variance. I think the issue is
"oh, I didn't think to squash vs. merge" rather than "who cares, I always
press merge anyway" in which case a timely reminder will go a long way.

Kenn
>
> [1]
> https://lists.apache.org/thread.html/4a65fb0b66935c9dc61568a3067538775edc3e685c6ac03dd3fa4725%40%3Cdev.beam.apache.org%3E
>
>
>>
>> As for me, I’d prefer that every committer paid more attention (if not
>> yet) on these “non code” things before reviewing/merging a PR.
>>
>> [1] https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py
>>
>>
>> On 22 Apr 2021, at 01:28, Robert Bradshaw  wrote:
>>
>> I am also in the camp that it often makes sense to have more than 1
>> commit per PR, but rather than enforce a 1 commit per PR policy, I would
>> say that it's too much bother to look at the commit history whether 

Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-22 Thread Kenneth Knowles
On Thu, Apr 22, 2021 at 7:04 AM Alexey Romanenko 
wrote:

> Thanks Ismael for bringing this on the table again. Kind of my “favourite”
> topic, unfortunately, that I raised a couple of times… Let me share some of
> my thoughts on this.
>
> First of all, as Beam developers, honestly we have to agree if we care
> about our commits history or not. If not (or not so much) then probably
> there is no more things to discuss and we use Git as just Git… It’s not a
> bad thing, it’s just different but for large projects, like Beam, clear
> commits history is ultra important, imho.
>
> Well, for now we do care and we clearly mention this in our Contribution
> Guide. Probably, it sounds only as a recommendation there or not all
> contributors (especially first-time ones) read this or take this into
> account or pay attention on this. It’s fine and we always can expect not
> following our guide because of many different reasons. And this is exactly
> where Committers have to play their role! I mean that our clear Git history
> mostly relies on committer's shoulders and, before clicking on *Merge*
> button, every committer have (even “must" I’d say) make sure that PR
> respects all our rules (we have them because of some reasons, right?) and
> ready to be merged. Nice and correct titles/messages is one this thing.
> Personally, the first thing that I do once I start to do a review and
> before merge, is checking the PR’s title, branches (if it’s from a feature
> branch and against main Beam branch), number of commits and their messages.
> Then I take a look on related Jira issue which ID should be prefixed to
> PR's title and commit’s message(s).
>
> For sure, there are always exceptions. In case of emergency, for example,
> if the build is broken because of tiny thing then it makes sense to fix
> this as fast as possible and then, probably, to neglect some rules. But if
> exceptions become the common practice and happen quite often, then it’s a
> signal that either we have to change the rules or change our attitude to
> this.
>
> As I see, the initial Ismael’s message of his email was more about titles
> and multiple commits per PR is another but, of course, related topic. For
> both, I believe we can partly automate it - add checks to prevent merging
> the commits with not correct messages or/and limit number of commits per
> PR, for example. Some other big projects, like Apache Spark, have even
> special tool to merge PR in well-formed way [1]. I’m not sure that we need
> to have something similar because I’m pretty sure it will affect the
> performance of adding new fixes/features (at least in the beginning), but
> since we already started the similar discussions several times on regular
> bases, we might want to think in this way as an option too.
>

Noting that we had one too [1]. The trouble was that the bot had a lot of
downtime, code was not part of Beam's repo, and also did not encode best
practices (for example it broke the connection between PRs and master
branch history because it just cherry-picked and squashed commits and
pushed those new unrelated commits straight to master). A script would
address much of this.

Heuristic CI that says "this commit history looks OK" might solve a lot
of the problem (I see that Robert already started on this).

And finally I was to repeat my agreement with Ismaël and Alexey that the
root problem is this: we need to actually care about the commit history and
communication of PR/commit titles and descriptions. We use tools to help us
to implement our intentions and to communicate them to newcomers, but I
don't think this will replace taking care of the repo.

Kenn

[1]
https://lists.apache.org/thread.html/4a65fb0b66935c9dc61568a3067538775edc3e685c6ac03dd3fa4725%40%3Cdev.beam.apache.org%3E


>
> As for me, I’d prefer that every committer paid more attention (if not
> yet) on these “non code” things before reviewing/merging a PR.
>
> [1] https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py
>
>
> On 22 Apr 2021, at 01:28, Robert Bradshaw  wrote:
>
> I am also in the camp that it often makes sense to have more than 1 commit
> per PR, but rather than enforce a 1 commit per PR policy, I would say that
> it's too much bother to look at the commit history whether it should be
> squashed or merged (though I think it is almost always very obvious which
> is preferable for a given PR), go ahead and squash rather than merge by
> default.
>
>
> On Wed, Apr 21, 2021 at 2:23 PM Kenneth Knowles  wrote:
>
>> This seems to come up a lot. Maybe we should change something?
>>
>> Having worked on a number of projects and at companies with this policy,
>> companies using non-distributed source control, and companies that just
>> "use git like git", I know all these ways of life pretty well.
>>
>> TL;DR my experience is:
>>  - when people care about the commit history and take care of it, then
>> just "use git like git" results in faster development and clearer history,
>> despite 

Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-22 Thread Robert Bradshaw
It's hard to change existing behavior without a change in circumstances or
tooling. I created https://github.com/apache/beam/pull/14619 , which will
add its opinion about squash or merge as a comment once a PR has ben
LGTM'd. The most important thing is not the advice it gives, but the fact
that it'll remind people to stop and think about this (which is I think the
main issue, not that people really want the fixup commits to be part of our
history). I'm sure improvements could be made, and we could use this to
validate other properties about commits and their comments, but we can try
it out and see how it goes.

On Thu, Apr 22, 2021 at 7:11 AM Alexey Romanenko 
wrote:

> Thanks Ismael for bringing this on the table again. Kind of my “favourite”
> topic, unfortunately, that I raised a couple of times… Let me share some of
> my thoughts on this.
>
> First of all, as Beam developers, honestly we have to agree if we care
> about our commits history or not. If not (or not so much) then probably
> there is no more things to discuss and we use Git as just Git… It’s not a
> bad thing, it’s just different but for large projects, like Beam, clear
> commits history is ultra important, imho.
>
> Well, for now we do care and we clearly mention this in our Contribution
> Guide. Probably, it sounds only as a recommendation there or not all
> contributors (especially first-time ones) read this or take this into
> account or pay attention on this. It’s fine and we always can expect not
> following our guide because of many different reasons. And this is exactly
> where Committers have to play their role! I mean that our clear Git history
> mostly relies on committer's shoulders and, before clicking on *Merge*
> button, every committer have (even “must" I’d say) make sure that PR
> respects all our rules (we have them because of some reasons, right?) and
> ready to be merged. Nice and correct titles/messages is one this thing.
> Personally, the first thing that I do once I start to do a review and
> before merge, is checking the PR’s title, branches (if it’s from a feature
> branch and against main Beam branch), number of commits and their messages.
> Then I take a look on related Jira issue which ID should be prefixed to
> PR's title and commit’s message(s).
>
> For sure, there are always exceptions. In case of emergency, for example,
> if the build is broken because of tiny thing then it makes sense to fix
> this as fast as possible and then, probably, to neglect some rules. But if
> exceptions become the common practice and happen quite often, then it’s a
> signal that either we have to change the rules or change our attitude to
> this.
>
> As I see, the initial Ismael’s message of his email was more about titles
> and multiple commits per PR is another but, of course, related topic. For
> both, I believe we can partly automate it - add checks to prevent merging
> the commits with not correct messages or/and limit number of commits per
> PR, for example. Some other big projects, like Apache Spark, have even
> special tool to merge PR in well-formed way [1]. I’m not sure that we need
> to have something similar because I’m pretty sure it will affect the
> performance of adding new fixes/features (at least in the beginning), but
> since we already started the similar discussions several times on regular
> bases, we might want to think in this way as an option too.
>
> As for me, I’d prefer that every committer paid more attention (if not
> yet) on these “non code” things before reviewing/merging a PR.
>
> [1] https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py
>
>
> On 22 Apr 2021, at 01:28, Robert Bradshaw  wrote:
>
> I am also in the camp that it often makes sense to have more than 1 commit
> per PR, but rather than enforce a 1 commit per PR policy, I would say that
> it's too much bother to look at the commit history whether it should be
> squashed or merged (though I think it is almost always very obvious which
> is preferable for a given PR), go ahead and squash rather than merge by
> default.
>
>
> On Wed, Apr 21, 2021 at 2:23 PM Kenneth Knowles  wrote:
>
>> This seems to come up a lot. Maybe we should change something?
>>
>> Having worked on a number of projects and at companies with this policy,
>> companies using non-distributed source control, and companies that just
>> "use git like git", I know all these ways of life pretty well.
>>
>> TL;DR my experience is:
>>  - when people care about the commit history and take care of it, then
>> just "use git like git" results in faster development and clearer history,
>> despite intermediate commits not being tested by Jenkins/Travis/GHA
>>  - when people see git as an inconvenience, view the history as an
>> implementation detail, or think in linear history of PR merges and view the
>> commits as an implementation detail, it becomes a mess
>>
>> Empirically, this is what I expect from a 1 commit = 1 PR policy (and how
>> I feel about each point):
>>  - 

Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-22 Thread Alexey Romanenko
Thanks Ismael for bringing this on the table again. Kind of my “favourite” 
topic, unfortunately, that I raised a couple of times… Let me share some of my 
thoughts on this.

First of all, as Beam developers, honestly we have to agree if we care about 
our commits history or not. If not (or not so much) then probably there is no 
more things to discuss and we use Git as just Git… It’s not a bad thing, it’s 
just different but for large projects, like Beam, clear commits history is 
ultra important, imho.

Well, for now we do care and we clearly mention this in our Contribution Guide. 
Probably, it sounds only as a recommendation there or not all contributors 
(especially first-time ones) read this or take this into account or pay 
attention on this. It’s fine and we always can expect not following our guide 
because of many different reasons. And this is exactly where Committers have to 
play their role! I mean that our clear Git history mostly relies on committer's 
shoulders and, before clicking on Merge button, every committer have (even 
“must" I’d say) make sure that PR respects all our rules (we have them because 
of some reasons, right?) and ready to be merged. Nice and correct 
titles/messages is one this thing. Personally, the first thing that I do once I 
start to do a review and before merge, is checking the PR’s title, branches (if 
it’s from a feature branch and against main Beam branch), number of commits and 
their messages. Then I take a look on related Jira issue which ID should be 
prefixed to PR's title and commit’s message(s). 

For sure, there are always exceptions. In case of emergency, for example, if 
the build is broken because of tiny thing then it makes sense to fix this as 
fast as possible and then, probably, to neglect some rules. But if exceptions 
become the common practice and happen quite often, then it’s a signal that 
either we have to change the rules or change our attitude to this.  

As I see, the initial Ismael’s message of his email was more about titles and 
multiple commits per PR is another but, of course, related topic. For both, I 
believe we can partly automate it - add checks to prevent merging the commits 
with not correct messages or/and limit number of commits per PR, for example. 
Some other big projects, like Apache Spark, have even special tool to merge PR 
in well-formed way [1]. I’m not sure that we need to have something similar 
because I’m pretty sure it will affect the performance of adding new 
fixes/features (at least in the beginning), but since we already started the 
similar discussions several times on regular bases, we might want to think in 
this way as an option too.

As for me, I’d prefer that every committer paid more attention (if not yet) on 
these “non code” things before reviewing/merging a PR.

[1] https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py


> On 22 Apr 2021, at 01:28, Robert Bradshaw  wrote:
> 
> I am also in the camp that it often makes sense to have more than 1 commit 
> per PR, but rather than enforce a 1 commit per PR policy, I would say that 
> it's too much bother to look at the commit history whether it should be 
> squashed or merged (though I think it is almost always very obvious which is 
> preferable for a given PR), go ahead and squash rather than merge by default. 
> 
> 
> On Wed, Apr 21, 2021 at 2:23 PM Kenneth Knowles  > wrote:
> This seems to come up a lot. Maybe we should change something?
> 
> Having worked on a number of projects and at companies with this policy, 
> companies using non-distributed source control, and companies that just "use 
> git like git", I know all these ways of life pretty well.
> 
> TL;DR my experience is:
>  - when people care about the commit history and take care of it, then just 
> "use git like git" results in faster development and clearer history, despite 
> intermediate commits not being tested by Jenkins/Travis/GHA
>  - when people see git as an inconvenience, view the history as an 
> implementation detail, or think in linear history of PR merges and view the 
> commits as an implementation detail, it becomes a mess
> 
> Empirically, this is what I expect from a 1 commit = 1 PR policy (and how I 
> feel about each point):
>  - fewer commits with bad messages (yay!)
>  - simpler git graph if we squash + rebase (meh)
>  - larger commits of related-but-independent changes (could be OK)
>  - commits with bullet points in their description that bundle unrelated 
> changes (sad face)
>  - slowdown of development (neutral - slow can be good)
>  - fewer "quality of life" improvements, since those would add lines of diff 
> to a PR and are off topic; when they have to be done in a separate PR they 
> don't get done and they don't get reviewed with the same priority (extra sad 
> face)
> 
> I know I am in the minority. I tend to have a lot of PRs where there 
> are 2-5 fairly independent commits. It is "to aid code review" but not in the 

Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-21 Thread Robert Bradshaw
I am also in the camp that it often makes sense to have more than 1 commit
per PR, but rather than enforce a 1 commit per PR policy, I would say that
it's too much bother to look at the commit history whether it should be
squashed or merged (though I think it is almost always very obvious which
is preferable for a given PR), go ahead and squash rather than merge by
default.


On Wed, Apr 21, 2021 at 2:23 PM Kenneth Knowles  wrote:

> This seems to come up a lot. Maybe we should change something?
>
> Having worked on a number of projects and at companies with this policy,
> companies using non-distributed source control, and companies that just
> "use git like git", I know all these ways of life pretty well.
>
> TL;DR my experience is:
>  - when people care about the commit history and take care of it, then
> just "use git like git" results in faster development and clearer history,
> despite intermediate commits not being tested by Jenkins/Travis/GHA
>  - when people see git as an inconvenience, view the history as an
> implementation detail, or think in linear history of PR merges and view the
> commits as an implementation detail, it becomes a mess
>
> Empirically, this is what I expect from a 1 commit = 1 PR policy (and how
> I feel about each point):
>  - fewer commits with bad messages (yay!)
>  - simpler git graph if we squash + rebase (meh)
>  - larger commits of related-but-independent changes (could be OK)
>  - commits with bullet points in their description that bundle unrelated
> changes (sad face)
>  - slowdown of development (neutral - slow can be good)
>  - fewer "quality of life" improvements, since those would add lines of
> diff to a PR and are off topic; when they have to be done in a separate PR
> they don't get done and they don't get reviewed with the same priority
> (extra sad face)
>
> I know I am in the minority. I tend to have a lot of PRs where there
> are 2-5 fairly independent commits. It is "to aid code review" but not in
> the way you might think: The best size for code review is pretty big,
> compared to the best size for commit. A commit is the unit of roll-forward,
> roll-back, cherry-pick, etc. Brian's point about commits not being
> independently tested is important: this is a tooling issue, but not that
> easy to change. Here is why I am not that worried about it: I believe
> strongly in a "rollback first" policy to restore greenness, but also that
> the rollback change itself must be verified to restore greenness. When a
> multi-commit PR fails, you can easily open a revert of the whole PR as well
> as reverts of individual suspect commits. The CI for these will finish
> around the same time, and if you manage a smaller revert, great! Imagine if
> to revert a PR you had to revert _every_ change between HEAD and that PR.
> It would restore to a known green state. Yet we don't do this, because we
> have technology that makes it unnecessary. Ultimately, single large commits
> with bullet points are just an unstructured version of multi-commit PRs. So
> I favor the structure. But people seem to be more likely to write good
> bullet points than to write independent commits. Perhaps because it is
> easier.
>
> So at this point, I think I am OK with a 1 commit per PR policy. I think
> the net benefits to our commit history would be good. I have grown tired of
> repeating the conversation. Rebase-and-squash edits commit ids in ways that
> confuses tools, so I do not favor this. Tooling that merges one commit at a
> time (without altering commit id) would also be super cool and not that
> hard. It would prevent intermediate results from merging, solving both
> problems.
>
> Kenn
>
>
> On Wed, Apr 21, 2021 at 1:25 PM Brian Hulette  wrote:
>
>> I'd argue that the history is almost always "most useful" when one PR ==
>> one commit on master. Intermediate commits from a PR may be useful to aid
>> code review, but they're not verified by presubmits and thus aren't
>> necessarily independently revertible, so I see little value in keeping them
>> around on master. In fact if you're breaking up a PR into multiple commits
>> to aid code review, it's worth considering if they could/should be
>> separately reviewed and verified PRs.
>> We could solve the unwanted commit issue if we have a policy to always
>> "Squash and Merge" PRs with rare exceptions.
>>
>> I agree jira/PR titles could be better, I'm not sure what we can do about
>> it aside from reminding committers of this responsibility. Perhaps the
>> triage process can help catch poorly titled jiras?
>>
>> On Wed, Apr 21, 2021 at 11:38 AM Robert Bradshaw 
>> wrote:
>>
>>> +1 to better descriptions for JIRA (and PRs). Thanks for bringing this
>>> up.
>>>
>>> For merging unwanted commits, can we automate a simple check (e.g. with
>>> github actions)?
>>>
>>> On Wed, Apr 21, 2021 at 8:00 AM Tomo Suzuki  wrote:
>>>
 BEAM-12173 is on me. I'm sorry about that. Re-reading committer guide
 [1], I see I was not following this


Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-21 Thread Kenneth Knowles
This seems to come up a lot. Maybe we should change something?

Having worked on a number of projects and at companies with this policy,
companies using non-distributed source control, and companies that just
"use git like git", I know all these ways of life pretty well.

TL;DR my experience is:
 - when people care about the commit history and take care of it, then just
"use git like git" results in faster development and clearer history,
despite intermediate commits not being tested by Jenkins/Travis/GHA
 - when people see git as an inconvenience, view the history as an
implementation detail, or think in linear history of PR merges and view the
commits as an implementation detail, it becomes a mess

Empirically, this is what I expect from a 1 commit = 1 PR policy (and how I
feel about each point):
 - fewer commits with bad messages (yay!)
 - simpler git graph if we squash + rebase (meh)
 - larger commits of related-but-independent changes (could be OK)
 - commits with bullet points in their description that bundle unrelated
changes (sad face)
 - slowdown of development (neutral - slow can be good)
 - fewer "quality of life" improvements, since those would add lines of
diff to a PR and are off topic; when they have to be done in a separate PR
they don't get done and they don't get reviewed with the same priority
(extra sad face)

I know I am in the minority. I tend to have a lot of PRs where there
are 2-5 fairly independent commits. It is "to aid code review" but not in
the way you might think: The best size for code review is pretty big,
compared to the best size for commit. A commit is the unit of roll-forward,
roll-back, cherry-pick, etc. Brian's point about commits not being
independently tested is important: this is a tooling issue, but not that
easy to change. Here is why I am not that worried about it: I believe
strongly in a "rollback first" policy to restore greenness, but also that
the rollback change itself must be verified to restore greenness. When a
multi-commit PR fails, you can easily open a revert of the whole PR as well
as reverts of individual suspect commits. The CI for these will finish
around the same time, and if you manage a smaller revert, great! Imagine if
to revert a PR you had to revert _every_ change between HEAD and that PR.
It would restore to a known green state. Yet we don't do this, because we
have technology that makes it unnecessary. Ultimately, single large commits
with bullet points are just an unstructured version of multi-commit PRs. So
I favor the structure. But people seem to be more likely to write good
bullet points than to write independent commits. Perhaps because it is
easier.

So at this point, I think I am OK with a 1 commit per PR policy. I think
the net benefits to our commit history would be good. I have grown tired of
repeating the conversation. Rebase-and-squash edits commit ids in ways that
confuses tools, so I do not favor this. Tooling that merges one commit at a
time (without altering commit id) would also be super cool and not that
hard. It would prevent intermediate results from merging, solving both
problems.

Kenn


On Wed, Apr 21, 2021 at 1:25 PM Brian Hulette  wrote:

> I'd argue that the history is almost always "most useful" when one PR ==
> one commit on master. Intermediate commits from a PR may be useful to aid
> code review, but they're not verified by presubmits and thus aren't
> necessarily independently revertible, so I see little value in keeping them
> around on master. In fact if you're breaking up a PR into multiple commits
> to aid code review, it's worth considering if they could/should be
> separately reviewed and verified PRs.
> We could solve the unwanted commit issue if we have a policy to always
> "Squash and Merge" PRs with rare exceptions.
>
> I agree jira/PR titles could be better, I'm not sure what we can do about
> it aside from reminding committers of this responsibility. Perhaps the
> triage process can help catch poorly titled jiras?
>
> On Wed, Apr 21, 2021 at 11:38 AM Robert Bradshaw 
> wrote:
>
>> +1 to better descriptions for JIRA (and PRs). Thanks for bringing this
>> up.
>>
>> For merging unwanted commits, can we automate a simple check (e.g. with
>> github actions)?
>>
>> On Wed, Apr 21, 2021 at 8:00 AM Tomo Suzuki  wrote:
>>
>>> BEAM-12173 is on me. I'm sorry about that. Re-reading committer guide
>>> [1], I see I was not following this
>>>
>>> > The reviewer should give the LGTM and then request that the author of
>>> the pull request rebase, squash, split, etc, the commits, so that the
>>> history is most useful
>>>
>>>
>>> Thank you for the feedback on this matter! (And I don't think we
>>> should change the contribution guide)
>>>
>>> [1] https://beam.apache.org/contribute/committer-guide/
>>>
>>> On Wed, Apr 21, 2021 at 10:35 AM Ismaël Mejía  wrote:
>>> >
>>> > Hello,
>>> >
>>> > I have noticed an ongoing pattern of carelessness around issues/PR
>>> titles and
>>> > descriptions. It is really painful 

Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-21 Thread Brian Hulette
I'd argue that the history is almost always "most useful" when one PR ==
one commit on master. Intermediate commits from a PR may be useful to aid
code review, but they're not verified by presubmits and thus aren't
necessarily independently revertible, so I see little value in keeping them
around on master. In fact if you're breaking up a PR into multiple commits
to aid code review, it's worth considering if they could/should be
separately reviewed and verified PRs.
We could solve the unwanted commit issue if we have a policy to always
"Squash and Merge" PRs with rare exceptions.

I agree jira/PR titles could be better, I'm not sure what we can do about
it aside from reminding committers of this responsibility. Perhaps the
triage process can help catch poorly titled jiras?

On Wed, Apr 21, 2021 at 11:38 AM Robert Bradshaw 
wrote:

> +1 to better descriptions for JIRA (and PRs). Thanks for bringing this up.
>
> For merging unwanted commits, can we automate a simple check (e.g. with
> github actions)?
>
> On Wed, Apr 21, 2021 at 8:00 AM Tomo Suzuki  wrote:
>
>> BEAM-12173 is on me. I'm sorry about that. Re-reading committer guide
>> [1], I see I was not following this
>>
>> > The reviewer should give the LGTM and then request that the author of
>> the pull request rebase, squash, split, etc, the commits, so that the
>> history is most useful
>>
>>
>> Thank you for the feedback on this matter! (And I don't think we
>> should change the contribution guide)
>>
>> [1] https://beam.apache.org/contribute/committer-guide/
>>
>> On Wed, Apr 21, 2021 at 10:35 AM Ismaël Mejía  wrote:
>> >
>> > Hello,
>> >
>> > I have noticed an ongoing pattern of carelessness around issues/PR
>> titles and
>> > descriptions. It is really painful to see more and more examples like:
>> >
>> > BEAM-12160 Add TODO for fixing the warning
>> > BEAM-12165 Fix ParquetIO
>> > BEAM-12173 avoid intermediate conversion (PR) and BEAM-12173 use
>> > toMinutes (commit)
>> >
>> > In all these cases with just a bit of detail in the title it would be
>> enough to
>> > make other contributors or reviewers life easierm as well as to have a
>> better
>> > project history.  What astonishes me apart of the lack of care is that
>> some of
>> > those are from Beam commmitters.
>> >
>> > We already have discussed about not paying attention during commit
>> merges where
>> > some PRs end up merging tons of 'unwanted' fixup commits, and nothing
>> has
>> > changed so I am wondering if we should maybe just totally remove that
>> rule (for
>> > commits) and also eventually for titles and descriptions.
>> >
>> > Ismaël
>> >
>> > [1] https://beam.apache.org/contribute/
>>
>>
>>
>> --
>> Regards,
>> Tomo
>>
>


Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-21 Thread Robert Bradshaw
+1 to better descriptions for JIRA (and PRs). Thanks for bringing this up.

For merging unwanted commits, can we automate a simple check (e.g. with
github actions)?

On Wed, Apr 21, 2021 at 8:00 AM Tomo Suzuki  wrote:

> BEAM-12173 is on me. I'm sorry about that. Re-reading committer guide
> [1], I see I was not following this
>
> > The reviewer should give the LGTM and then request that the author of
> the pull request rebase, squash, split, etc, the commits, so that the
> history is most useful
>
>
> Thank you for the feedback on this matter! (And I don't think we
> should change the contribution guide)
>
> [1] https://beam.apache.org/contribute/committer-guide/
>
> On Wed, Apr 21, 2021 at 10:35 AM Ismaël Mejía  wrote:
> >
> > Hello,
> >
> > I have noticed an ongoing pattern of carelessness around issues/PR
> titles and
> > descriptions. It is really painful to see more and more examples like:
> >
> > BEAM-12160 Add TODO for fixing the warning
> > BEAM-12165 Fix ParquetIO
> > BEAM-12173 avoid intermediate conversion (PR) and BEAM-12173 use
> > toMinutes (commit)
> >
> > In all these cases with just a bit of detail in the title it would be
> enough to
> > make other contributors or reviewers life easierm as well as to have a
> better
> > project history.  What astonishes me apart of the lack of care is that
> some of
> > those are from Beam commmitters.
> >
> > We already have discussed about not paying attention during commit
> merges where
> > some PRs end up merging tons of 'unwanted' fixup commits, and nothing has
> > changed so I am wondering if we should maybe just totally remove that
> rule (for
> > commits) and also eventually for titles and descriptions.
> >
> > Ismaël
> >
> > [1] https://beam.apache.org/contribute/
>
>
>
> --
> Regards,
> Tomo
>


Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-21 Thread Tomo Suzuki
BEAM-12173 is on me. I'm sorry about that. Re-reading committer guide
[1], I see I was not following this

> The reviewer should give the LGTM and then request that the author of the 
> pull request rebase, squash, split, etc, the commits, so that the history is 
> most useful


Thank you for the feedback on this matter! (And I don't think we
should change the contribution guide)

[1] https://beam.apache.org/contribute/committer-guide/

On Wed, Apr 21, 2021 at 10:35 AM Ismaël Mejía  wrote:
>
> Hello,
>
> I have noticed an ongoing pattern of carelessness around issues/PR titles and
> descriptions. It is really painful to see more and more examples like:
>
> BEAM-12160 Add TODO for fixing the warning
> BEAM-12165 Fix ParquetIO
> BEAM-12173 avoid intermediate conversion (PR) and BEAM-12173 use
> toMinutes (commit)
>
> In all these cases with just a bit of detail in the title it would be enough 
> to
> make other contributors or reviewers life easierm as well as to have a better
> project history.  What astonishes me apart of the lack of care is that some of
> those are from Beam commmitters.
>
> We already have discussed about not paying attention during commit merges 
> where
> some PRs end up merging tons of 'unwanted' fixup commits, and nothing has
> changed so I am wondering if we should maybe just totally remove that rule 
> (for
> commits) and also eventually for titles and descriptions.
>
> Ismaël
>
> [1] https://beam.apache.org/contribute/



-- 
Regards,
Tomo


Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-21 Thread Ismaël Mejía
Hello,

I have noticed an ongoing pattern of carelessness around issues/PR titles and
descriptions. It is really painful to see more and more examples like:

BEAM-12160 Add TODO for fixing the warning
BEAM-12165 Fix ParquetIO
BEAM-12173 avoid intermediate conversion (PR) and BEAM-12173 use
toMinutes (commit)

In all these cases with just a bit of detail in the title it would be enough to
make other contributors or reviewers life easierm as well as to have a better
project history.  What astonishes me apart of the lack of care is that some of
those are from Beam commmitters.

We already have discussed about not paying attention during commit merges where
some PRs end up merging tons of 'unwanted' fixup commits, and nothing has
changed so I am wondering if we should maybe just totally remove that rule (for
commits) and also eventually for titles and descriptions.

Ismaël

[1] https://beam.apache.org/contribute/