Re: Merging CEP-15 to trunk

Josh McKenzie Tue, 24 Jan 2023 14:11:38 -0800

Cordial debate! <3

> - it's nevertheless the case that those contributors who didn't actively work 
> on Accord, have assumed that they will be invited to review now, when the 
> code is about to land in trunk. Not allowing that to happen would make them 
> feel like they weren't given the opportunity and that the process in 
> Cassandra Project Governance was bypassed. We can agree to work differently 
> in the future, but this is the reality now.
If this was a miscommunication on this instance rectifying it will of course 
require compromise from all parties. Good learning for future engagement and 
hopefully the outcome of this discussion is clearer norms as a project so we 
don't end up with this miscommunication in the future.


> the code is of the highest quality and ready to be merged to trunk, I don't 
> think that can be expected of every feature branch all the time
I think this is something we can either choose to make a formal requirement for 
feature branches in ASF git (all code that goes in has 2 committers hands on) 
or not. If folks want to work on other feature branches in other forks w/out 
this bar and then have a "mega review" at the end, I suppose that's their 
prerogative. Many of us that have been on the project for years have 
_significant emotional and psychological scars_ from that approach however, and 
multiple large efforts have failed at the "mega-review and merge" step. So I 
wouldn't advocate for that approach (and it's the only logical alternative I 
can think of to incremental bar of quality reinforcement throughout a work 
cycle on a large feature over time).

> if it had been ready to merge to trunk already a year ago, why wasn't it? 
> It's kind of the point of using a feature branch that the code in it is NOT 
> ready to be merged yet
Right now we culturally tend to avoid merging code that doesn't do anything, 
for better or worse. We don't have a strong culture of either incremental merge 
in during development or of using the experimental flag for new features. Much 
of the tightly coupled nature of our codebase makes this a necessity for 
keeping velocity while working unfortunately. So in this case I would qualify 
that "it's not ready to be merged yet given our assumption that all code in the 
codebase should serve an active immediate purpose, not due to a lack of 
merge-level quality".

The approach of "hold the same bar for merges into a feature branch as trunk" 
seems to be a compromise between Big Bang single commit drops and peppering 
trunk with a lot of "as yet dormant" incremental code as a large feature is 
built out. Not saying it's better or worse, just describing the contour of the 
tradeoffs as I see them.

> - Uncertainty: It's completely ok that some feature branches may be abandoned 
> without ever merging to trunk. Requiring the community (anyone potentially 
> interested, anyways) to review such code would obviously be a waste of 
> precious talent.
This is an excellent point. The only mitigation I'd see for this would be an 
additional review period or burden collectively before merge of a feature 
branch into trunk once something has crossed a threshold of success as to be 
included, or stepping away from a project where you don't have the cycles to 
stay up to date and review and trust that the other committers working on the 
project are making choices that are palatable and acceptable to you.

If all API decisions hit the dev ML and the architecture conforms generally to 
the specification of the CEP, it seems to me that stepping back and trusting 
your fellow committers to Do The Right Thing is the optimal (and scalable) 
approach here?

> Let's say someone in October 2021 was invested in the quality of Cassandra 
> 4.1 release. Should this person now invest in reviewing Accord or not? It's 
> impossible to know. Again, in hindsight we know that the answer is no, but 
> your suggestion again would require the person to review all active feature 
> branches just in case.
I'd argue that there's 3 times to really invest in the quality of any Cassandra 
release:
1. When we set agreed upon bars for quality we'll all hold ourselves 
accountable to (CI, code style, test coverage, etc)
2. When we raise new committers
3. When we write or review code

I don't think it's sustainable to try and build processes that will bottleneck 
our throughput as a community to the finite availability of individuals if 
they're concerned about specific topics and want to be individually involved in 
specific code level changes. If folks feel like our current processes, CI 
infrastructure, and committer pool risks maintaining our bar of quality that's 
definitely something we should talk about in depth, as in my mind that's the 
backbone of us scaling stably as a project community.


On Tue, Jan 24, 2023, at 4:41 PM, Henrik Ingo wrote:
> Thanks Josh
> 
> Since you mentioned the CEP process, I should also mention one sentiment you 
> omitted, but worth stating explicitly:
> 
> 4. The CEP itself should not be renegotiated at this point. However, the 
> reviewers should rather focus on validating that the implementation matches 
> the CEP. (Or if not, that the deviation is of a good reason and the reviewer 
> agrees to approve it.)
> 
> 
> Although I'm not personally full time working on either producing Cassandra 
> code or reviewing it, I'm going to spend one more email defending your point 
> #1, because I think your proposal would lead to a lot of inefficiencies in 
> the project, and that does happen to be my job to care about: 
> 
>  - Even if you could be right, from some point of view, it's nevertheless the 
> case that those contributors who didn't actively work on Accord, have assumed 
> that they will be invited to review now, when the code is about to land in 
> trunk. Not allowing that to happen would make them feel like they weren't 
> given the opportunity and that the process in Cassandra Project Governance 
> was bypassed. We can agree to work differently in the future, but this is the 
> reality now.
> 
>  - Although those who have collaborated on Accord testify that the code is of 
> the highest quality and ready to be merged to trunk, I don't think that can 
> be expected of every feature branch all the time. In fact, I'm pretty sure 
> the opposite must have been the case also for the Accord branch at some 
> point. After all, if it had been ready to merge to trunk already a year ago, 
> why wasn't it? It's kind of the point of using a feature branch that the code 
> in it is NOT ready to be merged yet. (For example, the existing code might be 
> of high quality, but the work is incomplete, so it shouldn't be merged to 
> trunk.)
> 
>  - Uncertainty: It's completely ok that some feature branches may be 
> abandoned without ever merging to trunk. Requiring the community (anyone 
> potentially interested, anyways) to review such code would obviously be a 
> waste of precious talent.
> 
>  - Time uncertainty: Also - and this is also true for Accord - it is unknown 
> when the merge will happen if it does. In the case of Accord it is now over a 
> year since the CEP was adopted. If I remember correctly an initial target 
> date for some kind of milestone may have been Summer of 2022? Let's say 
> someone in October 2021 was invested in the quality of Cassandra 4.1 release. 
> Should this person now invest in reviewing Accord or not? It's impossible to 
> know. Again, in hindsight we know that the answer is no, but your suggestion 
> again would require the person to review all active feature branches just in 
> case.
> 
> 
> As for 2 and 3, I certainly observe an assumption that contributors have 
> expected to review after a rebase. But I don't see this as a significant 
> topic to argue about. If indeed the rebase is as easy as Benedict advertised, 
> then we should just do the rebase because apparently it can be done faster 
> than it took me to write this email :-) (But yes, conversely, it seems then 
> that the rebase is not a big reason to hold off from reviewing either.)
> 
> henrik
> 
> 
> On Tue, Jan 24, 2023 at 9:29 PM Josh McKenzie <jmcken...@apache.org> wrote:
>> __
>> Zooming out a bit, I think Accord is the first large body of work we've done 
>> post introduction of the CEP system with multiple people collaborating on a 
>> feature branch like this. This discussion seems to have surfaced a few 
>> sentiments:
>> 
>> 1. Some contributors seem to feel that work on a feature branch doesn't have 
>> the same inherent visibility as work on trunk
>> 2. There's a lack of clarity on our review process when it comes to 
>> significant (in either time or size) rebases
>> 3. We might be treating Ninja commits a bit differently on a feature branch 
>> than trunk, which intersects with 1 and 2
>> 
>> My personal opinions are:
>> I disagree with 1 - it simply takes the added effort of actively following 
>> that branch and respective JIRAs if you're interested. I think having a 
>> feature branch in the ASF git repo w/commits and JIRAs tracking that work is 
>> perfectly fine, and the existing bar (2 committers +1, green tests before 
>> merge to trunk) when applied to a feature branch is also not just well 
>> within the "letter of the law" on the project but also logically sufficient 
>> to retain our bar of quality and stability.
>> 
>> For 2 (reviews required after rebase) I don't think we should over-prescribe 
>> process on this. If all tests are green pre-rebase, and all tests are green 
>> post-rebase, and a committer is confident they didn't materially modify the 
>> functioning of the logical flow or data structures of their code during a 
>> rebase, I don't see there being any value added by adding another review 
>> based on those grounds.
>> 
>> If the subtext is actually that some folks feel we need a discussion about 
>> whether we should have a different bar for review on CEP feature branches (3 
>> committers? 1+ pmc members? more diversity in reviewers or committers as 
>> measured by some as yet unspoken metric), perhaps we could have that 
>> discussion. FWIW I'm against changes there as well; we all wear our Apache 
>> Hats here, and if the debate is between work like this happening in a 
>> feature branch affording contributors increased efficiency and locality vs. 
>> all that happening on trunk and repeatedly colliding with everyone 
>> everywhere, feature branches are a clear win IMO.
>> 
>> And for 3 - I think we've all broadly agreed we shouldn't ninja commit 
>> unless it's a comment fix, typo, forgotten git add, or something along those 
>> lines. For any commit that doesn't qualify it should go through the review 
>> process.
>> 
>> And a final note - Ekaterina alluded to something valuable in her email 
>> earlier in this thread. I think having a "confirm green on all the test 
>> suites that are green on merge target" bar for large feature branches 
>> (rather than strictly the "pre-commit subset") before merge makes a lot of 
>> sense.
>> 
>> On Tue, Jan 24, 2023, at 1:41 PM, Caleb Rackliffe wrote:
>>> Just FYI, I'm going to be posting a Jira (which will have some dependencies 
>>> as well) to track this merge, hopefully some time today...
>>> 
>>> On Tue, Jan 24, 2023 at 12:26 PM Ekaterina Dimitrova 
>>> <e.dimitr...@gmail.com> wrote:
>>>> I actually see people all the time making a final check before merge as 
>>>> part of the review. And I personally see it only as a benefit when it 
>>>> comes to serious things like Accord, as an example. Even as a help for the 
>>>> author who is overwhelmed with the big amount of work already done - 
>>>> someone to do quick last round of review. Team work after all.
>>>> 
>>>> Easy rebase - those are great news. I guess any merge conflicts that were 
>>>> solved will be documented and confirmed with reviewers before merge on the 
>>>> ticket where the final CI push will be posted. I also assumed that even 
>>>> without direct conflicts a check that there is no contradiction with any 
>>>> post-September commits is done as part of the rebase. (Just adding here 
>>>> for completeness)
>>>> 
>>>> One thing that I wanted to ask for is when you push to CI, you or whoever 
>>>> does it, to approve all jobs. Currently we have pre-approved the minimum 
>>>> required jobs in the pre-commit workflow. I think in this case with a big 
>>>> work approving all jobs in CircleCI will be of benefit. (I also do it for 
>>>> bigger bodies of work to be on the safe side) Just pointing in case you 
>>>> use a script or something to push only the pre-approved ones. Please ping 
>>>> me in Slack if It’s not clear what I mean, happy to help with that
>>>> 
>>>> On Tue, 24 Jan 2023 at 11:52, Benedict <bened...@apache.org> wrote:
>>>>> 
>>>>> Perhaps the disconnect is that folk assume a rebase will be difficult and 
>>>>> have many conflicts? 
>>>>> 
>>>>> We have introduced primarily new code with minimal integration points, so 
>>>>> I decided to test this. I managed to rebase locally in around five 
>>>>> minutes; mostly imports. This is less work than for a rebase of fairly 
>>>>> typical ticket of average complexity.
>>>>> 
>>>>> Green CI is of course a requirement. There is, however, no good 
>>>>> procedural or technical justification for a special review of the rebase.
>>>>> 
>>>>> Mick is encouraged to take a look at the code before and after rebase, 
>>>>> and will be afforded plenty of time to do so. But I will not gate merge 
>>>>> on this adhoc requirement.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 24 Jan 2023, at 15:40, Ekaterina Dimitrova <e.dimitr...@gmail.com> 
>>>>>> wrote:
>>>>>> 
>>>>>> Hi everyone,
>>>>>> I am excited to see this work merged. I noticed the branch is 395 
>>>>>> commits behind trunk or not rebased since September last year. I think 
>>>>>> if Mick or anyone else wants to make a final pass after rebase happens 
>>>>>> and CI runs - this work can only benefit of that. Squash, rebase and 
>>>>>> full CI run green is the minimum that, if I read correctly the thread, 
>>>>>> we all agree on that part. 
>>>>>> I would say that CI and final check after a long rebase of a patch is a 
>>>>>> thing we actually do all the time even for small patches when we get 
>>>>>> back to our backlog of old patches. This doesn’t mean that the previous 
>>>>>> reviews are dismissed or people not trusted or anything like that.
>>>>>> But considering the size and the importance of this work, I can really 
>>>>>> see only benefit of a final cross-check.   
>>>>>> As Henrik mentioned me, I am not sure I will have the chance to review 
>>>>>> this work any time soon (just setting the right expectations up front) 
>>>>>> but I see at least Mick already mentioning he would do it if there are 
>>>>>> no other volunteers. Now, whether it will be separate ticket or not, 
>>>>>> that is a different story. Aren’t the Accord tickets in an epic under 
>>>>>> which we can document the final rebase, CI runs, etc?
>>>>>> 
>>>>>> On Tue, 24 Jan 2023 at 9:40, Henrik Ingo <henrik.i...@datastax.com> 
>>>>>> wrote:
>>>>>>> When was the last time the feature branch was rebased? Assuming it's a 
>>>>>>> while back and the delta is significant, surely it's normal process to 
>>>>>>> first rebase, run tests, and then present the branch for review?
>>>>>>> 
>>>>>>> To answer your question: The effect of the rebase is then either baked 
>>>>>>> into the original commits (which I personally dislike), or you can also 
>>>>>>> have the rebase-induced changes as their own commits. (Which can get 
>>>>>>> tedious, but has the benefit of making explicit what was only a change 
>>>>>>> due to rebasing.) Depending on which approach you take when rebasing, a 
>>>>>>> reviewer would then review accordingly.
>>>>>>> 
>>>>>>> henrik
>>>>>>> 
>>>>>>> On Tue, Jan 24, 2023 at 11:14 AM Benedict <bened...@apache.org> wrote:
>>>>>>>> 
>>>>>>>> No, that is not the normal process. What is it you think you would be 
>>>>>>>> reviewing? There are no diffs produced as part of rebasing, and the 
>>>>>>>> purpose of review is to ensure code meets out standards, not that the 
>>>>>>>> committer is competent at rebasing or squashing. Nor are you familiar 
>>>>>>>> with the code as it was originally reviewed, so would have nothing to 
>>>>>>>> compare against. We expect a clean CI run, ordinarily, not an 
>>>>>>>> additional round of review. If we were to expect that, it would be by 
>>>>>>>> the original reviewer, not a third party, as they are the only ones 
>>>>>>>> able to judge the rebase efficiently.
>>>>>>>> 
>>>>>>>> I would support investigating tooling to support reviewing rebases. 
>>>>>>>> I’m sure such tools and processes exist. But, we don’t have them today 
>>>>>>>> and it is not a normal part of the review process. If you want to 
>>>>>>>> modify, clarify or otherwise stipulate new standards or processes, I 
>>>>>>>> suggest a separate thread.
>>>>>>>> 
>>>>>>>> > How will the existing tickets make it clear when and where their 
>>>>>>>> > final merge happened?
>>>>>>>> 
>>>>>>>> By setting the release version and source control fields.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On 24 Jan 2023, at 08:43, Mick Semb Wever <m...@apache.org> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> .... But it's not merge-than-review, because they've already been 
>>>>>>>>>> reviewed, before being merged to the feature branch, by committers 
>>>>>>>>>> (actually PMC members)? 
>>>>>>>>>> 
>>>>>>>>>> You want code that's been written by one PMC member and reviewed by 
>>>>>>>>>> 2 other PMC members to be put up for review by some random 4th 
>>>>>>>>>> party? For how long? 
>>>>>>>>> 
>>>>>>>>>  
>>>>>>>>> It is my hope that the work as-is is not being merged. That there is 
>>>>>>>>> a rebase and some trivial squashing to do. That deserves a quick 
>>>>>>>>> check by another. Ideally this would be one of the existing reviewers 
>>>>>>>>> (but like any other review step, no matter how short and trivial it 
>>>>>>>>> is, that's still an open process). I see others already doing this 
>>>>>>>>> when rebasing larger patches before the final merge. 
>>>>>>>>> 
>>>>>>>>> Will the branch be rebased and cleaned up?
>>>>>>>>> How will the existing tickets make it clear when and where their 
>>>>>>>>> final merge happened?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> **Henrik Ingo**
>>>>>>> 
>>>>>>> **c**. +358 40 569 7354 
>>>>>>> 
>>>>>>> **w**. _www.datastax.com_
>>>>>>> 
>>>>>>> __ 
>>>>>>> <https://urldefense.com/v3/__https://www.facebook.com/datastax__;!!PbtH5S7Ebw!ep4b-HH4HnBcGPT32sQbstaimEP5eIigJGvIpXgHKHxWq4uyqmNUiaz6DwjozGhRlQX9M2F7yZrdLA1y1UUJDw$>
>>>>>>>   __ <https://twitter.com/datastax>  __ 
>>>>>>> <https://urldefense.com/v3/__https://www.linkedin.com/company/datastax/__;!!PbtH5S7Ebw!ep4b-HH4HnBcGPT32sQbstaimEP5eIigJGvIpXgHKHxWq4uyqmNUiaz6DwjozGhRlQX9M2F7yZrdLA2L93GKGQ$>
>>>>>>>   __ <https://github.com/datastax/>
>>>>>>> 
>> 
> 
> 
> --
> 
> 
> 
> 
> *Henrik Ingo*
> 
> *c*. +358 40 569 7354 
> 
> *w*. _www.datastax.com_
> 
> __ <https://www.facebook.com/datastax>  __ <https://twitter.com/datastax>  __ 
> <https://www.linkedin.com/company/datastax/>  __ 
> <https://github.com/datastax/>
>

Re: Merging CEP-15 to trunk

Reply via email to