Re: Merging CEP-15 to trunk

David Capwell Tue, 31 Jan 2023 12:34:58 -0800

> About this merge - I would  personally love to see everything rebased and 
> things rerun in CI which no one has a doubt it will happen


Yes, I think this is expected behavior that must happen before adding to trunk, 
or even just rebasing the feature branch; CI must be passing and flaky tests 
(as you called out) must be resolved.  Nothing is special from a process point 
of view as this is already the expected process.  I know Caleb is working on it 
and uses my circleci script, which leverages the _repeat job to help flesh this 
out.  With the merge to trunk it “should” detect all new tests and run them, so 
its fair to expect this to be stable.

> I am not sure I understand this - 


Sorry, I was trying to be explicit on the current state of test failures, but 
was not clear.

Accord adds a new system table to make state durable (similar to the paxos 
table).  This added table causes a well known subset of python dtests to fail 
as they are checking for the set of keyspaces/tables and detecting a new one 
added, causing the test to fail.  We have been ignoring these rather than 
changing python-dtest that way we pull in latest changes there rather than 
forking and freezing the branch we use.  We will 100% update python dtest along 
with the merge to trunk similar to all other cross-cutting work done.  Trunk 
must not fail due to this table getting added and must 100% be dealt with at 
the same time we merge.

> Do you plan to merge before fixing the tests? I am confused


Yes, its a blocker for trunk merge.  

> I think Henrik brought an interesting point about releasable trunk. I don’t 
> think that we are yet there with the Jenkins issues, but we are getting close 
> and we strive for that, no?


As Josh calls out, we require releasable trunk as defined by Circle CI (change 
was made in 4.1); my understanding is that people are working to improve 
Jenkins, and once this work is done we can migrate off Circle CI.  Until that 
happens we need to make sure Circle CI is stable for every commit.

> 3) what is the story with the ninja fixes? 


Can’t speak for 100% of them, but from the ones I have seen it boils down to 
“our merge process relies on humans to do the right thing, and people mess up 
some times”, which sadly is true in trunk.  To avoid calling anyone out other 
than myself, lets look at a specific one that I did

$ git show f8243f41c9e96c4a0390558086ece078b0b6b15c
commit f8243f41c9e96c4a0390558086ece078b0b6b15c
Author: David Capwell <[email protected]>
Date:   Mon Jan 9 13:20:58 2023 -0800

    Ninja: Add AccordTestUtils.parse which was missing in the latest commit

diff --git a/test/unit/org/apache/cassandra/service/accord/AccordTestUtils.java 
b/test/unit/org/apache/cassandra/service/accord/AccordTestUtils.java
index 4adad32d8a..20142c439b 100644
--- a/test/unit/org/apache/cassandra/service/accord/AccordTestUtils.java
+++ b/test/unit/org/apache/cassandra/service/accord/AccordTestUtils.java
@@ -174,6 +174,14 @@ public class AccordTestUtils
         return statement.createTxn(ClientState.forInternalCalls(), options);
     }

+    public static TransactionStatement parse(String query)
+    {
+        TransactionStatement.Parsed parsed = (TransactionStatement.Parsed) 
QueryProcessor.parseStatement(query);
+        Assert.assertNotNull(parsed);
+        TransactionStatement statement = (TransactionStatement) 
parsed.prepare(ClientState.forInternalCalls());
+        return statement;
+    }
+
     public static Txn createTxn(int readKey, int... writeKeys)
     {
         StringBuilder sb = new StringBuilder("BEGIN TRANSACTION\n”);

I merged but a specific method was missing, this caused the branch to fail to 
compile so I ninja patched by adding that method back; this was reviewed in 
slack with Benedict (if I remember correctly).

I made a human judgement to merge via GH UI and bypass my merge scripts, as I 
had a Circle CI build showing success… I messed up and had to go back and fix 
it after commit.

> It is frustrating and I can imagine how after that much work around Accord 
> people just want it in


I do not believe anyone is arguing to merge Accord with bugs… bugs we know 
about are blockers for merge.  There are pending improvements and features 
people are working on, but I don’t believe I know of us trying to merge bugs 
into trunk.  If there are bugs found we should address before merge, but 
improvements can happen after trunk merge (depending on timing, before merge is 
fine).

I do hope this comment doesn’t cause a rabbit hole of “what is a bug vs what is 
an improvement”….


> On Jan 31, 2023, at 8:20 AM, Josh McKenzie <[email protected]> wrote:
> 
>> Don't we follow a principle of always shippable trunk? This was actually a 
>> reason why I sidelined the talk about post-merge review, because it implies 
>> that the code wasn't "good enough"/perfect when it was first merged.
> We follow a principle of "always shippable trunk according to circleci" as of 
> cutting 4.1, which has been adhered to in this case. There's every likelihood 
> that ASF CI will detonate when this is merged in, but it's also already 
> currently detonating repeatedly which we're working on so... /shrug
> 
> If we block Accord merging on green ASF CI, we'll be in the same boat we were 
> in with 4.1 and never ship until we tear certain things down to the studs and 
> rebuild them. I don't think it's reasonable to put that burden (ASF CI must 
> be green) on any ticket right now, much less one that has a potentially high 
> integration cost to the entire project if it stagnates over time.
> 
>> plenty of experience with situations where even if an engineer, including 
>> myself sometimes, wanted to work on fixing some technical debt, the 
>> employer's project management processes simply wouldn't prioritize that work 
>> and it was left for years
> Seems like it's a healthy mix of debt and bikeshedding for us historically. 
> The former we don't want to sneak in, the latter, well, that I'm less 
> concerned about personally. :)
> 
> And I think part of the "two committers +1" bar is about trying to keep our 
> debt low.
> 
> On Tue, Jan 31, 2023, at 2:45 AM, Benedict wrote:
>> 
>> Do you mean as part of a blocking review, or just in general?
>> 
>> I don’t honestly understand the focus on ninja fixes or rebasing, in either 
>> context. Why would a project rebase its work until ready to merge? Why would 
>> you worry that a team well resourced with experienced contributors (30+yrs 
>> between them) don’t know how to work correctly with ninja fixes? These are 
>> all minor details, surely?
>> 
>> I understand your concern around flaky tests, particularly since it seems 
>> other work has let some slip through. I don’t believe this is a blocking 
>> review concern, as it represents its own blocking requirement. I believe I 
>> have responded positively to this query already though?
>> 
>> 
>>> On 31 Jan 2023, at 01:46, Ekaterina Dimitrova <[email protected]> wrote:
>>> 
>>> 
>>> Benedict, Is it an issue to ask whether flaky tests will be addressed as 
>>> per our project agreement? Or about ninja fixes and why a branch was not 
>>> rebased during development? What did I miss? 
>>> 
>>> By the way I do not ask these questions to block anyone, even suggested to 
>>> help with CI…it’s a pity this part was dismissed… 
>>> 
>>>  I can see that Caleb is handling the things around the merge ticket with 
>>> high attention to the details as always. I can only thank him! 
>>> 
>>> At this point I see this thread mostly as - this is the first feature 
>>> branch since quite some time, people are unsure about certain things - 
>>> let’s see how we handle this for the next time to be more efficient and 
>>> clear.  I think you already took some actions in your suggestion earlier 
>>> today with the document around comments. 
>>> 
>>> On Mon, 30 Jan 2023 at 20:16, Benedict <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> Review should primarily ask: "is this correct?" and "could this be done 
>>> differently (clearer, faster, more correct, etc)?" Blocking reviews 
>>> especially, because why else would a reasonable contributor want to block 
>>> progress?
>>> 
>>> These questions can of course be asked of implementation details for any 
>>> CEP. 
>>> 
>>> I have said before, a proposal to conduct a blocking review of this kind - 
>>> if very late in my view - would be valid, though timeline would have to be 
>>> debated.
>>> 
>>> Reviewers with weaker aspirations have plenty of time available to them - 
>>> two weeks have already passed, and another couple will likely yet (there 
>>> isn't a rush). But it is novel to propose that such optional reviews be 
>>> treated as blocking.
>>> 
>>> 
>>>> On 30 Jan 2023, at 23:04, Henrik Ingo <[email protected] 
>>>> <mailto:[email protected]>> wrote:
>>>> 
>>> 
>>>> Ooops, I missed copy pasting this reply into my previous email:
>>>> 
>>>> On Fri, Jan 27, 2023 at 11:21 PM Benedict <[email protected] 
>>>> <mailto:[email protected]>> wrote:
>>>> 
>>>>> I'm realizing in retrospect this leaves ambiguity
>>>> 
>>>> Another misreading at least of the intent of these clauses, is that they 
>>>> were to ensure that concerns about a design and approach are listened to, 
>>>> and addressed to the satisfaction of interested parties. It was 
>>>> essentially codifying the project’s long term etiquette around pieces of 
>>>> work with either competing proposals or fundamental concerns. It has 
>>>> historically helped to avoid escalation to vetoes, or reverting code after 
>>>> commit. 
>>>> 
>>>> It wasn’t intended that any reason might be invoked, as seems to have been 
>>>> inferred, and perhaps this should be clarified, though I had hoped it 
>>>> would be captured by the word “reasonable". Minor concerns that haven’t 
>>>> been caught by the initial review process can always be addressed in 
>>>> follow-up work, as is very common.
>>>> 
>>>> 
>>>> Wouldn't you expect such concerns to at least partially now have been 
>>>> covered in  the CEP discussion, up front? I would expect at most at this 
>>>> stage someone could validate that the implementation follows the CEP. But 
>>>> I wouldn't expect a debate on competing approaches at this stage.
>>>> 
>>>> henrik
>>>> --
>>>> 
>>>> 
>>>> Henrik Ingo
>>>> c. +358 40 569 7354 
>>>> w. www.datastax.com <http://www.datastax.com/>

Re: Merging CEP-15 to trunk

Reply via email to