Re: Merging CEP-15 to trunk

Caleb Rackliffe Wed, 01 Feb 2023 12:00:14 -0800

Just an FYI, the Accord feature flag has landed in the cep-15-accord
branch:
https://github.com/apache/cassandra/commit/2e680a33c03ce66d4b1358e1a1cc11cf4ee0189f


(btw, it implicitly fixes some of the dtests around the new Accord system
keyspace, because Accord is now disabled by default.)

On Tue, Jan 31, 2023 at 3:08 PM Ekaterina Dimitrova <e.dimitr...@gmail.com>
wrote:

> Thanks David, I think I am personally aligned with what you are saying.
> Thanks for clarifying your previous email.
>
> “ I do not believe anyone is arguing to merge Accord with bugs… bugs we
> know about are blockers for merge.  There are pending improvements and
> features people are working on, but I don’t believe I know of us trying to
> merge bugs into trunk.  If there are bugs found we should address before
> merge, but improvements can happen after trunk merge (depending on timing,
> before merge is fine).
>
> I do hope this comment doesn’t cause a rabbit hole of “what is a bug vs
> what is an improvement”….”
>
> No doubts on my end personally
>
> Excited to see Accord landing :-)
>
> On Tue, 31 Jan 2023 at 15:35, David Capwell <dcapw...@apple.com> wrote:
>
>> About this merge - I would  personally love to see everything rebased and
>> things rerun in CI which no one has a doubt it will happen
>>
>>
>> Yes, I think this is expected behavior that must happen before adding to
>> trunk, or even just rebasing the feature branch; CI must be passing and
>> flaky tests (as you called out) must be resolved.  Nothing is special from
>> a process point of view as this is already the expected process.  I know
>> Caleb is working on it and uses my circleci script, which leverages the
>> _repeat job to help flesh this out.  With the merge to trunk it “should”
>> detect all new tests and run them, so its fair to expect this to be stable.
>>
>> I am not sure I understand this -
>>
>>
>> Sorry, I was trying to be explicit on the current state of test failures,
>> but was not clear.
>>
>> Accord adds a new system table to make state durable (similar to the
>> paxos table).  This added table causes a well known subset of python dtests
>> to fail as they are checking for the set of keyspaces/tables and detecting
>> a new one added, causing the test to fail.  We have been ignoring these
>> rather than changing python-dtest that way we pull in latest changes there
>> rather than forking and freezing the branch we use.  We will 100% update
>> python dtest along with the merge to trunk similar to all other
>> cross-cutting work done.  Trunk must not fail due to this table getting
>> added and must 100% be dealt with at the same time we merge.
>>
>> Do you plan to merge before fixing the tests? I am confused
>>
>>
>> Yes, its a blocker for trunk merge.
>>
>> I think Henrik brought an interesting point about releasable trunk. I
>> don’t think that we are yet there with the Jenkins issues, but we are
>> getting close and we strive for that, no?
>>
>>
>> As Josh calls out, we require releasable trunk as defined by Circle CI
>> (change was made in 4.1); my understanding is that people are working to
>> improve Jenkins, and once this work is done we can migrate off Circle CI.
>> Until that happens we need to make sure Circle CI is stable for every
>> commit.
>>
>> 3) what is the story with the ninja fixes?
>>
>>
>> Can’t speak for 100% of them, but from the ones I have seen it boils down
>> to “our merge process relies on humans to do the right thing, and people
>> mess up some times”, which sadly is true in trunk.  To avoid calling anyone
>> out other than myself, lets look at a specific one that I did
>>
>> $ git show f8243f41c9e96c4a0390558086ece078b0b6b15c
>> commit f8243f41c9e96c4a0390558086ece078b0b6b15c
>> Author: David Capwell <dcapw...@apache.org>
>> Date:   Mon Jan 9 13:20:58 2023 -0800
>>
>>     Ninja: Add AccordTestUtils.parse which was missing in the latest
>> commit
>>
>> diff --git
>> a/test/unit/org/apache/cassandra/service/accord/AccordTestUtils.java
>> b/test/unit/org/apache/cassandra/service/accord/AccordTestUtils.java
>> index 4adad32d8a..20142c439b 100644
>> --- a/test/unit/org/apache/cassandra/service/accord/AccordTestUtils.java
>> +++ b/test/unit/org/apache/cassandra/service/accord/AccordTestUtils.java
>> @@ -174,6 +174,14 @@ public class AccordTestUtils
>>          return statement.createTxn(ClientState.forInternalCalls(),
>> options);
>>      }
>>
>> +    public static TransactionStatement parse(String query)
>> +    {
>> +        TransactionStatement.Parsed parsed =
>> (TransactionStatement.Parsed) QueryProcessor.parseStatement(query);
>> +        Assert.assertNotNull(parsed);
>> +        TransactionStatement statement = (TransactionStatement)
>> parsed.prepare(ClientState.forInternalCalls());
>> +        return statement;
>> +    }
>> +
>>      public static Txn createTxn(int readKey, int... writeKeys)
>>      {
>>
>>          StringBuilder sb = new StringBuilder("BEGIN TRANSACTION\n”);
>>
>> I merged but a specific method was missing, this caused the branch to
>> fail to compile so I ninja patched by adding that method back; this was
>> reviewed in slack with Benedict (if I remember correctly).
>>
>> I made a human judgement to merge via GH UI and bypass my merge scripts,
>> as I had a Circle CI build showing success… I messed up and had to go back
>> and fix it after commit.
>>
>> It is frustrating and I can imagine how after that much work around
>> Accord people just want it in
>>
>>
>> I do not believe anyone is arguing to merge Accord with bugs… bugs we
>> know about are blockers for merge.  There are pending improvements and
>> features people are working on, but I don’t believe I know of us trying to
>> merge bugs into trunk.  If there are bugs found we should address before
>> merge, but improvements can happen after trunk merge (depending on timing,
>> before merge is fine).
>>
>> I do hope this comment doesn’t cause a rabbit hole of “what is a bug vs
>> what is an improvement”….
>>
>>
>> On Jan 31, 2023, at 8:20 AM, Josh McKenzie <jmcken...@apache.org> wrote:
>>
>> Don't we follow a principle of always shippable trunk? This was actually
>> a reason why I sidelined the talk about post-merge review, because it
>> implies that the code wasn't "good enough"/perfect when it was first merged.
>>
>> We follow a principle of "always shippable trunk according to circleci"
>> as of cutting 4.1, which has been adhered to in this case. There's every
>> likelihood that ASF CI will detonate when this is merged in, but it's also
>> already *currently* detonating repeatedly which we're working on so...
>> /shrug
>>
>> If we block Accord merging on green ASF CI, we'll be in the same boat we
>> were in with 4.1 and never ship until we tear certain things down to the
>> studs and rebuild them. I don't think it's reasonable to put that burden
>> (ASF CI must be green) on *any* ticket right now, much less one that has
>> a potentially high integration cost to the entire project if it stagnates
>> over time.
>>
>> plenty of experience with situations where even if an engineer, including
>> myself sometimes, wanted to work on fixing some technical debt, the
>> employer's project management processes simply wouldn't prioritize that
>> work and it was left for years
>>
>> Seems like it's a healthy mix of debt and bikeshedding for us
>> historically. The former we don't want to sneak in, the latter, well, that
>> I'm less concerned about personally. :)
>>
>> And I think part of the "two committers +1" bar is about trying to keep
>> our debt low.
>>
>> On Tue, Jan 31, 2023, at 2:45 AM, Benedict wrote:
>>
>>
>> Do you mean as part of a blocking review, or just in general?
>>
>> I don’t honestly understand the focus on ninja fixes or rebasing, in
>> either context. Why would a project rebase its work until ready to merge?
>> Why would you worry that a team well resourced with experienced
>> contributors (30+yrs between them) don’t know how to work correctly with
>> ninja fixes? These are all minor details, surely?
>>
>> I understand your concern around flaky tests, particularly since it seems
>> other work has let some slip through. I don’t believe this is a blocking
>> review concern, as it represents its own blocking requirement. I believe I
>> have responded positively to this query already though?
>>
>>
>> On 31 Jan 2023, at 01:46, Ekaterina Dimitrova <e.dimitr...@gmail.com>
>> wrote:
>>
>> 
>>
>> Benedict, Is it an issue to ask whether flaky tests will be addressed as
>> per our project agreement? Or about ninja fixes and why a branch was not
>> rebased during development? What did I miss?
>>
>> By the way I do not ask these questions to block anyone, even suggested
>> to help with CI…it’s a pity this part was dismissed…
>>
>>  I can see that Caleb is handling the things around the merge ticket with
>> high attention to the details as always. I can only thank him!
>>
>> At this point I see this thread mostly as - this is the first feature
>> branch since quite some time, people are unsure about certain things -
>> let’s see how we handle this for the next time to be more efficient and
>> clear.  I think you already took some actions in your suggestion earlier
>> today with the document around comments.
>>
>> On Mon, 30 Jan 2023 at 20:16, Benedict <bened...@apache.org> wrote:
>>
>>
>> Review should primarily ask: "is this correct?" and "could this be done
>> differently (clearer, faster, more correct, etc)?" Blocking reviews
>> especially, because why else would a reasonable contributor want to block
>> progress?
>>
>> These questions can of course be asked of implementation details for any
>> CEP.
>>
>> I have said before, a proposal to conduct a blocking review of this kind
>> - if very late in my view - would be valid, though timeline would have to
>> be debated.
>>
>> Reviewers with weaker aspirations have plenty of time available to them -
>> two weeks have already passed, and another couple will likely yet (there
>> isn't a rush). But it is novel to propose that such optional reviews be
>> treated as blocking.
>>
>>
>> On 30 Jan 2023, at 23:04, Henrik Ingo <henrik.i...@datastax.com> wrote:
>>
>> 
>>
>> Ooops, I missed copy pasting this reply into my previous email:
>>
>> On Fri, Jan 27, 2023 at 11:21 PM Benedict <bened...@apache.org> wrote:
>>
>>
>> I'm realizing in retrospect this leaves ambiguity
>>
>>
>> Another misreading at least of the *intent* of these clauses, is that
>> they were to ensure that concerns about a *design and approach* are
>> listened to, and addressed to the satisfaction of interested parties. It
>> was essentially codifying the project’s long term etiquette around pieces
>> of work with either competing proposals or fundamental concerns. It has
>> historically helped to avoid escalation to vetoes, or reverting code after
>> commit.
>>
>> It wasn’t intended that *any* reason might be invoked, as seems to have
>> been inferred, and perhaps this should be clarified, though I had hoped it
>> would be captured by the word “reasonable". Minor concerns that haven’t
>> been caught by the initial review process can always be addressed in
>> follow-up work, as is very common.
>>
>>
>> Wouldn't you expect such concerns to at least partially now have been
>> covered in  the CEP discussion, up front? I would expect at most at this
>> stage someone could validate that the implementation follows the CEP. But I
>> wouldn't expect a debate on competing approaches at this stage.
>>
>> henrik
>> --
>>
>>
>> *Henrik Ingo*
>>
>> *c*. +358 40 569 7354
>>
>> *w*. *www.datastax.com <http://www.datastax.com/>*
>>
>>
>>

Re: Merging CEP-15 to trunk

Reply via email to