Just an FYI, the Accord feature flag has landed in the cep-15-accord branch: https://github.com/apache/cassandra/commit/2e680a33c03ce66d4b1358e1a1cc11cf4ee0189f
(btw, it implicitly fixes some of the dtests around the new Accord system keyspace, because Accord is now disabled by default.) On Tue, Jan 31, 2023 at 3:08 PM Ekaterina Dimitrova <e.dimitr...@gmail.com> wrote: > Thanks David, I think I am personally aligned with what you are saying. > Thanks for clarifying your previous email. > > “ I do not believe anyone is arguing to merge Accord with bugs… bugs we > know about are blockers for merge. There are pending improvements and > features people are working on, but I don’t believe I know of us trying to > merge bugs into trunk. If there are bugs found we should address before > merge, but improvements can happen after trunk merge (depending on timing, > before merge is fine). > > I do hope this comment doesn’t cause a rabbit hole of “what is a bug vs > what is an improvement”….” > > No doubts on my end personally > > Excited to see Accord landing :-) > > On Tue, 31 Jan 2023 at 15:35, David Capwell <dcapw...@apple.com> wrote: > >> About this merge - I would personally love to see everything rebased and >> things rerun in CI which no one has a doubt it will happen >> >> >> Yes, I think this is expected behavior that must happen before adding to >> trunk, or even just rebasing the feature branch; CI must be passing and >> flaky tests (as you called out) must be resolved. Nothing is special from >> a process point of view as this is already the expected process. I know >> Caleb is working on it and uses my circleci script, which leverages the >> _repeat job to help flesh this out. With the merge to trunk it “should” >> detect all new tests and run them, so its fair to expect this to be stable. >> >> I am not sure I understand this - >> >> >> Sorry, I was trying to be explicit on the current state of test failures, >> but was not clear. >> >> Accord adds a new system table to make state durable (similar to the >> paxos table). This added table causes a well known subset of python dtests >> to fail as they are checking for the set of keyspaces/tables and detecting >> a new one added, causing the test to fail. We have been ignoring these >> rather than changing python-dtest that way we pull in latest changes there >> rather than forking and freezing the branch we use. We will 100% update >> python dtest along with the merge to trunk similar to all other >> cross-cutting work done. Trunk must not fail due to this table getting >> added and must 100% be dealt with at the same time we merge. >> >> Do you plan to merge before fixing the tests? I am confused >> >> >> Yes, its a blocker for trunk merge. >> >> I think Henrik brought an interesting point about releasable trunk. I >> don’t think that we are yet there with the Jenkins issues, but we are >> getting close and we strive for that, no? >> >> >> As Josh calls out, we require releasable trunk as defined by Circle CI >> (change was made in 4.1); my understanding is that people are working to >> improve Jenkins, and once this work is done we can migrate off Circle CI. >> Until that happens we need to make sure Circle CI is stable for every >> commit. >> >> 3) what is the story with the ninja fixes? >> >> >> Can’t speak for 100% of them, but from the ones I have seen it boils down >> to “our merge process relies on humans to do the right thing, and people >> mess up some times”, which sadly is true in trunk. To avoid calling anyone >> out other than myself, lets look at a specific one that I did >> >> $ git show f8243f41c9e96c4a0390558086ece078b0b6b15c >> commit f8243f41c9e96c4a0390558086ece078b0b6b15c >> Author: David Capwell <dcapw...@apache.org> >> Date: Mon Jan 9 13:20:58 2023 -0800 >> >> Ninja: Add AccordTestUtils.parse which was missing in the latest >> commit >> >> diff --git >> a/test/unit/org/apache/cassandra/service/accord/AccordTestUtils.java >> b/test/unit/org/apache/cassandra/service/accord/AccordTestUtils.java >> index 4adad32d8a..20142c439b 100644 >> --- a/test/unit/org/apache/cassandra/service/accord/AccordTestUtils.java >> +++ b/test/unit/org/apache/cassandra/service/accord/AccordTestUtils.java >> @@ -174,6 +174,14 @@ public class AccordTestUtils >> return statement.createTxn(ClientState.forInternalCalls(), >> options); >> } >> >> + public static TransactionStatement parse(String query) >> + { >> + TransactionStatement.Parsed parsed = >> (TransactionStatement.Parsed) QueryProcessor.parseStatement(query); >> + Assert.assertNotNull(parsed); >> + TransactionStatement statement = (TransactionStatement) >> parsed.prepare(ClientState.forInternalCalls()); >> + return statement; >> + } >> + >> public static Txn createTxn(int readKey, int... writeKeys) >> { >> >> StringBuilder sb = new StringBuilder("BEGIN TRANSACTION\n”); >> >> I merged but a specific method was missing, this caused the branch to >> fail to compile so I ninja patched by adding that method back; this was >> reviewed in slack with Benedict (if I remember correctly). >> >> I made a human judgement to merge via GH UI and bypass my merge scripts, >> as I had a Circle CI build showing success… I messed up and had to go back >> and fix it after commit. >> >> It is frustrating and I can imagine how after that much work around >> Accord people just want it in >> >> >> I do not believe anyone is arguing to merge Accord with bugs… bugs we >> know about are blockers for merge. There are pending improvements and >> features people are working on, but I don’t believe I know of us trying to >> merge bugs into trunk. If there are bugs found we should address before >> merge, but improvements can happen after trunk merge (depending on timing, >> before merge is fine). >> >> I do hope this comment doesn’t cause a rabbit hole of “what is a bug vs >> what is an improvement”…. >> >> >> On Jan 31, 2023, at 8:20 AM, Josh McKenzie <jmcken...@apache.org> wrote: >> >> Don't we follow a principle of always shippable trunk? This was actually >> a reason why I sidelined the talk about post-merge review, because it >> implies that the code wasn't "good enough"/perfect when it was first merged. >> >> We follow a principle of "always shippable trunk according to circleci" >> as of cutting 4.1, which has been adhered to in this case. There's every >> likelihood that ASF CI will detonate when this is merged in, but it's also >> already *currently* detonating repeatedly which we're working on so... >> /shrug >> >> If we block Accord merging on green ASF CI, we'll be in the same boat we >> were in with 4.1 and never ship until we tear certain things down to the >> studs and rebuild them. I don't think it's reasonable to put that burden >> (ASF CI must be green) on *any* ticket right now, much less one that has >> a potentially high integration cost to the entire project if it stagnates >> over time. >> >> plenty of experience with situations where even if an engineer, including >> myself sometimes, wanted to work on fixing some technical debt, the >> employer's project management processes simply wouldn't prioritize that >> work and it was left for years >> >> Seems like it's a healthy mix of debt and bikeshedding for us >> historically. The former we don't want to sneak in, the latter, well, that >> I'm less concerned about personally. :) >> >> And I think part of the "two committers +1" bar is about trying to keep >> our debt low. >> >> On Tue, Jan 31, 2023, at 2:45 AM, Benedict wrote: >> >> >> Do you mean as part of a blocking review, or just in general? >> >> I don’t honestly understand the focus on ninja fixes or rebasing, in >> either context. Why would a project rebase its work until ready to merge? >> Why would you worry that a team well resourced with experienced >> contributors (30+yrs between them) don’t know how to work correctly with >> ninja fixes? These are all minor details, surely? >> >> I understand your concern around flaky tests, particularly since it seems >> other work has let some slip through. I don’t believe this is a blocking >> review concern, as it represents its own blocking requirement. I believe I >> have responded positively to this query already though? >> >> >> On 31 Jan 2023, at 01:46, Ekaterina Dimitrova <e.dimitr...@gmail.com> >> wrote: >> >> >> >> Benedict, Is it an issue to ask whether flaky tests will be addressed as >> per our project agreement? Or about ninja fixes and why a branch was not >> rebased during development? What did I miss? >> >> By the way I do not ask these questions to block anyone, even suggested >> to help with CI…it’s a pity this part was dismissed… >> >> I can see that Caleb is handling the things around the merge ticket with >> high attention to the details as always. I can only thank him! >> >> At this point I see this thread mostly as - this is the first feature >> branch since quite some time, people are unsure about certain things - >> let’s see how we handle this for the next time to be more efficient and >> clear. I think you already took some actions in your suggestion earlier >> today with the document around comments. >> >> On Mon, 30 Jan 2023 at 20:16, Benedict <bened...@apache.org> wrote: >> >> >> Review should primarily ask: "is this correct?" and "could this be done >> differently (clearer, faster, more correct, etc)?" Blocking reviews >> especially, because why else would a reasonable contributor want to block >> progress? >> >> These questions can of course be asked of implementation details for any >> CEP. >> >> I have said before, a proposal to conduct a blocking review of this kind >> - if very late in my view - would be valid, though timeline would have to >> be debated. >> >> Reviewers with weaker aspirations have plenty of time available to them - >> two weeks have already passed, and another couple will likely yet (there >> isn't a rush). But it is novel to propose that such optional reviews be >> treated as blocking. >> >> >> On 30 Jan 2023, at 23:04, Henrik Ingo <henrik.i...@datastax.com> wrote: >> >> >> >> Ooops, I missed copy pasting this reply into my previous email: >> >> On Fri, Jan 27, 2023 at 11:21 PM Benedict <bened...@apache.org> wrote: >> >> >> I'm realizing in retrospect this leaves ambiguity >> >> >> Another misreading at least of the *intent* of these clauses, is that >> they were to ensure that concerns about a *design and approach* are >> listened to, and addressed to the satisfaction of interested parties. It >> was essentially codifying the project’s long term etiquette around pieces >> of work with either competing proposals or fundamental concerns. It has >> historically helped to avoid escalation to vetoes, or reverting code after >> commit. >> >> It wasn’t intended that *any* reason might be invoked, as seems to have >> been inferred, and perhaps this should be clarified, though I had hoped it >> would be captured by the word “reasonable". Minor concerns that haven’t >> been caught by the initial review process can always be addressed in >> follow-up work, as is very common. >> >> >> Wouldn't you expect such concerns to at least partially now have been >> covered in the CEP discussion, up front? I would expect at most at this >> stage someone could validate that the implementation follows the CEP. But I >> wouldn't expect a debate on competing approaches at this stage. >> >> henrik >> -- >> >> >> *Henrik Ingo* >> >> *c*. +358 40 569 7354 >> >> *w*. *www.datastax.com <http://www.datastax.com/>* >> >> >>