Re: [DISCUSS] Harry in-tree

2023-11-27 Thread Jacek Lewandowski
+1

- - -- --- -  -
Jacek Lewandowski


pon., 27 lis 2023 o 22:11 Ekaterina Dimitrova 
napisał(a):

> +1, also, Alex, just an idea - maybe you want to make a virtual talk, as
> part of the contributors meetings?
>
>
> На понеделник, 27 ноември 2023 г. Yifan Cai  написа:
>
>> +1
>> --
>> *发件人:* Sam Tunnicliffe 
>> *发送时间:* Tuesday, November 28, 2023 2:43:51 AM
>> *收件人:* dev 
>> *主题:* Re: [DISCUSS] Harry in-tree
>>
>> Definite +1 to bringing harry-core in tree.
>>
>> On 24 Nov 2023, at 15:43, Alex Petrov  wrote:
>>
>> Hi everyone,
>>
>> With TCM landed, there will be way more Harry tests in-tree: we are using
>> it for many coordination tests, and there's now a simulator test that uses
>> Harry. During development, Harry has allowed us to uncover and resolve
>> numerous elusive edge cases.
>>
>> I had conversations with several folks, and wanted to propose to move
>> harry-core to Cassandra test tree. This will substantially
>> simplify/streamline co-development of Cassandra and Harry. With a new
>> HistoryBuilder API that has helped to find and trigger [1] [2] and [3], it
>> will also be much more approachable.
>>
>> Besides making it easier for everyone to develop new fuzz tests, it will
>> also substantially lower the barrier to entry. Currently, debugging an
>> issue found by Harry involves a cumbersome process of rebuilding and
>> transferring jars between Cassandra and Harry, depending on which side you
>> modify. This not only hampers efficiency but also deters broader adoption.
>> By merging harry-core into the Cassandra test tree, we eliminate this
>> barrier.
>>
>> Thank you,
>> --Alex
>>
>> [1] https://issues.apache.org/jira/browse/CASSANDRA-19011
>> [2] https://issues.apache.org/jira/browse/CASSANDRA-18993
>> [3] https://issues.apache.org/jira/browse/CASSANDRA-18932
>>
>>
>>


Re: [DISCUSSION] CEP-38: CQL Management API

2023-11-27 Thread Francisco Guerrero
Hi Maxim,

Thanks for working on this CEP! 

The CEP addresses some of the features we have been discussing for Cassandra 
Sidecar. For example, a dedicated admin port, moving towards more CQL-like 
interfacing with Cassandra, among others.

I think virtual tables intended to bring the gap down between JMX and CQL. 
However, virtual tables cannot action on node operations, so CEP-38 is finally 
addressing that gap.

I look forward to collaborating in this CEP, I think Cassandra and its 
ecosystem will greatly benefit from this enhancement.

Best,
- Francisco

On 2023/11/13 18:08:54 Maxim Muzafarov wrote:
> Hello everyone,
> 
> While we are still waiting for the review to make the settings virtual
> table updatable (CASSANDRA-15254), which will improve the
> configuration management experience for users, I'd like to take
> another step forward and improve the C* management approach we have as
> a whole. This approach aims to make all Cassandra management commands
> accessible via CQL, but not only that.
> 
> The problem of making commands accessible via CQL presents a complex
> challenge, especially if we aim to minimize code duplication across
> the implementation of management operations for different APIs and
> reduce the overall maintenance burden. The proposal's scope goes
> beyond simply introducing a new CQL syntax. It encompasses several key
> objectives for C* management operations, beyond their availability
> through CQL:
> - Ensure consistency across all public APIs we support, including JMX
> MBeans and the newly introduced CQL. Users should see consistent
> command specifications and arguments, irrespective of whether they're
> using an API or a CLI;
> - Reduce source code maintenance costs. With this new approach, when a
> new command is implemented, it should automatically become available
> across JMX MBeans, nodetool, CQL, and Cassandra Sidecar, eliminating
> the need for additional coding;
> - Maintain backward compatibility, ensuring that existing setups and
> workflows continue to work the same way as they do today;
> 
> I would suggest discussing the overall design concept first, and then
> diving into the CQL command syntax and other details once we've found
> common ground on the community's vision. However, regardless of these
> details, I would appreciate any feedback on the design.
> 
> I look forward to your comments!
> 
> Please, see the design document: CEP-38: CQL Management API
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-38%3A+CQL+Management+API
> 


Re: Request to create a Jira account

2023-11-27 Thread Rajneesh
Yes, I do have my account now.

Thanks for the warm welcome. I'm really excited.

-Regards


On Mon, Nov 27, 2023 at 2:39 PM Brandon Williams  wrote:

> You should hopefully have your account now, welcome to Apache
> Cassandra, Rajneesh!
>
> Kind Regards,
> Brandon
>
> On Mon, Nov 27, 2023 at 4:30 PM Rajneesh  wrote:
> >
> > Thank you, Brandon!
> >
> > -Regards
> >
> >
> > On Mon, Nov 27, 2023 at 1:34 PM Brandon Williams 
> wrote:
> >>
> >> Please request one here: https://selfserve.apache.org/jira-account.html
> >>
> >> Kind Regards,
> >> Brandon
> >>
> >> On Mon, Nov 27, 2023 at 3:29 PM Rajneesh  wrote:
> >> >
> >> > Hi,
> >> >
> >> > I am looking to contribute to Cassandra.
> >> >
> >> > I am going through the How-To here -
> https://cassandra.apache.org/_/development/index.html
> >> >
> >> > Following what is mentioned in the above guide, I'm planning to start
> from the Low Hanging Fruit.
> >> >
> >> > May I please have a Jira account?
> >> >
> >> > - Regards
>


Re: Request to create a Jira account

2023-11-27 Thread Brandon Williams
You should hopefully have your account now, welcome to Apache
Cassandra, Rajneesh!

Kind Regards,
Brandon

On Mon, Nov 27, 2023 at 4:30 PM Rajneesh  wrote:
>
> Thank you, Brandon!
>
> -Regards
>
>
> On Mon, Nov 27, 2023 at 1:34 PM Brandon Williams  wrote:
>>
>> Please request one here: https://selfserve.apache.org/jira-account.html
>>
>> Kind Regards,
>> Brandon
>>
>> On Mon, Nov 27, 2023 at 3:29 PM Rajneesh  wrote:
>> >
>> > Hi,
>> >
>> > I am looking to contribute to Cassandra.
>> >
>> > I am going through the How-To here - 
>> > https://cassandra.apache.org/_/development/index.html
>> >
>> > Following what is mentioned in the above guide, I'm planning to start from 
>> > the Low Hanging Fruit.
>> >
>> > May I please have a Jira account?
>> >
>> > - Regards


Re: Request to create a Jira account

2023-11-27 Thread Rajneesh
Thank you, Brandon!

-Regards


On Mon, Nov 27, 2023 at 1:34 PM Brandon Williams  wrote:

> Please request one here: https://selfserve.apache.org/jira-account.html
>
> Kind Regards,
> Brandon
>
> On Mon, Nov 27, 2023 at 3:29 PM Rajneesh  wrote:
> >
> > Hi,
> >
> > I am looking to contribute to Cassandra.
> >
> > I am going through the How-To here -
> https://cassandra.apache.org/_/development/index.html
> >
> > Following what is mentioned in the above guide, I'm planning to start
> from the Low Hanging Fruit.
> >
> > May I please have a Jira account?
> >
> > - Regards
>


Re: Request to create a Jira account

2023-11-27 Thread Brandon Williams
Please request one here: https://selfserve.apache.org/jira-account.html

Kind Regards,
Brandon

On Mon, Nov 27, 2023 at 3:29 PM Rajneesh  wrote:
>
> Hi,
>
> I am looking to contribute to Cassandra.
>
> I am going through the How-To here - 
> https://cassandra.apache.org/_/development/index.html
>
> Following what is mentioned in the above guide, I'm planning to start from 
> the Low Hanging Fruit.
>
> May I please have a Jira account?
>
> - Regards


Re: CEP-21 - Transactional cluster metadata merged to trunk

2023-11-27 Thread Josh McKenzie
> on our internal CI system
Some more context:

This environment adheres to the requirements we laid out in pre-commit CI on 
Cassandra 

 with a couple required differences. We don't yet include the resource 
restriction detail in the test report; it's on my backlog of things to do but I 
can confirm that less CPU and <= equivalent ASFCI memory is being allocated for 
each test suite. I also had to go the route of extracting a blend of what's in 
circle and what's in ASF CI (in terms of test suites, filtering, etc) since 
neither represented a complete view of our CI ecosystem; there are currently 
things executed in either environment not executed in the other.

I've been tracking the upstreaming of that declarative combination in 
CASSANDRA-18731 but have had some other priorities take front-seat (i.e. 
getting a new CI system based on that working since neither upstream ASF CI nor 
circle are re-usable in their current form) and will be upstreaming that ASAP. 
https://issues.apache.org/jira/browse/CASSANDRA-18731

I've left a pretty long comment on CASSANDRA-18731 about the structure of 
things and where my opinion falls; *I think we need a separate DISCUSS thread 
on the ML about CI and what we require for pre-commit smoke* suites: 
https://issues.apache.org/jira/browse/CASSANDRA-18731?focusedCommentId=17790270=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17790270

The TL;DR:
> With an *incredibly large* patch in the form of TCM (88k+ LoC, 900+ files 
> touched), we have less than a .002% test failure injection rate using the 
> above restricted smoke heuristic, and many of them look to be circle ci env 
> specific and not asf ci.

>From a cursory inspection it looks like most of the breakages being tracked on 
>the ticket Sam linked for TCM are likely to be circle env specific (new *nix 
>optimized deletion having a race, OOM's, etc). The TCM merge is actually a 
>great forcing function for us to surface anything env specific in terms of 
>timing and resourcing up-front; I'm glad we have this opportunity but it's 
>unfortunate that it's been interpreted as merging w/out passing CI as opposed 
>to having some env-difference specific kinks to work out.

*This was an incredibly huge merge.* For comparison, I just did a --stat on the 
merge for CASSANDRA-8099:
> 645 files changed, 49381 insertions(+), 42227 deletions(-)

TCM from the C* repo:
>  934 files changed, 66185 insertions(+), 21669 deletions(-)
My gut tells me it's basically impossible to have a merge of this size that 
doesn't disrupt what it's merging into, or the authors just end up slowly dying 
in rebase hell. Or both. This was a massive undertaking and compared to our 
past on this project, has had an incredibly low impact on the target it was 
merged into and the authors are rapidly burning down failures.

To the authors - great work, and thanks for being so diligent on following up 
on any disruptions this body of work has caused to other contributors' 
environments.

To the folks who were disrupted - I get it. This is deeply frustrating, green 
CI has long been many of our white whale's, and having something merge over a 
US holiday week with an incredibly active project where we don't all have time 
to keep up with everything can make things like this feel like a huge surprise. 
It's incredibly unfortunate that the timing on us transitioning to this new CI 
system and working out the kinks is when this behemoth of a merge needed to 
come through, but silver-lining.

We're making great strides. Let's not lose sight of our growth because of the 
pain in the moment of it.

~Josh

p.s. - for the record, I don't think we should hold off on merging things just 
because some folks are on holiday. :)

On Mon, Nov 27, 2023, at 3:38 PM, Sam Tunnicliffe wrote:
> I ought to clarify, we did actually have green CI modulo 3 flaky tests on our 
> internal CI system. I've attached the test artefacts to CASSANDRA-18330 
> now[1][2]: 2 of the 3 failures are upgrade dtests, with 1 other python dtest 
> failure noted. None of these were reproducible in a dev setup, so we 
> suspected them to be environmental and intended to merge before returning to 
> confirm that. The "known" failures that we mentioned in the email that 
> started this thread were ones observed by Mick running the cep-21-tcm branch 
> through Circle before merging.  
> 
> As the CEP-21 changeset was approaching 88k LoC touching over 900 files, 
> permanently rebasing as we tried to eradicate every flaky test was simply 
> unrealistic, especially as other significant patches continued to land in 
> trunk. With that in mind, we took the decision to merge so that we could 
> focus on actually removing any remaining instability.
> 
> [1] https://issues.apache.org/jira/secure/attachment/13064727/ci_summary.html
> [2] 
> 

Request to create a Jira account

2023-11-27 Thread Rajneesh
Hi,

I am looking to contribute to Cassandra.

I am going through the How-To here -
https://cassandra.apache.org/_/development/index.html

Following what is mentioned in the above guide, I'm planning to start from
the Low Hanging Fruit

.

May I please have a Jira account?

- Regards


Re: [DISCUSS] Harry in-tree

2023-11-27 Thread Ekaterina Dimitrova
+1, also, Alex, just an idea - maybe you want to make a virtual talk, as
part of the contributors meetings?


На понеделник, 27 ноември 2023 г. Yifan Cai  написа:

> +1
> --
> *发件人:* Sam Tunnicliffe 
> *发送时间:* Tuesday, November 28, 2023 2:43:51 AM
> *收件人:* dev 
> *主题:* Re: [DISCUSS] Harry in-tree
>
> Definite +1 to bringing harry-core in tree.
>
> On 24 Nov 2023, at 15:43, Alex Petrov  wrote:
>
> Hi everyone,
>
> With TCM landed, there will be way more Harry tests in-tree: we are using
> it for many coordination tests, and there's now a simulator test that uses
> Harry. During development, Harry has allowed us to uncover and resolve
> numerous elusive edge cases.
>
> I had conversations with several folks, and wanted to propose to move
> harry-core to Cassandra test tree. This will substantially
> simplify/streamline co-development of Cassandra and Harry. With a new
> HistoryBuilder API that has helped to find and trigger [1] [2] and [3], it
> will also be much more approachable.
>
> Besides making it easier for everyone to develop new fuzz tests, it will
> also substantially lower the barrier to entry. Currently, debugging an
> issue found by Harry involves a cumbersome process of rebuilding and
> transferring jars between Cassandra and Harry, depending on which side you
> modify. This not only hampers efficiency but also deters broader adoption.
> By merging harry-core into the Cassandra test tree, we eliminate this
> barrier.
>
> Thank you,
> --Alex
>
> [1] https://issues.apache.org/jira/browse/CASSANDRA-19011
> [2] https://issues.apache.org/jira/browse/CASSANDRA-18993
> [3] https://issues.apache.org/jira/browse/CASSANDRA-18932
>
>
>


Re: CEP-21 - Transactional cluster metadata merged to trunk

2023-11-27 Thread Sam Tunnicliffe
I ought to clarify, we did actually have green CI modulo 3 flaky tests on our 
internal CI system. I've attached the test artefacts to CASSANDRA-18330 
now[1][2]: 2 of the 3 failures are upgrade dtests, with 1 other python dtest 
failure noted. None of these were reproducible in a dev setup, so we suspected 
them to be environmental and intended to merge before returning to confirm 
that. The "known" failures that we mentioned in the email that started this 
thread were ones observed by Mick running the cep-21-tcm branch through Circle 
before merging.  

As the CEP-21 changeset was approaching 88k LoC touching over 900 files, 
permanently rebasing as we tried to eradicate every flaky test was simply 
unrealistic, especially as other significant patches continued to land in 
trunk. With that in mind, we took the decision to merge so that we could focus 
on actually removing any remaining instability.

[1] https://issues.apache.org/jira/secure/attachment/13064727/ci_summary.html
[2] 
https://issues.apache.org/jira/secure/attachment/13064728/result_details.tar.gz


> On 27 Nov 2023, at 10:28, Berenguer Blasi  wrote:
> 
> Hi,
> 
> I have written this email like 10 times before sending it and I can't manage 
> to avoid making it sound with a negative spin to it. So pardon my English or 
> poor choice of words in advance and try to read it in a positive way.
> 
> It is really demotivating to me seeing things getting merged without green 
> CI. I had to go through an herculean effort and pain (at least to me) to keep 
> rebasing the TTL patch continuously (a huge one imo) when it would have been 
> altogether much easier to merge, post-fix and post-add downgradability along 
> the TCM merge lines.
> 
> If this merge-post fix approach is a thing I would like it clarified so we 
> can all benefit from it and to avoid the big-patch rebase pain.
> 
> Regards
> 
> On 27/11/23 10:38, Jacek Lewandowski wrote:
>> Hi,
>> 
>> I'm happy to hear that the feature got merged. Though, I share Benjamin's 
>> worries about that being a bad precedent.
>> 
>> I don't think it makes sense to do repeated runs in this particular case. 
>> Detecting flaky tests would not prove anything; they can be caused by this 
>> patch, but we would not know that for sure. We would have to have a similar 
>> build with the same tests repeated to compare. It would take time and 
>> resources, and in the end, we will have to fix those flaky tests regardless 
>> of whether they were caused by this change. IMO, it makes sense to do a 
>> repeated run of the new tests, though. Aside from that, we can also consider 
>> making it easier and more automated for the developer to determine whether a 
>> particular flakiness comes from a feature branch one wants to merge.
>> 
>> thanks,
>> Jacek
>> 
>> 
>> pon., 27 lis 2023 o 10:15 Benjamin Lerer > > napisał(a):
>>> Hi,
>>> 
>>> I must admit that I have been surprised by this merge and this following 
>>> email. We had lengthy discussions recently and the final agreement was that 
>>> the requirement for a merge was a green CI.
>>> I could understand that for some reasons as a community we could wish to 
>>> make some exceptions. In this present case there was no official discussion 
>>> to ask for an exception.
>>> I believe that this merge creates a bad precedent where anybody can feel 
>>> entitled to merge without a green CI and disregard any previous community 
>>> agreement.
>>> 
>>> Le sam. 25 nov. 2023 à 09:22, Mick Semb Wever >> > a écrit :
 
 Great work Sam, Alex & Marcus !
 
  
> There are about 15-20 flaky or failing tests in total, spread over 
> several test jobs[2] (i.e. single digit failures in a few of these). We 
> have filed JIRAs for the failures and are working on getting those fixed 
> as a top priority. CASSANDRA-19055[3] is the umbrella ticket for this 
> follow up work.
> 
> There are also a number of improvements we will work on in the coming 
> weeks, we will file JIRAs for those early next week and add them as 
> subtasks to CASSANDRA-19055.
 
 
 Can we get these tests temporarily annotated as skipped while all the 
 subtickets to 19055 are being worked on ? 
 
 As we have seen from CASSANDRA-18166 and CASSANDRA-19034 there's a lot of 
 overhead now on 5.0 tickets having to navigate around these failures in 
 trunk CI runs.
 
 Also, we're still trying to figure out how to do repeated runs for a patch 
 so big… (the list of touched tests was too long for circleci, i need to 
 figure out what the limit is and chunk it into separate circleci configs) 
 … and it probably makes sense to wait until most of 19055 is done (or 
 tests are temporarily annotated as skipped).
 
 



Re: [DISCUSS] Harry in-tree

2023-11-27 Thread Yifan Cai
+1

发件人: Sam Tunnicliffe 
发送时间: Tuesday, November 28, 2023 2:43:51 AM
收件人: dev 
主题: Re: [DISCUSS] Harry in-tree

Definite +1 to bringing harry-core in tree.

On 24 Nov 2023, at 15:43, Alex Petrov  wrote:

Hi everyone,

With TCM landed, there will be way more Harry tests in-tree: we are using it 
for many coordination tests, and there's now a simulator test that uses Harry. 
During development, Harry has allowed us to uncover and resolve numerous 
elusive edge cases.

I had conversations with several folks, and wanted to propose to move 
harry-core to Cassandra test tree. This will substantially simplify/streamline 
co-development of Cassandra and Harry. With a new HistoryBuilder API that has 
helped to find and trigger [1] [2] and [3], it will also be much more 
approachable.

Besides making it easier for everyone to develop new fuzz tests, it will also 
substantially lower the barrier to entry. Currently, debugging an issue found 
by Harry involves a cumbersome process of rebuilding and transferring jars 
between Cassandra and Harry, depending on which side you modify. This not only 
hampers efficiency but also deters broader adoption. By merging harry-core into 
the Cassandra test tree, we eliminate this barrier.

Thank you,
--Alex

[1] https://issues.apache.org/jira/browse/CASSANDRA-19011
[2] https://issues.apache.org/jira/browse/CASSANDRA-18993
[3] https://issues.apache.org/jira/browse/CASSANDRA-18932



Re: [DISCUSS] Harry in-tree

2023-11-27 Thread Sam Tunnicliffe
Definite +1 to bringing harry-core in tree.

> On 24 Nov 2023, at 15:43, Alex Petrov  wrote:
> 
> Hi everyone,
> 
> With TCM landed, there will be way more Harry tests in-tree: we are using it 
> for many coordination tests, and there's now a simulator test that uses 
> Harry. During development, Harry has allowed us to uncover and resolve 
> numerous elusive edge cases.
> 
> I had conversations with several folks, and wanted to propose to move 
> harry-core to Cassandra test tree. This will substantially 
> simplify/streamline co-development of Cassandra and Harry. With a new 
> HistoryBuilder API that has helped to find and trigger [1] [2] and [3], it 
> will also be much more approachable.
> 
> Besides making it easier for everyone to develop new fuzz tests, it will also 
> substantially lower the barrier to entry. Currently, debugging an issue found 
> by Harry involves a cumbersome process of rebuilding and transferring jars 
> between Cassandra and Harry, depending on which side you modify. This not 
> only hampers efficiency but also deters broader adoption. By merging 
> harry-core into the Cassandra test tree, we eliminate this barrier.
> 
> Thank you,
> --Alex
> 
> [1] https://issues.apache.org/jira/browse/CASSANDRA-19011
> [2] https://issues.apache.org/jira/browse/CASSANDRA-18993
> [3] https://issues.apache.org/jira/browse/CASSANDRA-18932



Re: [DISCUSS] Harry in-tree

2023-11-27 Thread Francisco Guerrero
+1 (nb)

On 2023/11/24 16:32:33 Alex Petrov wrote:

> > (We can publish jar files at C* release time if there's a call for this, 
> > doesn't really matter if they don't contain changes.). 
> 
> I would say we should publish dtest jar files when releasing, this would be 
> very helpful! 

Yes, projects in the Cassandra ecosystem would benefit a lot if we started 
publishing dtest jars. I am a big +1 on this as well
 
> 
> 
> 
> On Fri, Nov 24, 2023, at 5:23 PM, Mick Semb Wever wrote:
> > 
> >  
> >  
> >> I had conversations with several folks, and wanted to propose to move 
> >> harry-core to Cassandra test tree. This will substantially 
> >> simplify/streamline co-development of Cassandra and Harry. With a new 
> >> HistoryBuilder API that has helped to find and trigger [1] [2] and [3], it 
> >> will also be much more approachable.
> > 
> > 
> > Yes.  Any reason to have releases for users outside of Cassandra ? (We can 
> > publish jar files at C* release time if there's a call for this, doesn't 
> > really matter if they don't contain changes.).  And is there any 
> > cross-branch/version interaction, like there is with the jvm-dtest-api ?
> > 
> > 
> >  
> 


Re: [VOTE] Release Apache Cassandra 5.0-beta1

2023-11-27 Thread Mike Adamson
> Furthermore, we don't even know if it's still an issue after 19034 was
committed.

It's a difficult one to reproduce because we don't have access to the harry
script that generated the error in the first place. I am investigating it
but without the original reproduction it may take some time.

On Mon, 27 Nov 2023 at 16:19, Mick Semb Wever  wrote:

>
>
> On Mon, 27 Nov 2023 at 16:28, Brandon Williams  wrote:
>
>> On Mon, Nov 27, 2023 at 9:25 AM Mick Semb Wever  wrote:
>> >
>> > It was agreed to move them to 5.0-rc
>>
>> Where?
>>
>
>
> Typo, "it" not "them".
> I'm only talking about 19011.  The others were already 5.0-rc, or infact
> forward from 5.0.x.
>
> Here:
> https://issues.apache.org/jira/browse/CASSANDRA-19011?focusedCommentId=17789202=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17789202
>
> Furthermore, we don't even know if it's still an issue after 19034 was
> committed.  We want to figure this out before the vote window closes.
>
>
>
>
>


-- 
[image: DataStax Logo Square]  *Mike Adamson*
Engineering

+1 650 389 6000 <16503896000> | datastax.com 
Find DataStax Online: [image: LinkedIn Logo]

   [image: Facebook Logo]

   [image: Twitter Logo]    [image: RSS Feed]
   [image: Github Logo]



Re: [DISCUSS] Harry in-tree

2023-11-27 Thread David Capwell
+1 to in-tree

> On Nov 27, 2023, at 9:17 AM, Benjamin Lerer  wrote:
> 
> +1
> 
> Le lun. 27 nov. 2023 à 18:01, Brandon Williams  > a écrit :
>> I am +1 on including Harry in-tree.
>> 
>> Kind Regards,
>> Brandon
>> 
>> On Fri, Nov 24, 2023 at 9:44 AM Alex Petrov > > wrote:
>> >
>> > Hi everyone,
>> >
>> > With TCM landed, there will be way more Harry tests in-tree: we are using 
>> > it for many coordination tests, and there's now a simulator test that uses 
>> > Harry. During development, Harry has allowed us to uncover and resolve 
>> > numerous elusive edge cases.
>> >
>> > I had conversations with several folks, and wanted to propose to move 
>> > harry-core to Cassandra test tree. This will substantially 
>> > simplify/streamline co-development of Cassandra and Harry. With a new 
>> > HistoryBuilder API that has helped to find and trigger [1] [2] and [3], it 
>> > will also be much more approachable.
>> >
>> > Besides making it easier for everyone to develop new fuzz tests, it will 
>> > also substantially lower the barrier to entry. Currently, debugging an 
>> > issue found by Harry involves a cumbersome process of rebuilding and 
>> > transferring jars between Cassandra and Harry, depending on which side you 
>> > modify. This not only hampers efficiency but also deters broader adoption. 
>> > By merging harry-core into the Cassandra test tree, we eliminate this 
>> > barrier.
>> >
>> > Thank you,
>> > --Alex
>> >
>> > [1] https://issues.apache.org/jira/browse/CASSANDRA-19011
>> > [2] https://issues.apache.org/jira/browse/CASSANDRA-18993
>> > [3] https://issues.apache.org/jira/browse/CASSANDRA-18932



Re: [DISCUSS] Harry in-tree

2023-11-27 Thread Benjamin Lerer
+1

Le lun. 27 nov. 2023 à 18:01, Brandon Williams  a écrit :

> I am +1 on including Harry in-tree.
>
> Kind Regards,
> Brandon
>
> On Fri, Nov 24, 2023 at 9:44 AM Alex Petrov  wrote:
> >
> > Hi everyone,
> >
> > With TCM landed, there will be way more Harry tests in-tree: we are
> using it for many coordination tests, and there's now a simulator test that
> uses Harry. During development, Harry has allowed us to uncover and resolve
> numerous elusive edge cases.
> >
> > I had conversations with several folks, and wanted to propose to move
> harry-core to Cassandra test tree. This will substantially
> simplify/streamline co-development of Cassandra and Harry. With a new
> HistoryBuilder API that has helped to find and trigger [1] [2] and [3], it
> will also be much more approachable.
> >
> > Besides making it easier for everyone to develop new fuzz tests, it will
> also substantially lower the barrier to entry. Currently, debugging an
> issue found by Harry involves a cumbersome process of rebuilding and
> transferring jars between Cassandra and Harry, depending on which side you
> modify. This not only hampers efficiency but also deters broader adoption.
> By merging harry-core into the Cassandra test tree, we eliminate this
> barrier.
> >
> > Thank you,
> > --Alex
> >
> > [1] https://issues.apache.org/jira/browse/CASSANDRA-19011
> > [2] https://issues.apache.org/jira/browse/CASSANDRA-18993
> > [3] https://issues.apache.org/jira/browse/CASSANDRA-18932
>


Re: [DISCUSS] Harry in-tree

2023-11-27 Thread Brandon Williams
I am +1 on including Harry in-tree.

Kind Regards,
Brandon

On Fri, Nov 24, 2023 at 9:44 AM Alex Petrov  wrote:
>
> Hi everyone,
>
> With TCM landed, there will be way more Harry tests in-tree: we are using it 
> for many coordination tests, and there's now a simulator test that uses 
> Harry. During development, Harry has allowed us to uncover and resolve 
> numerous elusive edge cases.
>
> I had conversations with several folks, and wanted to propose to move 
> harry-core to Cassandra test tree. This will substantially 
> simplify/streamline co-development of Cassandra and Harry. With a new 
> HistoryBuilder API that has helped to find and trigger [1] [2] and [3], it 
> will also be much more approachable.
>
> Besides making it easier for everyone to develop new fuzz tests, it will also 
> substantially lower the barrier to entry. Currently, debugging an issue found 
> by Harry involves a cumbersome process of rebuilding and transferring jars 
> between Cassandra and Harry, depending on which side you modify. This not 
> only hampers efficiency but also deters broader adoption. By merging 
> harry-core into the Cassandra test tree, we eliminate this barrier.
>
> Thank you,
> --Alex
>
> [1] https://issues.apache.org/jira/browse/CASSANDRA-19011
> [2] https://issues.apache.org/jira/browse/CASSANDRA-18993
> [3] https://issues.apache.org/jira/browse/CASSANDRA-18932


Re: [VOTE] Release Apache Cassandra 5.0-beta1

2023-11-27 Thread Mick Semb Wever
On Mon, 27 Nov 2023 at 16:28, Brandon Williams  wrote:

> On Mon, Nov 27, 2023 at 9:25 AM Mick Semb Wever  wrote:
> >
> > It was agreed to move them to 5.0-rc
>
> Where?
>


Typo, "it" not "them".
I'm only talking about 19011.  The others were already 5.0-rc, or infact
forward from 5.0.x.

Here:
https://issues.apache.org/jira/browse/CASSANDRA-19011?focusedCommentId=17789202=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17789202

Furthermore, we don't even know if it's still an issue after 19034 was
committed.  We want to figure this out before the vote window closes.


Re: [VOTE] Release Apache Cassandra 5.0-beta1

2023-11-27 Thread Brandon Williams
On Mon, Nov 27, 2023 at 9:25 AM Mick Semb Wever  wrote:
>
> It was agreed to move them to 5.0-rc

Where?


Re: [VOTE] Release Apache Cassandra 5.0-beta1

2023-11-27 Thread Mick Semb Wever
It was agreed to move them to 5.0-rc, and to complete any further
triage/investigation of any tickets deemed potentially critical early this
week before the vote window closes.   SAI is a new feature, not enabled by
default, so IMHO this is an acceptable call.  And some of those tickets
look like the fixes are uncertain, might well be quite involved and could
take weeks.  So again, I'd much rather see a beta1 out, if possible given
how much value there is here, and cut a 5.0-beta2 as more fixes
land (instead of going straight to 5.0-rc1).

19039 applies to multiple branches, and hasn't been marked as critical, so
it wouldn't normally be a beta blocker.  But again, we've got until
Wednesday.

I really don't want to veto and recut a release for a new
feature/improvement (18464), I would rather just include it in 5.0-beta2
(same time spent, more value), I mentioned this in the other thread.


On Mon, 27 Nov 2023 at 13:05, Benjamin Lerer  wrote:

> Looking at the board it is unclear to me why CASSANDRA-19011
> , CASSANDRA-19018
> ,  CASSANDRA-18796
>  and
> CASSANDRA-18940 
> are not beta tickets.
> SAI being one of the important features of 5.0 it seems to me that those
> tickets should have been handled for the beta release.
> CASSANDRA-19039 
> could also be a real problem.
>
>
> Le dim. 26 nov. 2023 à 13:35, Mick Semb Wever  a écrit :
>
>>
>> Proposing the test build of Cassandra 5.0-beta1 for release.
>>
>> sha1: e0c0c31c7f6db1e3ddb80cef842b820fc27fd0eb
>> Git: https://github.com/apache/cassandra/tree/5.0-beta1-tentative
>> Maven Artifacts:
>> https://repository.apache.org/content/repositories/orgapachecassandra-1319/org/apache/cassandra/cassandra-all/5.0-beta1/
>>
>> The Source and Build Artifacts, and the Debian and RPM packages and
>> repositories, are available here:
>> https://dist.apache.org/repos/dist/dev/cassandra/5.0-beta1/
>>
>> The vote will be open for 72 hours (longer if needed). Everyone who has
>> tested the build is invited to vote. Votes by PMC members are considered
>> binding. A vote passes if there are at least three binding +1s and no -1's.
>>
>> Remaining tickets to get us to 5.0-rc1 can be found on this jira board:
>> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=593=detail
>>
>> [1]: CHANGES.txt:
>> https://github.com/apache/cassandra/blob/5.0-beta1-tentative/CHANGES.txt
>> [2]: NEWS.txt:
>> https://github.com/apache/cassandra/blob/5.0-beta1-tentative/NEWS.txt
>>
>


Re: Include CASSANDRA-18464 in 5.0-beta1 (direct I/O support for commitlog write)

2023-11-27 Thread Mick Semb Wever
I don't want to veto the 5.0-beta1 release for this.

I would rather cut and include it in the next: 5.0-beta2; release.
Given it's an add-on and does not change default behaviour, I believe this
would be ok – if we agree.

I feel this would be a better use of our time.  And would mean, yes commit
to cassandra-5.0




On Mon, 27 Nov 2023 at 14:04, Jacek Lewandowski 
wrote:

> Hey,
>
> I'd like to ask if we can include
> https://issues.apache.org/jira/browse/CASSANDRA-18464 in the 5.0-beta1
> release. This introduces the ability to write to the commitlog using direct
> I/O and bringing some noticeable performance improvements when enabled
> (disabled by default).
>
> Since it introduces a change in the yaml config, it probably cannot be
> delivered in RC or 5.0.x - hence my question.
>
> The ticket has been reviewed and tested. It is basically in the
> read-to-commit state.
>
> thanks,
> Jacek
>
>


Re: Include CASSANDRA-18464 in 5.0-beta1 (direct I/O support for commitlog write)

2023-11-27 Thread guo Maxwell
+1 

Brandon Williams  于2023年11月27日周一 22:25写道:

> As long as it's disabled by default that's an easy +1 from me.
>
> Kind Regards,
> Brandon
>
> On Mon, Nov 27, 2023 at 7:03 AM Jacek Lewandowski
>  wrote:
> >
> > Hey,
> >
> > I'd like to ask if we can include
> https://issues.apache.org/jira/browse/CASSANDRA-18464 in the 5.0-beta1
> release. This introduces the ability to write to the commitlog using direct
> I/O and bringing some noticeable performance improvements when enabled
> (disabled by default).
> >
> > Since it introduces a change in the yaml config, it probably cannot be
> delivered in RC or 5.0.x - hence my question.
> >
> > The ticket has been reviewed and tested. It is basically in the
> read-to-commit state.
> >
> > thanks,
> > Jacek
> >
>


Re: Include CASSANDRA-18464 in 5.0-beta1 (direct I/O support for commitlog write)

2023-11-27 Thread Brandon Williams
As long as it's disabled by default that's an easy +1 from me.

Kind Regards,
Brandon

On Mon, Nov 27, 2023 at 7:03 AM Jacek Lewandowski
 wrote:
>
> Hey,
>
> I'd like to ask if we can include 
> https://issues.apache.org/jira/browse/CASSANDRA-18464 in the 5.0-beta1 
> release. This introduces the ability to write to the commitlog using direct 
> I/O and bringing some noticeable performance improvements when enabled 
> (disabled by default).
>
> Since it introduces a change in the yaml config, it probably cannot be 
> delivered in RC or 5.0.x - hence my question.
>
> The ticket has been reviewed and tested. It is basically in the 
> read-to-commit state.
>
> thanks,
> Jacek
>


Re: [VOTE] Release Apache Cassandra 5.0-beta1

2023-11-27 Thread Jacek Lewandowski
I propose to consider including
https://issues.apache.org/jira/browse/CASSANDRA-18464 (started a separate
thread)




pon., 27 lis 2023 o 13:06 Benjamin Lerer  napisał(a):

> Looking at the board it is unclear to me why CASSANDRA-19011
> , CASSANDRA-19018
> ,  CASSANDRA-18796
>  and
> CASSANDRA-18940 
> are not beta tickets.
> SAI being one of the important features of 5.0 it seems to me that those
> tickets should have been handled for the beta release.
> CASSANDRA-19039 
> could also be a real problem.
>
>
> Le dim. 26 nov. 2023 à 13:35, Mick Semb Wever  a écrit :
>
>>
>> Proposing the test build of Cassandra 5.0-beta1 for release.
>>
>> sha1: e0c0c31c7f6db1e3ddb80cef842b820fc27fd0eb
>> Git: https://github.com/apache/cassandra/tree/5.0-beta1-tentative
>> Maven Artifacts:
>> https://repository.apache.org/content/repositories/orgapachecassandra-1319/org/apache/cassandra/cassandra-all/5.0-beta1/
>>
>> The Source and Build Artifacts, and the Debian and RPM packages and
>> repositories, are available here:
>> https://dist.apache.org/repos/dist/dev/cassandra/5.0-beta1/
>>
>> The vote will be open for 72 hours (longer if needed). Everyone who has
>> tested the build is invited to vote. Votes by PMC members are considered
>> binding. A vote passes if there are at least three binding +1s and no -1's.
>>
>> Remaining tickets to get us to 5.0-rc1 can be found on this jira board:
>> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=593=detail
>>
>> [1]: CHANGES.txt:
>> https://github.com/apache/cassandra/blob/5.0-beta1-tentative/CHANGES.txt
>> [2]: NEWS.txt:
>> https://github.com/apache/cassandra/blob/5.0-beta1-tentative/NEWS.txt
>>
>


Include CASSANDRA-18464 in 5.0-beta1 (direct I/O support for commitlog write)

2023-11-27 Thread Jacek Lewandowski
Hey,

I'd like to ask if we can include
https://issues.apache.org/jira/browse/CASSANDRA-18464 in the 5.0-beta1
release. This introduces the ability to write to the commitlog using direct
I/O and bringing some noticeable performance improvements when enabled
(disabled by default).

Since it introduces a change in the yaml config, it probably cannot be
delivered in RC or 5.0.x - hence my question.

The ticket has been reviewed and tested. It is basically in the
read-to-commit state.

thanks,
Jacek


Re: [VOTE] Release Apache Cassandra 5.0-beta1

2023-11-27 Thread Benjamin Lerer
Looking at the board it is unclear to me why CASSANDRA-19011
, CASSANDRA-19018
,  CASSANDRA-18796
 and  CASSANDRA-18940
 are not beta
tickets.
SAI being one of the important features of 5.0 it seems to me that those
tickets should have been handled for the beta release.
CASSANDRA-19039 
could also be a real problem.


Le dim. 26 nov. 2023 à 13:35, Mick Semb Wever  a écrit :

>
> Proposing the test build of Cassandra 5.0-beta1 for release.
>
> sha1: e0c0c31c7f6db1e3ddb80cef842b820fc27fd0eb
> Git: https://github.com/apache/cassandra/tree/5.0-beta1-tentative
> Maven Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-1319/org/apache/cassandra/cassandra-all/5.0-beta1/
>
> The Source and Build Artifacts, and the Debian and RPM packages and
> repositories, are available here:
> https://dist.apache.org/repos/dist/dev/cassandra/5.0-beta1/
>
> The vote will be open for 72 hours (longer if needed). Everyone who has
> tested the build is invited to vote. Votes by PMC members are considered
> binding. A vote passes if there are at least three binding +1s and no -1's.
>
> Remaining tickets to get us to 5.0-rc1 can be found on this jira board:
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=593=detail
>
> [1]: CHANGES.txt:
> https://github.com/apache/cassandra/blob/5.0-beta1-tentative/CHANGES.txt
> [2]: NEWS.txt:
> https://github.com/apache/cassandra/blob/5.0-beta1-tentative/NEWS.txt
>


Re: CEP-21 - Transactional cluster metadata merged to trunk

2023-11-27 Thread Berenguer Blasi

Hi,

I have written this email like 10 times before sending it and I can't 
manage to avoid making it sound with a negative spin to it. So pardon my 
English or poor choice of words in advance and try to read it in a 
positive way.


It is really demotivating to me seeing things getting merged without 
green CI. I had to go through an herculean effort and pain (at least to 
me) to keep rebasing the TTL patch continuously (a huge one imo) when it 
would have been altogether much easier to merge, post-fix and post-add 
downgradability along the TCM merge lines.


If this merge-post fix approach is a thing I would like it clarified so 
we can all benefit from it and to avoid the big-patch rebase pain.


Regards

On 27/11/23 10:38, Jacek Lewandowski wrote:

Hi,

I'm happy to hear that the feature got merged. Though, I share 
Benjamin's worries about that being a bad precedent.


I don't think it makes sense to do repeated runs in this particular 
case. Detecting flaky tests would not prove anything; they can be 
caused by this patch, but we would not know that for sure. We would 
have to have a similar build with the same tests repeated to compare. 
It would take time and resources, and in the end, we will have to fix 
those flaky tests regardless of whether they were caused by this 
change. IMO, it makes sense to do a repeated run of the new tests, 
though. Aside from that, we can also consider making it easier and 
more automated for the developer to determine whether a particular 
flakiness comes from a feature branch one wants to merge.


thanks,
Jacek


pon., 27 lis 2023 o 10:15 Benjamin Lerer  napisał(a):

Hi,

I must admit that I have been surprised by this merge and this
following email. We had lengthy discussions recently and the final
agreement was that the requirement for a merge was a green CI.
I could understand that for some reasons as a community we could
wish to make some exceptions. In this present case there was no
official discussion to ask for an exception.
I believe that this merge creates a bad precedent where anybody
can feel entitled to merge without a green CI and disregard any
previous community agreement.

Le sam. 25 nov. 2023 à 09:22, Mick Semb Wever  a
écrit :


Great work Sam, Alex & Marcus !

There are about 15-20 flaky or failing tests in total,
spread over several test jobs[2] (i.e. single digit
failures in a few of these). We have filed JIRAs for the
failures and are working on getting those fixed as a top
priority. CASSANDRA-19055[3] is the umbrella ticket for
this follow up work.

There are also a number of improvements we will work on in
the coming weeks, we will file JIRAs for those early next
week and add them as subtasks to CASSANDRA-19055.



Can we get these tests temporarily annotated as skipped while
all the subtickets to 19055 are being worked on ?

As we have seen from CASSANDRA-18166 and CASSANDRA-19034
there's a lot of overhead now on 5.0 tickets having to
navigate around these failures in trunk CI runs.

Also, we're still trying to figure out how to do repeated runs
for a patch so big… (the list of touched tests was too long
for circleci, i need to figure out what the limit is and chunk
it into separate circleci configs) … and it probably makes
sense to wait until most of 19055 is done (or tests are
temporarily annotated as skipped).



Re: CEP-21 - Transactional cluster metadata merged to trunk

2023-11-27 Thread Jacek Lewandowski
Hi,

I'm happy to hear that the feature got merged. Though, I share Benjamin's
worries about that being a bad precedent.

I don't think it makes sense to do repeated runs in this particular case.
Detecting flaky tests would not prove anything; they can be caused by this
patch, but we would not know that for sure. We would have to have a similar
build with the same tests repeated to compare. It would take time and
resources, and in the end, we will have to fix those flaky tests regardless
of whether they were caused by this change. IMO, it makes sense to do a
repeated run of the new tests, though. Aside from that, we can also
consider making it easier and more automated for the developer to determine
whether a particular flakiness comes from a feature branch one wants to
merge.

thanks,
Jacek


pon., 27 lis 2023 o 10:15 Benjamin Lerer  napisał(a):

> Hi,
>
> I must admit that I have been surprised by this merge and this following
> email. We had lengthy discussions recently and the final agreement was that
> the requirement for a merge was a green CI.
> I could understand that for some reasons as a community we could wish to
> make some exceptions. In this present case there was no official discussion
> to ask for an exception.
> I believe that this merge creates a bad precedent where anybody can feel
> entitled to merge without a green CI and disregard any previous community
> agreement.
>
> Le sam. 25 nov. 2023 à 09:22, Mick Semb Wever  a écrit :
>
>>
>> Great work Sam, Alex & Marcus !
>>
>>
>>
>>> There are about 15-20 flaky or failing tests in total, spread over
>>> several test jobs[2] (i.e. single digit failures in a few of these). We
>>> have filed JIRAs for the failures and are working on getting those fixed as
>>> a top priority. CASSANDRA-19055[3] is the umbrella ticket for this follow
>>> up work.
>>>
>>> There are also a number of improvements we will work on in the coming
>>> weeks, we will file JIRAs for those early next week and add them as
>>> subtasks to CASSANDRA-19055.
>>>
>>
>>
>> Can we get these tests temporarily annotated as skipped while all the
>> subtickets to 19055 are being worked on ?
>>
>> As we have seen from CASSANDRA-18166 and CASSANDRA-19034 there's a lot of
>> overhead now on 5.0 tickets having to navigate around these failures in
>> trunk CI runs.
>>
>> Also, we're still trying to figure out how to do repeated runs for a
>> patch so big… (the list of touched tests was too long for circleci, i need
>> to figure out what the limit is and chunk it into separate circleci
>> configs) … and it probably makes sense to wait until most of 19055 is done
>> (or tests are temporarily annotated as skipped).
>>
>>
>>


Re: CEP-21 - Transactional cluster metadata merged to trunk

2023-11-27 Thread Benjamin Lerer
Hi,

I must admit that I have been surprised by this merge and this following
email. We had lengthy discussions recently and the final agreement was that
the requirement for a merge was a green CI.
I could understand that for some reasons as a community we could wish to
make some exceptions. In this present case there was no official discussion
to ask for an exception.
I believe that this merge creates a bad precedent where anybody can feel
entitled to merge without a green CI and disregard any previous community
agreement.

Le sam. 25 nov. 2023 à 09:22, Mick Semb Wever  a écrit :

>
> Great work Sam, Alex & Marcus !
>
>
>
>> There are about 15-20 flaky or failing tests in total, spread over
>> several test jobs[2] (i.e. single digit failures in a few of these). We
>> have filed JIRAs for the failures and are working on getting those fixed as
>> a top priority. CASSANDRA-19055[3] is the umbrella ticket for this follow
>> up work.
>>
>> There are also a number of improvements we will work on in the coming
>> weeks, we will file JIRAs for those early next week and add them as
>> subtasks to CASSANDRA-19055.
>>
>
>
> Can we get these tests temporarily annotated as skipped while all the
> subtickets to 19055 are being worked on ?
>
> As we have seen from CASSANDRA-18166 and CASSANDRA-19034 there's a lot of
> overhead now on 5.0 tickets having to navigate around these failures in
> trunk CI runs.
>
> Also, we're still trying to figure out how to do repeated runs for a patch
> so big… (the list of touched tests was too long for circleci, i need to
> figure out what the limit is and chunk it into separate circleci configs) …
> and it probably makes sense to wait until most of 19055 is done (or tests
> are temporarily annotated as skipped).
>
>
>