[DISCUSSION] Adding sonar report analysis to the Cassandra project

2023-06-12 Thread Maxim Muzafarov
Hello everyone, I would like to make the source code of the Cassandra project more visible to people outside of the Cassandra Community and highlight the typical known issues in new contributions in the GitHub pull-request interface as well. This makes it easier for those who are unfamiliar with

Re: [DISCUSS] CEP-8 Drivers Donation - take 2

2023-06-12 Thread Jeremy Hanna
I'd like to close out this thread. As Benjamin notes, we'll have a single subproject for all of the drivers and with 3 PMC members overseeing the subproject as outlined in the linked subproject governance procedures. However we'll introduce the drivers to that subproject one by one out of

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Josh McKenzie
> As long as it is valid in the paging protocol to return a short page, but > still say “there are more pages”, I think that is fine to do that. Thankfully the v3-v5 spec all make it clear that clients need to respect what the server has to say about there being more pages:

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Jeremiah Jordan
As long as it is valid in the paging protocol to return a short page, but still say “there are more pages”, I think that is fine to do that. For an actual LIMIT that is part of the user query, I think the server must always have returned all data that fits into the LIMIT when all pages have been

[DISCUSS] Remove deprecated keyspace_count_warn_threshold and table_count_warn_threshold

2023-06-12 Thread Dan Jatnieks
Hello everyone, I would like to propose removing the non-guardrail thresholds 'keyspace_count_warn_threshold' and 'table_count_warn_threshold' configuration settings on the trunk branch for the next major release. These thresholds were first added with CASSANDRA-16309 in 4.0-beta4 and have

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Jacek Lewandowski
Josh, that answers my question exactly; thank you. I will not implement limiting the result set in CQL (that is, by LIMIT clause) and stay with just paging. Whether the page size is defined in bytes or rows can be determined by a flag - there are many unused bits for that. So my other question -

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Josh McKenzie
> I do not have in mind a scenario where it could be useful to specify a LIMIT > in bytes. The LIMIT clause is usually used when you know how many rows you > wish to display or use. Unless somebody has a useful scenario in mind I do > not think that there is a need for that feature. If you have

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Jacek Lewandowski
Limiting the amount of returned data in bytes in addition to the row limit could be helpful when applied transparently by the server as a kind of guardrail. The server could fail the query if it exceeds some administratively imposed limit on the configuration level, WDYT? pon., 12 cze 2023 o

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Jacek Lewandowski
Yes, LIMIT BY provided by the user in CQL does not make much sense to me either pon., 12 cze 2023 o 11:20 Benedict napisał(a): > I agree that this is more suitable as a paging option, and not as a CQL > LIMIT option. > > If it were to be a CQL LIMIT option though, then it should be accurate >

[DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Jacek Lewandowski
Hi, I was working on limiting query results by their size expressed in bytes, and some questions arose that I'd like to bring to the mailing list. The semantics of queries (without aggregation) - data limits are applied on the raw data returned from replicas - while it works fine for the row

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Benjamin Lerer
Thanks Jacek for raising that discussion. I do not have in mind a scenario where it could be useful to specify a LIMIT in bytes. The LIMIT clause is usually used when you know how many rows you wish to display or use. Unless somebody has a useful scenario in mind I do not think that there is a

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Benedict
I agree that this is more suitable as a paging option, and not as a CQL LIMIT option. If it were to be a CQL LIMIT option though, then it should be accurate regarding result set IMO; there shouldn’t be any further results that could have been returned within the LIMIT.On 12 Jun 2023, at 10:16,

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Benjamin Lerer
> > If you have rows that vary significantly in their size, your latencies > could end up being pretty unpredictable using a LIMIT BY . Being > able to specify a limit by bytes at the driver / API level would allow app > devs to get more deterministic results out of their interaction w/the DB if >

Re: [DISCUSSION] Adding sonar report analysis to the Cassandra project

2023-06-12 Thread Mick Semb Wever
On Mon, 12 Jun 2023 at 15:02, Maxim Muzafarov wrote: > Hello everyone, > > I would like to make the source code of the Cassandra project more > visible to people outside of the Cassandra Community and highlight the > typical known issues in new contributions in the GitHub pull-request >

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Jeff Jirsa
On Mon, Jun 12, 2023 at 9:50 AM Benjamin Lerer wrote: > If you have rows that vary significantly in their size, your latencies >> could end up being pretty unpredictable using a LIMIT BY . Being >> able to specify a limit by bytes at the driver / API level would allow app >> devs to get more

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Josh McKenzie
Yeah, my bad. I have paging on the brain. Seriously. I can't think of a use-case in which a LIMIT based on # bytes makes sense from a user perspective. On Mon, Jun 12, 2023, at 1:35 PM, Jeff Jirsa wrote: > > > On Mon, Jun 12, 2023 at 9:50 AM Benjamin Lerer wrote: >>> If you have rows that

Re: [DISCUSSION] Adding sonar report analysis to the Cassandra project

2023-06-12 Thread Jeff Jirsa
On Mon, Jun 12, 2023 at 10:18 AM Mick Semb Wever wrote: > > > On Mon, 12 Jun 2023 at 15:02, Maxim Muzafarov wrote: > >> Hello everyone, >> >> I would like to make the source code of the Cassandra project more >> visible to people outside of the Cassandra Community and highlight the >> typical