Re: [VOTE] Release Apache Cassandra 5.0-alpha1

2023-08-30 Thread Benedict
ALv2. [11] > > > This comes down to using an int number from Philip Koopman's CRC work. > `private static final int CRC24_POLY = 0x1974F0B;` > > It was questioned whether a number can be copyrighted, in which case > we would not be including third-party wor

Re: [DISCUSS] Addition of smile-nlp test dependency for CEP-30

2023-09-13 Thread Benedict
There’s a distinction for spotbugs and other build related tools where they can be downloaded and used during the build so long as they’re not critical to the build process.They have to be downloaded dynamically in binary form I believe though, they cannot be included in the release.So it’s not

Re: [DISCUSS] CommitLog default disk access mode

2023-10-16 Thread Benedict
I have some plans to (eventually) use the commit log as memtable payload storage (ie memtables would reference the commit log entries directly, storing only indexing info), and to back first level of sstables by reference to commit log entries. This will permit us to deliver not only much

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-25 Thread Benedict
cut “preview” artifacts from trunk? >>> >>> -Jeremiah >>> >>> On Oct 24, 2023 at 11:54:25 AM, Jon Haddad <rustyrazorbl...@apache.org> wrote: >>> >>> I guess at the end of the day, shipping a release with a bunch of awesome features is better t

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-25 Thread Benedict
the landing of CEPs.On 25 Oct 2023, at 21:55, Benedict wrote:I am surprised this needs to be said, but - especially for long-running CEPs - you must involve yourself early, and certainly within some reasonable time of being notified the work is ready for broader input and review. In this case

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-11-01 Thread Benedict
ld we get some working version of TCM/Accord into people's hands to try out at/by Summit?"  That's all.  People are eager to see it and try it out.On Oct 31, 2023, at 12:16 PM, Benedict wrote:No, if I understand it correctly we’re in weird hypothetical land where people are inventing new re

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-11-01 Thread Benedict
> The idea that agreeing things carefully costs us agility is one I cannot endorsenot one I can endorse On 1 Nov 2023, at 21:11, Benedict wrote:The project governance document does not list any kind of general purpose technical change vote. There are only three very specific kinds of commun

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-11-01 Thread Benedict
e simple majority committer bar... how do we navigate that?On Wed, Nov 1, 2023, at 12:33 PM, Benedict wrote:Your conceptualisation implies no weight to the decision, as a norm is not binding?The community voting section mentions only three kinds of decision, and this was deliberate: code contributions

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-11-01 Thread Benedict
mport ordering, or Config.java structure, or refactoring out singletons, or gatekeeping CI - things we've had come up over the years where we've had a lot of people chime in and we benefit from more than just "2 committers agree on it" but less than "We need a CEP or pmc vote for th

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-31 Thread Benedict
which do not introduce NEW regressions, and we allow releases with known regressions that are deemed acceptable.We can indeed always vote to override it, and if it comes to that we can consider that as an option.-Jeremiah On Oct 31, 2023 at 11:41:29 AM, Benedict <bened...@apache.org> wrote:

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-11-02 Thread Benedict
t.So - that's my bid. What do others think?On Wed, Nov 1, 2023, at 8:11 PM, Benedict wrote:So my view is that the community is strongly built on consensus, so expressions of sentiment within the community have strong normative weight even without any specific legislative effect. You shouldn’t knowi

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-11-02 Thread Benedict
or committers actively working on the code who touch it before it's merged to trunk should it?On Thu, Nov 2, 2023, at 10:16 AM, Benedict wrote:My view is that we wait and see what the CI looks like at that time.My reading of ASF policy is that directing users to CEP preview releases that are not formal

Re: Road to 5.0-GA (was: [VOTE] Release Apache Cassandra 5.0-alpha2)

2023-11-04 Thread Benedict
. Looking forward to the reproduction test mentioned on the ticket.Thanks to Alex for his work on harry!On Sat, 4 Nov 2023 at 12:47, Benedict <bened...@apache.org> wrote:Alex can confirm but I think it actually turns out to be a new bug in 5.0, but either way we should not cut a r

Re: Road to 5.0-GA (was: [VOTE] Release Apache Cassandra 5.0-alpha2)

2023-11-04 Thread Benedict
I think before we cut a beta we need to have diagnosed and fixed 18993 (assuming it is a bug). > On 4 Nov 2023, at 16:04, Mick Semb Wever wrote: > >  >> >> With the publication of this release I would like to switch the >> default 'latest' docs on the website from 4.1 to 5.0. Are there any

Re: Road to 5.0-GA (was: [VOTE] Release Apache Cassandra 5.0-alpha2)

2023-11-04 Thread Benedict
l? > So I would say we should fix it with the highest priority and get a new 4.1.x > released. Blocking 5.0 beta voting is a secondary issue to me if we have a > “data not being returned” issue in an existing release? > >> On Nov 4, 2023, at 11:09 AM, Benedict wrote: &g

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-31 Thread Benedict
, and only a mitigation plan.  On Thu, 26 Oct 2023 at 14:20, Benedict <bened...@apache.org> wrote:The time to stabilise is orthogonal to the time we branch. Once we branch we stop accepting new features for the branch, and work to stabilise.My understanding is we will branch as soon as we hav

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-30 Thread Benedict
gs are going on in parallel. There are also more interdependencies between the different projects. In my opinion what we are lacking is a global overview of the different things going on in the project and some rough ideas of the status of the different significant pieces. It would allow us to bet

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-31 Thread Benedict
AM, Benedict <bened...@apache.org> wrote: There is no requirement for green CI on alpha. We voted last year to require running all tests before commit and to require green CI for beta releases. This vote was invalid because it didn’t reach the vote floor for a procedural

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-26 Thread Benedict
)On Thu, 26 Oct 2023 at 8:20, Benedict <bened...@apache.org> wrote:The time to stabilise is orthogonal to the time we branch. Once we branch we stop accepting new features for the branch, and work to stabilise.My understanding is we will branch as soon as we have a viable alpha containing TCM and

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-26 Thread Benedict
going on in the project and some rough ideas of the status of the different significant pieces. It would allow us to better organize ourselves.    Le jeu. 26 oct. 2023 à 00:26, Benedict <bened...@apache.org> a écrit :I have spoken privately with Ekaterina, and to clear up some possible ambiguit

Re: [EXTERNAL] Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-10-24 Thread Benedict
, Jeff Jirsa wrote: To do that, the cassandra PMC can open a legal JIRA and ask for a (durable, concrete) opinion. On Fri, Sep 22, 2023 at 5:59 AM Benedict <bened...@apache.org> wrote: my understanding is that with the former the liability rests on the provider of the lib to ensur

Re: [DISCUSS] Vector type and empty value

2023-09-19 Thread Benedict
If I understand this suggestion correctly it is a whole can of worms, as types that can never be null prevent us ever supporting outer joins that return these types. I am strongly in favour of permitting the table definition forbidding nulls - and perhaps even defaulting to this behaviour. But

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-21 Thread Benedict
At some point we have to discuss this, and here’s as good a place as any. There’s a great news article published talking about how generative AI was used to assist in developing the new vector search feature, which is itself really cool. Unfortunately it *sounds* like it runs afoul of the ASF

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread Benedict
ction of this work.As I said, an annoying topic.On 22 Sep 2023, at 13:06, Mick Semb Wever wrote:On Thu, 21 Sept 2023 at 10:41, Benedict <bened...@apache.org> wrote:At some point we have to discuss this, and here’s as good a place as any. There’s a great news article published talking about how

Re: [DISCUSS] Vector type and empty value

2023-09-20 Thread Benedict
valid CQL, its a hybrid of CQL + Java >>>> code…) >>>> >>>> CREATE TABLE fluffykittens (pk int primary key, cuteness int); >>>> INSERT INTO fluffykittens (pk, cuteness) VALUES (0, new byte[0]) >>>> >>>> CREATE TABLE ty

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-26 Thread Benedict
I agree with Ariel, the more suitable insertion point is probably the JDK level FileSystemProvider and FileSystem abstraction. It might also be that we can reuse existing work here in some cases? > On 26 Sep 2023, at 17:49, Ariel Weisberg wrote: > >  > Hi, > > Support for multiple storage

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread Benedict
is apparently incredibly pervasive right now, "everybody is doing it" is a pretty high risk legal defense. :)On Fri, Sep 22, 2023, at 8:04 AM, Mick Semb Wever wrote:On Thu, 21 Sept 2023 at 10:41, Benedict <bened...@apache.org> wrote:At some point we have to discuss this, and here’s

Re: [VOTE] Accept java-driver

2023-10-05 Thread Benedict
Surely it needs to be shared with the foundation and the PMC so we can verify? Or at least have ASF legal confirm they have received and are satisfied with the tarball? It certainly can’t be kept private to DS, AFAICT.Of course it shouldn’t be shared publicly but not sure how PMC can fulfil its

Re: [VOTE] Accept java-driver

2023-10-07 Thread Benedict
ftware Grant Agreement. Yes, any future work done after donation needs to be covered by ASF CLAs.But happy to see someone ask legal@ to confirm this so we can move forward.On Oct 6, 2023, at 3:33 AM, Benedict <bened...@apache.org> wrote:Are we certain about that? It’s unclear to me from t

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-23 Thread Benedict
I’m cool with this. We may have to think about numbering as I think TCM will break some backwards compatibility and we might technically expect the follow-up release to be 6.0 Maybe it’s not so bad to have such rapid releases either way. > On 23 Oct 2023, at 12:52, Mick Semb Wever wrote: > >

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-23 Thread Benedict
I agree. If we go this route we should essentially announce an immediate 5.1 alpha at the same time as 5.0 GA, and I can’t see almost anybody rolling out 5.0 with 5.1 so close on its heels.On 23 Oct 2023, at 18:11, Aleksey Yeshchenko wrote:I’m not so sure that many folks will choose to go

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-23 Thread Benedict
To be clear, I’m not making an argument either way about the path forwards we should take, just concurring about a likely downside of this proposal. I don’t have a strong opinion about how we should proceed.On 23 Oct 2023, at 18:16, Benedict wrote:I agree. If we go this route we should

Re: [VOTE] Accept java-driver

2023-10-06 Thread Benedict
aware and which makes or would make Licensor's representations in this License Agreement inaccurate in any respect. On Oct 5, 2023 at 4:35:08 AM, Benedict <bened...@apache.org> wrote: Surely it needs to be shared with the foundation and the PMC so we can verify? Or at least ha

Re: Cassandra project biweekly status update 2022-06-14

2022-06-28 Thread Benedict
I don’t think it has to be all that complicated? If it’s a part of our UX it’s probably something we should maintain backwards compatibility for. If it’s part of our internal codebase, probably not. The only two “public” APIs we have inside the codebase that I’m aware of are triggers and

Re: [Marketing] For Review: Pluggable Memtable Implementations blog

2022-07-14 Thread Benedict
I’m intrigued by the blocking skip list throughput numbers. By description (and assumption) I would expect it only to block for writes, but the throughput stays static even as the read/write mix changes. Is this expected? There are other memtable implementations coming too, I think? It looks

Re: [DISCUSS] Improve Commitlog write path

2022-07-22 Thread Benedict
Hi Amit, I am inclined to agree with Bowen Song, in that benchmarks from an initially empty cluster tend to lean more heavily on memtable and commit log bottlenecks than a real-world long running cluster does, as the algorithmic complexity of LSMTs begin to bite much later while the cost of

Re: Is this an MV bug?

2022-08-19 Thread Benedict
able. I assume they are not bundled together so from separate CQL > statements. > > On Fri, Aug 19, 2022 at 11:11 AM Claude Warren, Jr > wrote: >> If each mutation comes from a separate CQL they would be separate, no? >> >> >> On Fri, Aug 19, 2022 at 10:17 AM Benedict

Re: [DISCUSS] CEP-20: Dynamic Data Masking

2022-08-23 Thread Benedict
Applying this should prevent querying on a field, else you could leak its contents, surely? This pretty much prohibits using it in a clustering key, and a partition key with the ordered partitioner - but probably also a hashed partitioner since we do not use a cryptographic hash and the hash

Re: [Proposal] add pull request template

2022-08-18 Thread Benedict
Was it? I mean, we’ve all (or most) I think worked on projects with those things, so we all know what the benefits are? It’s fair to point out that we don’t have it even running for any branch yet. However there’s perhaps a chicken-and-egg situation, where I’m unsure the investment to develop

Re: [Proposal] add pull request template

2022-08-18 Thread Benedict
Let’s change our merge strategy! I really want us to. Perhaps I should start a formal discussion about it, and afterwards we can see where the votes land. > On 18 Aug 2022, at 15:37, Josh McKenzie wrote: > >  >> >> Until IDEs auto cross-reference JIRA, > I'm going to lightly touch the lid

Re: [DISCUSS] CEP-21: Transactional Cluster Metadata

2022-08-22 Thread Benedict
I just want to say I’m really excited about this work. It’s one of the last remaining major inadequacies of the project that makes it hard for people to deploy, and hard for us to develop. Can’t wait for it to be fixed. > On 22 Aug 2022, at 13:45, Sam Tunnicliffe wrote: > Hi, > > I'd like

Re: [Proposal] add pull request template

2022-08-18 Thread Benedict
en-egg to me. All it takes is ctrl+c & ctrl+v on your merging > commits. How would new merging strategy actually look like? I am all > ears. This seems to be quite nice as is if we stick to be more verbose > what we did. > >> On Thu, 18 Aug 2022 at 20:27, Benedict wrote: &

Re: [DISCUSS] LWT UPDATE semantics with + and - when null

2022-08-30 Thread Benedict
I’m a bit torn here, as consistency with counters is important. But they are a unique eventually consistent data type, and I am inclined to default standard numeric types to behave as SQL does, since they write a new value rather than a “delta” It is far from optimal to have divergent

Re: [DISSCUSS] Access to JDK internals only after dev mailing list consensus?

2022-09-01 Thread Benedict
I’m not opposed to this, although I think there is less need for it. Do you have an example of where you think this policy could have resulted in a different outcome? > On 1 Sep 2022, at 16:31, Ekaterina Dimitrova wrote: > > Hi everyone, > > Some time ago we added a note to the project

Re: [DISCUSS] CEP-21: Transactional Cluster Metadata

2022-09-02 Thread Benedict
Unmesh, LWTs today repair themselves periodically already, and do not rely on a later proposer. Also, the CMS will naturally use a single partition key for each log it needs to maintain, else they would not be linearised. > On 2 Sep 2022, at 05:01, Unmesh Joshi wrote: > >  >> I think

Re: [DISCUSS] CEP-20: Dynamic Data Masking

2022-09-07 Thread Benedict
t; > Where my vote is for A. > > >> On Wed, 7 Sept 2022 at 13:12, Benedict wrote: >> I’m not convinced there’s been adequate resolution over which approach is >> adopted. I know you have expressed a preference for the table schema >> approach, but the weight of oth

Re: [DISCUSS] CEP-20: Dynamic Data Masking

2022-09-07 Thread Benedict
I’m not convinced there’s been adequate resolution over which approach is adopted. I know you have expressed a preference for the table schema approach, but the weight of other opinion so far appears to be against this approach - even if it is broadly adopted by other databases. I will note

Re: [DISCUSS] CEP-23: Enhancement for Sparse Data Serialization

2022-09-06 Thread Benedict
n 6 Sep 2022, at 07:28, Benedict wrote: > > I agree a Jira would suffice, and if visibility there required a DISCUSS > thread or simply a notice sent to the list. > > While we’re here though, while I don’t have a lot of time to engage in > discussion it’s unclear to me what ad

Re: [DISCUSS] CEP-20: Dynamic Data Masking

2022-08-30 Thread Benedict
t, >> >> name text GENERATED ALWAYS AS some_mask_function(text, 'xxx', 7) >> >> ) >> >> >> >> (syntax from postgresql) >> >> >> >> GRANT SELECT ON foo.name TO general_use; >> >> GRANT SELECT ON foo.unmasked_nam

Re: [DISCUSS] CEP-20: Dynamic Data Masking

2022-08-24 Thread Benedict
ryable field. That would also preclude secondary >>> indexing, right? >> >> Yes, that's my thought as well. >> >>> On Tue, Aug 23, 2022 at 12:42 PM Derek Chen-Becker >>> wrote: >>> Agreed on not being a queryable field. That would also preclude seco

Re: [DISCUSS] CEP-20: Dynamic Data Masking

2022-08-24 Thread Benedict
sking a field like people's gender is useless because you will be able >>>> to determine its value in one query. On the other hand masking credit card >>>> numbers makes a lot of sense as it will complicate the life of the person >>>> trying to have acc

Re: [DISCUSS] CEP-20: Dynamic Data Masking

2022-08-24 Thread Benedict
and UPDATE statements. > That would differentiate us from many popular databases out there, where data > masking usually is a simpler thing. > >> On Wed, 24 Aug 2022 at 14:08, Benedict wrote: >> I can’t tell for sure, but the documentation on Postgres’ feature suggests >

Re: [DISCUSS] CEP-20: Dynamic Data Masking

2022-08-24 Thread Benedict
namic data masking > >> On Wed, 24 Aug 2022 at 10:40, Benedict wrote: >> Right, but we get to decide how we offer such features and what we call >> them. I can’t imagine a good reason to call this a masking feature, >> especially one that applies differentially to certa

Re: [DISCUSS] CEP-20: Dynamic Data Masking

2022-08-25 Thread Benedict
ng >>> as the application end user doesn't have access to run arbitrary CQL, then >>> these frorms of masking prevent accidental unauthorized use/leaking of >>> personal data. >>> >>> henrik >>> >>> >>> >>>>

Re: Is this an MV bug?

2022-08-19 Thread Benedict
If M1 and M2 both operate over the same partition key they won’t be separate mutations, they should be combined into a single mutation before submission to SP.mutate > On 19 Aug 2022, at 10:05, Claude Warren, Jr via dev > wrote: > >  > > # Table definitions > > Table [ Primary key ] other

Re: CEP-15 multi key transaction syntax

2022-08-21 Thread Benedict
> On 21 Aug 2022, at 14:59, Benedict wrote: > > SELECT INTO in T-SQL creates a new table with the results. Since our > semantics are likely to be different than Postgres and MySQL, I’m not sure > it’s less confusing or otherwise beneficial to mimic an existing syntax. &g

Re: [DISCUSS] Adding dependency on agrona

2022-09-29 Thread Benedict
e implementation an interface to allow this to be >>> pluggable? Could we avoid bringing it in as a full dependency for Cassandra >>> if the trie memtable were packaged separately as a plugin instead of being >>> included directly? >>> >>> Ch

Re: Shall 4.2 become 5.0 ?

2022-10-16 Thread Benedict
I’m confused: do people pay attention to version numbers or not? If not, how does it affect upgrades? If they do, surely it matters for communication, including marketing? We aren’t only a technical panel. I also reject the premise that a major bump has to have any effect on our backwards

Re: Shall 4.2 become 5.0 ?

2022-10-17 Thread Benedict
So… what’s the problem with bumping our major version because we want to communicate a release is “major” rather than has a breaking change - ie that we think users should feel incentivised to upgrade to it for whatever reason? Also, who is talking about never making breaking changes? Breaking

Re: Shall 4.2 become 5.0 ?

2022-10-17 Thread Benedict
It’s always arbitrary. We don’t bump major version when we make incompatible behavioural changes in bug fixes, for instance. It’s always a judgement: this release has important changes you should take a close look at. That’s all it means. Stuff that’s literally arbitrated is always somewhat

Re: [DISCUSS] Adding dependency on agrona

2022-09-21 Thread Benedict
In principle no, it’s a high quality library. But it might help to briefly outline what it’s used for. I assume it is instead of ByteBuffer? In which case it could maybe be worthwhile discussing as a project how we foresee interaction with existing buffer machinery, and maybe how we expect our

Re: CEP-15 multi key transaction syntax

2022-09-21 Thread Benedict
>>> Agree it's better to reuse existing syntax than invent new syntax. >>>>> >>>>> On 8/21/22 16:52, Konstantin Osipov wrote: >>>>> > * Avi Kivity via dev [22/08/14 15:59]: >>>>> > >>>>> > MySQL supports SELECT

Re: CEP-15 multi key transaction syntax

2022-09-21 Thread Benedict
2, 2022 at 1:36 AM Avi Kivity via dev >>>> wrote: >>>>> Agree it's better to reuse existing syntax than invent new syntax. >>>>> >>>>> On 8/21/22 16:52, Konstantin Osipov wrote: >>>>> > * Avi Kivity via dev [22/08/14 15:59]: >>>>

Re: [DISCUSS] CEP-23: Enhancement for Sparse Data Serialization

2022-09-08 Thread Benedict
ing > directly to VInt encoding for sizes rather than one of the other encodings? > Using a -2 as the first length to signal that the new encoding is in use so > that existing encodings can be read unchanged? > > >> On 06/09/2022 16:37, Benedict wrote: >> So, looki

Re: [DISCUSS] Adding dependency on agrona

2022-10-01 Thread Benedict
> has better buffers (or locking, timers, etc), should we be talking about > replacing usage of stdlib with Agrona throughout, or making a recommendation > for one over the other for future work? > > Cheers, > > Derek > >> On Thu, Sep 29, 2022 at 12:26 AM Bened

Re: [DISCUSS] Remove Dead Pull Requests

2022-08-11 Thread Benedict
Those all seem like good suggestions to me > On 11 Aug 2022, at 08:44, Claude Warren, Jr via dev > wrote: > >  > My original goal was to reduce the number of pull requests in the backlog as > it appears, from the outside, that the project does not really care for > outside contributions

Re: [Proposal] add pull request template

2022-08-18 Thread Benedict
> By submitting this pull request, I acknowledge that I am making a > contribution to the Apache Software Foundation under the terms and conditions > of the [Contributor's > Agreement](https://www.apache.org/licenses/contributor-agreements.html). Do we expect every contributor who makes any

Re: [Proposal] add pull request template

2022-08-18 Thread Benedict
urrent format (defer all context to > JIRA) or whether there's value in adding a longer form digest of context in a > paragraph below the commit. > > Over time I've become more sympathetic to the approach of informative > long-form bodies (I think you advocated for this in the pa

Re: [DISCUSSION] Cassandra's code style and source code analysis

2022-12-22 Thread Benedict
I like 3 or 4. We need to be sure we have a way of deactivating the check with code comments tho, as Java 8 has some bug with import order that can rarely break compilation, so we need to have some mechanism for permitting a different import order. Did we decide any changes to star imports?

Re: [RESULT][VOTE] Release Apache Cassandra 4.1.0 GA

2022-12-07 Thread Benedict
Can we give Marianne and Matt a chance to confirm their performance numbers? I got an indicative message suggesting it looked good, but nothing firm yet. > On 7 Dec 2022, at 20:37, Mick Semb Wever wrote: > >  > >> The vote will be open for 72 hours (longer if needed). Everyone who has >>

Re: [RESULT][VOTE] Release Apache Cassandra 4.1.0 GA

2022-12-07 Thread Benedict
Sure > On 7 Dec 2022, at 20:47, Mick Semb Wever wrote: > >  > >> Can we give Marianne and Matt a chance to confirm their performance numbers? >> I got an indicative message suggesting it looked good, but nothing firm yet. > > > > I am presuming that will (and must) happen before the new

Re: Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-08 Thread Benedict
I meant unnest, not unwrap. > On 8 Dec 2022, at 10:34, Benedict wrote: > >  >  >> I do not think we should have functions that aggregate across rows and >> functions that operate within a row use the same name. > > I’m sympathetic to that view for sure.

Re: Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-08 Thread Benedict
single row of output or always operates within a row, >> so returns the full set of rows matching the query. >> >> So if we want a max that aggregates across rows that works for collections >> we could change it to return the aggregated max across all rows. Or we just >>

Re: Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-08 Thread Benedict
APPLY(MAX,column)) would get the maximum value from the column across >> all the rows. >> >> Similarly APPLY could be used with other functions MAX(APPLY(MIN,column)) >> the largest minimum value from the column across all rows. >> >> These statements mak

Re: Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-08 Thread Benedict
ussion thread about > that new function? I think we should figure out our overall strategy - these are all pieces of the puzzle IMO. But I guess the above questions seem to come first and will shape this. I would be in favour of some general approach, however, such as either first casting to a

Re: Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-08 Thread Benedict
originally designed, and as they currently are on trunk. > The question is what we do with MAXWRITETIME. That function is also only on > trunk, and it might be repetitive given the more generic collection > functions. It's also a bit odd that there isn't, for example, a similar > MINTTL

Re: Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-08 Thread Benedict
thread otherwise. > On 8 Dec 2022, at 17:37, Benedict wrote: > >  > >>> 1) Do they offer ARRAY_SUM or ARRAY_AVG? >> Yes, a quick search on Google shows some examples: >> https://docs.teradata.com/r/kmuOwjp1zEYg98JsB8fu_A/68fdFR3LWhx7KtHc9Iv5

Re: Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-09 Thread Benedict
e reason why Java's maps aren't collections. > > > > > >> On Fri, 9 Dec 2022 at 11:26, Benedict wrote: >> Right, this is basically my view - it can be syntactic sugar for UNNEST >> subqueries as and when we offer those (if not now), but I think we should be >

Re: Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-09 Thread Benedict
se changes were happening none of the > persons involved on them felt the need of a discuss thread, and I opened this > thread as soon as Benedict objected the changes. I think this is perfectly in > line with ASF's wise policy about lazy consensus: > https://community.apache.org/co

Re: Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-09 Thread Benedict
that we know that there is dissension >> about these changes. However, as those changes were happening none of the >> persons involved on them felt the need of a discuss thread, and I opened >> this thread as soon as Benedict objected the changes. I think this is >>

Re: [VOTE] Release Apache Cassandra 4.1.0 (take2)

2022-12-12 Thread Benedict
I’m unsure that without more information it is very helpful to highlight in the release notes. We don’t even have a strong hypothesis tying this issue to 4.1.0 specifically, and don’t have a general policy of highlighting undiagnosed issues in release notes? > On 13 Dec 2022, at 00:48, Jon

Re: Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-06 Thread Benedict
only element of a singleton > collection. So, for example, COLLECTION_MAX(7) = COLLECTION_MAX([7]) = 7. > That ticket has already been reviewed and it's mostly ready to commit. > > Now we can go straight to the point: > > Recently Benedict brought back the idea of deprecatin

Re: [VOTE] Release Apache Cassandra 4.1.0 GA

2022-12-06 Thread Benedict
_SERIALIZABLE workload that >>> writes/reads to only 100 >>> partitions (v2 performs better for higher partition counts). We're still >>> investigating what's going >>> on. >>> >>> Should that be a -1 vote? I'm not sure :) >>>

Re: [DISCUSS] API modifications and when to raise a thread on the dev ML

2022-12-05 Thread Benedict
yaml, etc) and have jenkins >>>> send an email when changes are detected on them. Overkill? bad idea? >>>> :thinking:... >>>> >>>>> On 4/12/22 1:14, Dinesh Joshi wrote: >>>>> We should also very clearly list out what is considered a publi

Re: [DISCUSS] API modifications and when to raise a thread on the dev ML

2022-12-05 Thread Benedict
Changes” thread might not be a bad approach. > On 5 Dec 2022, at 14:16, Benjamin Lerer wrote: > >  > Benedict, I am confused. If you are so much concerned about virtual tables or > CQL why do you not track those components changes directly? People usually > label them correct

Re: Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-06 Thread Benedict
than > seeing them as collection variants, we should see them as variants that > operate on the data in a single row, rather than aggregating across multiple > rows. But even with that perspective I don’t know what the best name would > be. > >> On Dec 6, 2022, at 7:30 A

Re: [VOTE] Release Apache Cassandra 4.1.0 GA

2022-12-05 Thread Benedict
-0 CASSANDRA-18086 should probably be fixed and merged first, as Paxos v2 will be unlikely to work well for users without it. Either that or we need to update NEWS.txt to mention it. > On 5 Dec 2022, at 11:01, Aleksey Yeshchenko wrote: > > +1 > >> On 5 Dec 2022, at 10:17, Benjamin Lerer

Re: [DISCUSS] API modifications and when to raise a thread on the dev ML

2022-12-02 Thread Benedict
I think some of that text also got garbled by mixing up how you approach internal APIs and external APIs. We should probably clarify that there are different burdens for each. Which is all my fault as the formulator. I remember it being much clearer in my head. My view is the same as yours

Re: [VOTE] CEP-25: Trie-indexed SSTable format

2022-12-19 Thread Benedict
+1 > On 19 Dec 2022, at 13:00, Branimir Lambov wrote: > >  > Hi everyone, > > I'd like to propose CEP-25 for approval. > > Proposal: > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-25%3A+Trie-indexed+SSTable+format > Discussion:

Re: [DISCUSS] CEP-26: Unified Compaction Strategy

2022-12-21 Thread Benedict
I’m personally very excited by this work. Compaction could do with a spring clean and this feels to formalise things much more cleanly, but density tiering in particular is something I’ve wanted to incorporate for years now, as it should significantly improve STCS behaviour (most importantly

Re: [DISCUSS] API modifications and when to raise a thread on the dev ML

2022-12-05 Thread Benedict
d then everyone can open it and check the list of tickets and >> comment directly on the tickets or open a thread if they think the issue >> deserves one. >> Same as having link to the tickets that are blockers or tickets that need >> reviewers. Whoever wants will have

Re: [DISCUSSION] New dependencies for SAI CEP-7

2022-12-14 Thread Benedict
I don’t believe we are ready to be prescriptive about how our randomised tests are written.1) We want as many people to write randomised tests as possible, so do not want to create impediments.2) We don’t, I expect, all agree on what a good randomised test looks like.I think Mike should include

Re: [DISCUSS] CEP-25: Trie-indexed SSTable format

2022-11-22 Thread Benedict
wandowski wrote: >> +1 for the proposal ! >> >> btw. regarding tests - perhaps we will have to let Python DTests run with >> either new or old format >> >> thanks >> - - -- --- - - >> Jacek Lewandowski >> >>

Re: [DISCUSS] CEP-25: Trie-indexed SSTable format

2022-11-21 Thread Benedict
e performance. In either > case it may be harder to cache. Do you have something different in mind? > > Regards, > Branimir > >> On Mon, Nov 21, 2022 at 3:01 PM Benedict wrote: >> Personally very pleased to see this proposal, and I’m not opposed to easing >> you

Re: [DISCUSS] CEP-25: Trie-indexed SSTable format

2022-11-21 Thread Benedict
Personally very pleased to see this proposal, and I’m not opposed to easing your migration by maintaining some light support for internal file versions - though would prefer the support have some version limit where it can be excised (maybe for one minor version bump?) One implementation

Re: [DISCUSS] CEP-25: Trie-indexed SSTable format

2022-11-21 Thread Benedict
>> On Mon, Nov 21, 2022 at 3:38 PM Benedict wrote: >> Buffering on write up to at most one page seems fine? Once you are past a >> single page it’s fine to write either to the end of the partition or to a >> separate file, there’s nothing much to be gained, but esp. for

Re: [DISCUSSION] Cassandra's code style and source code analysis

2022-11-28 Thread Benedict
en into disuse, most tooling commonly used by security >>> organizations doesn’t support it. SBOMs are a good example, as their >>> introduction postdates ant’s decline. Maven plugins exist to generate them >>> in CycloneDX and SPDX format, but no such plugins exist for an

Re: [DISCUSSION] Cassandra's code style and source code analysis

2022-11-28 Thread Benedict
>> also integrate with security tooling at their respective companies. Because >>> Ant has fallen into disuse, most tooling commonly used by security >>> organizations doesn’t support it. SBOMs are a good example, as their >>> introduction postdates ant’s decline. Ma

Re: [DISCUSSION] Cassandra's code style and source code analysis

2022-11-25 Thread Benedict
suspect most are at best ambivalent to this. So without evidence of strong support for the migration from those folk, I’ll continue to voice my concerns as one of the more vocal ones. > On 25 Nov 2022, at 10:07, Benedict wrote: > > There’s always a handful of people asking for it, bu

Re: [DISCUSSION] Cassandra's code style and source code analysis

2022-11-25 Thread Benedict
nk there are not any significant benefits to > switch even if it "just works" now? > > > From: Benedict > Sent: Friday, November 25, 2022 11:07 > To: dev@cassandra.apache.org > Subject: Re: [DISCUSSION] Cassandra's code style and source code analysis > > NetApp

  1   2   3   4   5   6   7   >