Re: [DISCUSS] putting versions into Deprecated annotations

2023-10-13 Thread Josh McKenzie
> If some piece of code is not used anymore then simplifying the code is the > best thing to do In the case of unused / unreferenced, sure. In the case of "other things use this but we shouldn't add any more dependencies on this because we need to remove it", a @Deprecated annotation w/version,

Re: Avoiding pushes to broken branches

2023-10-10 Thread Josh McKenzie
What about having a nag-bot that notifies #cassandra-dev on ASF slack hourly if the builds are broken? On Tue, Oct 10, 2023, at 8:02 AM, Mick Semb Wever wrote: > I'd like to suggest some improvements for identifying and announcing > broken branches and to avoid pushing commits to broken branches.

Re: [DISCUSS] putting versions into Deprecated annotations

2023-10-10 Thread Josh McKenzie
Sounds like we're relitigating the basics of how @Deprecated, forRemoval, since, and javadoc @link all intersect to make deprecation less painful ;) So: 1. Built-in java.lang.Deprecated: required 2. Can use since and forRemoval if you have that info handy and think it'd be useful (would make i

Re: [DISCUSS] putting versions into Deprecated annotations

2023-10-06 Thread Josh McKenzie
Might be nice to support a 3rd param that's a String for the reason it's deprecated. i.e. "Replaced by X", "Unmaintained", "Obsolete", "See CASSANDRA-N", link to a dev ML thread on pony mail, etc. That way if someone comes across it in the codebase they have some context to follow up on if

Re: [VOTE] Accept java-driver

2023-10-03 Thread Josh McKenzie
> I see now this will likely be instead apache/cassandra-java-driver I was wondering about that. apache/java-driver seemed pretty broad. :) >From the linked page: Check that all active committers have a signed CLA on record. TODO – attach list I've been part of these discussions and work so am fam

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-26 Thread Josh McKenzie
> it may be better to support most cloud storage > It simply only supports S3, which feels a bit customized for a certain user > and is not universal enough.Am I right ? I agree w/the eventual goal (and constraint on design now) of supporting most popular cloud storage vendors, but if we have som

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread Josh McKenzie
ailing >>> list DISCUSS thread? It applies to all source code we take in, and accept >>> copyright assignment of, not to jars we depend on and not only to vector >>> related code contributions. >>> >>>> On Sep 22, 2023, at 7:29 AM, Josh Mc

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread Josh McKenzie
So if we're going to chat about GenAI on this thread here, 2 things: 1. A dependency we pull in != a code contribution (I am not a lawyer but my understanding is that with the former the liability rests on the provider of the lib to ensure it's in compliance with their claims to copyright and it

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-21 Thread Josh McKenzie
Oops; thought I'd already +1'ed earlier in the thread. In case it wasn't clear: +1 on inclusion as-is. On Thu, Sep 21, 2023, at 4:00 PM, Josh McKenzie wrote: > My .02 re: the copyright: the library is licensed ASL v2.0. Who it's > originally copyrighted by / to (Jonath

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-21 Thread Josh McKenzie
My .02 re: the copyright: the library is licensed ASL v2.0. Who it's originally copyrighted by / to (Jonathan personally, DataStax as a corporate entity, Santa Claus, my dog :)) doesn't really have any impact on the legalities of our ability to make use of it or the durability or safety of the c

Re: [DISCUSS] Backport CASSANDRA-18816 to 5.0? Add support for repair coordinator to retry messages that timeout

2023-09-19 Thread Josh McKenzie
I support including this in 5.0. This looks to me like a significant correctness and stabilization effort, very similar to other large bodies of work we merged in post freeze for testing and stabilizing 4.0. On Tue, Sep 19, 2023, at 5:42 PM, Chris Lohfink wrote: > I absolutely love the idea of

Re: [DISCUSS] Vector type and empty value

2023-09-19 Thread Josh McKenzie
> I am strongly in favour of permitting the table definition forbidding nulls - > and perhaps even defaulting to this behaviour. But I don’t think we should > have types that are inherently incapable of being null. I'm with Benedict. Seems like this could help prevent whatever "nulls in primary

Re: [Discuss] cleaning up build temp files

2023-08-13 Thread Josh McKenzie
> There's also tests that hardcode I started mentally twitching when I hit that point in the sentence. **Kill them with fire.** On Sun, Aug 13, 2023, at 4:51 PM, Mick Semb Wever wrote: >> >> https://github.com/apache/cassandra/blob/trunk/test/unit/org/apache/cassandra/db/DirectoriesTest.java#L71

Re: [Discuss] cleaning up build temp files

2023-08-13 Thread Josh McKenzie
> I think we want/need relative paths, e.g. "build/tmp", and if the path is in > a mounted volume there can be another container still running. Sure. The specifics of *what* path isn't interesting to me. The pattern of: 1. Let env declare where TEMP lives 2. Write things to TEMP 3. Delete things

Re: [Discuss] cleaning up build temp files

2023-08-13 Thread Josh McKenzie
Why not use "/${CASS_BUILD_TMP}/cassandra." on a given run and then on subsequent runs "rm -rf f/${CASS_BUILD_TMP}/cassandra.*"? If CASS_BUILD_TMP is not defined, default to /tmp. "ant clean" can also wipe it. If it's a safe assumption that we only ever need 1 instance of data in that space (i

Re: Tokenization and SAI query syntax

2023-08-07 Thread Josh McKenzie
s we use Caleb's suggestion, >>>>>> > > I'd ask >>>>>> > > that the queries that SASI and SAI both support use the same syntax, >>>>>> > > even >>>>>> > > if it means there's two ways of writing the

Re: August 5.0 Freeze (with waivers…) and a 5.0-alpha1

2023-08-07 Thread Josh McKenzie
Merge path for bugs on 3.0 is pretty brutal at this point. Good thing 2 will drop off when we GA 5.0. Updated wiki w/new branches plus some examples: link On Mon, Aug 7, 2023, at 11:18 AM, Mick Se

Re: [DISCUSSION] Shall we remove ant javadoc task?

2023-08-03 Thread Josh McKenzie
gt; > >> >> > +1 from me on removing the ant task. If someone feels the task is >> >> > useful they can always implement one that does not crash and add it >> >> > back. >> >> > >> >> > -Jeremiah >> >> >

Re: [DISCUSS] Creating a 5.0 landing page

2023-08-03 Thread Josh McKenzie
We actually already have an events page: https://cassandra.apache.org/_/events.html; not sure if you were saying we should add one Ekaterina or saying we should add this content there. +1 to the content there and having a landing page that points there + integrating meetups, town halls, etc. C

Re: [DISCUSSION] Shall we remove ant javadoc task?

2023-08-02 Thread Josh McKenzie
spect most people are not opening their browsers and > looking at Javadoc..." :) > > Cheers, > > Derek > > > > On Wed, Aug 2, 2023, 1:30 PM Josh McKenzie > mailto:jmcken...@apache.org>> wrote: > most people are not looking at Javadoc when working on

Re: [DISCUSSION] Shall we remove ant javadoc task?

2023-08-02 Thread Josh McKenzie
> most people are not looking at Javadoc when working on the codebase. I definitely use it extensively **inside the IDE**. But never as a compiled set of external docs. Which is to say, I'm +1 on removing the target and I'd ask everyone to keep javadoccing your classes and methods where things a

Re: August 5.0 Freeze (with waivers…) and a 5.0-alpha1

2023-07-27 Thread Josh McKenzie
+1 to what you've stated here Mick with a question: where did we land on flagging new features as experimental? Seems like it's an "at author's discretion" - search of the list turned up not too much structure there. Had a statement to that effect from Benjamin here

Re: [DISCUSS] Maintain backwards compatibility after dependency upgrade in the 5.0

2023-07-27 Thread Josh McKenzie
+1 to the change pre 5.0. Any committers have bandwidth to review https://issues.apache.org/jira/projects/CASSANDRA/issues/CASSANDRA-14667? PR can be found here: https://github.com/apache/cassandra/pull/2238/files On Thu, Jul 27, 2023, at 7:59 AM, Maxim Muzafarov wrote: > Bump this topic up for

Re: [Discuss] Repair inside C*

2023-07-27 Thread Josh McKenzie
> The idea that your data integrity needs to be opt-in has never made sense to > me from the perspective of either the product or the end user. I could not agree with this more. 100%. > The current (and past) state of things where running the DB correctly > **requires* *running a separate proces

Re: [DISCUSS] Using ACCP or tc-native by default

2023-07-26 Thread Josh McKenzie
+1 to the "on by default" camp. > What comes to mind is how we brought down people clusters and made sstables > unreadable with the introduction of the chunk_length configuration in 1.0 I think a key difference here is that changing chunk length is something that materially changes behavior and

Re: Tokenization and SAI query syntax

2023-07-24 Thread Josh McKenzie
> `column CONTAINS term`. Contains is used by both Java and Python for > substring searches, so at least some users will be surprised by term-based > behavior. I wonder whether users are in their "programming language" headspace or in their "querying a database" headspace when interacting with C

Cassandra project status, 2023-07-19

2023-07-19 Thread Josh McKenzie
In case you were wondering, if you switch your ToDo list structure enough times you're bound to have something slip through the cracks. Like, say, a periodic project status update email. If you're curious where I landed: "All tools are comparably insufficient in different ways". /sigh Don't be

Re: Changing the output of tooling between majors

2023-07-13 Thread Josh McKenzie
> I just find it ridiculous we can not change "someProperty: 10" to "Some > Property: 10" and there is so much red tape about that. Well, we're talking about programmatic parsing here. This feels like complaining about a compiler that won't let you build if you're missing a ; We *can* change it,

Re: Fwd: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-12 Thread Josh McKenzie
gly indicative of the future here since >>> we've been allowing circle to validate pre-commit and haven't been >>> multiplexing.” >>> I am interested to compare how many tickets for flaky tests we will have >>> pre-5.0 now compared to pre-4.1.

Re: Fwd: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-12 Thread Josh McKenzie
t;> we've been allowing circle to validate pre-commit and haven't been >> multiplexing.” >> I am interested to compare how many tickets for flaky tests we will have >> pre-5.0 now compared to pre-4.1. >> >> >> On Wed, 12 Jul 2023 at 8:41, Josh

Re: Fwd: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-12 Thread Josh McKenzie
o 07:24 Berenguer Blasi > napisał(a): >> On our 4.0 release I remember a number of such failures but not recently. >> What is more common though is packaging errors, >> cdc/compression/system_ks_directory targeted fixes, CI w/wo upgrade tests, >> being less respons

Re: Fwd: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-11 Thread Josh McKenzie
currently used, but right now we get 2 CI > systems for twice the price. +1 on the proposed subsets. > > Derek > > On Mon, Jul 10, 2023 at 9:37 AM Josh McKenzie wrote: >> __ >> I'm personally not thinking about CircleCI at all; I'm envisioning a world

Re: Removal of CloudstackSnitch

2023-07-10 Thread Josh McKenzie
> 2) keep it there in 5.0 but mark it @Deprecated I'd say Deprecate, log warnings that it's not supported nor maintained and people to use it at their own risk, and that it's going to be removed. That is, assuming the maintenance burden of it isn't high. I assume not since, as Brandon said, they

Re: [DISCUSS] When to run CheckStyle and other verificiations

2023-07-10 Thread Josh McKenzie
> • Remove the checkstyle dependency from "jar" and "test" > • Create a single "check" target that includes all the checks we expect to > pass in the CI (currently Checkstyle, RAT, and Eclipse-Warnings), making this > task the default. +1 here. (of note: haven't forgotten the request from this

Re: Fwd: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-10 Thread Josh McKenzie
gs could be reasonable, since it's >> unlikely to have config-specific flaky tests. As in five configs with 100 >> repetitions each. >> >> On Fri, 7 Jul 2023 at 16:14, Josh McKenzie wrote: >>> Maybe. Kind of depends on how long we write our tests to run d

Re: Changing the output of tooling between majors

2023-07-08 Thread Josh McKenzie
at > the argument "lets not break it for folks in nodetool" is still relevant. CQL > output is there from times of 4.0 at least (at least!) and YAML / JSON is > also not something completely new. It is not like we are suddenly forcing > people to change their habits, there

Re: Changing the output of tooling between majors

2023-07-08 Thread Josh McKenzie
> Once there is, we are free to change the default output however we want. One thing I always try to keep in mind on discussions like this. A thought experiment (with very hand-wavy numbers; try not to get hung up on them): * Let's say there are 5,000 discrete "users" of C* out there (different g

Re: [DISCUSS] Allow UPDATE on settings virtual table to change running configuration

2023-07-07 Thread Josh McKenzie
This really is great work Maxim; definitely appreciate all the hard work that's gone into it and I think the users will too. In terms of where it should land, we discussed this type of question at length on the ML awhile ago and ended up codifying it in the wiki: https://cwiki.apache.org/conflu

Re: Fwd: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-07 Thread Josh McKenzie
Maybe. Kind of depends on how long we write our tests to run doesn't it? :) But point taken. Any non-trivial test would start to be something of a beast under this approach. On Fri, Jul 7, 2023, at 11:12 AM, Brandon Williams wrote: > On Fri, Jul 7, 2023 at 10:09 AM Josh McKenzie wrot

Re: Fwd: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-07 Thread Josh McKenzie
deterministic, I believe we will learn a thing or two, and those > types of things will happen less in time. > > Best regards, > Ekaterina > > -- Forwarded message - > From: *Josh McKenzie* > Date: Wed, 5 Jul 2023 at 8:25 > Subject: Re: [DISCUSS] For

Re: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-05 Thread Josh McKenzie
e truth is that a test that fails >>> is either a bug in the service code or a bug in the test. I've come to >>> realize that the CI and build framework is way too complex for me to be >>> able to help with much, but I would love to start chipping away at failing &g

Re: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-07-03 Thread Josh McKenzie
es and fit the run into the 1-2 hour build > timeframe); > - nightly builds (scheduled task to build everything we have once a > day and notify the ML if that build fails); > > > My question here is: > Should we mention in this concept how we will build the sub-projects > (e

Re: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-06-30 Thread Josh McKenzie
> Not everyone will have access to such resources, if all you have is 1 such > pod you'll be waiting a long time (in theory one month, and you actually need > a few bigger pods for some of the more extensive tests, e.g. large upgrade > tests)…. One thing worth calling out: I believe we have *

Re: [DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-06-30 Thread Josh McKenzie
gt;> Thanks Josh, this looks great! I think the constraints you've outlined are >> reasonable for an initial attempt. We can always evolve if we run into >> issues. >> >> Cheers, >> >> Derek >> >> On Fri, Jun 30, 2023 at 11:19 AM Josh McKenzi

[DISCUSS] Formalizing requirements for pre-commit patches on new CI

2023-06-30 Thread Josh McKenzie
Context: we're looking to get away from having split CircleCI and ASF CI as well as getting ASF CI to a stable state. There's a variety of reasons why it's flaky (orchestration, heterogenous hardware, hardware failures, flaky tests, non-deterministic runs, noisy neighbors, etc), many of which Mick

Re: [DISCUSS] When to run CheckStyle and other verificiations

2023-06-29 Thread Josh McKenzie
the no-checkstyle one? > > Trading one for one with Josh :-) > > Best regards, > Ekaterina > > On Thu, 29 Jun 2023 at 10:52, Josh McKenzie wrote: >> __ >>> I really prefer separate tasks than flags. Flags are not listed in the help >>> message like &

Re: Improved DeletionTime serialization to reduce disk size

2023-06-29 Thread Josh McKenzie
> I would prefer we not plan on two distinct changes to this I agree with this sentiment, **and** > +1, if you have time for this approach and no other in this window. People are going to use 5.0 for awhile. Better to have an improvement in their hands for that duration than no improvement at all

Re: [DISCUSS] When to run CheckStyle and other verificiations

2023-06-29 Thread Josh McKenzie
> I really prefer separate tasks than flags. Flags are not listed in the help > message like "ant -p" and are not auto-completed in the terminal. That makes > them almost undiscoverable for newcomers. Please, no more flags. We are *more* than flaggy enough right now. Having to dig through build

Re: [DISCUSS] Maintain backwards compatibility after dependency upgrade in the 5.0

2023-06-28 Thread Josh McKenzie
Reasons 1 and 2 (getting into CVE coverage pipeline proactively rather than reactively, JDK consistency) seem compelling enough to justify the upgrade on their own to me. > This is a problem for applications/tools that rely on the cassandra > classpath (lib/jars) as after the upgrade they may be

Re: [VOTE] CEP 33 - CIDR filtering authorizer

2023-06-27 Thread Josh McKenzie
+1 On Tue, Jun 27, 2023, at 1:17 PM, Shailaja Koppu wrote: > Hi Team, > > (Starting a new thread for VOTE instead of reusing the DISCUSS thread, to > follow usual procedure). > > Please vote on CEP 33 - CIDR filtering authorizer > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-33%3A

Re: Improved DeletionTime serialization to reduce disk size

2023-06-23 Thread Josh McKenzie
> If we’re doing this, why don’t we delta encode a vint from some per-sstable > minimum value? I’d expect that to commonly compress to a single byte or so. +1 to this approach. > Distant future people will not be happy about this, I can already tell you > now. Eh, they'll all be AI's anyway and

Re: [DISCUSS] Using ACCP or tc-native by default

2023-06-23 Thread Josh McKenzie
+1 here on inclusion by default. On Fri, Jun 23, 2023, at 2:01 AM, Dinesh Joshi wrote: > This would be a good addition and would make Cassandra more performant out of > the box. > > Dinesh > >> On Jun 22, 2023, at 9:45 PM, Jordan West wrote: >>  >> Glad to see there is support for this! I th

Re: [DISCUSS] Being specific about JDK versions and python lib versions in CI

2023-06-23 Thread Josh McKenzie
PM, Ekaterina Dimitrova >> wrote: >>> Wouldn’t we recommend people to use the test images the project CI use? >>> Thus using in testing the versions we use? I would assume the repeatable CI >>> will still expect test images the way we have now? >>> (I

[DISCUSS] Being specific about JDK versions and python lib versions in CI

2023-06-22 Thread Josh McKenzie
Been working with Mick on CI and it dawned on me that we don't specify a specific required JDK version (major.minor.patch) for CI nor do we actually specify required python lib versions for many of the requirements in our dtests

Re: Adding wiremock to test dependencies

2023-06-20 Thread Josh McKenzie
Speaking only to the "we don't want to add a dependency on something that's unstable or likely to fizzle out", looks good there to me: • Long-term project health / activity looks robust: https://github.com/wiremock/wiremock/graphs/contributors • Pretty diverse set of contributors in the last co

Re: [DISCUSS] Remove deprecated keyspace_count_warn_threshold and table_count_warn_threshold

2023-06-16 Thread Josh McKenzie
already exposes 50… then we should recommend no more than 100…. >>> >>>> I find it's better for usability to not count the system tables and just >>>> say "It's recommended not to have more than 100 tables. This doesn't >>>> include

Re: [DISCUSS] Remove deprecated keyspace_count_warn_threshold and table_count_warn_threshold

2023-06-14 Thread Josh McKenzie
counted, a recommendation for the > threshold would say something like "It's recommended not to have more than > 150 tables. The system already includes 45 tables for internal usage, so you > shouldn't create more than 105 user tables". I find it's bett

Re: [DISCUSS] Remove deprecated keyspace_count_warn_threshold and table_count_warn_threshold

2023-06-13 Thread Josh McKenzie
7; and 'table_count_warn_threshold' >>> > configuration settings on the trunk branch for the next major release. >>> >>> Deprecate in 4.1 is way too new for me to accept that, and its low effort >>> to keep; breaking users is always a bad idea and

Re: [VOTE] CEP-8 Datastax Drivers Donation

2023-06-13 Thread Josh McKenzie
+1 On Tue, Jun 13, 2023, at 10:55 AM, Jeremiah Jordan wrote: > +1 nb > > On Jun 13, 2023 at 9:14:35 AM, Jeremy Hanna > wrote: >> >> Calling for a vote on CEP-8 [1]. >> >> To clarify the intent, as Benjamin said in the discussion thread [2], the >> goal of this vote is simply to ensure that t

Re: [DISCUSS] Remove deprecated keyspace_count_warn_threshold and table_count_warn_threshold

2023-06-13 Thread Josh McKenzie
> have subsequently been deprecated since 4.1-alpha in CASSANDRA-17195 when > they were replaced/migrated to guardrails as part of CEP-3 (Guardrails). Have we been dropping support entirely for old params or using the @Replaces annotation into perpetuity? I dislike the idea of operators having t

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Josh McKenzie
part of the user query, I think the server must always > have returned all data that fits into the LIMIT when all pages have been > returned. > > -Jeremiah > > On Jun 12, 2023 at 12:56:14 PM, Josh McKenzie wrote: >> >> Yeah, my bad. I have paging on the brain. S

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Josh McKenzie
Yeah, my bad. I have paging on the brain. Seriously. I can't think of a use-case in which a LIMIT based on # bytes makes sense from a user perspective. On Mon, Jun 12, 2023, at 1:35 PM, Jeff Jirsa wrote: > > > On Mon, Jun 12, 2023 at 9:50 AM Benjamin Lerer wrote: >>> If you have rows that var

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Josh McKenzie
> I do not have in mind a scenario where it could be useful to specify a LIMIT > in bytes. The LIMIT clause is usually used when you know how many rows you > wish to display or use. Unless somebody has a useful scenario in mind I do > not think that there is a need for that feature. If you have

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-06-01 Thread Josh McKenzie
till calling >> out as it has been an issue. >> >> Josh, do you see any reports on what isn’t working? I think most people >> don’t touch 1% of what git can do… so it might be that 10% is broken but >> that no one in our domain actually touches that path?

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-31 Thread Josh McKenzie
reason as the JVM dtests. It's nice to write a feature or fix, find a >> similar JVM dtest, copy, paste, and edit, and have something useful. >> >> 3. General subdivision of Cassandra projects >> >> This topic has come up quite a few times recently - aroun

Re: Is simplenative in cassandra-stress still relevant?

2023-05-31 Thread Josh McKenzie
hots here, if a community decides it has to go > so it will but I would be said to see it. > > Regards > > > > From: Josh McKenzie > Sent: Wednesday, May 31, 2023 15:15 > To: dev > Subject: Re: Is simplenative in cassandra-stress still relevant?

Re: Is simplenative in cassandra-stress still relevant?

2023-05-31 Thread Josh McKenzie
> The main issue I see with maintaining the SimpleClient in cassandra-stress is > the burden it puts on a user to understand the options available when > connecting with *-mode*: How frequently do we expect users or devs to use the built-in cassandra-stress tool? Between tlp-stress and NoSQLBenc

Cassandra project status, 2023-05-30

2023-05-30 Thread Josh McKenzie
Been a bit over a month; let's check in and see how things are looking. We released the following: - 3.11.15 - 3.0.29 - 4.0.10 - 4.1.2 Thanks to all the release managers who worked on getting these out the door. [New Contributors Getting Started] First off, come hang out with us in the #cassand

Re: [DISCUSS] CEP-8 Drivers Donation - take 2

2023-05-30 Thread Josh McKenzie
> Is the vote for the CEP to be for all drivers, but we will introduce each > driver one by one? What determines when we are comfortable with one driver > subproject and can move on to accepting the next ? Curious to hear on this as well. There's 2 implications from the CEP as written: 1. The

Re: [VOTE] CEP-30 ANN Vector Search

2023-05-25 Thread Josh McKenzie
+1 On Thu, May 25, 2023, at 8:33 PM, Jake Luciani wrote: > +1 > > On Thu, May 25, 2023 at 11:45 AM Jonathan Ellis wrote: >> Let's make this official. >> >> CEP: >> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-30%3A+Approximate+Nearest+Neighbor%28ANN%29+Vector+Search+via+Storage-At

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-25 Thread Josh McKenzie
;>>> >>>>>> > We could go over some interesting examples such as testing 2i (SAI) >>>>>> >>>>>> +100 >>>>>> >>>>>> >>>>>> On Wed, May 24, 2023 at 1:40 PM Alex Petrov wrote: &g

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Josh McKenzie
t;> I would not set this improvement as a prerequisite to pulling Harry into the >> main branch, but rather interpret it as a commitment from myself to take >> community input and make it more approachable by the day. >> >> On Wed, May 24, 2023, at 2:44 PM, Josh McKenzie wro

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Josh McKenzie
r few hours, where we >> could have cut many manual module releases in that time. >> >> David and folks working on accord ? >> >> >> >> On Tue, 23 May 2023 at 20:09, Josh McKenzie wrote: >>> __ >>> I'll hold off on this un

Re: Vector search demo, and query syntax

2023-05-24 Thread Josh McKenzie
+1 to the flow of: 1: ORDER BY? 2: Oh. Yeah. That *does *makes sense. ;) (sending from fastmail in the hopes the image doesn't get stripped. Thanks ASF smtp server...) ~Josh On Wed, May 24, 2023, at 1:00 AM, Jeremiah D Jordan wrote: > At first I wasn’t sure about using ORDER BY, but the mor

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-23 Thread Josh McKenzie
a released JAR. We can then reference Harry as >> a library without maintaining public artifacts for it. Is that in line with >> what you're thinking? >> >> > I'd also like to see us get a Harry run integrated as part of our >> > pre-commit CI >> &

Re: [DISCUSS] Feature branch version hygiene

2023-05-18 Thread Josh McKenzie
h D Jordan wrote: > So what do we do with feature branch merged tickets in this model? *They > stay on 5.0-target after close and move to 5.0.0 when the epic is merged and > closes*? > >> On May 18, 2023, at 9:33 AM, Josh McKenzie wrote: >> >>> My mental model, thou

Re: [DISCUSS] Feature branch version hygiene

2023-05-18 Thread Josh McKenzie
we just need to get 5.0-alpha1 > labels added when those releases are cut. > > Then I propose we break the confusion in both directions by scrapping 5.0 > entirely and introducing 5.0-target. > > So tickets go to 5.0-target if they target 5.0, and to 5.0.0 once they are

Re: [DISCUSS] Feature branch version hygiene

2023-05-18 Thread Josh McKenzie
lieu of that, every ticket targeting 5.0 could use fixVersion 5.0.x, > since it is pretty clear what this means. Some tickets that don’t hit 5.0.0 > can then be postponed to a later version, but it’s not like this is > burdensome. Anything marked feature/improvement and 5.0.x gets bumped

Re: [DISCUSS] Feature branch version hygiene

2023-05-18 Thread Josh McKenzie
CEP-N seems like a good compromise. NextMajorRelease bumps into our interchangeable use of "Major" and "Minor" from a semver perspective and could get confusing. Suppose we could do NextFeatureRelease, but at that point why not just have it linked to the CEP and have the epic set. On Thu, May 1

[DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-16 Thread Josh McKenzie
Similar to what we've done with accord in https://issues.apache.org/jira/browse/CASSANDRA-18204, I'd like to discuss bringing cassandra-harry in-tree as a submodule. repo link: https://github.com/apache/cassandra-harry Given the value it's brought to the project's stabilization efforts and the

Re: [VOTE] CEP-29 CQL NOT Operator

2023-05-09 Thread Josh McKenzie
+1 On Tue, May 9, 2023, at 2:42 PM, Patrick McFadin wrote: > +1 > > On Tue, May 9, 2023 at 10:58 AM Caleb Rackliffe > wrote: >> +1 >> >> On Tue, May 9, 2023 at 12:04 PM Piotr Kołaczkowski >> wrote: >>> Let's vote. >>> >>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-29%3A+CQL+N

Re: [POLL] Vector type for ML

2023-05-05 Thread Josh McKenzie
Idiomatically, to my mind, there's a question of "what space are we thinking about this datatype in"? - In the context of mathematics, nullability in a vector would be 0 - In the context of Cassandra, nullability tends to mean a tombstone (or nothing) - In the context of programming languages, i

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread Josh McKenzie
> If we want to make an ML-specific data type, it should be in an ML plug-in. How can we encourage a healthier plug-in ecosystem? As far as I know it's been pretty anemic historically: cassandra: https://cassandra.apache.org/doc/latest/cassandra/plugins/index.html postgres: https://www.postgresql

Re: [DISCUSS] New data type for vector search

2023-04-27 Thread Josh McKenzie
>From a machine learning perspective, vectors are a well-known concept that are >effectively immutable fixed-length n-dimensional values that are then later >used either as part of a model or in conjunction with a model after the fact. While we could have this be non-frozen and not call it a vec

Re: Adding vector search to SAI with heirarchical navigable small world graph index

2023-04-25 Thread Josh McKenzie
To be fair Dinesh kind of primed that: > Do you intend to make this part of CEP-7 or as an incremental update to SAI > once it is committed? ;) I think this body of work more than stands on its own. Great work Jonathan, Mike, and Zhao; having native support for more ML-oriented workloads in C*

Cassandra project status, 2023-04-25

2023-04-25 Thread Josh McKenzie
We have a town hall coming up! The URL for the meetup can be found here: https://www.meetup.com/cassandra-global/events/292858262/. This will be held tomorrow at 12pm EST. Jon Haddad (https://www.linkedin.com/in/rustyrazorblade/) will be discussing performance tuning on Apache Cassandra, I'll b

Re: [DISCUSS] Next release date

2023-04-19 Thread Josh McKenzie
Let me try to break this down another way: I see a few competing concerns, each with QA related time requirements (asserting 8 weeks minimum, 16 weeks maximum we should plan for to stabilize a GA): 1. A freeze to a branch to stabilize for release (8-16 weeks of QA required after we branch) 2.

Re: [DISCUSS] [PATCH] Enable Direct I/O For CommitLog Files

2023-04-18 Thread Josh McKenzie
I took the liberty of creating https://issues.apache.org/jira/browse/CASSANDRA-18464 linking to this email thread w/the contents of your email and applying the patch to that ticket. Probably want to have some lower level discussions there when we find you a reviewer. On Tue, Apr 18, 2023, at 2

Re: [DISCUSS] Next release date

2023-04-17 Thread Josh McKenzie
> If this is true, why do we even bother running any CI before the CEP-21 > merge? It will all be invalidated anyway, right? I'm referring to manual validation or soak testing in qa environments rather than automated. Just because a soft-frozen branch without those features works in QA doesn't m

Re: [DISCUSS] Next release date

2023-04-17 Thread Josh McKenzie
means we branch there and anything not already merged has to wait > > > On Mon, Apr 17, 2023 at 3:37 PM Josh McKenzie wrote: >> __ >>> it's (b) for me, and everything minus 21 and 15 is defining enough to >>> warrant the branching and a checkpoint where tes

Re: [DISCUSS] Next release date

2023-04-17 Thread Josh McKenzie
> it's (b) for me, and everything minus 21 and 15 is defining enough to warrant > the branching and a checkpoint where testing can start Ok, I don't follow. There's three different ways I can read what you're saying here: 1. "Everything we have targeting 5.x is substantial and we can branch when

Re: [DISCUSS] Next release date

2023-04-17 Thread Josh McKenzie
ot;freeze" in this regard. On Mon, Apr 17, 2023, at 3:06 PM, Josh McKenzie wrote: > So to bring us back to the goals and alignment here: > >> With the following intentions: >> - moving towards the goal of annual releases, with a cadence 12±3 months >> apart, >

Re: [DISCUSS] Next release date

2023-04-17 Thread Josh McKenzie
So to bring us back to the goals and alignment here: > With the following intentions: > - moving towards the goal of annual releases, with a cadence 12±3 months > apart, > - the branch to GA period being 2-3 months, > - avoiding any type of freeze on trunk, > - getting a release out by December's

Re: [EXTERNAL] Re: Cassandra CI Status 2023-01-07

2023-04-17 Thread Josh McKenzie
wn to ~ 6 failures right now. On Mon, Mar 27, 2023, at 12:27 PM, Josh McKenzie wrote: > I'll take build lead for the next 2 weeks. > > On Sat, Mar 25, 2023, at 4:50 PM, Mick Semb Wever wrote: >>> Here comes Cassandra CI status for 2023-3-13 - 2023-23-179 : >>

Re: [DISCUSS] Next release date

2023-04-16 Thread Josh McKenzie
> 2. When CEP-15 lands we cut alpha1, > 2a. The deadline is first week of October, anything not yet in > cassandra-5.0 is not in 5.0, > 2b. We expect a minimum two months of testing and beta+rc releases > to get to GA. To clarify, is the intent here to say "The deadline for cutoff is 1st we

Re: (CVE only) support for 3,11 beyond published EOL

2023-04-13 Thread Josh McKenzie
> We already have an understanding and precedence in place that CVEs on > the previous unmaintained branch are addressed and released. Correct me if I'm wrong German, but the question I got from your email was effectively "If we consider formalizing our commitment to fixing CVE's on older branch

Re: [VOTE] Release Apache Cassandra 4.0.9 - SECOND ATTEMPT

2023-04-13 Thread Josh McKenzie
+1 On Thu, Apr 13, 2023, at 3:17 AM, Benjamin Lerer wrote: > +1 > > Le jeu. 13 avr. 2023 à 08:56, Tommy Stendahl via dev > a écrit : >> +1 (nb) >> >> -Original Message- >> *From*: Brandon Williams > > >> *Reply-To*: dev@cassandra.apac

Re: [VOTE] CEP-26: Unified Compaction Strategy

2023-04-06 Thread Josh McKenzie
+1 On Thu, Apr 6, 2023, at 12:18 PM, Joseph Lynch wrote: > +1 > > This proposal looks really exciting! > > -Joey > > On Wed, Apr 5, 2023 at 2:13 AM Aleksey Yeshchenko wrote: > > > > +1 > > > > On 4 Apr 2023, at 16:56, Ekaterina Dimitrova wrote: > > > > +1 > > > > On Tue, 4 Apr 2023 at 11:44,

Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Josh McKenzie
> KEYSPACE is fine. If we want to introduce a standard nomenclature like > DATABASE that’s also fine. Inventing brand new ones is not fine, there’s no > benefit. I'm with Benedict in principle, with Aleksey in practice; I think KEYSPACE and SCHEMA are actually fine enough. If and when we get to

Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-04 Thread Josh McKenzie
I think there's competing dynamics here. 1) KEYSPACE isn't that great of a name; it's not a space in which keys are necessarily unique, and you can't address things just by key w/out their respective tables 2) DATABASE isn't that great of a name either due to the aforementioned ambiguity. Some

<    1   2   3   4   5   6   >