Re: [discuss] What to do with the Cassandra components

2024-05-20 Thread Matt Burgess
Yeah as I was going through it and creating all those delegates I thought about trying to define a higher-level API but that's about where I stopped. I'd definitely be interested to see what you've got. Thanks, Matt On Mon, May 20, 2024 at 12:50 PM Mike Thomsen wrote: > Matt, > > IIRC in your

Re: [discuss] What to do with the Cassandra components

2024-05-20 Thread Mike Thomsen
Matt, IIRC in your PR you had abstractions on the basic APIs. I think the right approach would be to make the driver controller services provide an API for querying so the processors and other services never have to work with the api or an abstraction of it. I have a lot of progress on such an

Re: [discuss] What to do with the Cassandra components

2024-05-17 Thread Matt Burgess
Now that the Couchbase PR is up I can continue my work on this if everyone's ok with the approach. On Fri, May 17, 2024 at 5:30 AM Pierre Villard wrote: > Hey guys, what's the latest on this? > > Le sam. 23 mars 2024 à 01:49, Mike Thomsen a > écrit : > > > Fair enough, Joe. > > > > Matt, > > >

Re: [discuss] What to do with the Cassandra components

2024-05-17 Thread Pierre Villard
Hey guys, what's the latest on this? Le sam. 23 mars 2024 à 01:49, Mike Thomsen a écrit : > Fair enough, Joe. > > Matt, > > I poked around your branch a little this evening. I agree with you and > David 100% now on the need for some sort of abstraction. I think the Graph > Bundle's model could

Re: [discuss] What to do with the Cassandra components

2024-03-22 Thread Mike Thomsen
Fair enough, Joe. Matt, I poked around your branch a little this evening. I agree with you and David 100% now on the need for some sort of abstraction. I think the Graph Bundle's model could serve as a good starting point for how to approach the problem. The client drivers in that bundle do the

Re: [discuss] What to do with the Cassandra components

2024-03-22 Thread Joe Witt
Mike, The bundles we include cannot have libs with know vulns and that last a very long time. That is a more pressing blocker. As noted top of thread we all recognize the importance of being able to integrate with Cassandra but including that must come with active mx especially as it relates to

Re: [discuss] What to do with the Cassandra components

2024-03-22 Thread Mike Thomsen
The scope tag was probably copy pasta. You raise a valid point about the processor dependencies that completely slipped my mind. That said, I think it's premature to remove the cassandra bundle until we have a working replacement. I would consider that support a blocker for 2.X. On Fri, Mar 22,

Re: [discuss] What to do with the Cassandra components

2024-03-22 Thread Matt Burgess
David beat me to it :) IMO the only NAR that should have any dependencies on Cassandra is the services NAR, not the processors or services API. On Fri, Mar 22, 2024 at 11:10 AM David Handermann < exceptionfact...@apache.org> wrote: > Mike, > > Thanks for sharing the branch, it is helpful to have

Re: [discuss] What to do with the Cassandra components

2024-03-22 Thread David Handermann
Mike, Thanks for sharing the branch, it is helpful to have that as a reference example. Have you been able to exercise any of that approach at runtime? Based on what is there right now, attempting to mark the DataStax java-driver-core as provided does not look like it will work. It may pass unit

Re: [discuss] What to do with the Cassandra components

2024-03-22 Thread Mike Thomsen
Work so far: https://github.com/MikeThomsen/nifi/tree/cql-changes On Thu, Mar 21, 2024 at 9:52 AM Mike Thomsen wrote: > Matt/David, > > By this evening, I should be at a point where I can share my branch. It > should be far enough along that y'all can see what I mean about how most of > the

Re: [discuss] What to do with the Cassandra components

2024-03-21 Thread Mike Thomsen
Matt/David, By this evening, I should be at a point where I can share my branch. It should be far enough along that y'all can see what I mean about how most of the changes really weren't that complicated. My sense is that if we collaborate on it, we can probably get it ready for a PR within a

Re: [discuss] What to do with the Cassandra components

2024-03-20 Thread Mike Thomsen
If it were that simple, they would probably have just gone with that solution. That said, the API is functionally vendor agnostic at this point at the Java API level. So I see no need to add abstraction above that. I've got probably 2/3 of nifi-cassandra-bundle converted. Hitting a few pain points

Re: [discuss] What to do with the Cassandra components

2024-03-20 Thread Matt Burgess
It would be interesting to see if you exclude the Scylla API JAR from the Scylla implementation and instead include DataStax's, if that works. However I'm still leaning towards a vendor-agnostic API. On Wed, Mar 20, 2024 at 11:26 AM Mike Thomsen wrote: > At first glance, the package names look

Re: [discuss] What to do with the Cassandra components

2024-03-20 Thread Mike Thomsen
At first glance, the package names look identical to me: https://java-driver.docs.scylladb.com/scylla-4.15.0.x/api/index.html So I see no reason to not take them at their word that it's drop-in On Wed, Mar 20, 2024 at 11:04 AM David Handermann < exceptionfact...@apache.org> wrote: > Mike, > >

Re: [discuss] What to do with the Cassandra components

2024-03-20 Thread David Handermann
Mike, One important thing to mention about the DataStax vs ScyllaDB driver is that the Maven coordinates are different, and managing the dependencies correctly will make or break the implementation. In other words, if it is possible to use the DataStax 4 core JAR in the Controller Service API,

Re: [discuss] What to do with the Cassandra components

2024-03-20 Thread Mike Thomsen
Matt/David, I got some time this morning to take a crack at directly migrating it over to the DataStax 4.17 driver. Definitely got a lot of work to do, but so far I haven't hit any real snags. This is a branch that reverts the commit to remove the cassandra bundle and reuses the existing features

Re: [discuss] What to do with the Cassandra components

2024-03-19 Thread Mike Thomsen
A cursory look at the Cassandra 5 stuff didn’t indicate any incompatibility. So yeah, I think we are likely pretty safe to use the 4.17 driver Sent from my iPhone > On Mar 19, 2024, at 3:35 PM, Matt Burgess wrote: > > Is it likely now (due to the refactor) that we will simply be able to >

Re: [discuss] What to do with the Cassandra components

2024-03-19 Thread Matt Burgess
Is it likely now (due to the refactor) that we will simply be able to upgrade the driver when Cassandra 5 is GA? Also does anyone use Netflix's Astyanax [1]? [1] https://cassandra.apache.org/doc/stable/cassandra/getting_started/drivers.html#java On Tue, Mar 19, 2024 at 3:10 PM Mike Thomsen

Re: [discuss] What to do with the Cassandra components

2024-03-19 Thread Mike Thomsen
Realistically, I think we are only likely to see two drivers: * DataStax * ScyllaDB The latter makes a selling point of being a binary compatible, drop-in replacement for the former. That's why I don't see a need to have an abstraction layer per se. I think we only need

Re: [discuss] What to do with the Cassandra components

2024-03-19 Thread David Handermann
Mike, Thanks for the reply and clarification. I agree there is no need to maintain support for the DataStax 3 driver and Java API, any new components should be built on the latest version of the driver. What we do need going forward is to avoid, if at all possible, having a DataStax 4

Re: [discuss] What to do with the Cassandra components

2024-03-19 Thread Mike Thomsen
** we can dump v3 **DRIVER** compatibility, since later 4.X Java drivers are backward compatible with Cassandra 3 On Tue, Mar 19, 2024 at 2:43 PM Mike Thomsen wrote: > David, > > Before we proceed, I think we should make sure we're all understanding the > same problem here. Starting with this:

Re: [discuss] What to do with the Cassandra components

2024-03-19 Thread Mike Thomsen
David, Before we proceed, I think we should make sure we're all understanding the same problem here. Starting with this: > I believe the CQL protocol is backwards compatible but the Java API is not. > For example "com.datastax.driver.core.Session" is now >

Re: [discuss] What to do with the Cassandra components

2024-03-19 Thread David Handermann
All, I support a Controller Service API abstraction around the Cassandra Driver. The changes from DataStax 3 to 4 already highlight the need for that abstraction. The donation of the DataStax Java driver to Apache [1] also shows the value of providing some level of isolation, if at all possible.

Re: [discuss] What to do with the Cassandra components

2024-03-19 Thread Mike Thomsen
Matt, I got that. My point was that the Java changes appear to be a one time thing that DataStax did to make a better driver with a much more future-proof API. Since Scylla tracks them as closely as possible, I suspect that we don't need to plan for a bunch of abstraction to isolate Java changes.

Re: [discuss] What to do with the Cassandra components

2024-03-19 Thread Steven Matison
That was kinda where i got stuck and fell out on my branch/jira. Mike and I wanted to make a new controller service , without backward compatibility; and remove the duplicate driver/connection properties found in some of the processors. I agree taking out all old stuff and making new controller

Re: [discuss] What to do with the Cassandra components

2024-03-19 Thread Matt Burgess
The abstraction is to isolate Java API changes, not protocol compatibility. Changing to the java-driver comes with a number of changes to the code (see Steven's and my branches), if we can abstract that API it should lead to more maintainable code in the future by not having to change any

Re: [discuss] What to do with the Cassandra components

2024-03-19 Thread Mike Thomsen
https://opensource.docs.scylladb.com/stable/using-scylla/drivers/cql-drivers/scylla-java-driver.html Directly quoting Scylla docs here: > The Scylla Java Driver is a drop-in replacement for the DataStax Java Driver. As such, no code changes are needed to use this driver. On Tue, Mar 19, 2024 at

Re: [discuss] What to do with the Cassandra components

2024-03-19 Thread Mike Thomsen
Matt, I don't think we need to really "abstract above" the drivers because the Java DataStax driver appears to support 4.X all the way back to 2.X, as well as the enterprise versions from DataStax https://docs.datastax.com/en/driver-matrix/docs/java-drivers.html Similar situation with Scylla.

Re: [discuss] What to do with the Cassandra components

2024-03-18 Thread Matt Burgess
Sounds like a plan, thanks much! On Mon, Mar 18, 2024 at 4:49 PM David Handermann < exceptionfact...@apache.org> wrote: > Matt, > > Thanks, that makes sense on further review of the branch you mentioned > previously. > > It sounds like we have a general consensus on a way forward. > > I will

Re: [discuss] What to do with the Cassandra components

2024-03-18 Thread David Handermann
Matt, Thanks, that makes sense on further review of the branch you mentioned previously. It sounds like we have a general consensus on a way forward. I will proceed with writing up the Jira issues and putting together pull requests for deprecation and removal of the existing Cassandra 3

Re: [discuss] What to do with the Cassandra components

2024-03-18 Thread Matt Burgess
Totally agree, that's what my branch does (see link in previous email). The more I work with it, the more I think I can abstract it further from their JDBC-like API but I started with a bunch of delegate classes then I figure I'll see where I can consolidate to more abstract concepts. If I don't

Re: [discuss] What to do with the Cassandra components

2024-03-18 Thread David Handermann
Matt et al, It is good to see the background effort on moving Cassandra capabilities in a supportable direction. I think new Cassandra components will require a significant departure from current Controller Service abstractions. Right now, the existing service interface does not provide a clean

Re: [discuss] What to do with the Cassandra components

2024-03-18 Thread Matt Burgess
What do y'all think about removing the individual connection properties from the Cassandra processors for NiFi 2.0 and requiring a CassandraSessionProvider instead? I think we started doing that elsewhere (Elasticsearch maybe?), I noticed duplicate code in the CassandraSessionProvider and

Re: [discuss] What to do with the Cassandra components

2024-03-15 Thread Steven Matison
I got through quite a bit of work to enable 4.x… The 3.x pieces that were not backwards compatible is very edge use case and could have been done slightly differently but with work around. https://github.com/steven-matison/nifi/tree/nifi-10120-1 On Fri, Mar 15, 2024 at 2:30 PM Matt Burgess

Re: [discuss] What to do with the Cassandra components

2024-03-15 Thread Matt Burgess
Oops used the wrong email address so if there have been responses to the Cassandra thread since mine I missed them, my bad! On Fri, Mar 15, 2024 at 2:00 PM Matt Burgess wrote: > I believe the CQL protocol is backwards compatible but the Java API is > not. For example

Re: [discuss] What to do with the Cassandra components

2024-03-15 Thread Matt Burgess
I believe the CQL protocol is backwards compatible but the Java API is not. For example "com.datastax.driver.core.Session" is now "com.datastax.oss.driver.api.core.session.Session" and there is no more "Cluster" class. Might be fairly trivial to fix though, if that's the path of least resistance.

Re: [discuss] What to do with the Cassandra components

2024-03-15 Thread Joe Witt
Matt I dont know a ton about Cassandra but when I looked at client/driver notes for 4+ it said it was compatible all the way back to 3.x. Not sure what that means but it surely seems worth exploring. Also I dont know if the 4.x drivers get rid of the vulnerable bits. Thanks On Fri, Mar 15,

Re: [discuss] What to do with the Cassandra components

2024-03-15 Thread Matt Burgess
At the very least we should upgrade to Cassandra 3.11.6: https://github.com/apache/cassandra/blob/cassandra-3.11.16/CHANGES.txt On Fri, Mar 15, 2024 at 1:31 PM Matt Burgess wrote: > If the community agrees to get rid of Cassandra 3 that'll save me effort > on the refactor after I add Cassandra

Re: [discuss] What to do with the Cassandra components

2024-03-15 Thread Matt Burgess
If the community agrees to get rid of Cassandra 3 that'll save me effort on the refactor after I add Cassandra 4 :) Otherwise those vulnerabilities would only be in a "new" Cassandra 3 services NAR that would not be included in the convenience binary. On Fri, Mar 15, 2024 at 1:28 PM Joe Witt

Re: [discuss] What to do with the Cassandra components

2024-03-15 Thread Joe Witt
Mike, Matt, Happy to hear you both have active efforts or are interested in doing so. Can you help me understand more specifically what that means for the current set of components? The CVE hits are concerning and long standing. Supporting Cassandra 3 implies the current set of dependencies

Re: [discuss] What to do with the Cassandra components

2024-03-15 Thread Matt Burgess
I'm actively working this, I pushed my branch up in case anyone wants to take a look [1]. The idea is to abstract the Cassandra API "up a couple levels" and provide implementations for Cassandra 3, 4, and eventually 5. For JDBC-like interfaces this is a PITA because of the API (Statement,

Re: [discuss] What to do with the Cassandra components

2024-03-15 Thread Mike Thomsen
That’s been on my todo list for a little while but things kept coming up. I think I could get started on that now. Based on my initial research it appears that scylla uses the exact same api as datastax so supporting both in a cql bundle should theoretically be fairly easy. Sent from my

[discuss] What to do with the Cassandra components

2024-03-14 Thread Joe Witt
Team, Cassandra remains a really important system to be able to send data to. However, it seems like we've not maintained these well. We have what appears to be at least a full generation behind on client versions (we are on 3x vs 4x which is the latest stable with 5x apparently coming shortly).