The abstraction is to isolate Java API changes, not protocol compatibility. Changing to the java-driver comes with a number of changes to the code (see Steven's and my branches), if we can abstract that API it should lead to more maintainable code in the future by not having to change any processors, just the controller service implementation.
On Tue, Mar 19, 2024 at 10:14 AM Mike Thomsen <mikerthom...@gmail.com> wrote: > > https://opensource.docs.scylladb.com/stable/using-scylla/drivers/cql-drivers/scylla-java-driver.html > > Directly quoting Scylla docs here: > > > The Scylla Java Driver is a drop-in replacement for the DataStax Java > Driver. As such, no code changes are needed to use this driver. > > On Tue, Mar 19, 2024 at 10:13 AM Mike Thomsen <mikerthom...@gmail.com> > wrote: > > > Matt, > > > > I don't think we need to really "abstract above" the drivers because the > > Java DataStax driver appears to support 4.X all the way back to 2.X, as > > well as the enterprise versions from DataStax > > > > https://docs.datastax.com/en/driver-matrix/docs/java-drivers.html > > > > Similar situation with Scylla. When I looked at the driver, it appeared > to > > copy verbatim the entire public API of that driver. So I think before we > > dive into abstractions, it's worth doing a bit more validation of these > > details. IMHO, this might be a much lighter lift than anticipated. > > > > > > On Mon, Mar 18, 2024 at 4:30 PM Matt Burgess <mattyb...@gmail.com> > wrote: > > > >> Totally agree, that's what my branch does (see link in previous email). > >> The > >> more I work with it, the more I think I can abstract it further from > their > >> JDBC-like API but I started with a bunch of delegate classes then I > figure > >> I'll see where I can consolidate to more abstract concepts. If I don't > >> have > >> to support Cassandra 3 with the new API, so much the better. > >> > >> Regards, > >> Matt > >> > >> On Mon, Mar 18, 2024 at 4:14 PM David Handermann < > >> exceptionfact...@apache.org> wrote: > >> > >> > Matt et al, > >> > > >> > It is good to see the background effort on moving Cassandra > >> > capabilities in a supportable direction. > >> > > >> > I think new Cassandra components will require a significant departure > >> > from current Controller Service abstractions. Right now, the existing > >> > service interface does not provide a clean abstraction from the > >> > Cassandra library, which is part of the reason for the current > >> > coupling to the legacy driver version. > >> > > >> > Following up from Joe's comments, it seems like the cleanest way > >> > forward is to deprecate the current bundle on the 1.x branch, and > >> > remove the current bundle from the main branch. That will provide a > >> > clean slate for new Service and Processor implementations, without > >> > concern for uncertain compatibility questions. > >> > > >> > Regards, > >> > David Handermann > >> > > >> > On Mon, Mar 18, 2024 at 2:35 PM Matt Burgess <mattyb...@apache.org> > >> wrote: > >> > > > >> > > What do y'all think about removing the individual connection > >> properties > >> > > from the Cassandra processors for NiFi 2.0 and requiring a > >> > > CassandraSessionProvider instead? I think we started doing that > >> elsewhere > >> > > (Elasticsearch maybe?), I noticed duplicate code in the > >> > > CassandraSessionProvider and AbstractCassandraProcessor, if we keep > >> those > >> > > properties I can refactor them into a utility class. > >> > > > >> > > Thanks, > >> > > Matt > >> > > > >> > > > >> > > On Fri, Mar 15, 2024 at 2:44 PM Steven Matison < > >> steven.mati...@gmail.com > >> > > > >> > > wrote: > >> > > > >> > > > I got through quite a bit of work to enable 4.x… > >> > > > > >> > > > The 3.x pieces that were not backwards compatible is very edge use > >> > case and > >> > > > could have been done slightly differently but with work around. > >> > > > > >> > > > https://github.com/steven-matison/nifi/tree/nifi-10120-1 > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > On Fri, Mar 15, 2024 at 2:30 PM Matt Burgess < > mattyb...@apache.org> > >> > wrote: > >> > > > > >> > > > > Oops used the wrong email address so if there have been > responses > >> to > >> > the > >> > > > > Cassandra thread since mine I missed them, my bad! > >> > > > > > >> > > > > On Fri, Mar 15, 2024 at 2:00 PM Matt Burgess < > mattyb...@gmail.com > >> > > >> > > > wrote: > >> > > > > > >> > > > > > I believe the CQL protocol is backwards compatible but the > Java > >> > API is > >> > > > > > not. For example "com.datastax.driver.core.Session" is now > >> > > > > > "com.datastax.oss.driver.api.core.session.Session" and there > is > >> no > >> > more > >> > > > > > "Cluster" class. Might be fairly trivial to fix though, if > >> that's > >> > the > >> > > > > path > >> > > > > > of least resistance. > >> > > > > > > >> > > > > > On Fri, Mar 15, 2024 at 1:40 PM Joe Witt <joe.w...@gmail.com> > >> > wrote: > >> > > > > > > >> > > > > >> Matt > >> > > > > >> > >> > > > > >> I dont know a ton about Cassandra but when I looked at > >> > client/driver > >> > > > > notes > >> > > > > >> for 4+ it said it was compatible all the way back to 3.x. > Not > >> > sure > >> > > > > what > >> > > > > >> that means but it surely seems worth exploring. Also I dont > >> know > >> > if > >> > > > the > >> > > > > >> 4.x drivers get rid of the vulnerable bits. > >> > > > > >> > >> > > > > >> Thanks > >> > > > > >> > >> > > > > >> On Fri, Mar 15, 2024 at 10:39 AM Matt Burgess < > >> > mattyb...@apache.org> > >> > > > > >> wrote: > >> > > > > >> > >> > > > > >> > At the very least we should upgrade to Cassandra 3.11.6: > >> > > > > >> > > >> > > > > > >> > > https://github.com/apache/cassandra/blob/cassandra-3.11.16/CHANGES.txt > >> > > > > >> > > >> > > > > >> > On Fri, Mar 15, 2024 at 1:31 PM Matt Burgess < > >> > mattyb...@apache.org> > >> > > > > >> wrote: > >> > > > > >> > > >> > > > > >> > > If the community agrees to get rid of Cassandra 3 that'll > >> > save me > >> > > > > >> effort > >> > > > > >> > > on the refactor after I add Cassandra 4 :) Otherwise > those > >> > > > > >> > > vulnerabilities would only be in a "new" Cassandra 3 > >> services > >> > NAR > >> > > > > that > >> > > > > >> > > would not be included in the convenience binary. > >> > > > > >> > > > >> > > > > >> > > On Fri, Mar 15, 2024 at 1:28 PM Joe Witt < > >> joe.w...@gmail.com> > >> > > > > wrote: > >> > > > > >> > > > >> > > > > >> > >> Mike, Matt, > >> > > > > >> > >> > >> > > > > >> > >> Happy to hear you both have active efforts or are > >> interested > >> > in > >> > > > > doing > >> > > > > >> > so. > >> > > > > >> > >> Can you help me understand more specifically what that > >> means > >> > for > >> > > > > the > >> > > > > >> > >> current set of components? > >> > > > > >> > >> > >> > > > > >> > >> The CVE hits are concerning and long standing. > Supporting > >> > > > > Cassandra > >> > > > > >> 3 > >> > > > > >> > >> implies the current set of dependencies would remain too > >> > right? > >> > > > > >> > >> > >> > > > > >> > >> Is the current set of components we have ones we want to > >> > retain? > >> > > > > We > >> > > > > >> > >> certainly need Cassandra components - but are the ones > we > >> > have > >> > > > now > >> > > > > >> the > >> > > > > >> > >> right ones? > >> > > > > >> > >> > >> > > > > >> > >> Thanks > >> > > > > >> > >> Joe > >> > > > > >> > >> > >> > > > > >> > >> On Fri, Mar 15, 2024 at 10:25 AM Matt Burgess < > >> > > > > mattyb...@apache.org> > >> > > > > >> > >> wrote: > >> > > > > >> > >> > >> > > > > >> > >> > I'm actively working this, I pushed my branch up in > case > >> > anyone > >> > > > > >> wants > >> > > > > >> > to > >> > > > > >> > >> > take a look [1]. The idea is to abstract the Cassandra > >> API > >> > "up > >> > > > a > >> > > > > >> > couple > >> > > > > >> > >> > levels" and provide implementations for Cassandra 3, > 4, > >> and > >> > > > > >> eventually > >> > > > > >> > >> 5. > >> > > > > >> > >> > For JDBC-like interfaces this is a PITA because of the > >> API > >> > > > > >> (Statement, > >> > > > > >> > >> > PreparedStatement, BoundStatement, ResultSet, etc.) > but > >> I'm > >> > > > > hoping > >> > > > > >> we > >> > > > > >> > >> can > >> > > > > >> > >> > find a common pattern for abstracting the third-party > >> > library > >> > > > > >> > >> > implementation and API from the NiFi component > >> (Processor, > >> > > > > >> > >> > ControllerService, etc.) API. I think we're doing > >> something > >> > > > > similar > >> > > > > >> > for > >> > > > > >> > >> > Kafka? > >> > > > > >> > >> > > >> > > > > >> > >> > Regards, > >> > > > > >> > >> > Matt > >> > > > > >> > >> > > >> > > > > >> > >> > [1] https://github.com/mattyb149/nifi/tree/cassy4 > >> > > > > >> > >> > > >> > > > > >> > >> > > >> > > > > >> > >> > On Fri, Mar 15, 2024 at 8:43 AM Mike Thomsen < > >> > > > > >> mikerthom...@gmail.com> > >> > > > > >> > >> > wrote: > >> > > > > >> > >> > > >> > > > > >> > >> > > That’s been on my todo list for a little while but > >> things > >> > > > kept > >> > > > > >> > coming > >> > > > > >> > >> up. > >> > > > > >> > >> > > I think I could get started on that now. Based on my > >> > initial > >> > > > > >> > research > >> > > > > >> > >> it > >> > > > > >> > >> > > appears that scylla uses the exact same api as > >> datastax > >> > so > >> > > > > >> > supporting > >> > > > > >> > >> > both > >> > > > > >> > >> > > in a cql bundle should theoretically be fairly easy. > >> > > > > >> > >> > > > >> > > > > >> > >> > > > >> > > > > >> > >> > > Sent from my iPhone > >> > > > > >> > >> > > > >> > > > > >> > >> > > > On Mar 14, 2024, at 6:18 PM, Joe Witt < > >> > joew...@apache.org> > >> > > > > >> wrote: > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > Team, > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > Cassandra remains a really important system to be > >> able > >> > to > >> > > > > send > >> > > > > >> > data > >> > > > > >> > >> to. > >> > > > > >> > >> > > > However, it seems like we've not maintained these > >> > well. We > >> > > > > >> have > >> > > > > >> > >> what > >> > > > > >> > >> > > > appears to be at least a full generation behind on > >> > client > >> > > > > >> versions > >> > > > > >> > >> (we > >> > > > > >> > >> > > are > >> > > > > >> > >> > > > on 3x vs 4x which is the latest stable with 5x > >> > apparently > >> > > > > >> coming > >> > > > > >> > >> > > shortly). > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > We have components to send data, query data, and > use > >> > > > > Cassandra > >> > > > > >> as > >> > > > > >> > a > >> > > > > >> > >> > cache > >> > > > > >> > >> > > > store. We have older mechanisms for json/avro and > >> > publish > >> > > > > >> > >> mechanisms > >> > > > > >> > >> > for > >> > > > > >> > >> > > > records. > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > The libraries we do have depend on outdated > >> versions of > >> > > > Guava > >> > > > > >> and > >> > > > > >> > >> > result > >> > > > > >> > >> > > in > >> > > > > >> > >> > > > many CVE hits. > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > I am inclined to think we should deprecate the 1.x > >> > > > components > >> > > > > >> and > >> > > > > >> > >> > remove > >> > > > > >> > >> > > > them as-is from the 2.x line. Then re-introduce > >> them > >> > with > >> > > > > >> record > >> > > > > >> > >> only > >> > > > > >> > >> > > > interfaces and built against the latest stable > >> > > > > >> > >> > > Cassandra/Datastax/ScyllaDB > >> > > > > >> > >> > > > interfaces. > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > I'd love to hear thoughts from those closer to > this > >> > space > >> > > > > both > >> > > > > >> as > >> > > > > >> > a > >> > > > > >> > >> > user > >> > > > > >> > >> > > > and developer so we can make good next steps. > >> > > > > >> > >> > > > > >> > > > > >> > >> > > > Thanks > >> > > > > >> > >> > > > >> > > > > >> > >> > > >> > > > > >> > >> > >> > > > > >> > > > >> > > > > >> > > >> > > > > >> > >> > > > > > > >> > > > > > >> > > > > >> > > >> > > >