Matt,

Thanks, that makes sense on further review of the branch you mentioned
previously.

It sounds like we have a general consensus on a way forward.

I will proceed with writing up the Jira issues and putting together
pull requests for deprecation and removal of the existing Cassandra 3
components. That should put things in a good place to land the new
capabilities when they are ready. It also resolves the current
vulnerability findings with the legacy driver, so this approach is
helpful on several fronts.

Regards,
David Handermann

On Mon, Mar 18, 2024 at 3:30 PM Matt Burgess <mattyb...@gmail.com> wrote:
>
> Totally agree, that's what my branch does (see link in previous email). The
> more I work with it, the more I think I can abstract it further from their
> JDBC-like API but I started with a bunch of delegate classes then I figure
> I'll see where I can consolidate to more abstract concepts. If I don't have
> to support Cassandra 3 with the new API, so much the better.
>
> Regards,
> Matt
>
> On Mon, Mar 18, 2024 at 4:14 PM David Handermann <
> exceptionfact...@apache.org> wrote:
>
> > Matt et al,
> >
> > It is good to see the background effort on moving Cassandra
> > capabilities in a supportable direction.
> >
> > I think new Cassandra components will require a significant departure
> > from current Controller Service abstractions. Right now, the existing
> > service interface does not provide a clean abstraction from the
> > Cassandra library, which is part of the reason for the current
> > coupling to the legacy driver version.
> >
> > Following up from Joe's comments, it seems like the cleanest way
> > forward is to deprecate the current bundle on the 1.x branch, and
> > remove the current bundle from the main branch. That will provide a
> > clean slate for new Service and Processor implementations, without
> > concern for uncertain compatibility questions.
> >
> > Regards,
> > David Handermann
> >
> > On Mon, Mar 18, 2024 at 2:35 PM Matt Burgess <mattyb...@apache.org> wrote:
> > >
> > > What do y'all think about removing the individual connection properties
> > > from the Cassandra processors for NiFi 2.0 and requiring a
> > > CassandraSessionProvider instead? I think we started doing that elsewhere
> > > (Elasticsearch maybe?), I noticed duplicate code in the
> > > CassandraSessionProvider and AbstractCassandraProcessor, if we keep those
> > > properties I can refactor them into a utility class.
> > >
> > > Thanks,
> > > Matt
> > >
> > >
> > > On Fri, Mar 15, 2024 at 2:44 PM Steven Matison <steven.mati...@gmail.com
> > >
> > > wrote:
> > >
> > > > I got through quite a bit of work to enable 4.x…
> > > >
> > > > The 3.x pieces that were not backwards compatible is very edge use
> > case and
> > > > could have been done slightly differently but with work around.
> > > >
> > > > https://github.com/steven-matison/nifi/tree/nifi-10120-1
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, Mar 15, 2024 at 2:30 PM Matt Burgess <mattyb...@apache.org>
> > wrote:
> > > >
> > > > > Oops used the wrong email address so if there have been responses to
> > the
> > > > > Cassandra thread since mine I missed them, my bad!
> > > > >
> > > > > On Fri, Mar 15, 2024 at 2:00 PM Matt Burgess <mattyb...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > I believe the CQL protocol is backwards compatible but the Java
> > API is
> > > > > > not. For example "com.datastax.driver.core.Session" is now
> > > > > > "com.datastax.oss.driver.api.core.session.Session" and there is no
> > more
> > > > > > "Cluster" class. Might be fairly trivial to fix though, if that's
> > the
> > > > > path
> > > > > > of least resistance.
> > > > > >
> > > > > > On Fri, Mar 15, 2024 at 1:40 PM Joe Witt <joe.w...@gmail.com>
> > wrote:
> > > > > >
> > > > > >> Matt
> > > > > >>
> > > > > >> I dont know a ton about Cassandra but when I looked at
> > client/driver
> > > > > notes
> > > > > >> for 4+ it said it was compatible all the way back to 3.x.   Not
> > sure
> > > > > what
> > > > > >> that means but it surely seems worth exploring.  Also I dont know
> > if
> > > > the
> > > > > >> 4.x drivers get rid of the vulnerable bits.
> > > > > >>
> > > > > >> Thanks
> > > > > >>
> > > > > >> On Fri, Mar 15, 2024 at 10:39 AM Matt Burgess <
> > mattyb...@apache.org>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > At the very least we should upgrade to Cassandra 3.11.6:
> > > > > >> >
> > > > >
> > https://github.com/apache/cassandra/blob/cassandra-3.11.16/CHANGES.txt
> > > > > >> >
> > > > > >> > On Fri, Mar 15, 2024 at 1:31 PM Matt Burgess <
> > mattyb...@apache.org>
> > > > > >> wrote:
> > > > > >> >
> > > > > >> > > If the community agrees to get rid of Cassandra 3 that'll
> > save me
> > > > > >> effort
> > > > > >> > > on the refactor after I add Cassandra 4 :) Otherwise those
> > > > > >> > > vulnerabilities would only be in a "new" Cassandra 3 services
> > NAR
> > > > > that
> > > > > >> > > would not be included in the convenience binary.
> > > > > >> > >
> > > > > >> > > On Fri, Mar 15, 2024 at 1:28 PM Joe Witt <joe.w...@gmail.com>
> > > > > wrote:
> > > > > >> > >
> > > > > >> > >> Mike, Matt,
> > > > > >> > >>
> > > > > >> > >> Happy to hear you both have active efforts or are interested
> > in
> > > > > doing
> > > > > >> > so.
> > > > > >> > >> Can you help me understand more specifically what that means
> > for
> > > > > the
> > > > > >> > >> current set of components?
> > > > > >> > >>
> > > > > >> > >> The CVE hits are concerning and long standing.  Supporting
> > > > > Cassandra
> > > > > >> 3
> > > > > >> > >> implies the current set of dependencies would remain too
> > right?
> > > > > >> > >>
> > > > > >> > >> Is the current set of components we have ones we want to
> > retain?
> > > > > We
> > > > > >> > >> certainly need Cassandra components - but are the ones we
> > have
> > > > now
> > > > > >> the
> > > > > >> > >> right ones?
> > > > > >> > >>
> > > > > >> > >> Thanks
> > > > > >> > >> Joe
> > > > > >> > >>
> > > > > >> > >> On Fri, Mar 15, 2024 at 10:25 AM Matt Burgess <
> > > > > mattyb...@apache.org>
> > > > > >> > >> wrote:
> > > > > >> > >>
> > > > > >> > >> > I'm actively working this, I pushed my branch up in case
> > anyone
> > > > > >> wants
> > > > > >> > to
> > > > > >> > >> > take a look [1]. The idea is to abstract the Cassandra API
> > "up
> > > > a
> > > > > >> > couple
> > > > > >> > >> > levels" and provide implementations for Cassandra 3, 4, and
> > > > > >> eventually
> > > > > >> > >> 5.
> > > > > >> > >> > For JDBC-like interfaces this is a PITA because of the API
> > > > > >> (Statement,
> > > > > >> > >> > PreparedStatement, BoundStatement, ResultSet, etc.) but I'm
> > > > > hoping
> > > > > >> we
> > > > > >> > >> can
> > > > > >> > >> > find a common pattern for abstracting the third-party
> > library
> > > > > >> > >> > implementation and API from the NiFi component (Processor,
> > > > > >> > >> > ControllerService, etc.) API. I think we're doing something
> > > > > similar
> > > > > >> > for
> > > > > >> > >> > Kafka?
> > > > > >> > >> >
> > > > > >> > >> > Regards,
> > > > > >> > >> > Matt
> > > > > >> > >> >
> > > > > >> > >> > [1] https://github.com/mattyb149/nifi/tree/cassy4
> > > > > >> > >> >
> > > > > >> > >> >
> > > > > >> > >> > On Fri, Mar 15, 2024 at 8:43 AM Mike Thomsen <
> > > > > >> mikerthom...@gmail.com>
> > > > > >> > >> > wrote:
> > > > > >> > >> >
> > > > > >> > >> > > That’s been on my todo list for a little while but things
> > > > kept
> > > > > >> > coming
> > > > > >> > >> up.
> > > > > >> > >> > > I think I could get started on that now. Based on my
> > initial
> > > > > >> > research
> > > > > >> > >> it
> > > > > >> > >> > > appears that scylla uses the exact same api as datastax
> > so
> > > > > >> > supporting
> > > > > >> > >> > both
> > > > > >> > >> > > in a cql bundle should theoretically be fairly easy.
> > > > > >> > >> > >
> > > > > >> > >> > >
> > > > > >> > >> > > Sent from my iPhone
> > > > > >> > >> > >
> > > > > >> > >> > > > On Mar 14, 2024, at 6:18 PM, Joe Witt <
> > joew...@apache.org>
> > > > > >> wrote:
> > > > > >> > >> > > >
> > > > > >> > >> > > > Team,
> > > > > >> > >> > > >
> > > > > >> > >> > > > Cassandra remains a really important system to be able
> > to
> > > > > send
> > > > > >> > data
> > > > > >> > >> to.
> > > > > >> > >> > > > However, it seems like we've not maintained these
> > well.  We
> > > > > >> have
> > > > > >> > >> what
> > > > > >> > >> > > > appears to be at least a full generation behind on
> > client
> > > > > >> versions
> > > > > >> > >> (we
> > > > > >> > >> > > are
> > > > > >> > >> > > > on 3x vs 4x which is the latest stable with 5x
> > apparently
> > > > > >> coming
> > > > > >> > >> > > shortly).
> > > > > >> > >> > > >
> > > > > >> > >> > > > We have components to send data, query data, and use
> > > > > Cassandra
> > > > > >> as
> > > > > >> > a
> > > > > >> > >> > cache
> > > > > >> > >> > > > store.  We have older mechanisms for json/avro and
> > publish
> > > > > >> > >> mechanisms
> > > > > >> > >> > for
> > > > > >> > >> > > > records.
> > > > > >> > >> > > >
> > > > > >> > >> > > > The libraries we do have depend on outdated versions of
> > > > Guava
> > > > > >> and
> > > > > >> > >> > result
> > > > > >> > >> > > in
> > > > > >> > >> > > > many CVE hits.
> > > > > >> > >> > > >
> > > > > >> > >> > > > I am inclined to think we should deprecate the 1.x
> > > > components
> > > > > >> and
> > > > > >> > >> > remove
> > > > > >> > >> > > > them as-is from the 2.x line.  Then re-introduce them
> > with
> > > > > >> record
> > > > > >> > >> only
> > > > > >> > >> > > > interfaces and built against the latest stable
> > > > > >> > >> > > Cassandra/Datastax/ScyllaDB
> > > > > >> > >> > > > interfaces.
> > > > > >> > >> > > >
> > > > > >> > >> > > > I'd love to hear thoughts from those closer to this
> > space
> > > > > both
> > > > > >> as
> > > > > >> > a
> > > > > >> > >> > user
> > > > > >> > >> > > > and developer so we can make good next steps.
> > > > > >> > >> > > >
> > > > > >> > >> > > > Thanks
> > > > > >> > >> > >
> > > > > >> > >> >
> > > > > >> > >>
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > > >
> > > > >
> > > >
> >

Reply via email to