Agreed with Dinesh and Josh.  I would *never* put the audit log back in
Cassandra.

This is extendable, Sagar, so you're free to do as you want, but I'm very
opposed to putting a ticking time bomb in Cassandra proper.

Jon


On Thu, Feb 28, 2019 at 8:38 AM Dinesh Joshi <djos...@icloud.com.invalid>
wrote:

> I strongly echo Josh’s sentiment. Imagine losing audit entries because C*
> is overloaded? It’s fine if you don’t care about losing audit entries.
>
> Dinesh
>
> > On Feb 28, 2019, at 6:41 AM, Joshua McKenzie <jmcken...@apache.org>
> wrote:
> >
> > One of the things we've run into historically, on a *lot* of axes, is
> that
> > "just put it in C*" for various functionality looks great from a user and
> > usability perspective, and proves to be something of a nightmare from an
> > admin / cluster behavior perspective.
> >
> > i.e. - cluster suffering so you're writing hints? Write them to C* tables
> > and watch the cluster suffer more! :)
> > Same thing probably holds true for audit logging - at a time frame when
> > things are getting hairy w/a cluster, if you're writing that audit
> logging
> > into C* proper (and dealing with ser/deser, compaction pressure, flushing
> > pressure, etc) from that, there's a compounding effect of pressure and
> pain
> > on the cluster.
> >
> > So the TL;DR we as a project kind of philosophically have been moving
> > towards (I think that's valid to say?) is: use C* for the things it's
> > absolutely great at, and try to side-channel other recovery operations as
> > much as you can (see: file-based hints) to stay out of its way.
> >
> > Same thing held true w/design of CDC - I debated "materialize in memory
> for
> > consumer to take over socket", and "keep the data in another C* table",
> but
> > the ramifications to perf and core I/O operations in C* the moment things
> > start to go badly were significant enough that the route we went was "do
> no
> > harm". For better or for worse, as there's obvious tradeoffs there.
> >
> >> On Thu, Feb 28, 2019 at 7:46 AM Sagar <sagarmeansoc...@gmail.com>
> wrote:
> >>
> >> Thanks all for the pointers.
> >>
> >> @Joseph,
> >>
> >> I have gone through the links shared by you. Also, I have been looking
> at
> >> the code base.
> >>
> >> I understand the fact that pushing the logs to ES or Solr is a lot
> easier
> >> to do. Having said that, the only reason I thought having something like
> >> this might help is, if I don't want to add more pieces and still
> provide a
> >> central piece of audit logging within Cassandra itself and still be
> >> queryable.
> >>
> >> In terms of usages, one of them could definitely be CDC related use
> cases.
> >> With data being stored in tables and being queryable, it can become a
> lot
> >> more easier to expose this data to external systems like Kafka Connect,
> >> Debezium which have the ability to push data to Kafka for example. Note
> >> that pushing data to Kafka is just an example, but what I mean is, if we
> >> can have data in tables, then instead of everyone writing custom custom
> >> loggers, they can hook into this table info and take action.
> >>
> >> Regarding the infinite loop question, I have done some analysis, and in
> my
> >> opinion, instead of tweaking the behaviour of Binlog and the way it
> >> functions currently, we can actually spin up another tailer thread to
> the
> >> same Chronicle Queue which can do the needful. This way the config
> options
> >> etc all remain the same(apart from the logger ofcourse).
> >>
> >> Let me know if any of it makes sense :D
> >>
> >> Thanks!
> >> Sagar.
> >>
> >>
> >> On Thu, Feb 28, 2019 at 1:09 AM Dinesh Joshi <djos...@icloud.com.invalid
> >
> >> wrote:
> >>
> >>>
> >>>
> >>>> On Feb 27, 2019, at 10:41 AM, Joseph Lynch <joe.e.ly...@gmail.com>
> >>> wrote:
> >>>>
> >>>> Vinay can confirm, but as far as I am aware we have no current plans
> to
> >>>> implement audit logging to a table directly, but the implementation is
> >>>> fully pluggable (like compaction, compression, etc ...). Check out the
> >>> blog
> >>>> post [1] and documentation [2] Vinay wrote for more details, but the
> >>> short
> >>>
> >>> +1. I am still curious as to why you'd want to store audit log entries
> >>> back in Cassandra? Depending on the scale it can generate a lot of load
> >> and
> >>> I think you'd end up in an infinite loop because as you're inserting
> the
> >>> audit log entry you'll generate a new one and so on unless you black
> list
> >>> audits to that table / keyspace.
> >>>
> >>> Ideally you'd insert this data into ElasticSearch / Solr or some other
> >>> place that can be then used for analytics or search.
> >>>
> >>> Dinesh
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>
> >>>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

-- 
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade

Reply via email to