Agreed with Dinesh and Josh. I would *never* put the audit log back in Cassandra.
This is extendable, Sagar, so you're free to do as you want, but I'm very opposed to putting a ticking time bomb in Cassandra proper. Jon On Thu, Feb 28, 2019 at 8:38 AM Dinesh Joshi <djos...@icloud.com.invalid> wrote: > I strongly echo Josh’s sentiment. Imagine losing audit entries because C* > is overloaded? It’s fine if you don’t care about losing audit entries. > > Dinesh > > > On Feb 28, 2019, at 6:41 AM, Joshua McKenzie <jmcken...@apache.org> > wrote: > > > > One of the things we've run into historically, on a *lot* of axes, is > that > > "just put it in C*" for various functionality looks great from a user and > > usability perspective, and proves to be something of a nightmare from an > > admin / cluster behavior perspective. > > > > i.e. - cluster suffering so you're writing hints? Write them to C* tables > > and watch the cluster suffer more! :) > > Same thing probably holds true for audit logging - at a time frame when > > things are getting hairy w/a cluster, if you're writing that audit > logging > > into C* proper (and dealing with ser/deser, compaction pressure, flushing > > pressure, etc) from that, there's a compounding effect of pressure and > pain > > on the cluster. > > > > So the TL;DR we as a project kind of philosophically have been moving > > towards (I think that's valid to say?) is: use C* for the things it's > > absolutely great at, and try to side-channel other recovery operations as > > much as you can (see: file-based hints) to stay out of its way. > > > > Same thing held true w/design of CDC - I debated "materialize in memory > for > > consumer to take over socket", and "keep the data in another C* table", > but > > the ramifications to perf and core I/O operations in C* the moment things > > start to go badly were significant enough that the route we went was "do > no > > harm". For better or for worse, as there's obvious tradeoffs there. > > > >> On Thu, Feb 28, 2019 at 7:46 AM Sagar <sagarmeansoc...@gmail.com> > wrote: > >> > >> Thanks all for the pointers. > >> > >> @Joseph, > >> > >> I have gone through the links shared by you. Also, I have been looking > at > >> the code base. > >> > >> I understand the fact that pushing the logs to ES or Solr is a lot > easier > >> to do. Having said that, the only reason I thought having something like > >> this might help is, if I don't want to add more pieces and still > provide a > >> central piece of audit logging within Cassandra itself and still be > >> queryable. > >> > >> In terms of usages, one of them could definitely be CDC related use > cases. > >> With data being stored in tables and being queryable, it can become a > lot > >> more easier to expose this data to external systems like Kafka Connect, > >> Debezium which have the ability to push data to Kafka for example. Note > >> that pushing data to Kafka is just an example, but what I mean is, if we > >> can have data in tables, then instead of everyone writing custom custom > >> loggers, they can hook into this table info and take action. > >> > >> Regarding the infinite loop question, I have done some analysis, and in > my > >> opinion, instead of tweaking the behaviour of Binlog and the way it > >> functions currently, we can actually spin up another tailer thread to > the > >> same Chronicle Queue which can do the needful. This way the config > options > >> etc all remain the same(apart from the logger ofcourse). > >> > >> Let me know if any of it makes sense :D > >> > >> Thanks! > >> Sagar. > >> > >> > >> On Thu, Feb 28, 2019 at 1:09 AM Dinesh Joshi <djos...@icloud.com.invalid > > > >> wrote: > >> > >>> > >>> > >>>> On Feb 27, 2019, at 10:41 AM, Joseph Lynch <joe.e.ly...@gmail.com> > >>> wrote: > >>>> > >>>> Vinay can confirm, but as far as I am aware we have no current plans > to > >>>> implement audit logging to a table directly, but the implementation is > >>>> fully pluggable (like compaction, compression, etc ...). Check out the > >>> blog > >>>> post [1] and documentation [2] Vinay wrote for more details, but the > >>> short > >>> > >>> +1. I am still curious as to why you'd want to store audit log entries > >>> back in Cassandra? Depending on the scale it can generate a lot of load > >> and > >>> I think you'd end up in an infinite loop because as you're inserting > the > >>> audit log entry you'll generate a new one and so on unless you black > list > >>> audits to that table / keyspace. > >>> > >>> Ideally you'd insert this data into ElasticSearch / Solr or some other > >>> place that can be then used for analytics or search. > >>> > >>> Dinesh > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > >>> For additional commands, e-mail: dev-h...@cassandra.apache.org > >>> > >>> > >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > > -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade