Thanks. Well, at a minimum I'll probably start writing something soon for trigger-based write mirroring, and we will probably support kafka and another cassandra cluster, so if those seem to work I will contribute those.
On Thu, Oct 18, 2018 at 11:27 AM Jeff Jirsa <jji...@gmail.com> wrote: > The write sampling is adding an extra instance with the same schema to > test things like yaml params or compaction without impacting reads or > correctness - it’s different than what you describe > > > > -- > Jeff Jirsa > > > > On Oct 18, 2018, at 5:57 PM, Carl Mueller > > <carl.muel...@smartthings.com.INVALID> > wrote: > > > > I guess there is also write-survey-mode from cass 1.1: > > > > https://issues.apache.org/jira/browse/CASSANDRA-3452 > > > > Were triggers intended to supersede this capability? I can't find a lot > of > > "user level" info on it. > > > > > > On Thu, Oct 18, 2018 at 10:53 AM Carl Mueller < > carl.muel...@smartthings.com> > > wrote: > > > >> tl;dr: a generic trigger on TABLES that will mirror all writes to > >> facilitate data migrations between clusters or systems. What is > necessary > >> to ensure full write mirroring/coherency? > >> > >> When cassandra clusters have several "apps" aka keyspaces serving > >> applications colocated on them, but the app/keyspace bandwidth and size > >> demands begin impacting other keyspaces/apps, then one strategy is to > >> migrate the keyspace to its own dedicated cluster. > >> > >> With backups/sstableloading, this will entail a delay and therefore a > >> "coherency" shortfall between the clusters. So typically one would > employ a > >> "double write, read once": > >> > >> - all updates are mirrored to both clusters > >> - writes come from the current most coherent. > >> > >> Often two sstable loads are done: > >> > >> 1) first load > >> 2) turn on double writes/write mirroring > >> 3) a second load is done to finalize coherency > >> 4) switch the app to point to the new cluster now that it is coherent > >> > >> The double writes and read is the sticking point. We could do it at the > >> app layer, but if the app wasn't written with that, it is a lot of > testing > >> and customization specific to the framework. > >> > >> We could theoretically do some sort of proxying of the java-driver > >> somehow, but all the async structures and complex interfaces/apis would > be > >> difficult to proxy. Maybe there is a lower level in the java-driver > that is > >> possible. This also would only apply to the java-driver, and not > >> python/go/javascript/other drivers. > >> > >> Finally, I suppose we could do a trigger on the tables. It would be > really > >> nice if we could add to the cassandra toolbox the basics of a write > >> mirroring trigger that could be activated "fairly easily"... now I know > >> there are the complexities of inter-cluster access, and if we are even > >> using cassandra as the target mirror system (for example there is an > >> article on triggers write-mirroring to kafka: > >> https://dzone.com/articles/cassandra-to-kafka-data-pipeline-part-1). > >> > >> And this starts to get into the complexities of hinted handoff as well. > >> But fundamentally this seems something that would be a very nice feature > >> (especially when you NEED it) to have in the core of cassandra. > >> > >> Finally, is the mutation hook in triggers sufficient to track all > incoming > >> mutations (outside of "shudder" other triggers generating data) > >> > >> > >> > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > >