>
> My goal is to reconstruct the CQL operation from the Mutation object.
> So that I can trigger the same action on another NoSQL target like MongoDB.
>

There are different way of keeping your 2 database in sync. Unfortunatly,
they all have some trade offs (as always ;-))


   1. If you have controle on the client side, you could wrap the driver
   and add some code that convert the query and write it to the other database
   at the same time. The main problem with that approach is that a write can
   succeed on one of the database but not on the other. Which means that you
   will need a mechanism to resolve those problems.
   2. On the Cassandra side you could, as Nate suggested, extends the
   QueryProcessor in order to log the mutations to a log file. As the
   QueryProcessor has access to the prepared statement cache and to the bind
   parameter you should be able to extract the information you need. Some of
   the problems of that approach are:
      1. You cannot reprocess already inserted data
      2. You will probably have to use a replication log to deal with the
      cases where the other database is unreachable
      3. It might slow down your query processing and take some of your
      band width at critical time (heavy write)
      3. Use a fake index as Jacques-Henri suggested. It will allow to
   easily reprocess already inserted data so you will not need some
   replication logs (at the same time having to rebuild the index might slow
   down your database). The main issues for that solution are:
   1. All the tables that you want to replicate will have to have that
      index and you cannot automatically update the schemas on your
other database
      2. It might slow down your query processing and take some of your
      band width at critical time (heavy write)
   4. Read the commitlogs to recreate the mutation statements (your initial
   approach). The main problem is that it is simply not easy to do and might
   break up with new major releases. You will also have to make sure that the
   files do not disappear before you have processed them.
   5. Try a Datawarehouse/ETL approach to synchronized your data.
   CASSANDRA-8844 added support for CDC (Change Data Capture) which might help
   you there. Unfortunatly, I have not really worked on it so I cannot help
   you much there.

There might be some other approach that are worth considering but they did
not come to my mind.

Hope it helps

Benjamin

PS: MongoDB ... Seriously ??? ;-)

Reply via email to