Hi Guys,

Assuming you have, for example, an “account” table, and an “account_history” 
table which simply tracks older versions of what a persons account looks like 
when an administrator edits a customer account.

Given that we don’t have the luxury of a safe transaction to update the account 
record, i.e. to do:

 - select account details
 - compare old account details with new account details
 - if there are changes to the account"
    - copy old account details to account_history table
    - update account

How do people deal with this in a multi data centre environment? The closest 
thing I can think of is something like this on “save":

 - insert new record into account_history table
 - update record into account table"
 - every hour look for duplicate rows in account_history table and duplicate 
where someone did a save that did not change any fields on the account table.

My biggest problem with the above, is, what happens if you want to bulk load a 
data file into your account table, and it — for example — contains 1 million 
records, and only actually changes 100 account entries. For bulk loading you 
could probably resort to doing a "select before update” just to prevent 1 
million pointless updates into the account_history table, but that feels a bit 
yucky. Some sort of java stored procedure might help here, but surely this is a 
common enough use case that we shouldn’t have to write custom java code for the 
Cassandra right?

Thanks!
Jacob

Reply via email to