Hi Guys,
Assuming you have, for example, an “account” table, and an “account_history”
table which simply tracks older versions of what a persons account looks like
when an administrator edits a customer account.
Given that we don’t have the luxury of a safe transaction to update the account
record, i.e. to do:
- select account details
- compare old account details with new account details
- if there are changes to the account"
- copy old account details to account_history table
- update account
How do people deal with this in a multi data centre environment? The closest
thing I can think of is something like this on “save":
- insert new record into account_history table
- update record into account table"
- every hour look for duplicate rows in account_history table and duplicate
where someone did a save that did not change any fields on the account table.
My biggest problem with the above, is, what happens if you want to bulk load a
data file into your account table, and it — for example — contains 1 million
records, and only actually changes 100 account entries. For bulk loading you
could probably resort to doing a "select before update” just to prevent 1
million pointless updates into the account_history table, but that feels a bit
yucky. Some sort of java stored procedure might help here, but surely this is a
common enough use case that we shouldn’t have to write custom java code for the
Cassandra right?
Thanks!
Jacob