hi,

i've been looking at the model below from Ed Anuff's presentation at Cassandra CF (http://www.slideshare.net/edanuff/indexing-in-cassandra). Couple of questions:

1) Isn't there still the chance that two concurrent updates may end up with the index containing two entries for the given user, only one of which would be match the actual value in the Users cf?

2) What happens if your batch fails partway through the update? If i understand correctly there are no guarantees about ordering when a batch is executed, so isn't it possible that eg the previous value entries in Users_Index_Entries may have been deleted, and then the batch fails before the entries in Indexes are deleted, ie the mechanism has 'lost' those values? I assume this can be addressed by not deleting the old entries until the batch has succeeded (ie put the previous entry deletion into a separate, subsequent batch). this at least lets you retry at a later time.

perhaps i'm missing something?

SELECT {"location"}..{"location", *}
FROM Users_Index_Entries WHERE KEY = <user_key>;

BEGIN BATCH

DELETE {"location", ts1}, {"location", ts2}, ...
FROM Users_Index_Entries WHERE KEY = <user_key>;

DELETE {<value1>, <user_key>, ts1}, {<value2>, <user_key>, ts2}, ...
FROM Indexes WHERE KEY = "Users_By_Location";

UPDATE Users_Index_Entries SET {"location", ts3} = <value3>
WHERE KEY=<user_key>;

UPDATE Indexes SET {<value3>, <user_key>, ts3) = null
WHERE KEY = "Users_By_Location";

UPDATE Users SET location = <value3>
WHERE KEY = <user_key>;

APPLY BATCH

Reply via email to