Tyler Hobbs created CASSANDRA-10295:
---------------------------------------

             Summary: Support skipping MV read-before-write on a per-operation 
basis
                 Key: CASSANDRA-10295
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10295
             Project: Cassandra
          Issue Type: New Feature
          Components: API, Core
            Reporter: Tyler Hobbs
             Fix For: 3.x


This is similar in spirit to CASSANDRA-9779, but on a per-operation basis.  
There are many workloads that include a mixture of new insertions and 
overwrites.  In some cases, logic outside of Cassandra guarantees that an 
inserted row does not already exist.  For example, the primary key may include 
a UUID or another form of unique id (from, say, Snowflake).  

When denormalizing manually, users can take advantage of this knowledge to 
avoid doing a read-before-write, but with materialized views they don't have 
this option.  When the newly inserted row also happens to be a new partition, 
MVs are still pretty efficient, because the bloom filters allow us to quickly 
short circuit the read.  However, when new rows are inserted to existing 
partitions, the reads can become costly.

I'd like to consider exposing a way for the user to indicate that an inserted 
row is new on a per-operation basis.  Internally, this could potentially use 
the mechanism from CASSANDRA-9779, depending on how that's implemented.  As far 
as the API goes, I'm not sure.  Perhaps an "assertion" clause in inserts would 
work well:

{noformat}
INSERT INTO users ... ASSERTING DOES NOT EXIST;
{noformat}

The choice of API should also take into consideration potential future 
enhancements along these lines.  For example, we might want to support 
asserting that a given column has a known current value (as another means of 
avoiding read-before-writes).

If we implement this, we should make sure that hints, logged batches, and 
commitlog replay handle this safely.  If the original timestamp is used for 
replay, I believe it should be idempotent (during the gc_grace window), but I 
could be missing something.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to