[ 
https://issues.apache.org/jira/browse/CASSANDRA-10295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-10295:
------------------------------------
    Labels: client-impacting doc-impacting  (was: )

> Support skipping MV read-before-write on a per-operation basis
> --------------------------------------------------------------
>
>                 Key: CASSANDRA-10295
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10295
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Tyler Hobbs
>              Labels: client-impacting, doc-impacting
>             Fix For: 3.x
>
>
> This is similar in spirit to CASSANDRA-9779, but on a per-operation basis.  
> There are many workloads that include a mixture of new insertions and 
> overwrites.  In some cases, logic outside of Cassandra guarantees that an 
> inserted row does not already exist.  For example, the primary key may 
> include a UUID or another form of unique id (from, say, Snowflake).  
> When denormalizing manually, users can take advantage of this knowledge to 
> avoid doing a read-before-write, but with materialized views they don't have 
> this option.  When the newly inserted row also happens to be a new partition, 
> MVs are still pretty efficient, because the bloom filters allow us to quickly 
> short circuit the read.  However, when new rows are inserted to existing 
> partitions, the reads can become costly.
> I'd like to consider exposing a way for the user to indicate that an inserted 
> row is new on a per-operation basis.  Internally, this could potentially use 
> the mechanism from CASSANDRA-9779, depending on how that's implemented.  As 
> far as the API goes, I'm not sure.  Perhaps an "assertion" clause in inserts 
> would work well:
> {noformat}
> INSERT INTO users ... ASSERTING DOES NOT EXIST;
> {noformat}
> The choice of API should also take into consideration potential future 
> enhancements along these lines.  For example, we might want to support 
> asserting that a given column has a known current value (as another means of 
> avoiding read-before-writes).
> If we implement this, we should make sure that hints, logged batches, and 
> commitlog replay handle this safely.  If the original timestamp is used for 
> replay, I believe it should be idempotent (during the gc_grace window), but I 
> could be missing something.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to