[
https://issues.apache.org/jira/browse/CASSANDRA-9779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623732#comment-14623732
]
Robert Stupp commented on CASSANDRA-9779:
-----------------------------------------
IMO it would be logical to disallow {{UPDATE}} for {{WITH INSERTS ONLY}} tables
(and that's what {{with INSERTs only}} says).
Would {{WITH INSERTS ONLY}} mean to also restrict to primary-keys without
clustering-key?
Maybe I didn't completely get it. What I'm thinking about is that one partition
can still be split over memtable + multiple sstables - which would conflict
with the compaction/read-path optimizations. For example, if you have a table
with {{PRIMARY KEY ( (year, month, day), hour, minute, second)}} with several
millions INSERTs per day, it's likely that this will result in multiple
sstables per day. Mean - I'm a bit afraid that partitions get too tiny with all
its consequences (too many queries, not able to insert from different clients
for the same day).
If such a {{WITH INSERTS ONLY}} table has no clustering-key, even more
optimizations might be possible (key-cache key would not need the sstable ref
in the key, but in the value - so we could do the key-cache lookup and skip
bloom-filter lookup on hit).
> Append-only optimization
> ------------------------
>
> Key: CASSANDRA-9779
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9779
> Project: Cassandra
> Issue Type: New Feature
> Components: API, Core
> Reporter: Jonathan Ellis
> Fix For: 3.x
>
>
> Many common workloads are append-only: that is, they insert new rows but do
> not update existing ones. However, Cassandra has no way to infer this and so
> it must treat all tables as if they may experience updates in the future.
> If we added syntax to tell Cassandra about this ({{WITH INSERTS ONLY}} for
> instance) then we could do a number of optimizations:
> - Compaction would only need to worry about defragmenting partitions, not
> rows. We could default to DTCS or similar.
> - CollationController could stop scanning sstables as soon as it finds a
> matching row
> - Most importantly, materialized views wouldn't need to worry about deleting
> prior values, which would eliminate the majority of the MV overhead
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)