[ 
https://issues.apache.org/jira/browse/CASSANDRA-5527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645764#comment-13645764
 ] 

Jonathan Ellis commented on CASSANDRA-5527:
-------------------------------------------

bq. Underneath, the storage engine row would contain additional secondary key 
tombstones

Hmm, I think you might be glossing over something problematic here.

Currently we support three types of tombstones:

- Partition key tombstones, which are just a an int and a long (local and 
client-facing deletion times)
- Range tombstones, which are an int/long pair with a start and stop cell name 
(in the conveniently named {{RangeTombstone}} class)
- Single-cell tombstones

Partition key tombstones are just hardcoded to come after the PK itself in the 
row header.  Range tombstones are scattered among the data cells, following the 
same comparator rules.  So if we are looking for cell X, the same scan we'd do 
for X will also run across anything tombstoning it without having to do extra 
seeks.  (We'll replicate a range tombstone multiple times if it covers multiple 
cell-name-index blocks.)

The problem is that I don't see a way to efficiently check for tombstones 
against cell names that are not part of the PK (hence, comparator).  If we're 
talking about "loading a list of key tombstones from the row header and 
checking each one in turn" then I think I'm -1 on the idea.
                
> Deletion by Secondary Key
> -------------------------
>
>                 Key: CASSANDRA-5527
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5527
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Rick Branson
>
> Given Cassandra's popularity as a time ordered list store, the inability to 
> do deletes by anything other than the primary key without re-implementing 
> tombstones in the application is a bit of an achilles heel for many use 
> cases. It's a data modeling problem that seems to come up quite often, and 
> given that we now have the CQL3 abstraction layer sitting on top of the 
> storage engine, I think there's an opportunity to take this burden off of the 
> application layer. I've spent several weeks thinking about this problem 
> within the context of Cassandra, and I think I've come up with a reasonable 
> proposal.
> It would involve addition of a secondary key facility to CQL3 tables:
> CREATE TABLE timeline (
>       timeline_id uuid,
>       entry_id timeuuid,
>       entry_key blob,
>       entry_payload blob,
>       PRIMARY KEY (timeline_id, entry_id),
>       KEY (timeline_id, entry_key)
> );
> Secondary keys would be required to share the same partition key with the 
> primary key. They would be included to support deletion by secondary key 
> operations:
> DELETE FROM timeline WHERE timeline_id = <X> and entry_key = <Y>;
> Underneath, the storage engine row would contain additional secondary key 
> tombstones. Secondary key deletion would be read-free, requiring a single 
> tombstone write. The cost of reads would necessarily go up. Queries would 
> need to be modified to perform an additional step to find any matching 
> secondary key tombstones and perform the regular convergence process. The 
> secondary key tombstones should be cleaned up by the regular tombstone GC 
> process.
> While I didn't want to complicate this idea too much, it might be also worth 
> having a discussion around supporting secondary key queries as well, or at 
> least making the schema compatible with potential future support (maybe 
> rename KEY to DELETABLE KEY or something).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to