Sylvain Lebresne created CASSANDRA-9796:
-------------------------------------------

             Summary: Give 8099's like treatment to partition keys
                 Key: CASSANDRA-9796
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9796
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Sylvain Lebresne


Post-8099, we properly distinguish clustering columns at the engine level, 
which allows use somewhat more efficient encoding: we don't write the size of 
values of fixed width types, and we can properly store null values (which will 
likely prove useful for CASSANDRA-6477 for instance).

Partition keys however have had no such love: the storage engine still 
manipulate them like a single blob and their encoding is not terribly 
efficient: we always store the size of every values (even fixed width ones) and 
for compound values we even store the size of the full partition key even 
though it's redundant with the individual value sizes. The encoding also don't 
allow nulls, which is inconvenient at least for CASSANDRA-6477.

So I'd like to improve on this by:
# making the {{DecoratedKey}} API (which I'd personally rename into 
{{PartitionKey}}) expose the fact that we can have more than one value.  
Typically by adding {{size()}} and {{get\(i\)}} methods like for 
{{Clustering}}.  This would simplify a couple of places in the code where we 
still manually decompose such values in particular.
# improve their encoding. An easy/consistent solution for that would be reuse 
the same encoding than for {{Clustering}} (they are the same kind of beast), 
though I'm open to other options.

One small subtlety to be aware of is that whatever we do to the internal 
encoding/implementation, we must make sure we still compute the same tokens.  
But that's not particularly hard either.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to