[jira] [Commented] (CASSANDRA-1034) Remove assumption that Key to Token is one-to-one

Sylvain Lebresne (JIRA) Thu, 07 Apr 2011 09:25:45 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017026#comment-13017026
 ]


Sylvain Lebresne commented on CASSANDRA-1034:
---------------------------------------------

bq. DK.isEmpty seems like a bad method name for a Key object – intuitively, 
keys are a specific point and should not contain other points except for the 
obvious identity case. Would isMinimum be a better name?

Actually I don't even like isEmpty for token, so in favor of isMinimum for both 
DK and token.

bq.  don't understand RP.toSplitValue or why DK would throw away information, 
when calling it. More generally, I'm unclear why we would have null keys in DK 
– shouldn't you use a Token, if you don't have key information?

Current patch don't allow to mix token and DK in a range/bounds (because that 
comes with its whole sets of complications). However getRestrictedRange must be 
able to break a range of DK based on a node token. So RP.toSplitValue() returns 
for a given token the value that splits the range: for a token range it's the 
token itself, but for a DK range, it's the largest DK having this token.
The null keys is related: even though we don't mix DK and token in range, we 
need to be able to have a range of DK that includes everything from x token to 
y token. Hence, for a given token t, we need two DK: the smallest DK having t 
and the biggest DK having t. In the patch, slightly but not totally randomly, I 
use DK(t, EMPTY_BB) for the smallest key and DK(t, null) for the biggest one, 
hence the "need" for null keys. 

bq. using MINIMUM_TOKEN for both sort-before-everything and 
sort-after-everything values has always been confusing. Should we introduce a 
MAXIMUM_TOKEN value to clear that up?

I think that would make wrapping stuffs more complicated. Because then what 
would be the difference between the following ranges: (MIN, MIN], (MAX, MAX], 
(MIN, MAX] and (MAX, MIN]. For DK, the code is already enforcing that the we 
only have one minimum key (that is DK(MIN, EMPTY_BB)) and never ever use 
DK(MIN, null) because that poses problems. I think a MAX token would make that 
worst. 

> Remove assumption that Key to Token is one-to-one
> -------------------------------------------------
>
>                 Key: CASSANDRA-1034
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1034
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Stu Hood
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.8
>
>         Attachments: 0001-Generify-AbstractBounds.patch, 
> 0001-Make-range-accept-both-Token-and-DecoratedKey.patch, 
> 0002-LengthPartitioner.patch, 
> 0002-Remove-assumption-that-token-and-keys-are-one-to-one-v2.patch, 
> 0002-Remove-assumption-that-token-and-keys-are-one-to-one.patch, 1034_v1.txt
>
>
> get_range_slices assumes that Tokens do not collide and converts a KeyRange 
> to an AbstractBounds. For RandomPartitioner, this assumption isn't safe, and 
> would lead to a very weird heisenberg.
> Converting AbstractBounds to use a DecoratedKey would solve this, because the 
> byte[] key portion of the DecoratedKey can act as a tiebreaker. 
> Alternatively, we could make DecoratedKey extend Token, and then use 
> DecoratedKeys in places where collisions are unacceptable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-1034) Remove assumption that Key to Token is one-to-one

Reply via email to