[
https://issues.apache.org/jira/browse/CASSANDRA-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125773#comment-13125773
]
Sylvain Lebresne commented on CASSANDRA-1034:
---------------------------------------------
bq. Can you please define what do you mean by "pure token"?
In you patch, it's a Token whose key is EMPTY_BYTE_BUFFER (which is *not* a
valid row key, hence the 'pure' token name).
bq. Aren't we supposed to generate token from key in all situations except
initial token in config and middle point between tokens?
And? Is that not enough? There is tons of place in the code where we manipulate
those tokens 'not created from a key' (all the distribution code basically,
which is a big part of Cassandra). Moreover, there is also range_slice that
accept a range of token.
bq. So if you do a range slice using tokens instead of keys
TokenFactory.fromString will force you to use correctly serialized token data
which will also include key.
To what is this supposed to be an answer ?
bq. If token is generated from key than for me it's natural to have a key as
member. The thing is that you are enable to create a "pure" token, Partitioner
will always give you a Token with valid key except for midpoint method so if
partitioner is used to generate tokens you are guaranteed to have a valid key
in the resulting token instance.
But it's not always generated from a key! There is nothing natural to a key
member in all the Token object manipulated by TokenMetadata and other, since
there is not such key.
bq. The same thing I can say about your set of patches - it's 198 KB. Aren't we
writing tests to catch such bugs?
Well, in my patch, 148K of those are a type generification only (that's why
I've separated it). Because generics are erased at runtime, as long as it
compile, there is *NO* chance this can introduce a bug. As for trusting tests
to catch bugs, I think it's being overconfident in tests. But in the end, I'm
happily taking back that objection as this is by far the less important.
Let me try to put things graphically, everyone loves a graph: if I draw the set
of all keys as this:
{noformat}
[-----------------------------------------------------------------------------[
{noformat}
i.e, the ring but as a line because I'm ignoring wrapping for this.
Now, if I display row keys (decorated or not, that doesn't matter, both are
keys), I would have for instance:
{noformat}
[---------------------------|-------|---------------|---------------|---------[
k1 k2 k3 k4
{noformat}
A key is a point on the ring.
Now if keys and tokens are a 1 to 1 mapping, then it could be ok to say that a
token is a point on the ring, but once it's not the case, then it looks like
that:
{noformat}
t t' t''
[-------------------------[*|*******|*]----------[**|****]-------[**|******]--[
k1 k2 k3 k4
{noformat}
where t is the token for both k1 and k2 (and an infinite number of other keys
(actually finite because we're working on a computer)), t' the token of k3 (and
an 'infinite' number of other keys), etc...
A token is intrinsically a range, a segment on the ring. Shoving DK and Token
into the same class everywhere in the code is saying that we'll use the same
class for a point and an interval. How can that be a good idea? How can that
not backfire on us and be hard to work with, making it easy to introduce errors?
> Remove assumption that Key to Token is one-to-one
> -------------------------------------------------
>
> Key: CASSANDRA-1034
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1034
> Project: Cassandra
> Issue Type: Bug
> Reporter: Stu Hood
> Assignee: Pavel Yaskevich
> Priority: Minor
> Fix For: 1.1
>
> Attachments:
> 0001-Make-range-accept-both-Token-and-DecoratedKey.patch,
> 0002-LengthPartitioner.patch, 1034-1-Generify-AbstractBounds-v3.patch,
> 1034-2-Remove-assumption-that-token-and-keys-are-one-to-one-v3.patch,
> 1034_v1.txt, CASSANDRA-1034.patch
>
>
> get_range_slices assumes that Tokens do not collide and converts a KeyRange
> to an AbstractBounds. For RandomPartitioner, this assumption isn't safe, and
> would lead to a very weird heisenberg.
> Converting AbstractBounds to use a DecoratedKey would solve this, because the
> byte[] key portion of the DecoratedKey can act as a tiebreaker.
> Alternatively, we could make DecoratedKey extend Token, and then use
> DecoratedKeys in places where collisions are unacceptable.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira