[
https://issues.apache.org/jira/browse/CASSANDRA-12923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15673278#comment-15673278
]
Sylvain Lebresne commented on CASSANDRA-12923:
----------------------------------------------
This is indeed a (somewhat known) problem, and we have a similar problem with
{{varint}} (as if a user sends the same value but with more or less leading
zeros, both value with compare similarly, but won't yield the same token). This
is really unfortunate but I'm also not sure we can fix that without risking
breaking backward compatibility: if we start forcing a canonical value, old
non-canonical values would become unreachable. Even if we were to somehow
convert values to their canonical form on upgrade somehow (by forcing the
canonical form on reads, and maybe bumping the sstable format (so we only
bother checking if in canonical form on non-upgraded sstables)), that would
still be a problem during rolling upgrades (where old nodes wouldn't convert to
canonical values and so we'd have unreachable data at least temporarily).
For what it's worth, It has also been the behavior since forever (including
back in thrift).
So I agree it's really not great, and if there wasn't the upgrade/backward
compatibility issue, I'd have no hesitation on fixing this. But due to those
concerns, I'm wondering if it's not better to just call it a known crappy
behavior (we should still at least document it better obviously), and maybe
openly discourage using {{decimal}} and {{varint}} in partition keys.
> Decimal type has inconsistent equality semantics when used in partition key
> ---------------------------------------------------------------------------
>
> Key: CASSANDRA-12923
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12923
> Project: Cassandra
> Issue Type: Bug
> Reporter: Branimir Lambov
>
> Unlike {{double}} or {{float}}, for {{decimal}} used as primary key in
> Cassandra we have that {{3 != 3.0}} even though {{3 <= 3.0}} and {{3 >= 3.0}}:
> {code}
> cqlsh:keyspace1> create table testdec (key decimal primary key, value int);
> cqlsh:keyspace1> insert into testdec (key, value) values (3.0, 3);
> cqlsh:keyspace1> select * from testdec;
> key | value
> ------+-------
> 3.0 | 3
> (1 rows)
> cqlsh:keyspace1> select * from testdec where key = 3;
> key | value
> -----+-------
> (0 rows)
> cqlsh:keyspace1> select * from testdec where key = 3.0;
> key | value
> -----+-------
> 3.0 | 3
> (1 rows)
> cqlsh:keyspace1> select * from testdec where key >= 3 and key <= 3 ALLOW
> FILTERING;
> key | value
> -----+-------
> 3.0 | 3
> (1 rows)
> {code}
> The reason for this is that we use the key's bytes (as produced by
> {{BigDecimal}}) to form the token:
> {code}
> cqlsh:keyspace1> select * from testdec where token(key) = token(3);
> key | value
> -----+-------
> (0 rows)
> cqlsh:keyspace1> select * from testdec where token(key) = token(3.0);
> key | value
> -----+-------
> 3.0 | 3
> (1 rows)
> {code}
> as well as to check key matches in
> [{{BigTableReader.getPosition}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/format/big/BigTableReader.java#L217].
> The solution is to always store a canonical form of each key. In this case,
> such a value that the decimal's _unscaled value_ is not divisible by 10.
> This problem may be affecting other types as well.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)