[jira] [Comment Edited] (CASSANDRA-10539) Different encodings used between nodes can cause inconsistently generated prepared statement ids

Andy Tolbert (JIRA) Fri, 16 Oct 2015 11:23:26 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-10539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961136#comment-14961136
 ]


Andy Tolbert edited comment on CASSANDRA-10539 at 10/16/15 6:23 PM:
--------------------------------------------------------------------

Yep, it appears to be the case (reproduced against 3.0.0-rc1, 2.0.17 and 2.1.9, 
hadn't tried 2.2.2 but assuming its the same).  I don't consider this too big 
of a problem since it requires different nodes to be using different encoding 
which may cause other problems.


was (Author: andrew.tolbert):
Yep, it appears to be the case (reproduced against 3.0.0-rc1, 2.0.17 and 2.1.9, 
hadn't tried 2.2.2 but assuming its the same).  I don't consider this too big 
of a problem since it requires different instances to be using different 
encoding which may cause other problems.

> Different encodings used between nodes can cause inconsistently generated 
> prepared statement ids 
> -------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10539
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10539
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Andy Tolbert
>            Priority: Minor
>
> [From the java-driver mailing 
> list|https://groups.google.com/a/lists.datastax.com/forum/#!topic/java-driver-user/3Aa7s0u2ZrI]
>  / [JAVA-955|https://datastax-oss.atlassian.net/browse/JAVA-955]
> If you have nodes in your cluster that are using a different default 
> character set it's possible for nodes to generate different prepared 
> statement ids for the same 'keyspace + query string' combination.  I imagine 
> this is not a very typical or desired configuration (thus the low severity).
> This is because 
> [MD5Digest.compute(String)|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/utils/MD5Digest.java#L51-L54]
>  uses 
> [String.getBytes()|http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#getBytes()]
>  which relies on the default charset.
> In the general case this is fine, but if you use some characters in your 
> query string such as 
> [Character.MAX_VALUE|http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#MAX_VALUE]
>  ('\uffff') the byte representation may vary based on the coding.
> I was able to reproduce this configuring a 2-node cluster with node1 using 
> file.encoding {{UTF-8}} and node2 using file.encoding {{ISO-8859-1}}.   The 
> java-driver test that demonstrates this can be found 
> [here|https://github.com/datastax/java-driver/blob/java955/driver-core/src/test/java/com/datastax/driver/core/RetryOnUnpreparedTest.java].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-10539) Different encodings used between nodes can cause inconsistently generated prepared statement ids

Reply via email to