[ 
https://issues.apache.org/jira/browse/CASSANDRA-5198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566877#comment-13566877
 ] 

Edward Capriolo commented on CASSANDRA-5198:
--------------------------------------------

I like #1.
I advocate proper type validation. We recently had a MySQL update that was 
wrapping booleans in 'T', 'F', 'true' and based on your database the results 
are different or non-intuitive. Personally, I do not like "loose validation" it 
encourages ambiguity. Hive went through something similar: 
http://grokbase.com/t/hive/dev/125sw56a78/non-string-partition-columns, there 
was much ambiguity and misconceptions around "loose validation" and it became 
tech-dept that was hard to dig out of.
                
> token () function automatically coerces types leading to confusing output
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5198
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5198
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.2.1
>            Reporter: Edward Capriolo
>            Priority: Minor
>             Fix For: 1.2.2
>
>         Attachments: 0001-Respect-CQL3-constant-types.txt, 
> 0002-Improve-printing-of-type-in-error-message.txt, 
> 0003-Respect-partitioner-type-for-Token-function.txt
>
>
> This works as it should.
> {noformat}
> cqlsh:movies> select * from users where token (username) > token('') ;
>  username  | created_date | email | firstname | lastname | password
> -----------+--------------+-------+-----------+----------+----------
>     bsmith |         null |  null |       bob |    smith |     null
>  scapriolo |         null |  null |    stacey | capriolo |     null
>  ecapriolo |         null |  null |    edward | capriolo |     null
> cqlsh:movies> select * from users where token (username) > token('bsmith') ;
>  username  | created_date | email | firstname | lastname | password
> -----------+--------------+-------+-----------+----------+----------
>  scapriolo |         null |  null |    stacey | capriolo |     null
>  ecapriolo |         null |  null |    edward | capriolo |     null
> cqlsh:movies> select * from users where token (username) > token('scapriolo') 
> ;
>  username  | created_date | email | firstname | lastname | password
> -----------+--------------+-------+-----------+----------+----------
>  ecapriolo |         null |  null |    edward | capriolo |     null
> {noformat}
> But look what happens when you supply numbers into the token function.
> {noformat}
> qlsh:movies> select * from users where token (username) > token(0) ;
>  username  | created_date | email | firstname | lastname | password
> -----------+--------------+-------+-----------+----------+----------
>  ecapriolo |         null |  null |    edward | capriolo |     null
> cqlsh:movies> select * from users where token (username) > token(1134314) ;
>  username  | created_date | email | firstname | lastname | password
> -----------+--------------+-------+-----------+----------+----------
>     bsmith |         null |  null |       bob |    smith |     null
>  scapriolo |         null |  null |    stacey | capriolo |     null
>  ecapriolo |         null |  null |    edward | capriolo |     null
> cqlsh:movies> select * from users where token (username) > token(113431431) ;
>  username  | created_date | email | firstname | lastname | password
> -----------+--------------+-------+-----------+----------+----------
>  scapriolo |         null |  null |    stacey | capriolo |     null
>  ecapriolo |         null |  null |    edward | capriolo |     null
> cqlsh:movies> select * from users where token (username) > token(1134) ;
>  username  | created_date | email | firstname | lastname | password
> -----------+--------------+-------+-----------+----------+----------
>  ecapriolo |         null |  null |    edward | capriolo |     null
> cqlsh:movies> select * from users where token (username) > token(1134434) ;
>  username  | created_date | email | firstname | lastname | password
> -----------+--------------+-------+-----------+----------+----------
>  scapriolo |         null |  null |    stacey | capriolo |     null
> {noformat}
> This does not make sense to me. The token function is apparently converting 
> integers to strings leading to seemingly unpredictable results. 
> However I find this syntax odd, I feel like I should be able to say 
> 'token(username) > 0 and token(username) < 10' because from a thrift side I 
> can page tokens or I can page keys. In this case, I guess, I am only able to 
> page keys because the token is not returned to the user.
> Is token 0 = ''? How do I arrive at the minimal token for and int column. 
> Should the token() function at least be smart enough to reject integers for 
> string columns?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to