[jira] [Updated] (CASSANDRA-4920) Add Collation to abstract type to provide standard sort order for Strings

Sidharth (JIRA) Tue, 06 Nov 2012 08:54:14 -0800

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sidharth updated CASSANDRA-4920:
--------------------------------

    Description: 
Adding a way to sort UTF8 based on below described collation semantics can be 
useful. 

Use case: Say for example you have wide rows where you cannot use cassandra's 
standard indexes(secondary/primary index). Lets say each column had a string 
value that was either one of alphanumeric or purely numeric and you wanted an 
index by value. MOre specifically you want to slice range over a bunch of 
column values and say "get me all the ID's associated with value ABC to XYZ ". 
As usual I would index these values in a materialized views  

More specifically I create an index CF; And add these values into a 
CompositeType column and SliceRange over them for the indexing to work and I 
dont really care weather its a alpha or a numeric as long as its ordered by the 
following collation semantics as follows:
1) If the string is a numeric then it should be comparable like a numeric
2) If its a alpha then it should be comparable like a normal string. 
3) If its a alhpa-numeric then a contiguos sequence of numbers in the string 
should be compared as numbers like "c10" > "c2".
4) UTF8 type strings assumed everywhere.

How this helps?:
1) You dont end up creating multiple CF for different value types. 
2) You dont have to write boiler plate to do complicated type detection and do 
this manually in the application. 

  was:
Adding a way to sort UTF8 based on a standard order(collation) is very useful. 
Say for example you have wide rows where you cannot use cassandra's standard 
indexes(secondary/primary index). Lets say each column had a string value that 
was either one of alphanumeric or purely numeric.  

Now lets say I want to index these values in a materialized views so I could 
look up things by range of values (range makes sense as a standard ordering 
over my  alpha numeric and numeric strings i.e. "12" < "10000").

More specifically I add these values into a CompositeType and SliceRange over 
them for the index to work and I dont really care weather its a alpha or a 
numeric, it should be in the order that follows collation semantics as follows:
1) If the string is a numeric then it should be comparable like a numeric
2) If its a alpha then it should be comparable like a normal string. 
3) If its a alhpa-numeric then a contiguos sequence of numbers in the string 
should be compared as numbers like "c10" > "c2".
4) UTF8 type strings assumed everywhere.

    
> Add Collation to abstract type to provide standard sort order for Strings
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4920
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4920
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API, Core
>    Affects Versions: 1.2.0 beta 1
>            Reporter: Sidharth
>            Priority: Minor
>              Labels: cassandra
>
> Adding a way to sort UTF8 based on below described collation semantics can be 
> useful. 
> Use case: Say for example you have wide rows where you cannot use cassandra's 
> standard indexes(secondary/primary index). Lets say each column had a string 
> value that was either one of alphanumeric or purely numeric and you wanted an 
> index by value. MOre specifically you want to slice range over a bunch of 
> column values and say "get me all the ID's associated with value ABC to XYZ 
> ". As usual I would index these values in a materialized views  
> More specifically I create an index CF; And add these values into a 
> CompositeType column and SliceRange over them for the indexing to work and I 
> dont really care weather its a alpha or a numeric as long as its ordered by 
> the following collation semantics as follows:
> 1) If the string is a numeric then it should be comparable like a numeric
> 2) If its a alpha then it should be comparable like a normal string. 
> 3) If its a alhpa-numeric then a contiguos sequence of numbers in the string 
> should be compared as numbers like "c10" > "c2".
> 4) UTF8 type strings assumed everywhere.
> How this helps?:
> 1) You dont end up creating multiple CF for different value types. 
> 2) You dont have to write boiler plate to do complicated type detection and 
> do this manually in the application. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4920) Add Collation to abstract type to provide standard sort order for Strings

Reply via email to