Johnny Miller created CASSANDRA-8790:
----------------------------------------

             Summary: Improve handling of unicode characters in text fields
                 Key: CASSANDRA-8790
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8790
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Johnny Miller
            Priority: Minor


Currently to store a string/text value that contains a unicode character and 
than subsequently be able to query it CQLSH I need to store the field as a blob 
via the blobAsText and textAsBlob functions. 

This is not really optimal - it would be better if CQLSH handled this rather 
than having to model data around this limitation.

For example:

{code:title=Example Code|borderStyle=solid}

String createTableCql = "CREATE TABLE IF NOT EXISTS test_ks.testunicode (id 
blob PRIMARY KEY, inserted_on timestamp, lorem text)";
session.execute(createTableCql);
System.out.println("Table created.");           
                
String dimension1 = "state";
String dimension2 = "card";
String key = dimension1 + '\u001C' + dimension2;
Date now = new Date();
String lorem = "Lorem ipsum dolor sit amet.";
                
String insertcql = "INSERT INTO testunicode (id, inserted_on, lorem) VALUES 
(textAsBlob(?), ?, ?)";
PreparedStatement ps = session.prepare(insertcql);
BoundStatement bs = new BoundStatement(ps);
bs.bind(key, now, lorem);
session.execute(bs);
System.out.println("Row inserted with key "+key);               
                
String selectcql = "SELECT blobAsText(id) AS id, inserted_on, lorem FROM 
testunicode WHERE id = textAsBlob(?)";
PreparedStatement ps2 = session.prepare(selectcql);
BoundStatement bs2 = new BoundStatement(ps2);
bs2.bind(key);
ResultSet results = session.execute(bs2);
                
System.out.println("Got results...");
                
for (Row row : results) {
                    System.out.println(String.format("%-30s\t%-20s\t%-20s", 
row.getString("id"), row.getDate("inserted_on"), row.getString("lorem")));
}

{code}

And to query via CQLSH:


select * from testunicode where id = 0x73746174651c63617264 ;

 id                     | inserted_on              | lorem
------------------------+--------------------------+-----------------------------
 0x73746174651c63617264 | 2015-02-11 20:32:20+0000 | Lorem ipsum dolor sit amet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to