Enhance cassandra-cli with more flexible querying and better data type support
------------------------------------------------------------------------------

                 Key: CASSANDRA-1688
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1688
             Project: Cassandra
          Issue Type: Improvement
          Components: Tools
            Reporter: Jim Ancona


In trying to use cassandra-cli, I've felt the need to have better support for 
non-String data types, and more flexibility in the types of queries possible. 
The attached patch is an attempt to address some of those issues. 

It enhances the GET command with a more flexible syntax, outlined below. The 
new syntax adds to and partially duplicates the current GET syntax, but is more 
verbose. Functionally it's a superset of the LIST command, but I haven't 
removed any functionality yet. I added support for the Thrift getSlice and 
getRangeSlices calls.

Syntax overview:

getSlice examples:
{noformat}
get CF2 key Long(12345) columns from 10000 to 99999999999
get SCF1 supercolumn 'super' key 'hello' columns 'world' as integer, 'moon' as 
ascii, 
{noformat}

getRangeSlices examples:
{noformat}
get CF2 keys all columns from 10000 to 99999999999
get SCF1 supercolumn 'super' keys from Integer(1234567876) limit 500 columns 
'world' as integer
get CF2 keys from 'A' to 'Z' columns from 10000 to 99999999999 limit 50
{noformat}

Pseudo-Antlr syntax
{noformat}
thriftGetSlice
    : K_GET columnParent 'KEY' keyValue columnSlice?

thriftGetRangeSlices
    : K_GET columnParent keyRange? columnSlice?

columnParent
    : columnFamily ('SUPERCOLUMN' superColumnName)?

columnSlice
    : (columnList | columnRange | allColumns)
 
columnList
    : 'COLUMNS' columnSpec (',' columnSpec)*
 
columnRange
    : 'COLUMNS' ('FROM' startColumn)? ('TO' endColumn)? ('AS' typeIdentifier)? 
('LIMIT' limit)?
    
allColumns
    : 'COLUMNS' 'ALL' ('AS' typeIdentifier)? ('LIMIT' limit)?

keyRange
    : 'KEYS' ( ('FROM' startKeyValue)? ('TO' endKeyValue)? |  ALL ) ('LIMIT' 
limit=IntegerLiteral)?

columnSpec
    : columnName ('AS' typeIdentifier)?

value: (Identifier | IntegerLiteral | StringLiteral | functionCall );

functionCall 
    : functionName=Identifier '(' functionArgument ')'
{noformat}


Questions:

* Should I use a different keyword? Perhaps GET should be reserved for the 
simple bracket-based, single-key case and this functionality should use LIST or 
SELECT as a keyword.
* Should the syntax be more SQL-like? I started out doing that, but it seemed 
to me that the C* model is so different that mapping it to the SQL syntax was 
difficult. I haven't looked at Eric Evans' CQL work in any detail yet, but 
perhaps that is a better model.

Additional work:

* The KEYS and COLUMNS keywords should be added to the GET / WHERE syntax for 
getIndexedSlices.
* The LIST command should be deprecated or removed.
* The SET command should be enhanced to allow for non-string keys and column 
names.
* I've used a different model for processing the syntax tree in the code. If 
other people like it, it would make sense to convert the rest of CliClient to 
the same model.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to