Andrés de la Peña created CASSANDRA-7575:
--------------------------------------------

             Summary: Custom 2i validation
                 Key: CASSANDRA-7575
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7575
             Project: Cassandra
          Issue Type: Improvement
          Components: API
            Reporter: Andrés de la Peña
            Priority: Minor
         Attachments: 2i_validation.patch

There are several projects using custom secondary indexes as an extension point 
to integrate C* with other systems such as Solr or Lucene. The usual approach 
is to embed third party indexing queries in CQL clauses. 

For example, [DSE 
Search|http://www.datastax.com/what-we-offer/products-services/datastax-enterprise]
 embeds Solr syntax this way:
{code}
SELECT title FROM solr WHERE solr_query='title:natio*';
{code}

[Stratio platform|https://github.com/Stratio/stratio-cassandra] embeds custom 
JSON syntax for searching in Lucene indexes:
{code}
SELECT * FROM tweets WHERE lucene='{
    filter : {
        type: "range",
        field: "time",
        lower: "2014/04/25",
        upper: "2014/04/1"
    },
    query  : {
        type: "phrase", 
        field: "body", 
        values: ["big", "data"]
    },
    sort  : {fields: [ {field:"time", reverse:true} ] }
}';
{code}

Tuplejump [Stargate|http://tuplejump.github.io/stargate/] also uses the 
Stratio's open source JSON syntax:
{code}
SELECT name,company FROM PERSON WHERE stargate ='{
    filter: {
        type: "range",
        field: "company",
        lower: "a",
        upper: "p"
    },
    sort:{
       fields: [{field:"name",reverse:true}]
    }
}';
{code}

These syntaxes are validated by the corresponding 2i implementation. This 
validation is done behind the StorageProxy command distribution. So, far as I 
know, there is no way to give rich feedback about syntax errors to CQL users.

I'm uploading a patch with some changes trying to improve this. I propose 
adding an empty validation method to SecondaryIndexSearcher that can be 
overridden by custom 2i implementations:
{code}
public void validate(List<IndexExpression> clause) {}
{code}
And call it from SelectStatement#getRangeCommand:
{code}
ColumnFamilyStore cfs = 
Keyspace.open(keyspace()).getColumnFamilyStore(columnFamily());
        for (SecondaryIndexSearcher searcher : 
cfs.indexManager.getIndexSearchersForQuery(expressions))
        {

            try
            {
                searcher.validate(expressions);
            }
            catch (RuntimeException e)
            {
                String exceptionMessage = e.getMessage();
                if (exceptionMessage != null 
                        && !exceptionMessage.trim().isEmpty())
                    throw new InvalidRequestException(
                            "Invalid index expression: " + e.getMessage());
                else
                    throw new InvalidRequestException(
                            "Invalid index expression");
            }
        }
{code}

In this way C* allows custom 2i implementations to give feedback about syntax 
errors.

We are currently using these changes in a fork with no problems.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to