[ https://issues.apache.org/jira/browse/CASSANDRA-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074678#comment-14074678 ]
Robert Stupp commented on CASSANDRA-7575: ----------------------------------------- The custom column solution is a bit dirty by that you query column 'foo' for something that is in column 'bar'. I think we can add support for 2i in UDFs (as an extension to UDFs - not with the basics covered by CASSANDRA-7395) - maybe by providing some information like table/row/column meta data to the UDF information. UDFs could also help to cover "complex" queries against lucene/solr/elasticsearch (conditional filter/query against multiple fields, highlighting, etc). For example with something like this to get every row with a full-text-search score > .75. But that's stuff for a separate ticket - the UDF impl would then return a set of primary keys - so it's a "query rewrite" behind the scenes. But the syntax is much more obvious. {noformat} SELECT * FROM my_super_table WHERE elasticsearch('{ filter: { type: "range", field: "company", lower: "a", upper: "p" }, sort:{ fields: [{field:"name",reverse:true}] } }') > 0.75 {noformat} > Custom 2i validation > -------------------- > > Key: CASSANDRA-7575 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7575 > Project: Cassandra > Issue Type: Improvement > Components: API > Reporter: Andrés de la Peña > Assignee: Andrés de la Peña > Priority: Minor > Labels: 2i, cql3, secondaryIndex, secondary_index, select > Attachments: 2i_validation.patch > > > There are several projects using custom secondary indexes as an extension > point to integrate C* with other systems such as Solr or Lucene. The usual > approach is to embed third party indexing queries in CQL clauses. > For example, [DSE > Search|http://www.datastax.com/what-we-offer/products-services/datastax-enterprise] > embeds Solr syntax this way: > {code} > SELECT title FROM solr WHERE solr_query='title:natio*'; > {code} > [Stratio platform|https://github.com/Stratio/stratio-cassandra] embeds custom > JSON syntax for searching in Lucene indexes: > {code} > SELECT * FROM tweets WHERE lucene='{ > filter : { > type: "range", > field: "time", > lower: "2014/04/25", > upper: "2014/04/1" > }, > query : { > type: "phrase", > field: "body", > values: ["big", "data"] > }, > sort : {fields: [ {field:"time", reverse:true} ] } > }'; > {code} > Tuplejump [Stargate|http://tuplejump.github.io/stargate/] also uses the > Stratio's open source JSON syntax: > {code} > SELECT name,company FROM PERSON WHERE stargate ='{ > filter: { > type: "range", > field: "company", > lower: "a", > upper: "p" > }, > sort:{ > fields: [{field:"name",reverse:true}] > } > }'; > {code} > These syntaxes are validated by the corresponding 2i implementation. This > validation is done behind the StorageProxy command distribution. So, far as I > know, there is no way to give rich feedback about syntax errors to CQL users. > I'm uploading a patch with some changes trying to improve this. I propose > adding an empty validation method to SecondaryIndexSearcher that can be > overridden by custom 2i implementations: > {code} > public void validate(List<IndexExpression> clause) {} > {code} > And call it from SelectStatement#getRangeCommand: > {code} > ColumnFamilyStore cfs = > Keyspace.open(keyspace()).getColumnFamilyStore(columnFamily()); > for (SecondaryIndexSearcher searcher : > cfs.indexManager.getIndexSearchersForQuery(expressions)) > { > try > { > searcher.validate(expressions); > } > catch (RuntimeException e) > { > String exceptionMessage = e.getMessage(); > if (exceptionMessage != null > && !exceptionMessage.trim().isEmpty()) > throw new InvalidRequestException( > "Invalid index expression: " + e.getMessage()); > else > throw new InvalidRequestException( > "Invalid index expression"); > } > } > {code} > In this way C* allows custom 2i implementations to give feedback about syntax > errors. > We are currently using these changes in a fork with no problems. -- This message was sent by Atlassian JIRA (v6.2#6252)