[
https://issues.apache.org/jira/browse/CASSANDRA-11067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15132517#comment-15132517
]
Jack Krupansky commented on CASSANDRA-11067:
--------------------------------------------
For reference, over in Solr land users constantly struggle with how to combine
exact and partial matching - sometimes they want an absolute literal match for
the full field/column, sometimes a wildcard on that full field, sometimes
keyword tokenization, sometimes wildcard on tokenized terms, sometimes phrases
of tokenized terms, and sometimes phrases from the full literal string.
Unfortunately, Solr doesn't have a direct answer for that, so people are forced
to copy the field (typically a <copyyField>) directive and then one field is
the literal string and the other is the tokenized field. That gives them
complete control at query time, so q=name_literal:Joe would only match when the
full name is Joe while q=name_tokenized:joe would match for any name with joe.
Similarly, q=name_lit:Jo* would only match names with Jo as a prefix, while
q=name_tok:jo* would match Joe Smith as well as Bill Johnson.
The user might also opt to copy to yet a third field which is tokenized but
with the so-called keyword tokenizer which permits the string to be normalized
but not broken into tokens. The common case is to lower case, but other common
cases would be to eliminate punctuation, replace certain prefixes and suffixes,
or whatever.
The real point there is that "exact" match is still a range of possibilities.
One of the issues here for Cassandra is whether you really want to combine
these two separate exactness semantics that Solr keeps separate.
> Improve SASI syntax
> -------------------
>
> Key: CASSANDRA-11067
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11067
> Project: Cassandra
> Issue Type: Task
> Components: CQL
> Reporter: Jonathan Ellis
> Assignee: Pavel Yaskevich
> Fix For: 3.4
>
>
> I think everyone agrees that a LIKE operator would be ideal, but that's
> probably not in scope for an initial 3.4 release.
> Still, I'm uncomfortable with the initial approach of overloading = to mean
> "satisfies index expression." The problem is that it will be very difficult
> to back out of this behavior once people are using it.
> I propose adding a new operator in the interim instead. Call it MATCHES,
> maybe. With the exact same behavior that SASI currently exposes, just with a
> separate operator rather than being rolled into =.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)