[jira] [Comment Edited] (CASSANDRA-15169) SASIIndex does not compare strings correctly

mazhenlin (Jira) Tue, 15 Oct 2019 02:58:23 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-15169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951778#comment-16951778
 ]


mazhenlin edited comment on CASSANDRA-15169 at 10/15/19 9:57 AM:
-----------------------------------------------------------------

[~mck] , since there are many restrictions in the code for not applying RANGE 
on literal indexes(e.g. ColumnIndex.supports), I have never considered the 
operation in your new case. Now if we want to remove these restrictions for 
PREFIX mode, the followings need to be done:

 

1 The expected results of OnDiskIndexTest.testNotEqualsQueryForStrings need to 
be changed because it is conflicted between the common '>' semantic and using 
NEQ as "not-like-prefix".

 

2 Make OnDiskIndex behaves correctly  in comparison. This can be done easily.

 

3 Support RANGE for MemIndex. This might be a little complicated. Since the 
TrieMemIndex does not support RANGE  ,we need to either implement our new 
radix-tree  class to support RANGE operation , or use SkipListMemIndex for 
PREFIX mode and treat "a like b% " as " a > b and a < bytes(b)+1 ".

 

Besides,  obviously RANGE operations are meaningful only for untokenized 
indexes. We need to think about whether these changes are worthwhile, and 
whether they will confuse users.


was (Author: mazhenlin):
[~mck] , since there are many restrictions in the code for not applying RANGE 
on literal indexes(e.g. ColumnIndex.supports), I have never considered the 
operation in your new case. Now if we want to remove these restrictions for 
PREFIX mode, the followings need to be done:

 

1 The expected results of OnDiskIndexTest.testNotEqualsQueryForStrings need to 
be changed because it is conflicted between the common '>' semantic and using 
NEQ as "not-like-prefix".

 

2 Make OnDiskIndex behaves correctly  in comparison. This can be done easily.

 

3 Support RANGE for MemIndex. This might be a little complicated. Since the 
TrieMemIndex does not support RANGE right now ,we need to either implement our 
new radix-tree  class to support RANGE operation , or use SkipListMemIndex for 
PREFIX mode and treat "a like b% " as " a > b and a < bytes(b)+1 ".

 

Besides,  obviously RANGE operations are meaningful only for untokenized 
indexes. We need to think about whether these changes are worthwhile, and 
whether they will confuse users.

> SASIIndex does not compare strings correctly
> --------------------------------------------
>
>                 Key: CASSANDRA-15169
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15169
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Feature/SASI
>            Reporter: mazhenlin
>            Assignee: mazhenlin
>            Priority: Normal
>         Attachments: CASSANDRA-15169-v1.patch, CASSANDRA-15169-v2.patch
>
>
> In our scenario, we need to query with '>' conditions on string columns. So I 
> created index with  is_literal = false. like the following:
>  
> {code:java}
> CREATE TABLE test (id int primary key, t text);
> CREATE CUSTOM INDEX ON test (t) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {'is_literal': 
> 'false'};
> {code}
>  I also inserted some records and query:
>  
> {code:java}
> insert into test(id,t) values(1,'abc');
> select * from test where t > 'ab';
> {code}
> At first ,it worked. But after flush, the query returned none record.
> I have read the code of SASIIndex and found that it is because in the 
> {code:java}
> Expression.isLowerSatisfiedBy{code}
> function,
> {code:java}
> term.compareTo{code}
> was called with parameter checkFully=false, which cause the string 'abc' was 
> only compared with its first 2 characters( length of expression value).
>  
> I have wrote a UT for this case and fixed it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-15169) SASIIndex does not compare strings correctly

Reply via email to