Re: given partition key and secondary index, still require allow_filtering?

2016-10-31 Thread DuyHai Doan
Native Cassandra 2nd index does not perform very well with inequalities (<,
>, <=, >=). In your case, even if you provide partition key (which is a
very good idea), Cassandra still need to perform a full scan on the local
node to find any score matching the inequality and it is pretty expensive,
thus requiring ALLOW FILTERING.

General thumb of rule for production is:  ALLOW FILTERING == SURELY TIMEOUT

On Mon, Oct 31, 2016 at 9:00 AM, Zao Liu  wrote:

> Hi,
>
> I created a table, schema like here:
>
> CREATE TABLE profile_new.user_categories_1477899735 (
>
> id bigint,
>
> category int,
>
> score double,
>
> PRIMARY KEY (id, category)
>
> ) WITH CLUSTERING ORDER BY (category ASC)
>
> AND bloom_filter_fp_chance = 0.01
>
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>
> AND comment = ''
>
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
>
> AND compression = {'chunk_length_in_kb': '64', 'class': '
> org.apache.cassandra.io.compress.LZ4Compressor'}
>
> AND crc_check_chance = 1.0
>
> AND dclocal_read_repair_chance = 0.1
>
> AND default_time_to_live = 0
>
> AND gc_grace_seconds = 864000
>
> AND max_index_interval = 2048
>
> AND memtable_flush_period_in_ms = 0
>
> AND min_index_interval = 128
>
> AND read_repair_chance = 0.0
>
> AND speculative_retry = '99PERCENTILE';
>
> CREATE INDEX user_categories_1477899735_score_idx ON
> profile_new.user_categories_1477899735 (score);
>
>
> cqlsh:profile_new> select * from user_categories_1477899735 where id=3674;
>
>
> But somehow when I pass partition key and secondary index key, it still
> complains:
>
> cqlsh:profile_new> select * from user_categories_1477899735 where id=3674
> and score > 0.5;
>
> *InvalidRequest: Error from server: code=2200 [Invalid query]
> message="Cannot execute this query as it might involve data filtering and
> thus may have unpredictable performance. If you want to execute this query
> despite the performance unpredictability, use ALLOW FILTERING"*
>
> cqlsh:profile_new>
>
>
>


given partition key and secondary index, still require allow_filtering?

2016-10-31 Thread Zao Liu
Hi,

I created a table, schema like here:

CREATE TABLE profile_new.user_categories_1477899735 (

id bigint,

category int,

score double,

PRIMARY KEY (id, category)

) WITH CLUSTERING ORDER BY (category ASC)

AND bloom_filter_fp_chance = 0.01

AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}

AND comment = ''

AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}

AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}

AND crc_check_chance = 1.0

AND dclocal_read_repair_chance = 0.1

AND default_time_to_live = 0

AND gc_grace_seconds = 864000

AND max_index_interval = 2048

AND memtable_flush_period_in_ms = 0

AND min_index_interval = 128

AND read_repair_chance = 0.0

AND speculative_retry = '99PERCENTILE';

CREATE INDEX user_categories_1477899735_score_idx ON
profile_new.user_categories_1477899735 (score);


cqlsh:profile_new> select * from user_categories_1477899735 where id=3674;


But somehow when I pass partition key and secondary index key, it still
complains:

cqlsh:profile_new> select * from user_categories_1477899735 where id=3674
and score > 0.5;

*InvalidRequest: Error from server: code=2200 [Invalid query]
message="Cannot execute this query as it might involve data filtering and
thus may have unpredictable performance. If you want to execute this query
despite the performance unpredictability, use ALLOW FILTERING"*

cqlsh:profile_new>