[
https://issues.apache.org/jira/browse/CASSANDRA-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joshua McKenzie updated CASSANDRA-10790:
----------------------------------------
Issue Type: Improvement (was: Bug)
> querying on partitioning key and secondary index slow
> ------------------------------------------------------
>
> Key: CASSANDRA-10790
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10790
> Project: Cassandra
> Issue Type: Improvement
> Components: Local Write-Read Paths
> Reporter: Jan Cetkovsky
> Fix For: 2.2.x, 3.0.x, 3.x
>
>
> If large amount on entries exists in table, querying on partitioning key and
> (low cardinality) secondary index is significantly slower than just on the
> partitioning key, even with limit 1. It seems to do a full scan on both
> indexes and combine and filter this result.
> In the example below, there is roughly 400M records on a given node - as you
> can see the two queries yield the same result, with the refining secondary
> key condition increasing the execution time roughly 35x. It seems to me that
> plain filtering of the pk query result should be more efficient especially
> with low cardinality secondary indexes (but it might be an assumption based
> on my modeling approach).
> [cqlsh 4.1.1 | Cassandra 2.0.14 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
> {code}
> CREATE TABLE fil (
> nm text,
> dir boolean,
> id timeuuid,
> attr map<text, text>,
> cntt text,
> del boolean,
> exp timestamp,
> gid text,
> perm int,
> seg map<timeuuid, bigint>,
> size bigint,
> uid text,
> PRIMARY KEY ((nm), dir, id)
> ) WITH CLUSTERING ORDER BY (dir ASC, id DESC) AND
> bloom_filter_fp_chance=0.010000 AND
> caching='KEYS_ONLY' AND
> comment='table used to store file information' AND
> dclocal_read_repair_chance=0.100000 AND
> gc_grace_seconds=864000 AND
> index_interval=128 AND
> read_repair_chance=0.000000 AND
> replicate_on_write='true' AND
> populate_io_cache_on_flush='false' AND
> default_time_to_live=0 AND
> speculative_retry='99.0PERCENTILE' AND
> memtable_flush_period_in_ms=0 AND
> compaction={'class': 'SizeTieredCompactionStrategy'} AND
> compression={'chunk_length_kb': '512', 'crc_check_chance': '0.5',
> 'sstable_compression': 'SnappyCompressor'};
> CREATE INDEX fil_del_idx ON fil (del);
> cqlsh:pronto> select id, nm, attr, seg, exp, dir, uid, gid, perm, cntt from
> fil where nm='/dir_108/aePzC_62755108' and dir=false and del=false limit 1;
> id | nm | attr | seg
> | exp | dir |
> uid | gid | perm | cntt
> --------------------------------------+-------------------------+------+-----------------------------------------------+--------------------------+-------+-----------+------------+------+--------------------------
> e22d17f3-7f77-11e5-8313-0522d849d63b | /dir_108/aePzC_62755108 | null |
> {e23d6ba0-7f77-11e5-8313-0522d849d63b: 22528} | 2016-10-29 19:34:17-0700 |
> False | test_user | test_group | 1911 | application/octet-stream
> (1 rows)
> Tracing session: e0287850-88cd-11e5-a7d5-2360c9ef4b33
> activity
>
>
> | timestamp | source
> | source_elapsed
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+--------------+----------------
>
>
>
> execute_cql3_query | 15:42:31,643 | 10.65.230.23
> | 0
>
> Parsing select id,
> nm, attr, seg, exp, dir, uid, gid, perm, cntt from fil where
> nm='/dir_108/aePzC_62755108' and dir=false and del=false limit 1; |
> 15:42:31,643 | 10.65.230.23 | 105
>
>
>
> Preparing statement | 15:42:31,644 | 10.65.230.23
> | 332
>
>
>
> Determining replicas to query | 15:42:31,644 | 10.65.230.23
> | 607
> Candidate index mean cardinalities are
> CompositesIndexOnRegular{columnDefs=[ColumnDefinition{name=64656c,
> validator=org.apache.cassandra.db.marshal.BooleanType, type=REGULAR,
> componentIndex=2, indexName=fil_del_idx, indexType=COMPOSITES}]}:43707902.
> Scanning with fil.fil_del_idx. | 15:42:31,644 | 10.65.230.23 | 1031
>
>
> Executing indexed scan for
> [/dir_108/aePzC_62755108, /dir_108/aePzC_62755108] | 15:42:31,644 |
> 10.65.230.23 | 1077
> Candidate index mean cardinalities are
> CompositesIndexOnRegular{columnDefs=[ColumnDefinition{name=64656c,
> validator=org.apache.cassandra.db.marshal.BooleanType, type=REGULAR,
> componentIndex=2, indexName=fil_del_idx, indexType=COMPOSITES}]}:43707902.
> Scanning with fil.fil_del_idx. | 15:42:31,644 | 10.65.230.23 | 1157
>
>
>
> Executing single-partition query on fil.fil_del_idx | 15:42:31,644 |
> 10.65.230.23 | 1221
>
>
>
> Acquiring sstable references | 15:42:31,644 | 10.65.230.23
> | 1239
>
>
>
> Merging memtable tombstones | 15:42:31,645 | 10.65.230.23
> | 1323
>
>
>
> Partition index with 1675 entries found for sstable 3675 | 15:42:31,654 |
> 10.65.230.23 | 10309
>
>
>
> Seeking to partition indexed section in data file | 15:42:31,654 |
> 10.65.230.23 | 10325
>
>
>
> Partition index with 2132 entries found for sstable 3582 | 15:42:31,657 |
> 10.65.230.23 | 13605
>
>
>
> Seeking to partition indexed section in data file | 15:42:31,657 |
> 10.65.230.23 | 13619
>
>
>
> Partition index with 1800 entries found for sstable 3565 | 15:42:31,661 |
> 10.65.230.23 | 17657
>
>
>
> Seeking to partition indexed section in data file | 15:42:31,661 |
> 10.65.230.23 | 17698
>
>
> Partition
> index with 28922 entries found for sstable 3553 | 15:42:31,723 | 10.65.230.23
> | 79557
>
>
>
> Seeking to partition indexed section in data file | 15:42:31,723 |
> 10.65.230.23 | 79612
>
>
> Partition
> index with 32092 entries found for sstable 3359 | 15:42:31,779 | 10.65.230.23
> | 135423
>
>
>
> Seeking to partition indexed section in data file | 15:42:31,779 |
> 10.65.230.23 | 135484
>
>
> Partition
> index with 32148 entries found for sstable 3097 | 15:42:31,858 | 10.65.230.23
> | 214790
>
>
>
> Seeking to partition indexed section in data file | 15:42:31,858 |
> 10.65.230.23 | 214856
>
>
> Partition
> index with 127470 entries found for sstable 2811 | 15:42:32,040 |
> 10.65.230.23 | 397123
>
>
>
> Seeking to partition indexed section in data file | 15:42:32,040 |
> 10.65.230.23 | 397196
>
>
> Partition
> index with 136232 entries found for sstable 1769 | 15:42:32,286 |
> 10.65.230.23 | 642508
>
>
>
> Seeking to partition indexed section in data file | 15:42:32,286 |
> 10.65.230.23 | 642577
>
>
>
> Partition index with 15131 entries found for sstable 938 | 15:42:32,320 |
> 10.65.230.23 | 676574
>
>
>
> Seeking to partition indexed section in data file | 15:42:32,320 |
> 10.65.230.23 | 676613
>
>
> Skipped 0/9
> non-slice-intersecting sstables, included 0 due to tombstones | 15:42:32,320
> | 10.65.230.23 | 676646
>
>
>
> Merging data from memtables and 9 sstables | 15:42:32,320 | 10.65.230.23
> | 676660
>
>
>
> Read 1 live and 0 tombstoned cells | 15:42:32,351 | 10.65.230.23
> | 707674
>
>
>
> Executing single-partition query on fil | 15:42:32,351 | 10.65.230.23
> | 707862
>
>
>
> Acquiring sstable references | 15:42:32,351 | 10.65.230.23
> | 707872
>
>
>
> Merging memtable tombstones | 15:42:32,351 | 10.65.230.23
> | 707893
>
>
>
> Bloom filter allows skipping sstable 3391 | 15:42:32,351 | 10.65.230.23
> | 707973
>
>
>
> Bloom filter allows skipping sstable 3387 | 15:42:32,351 | 10.65.230.23
> | 707992
>
>
>
> Partition index with 0 entries found for sstable 3385 | 15:42:32,351 |
> 10.65.230.23 | 708071
>
>
>
> Seeking to partition indexed section in data file | 15:42:32,351 |
> 10.65.230.23 | 708079
>
>
> Skipped 10/13
> non-slice-intersecting sstables, included 0 due to tombstones | 15:42:32,352
> | 10.65.230.23 | 709028
>
>
>
> Merging data from memtables and 1 sstables | 15:42:32,352 | 10.65.230.23
> | 709045
>
>
>
> Read 1 live and 0 tombstoned cells | 15:42:32,352 | 10.65.230.23
> | 709090
>
>
>
> Scanned 1 rows and matched 1 | 15:42:32,352 | 10.65.230.23
> | 709211
>
>
>
> Request complete | 15:42:32,352 | 10.65.230.23
> | 709492
> cqlsh:pronto> select id, nm, attr, seg, exp, dir, uid, gid, perm, cntt from
> fil where nm='/dir_108/aePzC_62755108' and dir=false limit 1;
> id | nm | attr | seg
> | exp | dir |
> uid | gid | perm | cntt
> --------------------------------------+-------------------------+------+-----------------------------------------------+--------------------------+-------+-----------+------------+------+--------------------------
> e22d17f3-7f77-11e5-8313-0522d849d63b | /dir_108/aePzC_62755108 | null |
> {e23d6ba0-7f77-11e5-8313-0522d849d63b: 22528} | 2016-10-29 19:34:17-0700 |
> False | test_user | test_group | 1911 | application/octet-stream
> (1 rows)
> Tracing session: e69202b0-88cd-11e5-a7d5-2360c9ef4b33
> activity
> | timestamp |
> source | source_elapsed
> -------------------------------------------------------------------------------------------------------------------------------------+--------------+--------------+----------------
>
> execute_cql3_query | 15:42:42,382 |
> 10.65.230.23 | 0
> Parsing select id, nm, attr, seg, exp, dir, uid, gid, perm, cntt from fil
> where nm='/dir_108/aePzC_62755108' and dir=false limit 1; | 15:42:42,382 |
> 10.65.230.23 | 112
>
> Preparing statement | 15:42:42,382 |
> 10.65.230.23 | 333
>
> Executing single-partition query on fil | 15:42:42,383 |
> 10.65.230.23 | 940
>
> Acquiring sstable references | 15:42:42,383 |
> 10.65.230.23 | 966
>
> Merging memtable tombstones | 15:42:42,383 |
> 10.65.230.23 | 1000
>
> Bloom filter allows skipping sstable 3471 | 15:42:42,383 |
> 10.65.230.23 | 1121
>
> Bloom filter allows skipping sstable 3432 | 15:42:42,383 |
> 10.65.230.23 | 1170
>
> Bloom filter allows skipping sstable 3391 | 15:42:42,383 |
> 10.65.230.23 | 1199
>
> Bloom filter allows skipping sstable 3387 | 15:42:42,383 |
> 10.65.230.23 | 1227
>
> Bloom filter allows skipping sstable 3369 | 15:42:42,383 |
> 10.65.230.23 | 1317
>
> Bloom filter allows skipping sstable 3364 | 15:42:42,383 |
> 10.65.230.23 | 1353
>
> Bloom filter allows skipping sstable 3358 | 15:42:42,383 |
> 10.65.230.23 | 1387
>
> Key cache hit for sstable 3385 | 15:42:42,383 |
> 10.65.230.23 | 1432
>
> Seeking to partition indexed section in data file | 15:42:42,383 |
> 10.65.230.23 | 1443
>
> Message received from /10.65.230.23 | 15:42:42,386 |
> 10.65.230.20 | 37
>
> Sending message to /10.65.230.20 | 15:42:42,388 |
> 10.65.230.23 | 5886
>
> Bloom filter allows skipping sstable 2984 | 15:42:42,390 |
> 10.65.230.23 | 7679
>
> Bloom filter allows skipping sstable 2535 | 15:42:42,390 |
> 10.65.230.23 | 7737
>
> Bloom filter allows skipping sstable 2452 | 15:42:42,390 |
> 10.65.230.23 | 7780
>
> Bloom filter allows skipping sstable 2058 | 15:42:42,390 |
> 10.65.230.23 | 7824
>
> Executing single-partition query on fil | 15:42:42,392 |
> 10.65.230.20 | 6571
>
> Acquiring sstable references | 15:42:42,392 |
> 10.65.230.20 | 6884
>
> Merging memtable tombstones | 15:42:42,392 |
> 10.65.230.20 | 6931
>
> Bloom filter allows skipping sstable 1695 | 15:42:42,398 |
> 10.65.230.23 | 16437
> Skipped 0/13
> non-slice-intersecting sstables, included 0 due to tombstones | 15:42:42,398
> | 10.65.230.23 | 16460
>
> Merging data from memtables and 1 sstables | 15:42:42,398 |
> 10.65.230.23 | 16472
>
> Read 1 live and 0 tombstoned cells | 15:42:42,398 |
> 10.65.230.23 | 16530
>
> Bloom filter allows skipping sstable 3415 | 15:42:42,404 |
> 10.65.230.20 | 18138
>
> Bloom filter allows skipping sstable 3393 | 15:42:42,404 |
> 10.65.230.20 | 18186
>
> Partition index with 0 entries found for sstable 3359 | 15:42:42,420 |
> 10.65.230.20 | 34457
>
> Seeking to partition indexed section in data file | 15:42:42,420 |
> 10.65.230.20 | 34544
> Skipped 0/3
> non-slice-intersecting sstables, included 0 due to tombstones | 15:42:42,444
> | 10.65.230.20 | 58228
>
> Merging data from memtables and 1 sstables | 15:42:42,444 |
> 10.65.230.20 | 58374
>
> Read 1 live and 0 tombstoned cells | 15:42:42,444 |
> 10.65.230.20 | 58570
>
> Enqueuing response to /10.65.230.23 | 15:42:42,468 |
> 10.65.230.20 | 82602
>
> Sending message to /10.65.230.23 | 15:42:42,468 |
> 10.65.230.20 | 82775
>
> Message received from /10.65.230.20 | 15:42:42,472 |
> 10.65.230.23 | null
>
> Processing response from /10.65.230.20 | 15:42:42,473 |
> 10.65.230.23 | null
>
> Request complete | 15:42:42,402 |
> 10.65.230.23 | 20725
> {code}
> also this doesn't manifest much on new entries (close to top of the secondary
> index?) and gets worse with older entries.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
