Re: Optimization for partitions with high number of rows

Bowen Song via user Tue, 11 Apr 2023 11:06:09 -0700

Reading 4MB from 70k rows and 13 columns (0.91 million cells) from diskin 120ms doesn't sound bad. That's a lots of deserialisation to do. Ifyou want it to be faster, you can store the number of rows elsewhere ifthat's the only thing you need.


On 11/04/2023 07:13, Gil Ganz wrote:

Hey
I have a 4.0.4 cluster, with reads of partitions that are a bit on thebigger side, taking longer than I would expect.Reading entire partition that has ~70000 rows, total partition size of4mb, takes 120ms, I would expect it to take less.
This is after major compaction, so there is only one sstables.local_one consistency level, no tombstones, and reading the entirepartition in one fetch.Cluster is not doing much else at the time, nvme disk. I can see mostof the time is spent on getting the data from the sstable.
Is there any specific optimization one can do to speed up cases like this?
I would expect fetching 4mb to take less, I assume if this was oneblob of 4mb that would be the case.
Table definition :

CREATE TABLE ks1.item_to_item (
    x1 bigint,
    x2 bigint,
    x3 int,
    x4 int,
    y1 bigint,
    y2 bigint,
    y3 bigint,
    y4 bigint,
    y5 bigint,
    metadata text,
    m_metadata_created_at timestamp,
    m_metadata_created_by bigint,
    m_metadata_updated_at timestamp,
    PRIMARY KEY ((x1, x2, x3, x4), y1, y2, y3, y4, y5)
) WITH CLUSTERING ORDER BY (y1 ASC, y2 ASC, y3 ASC, y4 ASC, y5 ASC)
cqlsh> select count(0) from ks1.item_to_item where x1=4 and x2=70000and x4=0 and x3=1;
 count
-------
 70000

(1 rows)

Tracing session: 6356d290-d785-11ed-aba5-ab86979f2f58
activity | timestamp | source |source_elapsed | client
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+------------+----------------+-----------
Execute CQL3 query | 2023-04-1009:52:21.561000 | 172.25.0.4 | 0 | 127.0.0.1 Parsing XXXX [Native-Transport-Requests-1] | 2023-04-10 09:52:21.561000 | 172.25.0.4 | 428 | 127.0.0.1 Preparing statement[Native-Transport-Requests-1] | 2023-04-10 09:52:21.562000| 172.25.0.4 | 973 | 127.0.0.1 Acquiring sstable references[ReadStage-2] | 2023-04-10 09:52:21.563000 | 172.25.0.4 | 2255 | 127.0.0.1 Skipped 0/1 non-slice-intersecting sstables, included 0 due totombstones [ReadStage-2] | 2023-04-10 09:52:21.563000 | 172.25.0.4 | 2524 | 127.0.0.1 Key cache hit for sstable 9[ReadStage-2] | 2023-04-10 09:52:21.563000 | 172.25.0.4 | 2692 | 127.0.0.1 Merged data from memtables and 1 sstables[ReadStage-2] | 2023-04-10 09:52:21.651000 | 172.25.0.4 | 90405 | 127.0.0.1 Read 70000 live rows and 0 tombstone cells[ReadStage-2] | 2023-04-10 09:52:21.651000 | 172.25.0.4 | 90726 | 127.0.0.1 Request complete | 2023-04-10 09:52:21.682603 | 172.25.0.4| 121603 | 127.0.0.1
gil

Re: Optimization for partitions with high number of rows

Reply via email to