Is there something I can do to speed up the deserialisation ? In this example I did a count query, but in reality I need the actual data. Write pattern in this table is such that all data for a given row is written at the same time, so I know I can use frozen udt instead of this, making it faster, but I wonder if there is another way.
On Tue, Apr 11, 2023 at 9:06 PM Bowen Song via user < user@cassandra.apache.org> wrote: > Reading 4MB from 70k rows and 13 columns (0.91 million cells) from disk in > 120ms doesn't sound bad. That's a lots of deserialisation to do. If you > want it to be faster, you can store the number of rows elsewhere if that's > the only thing you need. > On 11/04/2023 07:13, Gil Ganz wrote: > > Hey > I have a 4.0.4 cluster, with reads of partitions that are a bit on the > bigger side, taking longer than I would expect. > Reading entire partition that has ~70000 rows, total partition size of > 4mb, takes 120ms, I would expect it to take less. > > This is after major compaction, so there is only one sstables. local_one > consistency level, no tombstones, and reading the entire partition in one > fetch. > Cluster is not doing much else at the time, nvme disk. I can see most of > the time is spent on getting the data from the sstable. > > Is there any specific optimization one can do to speed up cases like this? > I would expect fetching 4mb to take less, I assume if this was one blob of > 4mb that would be the case. > > Table definition : > > CREATE TABLE ks1.item_to_item ( > x1 bigint, > x2 bigint, > x3 int, > x4 int, > y1 bigint, > y2 bigint, > y3 bigint, > y4 bigint, > y5 bigint, > metadata text, > m_metadata_created_at timestamp, > m_metadata_created_by bigint, > m_metadata_updated_at timestamp, > PRIMARY KEY ((x1, x2, x3, x4), y1, y2, y3, y4, y5) > ) WITH CLUSTERING ORDER BY (y1 ASC, y2 ASC, y3 ASC, y4 ASC, y5 ASC) > > > > cqlsh> select count(0) from ks1.item_to_item where x1=4 and x2=70000 and > x4=0 and x3=1; > > count > ------- > 70000 > > (1 rows) > > Tracing session: 6356d290-d785-11ed-aba5-ab86979f2f58 > > activity > | timestamp | source | source_elapsed | client > > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+------------+----------------+----------- > Execute CQL3 query > | 2023-04-10 09:52:21.561000 | 172.25.0.4 | 0 | > 127.0.0.1 > Parsing XXXX [Native-Transport-Requests-1] > | 2023-04-10 09:52:21.561000 | 172.25.0.4 | 428 | > 127.0.0.1 > Preparing statement > [Native-Transport-Requests-1] | 2023-04-10 09:52:21.562000 | > 172.25.0.4 | 973 | 127.0.0.1 > Acquiring sstable references > [ReadStage-2] | 2023-04-10 09:52:21.563000 | 172.25.0.4 | > 2255 | 127.0.0.1 > Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones > [ReadStage-2] | 2023-04-10 09:52:21.563000 | 172.25.0.4 | 2524 | > 127.0.0.1 > Key cache hit for sstable 9 > [ReadStage-2] | 2023-04-10 09:52:21.563000 | 172.25.0.4 | > 2692 | 127.0.0.1 > Merged data from memtables and 1 sstables > [ReadStage-2] | 2023-04-10 09:52:21.651000 | 172.25.0.4 | > 90405 | 127.0.0.1 > Read 70000 live rows and 0 tombstone cells > [ReadStage-2] | 2023-04-10 09:52:21.651000 | 172.25.0.4 | > 90726 | 127.0.0.1 > Request > complete | 2023-04-10 09:52:21.682603 | 172.25.0.4 | > 121603 | 127.0.0.1 > > > gil > >