According to the trace log, only one was read, the compaction strategy is
size tiered.
I attached a more readable version of my trace for details.
On Mon, May 11, 2015 at 11:35 AM, Anishek Agarwal <[email protected]> wrote:
> how many sst tables were there? what compaction are you using ? These
> properties define how many possible disk reads cassandra has to do to get
> all the data you need depending on which SST Tables have data for your
> partition key.
>
> On Fri, May 8, 2015 at 6:25 PM, Alprema <[email protected]> wrote:
>
>> I was planning on using a more "server-friendly" strategy anyway (by
>> parallelizing my workload on multiple metrics) but my concern here is more
>> about the raw numbers.
>>
>> According to the trace and my estimation of the data size, the read from
>> disk was done at about 30MByte/s and the transfer between the responsible
>> node and the coordinator was done at 120Mbits/s which doesn't seem right
>> given that the cluster was not busy and the network is Gbit capable.
>>
>> I know that there is some overhead, but these numbers seem odd to me, do
>> they seem normal to you ?
>>
>> On Fri, May 8, 2015 at 2:34 PM, Bryan Holladay <[email protected]>
>> wrote:
>>
>>> Try breaking it up into smaller chunks using multiple threads and token
>>> ranges. 86400 is pretty large. I found ~1000 results per query is good.
>>> This will spread the burden across all servers a little more evenly.
>>>
>>> On Thu, May 7, 2015 at 4:27 AM, Alprema <[email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am writing an application that will periodically read big amounts of
>>>> data from Cassandra and I am experiencing odd performances.
>>>>
>>>> My column family is a classic time series one, with series ID and Day
>>>> as partition key and a timestamp as clustering key, the value being a
>>>> double.
>>>>
>>>> The query I run gets all the values for a given time series for a given
>>>> day (so about 86400 points):
>>>>
>>>> SELECT "UtcDate", "Value"FROM "Metric_OneSec"WHERE "MetricId" =
>>>> 12215ece-6544-4fcf-a15d-4f9e9ce1567eAND "Day" = '2015-05-05
>>>> 00:00:00+0000'LIMIT 86400;
>>>>
>>>>
>>>> This takes about 450ms to run and when I trace the query I see that it
>>>> takes about 110ms to read the data from disk and 224ms to send the data
>>>> from the responsible node to the coordinator (full trace in attachment).
>>>>
>>>> I did a quick estimation of the requested data (correct me if I'm
>>>> wrong):
>>>> 86400 * (column name + column value + timestamp + ttl)
>>>> = 86400 * (8 + 8 + 8 + 8?)
>>>> = 2.6Mb
>>>>
>>>> Let's say about 3Mb with misc. overhead, so these timings seem pretty
>>>> slow to me for a modern SSD and a 1Gb/s NIC.
>>>>
>>>> Do those timings seem normal? Am I missing something?
>>>>
>>>> Thank you,
>>>>
>>>> Kévin
>>>>
>>>>
>>>>
>>>
>>
>
activity |
timestamp | source | source_elapsed
--------------------------------------------------------------------------+--------------+----------------+----------------
execute_cql3_query |
09:25:45,027 | node01 | 0
Message received from /node01 |
09:25:45,021 | node02 | 10
Executing single-partition query on Metric_OneSec |
09:25:45,021 | node02 | 156
Acquiring sstable references |
09:25:45,021 | node02 | 164
Merging memtable tombstones |
09:25:45,021 | node02 | 179
Bloom filter allows skipping sstable 5153 |
09:25:45,021 | node02 | 198
Bloom filter allows skipping sstable 5152 |
09:25:45,021 | node02 | 205
Bloom filter allows skipping sstable 5151 |
09:25:45,021 | node02 | 211
Bloom filter allows skipping sstable 5146 |
09:25:45,021 | node02 | 217
Key cache hit for sstable 5125 |
09:25:45,021 | node02 | 228
Seeking to partition beginning in data file |
09:25:45,021 | node02 | 231
Bloom filter allows skipping sstable 5040 |
09:25:45,022 | node02 | 470
Bloom filter allows skipping sstable 4955 |
09:25:45,022 | node02 | 479
Bloom filter allows skipping sstable 4614 |
09:25:45,022 | node02 | 485
Skipped 0/8 non-slice-intersecting sstables, included 0 due to tombstones |
09:25:45,022 | node02 | 491
Merging data from memtables and 1 sstables |
09:25:45,022 | node02 | 495
Parsing
SELECT "Value" FROM "Metric_OneSec"
WHERE "MetricId" = 12215ece-6544-4fcf-a15d-4f9e9ce1567e
AND "Day" = '2015-05-05 00:00:00+0000'
LIMIT 86400; |
09:25:45,027 | node01 | 23
Preparing statement |
09:25:45,027 | node01 | 115
Sending message to /node02 |
09:25:45,027 | node01 | 798
Read 86090 live and 0 tombstoned cells |
09:25:45,135 | node02 | 113809
Enqueuing response to /node01 |
09:25:45,135 | node02 | 114046
Sending message to /node01 |
09:25:45,135 | node02 | 114108
Message received from /node02 |
09:25:45,365 | node01 | 338615
Processing response from /node02 |
09:25:45,365 | node01 | 338654
Request complete |
09:25:45,455 | node01 | 428111