Re: TWCS on Non TTL Data

2021-09-20 Thread Bowen Song
By switching to this method, you will have two tables - the short-term 
storage table and the permanent storage table. The short-term table will 
use TWCS and have a TTL on it. The permanent table doesn't have TTL, and 
it will use STCS or LCS instead of TWCS. Whenever the application write 
(insert or update) anything, it should always write to both tables. When 
the application wants to read from the table, it may choose to read from 
either the short-term or permanent table, depending on the expected 
written time of the data.



On 19/09/2021 08:55, Isaeed Mohanna wrote:


The point is that I am NOT using TTL and I need to keep the data, so 
when I do the switch to TWCS, will the old files be recompacted or 
they will remain the same and only new data coming in will use TWCS?


*From:* Bowen Song 
*Sent:* Friday, September 17, 2021 9:04 PM
*To:* user@cassandra.apache.org
*Subject:* Re: TWCS on Non TTL Data

If you use TWCS with TTL, the old SSTables won't be compacted, the 
entire SSTable file will get dropped after it expires. I don't think 
you will need to manage the compaction or cleanup at all, as they are 
automatic. There's no space limit on the table holding the near-term 
data other than the overall free disk space. There's only a time limit 
on that table.


On 17/09/2021 16:51, Isaeed Mohanna wrote:

Thank for the help,

How does the compaction run? Does it clean old compaction files
while running or only at the end, I want to manage the free space
so not run out while its running?

*From:* Jim Shaw  <mailto:jxys...@gmail.com>
*Sent:* Wednesday, September 15, 2021 3:49 PM
*To:* user@cassandra.apache.org
*Subject:* Re: TWCS on Non TTL Data

You may try roll up the data, i.e.  a table only 1 month data, old
data roll up to a table keep a year data.

Thanks,

Jim

On Wed, Sep 15, 2021 at 1:26 AM Isaeed Mohanna 
wrote:

My cluster column is the time series timestamp, so basically
sourceId, metric type for partition key and timestamp for the
clustering key the rest of the fields are just values outside
of the primary key. Our reads request are simply give me
values for a time range of a specific sourceId,Metric
combination. So I am guess that during read the sstables that
contain the partition key will be found and out of those the
ones that are out of the range will be excluded, correct?

In practice our queries are up to a month by default, only
rarely we fetch more when someone is exporting the data or so.

In reality also we get old data, that is a source will send
its information late instead of sending it in realtime it will
send all last month\week\day data at once, in that case I
guess the data will end up in current bucket, will that affect
performance?

Assuming I start with a  1 week bucket, I could later change
the time window right?

Thanks

*From:* Jeff Jirsa 
*Sent:* Tuesday, September 14, 2021 10:35 PM
*To:* cassandra 
*Subject:* Re: TWCS on Non TTL Data

Inline

On Tue, Sep 14, 2021 at 11:47 AM Isaeed Mohanna
 wrote:

Hi Jeff

My data is partitioned by a sourceId and metric, a source
is usually active up to a year after which there is no
additional writes for the partition, and reads become
scarce, so although this is not an explicit time
component, its time based, will that suffice?

I guess it means that a single read may touch a year of
sstables. Not great, but perhaps not fatal. Hopefully your
reads avoid that in practice. We'd need the full schema to be
very sure (does clustering column include month/day? if so,
there are cases where that can help exclude sstables)

If I use a  week bucket we will be able to serve last few
days reads from one file and last month from ~5 which is
the most common queries, do u think doing a months bucket
a good idea? That will allow reading from one file most of
the time but the size of each SSTable will be ~5 times bigger

It'll be 1-4 for most common (up to 4 for same bucket reads
because STCS in the first bucket is triggered at
min_threshold=4), and 5 max, seems reasonable. Way better than
the 200 or so you're doing now.

When changing the compaction strategy via JMX, do I need
to issue the alter table command at the end so it will be
reflected in the schema or is it taking care of
automatically? (I am using cassandra 3.11.11)

At the end, yes.

Thanks a lot for your help.

*From:* Jeff Jirsa 
*Sent:* Tuesday, September 14, 2021 4:51 PM
*To:* cassandra 
*Subject:* Re

RE: TWCS on Non TTL Data

2021-09-19 Thread Isaeed Mohanna
The point is that I am NOT using TTL and I need to keep the data, so when I do 
the switch to TWCS, will the old files be recompacted or they will remain the 
same and only new data coming in will use TWCS?

From: Bowen Song 
Sent: Friday, September 17, 2021 9:04 PM
To: user@cassandra.apache.org
Subject: Re: TWCS on Non TTL Data


If you use TWCS with TTL, the old SSTables won't be compacted, the entire 
SSTable file will get dropped after it expires. I don't think you will need to 
manage the compaction or cleanup at all, as they are automatic. There's no 
space limit on the table holding the near-term data other than the overall free 
disk space. There's only a time limit on that table.
On 17/09/2021 16:51, Isaeed Mohanna wrote:
Thank for the help,
How does the compaction run? Does it clean old compaction files while running 
or only at the end, I want to manage the free space so not run out while its 
running?


From: Jim Shaw <mailto:jxys...@gmail.com>
Sent: Wednesday, September 15, 2021 3:49 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: TWCS on Non TTL Data

You may try roll up the data, i.e.  a table only 1 month data, old data roll up 
to a table keep a year data.

Thanks,
Jim

On Wed, Sep 15, 2021 at 1:26 AM Isaeed Mohanna 
mailto:isa...@xsense.co>> wrote:
My cluster column is the time series timestamp, so basically sourceId, metric 
type for partition key and timestamp for the clustering key the rest of the 
fields are just values outside of the primary key. Our reads request are simply 
give me values for a time range of a specific sourceId,Metric combination. So I 
am guess that during read the sstables that contain the partition key will be 
found and out of those the ones that are out of the range will be excluded, 
correct?
In practice our queries are up to a month by default, only rarely we fetch more 
when someone is exporting the data or so.

In reality also we get old data, that is a source will send its information 
late instead of sending it in realtime it will send all last month\week\day 
data at once, in that case I guess the data will end up in current bucket, will 
that affect performance?

Assuming I start with a  1 week bucket, I could later change the time window 
right?

Thanks


From: Jeff Jirsa mailto:jji...@gmail.com>>
Sent: Tuesday, September 14, 2021 10:35 PM
To: cassandra mailto:user@cassandra.apache.org>>
Subject: Re: TWCS on Non TTL Data

Inline

On Tue, Sep 14, 2021 at 11:47 AM Isaeed Mohanna 
mailto:isa...@xsense.co>> wrote:
Hi Jeff
My data is partitioned by a sourceId and metric, a source is usually active up 
to a year after which there is no additional writes for the partition, and 
reads become scarce, so although this is not an explicit time component, its 
time based, will that suffice?

I guess it means that a single read may touch a year of sstables. Not great, 
but perhaps not fatal. Hopefully your reads avoid that in practice. We'd need 
the full schema to be very sure (does clustering column include month/day? if 
so, there are cases where that can help exclude sstables)


If I use a  week bucket we will be able to serve last few days reads from one 
file and last month from ~5 which is the most common queries, do u think doing 
a months bucket a good idea? That will allow reading from one file most of the 
time but the size of each SSTable will be ~5 times bigger

It'll be 1-4 for most common (up to 4 for same bucket reads because STCS in the 
first bucket is triggered at min_threshold=4), and 5 max, seems reasonable. Way 
better than the 200 or so you're doing now.


When changing the compaction strategy via JMX, do I need to issue the alter 
table command at the end so it will be reflected in the schema or is it taking 
care of automatically? (I am using cassandra 3.11.11)


At the end, yes.

Thanks a lot for your help.








From: Jeff Jirsa mailto:jji...@gmail.com>>
Sent: Tuesday, September 14, 2021 4:51 PM
To: cassandra mailto:user@cassandra.apache.org>>
Subject: Re: TWCS on Non TTL Data



On Tue, Sep 14, 2021 at 5:42 AM Isaeed Mohanna 
mailto:isa...@xsense.co>> wrote:
Hi
I have a table that stores time series data, the data is not TTLed since we 
want to retain the data for the foreseeable future, and there are no updates or 
deletes. (deletes could happens rarely in case some scrambled data reached the 
table, but its extremely rare).
Usually we do constant write of incoming data to the table ~ 5 milion a day, 
mostly newly generated data in the past week, but we also get old data that got 
stuck somewhere but not that often. Usually our reads are for the most recent 
data last month – three. But we do fetch old data as well in a specific time 
period in the past.
Lately we have been facing performance trouble with this table see histogram 
below, When compaction is working on the table the performance even drops to 
10-20 seconds!!
Percentile  SSTables

Re: TWCS on Non TTL Data

2021-09-17 Thread Bowen Song
If you use TWCS with TTL, the old SSTables won't be compacted, the 
entire SSTable file will get dropped after it expires. I don't think you 
will need to manage the compaction or cleanup at all, as they are 
automatic. There's no space limit on the table holding the near-term 
data other than the overall free disk space. There's only a time limit 
on that table.


On 17/09/2021 16:51, Isaeed Mohanna wrote:


Thank for the help,

How does the compaction run? Does it clean old compaction files while 
running or only at the end, I want to manage the free space so not run 
out while its running?


*From:* Jim Shaw 
*Sent:* Wednesday, September 15, 2021 3:49 PM
*To:* user@cassandra.apache.org
*Subject:* Re: TWCS on Non TTL Data

You may try roll up the data, i.e.  a table only 1 month data, old 
data roll up to a table keep a year data.


Thanks,

Jim

On Wed, Sep 15, 2021 at 1:26 AM Isaeed Mohanna <mailto:isa...@xsense.co>> wrote:


My cluster column is the time series timestamp, so basically
sourceId, metric type for partition key and timestamp for the
clustering key the rest of the fields are just values outside of
the primary key. Our reads request are simply give me values for a
time range of a specific sourceId,Metric combination. So I am
guess that during read the sstables that contain the partition key
will be found and out of those the ones that are out of the range
will be excluded, correct?

In practice our queries are up to a month by default, only rarely
we fetch more when someone is exporting the data or so.

In reality also we get old data, that is a source will send its
information late instead of sending it in realtime it will send
all last month\week\day data at once, in that case I guess the
data will end up in current bucket, will that affect performance?

Assuming I start with a  1 week bucket, I could later change the
time window right?

Thanks

*From:* Jeff Jirsa mailto:jji...@gmail.com>>
*Sent:* Tuesday, September 14, 2021 10:35 PM
*To:* cassandra mailto:user@cassandra.apache.org>>
    *Subject:* Re: TWCS on Non TTL Data

Inline

On Tue, Sep 14, 2021 at 11:47 AM Isaeed Mohanna mailto:isa...@xsense.co>> wrote:

Hi Jeff

My data is partitioned by a sourceId and metric, a source is
usually active up to a year after which there is no additional
writes for the partition, and reads become scarce, so although
this is not an explicit time component, its time based, will
that suffice?

I guess it means that a single read may touch a year of sstables.
Not great, but perhaps not fatal. Hopefully your reads avoid that
in practice. We'd need the full schema to be very sure (does
clustering column include month/day? if so, there are cases where
that can help exclude sstables)

If I use a  week bucket we will be able to serve last few days
reads from one file and last month from ~5 which is the most
common queries, do u think doing a months bucket a good idea?
That will allow reading from one file most of the time but the
size of each SSTable will be ~5 times bigger

It'll be 1-4 for most common (up to 4 for same bucket reads
because STCS in the first bucket is triggered at min_threshold=4),
and 5 max, seems reasonable. Way better than the 200 or so you're
doing now.

When changing the compaction strategy via JMX, do I need to
issue the alter table command at the end so it will be
reflected in the schema or is it taking care of automatically?
(I am using cassandra 3.11.11)

At the end, yes.

Thanks a lot for your help.

*From:* Jeff Jirsa mailto:jji...@gmail.com>>
*Sent:* Tuesday, September 14, 2021 4:51 PM
*To:* cassandra mailto:user@cassandra.apache.org>>
*Subject:* Re: TWCS on Non TTL Data

On Tue, Sep 14, 2021 at 5:42 AM Isaeed Mohanna
mailto:isa...@xsense.co>> wrote:

Hi

I have a table that stores time series data, the data is
not TTLed since we want to retain the data for the
foreseeable future, and there are no updates or deletes.
(deletes could happens rarely in case some scrambled data
reached the table, but its extremely rare).

Usually we do constant write of incoming data to the table
~ 5 milion a day, mostly newly generated data in the past
week, but we also get old data that got stuck somewhere
but not that often. Usually our reads are for the most
recent data last month – three. But we do fetch old data
as well in a specific time period in the past.

Lately we have been facing performance trouble with this
table see histogram below, When compaction is working on

RE: TWCS on Non TTL Data

2021-09-17 Thread Isaeed Mohanna
Thank for the help,
How does the compaction run? Does it clean old compaction files while running 
or only at the end, I want to manage the free space so not run out while its 
running?


From: Jim Shaw 
Sent: Wednesday, September 15, 2021 3:49 PM
To: user@cassandra.apache.org
Subject: Re: TWCS on Non TTL Data

You may try roll up the data, i.e.  a table only 1 month data, old data roll up 
to a table keep a year data.

Thanks,
Jim

On Wed, Sep 15, 2021 at 1:26 AM Isaeed Mohanna 
mailto:isa...@xsense.co>> wrote:
My cluster column is the time series timestamp, so basically sourceId, metric 
type for partition key and timestamp for the clustering key the rest of the 
fields are just values outside of the primary key. Our reads request are simply 
give me values for a time range of a specific sourceId,Metric combination. So I 
am guess that during read the sstables that contain the partition key will be 
found and out of those the ones that are out of the range will be excluded, 
correct?
In practice our queries are up to a month by default, only rarely we fetch more 
when someone is exporting the data or so.

In reality also we get old data, that is a source will send its information 
late instead of sending it in realtime it will send all last month\week\day 
data at once, in that case I guess the data will end up in current bucket, will 
that affect performance?

Assuming I start with a  1 week bucket, I could later change the time window 
right?

Thanks


From: Jeff Jirsa mailto:jji...@gmail.com>>
Sent: Tuesday, September 14, 2021 10:35 PM
To: cassandra mailto:user@cassandra.apache.org>>
Subject: Re: TWCS on Non TTL Data

Inline

On Tue, Sep 14, 2021 at 11:47 AM Isaeed Mohanna 
mailto:isa...@xsense.co>> wrote:
Hi Jeff
My data is partitioned by a sourceId and metric, a source is usually active up 
to a year after which there is no additional writes for the partition, and 
reads become scarce, so although this is not an explicit time component, its 
time based, will that suffice?

I guess it means that a single read may touch a year of sstables. Not great, 
but perhaps not fatal. Hopefully your reads avoid that in practice. We'd need 
the full schema to be very sure (does clustering column include month/day? if 
so, there are cases where that can help exclude sstables)


If I use a  week bucket we will be able to serve last few days reads from one 
file and last month from ~5 which is the most common queries, do u think doing 
a months bucket a good idea? That will allow reading from one file most of the 
time but the size of each SSTable will be ~5 times bigger

It'll be 1-4 for most common (up to 4 for same bucket reads because STCS in the 
first bucket is triggered at min_threshold=4), and 5 max, seems reasonable. Way 
better than the 200 or so you're doing now.


When changing the compaction strategy via JMX, do I need to issue the alter 
table command at the end so it will be reflected in the schema or is it taking 
care of automatically? (I am using cassandra 3.11.11)


At the end, yes.

Thanks a lot for your help.








From: Jeff Jirsa mailto:jji...@gmail.com>>
Sent: Tuesday, September 14, 2021 4:51 PM
To: cassandra mailto:user@cassandra.apache.org>>
Subject: Re: TWCS on Non TTL Data



On Tue, Sep 14, 2021 at 5:42 AM Isaeed Mohanna 
mailto:isa...@xsense.co>> wrote:
Hi
I have a table that stores time series data, the data is not TTLed since we 
want to retain the data for the foreseeable future, and there are no updates or 
deletes. (deletes could happens rarely in case some scrambled data reached the 
table, but its extremely rare).
Usually we do constant write of incoming data to the table ~ 5 milion a day, 
mostly newly generated data in the past week, but we also get old data that got 
stuck somewhere but not that often. Usually our reads are for the most recent 
data last month – three. But we do fetch old data as well in a specific time 
period in the past.
Lately we have been facing performance trouble with this table see histogram 
below, When compaction is working on the table the performance even drops to 
10-20 seconds!!
Percentile  SSTables Write Latency  Read LatencyPartition Size  
  Cell Count
  (micros)  (micros)   (bytes)
50%   215.00 17.08  89970.66  1916  
 149
75%   446.00 24.60 223875.79  2759  
 215
95%   535.00 35.43 464228.84  8239  
 642
98%   642.00 51.01 668489.53 24601  
1916
99%   642.00 73.46 962624.93 42510  
3311
Min 0.00  2.30  10090.8143  
   0
Max   770.00   1358.102395318.86   5839588  
  454826

As u c

Re: TWCS on Non TTL Data

2021-09-15 Thread Jim Shaw
You may try roll up the data, i.e.  a table only 1 month data, old data
roll up to a table keep a year data.

Thanks,
Jim

On Wed, Sep 15, 2021 at 1:26 AM Isaeed Mohanna  wrote:

> My cluster column is the time series timestamp, so basically sourceId,
> metric type for partition key and timestamp for the clustering key the rest
> of the fields are just values outside of the primary key. Our reads request
> are simply give me values for a time range of a specific sourceId,Metric
> combination. So I am guess that during read the sstables that contain the
> partition key will be found and out of those the ones that are out of the
> range will be excluded, correct?
>
> In practice our queries are up to a month by default, only rarely we fetch
> more when someone is exporting the data or so.
>
>
>
> In reality also we get old data, that is a source will send its
> information late instead of sending it in realtime it will send all last
> month\week\day data at once, in that case I guess the data will end up in
> current bucket, will that affect performance?
>
>
>
> Assuming I start with a  1 week bucket, I could later change the time
> window right?
>
>
>
> Thanks
>
>
>
>
>
> *From:* Jeff Jirsa 
> *Sent:* Tuesday, September 14, 2021 10:35 PM
> *To:* cassandra 
> *Subject:* Re: TWCS on Non TTL Data
>
>
>
> Inline
>
>
>
> On Tue, Sep 14, 2021 at 11:47 AM Isaeed Mohanna  wrote:
>
> Hi Jeff
>
> My data is partitioned by a sourceId and metric, a source is usually
> active up to a year after which there is no additional writes for the
> partition, and reads become scarce, so although this is not an explicit
> time component, its time based, will that suffice?
>
>
>
> I guess it means that a single read may touch a year of sstables. Not
> great, but perhaps not fatal. Hopefully your reads avoid that in practice.
> We'd need the full schema to be very sure (does clustering column include
> month/day? if so, there are cases where that can help exclude sstables)
>
>
>
>
>
> If I use a  week bucket we will be able to serve last few days reads from
> one file and last month from ~5 which is the most common queries, do u
> think doing a months bucket a good idea? That will allow reading from one
> file most of the time but the size of each SSTable will be ~5 times bigger
>
>
>
> It'll be 1-4 for most common (up to 4 for same bucket reads because STCS
> in the first bucket is triggered at min_threshold=4), and 5 max, seems
> reasonable. Way better than the 200 or so you're doing now.
>
>
>
>
>
> When changing the compaction strategy via JMX, do I need to issue the
> alter table command at the end so it will be reflected in the schema or is
> it taking care of automatically? (I am using cassandra 3.11.11)
>
>
>
>
>
> At the end, yes.
>
>
>
> Thanks a lot for your help.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *From:* Jeff Jirsa 
> *Sent:* Tuesday, September 14, 2021 4:51 PM
> *To:* cassandra 
> *Subject:* Re: TWCS on Non TTL Data
>
>
>
>
>
>
>
> On Tue, Sep 14, 2021 at 5:42 AM Isaeed Mohanna  wrote:
>
> Hi
>
> I have a table that stores time series data, the data is not TTLed since
> we want to retain the data for the foreseeable future, and there are no
> updates or deletes. (deletes could happens rarely in case some scrambled
> data reached the table, but its extremely rare).
>
> Usually we do constant write of incoming data to the table ~ 5 milion a
> day, mostly newly generated data in the past week, but we also get old data
> that got stuck somewhere but not that often. Usually our reads are for the
> most recent data last month – three. But we do fetch old data as well in a
> specific time period in the past.
>
> Lately we have been facing performance trouble with this table see
> histogram below, When compaction is working on the table the performance
> even drops to 10-20 seconds!!
>
> Percentile  SSTables Write Latency  Read LatencyPartition
> SizeCell Count
>
>   (micros)  (micros)   (bytes)
>
> 50%   215.00 17.08  89970.66
> 1916   149
>
> 75%   446.00 24.60 223875.79
> 2759   215
>
> 95%   535.00 35.43 464228.84
> 8239   642
>
> 98%   642.00 51.01 668489.53
> 24601  1916
>
> 99%   642.00 73.46 962624.93
> 42510  3311
>
> Min 0.00  2.30  10090.81
> 43  

RE: TWCS on Non TTL Data

2021-09-14 Thread Isaeed Mohanna
My cluster column is the time series timestamp, so basically sourceId, metric 
type for partition key and timestamp for the clustering key the rest of the 
fields are just values outside of the primary key. Our reads request are simply 
give me values for a time range of a specific sourceId,Metric combination. So I 
am guess that during read the sstables that contain the partition key will be 
found and out of those the ones that are out of the range will be excluded, 
correct?
In practice our queries are up to a month by default, only rarely we fetch more 
when someone is exporting the data or so.

In reality also we get old data, that is a source will send its information 
late instead of sending it in realtime it will send all last month\week\day 
data at once, in that case I guess the data will end up in current bucket, will 
that affect performance?

Assuming I start with a  1 week bucket, I could later change the time window 
right?

Thanks


From: Jeff Jirsa 
Sent: Tuesday, September 14, 2021 10:35 PM
To: cassandra 
Subject: Re: TWCS on Non TTL Data

Inline

On Tue, Sep 14, 2021 at 11:47 AM Isaeed Mohanna 
mailto:isa...@xsense.co>> wrote:
Hi Jeff
My data is partitioned by a sourceId and metric, a source is usually active up 
to a year after which there is no additional writes for the partition, and 
reads become scarce, so although this is not an explicit time component, its 
time based, will that suffice?

I guess it means that a single read may touch a year of sstables. Not great, 
but perhaps not fatal. Hopefully your reads avoid that in practice. We'd need 
the full schema to be very sure (does clustering column include month/day? if 
so, there are cases where that can help exclude sstables)


If I use a  week bucket we will be able to serve last few days reads from one 
file and last month from ~5 which is the most common queries, do u think doing 
a months bucket a good idea? That will allow reading from one file most of the 
time but the size of each SSTable will be ~5 times bigger

It'll be 1-4 for most common (up to 4 for same bucket reads because STCS in the 
first bucket is triggered at min_threshold=4), and 5 max, seems reasonable. Way 
better than the 200 or so you're doing now.


When changing the compaction strategy via JMX, do I need to issue the alter 
table command at the end so it will be reflected in the schema or is it taking 
care of automatically? (I am using cassandra 3.11.11)


At the end, yes.

Thanks a lot for your help.








From: Jeff Jirsa mailto:jji...@gmail.com>>
Sent: Tuesday, September 14, 2021 4:51 PM
To: cassandra mailto:user@cassandra.apache.org>>
Subject: Re: TWCS on Non TTL Data



On Tue, Sep 14, 2021 at 5:42 AM Isaeed Mohanna 
mailto:isa...@xsense.co>> wrote:
Hi
I have a table that stores time series data, the data is not TTLed since we 
want to retain the data for the foreseeable future, and there are no updates or 
deletes. (deletes could happens rarely in case some scrambled data reached the 
table, but its extremely rare).
Usually we do constant write of incoming data to the table ~ 5 milion a day, 
mostly newly generated data in the past week, but we also get old data that got 
stuck somewhere but not that often. Usually our reads are for the most recent 
data last month – three. But we do fetch old data as well in a specific time 
period in the past.
Lately we have been facing performance trouble with this table see histogram 
below, When compaction is working on the table the performance even drops to 
10-20 seconds!!
Percentile  SSTables Write Latency  Read LatencyPartition Size  
  Cell Count
  (micros)  (micros)   (bytes)
50%   215.00 17.08  89970.66  1916  
 149
75%   446.00 24.60 223875.79  2759  
 215
95%   535.00 35.43 464228.84  8239  
 642
98%   642.00 51.01 668489.53 24601  
1916
99%   642.00 73.46 962624.93 42510  
3311
Min 0.00  2.30  10090.8143  
   0
Max   770.00   1358.102395318.86   5839588  
  454826

As u can see we are scaning hundreds of sstables, turns out we are using DTCS  
(min:4,max32) , the table folder contains ~33K files  of ~130GB per node 
(cleanup pending after increasing the cluster), And compaction takes a very 
long time to complete.
As I understood DTCS is deprecated so my questions

  1.  should we switch to TWCS even though our data is not TTLed since we do 
not do delete at all can we still use it? Will it improve performance?
It will probably be better than DTCS here, but you'll still have potentially 
lots of sstables over time.

Lots of sstables in itself isn't a big deal

Re: TWCS on Non TTL Data

2021-09-14 Thread Jeff Jirsa
Inline

On Tue, Sep 14, 2021 at 11:47 AM Isaeed Mohanna  wrote:

> Hi Jeff
>
> My data is partitioned by a sourceId and metric, a source is usually
> active up to a year after which there is no additional writes for the
> partition, and reads become scarce, so although this is not an explicit
> time component, its time based, will that suffice?
>

I guess it means that a single read may touch a year of sstables. Not
great, but perhaps not fatal. Hopefully your reads avoid that in practice.
We'd need the full schema to be very sure (does clustering column include
month/day? if so, there are cases where that can help exclude sstables)


>
>
> If I use a  week bucket we will be able to serve last few days reads from
> one file and last month from ~5 which is the most common queries, do u
> think doing a months bucket a good idea? That will allow reading from one
> file most of the time but the size of each SSTable will be ~5 times bigger
>

It'll be 1-4 for most common (up to 4 for same bucket reads because STCS in
the first bucket is triggered at min_threshold=4), and 5 max, seems
reasonable. Way better than the 200 or so you're doing now.


>
>
> When changing the compaction strategy via JMX, do I need to issue the
> alter table command at the end so it will be reflected in the schema or is
> it taking care of automatically? (I am using cassandra 3.11.11)
>
>
>

At the end, yes.


> Thanks a lot for your help.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *From:* Jeff Jirsa 
> *Sent:* Tuesday, September 14, 2021 4:51 PM
> *To:* cassandra 
> *Subject:* Re: TWCS on Non TTL Data
>
>
>
>
>
>
>
> On Tue, Sep 14, 2021 at 5:42 AM Isaeed Mohanna  wrote:
>
> Hi
>
> I have a table that stores time series data, the data is not TTLed since
> we want to retain the data for the foreseeable future, and there are no
> updates or deletes. (deletes could happens rarely in case some scrambled
> data reached the table, but its extremely rare).
>
> Usually we do constant write of incoming data to the table ~ 5 milion a
> day, mostly newly generated data in the past week, but we also get old data
> that got stuck somewhere but not that often. Usually our reads are for the
> most recent data last month – three. But we do fetch old data as well in a
> specific time period in the past.
>
> Lately we have been facing performance trouble with this table see
> histogram below, When compaction is working on the table the performance
> even drops to 10-20 seconds!!
>
> Percentile  SSTables Write Latency  Read LatencyPartition
> SizeCell Count
>
>   (micros)  (micros)   (bytes)
>
> 50%   215.00 17.08  89970.66
> 1916   149
>
> 75%   446.00 24.60 223875.79
> 2759   215
>
> 95%   535.00 35.43 464228.84
> 8239   642
>
> 98%   642.00 51.01 668489.53
> 24601  1916
>
> 99%   642.00 73.46 962624.93
> 42510  3311
>
> Min 0.00  2.30  10090.81
> 43 0
>
> Max   770.00   1358.102395318.86
> 5839588454826
>
>
>
> As u can see we are scaning hundreds of sstables, turns out we are using
> DTCS  (min:4,max32) , the table folder contains ~33K files  of ~130GB per
> node (cleanup pending after increasing the cluster), And compaction takes a
> very long time to complete.
>
> As I understood DTCS is deprecated so my questions
>
>1. should we switch to TWCS even though our data is not TTLed since we
>do not do delete at all can we still use it? Will it improve performance?
>
> It will probably be better than DTCS here, but you'll still have
> potentially lots of sstables over time.
>
>
>
> Lots of sstables in itself isn't a big deal, the problem comes from
> scanning more than a handful on each read. Does your table have some form
> of date bucketing to avoid touching old data files?
>
>
>
>
>
>
>1. If we should switch I am thinking of using a time window of a week,
>this way the read will scan 10s of sstables instead of hundreds today. Does
>it sound reasonable?
>
> 10s is better than hundreds, but it's still a lot.
>
>
>
>
>1. Is there a recommended size of a window bucket in terms of disk
>space?
>
> When I wrote it, I wrote it for a use case that had 30 windows over the
> whole set of data. Since then, I've seen it used with anywhere from 5 to 60
> buckets.
>
> With no TTL, you're e

RE: TWCS on Non TTL Data

2021-09-14 Thread Isaeed Mohanna
Hi Jeff
My data is partitioned by a sourceId and metric, a source is usually active up 
to a year after which there is no additional writes for the partition, and 
reads become scarce, so although this is not an explicit time component, its 
time based, will that suffice?

If I use a  week bucket we will be able to serve last few days reads from one 
file and last month from ~5 which is the most common queries, do u think doing 
a months bucket a good idea? That will allow reading from one file most of the 
time but the size of each SSTable will be ~5 times bigger

When changing the compaction strategy via JMX, do I need to issue the alter 
table command at the end so it will be reflected in the schema or is it taking 
care of automatically? (I am using cassandra 3.11.11)

Thanks a lot for your help.








From: Jeff Jirsa 
Sent: Tuesday, September 14, 2021 4:51 PM
To: cassandra 
Subject: Re: TWCS on Non TTL Data



On Tue, Sep 14, 2021 at 5:42 AM Isaeed Mohanna 
mailto:isa...@xsense.co>> wrote:
Hi
I have a table that stores time series data, the data is not TTLed since we 
want to retain the data for the foreseeable future, and there are no updates or 
deletes. (deletes could happens rarely in case some scrambled data reached the 
table, but its extremely rare).
Usually we do constant write of incoming data to the table ~ 5 milion a day, 
mostly newly generated data in the past week, but we also get old data that got 
stuck somewhere but not that often. Usually our reads are for the most recent 
data last month – three. But we do fetch old data as well in a specific time 
period in the past.
Lately we have been facing performance trouble with this table see histogram 
below, When compaction is working on the table the performance even drops to 
10-20 seconds!!
Percentile  SSTables Write Latency  Read LatencyPartition Size  
  Cell Count
  (micros)  (micros)   (bytes)
50%   215.00 17.08  89970.66  1916  
 149
75%   446.00 24.60 223875.79  2759  
 215
95%   535.00 35.43 464228.84  8239  
 642
98%   642.00 51.01 668489.53 24601  
1916
99%   642.00 73.46 962624.93 42510  
3311
Min 0.00  2.30  10090.8143  
   0
Max   770.00   1358.102395318.86   5839588  
  454826

As u can see we are scaning hundreds of sstables, turns out we are using DTCS  
(min:4,max32) , the table folder contains ~33K files  of ~130GB per node 
(cleanup pending after increasing the cluster), And compaction takes a very 
long time to complete.
As I understood DTCS is deprecated so my questions

  1.  should we switch to TWCS even though our data is not TTLed since we do 
not do delete at all can we still use it? Will it improve performance?
It will probably be better than DTCS here, but you'll still have potentially 
lots of sstables over time.

Lots of sstables in itself isn't a big deal, the problem comes from scanning 
more than a handful on each read. Does your table have some form of date 
bucketing to avoid touching old data files?



  1.  If we should switch I am thinking of using a time window of a week, this 
way the read will scan 10s of sstables instead of hundreds today. Does it sound 
reasonable?
10s is better than hundreds, but it's still a lot.


  1.  Is there a recommended size of a window bucket in terms of disk space?
When I wrote it, I wrote it for a use case that had 30 windows over the whole 
set of data. Since then, I've seen it used with anywhere from 5 to 60 buckets.
With no TTL, you're effectively doing infinite buckets. So the only way to 
ensure you're not touching too many sstables is to put the date (in some form) 
into the partition key and let the database use that (+bloom filters) to avoid 
reading too many sstables.

  1.  If TWCS is not a good idea should I switch to STCS instead could that 
yield in better performance than current situation?
LCS will give you better read performance. STCS will probably be better than 
DTCS given the 215 sstable p50 you're seeing (which is crazy btw, I'm surprised 
you're not just OOMing)


  1.  What are the risk of changing compaction strategy on a production system, 
can it be done on the fly? Or its better to go through a full test, backup 
cycle?

The risk is you trigger a ton of compactions which drops the performance of the 
whole system all at once and your front door queries all time out.
You can approach this a few ways:
- Use the JMX endpoint to change compaction on one instance at a time (rather 
than doing it in the schema), which lets you control how many nodes are 
re-writing all their data at any given point in time
- You ca

Re: TWCS on Non TTL Data

2021-09-14 Thread Jeff Jirsa
On Tue, Sep 14, 2021 at 5:42 AM Isaeed Mohanna  wrote:

> Hi
>
> I have a table that stores time series data, the data is not TTLed since
> we want to retain the data for the foreseeable future, and there are no
> updates or deletes. (deletes could happens rarely in case some scrambled
> data reached the table, but its extremely rare).
>
> Usually we do constant write of incoming data to the table ~ 5 milion a
> day, mostly newly generated data in the past week, but we also get old data
> that got stuck somewhere but not that often. Usually our reads are for the
> most recent data last month – three. But we do fetch old data as well in a
> specific time period in the past.
>
> Lately we have been facing performance trouble with this table see
> histogram below, When compaction is working on the table the performance
> even drops to 10-20 seconds!!
>
> Percentile  SSTables Write Latency  Read LatencyPartition
> SizeCell Count
>
>   (micros)  (micros)   (bytes)
>
> 50%   215.00 17.08  89970.66
> 1916   149
>
> 75%   446.00 24.60 223875.79
> 2759   215
>
> 95%   535.00 35.43 464228.84
> 8239   642
>
> 98%   642.00 51.01 668489.53
> 24601  1916
>
> 99%   642.00 73.46 962624.93
> 42510  3311
>
> Min 0.00  2.30  10090.81
> 43 0
>
> Max   770.00   1358.102395318.86
> 5839588454826
>
>
>
> As u can see we are scaning hundreds of sstables, turns out we are using
> DTCS  (min:4,max32) , the table folder contains ~33K files  of ~130GB per
> node (cleanup pending after increasing the cluster), And compaction takes a
> very long time to complete.
>
> As I understood DTCS is deprecated so my questions
>
>1. should we switch to TWCS even though our data is not TTLed since we
>do not do delete at all can we still use it? Will it improve performance?
>
> It will probably be better than DTCS here, but you'll still have
potentially lots of sstables over time.

Lots of sstables in itself isn't a big deal, the problem comes from
scanning more than a handful on each read. Does your table have some form
of date bucketing to avoid touching old data files?



>
>1. If we should switch I am thinking of using a time window of a week,
>this way the read will scan 10s of sstables instead of hundreds today. Does
>it sound reasonable?
>
> 10s is better than hundreds, but it's still a lot.


>
>1. Is there a recommended size of a window bucket in terms of disk
>space?
>
> When I wrote it, I wrote it for a use case that had 30 windows over the
whole set of data. Since then, I've seen it used with anywhere from 5 to 60
buckets.
With no TTL, you're effectively doing infinite buckets. So the only way to
ensure you're not touching too many sstables is to put the date (in some
form) into the partition key and let the database use that (+bloom filters)
to avoid reading too many sstables.

>
>1. If TWCS is not a good idea should I switch to STCS instead could
>that yield in better performance than current situation?
>
> LCS will give you better read performance. STCS will probably be better
than DTCS given the 215 sstable p50 you're seeing (which is crazy btw, I'm
surprised you're not just OOMing)


>1. What are the risk of changing compaction strategy on a production
>system, can it be done on the fly? Or its better to go through a full test,
>backup cycle?
>
>
The risk is you trigger a ton of compactions which drops the performance of
the whole system all at once and your front door queries all time out.
You can approach this a few ways:
- Use the JMX endpoint to change compaction on one instance at a time
(rather than doing it in the schema), which lets you control how many nodes
are re-writing all their data at any given point in time
- You can make an entirely new table, and then populate it by reading from
the old one and writing ot the new one, and then you dont have the massive
compaction kick off
- You can use user defined compaction to force compact some of those 33k
sstables into fewer sstables in advance, hopefully taking away some of the
pain you're seeing, before you fire off the big compaction

The 3rd hint above - user defined compaction - will make TWCS less
effective, because TWCS uses the max timestamp per sstable for bucketing,
and you'd be merging sstables and losing granularity.

Really though, the main thing you need to do is get a time component in
your partition key so you avoid scanning every sstable looking for data,
either that or bite the bullet and use LCS so the compaction system keeps
it at a manageable level for reads.



>
>1.
>
> All input will be appreciated,
>
> Thank you
>


TWCS on Non TTL Data

2021-09-14 Thread Isaeed Mohanna
Hi
I have a table that stores time series data, the data is not TTLed since we 
want to retain the data for the foreseeable future, and there are no updates or 
deletes. (deletes could happens rarely in case some scrambled data reached the 
table, but its extremely rare).
Usually we do constant write of incoming data to the table ~ 5 milion a day, 
mostly newly generated data in the past week, but we also get old data that got 
stuck somewhere but not that often. Usually our reads are for the most recent 
data last month - three. But we do fetch old data as well in a specific time 
period in the past.
Lately we have been facing performance trouble with this table see histogram 
below, When compaction is working on the table the performance even drops to 
10-20 seconds!!
Percentile  SSTables Write Latency  Read LatencyPartition Size  
  Cell Count
  (micros)  (micros)   (bytes)
50%   215.00 17.08  89970.66  1916  
 149
75%   446.00 24.60 223875.79  2759  
 215
95%   535.00 35.43 464228.84  8239  
 642
98%   642.00 51.01 668489.53 24601  
1916
99%   642.00 73.46 962624.93 42510  
3311
Min 0.00  2.30  10090.8143  
   0
Max   770.00   1358.102395318.86   5839588  
  454826

As u can see we are scaning hundreds of sstables, turns out we are using DTCS  
(min:4,max32) , the table folder contains ~33K files  of ~130GB per node 
(cleanup pending after increasing the cluster), And compaction takes a very 
long time to complete.
As I understood DTCS is deprecated so my questions

  1.  should we switch to TWCS even though our data is not TTLed since we do 
not do delete at all can we still use it? Will it improve performance?
  2.  If we should switch I am thinking of using a time window of a week, this 
way the read will scan 10s of sstables instead of hundreds today. Does it sound 
reasonable?
  3.  Is there a recommended size of a window bucket in terms of disk space?
  4.  If TWCS is not a good idea should I switch to STCS instead could that 
yield in better performance than current situation?
  5.  What are the risk of changing compaction strategy on a production system, 
can it be done on the fly? Or its better to go through a full test, backup 
cycle?

All input will be appreciated,
Thank you