Re: Query around Data Modelling -2

MyWorld Fri, 01 Jul 2022 03:29:51 -0700

 Michiel, This is not in our use case. Since our data is not time series,
there is no TTL in our case.


Bowen, I think this is what is generally recommend to run a major
compaction once in a week for better read performance.

On Fri, Jul 1, 2022, 6:52 AM Michiel Saelen <michiel.sae...@skyline.be>
wrote:

> Hi,
>
> We did do compaction job every week in the past to keep the disk space
> used under control as we had mainly data in the table that needs to expire
> with TTL and were also using levelled compaction.
>
> In our case we had different TTL’s in the same table and the partitions
> were spread over multiple ssTables, as the partitions were never closing
> and therefor kept on pushing changes we ended up with repair actions that
> had to cover a lot of ssTables which is heavy on memory and CPU.
> By changing the compaction strategy to TWCS
> <https://cassandra.apache.org/doc/latest/cassandra/operating/compaction/twcs.html>,
> splitting the table into different tables with their own TTL and adding a
> part to the partition key (e.g. the day of the year) to close the
> partitions, so they can be “marked” as repaired, we were able to get rid of
> these heavy compaction actions.
>
>
>
> Not sure if you have the same use case, just wanted to share this info.
>
>
>
> Kind regards,
>
> Michiel
>
>
>
> <https://skyline.be/jobs/en>
>
>
>
>
>
> *Michiel Saelen *| Principal Solution Architect
>
> Email michiel.sae...@skyline.be
>
>
>
> Skyline Communications
>
> 39 Hong Kong Street #02-01 | Singapore 059678
> www.skyline.be | +65 6920 1145 <+6569201145>
>
>
>
> <https://skyline.be/>
>
>
>
>
>
> <https://teams.microsoft.com/l/chat/0/0?users=michiel.sae...@skyline.be>
>
>
> <https://community.dataminer.services/?utm_source=signature&utm_medium=email&utm_campaign=icon>
>
> <https://www.linkedin.com/company/skyline-communications>
>
> <https://www.youtube.com/user/SkylineCommu>
>
> <https://www.facebook.com/SkylineCommunications/>
>
> <https://www.instagram.com/skyline.dataminer/>
>
>
> <https://skyline.be/skyline/awards?utm_source=signature&utm_medium=email&utm_campaign=icon>
>
>
>
>
>
>
>
> *From:* Bowen Song <bo...@bso.ng>
> *Sent:* Friday, July 1, 2022 08:48
> *To:* user@cassandra.apache.org
> *Subject:* Re: Query around Data Modelling -2
>
>
>
> This message was sent from outside the company. Please do not click links
> or open attachments unless you recognise the source of this email and know
> the content is safe.
>
>
>
> And why do you do that?
>
> On 30/06/2022 16:35, MyWorld wrote:
>
> We run major compaction once in a week
>
>
>
> On Thu, Jun 30, 2022, 8:14 PM Bowen Song <bo...@bso.ng> wrote:
>
> I have noticed this "running a weekly repair and compaction job".
>
> What do you mean weekly compaction job? Have you disabled the
> auto-compaction on the table and is relying on weekly scheduled
> compactions? Or running weekly major compactions? Neither of these sounds
> right.
>
> On 30/06/2022 15:03, MyWorld wrote:
>
> Hi all,
>
>
>
> Another query around data Modelling.
>
>
>
> We have a existing table with below structure:
>
> Table(PK,CK, col1,col2, col3, col4,col5)
>
>
>
> Now each Pk here have 1k - 10k Clustering keys. Each PK has size from 10MB
> to 80MB. We have overall 100+ millions partitions. Also we have set
> levelled compactions in place so as to get better read response time.
>
>
>
> We are currently on 3.11.x version of Cassandra. On running a weekly
> repair and compaction job, this model because of levelled compaction
> (occupied till Level 3) consume heavy cpu resource and impact db
> performance.
>
>
>
> Now what if we divide this table in 10 with each table containing 1/10
> partitions. So now each table will be limited to levelled compaction upto
> level-2. I think this would ease down read as well as compaction task.
>
>
>
> What is your opinion on this?
>
> Even if we upgrade to ver 4.0, is the second model ok?
>
>
>
>

Re: Query around Data Modelling -2

Reply via email to