Michiel, This is not in our use case. Since our data is not time series, there is no TTL in our case.
Bowen, I think this is what is generally recommend to run a major compaction once in a week for better read performance. On Fri, Jul 1, 2022, 6:52 AM Michiel Saelen <michiel.sae...@skyline.be> wrote: > Hi, > > We did do compaction job every week in the past to keep the disk space > used under control as we had mainly data in the table that needs to expire > with TTL and were also using levelled compaction. > > In our case we had different TTL’s in the same table and the partitions > were spread over multiple ssTables, as the partitions were never closing > and therefor kept on pushing changes we ended up with repair actions that > had to cover a lot of ssTables which is heavy on memory and CPU. > By changing the compaction strategy to TWCS > <https://cassandra.apache.org/doc/latest/cassandra/operating/compaction/twcs.html>, > splitting the table into different tables with their own TTL and adding a > part to the partition key (e.g. the day of the year) to close the > partitions, so they can be “marked” as repaired, we were able to get rid of > these heavy compaction actions. > > > > Not sure if you have the same use case, just wanted to share this info. > > > > Kind regards, > > Michiel > > > > <https://skyline.be/jobs/en> > > > > > > *Michiel Saelen *| Principal Solution Architect > > Email michiel.sae...@skyline.be > > > > Skyline Communications > > 39 Hong Kong Street #02-01 | Singapore 059678 > www.skyline.be | +65 6920 1145 <+6569201145> > > > > <https://skyline.be/> > > > > > > <https://teams.microsoft.com/l/chat/0/0?users=michiel.sae...@skyline.be> > > > <https://community.dataminer.services/?utm_source=signature&utm_medium=email&utm_campaign=icon> > > <https://www.linkedin.com/company/skyline-communications> > > <https://www.youtube.com/user/SkylineCommu> > > <https://www.facebook.com/SkylineCommunications/> > > <https://www.instagram.com/skyline.dataminer/> > > > <https://skyline.be/skyline/awards?utm_source=signature&utm_medium=email&utm_campaign=icon> > > > > > > > > *From:* Bowen Song <bo...@bso.ng> > *Sent:* Friday, July 1, 2022 08:48 > *To:* user@cassandra.apache.org > *Subject:* Re: Query around Data Modelling -2 > > > > This message was sent from outside the company. Please do not click links > or open attachments unless you recognise the source of this email and know > the content is safe. > > > > And why do you do that? > > On 30/06/2022 16:35, MyWorld wrote: > > We run major compaction once in a week > > > > On Thu, Jun 30, 2022, 8:14 PM Bowen Song <bo...@bso.ng> wrote: > > I have noticed this "running a weekly repair and compaction job". > > What do you mean weekly compaction job? Have you disabled the > auto-compaction on the table and is relying on weekly scheduled > compactions? Or running weekly major compactions? Neither of these sounds > right. > > On 30/06/2022 15:03, MyWorld wrote: > > Hi all, > > > > Another query around data Modelling. > > > > We have a existing table with below structure: > > Table(PK,CK, col1,col2, col3, col4,col5) > > > > Now each Pk here have 1k - 10k Clustering keys. Each PK has size from 10MB > to 80MB. We have overall 100+ millions partitions. Also we have set > levelled compactions in place so as to get better read response time. > > > > We are currently on 3.11.x version of Cassandra. On running a weekly > repair and compaction job, this model because of levelled compaction > (occupied till Level 3) consume heavy cpu resource and impact db > performance. > > > > Now what if we divide this table in 10 with each table containing 1/10 > partitions. So now each table will be limited to levelled compaction upto > level-2. I think this would ease down read as well as compaction task. > > > > What is your opinion on this? > > Even if we upgrade to ver 4.0, is the second model ok? > > > >