RE: Query around Data Modelling -2

2022-06-30 Thread Michiel Saelen
Hi, We did do compaction job every week in the past to keep the disk space used under control as we had mainly data in the table that needs to expire with TTL and were also using levelled compaction. In our case we had different TTL’s in the same table and the partitions were spread over

Re: Query around Data Modelling -2

2022-06-30 Thread Bowen Song
And why do you do that? On 30/06/2022 16:35, MyWorld wrote: We run major compaction once in a week On Thu, Jun 30, 2022, 8:14 PM Bowen Song wrote: I have noticed this "running a weekly repair and compaction job". What do you mean weekly compaction job? Have you disabled the

Re: Query around Data Modelling -2

2022-06-30 Thread MyWorld
We run major compaction once in a week On Thu, Jun 30, 2022, 8:14 PM Bowen Song wrote: > I have noticed this "running a weekly repair and compaction job". > > What do you mean weekly compaction job? Have you disabled the > auto-compaction on the table and is relying on weekly scheduled >

The Apache Cassandra(R) Corner Podcast

2022-06-30 Thread Aaron Ploetz
How does open source marketing work and how does it help the Apache Cassandra® project? Listen in as Constantia's Melissa Logan and I talk about open source, Cassandra, and the Data on Kubernetes community. https://anchor.fm/cassandra-corner/episodes/ep5---Melissa-Logan-Constantia-e1kgp40 The

Re: Query around Data Modelling -2

2022-06-30 Thread Bowen Song
I have noticed this "running a weekly repair and compaction job". What do you mean weekly compaction job? Have you disabled the auto-compaction on the table and is relying on weekly scheduled compactions? Or running weekly major compactions? Neither of these sounds right. On 30/06/2022

Re: Query around Data Modelling -2

2022-06-30 Thread MyWorld
Hi Jeff, We are running repair with -pr option. You are right it would have no or very minimal impact on read (considering the fact now data has to be read from 2 levels instead of 3). But my guess there is no negative impact of this model2. On Thu, Jun 30, 2022, 7:41 PM Jeff Jirsa wrote: >

Re: Query around Data Modelling -2

2022-06-30 Thread Jeff Jirsa
How are you running repair? -pr? Or -st/-et? 4.0 gives you real incremental repair which helps. Splitting the table won’t make reads faster. It will increase the potential parallelization of compaction. > On Jun 30, 2022, at 7:04 AM, MyWorld wrote: > >  > Hi all, > > Another query around

Query around Data Modelling -2

2022-06-30 Thread MyWorld
Hi all, Another query around data Modelling. We have a existing table with below structure: Table(PK,CK, col1,col2, col3, col4,col5) Now each Pk here have 1k - 10k Clustering keys. Each PK has size from 10MB to 80MB. We have overall 100+ millions partitions. Also we have set levelled