Re: cold vs hot data

2018-09-13 Thread Ben Slater
Not quite a solution but you will probably be interested in the discussion on this ticket: https://issues.apache.org/jira/browse/CASSANDRA-8460 On Fri, 14 Sep 2018 at 10:46 Alaa Zubaidi (PDF) wrote: > Hi, > > We are using Apache Cassandra 3.11.2 on RedHat 7 > The data can grow to +100TB however

cold vs hot data

2018-09-13 Thread Alaa Zubaidi (PDF)
Hi, We are using Apache Cassandra 3.11.2 on RedHat 7 The data can grow to +100TB however the hot data will be in most cases less than 10TB but we still need to keep the rest of data accessible. Anyone has this problem? What is the best way to make the cluster more efficient? Is there a way to

Re: Corrupt insert during ALTER TABLE add

2018-09-13 Thread Max C.
Yep, that’s the problem! Thanks Jeff (and Alex Petrov for fixing it). - Max > On Sep 13, 2018, at 1:24 pm, Jeff Jirsa wrote: > > CASSANDA-13004 (fixed in recent 3.0 and 3.11 builds) - To unsubscribe, e-mail:

Re: Corrupt insert during ALTER TABLE add

2018-09-13 Thread Jeff Jirsa
CASSANDA-13004 (fixed in recent 3.0 and 3.11 builds) On Thu, Sep 13, 2018 at 1:12 PM Max C. wrote: > I ran “alter table” today to add the “task_output_capture_state” column > (see below), and we found a few rows inserted around the time of the ALTER > TABLE did not contain the same values when

Re: Corrupt insert during ALTER TABLE add

2018-09-13 Thread Max C.
Correction — we’re running C* 3.0.8. DataStax Python driver 3.4.1. > On Sep 13, 2018, at 1:11 pm, Max C. wrote: > > I ran “alter table” today to add the “task_output_capture_state” column (see > below), and we found a few rows inserted around the time of the ALTER TABLE > did not contain the

Corrupt insert during ALTER TABLE add

2018-09-13 Thread Max C.
I ran “alter table” today to add the “task_output_capture_state” column (see below), and we found a few rows inserted around the time of the ALTER TABLE did not contain the same values when selected as when they were inserted. When the row was selected, what we saw was: - test_id —> OK (same as

Re: Large partitions

2018-09-13 Thread Jonathan Haddad
It depends on a number of factors, such as compaction strategy and read patterns. I recommend sticking to the 100MB per partition limit (and I aim for significantly less than that). If you're doing time series with TWCS & TTL'ed data and small enough windows, and you're only querying for a small

Re: Large partitions

2018-09-13 Thread Mun Dega
I disagree. We had several over 150MB in 3.11 and we were able to break cluster doing r/w from these partitions in a short period of time. On Thu, Sep 13, 2018, 12:42 Gedeon Kamga wrote: > Folks, > > Based on the information found here >

Re: Large partitions

2018-09-13 Thread Alexander Dejanovski
Hi Gedeon, you should check Robert Stupp's 2016 talk about large partitions : https://www.youtube.com/watch?v=N3mGxgnUiRY Cheers, On Thu, Sep 13, 2018 at 6:42 PM Gedeon Kamga wrote: > Folks, > > Based on the information found here >

Large partitions

2018-09-13 Thread Gedeon Kamga
Folks, Based on the information found here https://docs.datastax.com/en/dse-planning/doc/planning/planningPartitionSize.html , the recommended limit for a partition size is 100MB. Even though, DataStax clearly states that this is a rule of thumb, some team members are claiming that our Cassandra

RE: Cassandra 2.2.7 Compaction after Truncate issue

2018-09-13 Thread David Payne
The truncation was performed via OpsCenter, which I believe is ALL by default. From: Rahul Singh Sent: Thursday, August 23, 2018 6:55 PM To: user@cassandra.apache.org Subject: Re: Cassandra 2.2.7 Compaction after Truncate issue David , What CL do you set when running this command? Rahul Singh

RE: Cassandra 2.2.7 Compaction after Truncate issue

2018-09-13 Thread David Payne
I was able to resolve the issue with a rolling restart of the cluster. From: James Shaw Sent: Thursday, August 23, 2018 7:52 PM To: user@cassandra.apache.org Subject: Re: Cassandra 2.2.7 Compaction after Truncate issue you may go OS level to delete the files.That's what I did before. Truncate

Re: Read timeouts when performing rolling restart

2018-09-13 Thread Riccardo Ferrari
Hi Shalom, It happens almost at every restart, either a single node or a rolling one. I do agree with you that it is good, at least on my setup, to wait few minutes to let the rebooted node to cool down before moving to the next. The more I look at it the more I think is something coming from

Re: Read timeouts when performing rolling restart

2018-09-13 Thread shalom sagges
Hi Riccardo, Does this issue occur when performing a single restart or after several restarts during a rolling restart (as mentioned in your original post)? We have a cluster that when performing a rolling restart, we prefer to wait ~10-15 minutes between each restart because we see an increase