Re: Stop long running queries in Cassandra 3.11.x or Cassandra 4.x

2021-10-13 Thread S G
I see. Thanks Jeff ! On Wed, Oct 13, 2021 at 2:25 PM Jeff Jirsa wrote: > Convention in the yaml is default being visible commented out. > > > On Wed, Oct 13, 2021 at 2:17 PM S G wrote: > >> ok, the link given has the value commented, so I was a bit confused. >> But then https://github.com/apach

Re: Single node slowing down queries in a large cluster

2021-10-13 Thread Jeff Jirsa
Some random notes, not necessarily going to help you, but: - You probably have vnodes enable, which means one bad node is PROBABLY a replica of almost every other node, so the fanout here is worse than it should be, and - You probably have speculative retry on the table set to a percentile. As the

Re: Stop long running queries in Cassandra 3.11.x or Cassandra 4.x

2021-10-13 Thread Jeff Jirsa
Convention in the yaml is default being visible commented out. On Wed, Oct 13, 2021 at 2:17 PM S G wrote: > ok, the link given has the value commented, so I was a bit confused. > But then https://github.com/apache/cassandra/search?q=cross_node_timeout > shows that default value is indeed true.

Re: Stop long running queries in Cassandra 3.11.x or Cassandra 4.x

2021-10-13 Thread S G
ok, the link given has the value commented, so I was a bit confused. But then https://github.com/apache/cassandra/search?q=cross_node_timeout shows that default value is indeed true. Thanks for the help, On Wed, Oct 13, 2021 at 11:26 AM Jeff Jirsa wrote: > The default is true: > > https://github

Single node slowing down queries in a large cluster

2021-10-13 Thread S G
Hello, We have frequently seen that a single bad node running slow can affect the latencies of the entire cluster (especially for queries where the slow node was acting as a coordinator). Is there any suggestion to avoid this behavior? Like something on the client side to not query that bad nod

Re: Stop long running queries in Cassandra 3.11.x or Cassandra 4.x

2021-10-13 Thread Jeff Jirsa
The default is true: https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L1000 There is no equivalent to `alter system kill session`, because it is assumed that any query has a short, finite life in the order of seconds. On Wed, Oct 13, 2021 at 11:10 AM S G wrote: > Hello, > >

Re: Stop long running queries in Cassandra 3.11.x or Cassandra 4.x

2021-10-13 Thread S G
Hello, Does anyone know about the default being turned off for this setting? It seems like a good one to be turned on - why have replicas process something for which coordinator has already sent the timeout to client? Thanks On Tue, Oct 12, 2021 at 11:06 AM S G wrote: > Thanks Bowen. > Any ide

Re: Schema collision results in multiple data directories per table

2021-10-13 Thread Jeff Jirsa
I've described this race a few times on the list. It is very very dangerous to do concurrent table creation in cassandra with non-determistic CFIDs. I'll try to describe it quickly right now: - Imagine you have 3 hosts, A B and C You connect to A and issue a "CREATE TABLE ... IF NOT EXISTS". A al

Re: Schema collision results in multiple data directories per table

2021-10-13 Thread vytenis silgalis
You ran the `alter keyspace` command on the original dc1 nodes or the new dc2 nodes? On Wed, Oct 13, 2021 at 8:15 AM Stefan Miklosovic < stefan.mikloso...@instaclustr.com> wrote: > Hi Tom, > > while I am not completely sure what might cause your issue, I just > want to highlight that schema agree

Re: Schema collision results in multiple data directories per table

2021-10-13 Thread Stefan Miklosovic
Hi Tom, while I am not completely sure what might cause your issue, I just want to highlight that schema agreements were overhauled in 4.0 (1) a lot so that may be somehow related to what that ticket was trying to fix. Regards (1) https://issues.apache.org/jira/browse/CASSANDRA-15158 On Fri, 1

RE: Trouble After Changing Replication Factor

2021-10-13 Thread Isaeed Mohanna
Hi again I did run repair -full without any parameters which I understood will run repair for all key spaces, but I do not recall seeing validation tasks running on one of my two main keyspaces with most data. Maybe it failed or didn’t run. Anyhow I tested with a small app on a small table that I