Re: Documentation about TTL and tombstones

2024-03-17 Thread Gil Ganz
It's actually correct to do it how it is today. Insertion date does not matter, what matters is the time after tombstones are supposed to be deleted. If the delete got to all nodes, sure, no problem, but if any of the nodes didn't get the delete, and you would get rid of the tombstones before

Re: Documentation about TTL and tombstones

2024-03-16 Thread Gil Ganz
That's not how gc_grace_seconds work. gc_grace_seconds controls how much time *after* a tombstone can be deleted, it can actually be deleted, in order to give you enough time to run repairs. Say you have data that is about to expire on March 16 8am, and gc_grace_seconds is 10 days. After Mar 16

Re: Optimization for partitions with high number of rows

2023-04-17 Thread Gil Ganz
the old table into a single row in the new table, the > read speed should be much faster. Keep in mind that this may not work if > the partition size of the original table is too large (approximately > >16MB), as the mutation size is limited to up to half of the commitlog >

Re: Optimization for partitions with high number of rows

2023-04-11 Thread Gil Ganz
f you > want it to be faster, you can store the number of rows elsewhere if that's > the only thing you need. > On 11/04/2023 07:13, Gil Ganz wrote: > > Hey > I have a 4.0.4 cluster, with reads of partitions that are a bit on the > bigger side, taking longer than I would expect. > Read

Optimization for partitions with high number of rows

2023-04-11 Thread Gil Ganz
Hey I have a 4.0.4 cluster, with reads of partitions that are a bit on the bigger side, taking longer than I would expect. Reading entire partition that has ~7 rows, total partition size of 4mb, takes 120ms, I would expect it to take less. This is after major compaction, so there is only one

Re: When are sstables that were compacted deleted?

2023-04-04 Thread Gil Ganz
> - Index build/rebuild > - Streaming (repair, bootstrap, move, decom) > > If you have repairs running, you can try pausing/cancelling them and/or > stopping validation/index_build compactions. > > > > On Tue, Apr 4, 2023 at 2:29 PM Gil Ganz wrote: > >> If i

Re: When are sstables that were compacted deleted?

2023-04-04 Thread Gil Ganz
. > > > > On Tue, Apr 4, 2023 at 1:55 PM Gil Ganz wrote: > >> More information - from another node in the cluster >> >> I can see many txn files although I only have two compactions running. >> [user@server808 new_table-44263b406bf111ed8bd9

Re: When are sstables that were compacted deleted?

2023-04-04 Thread Gil Ganz
80ace3677a/nb-10334-big-Data.db (deleted) I will add that this cluster contains a single table with twcs. gil On Tue, Apr 4, 2023 at 10:14 AM Gil Ganz wrote: > Hey > I have a 4.0.7 cluster in which I see weird behavior. > I expect that once compaction finishes, the old ssta

When are sstables that were compacted deleted?

2023-04-04 Thread Gil Ganz
Hey I have a 4.0.7 cluster in which I see weird behavior. I expect that once compaction finishes, the old sstables that were part of the compaction set will be deleted, but it appears they are deleted much later, thus causing space issues. For example this is what I have in the log, only one

Re: cassandra 4.0.6 files removed from archive

2022-10-24 Thread Gil Ganz
Thanks Erick, indeed with curl with the redirect flag I can see the file there. On Mon, Oct 24, 2022 at 8:21 AM Erick Ramirez wrote: > redhat.cassandra.apache.org/40x/ redirects to > apache.jfrog.io/artifactory/cassandra-rpm/40x/. When I curl it on the > command line, I can see that the

cassandra 4.0.6 files removed from archive

2022-10-23 Thread Gil Ganz
Hey Seems like the following release of 4.0.7 a few hours ago ago, repo settings were a bit changed. Where can one download the 4.0.6 cassandra-tools rpm from? It's not in https://apache.jfrog.io/ui/native/cassandra-rpm/40x/ and https://redhat.cassandra.apache.org/40x/ points to a login page with

Local reads metric

2022-09-17 Thread Gil Ganz
Hey Do reads that come from a read repair are somehow counted as part of the local read metric? i.e org.apache.cassandra.metrics.Table... : ReadLatency.1m_rate Version is 4.0.4 Gil

Re: Bootstrap data streaming order

2022-09-12 Thread Gil Ganz
hey do. > > > > Similarly, when the cluster was created, I added the seeds nodes in > numerically ascending order and then the other nodes in a similar fashion. > So why doesn’t nodetool display the status in that same order? > > > > *From:* Gil Ganz > *Sent:* Monday,

Re: Bootstrap data streaming order

2022-09-12 Thread Gil Ganz
I can understand why the number of nodes sending at once might be interesting somehow, but why would the order of the nodes matter? On Fri, Sep 9, 2022 at 10:27 AM Marc Hoppins wrote: > Curiosity as to which data/node starts first, what determines the delivery > sequence, how many nodes send

Re: Compaction task priority

2022-09-04 Thread Gil Ganz
except stopping >> big/unnecessary compactions manually (using nodetool stop) whenever they >> appears by some shell scrips (using crontab) >> >> Sent using Zoho Mail <https://www.zoho.com/mail/> >> >> >> >> On Fri, 02 Sep 2022 10:59:22 +04

Compaction task priority

2022-09-02 Thread Gil Ganz
Hey When deciding which sstables to compact together, how is the priority determined between tasks, and can I do something about it? In some cases (mostly after removing a node), it takes a while for compactions to keep up with the new data the came from removed nodes, and I see it is busy on

Re: netty connection reset by peer errors in logs

2022-09-01 Thread Gil Ganz
Reason I would like to suppress it is I think this is due to network disconnects we know are happening, and looks like it's not going to change. Since it doesn't happen that often, and not causing a real issue , I would like to have cleaner logs if possible. On Thu, Sep 1, 2022 at 9:55 AM Erick

netty connection reset by peer errors in logs

2022-09-01 Thread Gil Ganz
Hey We have an issue in few of our 4.0.4 clusters, these are on-prem, multiple datacenters around the world clusters, and our logs have many errors like this : ERROR [Messaging-EventLoop-3-26] 2022-09-01 05:57:28,142 InboundMessageHandler.java:300 -

Re: Gossip issues after upgrading to 4.0.4

2022-06-07 Thread Gil Ganz
as a regression). > > > > On Tue, Jun 7, 2022 at 7:52 AM Gil Ganz wrote: > >> Yes, I know the issue with the peers table, we had it in different >> clusters, in this case it appears the cause of the problem was indeed a bad >> ip in the seed list. >> After removing it

Re: Gossip issues after upgrading to 4.0.4

2022-06-07 Thread Gil Ganz
ete from > system.peers_v2 where peer=..." to fix it, as on our client side, the > Python cassandra-driver, reads the token ring information from this table > and uses it for routing requests. > On 07/06/2022 05:22, Gil Ganz wrote: > > Only errors I see in the logs prior to goss

Re: Gossip issues after upgrading to 4.0.4

2022-06-06 Thread Gil Ganz
however I'm not aware of any open issues in 4.0.4 > that would result in this. > > Would be eager to investigate immediately if so. > > – Scott > > On Jun 6, 2022, at 11:04 AM, Gil Ganz wrote: > > > Hey > We have a big cluster (>500 nodes, onprem, multi

Gossip issues after upgrading to 4.0.4

2022-06-06 Thread Gil Ganz
Hey We have a big cluster (>500 nodes, onprem, multiple datacenters, most with vnodes=32, but some with 128), that was recently upgraded from 3.11.9 to 4.0.4. Servers are all centos 7. We have been dealing with a few issues related to gossip since : 1 - The moment the last node in the cluster was

Re: Running enablefullquerylog crashes cassandra

2022-02-09 Thread Gil Ganz
Feb 6, 2022, at 12:52 AM, Gil Ganz wrote: > >  > Hey > I'm trying to enable full query log on cassandra 4.01 node and it's > causing cassandra to shutdown > > nodetool enablefullquerylog --path /mnt/fql_data > > Cassandra has shutdown. > error: null

Running enablefullquerylog crashes cassandra

2022-02-06 Thread Gil Ganz
Hey I'm trying to enable full query log on cassandra 4.01 node and it's causing cassandra to shutdown nodetool enablefullquerylog --path /mnt/fql_data Cassandra has shutdown. error: null -- StackTrace -- java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:267)

Re: hints for a node that was removed from cluster

2021-09-12 Thread Gil Ganz
Looks like it's not happening, seen a couple of cases like this, in all cases the files belonged to nodes that died and had to be removed, and I can see this behaviour on all the nodes in the relevant data center. All these nodes had to be removed with force (not by choice..), I wonder if that

hints for a node that was removed from cluster

2021-09-12 Thread Gil Ganz
Hey What should happen to hints that were generated for a node that has been removed from the cluster, before hints were sent? I'm seeing old hint files that are not handled, they are not sent anywhere and not being deleted. Version is 3.11.9 gil

Re: Hints not being created

2021-05-30 Thread Gil Ganz
gc_grace_seconds set is the default 10 days (these tables do not have data with ttl, or anything else that might create tombstones), and max_hint_window_in_ms parameter is set to 18h. I'm checking it by looking at both the hints directory and hints metrics, and as far as check time, I took a node

Hints not being created

2021-05-27 Thread Gil Ganz
Hey I have a cluster, 3.11.9, in which I am enabling hints, but it appears that no hints are created when other nodes are down. I can see that hints are enabled by running "nodetool statushandoff", and gc_grace_seconds is high enough on all tables, so I'm expecting to see hints being created,

Re: counter cache loading very slow

2021-04-28 Thread Gil Ganz
re you determining it's counters that are the > problem? Is it stalling on the Initializing counters log line or something? > > raft.so - Cassandra consulting, support, and managed services > > > On Mon, Apr 26, 2021 at 3:25 AM Gil Ganz wrote: > >> Hey >> I have

counter cache loading very slow

2021-04-25 Thread Gil Ganz
Hey I have a cluster, 3.11.6, startup is very slow, i3en.xlarge server with about 1tb of data, takes 45 minutes to startup, almost 40 minutes of that is loading the saved counter cache from disk (200mb), and I can see that in these 40 minutes the amount of data read from disk is very high, up to

Re: Node removal causes spike in pending native-transport requests and clients suffer

2021-03-12 Thread Gil Ganz
n their case. > > Is that your case too? Bigger RAM, more cores and higher CPU frequency to > help "fix" the performance issue? I really hope not. > > > On 11/03/2021 09:57, Gil Ganz wrote: > > Yes. 192gb. > > On Thu, Mar 11, 2021 at 10:29 AM Kane Wilson >

Re: Node removal causes spike in pending native-transport requests and clients suffer

2021-03-11 Thread Gil Ganz
Yes. 192gb. On Thu, Mar 11, 2021 at 10:29 AM Kane Wilson wrote: > That is a very large heap. I presume you are using G1GC? How much memory > do your servers have? > > raft.so - Cassandra consulting, support, managed services > > On Thu., 11 Mar. 2021, 18:29 Gil Ganz, wr

Re: Node removal causes spike in pending native-transport requests and clients suffer

2021-03-10 Thread Gil Ganz
re seeing is the increased NTR > requests? > > raft.so - Cassandra consulting, support, and managed services > > > On Mon, Mar 8, 2021 at 10:47 PM Gil Ganz wrote: > >> >> Hey, >> We have a 3.11.9 cluster (recently upgraded from 2.1.14), and after the >

Node removal causes spike in pending native-transport requests and clients suffer

2021-03-08 Thread Gil Ganz
Hey, We have a 3.11.9 cluster (recently upgraded from 2.1.14), and after the upgrade we have an issue when we remove a node. The moment I run the removenode command, 3 servers in the same dc start to have a high amount of pending native-transport-requests (getting to around 1M) and clients are

Re: Cassandra on arm aws instances

2021-03-03 Thread Gil Ganz
I think the value of the r6gd (assuming cpu is good compared to intel) is more cpu, not disk. I'm not running on spark the cassandra servers, and having more cpu cores in my cluster will sure help. It all depends on the workloads, some workloads need more io, some cpu. i3 servers are great

Re: Cassandra on arm aws instances

2021-03-01 Thread Gil Ganz
it's not the same, notice I wrote r6gd, these are the ones with nvme, i'm looking just at those. I do not need all the space that i3en gives me (and probably won't be able to use it all due to memory usage, or have other issues just like you mention), so the plan is use the big enough r6gd nodes,

Cassandra on arm aws instances

2021-02-28 Thread Gil Ganz
Hey Does anyone have experience with running cassandra in aws on arm servers? Currently running on i3 servers and thinking about upgrading, looking at i3en vs r6gd. Obviously a lot of considerations here, but trying to understand what's the verdict on these arm processors running cassandra. Gil

Re: Understanding which table had digest mismatch

2021-02-27 Thread Gil Ganz
Eric - I understand, thing is repairs are causing issues and I would like to have the option to only run them on the tables that really need that. Kane - Thanks, good idea, I will check that metric. On Fri, Feb 26, 2021 at 12:07 AM Kane Wilson wrote: > You should be able to use the Table

Understanding which table had digest mismatch

2021-02-25 Thread Gil Ganz
Hey I'm running cassandra 3.11.9 and I have a lot of messages like this: DEBUG [ReadRepairStage:2] 2021-02-25 16:41:11,464 ReadCallback.java:244 - Digest mismatch: org.apache.cassandra.service.DigestMismatchException: Mismatch for key DecoratedKey(4059620144736691554,

Re: Upgrading cassandra cluster from 2.1 to 3.X when using custom TWCS

2020-07-13 Thread Gil Ganz
an opportunity to brush up my java skills ;) On Thu, Jul 9, 2020 at 7:59 PM Jeff Jirsa wrote: > > > On Thu, Jul 9, 2020 at 9:02 AM Oleksandr Shulgin < > oleksandr.shul...@zalando.de> wrote: > >> On Thu, Jul 9, 2020 at 5:54 PM Gil Ganz wrote: >> >> Another questio

Re: Upgrading cassandra cluster from 2.1 to 3.X when using custom TWCS

2020-07-09 Thread Gil Ganz
Great, thank you very much! On Thu, Jul 9, 2020 at 7:02 PM Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote: > On Thu, Jul 9, 2020 at 5:54 PM Gil Ganz wrote: > >> That sounds very interesting Alex, so just to be sure I understand, it >> was like this >> 1

Re: Upgrading cassandra cluster from 2.1 to 3.X when using custom TWCS

2020-07-09 Thread Gil Ganz
the built in one? Another question, did changing the compaction strategy from one twcs to the other trigger merging of old sstables? gil On Thu, Jul 9, 2020 at 4:51 PM Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote: > On Thu, Jul 9, 2020 at 3:38 PM Gil Ganz wrote: > >

Upgrading cassandra cluster from 2.1 to 3.X when using custom TWCS

2020-07-09 Thread Gil Ganz
Hey I have a 2.11.4 cluster with tables that are defined with twcs, using jeff's jar https://github.com/jeffjirsa/twcs All working great, but now I want to upgrade to 3.11, and I have a problem, cassandra won't start, it fails on the following error ERROR [main] 2020-07-09 13:30:29,823

Re: Impact of enabling authentication on performance

2020-06-04 Thread Gil Ganz
ing authentication on > performance > > DSR> Set the Auth cache to a long validity > > DSR> Don’t go crazy with RF of system auth > > DSR> Drop bcrypt rounds if you see massive cpu spikes on reconnect storms > > > >> On Jun 1, 2020, at 11:26 PM, G

Impact of enabling authentication on performance

2020-06-02 Thread Gil Ganz
Hi I have a production 3.11.6 cluster which I'm might want to enable authentication in, I'm trying to understand what will be the performance impact, if any. I understand each use case might be different, trying to understand if there is a common % people usually see their performance hit, or if

Re: disable debug message on read repair

2020-03-10 Thread Gil Ganz
isable the debug.log one, or change > org.apache.cassandra.service to log at INFO instead > > Nobody needs to see every digest mismatch and that someone thought this > was a good idea is amazing to me. Someone should jira that to be a trace. > > > On Mar 8, 2020, at 3:25 AM, Gil

Re: disable debug message on read repair

2020-03-08 Thread Gil Ganz
r on your cluster. But if these messages come back > again, you need to check what's causing these data inconsistencies. > > > On Sun, Mar 8, 2020 at 10:11 AM Gil Ganz wrote: > >> Hey all >> I have a lot of debug message about read repairs in my debug log : >> >&

disable debug message on read repair

2020-03-08 Thread Gil Ganz
Hey all I have a lot of debug message about read repairs in my debug log : DEBUG [ReadRepairStage:346] 2020-03-08 08:09:12,959 ReadCallback.java:242 - Digest mismatch: org.apache.cassandra.service.DigestMismatchException: Mismatch for key DecoratedKey(-28476014476640,

Re: TWCS sstables gets merged following node removal

2018-12-20 Thread Gil Ganz
d". > > > > On Wed, Dec 19, 2018 at 3:40 AM Gil Ganz wrote: > >> sounds like the foreground read repair can cause issues to twcs (mix old >> and new data in same sstable), is there a way to disable the foreground >> read repair? is that indeed the case that i

Re: TWCS sstables gets merged following node removal

2018-12-19 Thread Gil Ganz
this (again, this is 2.1, with your jar, might be something specifically there). thanks gil On Wed, Dec 19, 2018 at 1:39 PM Gil Ganz wrote: > sounds like the foreground read repair can cause issues to twcs (mix old > and new data in same sstable), is there a way to disable the foreground > re

Re: TWCS sstables gets merged following node removal

2018-12-19 Thread Gil Ganz
sounds like the foreground read repair can cause issues to twcs (mix old and new data in same sstable), is there a way to disable the foreground read repair? is that indeed the case that it's problematic? On Mon, Dec 17, 2018 at 9:21 AM Gil Ganz wrote: > hey jeff, attaching more informat