Re: nodetool load does not match du

2020-02-03 Thread Erick Ramirez
> I thought that the snapshot size was not counted in the load. > That's correct. I suggested looking at what nodetool tablestats reports so you can compare that against du/df outputs for clues as to why there is such a large discrepancy. Cheers!

Re: nodetool load does not match du

2020-02-03 Thread Sergio
- The amount of file system data under the cassandra data directory after excluding all content in the snapshots subdirectories. Because all SSTable data files are included, any data that is not cleaned up, such as TTL-expired cell or tombstoned data) is counted.

Re: nodetool load does not match du

2020-02-03 Thread Sergio
Thanks, Erick! I thought that the snapshot size was not counted in the load. Il giorno lun 3 feb 2020 alle ore 23:24 Erick Ramirez ha scritto: > Why the df -h and du -sh shows a big discrepancy? nodetool load is it >> computed with df -h? >> > > In Linux terms, df reports the filesystem disk

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-02-03 Thread Sergio
After reading this *I would only consider moving a cluster to 4 tokens if it is larger than 100 nodes. If you read through the paper that Erick mentioned, written by Joe Lynch & Josh Snyder, they show that the num_tokens impacts the availability of large scale clusters.* and With 16 tokens,

Fwd: Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-03 Thread onmstester onmstester
Thank you so much Sent using https://www.zoho.com/mail/ Forwarded message From: Max C. To: Date: Tue, 04 Feb 2020 08:37:21 +0330 Subject: Re: [Discuss] num_tokens default in Cassandra 4.0 Forwarded message Let’s say you have a 6

Re: nodetool load does not match du

2020-02-03 Thread Erick Ramirez
> > Why the df -h and du -sh shows a big discrepancy? nodetool load is it > computed with df -h? > In Linux terms, df reports the filesystem disk usage while du is an *estimate* of the file space usage. What that means is that the operating system uses different accounting between the two

nodetool load does not match du

2020-02-03 Thread Sergio Bilello
Hello! I was trying to understand the below differences: Cassandra 3.11.4 i3xlarge aws nodes $ du -sh /mnt 123G/mnt $ nodetool info ID : 3647fcca-688a-4851-ab15-df36819910f4 Gossip active : true Thrift active : true Native Transport active: true Load

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-03 Thread Max C.
Let’s say you have a 6 node cluster, with RF=3, and no vnodes. In that case each piece of data is stored as follows: : N1: N2 N3 N2: N3 N4 N3: N4 N5 N4: N5 N6 N5: N6 N1 N6: N1 N2 With this setup, there are some circumstances where you could lose 2 nodes (ex: N1 & N4) and still be able to

Re: Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-03 Thread Jeff Jirsa
The more vnodes you have on each host, the more likely it becomes that any 2 hosts are adjacent/neighbors/replicas. On Mon, Feb 3, 2020 at 8:39 PM onmstester onmstester wrote: > Sorry if its trivial, but i do not understand how num_tokens affects > availability, with RF=3, CLW,CLR=quorum, the

Fwd: Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-03 Thread onmstester onmstester
Sorry if its trivial, but i do not understand how num_tokens affects availability, with RF=3, CLW,CLR=quorum, the cluster could tolerate to lost at most one node and all of the tokens assigned to that node would be also assigned to two other nodes no matter what num_tokens is, right? Sent

Re: Apache vs Datastax cassandra

2020-02-03 Thread Erick Ramirez
Adarsh, a very *friendly* note that anyone is more than welcome to ask questions -- in fact as a group it's encouraged -- but a *gentle reminder* that this mailing list is for open-source Apache Cassandra. By all means, feel free to respond and not saying at all that it's not allowed (I'm just

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-02-03 Thread Sergio
Thanks Erick! Best, Sergio On Sun, Feb 2, 2020, 10:07 PM Erick Ramirez wrote: > If you are after more details about the trade-offs between different sized >> token values, please see the discussion on the dev mailing list: "[Discuss] >> num_tokens default in Cassandra 4.0 >>

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-02-03 Thread Maxim Parkachov
Hi guys, thanks a lot for useful tips. I obviously underestimated complexity of such change. Thanks again, Maxim. >

Apache vs Datastax cassandra

2020-02-03 Thread Adarsh Kumar
Hello All, We have a product that uses Postgres/Cassandra as datastore. We user both Apache and DataStax Cassandra depending on the client's requirements. But never got the chance to explore what exact difference between these two. Apart from Opscenter, DSEGraph, want to know more from Cassandra