Re: Barman equivalent for Cassandra?

2021-03-12 Thread Erick Ramirez
I'm not familiar with Barman but if you're looking for a backup software for Cassandra, have a look at Medusa from The Last Pickle -- https://github.com/thelastpickle/cassandra-medusa/wiki. It's open-source and is also used for https://k8ssandra.io/ -- the platform for deploying Cassandra on

Re: Barman equivalent for Cassandra?

2021-03-12 Thread Bowen Song
You can have a separate DC, so a physical destruction of an entire DC (such as a fire ) will not result in data loss; you can turn on automatic snapshot on truncate & drop table to help prevent some data losses caused by bugs and human errors; you can also have a cron job to take snapshots

Re: Barman equivalent for Cassandra?

2021-03-12 Thread Jonathan Lacefield
There is a community delivered tool named Medusa that may have what you're looking for as well - https://cassandra.tools/medusa Jonathan Lacefield e. jlacefi...@datastax.com w. www.datastax.com schedule a meeting on my calendar

Re: No node was available to execute query error

2021-03-12 Thread Bowen Song
The highlight is "millions rows in a **single** query". Fetching that amount of data in a single query is bad, because the Java heap memory overhead. You can fetch millions of rows in Cassandra, just make sure you do that over thousands or millions of queries, not one single query. On

Re: No node was available to execute query error

2021-03-12 Thread Paul Chandler
Hi Joe This could also be caused by the replication factor of the keyspace, if you have NetworkTopologyStrategy and it doesn’t list a replication factor for the datacenter datacenter1 then you will get this error message too. Paul > On 12 Mar 2021, at 13:07, Erick Ramirez wrote: > > Does

Re: No node was available to execute query error

2021-03-12 Thread Joe Obernberger
Thank you Paul and Erick.  The keyspace is defined like this: CREATE KEYSPACE doc WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true; Would that cause this? The program that is having the problem selects data, calculates stuff, and

Re: No node was available to execute query error

2021-03-12 Thread Joe Obernberger
The queries that are failing are: select fieldvalue, count from doc.ordered_fieldcounts where source=? and fieldname=? limit 10 Created with: CREATE TABLE doc.ordered_fieldcounts (     source text,     fieldname text,     count bigint,     fieldvalue text,     PRIMARY KEY

Re: No node was available to execute query error

2021-03-12 Thread Joe Obernberger
One question on the 'millions rows in a single query'.  How would you process that many rows?  At some point, I'd like to be able to process 10-100 billion rows.  Isn't that something that can be done with Cassandra?  I'm coming from HBase where we'd run map reduce jobs. Thank you. -Joe

Re: No node was available to execute query error

2021-03-12 Thread Bowen Song
Sleep-then-retry works is just another indicator that it's likely a GC pause related issue. I'd recommend you to check your Cassandra servers' GC logs first. Do you know what's the maximum partition size for the doc.fieldcounts table? (Try the "nodetool cfstats doc.fieldcounts" command) I

Re: No node was available to execute query error

2021-03-12 Thread Bowen Song
Millions rows in a single query? That sounds like a bad idea to me. Your "NoNodeAvailableException" could be caused by stop-the-world GC pauses, and the GC pauses are likely caused by the query itself. On 12/03/2021 13:39, Joe Obernberger wrote: Thank you Paul and Erick.  The keyspace is

What Happened To Alternate Storage And Rocksandra?

2021-03-12 Thread Gareth Collins
Hi, I remember a couple of years ago there was some noise about Rocksandra (Cassandra using rocksdb for storage) and opening up Cassandra to alternate storage mechanisms. I haven't seen anything about it for a while now though. The last commit to Rocksandra on github was in Nov 2019. The

Re: No node was available to execute query error

2021-03-12 Thread Joe Obernberger
Thank you very much for helping me out on this!  The table fieldcounts is currently pretty small - 6.4 million rows. cfstats are: Total number of tables: 81 Keyspace : doc         Read Count: 3713134         Read Latency: 0.2664131157130338 ms        

Re: What Happened To Alternate Storage And Rocksandra?

2021-03-12 Thread Elliott Sims
I'm not too familiar with the details on what's happened more recently, but I do remember that while Rocksandra was very favorably compared to Cassandra 2.x, the improvements looked fairly similar in nature and magnitude to what Cassandra got from the move to the 3.x sstable format and increased

Re: Node removal causes spike in pending native-transport requests and clients suffer

2021-03-12 Thread Gil Ganz
Hey Bowen I agree it's better to have smaller servers in general, this is the smaller servers version :) In this case, I wouldn't say the data model is bad, and we certainly do our best to tune everything so less hardware is needed. It's just that the data and amount of requests/s is very big to

Re: No node was available to execute query error

2021-03-12 Thread Bowen Song
The partition size min/avg/max of 8409008/15096925/25109160 bytes looks fine for the table fieldcounts, but the number of partitions is a bit worrying. Only 3 partitions? Are you expecting the partition size (instead of number of partitions) to grow in the future? That can lead to a lots of

Re: What Happened To Alternate Storage And Rocksandra?

2021-03-12 Thread onmstester onmstester
Beside the enhancements at storage layer, i think there are couple of good ideas in Rocksdb that could be used in Cassandra, like the one with disabling sort at memtable-insert part (write data fast like commitlig) and only sort the data when flushing/creating sst files. Sent using

Re: What Happened To Alternate Storage And Rocksandra?

2021-03-12 Thread Jeff Jirsa
As someone who watched some of the work (but wasn’t really involved), I think a bunch of it fizzled for various reasons The rocks stuff was built (mostly? Completely?) by one company for their use case (the best kind of open source), but wasn’t in a form that was easy to commit upstream - the

No node was available to execute query error

2021-03-12 Thread Joe Obernberger
Hi All - I'm getting this error: Error: com.datastax.oss.driver.api.core.NoNodeAvailableException: No node was available to execute the query com.datastax.oss.driver.api.core.NoNodeAvailableException: No node was available to execute the query         at

Barman equivalent for Cassandra?

2021-03-12 Thread David Tinker
Hi Guys I need to backup my 3 node Cassandra cluster to a remote machine. Is there a tool like Barman (really nice streaming backup tool for Postgresql) for Cassandra? Or does everyone roll their own scripts using snapshots and so on? The data is on all 3 nodes using about 900G of space on each.

Re: No node was available to execute query error

2021-03-12 Thread Erick Ramirez
Does it get returned by the driver every single time? The NoNodeAvailableException gets thrown when (1) all nodes are down, or (2) all the contact points are invalid from the driver's perspective. Is it possible there's no route/connectivity from your app server(s) to the 172.16.x.x network? If