I'm not familiar with Barman but if you're looking for a backup software
for Cassandra, have a look at Medusa from The Last Pickle --
https://github.com/thelastpickle/cassandra-medusa/wiki.
It's open-source and is also used for https://k8ssandra.io/ -- the platform
for deploying Cassandra on
You can have a separate DC, so a physical destruction of an entire DC
(such as a fire ) will not result in data loss; you can turn on
automatic snapshot on truncate & drop table to help prevent some data
losses caused by bugs and human errors; you can also have a cron job to
take snapshots
There is a community delivered tool named Medusa that may have what you're
looking for as well - https://cassandra.tools/medusa
Jonathan Lacefield
e. jlacefi...@datastax.com
w. www.datastax.com
schedule a meeting on my calendar
The highlight is "millions rows in a **single** query". Fetching that
amount of data in a single query is bad, because the Java heap memory
overhead. You can fetch millions of rows in Cassandra, just make sure
you do that over thousands or millions of queries, not one single query.
On
Hi Joe
This could also be caused by the replication factor of the keyspace, if you
have NetworkTopologyStrategy and it doesn’t list a replication factor for the
datacenter datacenter1 then you will get this error message too.
Paul
> On 12 Mar 2021, at 13:07, Erick Ramirez wrote:
>
> Does
Thank you Paul and Erick. The keyspace is defined like this:
CREATE KEYSPACE doc WITH replication = {'class': 'SimpleStrategy',
'replication_factor': '3'}Â AND durable_writes = true;
Would that cause this?
The program that is having the problem selects data, calculates stuff,
and
The queries that are failing are:
select fieldvalue, count from doc.ordered_fieldcounts where source=? and
fieldname=? limit 10
Created with:
CREATE TABLE doc.ordered_fieldcounts (
   source text,
   fieldname text,
   count bigint,
   fieldvalue text,
   PRIMARY KEY
One question on the 'millions rows in a single query'. How would you
process that many rows? At some point, I'd like to be able to process
10-100 billion rows. Isn't that something that can be done with
Cassandra? I'm coming from HBase where we'd run map reduce jobs.
Thank you.
-Joe
Sleep-then-retry works is just another indicator that it's likely a GC
pause related issue. I'd recommend you to check your Cassandra servers'
GC logs first.
Do you know what's the maximum partition size for the doc.fieldcounts
table? (Try the "nodetool cfstats doc.fieldcounts" command) I
Millions rows in a single query? That sounds like a bad idea to me. Your
"NoNodeAvailableException" could be caused by stop-the-world GC pauses,
and the GC pauses are likely caused by the query itself.
On 12/03/2021 13:39, Joe Obernberger wrote:
Thank you Paul and Erick. The keyspace is
Hi,
I remember a couple of years ago there was some noise about Rocksandra
(Cassandra using rocksdb for storage) and opening up Cassandra to alternate
storage mechanisms.
I haven't seen anything about it for a while now though. The last commit to
Rocksandra on github was in Nov 2019. The
Thank you very much for helping me out on this! The table fieldcounts
is currently pretty small - 6.4 million rows.
cfstats are:
Total number of tables: 81
Keyspace : doc
       Read Count: 3713134
       Read Latency: 0.2664131157130338 ms
      Â
I'm not too familiar with the details on what's happened more recently, but
I do remember that while Rocksandra was very favorably compared to
Cassandra 2.x, the improvements looked fairly similar in nature and
magnitude to what Cassandra got from the move to the 3.x sstable format and
increased
Hey Bowen
I agree it's better to have smaller servers in general, this is the smaller
servers version :)
In this case, I wouldn't say the data model is bad, and we certainly do our
best to tune everything so less hardware is needed.
It's just that the data and amount of requests/s is very big to
The partition size min/avg/max of 8409008/15096925/25109160 bytes looks
fine for the table fieldcounts, but the number of partitions is a bit
worrying. Only 3 partitions? Are you expecting the partition size
(instead of number of partitions) to grow in the future? That can lead
to a lots of
Beside the enhancements at storage layer, i think there are couple of good
ideas in Rocksdb that could be used in Cassandra, like the one with disabling
sort at memtable-insert part (write data fast like commitlig) and only sort the
data when flushing/creating sst files.
Sent using
As someone who watched some of the work (but wasn’t really involved), I think a
bunch of it fizzled for various reasons
The rocks stuff was built (mostly? Completely?) by one company for their use
case (the best kind of open source), but wasn’t in a form that was easy to
commit upstream - the
Hi All - I'm getting this error:
Error: com.datastax.oss.driver.api.core.NoNodeAvailableException: No
node was available to execute the query
com.datastax.oss.driver.api.core.NoNodeAvailableException: No node was
available to execute the query
       at
Hi Guys
I need to backup my 3 node Cassandra cluster to a remote machine. Is there
a tool like Barman (really nice streaming backup tool for Postgresql) for
Cassandra? Or does everyone roll their own scripts using snapshots and so
on?
The data is on all 3 nodes using about 900G of space on each.
Does it get returned by the driver every single time? The
NoNodeAvailableException gets thrown when (1) all nodes are down, or (2)
all the contact points are invalid from the driver's perspective.
Is it possible there's no route/connectivity from your app server(s) to the
172.16.x.x network? If
20 matches
Mail list logo