Re: local read from coordinator

2020-11-10 Thread Alex Ott
token-aware policy doesn't work for token range queries (at least in the Java driver 3.x). You need to force the driver to do the reading using a specific token as a routing key. Here is Java implementation of the token range scanning algorithm that Spark uses:

Re: local read from coordinator

2020-11-10 Thread Erick Ramirez
Yes, use a token-aware policy so the driver will pick a coordinator where the token (partition) exists. Cheers!

local read from coordinator

2020-11-10 Thread onmstester onmstester
Hi, I'm going to read all the data in the cluster as fast as possible, i'm aware that spark could do such things out of the box but just wanted to do it at low level to see how fast it could be. So: 1. retrieved partition keys on each node using nodetool ring token ranges and getting distinct

Re: Cassandra in a container - what to do (sequence of events) to snapshot the storage volume?

2020-11-10 Thread Jeff Jirsa
The commitlog defaults to periodic mode, which writes a sync marker to the file and fsync's the data to disk every 10s by default. `nodetool flush` will force a sync marker / fsync Data written since the last fsync will not be replayed on startup and will be lost. If you drop the periodic time,

Re: Cassandra in a container - what to do (sequence of events) to snapshot the storage volume?

2020-11-10 Thread Erick Ramirez
> > I do "nodetool flush", then snapshot the storage. Meanwhile, the DB is > under heavy read/write traffic, with lots of writes per second. What's > the worst that could happen, lose a few writes? > Nope, you won't lose anything. Snapshots in C* are the equivalent of a cold backup in relational

Re: Cassandra in a container - what to do (sequence of events) to snapshot the storage volume?

2020-11-10 Thread Florin Andrei
That sounds great! Now here's my question: I do "nodetool flush", then snapshot the storage. Meanwhile, the DB is under heavy read/write traffic, with lots of writes per second. What's the worst that could happen, lose a few writes? On 2020-11-10 15:59, Jeff Jirsa wrote: If you want all of

Re: Cassandra in a container - what to do (sequence of events) to snapshot the storage volume?

2020-11-10 Thread Jeff Jirsa
If you want all of the instances to be consistent with each other, this is much harder, but if you only want a container that can stop and resume, you don't have to do anything more than flush + snapshot the storage. The data files on cassandra should ALWAYS be in a state where the database will

Cassandra in a container - what to do (sequence of events) to snapshot the storage volume?

2020-11-10 Thread Florin Andrei
Running Apache Cassandra 3 in Docker. I need to snapshot the storage volumes. Obviously, I want to be able to re-launch Cassandra from the snapshots later on. So the snapshots need to be in a consistent state. With most DBs, the sequence of events is this: - flush the DB to disk - "freeze"

RE: Last stored value metadata table

2020-11-10 Thread Durity, Sean R
Lots of updates to the same rows/columns could theoretically impact read performance. One way to help counter that would be to use the LeveledCompactionStrategy to keep the table optimized for reads. It could keep your nodes busier with compaction – so test it out. Sean Durity From: Gábor

Re: Last stored value metadata table

2020-11-10 Thread Gábor Auth
Hi, On Tue, Nov 10, 2020 at 6:29 PM Alex Ott wrote: > What about using "per partition limit 1" on that table? > Oh, it is almost a good solution, but actually the key is ((epoch_day, name), timestamp), to support more distributed partitioning, so... it is not good... :/ -- Bye, Auth Gábor

Re: Last stored value metadata table

2020-11-10 Thread Alex Ott
What about using "per partition limit 1" on that table? On Tue, Nov 10, 2020 at 8:39 AM Gábor Auth wrote: > Hi, > > Short story: storing time series of measurements (key(name, timestamp), > value). > > The problem: get the list of the last `value` of every `name`. > > Is there a Cassandra

Re: Last stored value metadata table

2020-11-10 Thread Gábor Auth
Hi, On Tue, Nov 10, 2020 at 5:29 PM Durity, Sean R wrote: > Updates do not create tombstones. Deletes create tombstones. The above > scenario would not create any tombstones. For a full solution, though, I > would probably suggest a TTL on the data so that old/unchanged data > eventually gets

RE: Last stored value metadata table

2020-11-10 Thread Durity, Sean R
Hi, On Tue, Nov 10, 2020 at 3:18 PM Durity, Sean R mailto:sean_r_dur...@homedepot.com>> wrote: My answer would depend on how many “names” you expect. If it is a relatively small and constrained list (under a few hundred thousand), I would start with something like: At the moment, the number

Re: Last stored value metadata table

2020-11-10 Thread Gábor Auth
Hi, On Tue, Nov 10, 2020 at 3:18 PM Durity, Sean R wrote: > My answer would depend on how many “names” you expect. If it is a > relatively small and constrained list (under a few hundred thousand), I > would start with something like: > At the moment, the number of names is more than 10,000

RE: Last stored value metadata table

2020-11-10 Thread Durity, Sean R
My answer would depend on how many “names” you expect. If it is a relatively small and constrained list (under a few hundred thousand), I would start with something like: Create table last_values ( arbitrary_partition text, -- use an app name or something static to define the partition name