Re: local read from coordinator

2020-11-10 Thread Alex Ott
token-aware policy doesn't work for token range queries (at least in the
Java driver 3.x).  You need to force the driver to do the reading using a
specific token as a routing key.  Here is Java implementation of the token
range scanning algorithm that Spark uses:
https://github.com/alexott/cassandra-dse-playground/blob/master/driver-1.x/src/main/java/com/datastax/alexott/demos/TokenRangesScan.java

I'm not aware if Python driver is able to set routing key explicitly, but
whitelist policy should help



On Wed, Nov 11, 2020 at 7:03 AM Erick Ramirez 
wrote:

> Yes, use a token-aware policy so the driver will pick a coordinator where
> the token (partition) exists. Cheers!
>


-- 
With best wishes,Alex Ott
http://alexott.net/
Twitter: alexott_en (English), alexott (Russian)


Re: local read from coordinator

2020-11-10 Thread Erick Ramirez
Yes, use a token-aware policy so the driver will pick a coordinator where
the token (partition) exists. Cheers!


local read from coordinator

2020-11-10 Thread onmstester onmstester
Hi,

I'm going to read all the data in the cluster as fast as possible, i'm aware 
that spark could do such things out of the box but just wanted to do it at low 
level to see how fast it could be. So:
1. retrieved partition keys on each node using nodetool ring token ranges and 
getting distinct partition for the range
2. run the query for each partition on its main replica node using python 
(parallel job on all nodes of the cluster). i used loadBalancing strategy and 
only the local ip as contact point but i will try the whitelist policy too 
(With whitelist load balancing strategy restricted queries (read) to a 
single/local coordinator (a python script on same host as coordinator))
This mechanism turned out to be fast but not as fast as the sequential read of 
the disk could be (the query could be 100 times faster theoretically!




I'm using RF=3 in a single DC cluster with default WCL which is LOCAL_ONE. I 
suspect that may be the coordinator is also connecting other replicas but how 
can i debug that?

Is there any workaround to force the coordinator to only read data from itself 
so

if there is other replicas (beside the coordinator) for the partition key, only 
the coordinator's data would be read and returned and it should not even check 
other replicas foe the data

if the coordinator is not a replica for the partition key, it simply throw 
exception or return empty result


Is there any mechanism to accomplish this kind of local read?

Best Regards
Sent using https://www.zoho.com/mail/

Re: Cassandra in a container - what to do (sequence of events) to snapshot the storage volume?

2020-11-10 Thread Jeff Jirsa
The commitlog defaults to periodic mode, which writes a sync marker to the
file and fsync's the data to disk every 10s by default.

`nodetool flush` will force a sync marker / fsync

Data written since the last fsync will not be replayed on startup and will
be lost.

If you drop the periodic time, the number of writes you lose on restart
decreases.

Alternatively, you can switch to group/batch commitlog, and it goes to
zero, but you'll fsync far more frequently.



On Tue, Nov 10, 2020 at 4:19 PM Florin Andrei 
wrote:

> That sounds great! Now here's my question:
>
> I do "nodetool flush", then snapshot the storage. Meanwhile, the DB is
> under heavy read/write traffic, with lots of writes per second. What's
> the worst that could happen, lose a few writes?
>
>
> On 2020-11-10 15:59, Jeff Jirsa wrote:
> > If you want all of the instances to be consistent with each other,
> > this is much harder, but if you only want a container that can stop
> > and resume, you don't have to do anything more than flush + snapshot
> > the storage. The data files on cassandra should ALWAYS be in a state
> > where the database will restart, because they have to be to tolerate
> > power outage.
> >
> > On Tue, Nov 10, 2020 at 3:39 PM Florin Andrei 
> > wrote:
> >
> >> Running Apache Cassandra 3 in Docker. I need to snapshot the storage
> >>
> >> volumes. Obviously, I want to be able to re-launch Cassandra from
> >> the
> >> snapshots later on. So the snapshots need to be in a consistent
> >> state.
> >>
> >> With most DBs, the sequence of events is this:
> >>
> >> - flush the DB to disk
> >> - "freeze" the DB
> >> - snapshot the storage
> >> - "unfreeze" the DB
> >>
> >> What does that sequence translate to, in Cassandra parlance?
> >>
> >> What is the sequence of events that needs to happen when I bring the
> >> DB
> >> up from an old snapshot? Will there be a restore procedure, or can I
> >>
> >> just start it as usual?
> >>
> >> --
> >> Florin Andrei
> >> https://florin.myip.org/
> >>
> >>
> > -
> >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: user-h...@cassandra.apache.org
>
> --
> Florin Andrei
> https://florin.myip.org/
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: Cassandra in a container - what to do (sequence of events) to snapshot the storage volume?

2020-11-10 Thread Erick Ramirez
>
> I do "nodetool flush", then snapshot the storage. Meanwhile, the DB is
> under heavy read/write traffic, with lots of writes per second. What's
> the worst that could happen, lose a few writes?
>

Nope, you won't lose anything. Snapshots in C* are the equivalent of a cold
backup in relational DBs with the key difference that it all takes place
while the nodes and cluster remain online and operational. Cheers!


Re: Cassandra in a container - what to do (sequence of events) to snapshot the storage volume?

2020-11-10 Thread Florin Andrei

That sounds great! Now here's my question:

I do "nodetool flush", then snapshot the storage. Meanwhile, the DB is 
under heavy read/write traffic, with lots of writes per second. What's 
the worst that could happen, lose a few writes?



On 2020-11-10 15:59, Jeff Jirsa wrote:

If you want all of the instances to be consistent with each other,
this is much harder, but if you only want a container that can stop
and resume, you don't have to do anything more than flush + snapshot
the storage. The data files on cassandra should ALWAYS be in a state
where the database will restart, because they have to be to tolerate
power outage.

On Tue, Nov 10, 2020 at 3:39 PM Florin Andrei 
wrote:


Running Apache Cassandra 3 in Docker. I need to snapshot the storage

volumes. Obviously, I want to be able to re-launch Cassandra from
the
snapshots later on. So the snapshots need to be in a consistent
state.

With most DBs, the sequence of events is this:

- flush the DB to disk
- "freeze" the DB
- snapshot the storage
- "unfreeze" the DB

What does that sequence translate to, in Cassandra parlance?

What is the sequence of events that needs to happen when I bring the
DB
up from an old snapshot? Will there be a restore procedure, or can I

just start it as usual?

--
Florin Andrei
https://florin.myip.org/



-

To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org


--
Florin Andrei
https://florin.myip.org/

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Cassandra in a container - what to do (sequence of events) to snapshot the storage volume?

2020-11-10 Thread Jeff Jirsa
If you want all of the instances to be consistent with each other, this is
much harder, but if you only want a container that can stop and resume, you
don't have to do anything more than flush + snapshot the storage. The data
files on cassandra should ALWAYS be in a state where the database will
restart, because they have to be to tolerate power outage.



On Tue, Nov 10, 2020 at 3:39 PM Florin Andrei 
wrote:

> Running Apache Cassandra 3 in Docker. I need to snapshot the storage
> volumes. Obviously, I want to be able to re-launch Cassandra from the
> snapshots later on. So the snapshots need to be in a consistent state.
>
> With most DBs, the sequence of events is this:
>
> - flush the DB to disk
> - "freeze" the DB
> - snapshot the storage
> - "unfreeze" the DB
>
> What does that sequence translate to, in Cassandra parlance?
>
> What is the sequence of events that needs to happen when I bring the DB
> up from an old snapshot? Will there be a restore procedure, or can I
> just start it as usual?
>
> --
> Florin Andrei
> https://florin.myip.org/
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Cassandra in a container - what to do (sequence of events) to snapshot the storage volume?

2020-11-10 Thread Florin Andrei
Running Apache Cassandra 3 in Docker. I need to snapshot the storage 
volumes. Obviously, I want to be able to re-launch Cassandra from the 
snapshots later on. So the snapshots need to be in a consistent state.


With most DBs, the sequence of events is this:

- flush the DB to disk
- "freeze" the DB
- snapshot the storage
- "unfreeze" the DB

What does that sequence translate to, in Cassandra parlance?

What is the sequence of events that needs to happen when I bring the DB 
up from an old snapshot? Will there be a restore procedure, or can I 
just start it as usual?


--
Florin Andrei
https://florin.myip.org/

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



RE: Last stored value metadata table

2020-11-10 Thread Durity, Sean R
Lots of updates to the same rows/columns could theoretically impact read 
performance. One way to help counter that would be to use the 
LeveledCompactionStrategy to keep the table optimized for reads. It could keep 
your nodes busier with compaction – so test it out.


Sean Durity

From: Gábor Auth 
Sent: Tuesday, November 10, 2020 11:50 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Last stored value metadata table

Hi,

On Tue, Nov 10, 2020 at 5:29 PM Durity, Sean R 
mailto:sean_r_dur...@homedepot.com>> wrote:
Updates do not create tombstones. Deletes create tombstones. The above scenario 
would not create any tombstones. For a full solution, though, I would probably 
suggest a TTL on the data so that old/unchanged data eventually gets removed 
(if that is desirable). TTLs can create tombstones, but should not be a major 
problem if expired data is relatively infrequent.

Okay, there are no tombstones (I misused the term) but every updated `value` 
are sitting in the memory and on the disk before the next compaction... Does it 
degrade the read performance?

--
Bye,
Auth Gábor (https://iotguru.cloud 
[iotguru.cloud])



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Re: Last stored value metadata table

2020-11-10 Thread Gábor Auth
Hi,

On Tue, Nov 10, 2020 at 6:29 PM Alex Ott  wrote:

> What about using  "per partition limit 1" on that table?
>

Oh, it is almost a good solution, but actually the key is ((epoch_day,
name), timestamp), to support more distributed partitioning, so... it is
not good... :/

-- 
Bye,
Auth Gábor (https://iotguru.cloud)


Re: Last stored value metadata table

2020-11-10 Thread Alex Ott
What about using  "per partition limit 1" on that table?

On Tue, Nov 10, 2020 at 8:39 AM Gábor Auth  wrote:

> Hi,
>
> Short story: storing time series of measurements (key(name, timestamp),
> value).
>
> The problem: get the list of the last `value` of every `name`.
>
> Is there a Cassandra friendly solution to store the last value of every
> `name` in a separate metadata table? It will come with a lot of
> tombstones... any other solution? :)
>
> --
> Bye,
> Auth Gábor
>


-- 
With best wishes,Alex Ott
http://alexott.net/
Twitter: alexott_en (English), alexott (Russian)


Re: Last stored value metadata table

2020-11-10 Thread Gábor Auth
Hi,

On Tue, Nov 10, 2020 at 5:29 PM Durity, Sean R 
wrote:

> Updates do not create tombstones. Deletes create tombstones. The above
> scenario would not create any tombstones. For a full solution, though, I
> would probably suggest a TTL on the data so that old/unchanged data
> eventually gets removed (if that is desirable). TTLs can create tombstones,
> but should not be a major problem if expired data is relatively infrequent.
>

Okay, there are no tombstones (I misused the term) but every updated
`value` are sitting in the memory and on the disk before the next
compaction... Does it degrade the read performance?

-- 
Bye,
Auth Gábor (https://iotguru.cloud)


RE: Last stored value metadata table

2020-11-10 Thread Durity, Sean R

Hi,

On Tue, Nov 10, 2020 at 3:18 PM Durity, Sean R 
mailto:sean_r_dur...@homedepot.com>> wrote:
My answer would depend on how many “names” you expect. If it is a relatively 
small and constrained list (under a few hundred thousand), I would start with 
something like:

At the moment, the number of names is more than 10,000 but not than 100,000.

Create table last_values (
arbitrary_partition text, -- use an app name or something static to define the 
partition
name text,
value text,
last_upd_ts timestamp,
primary key (arbitrary_partition, name);

What is the purpose of the partition key?

--- This keeps the data in one partition so that you can retrieve all of it in 
one query (as you requested). If the partition key is just “name,” then you 
would need a query for each name:
select value, last_upd_ts from last_values where name = ‘name1’; //10,000+ 
queries and you have to know all the names

Since it is a single partition, you want to keep the partition size under 100 
MB (rule of thumb). That is why knowing the size/bounds of the data is 
important.

(NOTE: every insert would just overwrite the last value. You only keep the last 
one.)

This is the behavior that I want. :)

I’m assuming that your data arrives in time series order, so that it is easy to 
just insert the last value into last_values. If you have to read before write, 
that would be a Cassandra anti-pattern that needs a different solution. (Based 
on how regular the data points are, I would look at something time-series 
related with a short TTL.)

Okay, but as I know, this is the scenario when every update of the 
`last_values` generates two tombstones because of the update of the `value` and 
`last_upd_ts` field. Maybe I know it wrong?

--- Updates do not create tombstones. Deletes create tombstones. The above 
scenario would not create any tombstones. For a full solution, though, I would 
probably suggest a TTL on the data so that old/unchanged data eventually gets 
removed (if that is desirable). TTLs can create tombstones, but should not be a 
major problem if expired data is relatively infrequent.


--
Bye,
Auth Gábor (https://iotguru.cloud 
[iotguru.cloud])



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Re: Last stored value metadata table

2020-11-10 Thread Gábor Auth
Hi,

On Tue, Nov 10, 2020 at 3:18 PM Durity, Sean R 
wrote:

> My answer would depend on how many “names” you expect. If it is a
> relatively small and constrained list (under a few hundred thousand), I
> would start with something like:
>

At the moment, the number of names is more than 10,000 but not than 100,000.

>
> Create table last_values (
>
> arbitrary_partition text, -- use an app name or something static to define
> the partition
>
> name text,
>
> value text,
>
> last_upd_ts timestamp,
>
> primary key (arbitrary_partition, name);
>

What is the purpose of the partition key?

(NOTE: every insert would just overwrite the last value. You only keep the
> last one.)
>

This is the behavior that I want. :)


> I’m assuming that your data arrives in time series order, so that it is
> easy to just insert the last value into last_values. If you have to read
> before write, that would be a Cassandra anti-pattern that needs a different
> solution. (Based on how regular the data points are, I would look at
> something time-series related with a short TTL.)
>

Okay, but as I know, this is the scenario when every update of the
`last_values` generates two tombstones because of the update of the `value`
and `last_upd_ts` field. Maybe I know it wrong?

-- 
Bye,
Auth Gábor (https://iotguru.cloud)


RE: Last stored value metadata table

2020-11-10 Thread Durity, Sean R
My answer would depend on how many “names” you expect. If it is a relatively 
small and constrained list (under a few hundred thousand), I would start with 
something like:

Create table last_values (
arbitrary_partition text, -- use an app name or something static to define the 
partition
name text,
value text,
last_upd_ts timestamp,
primary key (arbitrary_partition, name);

(NOTE: every insert would just overwrite the last value. You only keep the last 
one.)

Then your query is easy:
Select name, value, last_upd_ts from last_values where arbitrary_partition = 
‘my_app_name’;

If the list of names is unbounded/large, then I would be asking, does the query 
really need every name/value pair? What other way could they grouped together 
in a reasonable partition? I would use that instead of the arbitrary_partition 
above and run multiple queries (one for each partition) if a massive list is 
actually required.

I’m assuming that your data arrives in time series order, so that it is easy to 
just insert the last value into last_values. If you have to read before write, 
that would be a Cassandra anti-pattern that needs a different solution. (Based 
on how regular the data points are, I would look at something time-series 
related with a short TTL.)


Sean Durity

From: Gábor Auth 
Sent: Tuesday, November 10, 2020 2:39 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Last stored value metadata table

Hi,

Short story: storing time series of measurements (key(name, timestamp), value).

The problem: get the list of the last `value` of every `name`.

Is there a Cassandra friendly solution to store the last value of every `name` 
in a separate metadata table? It will come with a lot of tombstones... any 
other solution? :)

--
Bye,
Auth Gábor



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.