Re: Error when running nodetool cleanup after adding a new node to a cluster

2017-02-09 Thread Srinath Reddy
Alex, Thanks for reply. I will try the workaround and post an update. Regards, Srinath Reddy > On 09-Feb-2017, at 1:44 PM, Oleksandr Shulgin > wrote: > > On Thu, Feb 9, 2017 at 6:13 AM, Srinath Reddy > wrote: >

DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Benjamin Roth
Hi Guys, CQL says this is not allowed: DELETE FROM ks.cf WHERE (pk1, pk2) IN ((1, 2)); 1. Is there a reason for it? There shouldn't be a performance penalty, it is a PK lookup, the same thing works with a single pk column 2. Is there a known workaround for it? It would be much of a help to

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Ben Slater
Are you looking this to be equivalent to (PK1=1 AND PK2=2) or are you looking for (PK1 IN (1,2) AND PK2 IN (1,2)) or something else? Cheers Ben On Thu, 9 Feb 2017 at 20:09 Benjamin Roth wrote: > Hi Guys, > > CQL says this is not allowed: > > DELETE FROM ks.cf WHERE

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Sylvain Lebresne
On Thu, Feb 9, 2017 at 10:52 AM, Benjamin Roth wrote: > Ok got it. > > But it's interesting that this is supported: > DELETE/SELECT FROM ks.cf WHERE (pk1) IN ((1), (2), (3)); > > This is technically mostly the same (Token awareness, > coordination/routing, read

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Benjamin Roth
This doesn't really belong to this topic but I also experienced what Ben says. I was migrating (and still am) tons of data from MySQL to CS. I measured several approached (async parallel, prepared stmt, sync with unlogged batches) and it turned out that batches where really fast and produced less

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Benjamin Roth
Yes, thats the workaround - I'll try that. Would you agree it would be better for internal optimizations to process this within a single statement? 2017-02-09 10:32 GMT+01:00 Ben Slater : > Yep, that makes it clear. I think an unlogged batch of prepared statements >

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Ben Slater
That’s a very good point from Sylvain that I forgot/missed. That said, we’ve seen plenty of scenarios where overall system throughput is improved through unlogged batches. One of my colleagues did quite a bit of benchmarking on this topic for his talk at last year’s C* summit:

Re: Error when running nodetool cleanup after adding a new node to a cluster

2017-02-09 Thread Oleksandr Shulgin
On Thu, Feb 9, 2017 at 6:13 AM, Srinath Reddy wrote: > Hi, > > Trying to re-balacne a Cassandra cluster after adding a new node and I'm > getting this error when running nodetool cleanup. The Cassandra cluster > is running in a Kubernetes cluster. > > Cassandra version is

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Benjamin Roth
Ok now I REALLY got it :) Thanks Sylvain! 2017-02-09 11:42 GMT+01:00 Sylvain Lebresne : > On Thu, Feb 9, 2017 at 10:52 AM, Benjamin Roth > wrote: > >> Ok got it. >> >> But it's interesting that this is supported: >> DELETE/SELECT FROM ks.cf WHERE

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Benjamin Roth
Maybe that makes it clear: DELETE FROM ks.cf WHERE (partitionkey1, partitionkey2) IN ((1, 2), (1, 3), (2, 3), (3, 4)); If want to delete or select a bunch of records identified by their multi-partitionkey tuples. 2017-02-09 10:18 GMT+01:00 Ben Slater : > Are you

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Sylvain Lebresne
This is a statement on multiple partitions and there is really no optimization the code internally does on that. In fact, I strongly advise you to not use a batch but rather simply do a for loop client side and send statement individually. That way, your driver will be able to use proper

How does cassandra achieve Linearizability?

2017-02-09 Thread Kant Kodali
How does Cassandra achieve Linearizability with “Last write wins” (conflict resolution methods based on time-of-day clocks) ? Relying on synchronized clocks are almost certainly non-linearizable, because clock timestamps cannot be guaranteed to be consistent with actual event ordering due to

Re: How does cassandra achieve Linearizability?

2017-02-09 Thread Jonathan Haddad
It doesn't, nor does it claim to. On Thu, Feb 9, 2017 at 4:09 PM Kant Kodali wrote: > How does Cassandra achieve Linearizability with “Last write wins” > (conflict resolution methods based on time-of-day clocks) ? > > Relying on synchronized clocks are almost certainly

Re: How does cassandra achieve Linearizability?

2017-02-09 Thread Michael Shuler
On 02/09/2017 07:21 PM, Kant Kodali wrote: > @Justin I read this article > http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0. > And it clearly says Linearizable consistency can be achieved with LWT's. > so should I assume the Linearizability in the context of the above >

If reading from materialized view with a consistency level of quorum am I guaranteed to have the most recent view?

2017-02-09 Thread Kant Kodali
If reading from materialized view with a consistency level of quorum am I guaranteed to have the most recent view? other words is w + r > n contract maintained for MV's as well for both reads and writes? Thanks!

Re: How does cassandra achieve Linearizability?

2017-02-09 Thread Michael Shuler
If you require the best precision you can get, setting up a pair of stratum 1 ntpd masters in each data center location with a GPS modules is not terribly complex. Low latency and jitter on servers you manage. 140ms is a long way away network-wise, and I would suggest that was a poor choice of

Re: How does cassandra achieve Linearizability?

2017-02-09 Thread Justin Cameron
I think the answer to that question will depend on your specific use case and requirements. If you're only doing a small number of updates but need to be sure they are applied in order you may be able to use lightweight transactions (keep in mind there's a performance hit here, so it's not an

Re: How does cassandra achieve Linearizability?

2017-02-09 Thread Kant Kodali
@Justin I read this article http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0. And it clearly says Linearizable consistency can be achieved with LWT's. so should I assume the Linearizability in the context of the above article is possible with LWT's and synchronization of

Re: How does cassandra achieve Linearizability?

2017-02-09 Thread Justin Cameron
Hi Kant, Clock synchronization is important - you should ensure that ntpd is properly configured on all nodes. If your particular use case is especially sensitive to out-of-order mutations it is possible to set timestamps on the client side using the drivers.

Re: How does cassandra achieve Linearizability?

2017-02-09 Thread Kant Kodali
Hi Justin, There are bunch of issues w.r.t to synchronization of clocks when we used ntpd. Also the time it took to sync the clocks was approx 140ms (don't quote me on it though because it is reported by our devops :) we have multiple clients (for example bunch of micro services are reading from

Re: How does cassandra achieve Linearizability?

2017-02-09 Thread Jon Haddad
LWT != Last Write Wins. They are totally different. LWTs give you (assuming you also read at SERIAL) “atomic consistency”, meaning you are able to perform operations atomically and in isolation. That’s the safety blanket everyone wants but is extremely expensive, especially in Cassandra.

Re: Composite partition key token

2017-02-09 Thread Edward Capriolo
On Thu, Feb 9, 2017 at 9:26 AM, Michael Burman wrote: > Hi, > > How about taking it from the BoundStatement directly? > > ByteBuffer routingKey = b.getRoutingKey(ProtocolVersion.NEWEST_SUPPORTED, > codecRegistry); > Token token = metadata.newToken(routingKey); > > In this

Re: Composite partition key token

2017-02-09 Thread Michael Burman
Hi, How about taking it from the BoundStatement directly? ByteBuffer routingKey = b.getRoutingKey(ProtocolVersion.NEWEST_SUPPORTED, codecRegistry); Token token = metadata.newToken(routingKey); In this case the b is the "BoundStatement". Replace codecRegistry & ProtocolVersion with what you

Re: Authentication with Java driver

2017-02-09 Thread Ben Bromhead
If the processes are launched separately or you fork before setting up the cluster object it won't share credentials. On Wed, Feb 8, 2017, 02:33 Yuji Ito wrote: > Thanks Ben, > > Do you mean lots of instances of the process or lots of instances of the > cluster/session

Re: Composite partition key token

2017-02-09 Thread Branislav Janosik -T (bjanosik - AAP3 INC at Cisco)
Works great, thank you! On 2/9/17, 6:26 AM, "Michael Burman" wrote: Hi, How about taking it from the BoundStatement directly? ByteBuffer routingKey = b.getRoutingKey(ProtocolVersion.NEWEST_SUPPORTED, codecRegistry); Token token =

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Ben Slater
Yep, that makes it clear. I think an unlogged batch of prepared statements with one statement per PK tuple would be roughly equivalent? And probably no more complex to generate in the client? On Thu, 9 Feb 2017 at 20:22 Benjamin Roth wrote: > Maybe that makes it clear:

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Benjamin Roth
Ok got it. But it's interesting that this is supported: DELETE/SELECT FROM ks.cf WHERE (pk1) IN ((1), (2), (3)); This is technically mostly the same (Token awareness, coordination/routing, read performance, ...), right? 2017-02-09 10:43 GMT+01:00 Sylvain Lebresne : > This