Repairs on 2.1.12

2017-05-09 Thread Mark Furlong
I have a large cluster running a -dc repair on a ring which has been running for nearly two weeks. When I review the logs I can see where my tables are reporting as ‘fully synced’ multiple times. I’m looking for some information to help me confirm that my repair is not looping and is running

Re: Smart Table creation for 2D range query

2017-05-09 Thread Jon Haddad
Sure, I don't see why not. Ultimately this is more or less the same thing I proposed. You end up with a slightly different way of encoding a point in space into a rough geographical area. Whether you encode them as a tree structure or some prefix of a geohash is a matter of convenience. I'm

Cassandra 2.1.13: Using JOIN_RING=False

2017-05-09 Thread Anubhav Kale
Hello, With some inspiration from the Cassandra Summit talk from last year, we are trying to setup a cluster with coordinator-only nodes. We setup join_ring=false in env.sh, disabled auth in YAML and the nodes are able to start just fine. However, we're running into a few problems 1] The

Re: Smart Table creation for 2D range query

2017-05-09 Thread Jim Ancona
Couldn't you use a bucketing strategy for the hash value, much like with time series data? That is, choose a partition key granularity that puts a reasonable number of rows in a partition, with the actual hash being the clustering key. Then ranges that within the partition key granularity could be

Cassandra 3.10 has partial partition key search but does it result in a table scan?

2017-05-09 Thread Kant Kodali
Hi All, It looks like Cassandra 3.10 has partial partition key search but does it result in a table scan? for example I can have the following create table hello( a text, b int, c text, d text, primary key((a,b), c) ); Now I can do select * from hello where a='foo' allow filtering;// This works

Re: Cassandra 3.10 has partial partition key search but does it result in a table scan?

2017-05-09 Thread Jon Haddad
I don’t see any way it wouldn’t. Have you tried tracing it? > On May 9, 2017, at 8:32 AM, Kant Kodali wrote: > > Hi All, > > It looks like Cassandra 3.10 has partial partition key search but does it > result in a table scan? for example I can have the following > > create

Re: Cassandra 3.10 has partial partition key search but does it result in a table scan?

2017-05-09 Thread Daniel Hölbling-Inzko
If you have to allow filtering for the query to work it usually always results in a table scan. greetings Daniel On Tue, 9 May 2017 at 15:33 Jon Haddad wrote: > I don’t see any way it wouldn’t. Have you tried tracing it? > > > On May 9, 2017, at 8:32 AM, Kant Kodali

Re: Cassandra 3.10 has partial partition key search but does it result in a table scan?

2017-05-09 Thread Jon Haddad
Output from both queries, demonstrating full cluster scans: https://gist.github.com/rustyrazorblade/c4947fc37da85bca50e08aa1ef3c7a06 Jon > On May 9, 2017, at 9:24 AM, Jon Haddad wrote: > >

Re: Smart Table creation for 2D range query

2017-05-09 Thread Jim Ancona
There are clever ways to encode coordinates into a single scalar value where points that are close on a surface are also close in value, making queries efficient. Examples are Geohash and Google's S2

Re: Cassandra 3.10 has partial partition key search but does it result in a table scan?

2017-05-09 Thread Jon Haddad
Nope, I didn’t comment on that query. I specifically answered your question about "select * from hello where a='foo' allow filtering;” The query you’ve listed here looks like it would also do a full table scan (again, I don’t see how it would be avoided). I recommend firing up a 3 node

Re: Cassandra 3.10 has partial partition key search but does it result in a table scan?

2017-05-09 Thread Alexander Dejanovski
Hi Kant, Unless you provide the full partition key, I see no way for Cassandra to avoid doing a full table scan. In order to know on which specific nodes to search (and in which sstables ,etc...) it needs to have a token. The token is a hash of the whole partition key. For a specific value of

Re: Cassandra 3.10 has partial partition key search but does it result in a table scan?

2017-05-09 Thread Kant Kodali
Thanks a lot guys! On Tue, May 9, 2017 at 7:32 AM, Alexander Dejanovski wrote: > Hi Kant, > > Unless you provide the full partition key, I see no way for Cassandra to > avoid doing a full table scan. > In order to know on which specific nodes to search (and in which

Re: Smart Table creation for 2D range query

2017-05-09 Thread Jon Haddad
The problem with using geohashes is that you can’t efficiently do ranges with random token distribution. So even if your scalar values are close to each other numerically they’ll likely end up on different nodes, and you end up doing a scatter gather. If the goal is to provide a scalable

NoSE: Automated schema design for Cassandra

2017-05-09 Thread Michael Mior
Hi all, I wanted to share a tool I've been working on that tries to help automate the schema design process for Cassandra. The short description is that you provide information on the kind of data you want to store and the queries and updates you want to issue, and NoSE will perform a cost-based