Re: Accessing Cassandra data from Spark Shell

2016-05-18 Thread Ben Slater
It definitely should be possible for 1.5.2 (I have used it with spark-shell and cassandra connector with 1.4.x). The main trick is in lining up all the versions and building an appropriate connector jar. Cheers Ben On Wed, 18 May 2016 at 15:40 Cassa L wrote: > Hi, > I

Setting bloom_filter_fp_chance < 0.01

2016-05-18 Thread Adarsh Kumar
Hi, What is the impact of setting bloom_filter_fp_chance < 0.01. During performance tuning I was trying to tune bloom_filter_fp_chance and have following questions: 1). Why bloom_filter_fp_chance = 0 is not allowed. ( https://issues.apache.org/jira/browse/CASSANDRA-5013) 2). What is the

Re: Extending a partially upgraded cluster - supported

2016-05-18 Thread Erik Forsberg
On 2016-05-18 20:19, Jeff Jirsa wrote: You can’t stream between versions, so in order to grow the cluster, you’ll need to be entirely on 2.0 or entirely on 2.1. OK. I was sure you can't stream between a 2.0 node and a 2.1 node, but if I understand you correctly you can't stream between two

Replication lag between data center

2016-05-18 Thread cass savy
How can we determine/measure the replication lag or latency between on premise data centers or cross region/Availability zones?

Re: Replication lag between data center

2016-05-18 Thread Jeff Jirsa
Cassandra isn’t a traditional DB – it doesn’t “replicate” in the same way that a relational DB replicas. Cassandra clients send mutations (via native protocol or thrift). Those mutations include a minimum consistency level for the server to return a successful write. If a write says

Re: Setting bloom_filter_fp_chance < 0.01

2016-05-18 Thread Adarsh Kumar
Hi Sai, We have a use case where we are designing a table that is going to have around 50 billion rows and we require a very fast reads. Partitions are not that complex/big, it has some validation data for duplicate checks (consisting 4-5 int and varchar). So we were trying various options to

Re: Accessing Cassandra data from Spark Shell

2016-05-18 Thread Cassa L
I tried all combinations of spark-cassandra connector. Didn't work. Finally, I downgraded spark to 1.5.1 and now it works. LCassa On Wed, May 18, 2016 at 11:11 AM, Mohammed Guller wrote: > As Ben mentioned, Spark 1.5.2 does work with C*. Make sure that you are > using

Intermittent CAS error

2016-05-18 Thread Robert Wille
When executing bulk CAS queries, I intermittently get the following error: SERIAL is not supported as conditional update commit consistency. Use ANY if you mean "make sure it is accepted but I don't care how many replicas commit it for non-SERIAL reads” This doesn’t make any sense. Obviously,

Re: Low cardinality secondary index behaviour

2016-05-18 Thread DuyHai Doan
Cassandra 3.0.6 does not have SASI. SASI is available only from C* 3.4 but I advise C* 3.5/3.6 because some critical bugs have been fixed in 3.5 On Wed, May 18, 2016 at 1:58 PM, Atul Saroha wrote: > Thanks Tyler, > > SPARSE SASI index solves my use case. Planing to

Re: Low cardinality secondary index behaviour

2016-05-18 Thread Atul Saroha
Thanks Tyler, SPARSE SASI index solves my use case. Planing to upgrade the cassandra to 3.0.6 now. - Atul Saroha *Lead Software Engineer* *M*: +91 8447784271 *T*: +91 124-415-6069

Migrating from Cassandra-Lucene to SASI

2016-05-18 Thread Atul Saroha
>From Duy Hai DOAN's blog http://www.doanduyhai.com/blog/?p=2058 : Please note that SASI does not intercept DELETE for indexing. Indeed the > resolution and reconciliation of deleted data is let to Cassandra at read > time. SASI only indexes INSERT and UPDATE. > With this it feels that Lucene

Re: Cassandra Debian repos (Apache vs DataStax)

2016-05-18 Thread Eric Evans
On Tue, May 17, 2016 at 2:16 PM, Drew Kutcharian wrote: > OK to make things even more confusing, the “Release” files in the Apache Repo > say "Origin: Unofficial Cassandra Packages”!! > > i.e. http://dl.bintray.com/apache/cassandra/dists/35x/:Release Yes, as I remember, someone

Re: Cassandra Debian repos (Apache vs DataStax)

2016-05-18 Thread Eric Evans
On Tue, May 17, 2016 at 2:11 PM, Drew Kutcharian wrote: > BTW, the language on this page should probably change since it currently > sounds like the official repo is the DataStax one and Apache is only an > “alternative" > > http://wiki.apache.org/cassandra/DebianPackaging It

Re: Setting bloom_filter_fp_chance < 0.01

2016-05-18 Thread sai krishnam raju potturi
hi Adarsh; were there any drawbacks to setting the bloom_filter_fp_chance to the default value? thanks Sai On Wed, May 18, 2016 at 2:21 AM, Adarsh Kumar wrote: > Hi, > > What is the impact of setting bloom_filter_fp_chance < 0.01. > > During performance tuning I was

Re: Setting bloom_filter_fp_chance < 0.01

2016-05-18 Thread Jonathan Haddad
The impact is it'll get massively bigger with very little performance benefit, if any. You can't get 0 because it's a probabilistic data structure. It tells you either: your data is definitely not here your data has a pretty decent chance of being here but never "it's here for sure"

RE: Accessing Cassandra data from Spark Shell

2016-05-18 Thread Mohammed Guller
As Ben mentioned, Spark 1.5.2 does work with C*. Make sure that you are using the correct version of the Spark Cassandra Connector. Mohammed Author: Big Data Analytics with Spark From: Ben Slater

Extending a partially upgraded cluster - supported

2016-05-18 Thread Erik Forsberg
Hi! I have a 2.0.13 cluster which I need to do two things with: * Extend it * Upgrade to 2.1.14 I'm pondering in what order to do things. Is it a supported operation to extend a partially upgraded cluster, i.e. a cluster upgraded to 2.0 where not all sstables have been upgraded? If I do

Re: Extending a partially upgraded cluster - supported

2016-05-18 Thread Jeff Jirsa
You can’t stream between versions, so in order to grow the cluster, you’ll need to be entirely on 2.0 or entirely on 2.1. If you go to 2.1 first, be sure you run upgradesstables before you try to extend the cluster. On 5/18/16, 11:17 AM, "Erik Forsberg" wrote: >Hi! >