Re: Less frequent flushing with LCS

2015-03-02 Thread Dan Kinder
Nope, they flush every 5 to 10 minutes. On Mon, Mar 2, 2015 at 1:13 PM, Daniel Chia danc...@coursera.org wrote: Do the tables look like they're being flushed every hour? It seems like the setting memtable_flush_after_mins which I believe defaults to 60 could also affect how often your tables

RDD partitions per executor in Cassandra Spark Connector

2015-03-02 Thread Rumph, Frens Jan
Hi all, I didn't find the *issues* button on https://github.com/datastax/spark-cassandra-connector/ so posting here. Any one have an idea why token ranges are grouped into one partition per executor? I expected at least one per core. Any suggestions on how to work around this? Doing a

Re: Should a node that is bootstrapping be receiving writes in addition to the streams it is receiving?

2015-03-02 Thread Robert Coli
On Mon, Mar 2, 2015 at 1:58 PM, Paulo Ricardo Motta Gomes paulo.mo...@chaordicsystems.com wrote: I also checked via JMX and all the write counts are zero. Is the node supposed to receive writes during bootstrap? As I understand it, yes. The other funny thing during bootstrap, is that

Re: Less frequent flushing with LCS

2015-03-02 Thread Daniel Chia
Do the tables look like they're being flushed every hour? It seems like the setting memtable_flush_after_mins which I believe defaults to 60 could also affect how often your tables are flushed. Thanks, Daniel On Mon, Mar 2, 2015 at 11:49 AM, Dan Kinder dkin...@turnitin.com wrote: I see, thanks

Re: Should a node that is bootstrapping be receiving writes in addition to the streams it is receiving?

2015-03-02 Thread Paulo Ricardo Motta Gomes
I'm also facing a similar issue while bootstrapping a replacement node via -Dreplace_address flag. The node is streaming data from neighbors, but cfstats shows 0 counts for all metrics of all CFs in the bootstrapping node: SSTable count: 0 SSTables in each level: [0, 0, 0, 0, 0,

Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster

2015-03-02 Thread Robert Coli
On Mon, Mar 2, 2015 at 11:44 AM, Dan Kinder dkin...@turnitin.com wrote: I had been having the same problem as in those older post: http://mail-archives.apache.org/mod_mbox/cassandra-user/201411.mbox/%3CCAORswtz+W4Eg2CoYdnEcYYxp9dARWsotaCkyvS5M7+Uo6HT1=a...@mail.gmail.com%3E As I said on that

Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster

2015-03-02 Thread Dan Kinder
Yeah I thought that was suspicious too, it's mysterious and fairly consistent. (By the way I had error checking but removed it for email brevity, but thanks for verifying :) ) On Mon, Mar 2, 2015 at 4:13 PM, Peter Sanford psanf...@retailnext.net wrote: Hmm. I was able to reproduce the behavior

Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster

2015-03-02 Thread Peter Sanford
Hmm. I was able to reproduce the behavior with your go program on my dev machine (C* 2.0.12). I was hoping it was going to just be an unchecked error from the .Exec() or .Scan(), but that is not the case for me. The fact that the issue seems to happen on loop iteration 10, 100 and 1000 is pretty

Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster

2015-03-02 Thread Dan Kinder
Done: https://issues.apache.org/jira/browse/CASSANDRA-8892 On Mon, Mar 2, 2015 at 3:26 PM, Robert Coli rc...@eventbrite.com wrote: On Mon, Mar 2, 2015 at 11:44 AM, Dan Kinder dkin...@turnitin.com wrote: I had been having the same problem as in those older post:

Re: Node stuck in joining the ring

2015-03-02 Thread Phil Yang
I encountered a similar situation that streaming can not finish, not only in joining but in removing a node. My tricky solution is: restart every node in the cluster before you starting the new node. In my experience streaming stucked only shows in the node that have been running many days

Re: using or in select query in cassandra

2015-03-02 Thread Jens Rantil
Hi Rahul, No, you can't do this in a single query. You will need to execute two separate queries if the requirements are on different columns. However, if you'd like to select multiple rows of with restriction on the same column you can do that using the `IN` construct: select * from table where

Re: how to make unique coloumns in cassandra

2015-03-02 Thread Peter Lin
Use a RDBMS There is a reason constraints were created and why Cassandra doesn't have it Sent from my iPhone On Mar 2, 2015, at 2:23 AM, Rahul Srivastava srivastava.robi...@gmail.com wrote: but what if i want to fetch the value using on table then this idea might fail On Mon, Mar 2,

Re: How to extract all the user id from a single table in Cassandra?

2015-03-02 Thread Jens Rantil
Hi Check, Please avoid double posting on mailing lists. It leads to double work (respect people's time!) and makes it hard for people in the future having the same issue as you to follow discussions and answers. That said, if you have a lot of primary keys select user_id from

Re: Composite Keys in cassandra 1.2

2015-03-02 Thread Kai Wang
AFIK it's not possible. The fact you need to query the data by partial row key indicates your data model isn't proper. What are your typical queries on the data? On Sun, Mar 1, 2015 at 7:24 AM, Yulian Oifa oifa.yul...@gmail.com wrote: Hello to all. Lets assume a scenario where key is compound

Re: Optimal Batch size (Unlogged) for Java driver

2015-03-02 Thread Ajay
I have a column family with 15 columns where there are timestamp, timeuuid, few text fields and rest int fields. If I calculate the size of its column name and it's value and divide 5kb (recommended max size for batch) with the value, I get result as 12. Is it correct?. Am I missing

Re: Optimal Batch size (Unlogged) for Java driver

2015-03-02 Thread Ajay
Hi Ankush, We are already using Prepared statement and our case is a time series data as well. Thanks Ajay On 02-Mar-2015 10:00 pm, Ankush Goyal ank...@gmail.com wrote: Ajay, First of all, I would recommend using PreparedStatements, so you only would be sending the variable bound arguments

Re: using or in select query in cassandra

2015-03-02 Thread Jonathan Haddad
I'd like to add that in() is usually a bad idea. It is convenient, but not really what you want in production. Go with Jens' original suggestion of multiple queries. I recommend reading Ryan Svihla's post on why in() is generally a bad thing:

Datastax Agent 5.1+ Configuration

2015-03-02 Thread Robert Halstead
I recently attempted to get our cassandra instances talking securely to one another with ssl opscenter communication. We are using DSE 4.6, opscenter 5.1. While a lot of the datastax documentation is fairly good, when it comes to advanced configuration topics or security configuration, I find

Re: using or in select query in cassandra

2015-03-02 Thread Robert Wille
I would also like to add that if you avoid IN and use async queries instead, it is pretty trivial to use a semaphore or some other limiting mechanism to put a ceiling on the amount on concurrent work you are sending to the cluster. If you use a query with an IN clause with a thousand things,

Re: Running Cassandra on mixed OS

2015-03-02 Thread Jonathan Haddad
I would really not recommend this. There's enough issues that can come up with a distributed database that can make it hard to pinpoint problems. In an ideal world, every machine would be completely identical. Don't set yourself up for fail. Pin the OS all packages to specific versions. On

RE: Running Cassandra on mixed OS

2015-03-02 Thread SEAN_R_DURITY
This is not for the long haul, but in order to accomplish an OS upgrade across the cluster, without taking an outage. Sean Durity From: Jonathan Haddad [mailto:j...@jonhaddad.com] Sent: Monday, March 02, 2015 1:15 PM To: user@cassandra.apache.org Subject: Re: Running Cassandra on mixed OS I

Re: Running Cassandra on mixed OS

2015-03-02 Thread Robert Coli
On Mon, Mar 2, 2015 at 6:43 AM, sean_r_dur...@homedepot.com wrote: Have any of you run a single Cassandra cluster on a mix of OS (Red Hat 5 and 6, for example), but with the same JVM? Any issues or concerns? If there are problems, how do you handle OS upgrades? If you are running the same

Re: Node stuck in joining the ring

2015-03-02 Thread Nate McCall
Can you verify that casssandra-rackdc.properties and cassandra-topology.properties are the same on the cluster? On Thu, Feb 26, 2015 at 7:52 AM, Batranut Bogdan batra...@yahoo.com wrote: No errors in the system.log file [root@cassa09 cassandra]# grep ERROR system.log [root@cassa09 cassandra]#

Re: sstables remain after compaction

2015-03-02 Thread Robert Coli
On Sat, Feb 28, 2015 at 5:39 PM, Jason Wee peich...@gmail.com wrote: Hi Rob, sorry for the late response, festive season here. cassandra version is 1.0.8 and thank you, I will read on the READ_STAGE threads. 1.0.8 is pretty seriously old in 2015. I would upgrade to at least 1.2.x (via 1.1.x)

set selinux context for cassandra to talk to website

2015-03-02 Thread Tim Dunphy
Hey all, Ok I have a website being powered by Cassandra 2.1.3. And I notice if selinux is set to off, the site works beautifully! However as soon as I set selinux to on, I am seeing the following error: Warning: require_once(/McFrazier/PhpBinaryCql/CqlClient.php): failed to open stream:

Re: does need to disable 'rpc_keepalive' if 'rpc_max_threads' is get larger?

2015-03-02 Thread Robert Coli
On Sun, Mar 1, 2015 at 6:40 PM, pprun pprun.dra...@gmail.com wrote: rpc_max_threads is set to 2048 and the 'rpc_server_type' is 'hsha', after 2 days running, observed that there's a high I/O activity and the number of 'RCP thread' grow to '2048' and VisualVm shows most of them is

Re: Less frequent flushing with LCS

2015-03-02 Thread Dan Kinder
I see, thanks for the input. Compression is not enabled at the moment, but I may try increasing that number regardless. Also I don't think in-memory tables would work since the dataset is actually quite large. The pattern is more like a given set of rows will receive many overwriting updates and

Reboot: Read After Write Inconsistent Even On A One Node Cluster

2015-03-02 Thread Dan Kinder
Hey all, I had been having the same problem as in those older post: http://mail-archives.apache.org/mod_mbox/cassandra-user/201411.mbox/%3CCAORswtz+W4Eg2CoYdnEcYYxp9dARWsotaCkyvS5M7+Uo6HT1=a...@mail.gmail.com%3E To summarize it, on my local box with just one cassandra node I can update and then

best practices for time-series data with massive amounts of records

2015-03-02 Thread Clint Kelly
Hi all, I am designing an application that will capture time series data where we expect the number of records per user to potentially be extremely high. I am not sure if we will eclipse the max row size of 2B elements, but I assume that we would not want our application to approach that size

RE: sstables remain after compaction

2015-03-02 Thread SEAN_R_DURITY
In my experience, you do not want to stay on 1.1 very long. 1.08 was very stable. 1.1 can get bad in a hurry. 1.2 (with many things moved off-heap) is very much better. Sean Durity – Cassandra Admin, Big Data Team From: Robert Coli [mailto:rc...@eventbrite.com] Sent: Monday, March 02, 2015

Re: What are the factors that affect the release time of each minor version?

2015-03-02 Thread Aleksey Yeschenko
Hi Phil, Right now there is no explicit scheme for minor releases scheduling. Eventually we just decide that it’s time for a new release - usually when the CHANGES list feels too long - and start the process. what are the duties to release a version? Need to build and eventually publish all

Re: how to make unique coloumns in cassandra

2015-03-02 Thread Ajaya Agrawal
Please be clear on questions and spend some time on writing questions so that other people know what you are trying to ask. I can't read your mind. :) Back to your question: Assuming that you need to search based on the values of the unique column then invert the index on auxiliary table. So