Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread DuyHai Doan
Hello Jay Your query is : select * from keyspaceuser.company_testusers where lastname = ‘lau’ LIMIT 1 Why do you think that the slowness is due to vnodes and not your query asking for 10 000 results ? On Fri, Sep 19, 2014 at 3:33 AM, Jay Patel pateljay3...@gmail.com wrote: Hi there, We

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Jonathan Haddad
Keep in mind secondary indexes in cassandra are not there to improve performance, or even really be used in a serious user facing manner. Build and maintain your own view of the data, it'll be much faster. On Thu, Sep 18, 2014 at 6:33 PM, Jay Patel pateljay3...@gmail.com wrote: Hi there, We

RE: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Parag Patel
Agreed. We only use secondary indexes for column families that are relatively small (~5k rows). For anything larger, we store the data into a wide row (but this depends on your data model) -Original Message- From: jonathan.had...@gmail.com [mailto:jonathan.had...@gmail.com] On Behalf

what's cool about cassandra 2.1.0?

2014-09-19 Thread Tim Dunphy
Hey all, I tried googling around to get an idea about what was new (and potentially cool) in the newest release of cassandra - 2.1.0. But all that I've been able to find so far is this kind of general statement about the new features.

Re: what's cool about cassandra 2.1.0?

2014-09-19 Thread DuyHai Doan
Hello Tim From this blog (http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-1) you should find the pointers to other big topics of 2.1 On Fri, Sep 19, 2014 at 3:33 PM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I tried googling around to get an idea about what was new (and

Re: what's cool about cassandra 2.1.0?

2014-09-19 Thread Tim Dunphy
Thanks I'll check that out! Really appreciate that! On Fri, Sep 19, 2014 at 10:07 AM, DuyHai Doan doanduy...@gmail.com wrote: Hello Tim From this blog ( http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-1) you should find the pointers to other big topics of 2.1 On Fri, Sep 19,

Wide Rows - Data Model Design

2014-09-19 Thread Check Peck
I am trying to use wide rows concept in my data modelling design for Cassandra. We are using Cassandra 2.0.6. CREATE TABLE test_data ( test_id int, client_name text, record_data text, creation_date timestamp, last_modified_date timestamp, PRIMARY KEY

Re: Wide Rows - Data Model Design

2014-09-19 Thread Jonathan Lacefield
Hello, Yes, this is a wide row table design. The first col is your Partition Key. The remaining 2 cols are clustering cols. You will receive ordered result sets based on client_name, record_date when running that query. Jonathan [image: datastax_logo.png] Jonathan Lacefield Solution

Re: Wide Rows - Data Model Design

2014-09-19 Thread DuyHai Doan
Does my above table falls under the category of wide rows in Cassandra or not? -- It depends on the cardinality. For each distinct test_id, how many combinations of client_name/record_data do you have ? By the way, why do you put the record_data as part of primary key ? In your table partiton

Re: Wide Rows - Data Model Design

2014-09-19 Thread Check Peck
@DuyHai - I have put that because of this condition - In this table, we can have multiple record_data for same client_name. It can be multiple combinations of client_name and record_data for each distinct test_id. On Fri, Sep 19, 2014 at 8:48 AM, DuyHai Doan doanduy...@gmail.com wrote: Does

Re: Wide Rows - Data Model Design

2014-09-19 Thread DuyHai Doan
Ahh yes, sorry, I read too fast, missed it. On Fri, Sep 19, 2014 at 5:54 PM, Check Peck comptechge...@gmail.com wrote: @DuyHai - I have put that because of this condition - In this table, we can have multiple record_data for same client_name. It can be multiple combinations of client_name

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Tyler Hobbs
Jon's advice is definitely still true, but in 2.1 there is https://issues.apache.org/jira/browse/CASSANDRA-1337, which parallelizes the fetching of ranges. On Fri, Sep 19, 2014 at 6:57 AM, Parag Patel ppa...@clearpoolgroup.com wrote: Agreed. We only use secondary indexes for column families

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Jay Patel
Thanks folks for all your inputs! Yes, I totally agree that we need to have a custom column family for indexing. However, we're trying to upgrade our existing cluster from non-vnode to vnode, and queries using secondary indexes breaks badly which used to be good with non-vnode. Btw, there is no

Re: Blocking while a node finishes joining the cluster after restart.

2014-09-19 Thread Kevin Burton
Hi Kevin, if you are using the latest version of opscenter, then even the community (= free) edition can do a rolling restart of your cluster. It's pretty convenient. We’re using ansible so I’d like something that integrates with that… On Tue, Sep 16, 2014 at 11:09 AM, Duncan Sands

Re: Blocking while a node finishes joining the cluster after restart.

2014-09-19 Thread Kevin Burton
This is great feedback… I think it could actually be even easier than this… You could have an ansible (or whatever cluster management system you’re using) role for just seeds. Then you would serially restart all seeds one at a time. You would need to run ‘nodetool status’ and make sure the

Upgrade to DSE 4.5

2014-09-19 Thread cass savy
We run on DSE 3.1.3 and only use the Cassandra in prod cluster. What is the release that I need to be on right away. Because if I need to upgrade to DSE 4.5.c* 2.0.7. I need to take 3 paths to get there. I see lot of improvements for solr/Hadoop features in DSE 4.0 and above. Can I upgrade to

Re: Blocking while a node finishes joining the cluster after restart.

2014-09-19 Thread Jonathan Haddad
Depending on how you query (one or quorum) you might be able to do 1 rack at a time (or az or whatever you've got) assuming your snitch is set up right On Sep 19, 2014, at 11:30 AM, Kevin Burton bur...@spinn3r.com wrote: This is great feedback… I think it could actually be even easier

Re: Blocking while a node finishes joining the cluster after restart.

2014-09-19 Thread Tyler Hobbs
On Fri, Sep 19, 2014 at 1:26 PM, Kevin Burton bur...@spinn3r.com wrote: We’re using ansible so I’d like something that integrates with that… I'm not familiar with Ansible, so I don't know if it's useful, but OpsCenter has a REST api you can use to do anything you can do from the UI. For

can't launch cassandra 2.1.0

2014-09-19 Thread Tim Dunphy
Hey all, I'm attempting to upgrade from cassandra 2.0.10 to version 2.1.0. However when launching the new version I'm running into the following: [root@beta-new:/etc/alternatives/cassandrahome] #./bin/cassandra -f SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in

Re: can't launch cassandra 2.1.0

2014-09-19 Thread DuyHai Doan
java.lang.NoSuchMethodError -- Seems like there is inconsistency with your jar dependencies On Fri, Sep 19, 2014 at 11:05 PM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I'm attempting to upgrade from cassandra 2.0.10 to version 2.1.0. However when launching the new version I'm

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Tyler Hobbs
On Fri, Sep 19, 2014 at 12:41 PM, Jay Patel pateljay3...@gmail.com wrote: Btw, there is no data in the table. Table is empty. Query is fired on the empty table. This is actually the worst case for secondary index lookups. From the tracing ouput, I don't understand why it's doing multiple

Is it wise to increase native_transport_max_threads if we have lots of CQL clients?

2014-09-19 Thread Donald Smith
If we have hundreds of CQL clients (for C* 2.0.9), should we increase native_transport_max_threads in cassandra.yaml from the default (128) to the number of clients? If we don't do that, I presume requests will queue up, resulting in higher latency, What's a reasonable max value for

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread DuyHai Doan
It will merge requests to neighboring ranges when the same node is a replica for both of them. Without vnodes, this usually results in all ranges for a node being merged. With vnodes, merging still happens, but not all ranges can be merged. -- But does it implies that with vnodes, there are

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Tyler Hobbs
On Fri, Sep 19, 2014 at 4:19 PM, DuyHai Doan doanduy...@gmail.com wrote: But does it implies that with vnodes, there are actually extra work to do for scanning indices ? Yes. If yes, is this extra load rather I/O bound or CPU bound ? It doesn't necessarily change what the query is

Help with approach to remove RDBMS schema from code to move to C*?

2014-09-19 Thread Les Hartzman
My company is using an RDBMS for storing time-series data. This application was developed before Cassandra and NoSQL. I'd like to move to C*, but ... The application supports data coming from multiple models of devices. Because there is enough variability in the data, the main table to hold the

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Robert Coli
On Fri, Sep 19, 2014 at 2:19 PM, DuyHai Doan doanduy...@gmail.com wrote: But does it implies that with vnodes, there are actually extra work to do for scanning indices ? Vnodes are just nodes, so they have all the problems-associated-with-many-nodes one would get with 256x as many nodes.

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Jay Patel
Thanks Tyler for the details. I'm still trying to understand what you described. Just to simplify my question what I don't understand: When coordinator fires indexed scan request to node 192.168.51.22, why don't it ask that node to check all of its (at least primary) ranges for the queried

Upgrade steps to address CASSANDRA-4411

2014-09-19 Thread Randy Fradin
I have a question about the steps listed in this article for addressing CASSANDRA-4411 https://issues.apache.org/jira/browse/CASSANDRA-4411 in an upgrade from a version = 1.1.3 or to a version = 1.1.5 when using leveled compaction: http://www.datastax.com/docs/1.1/install/upgrading#upgrade-steps

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Jay Patel
Thanks Robert for your intput but that sounds little crazy to me. Still physical node is the same so why can't it just do one indexed scan for all the contiguous or non-contiguous token ranges (vnodes) held by that physical node. I doubt that it needs to respect token order for some reason hence

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Tyler Hobbs
On Fri, Sep 19, 2014 at 4:53 PM, Jay Patel pateljay3...@gmail.com wrote: When coordinator fires indexed scan request to node 192.168.51.22, why don't it ask that node to check all of its (at least primary) ranges for the queried data, at once. Also, internally that node should be able to

Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Jay Patel
Thanks Tyler for clarification. I'll opened a tix CASSANDRA-7982 https://issues.apache.org/jira/browse/CASSANDRA-7982. For now, I've assigned to myself and put you as a reviewer. Pls. change assignment as you prefer.. Assume that we now batch the requests send only one request to the replica:

Re: Help with approach to remove RDBMS schema from code to move to C*?

2014-09-19 Thread James Briggs
Most of the C* success stories are for greenfield applications. Migrating from one database to another database is a lot of work. C* offers no magical path. If you only have a few tables and minor RDBMS feature dependencies, it can be done. Make sure your users and QA people are cooperative

Re: Blocking while a node finishes joining the cluster after restart.

2014-09-19 Thread James Briggs
Kevin: The serial approach would take a LONG time for large clusters. If you have sixty nodes, it could take an hour to do a rolling restart. 1) In Cassandra land, an hour is nothing. There's people doing repairs that practically never finish - as soon as one finishes after a week, they have

Re: what's cool about cassandra 2.1.0?

2014-09-19 Thread James Briggs
I'll be blunt. The reason to use the latest 2.0 or soon 2.1 is because Apple has committed 20 patches that make Cassandra operationally useful. Apple is the QA lab for Cassandra. Their conference talk was very exciting. I hope a video of that gets posted in October. Thanks, James Briggs. --

Re: Help with approach to remove RDBMS schema from code to move to C*?

2014-09-19 Thread Jack Krupansky
Start by asking how you intend to query the data. That should drive the data model. Is there existing app client code or an app layer that is already using the current schema, or are you intending to rewrite that as well. FWIW, you could place the numeric columns in a numeric map collection,