Duplicate result of get_indexed_slices, depending on indexClause.count

2011-04-15 Thread sam_
Hi All, I have been using Cassandra 0.7.2 and 0.7.4 with Thrift API (using Java). I noticed that if I am querying a Column Family with indexed columns sometimes I get a duplicate result in get_indexed_slices depending on the number of rows in the CF and the count that I set in IndexClause.count.

Re: Indexes on heterogeneous rows

2011-04-15 Thread Wangpei (Peter)
Does the get_indexed_slice in 0.7.4 version already do thing that way? It seems always take the 1st indexed column with EQ. Or is it a new feature of coming 0.7.5 or 0.8? -邮件原件- 发件人: Jonathan Ellis [mailto:jbel...@gmail.com] 发送时间: 2011年4月15日 0:21 收件人: user@cassandra.apache.org 抄送: David

Re: Duplicate result of get_indexed_slices, depending on indexClause.count

2011-04-15 Thread Jonathan Ellis
https://issues.apache.org/jira/browse/CASSANDRA-2406 On Fri, Apr 15, 2011 at 1:43 AM, sam_ amin_shar...@yahoo.com wrote: Hi All, I have been using Cassandra 0.7.2 and 0.7.4 with Thrift API (using Java). I noticed that if I am querying a Column Family with indexed columns sometimes I get a

CL.ONE gives UnavailableException on ok node

2011-04-15 Thread Mick Semb Wever
Just experienced something i don't understand yet. Running a 3 node cluster successfully for a few days now, then one of the nodes went down (server required reboot). After this the other two nodes kept throwing UnavailableExceptions like UnavailableException() at

How to warm up a cold node

2011-04-15 Thread Héctor Izquierdo Seliva
Hi everyone, is there any recommended procedure to warm up a node before bringing it up? Thanks!

Re: How to warm up a cold node

2011-04-15 Thread Peter Schuller
Hi everyone, is there any recommended procedure to warm up a node before bringing it up? Currently the only out-of-the-box support for warming up caches is that implied by the key cache and row cache, which will pre-heat on start-up. Indexes will be indirectly preheated by index sampling, to

Re: How to warm up a cold node

2011-04-15 Thread Héctor Izquierdo Seliva
How difficult do you think this could be? I would be interested into developing this if it's feasible. El vie, 15-04-2011 a las 16:19 +0200, Peter Schuller escribió: Hi everyone, is there any recommended procedure to warm up a node before bringing it up? Currently the only out-of-the-box

question about performance of Cassandra 0.7.4 under a read-heavy workload.

2011-04-15 Thread 魏金仙
I just deployed cassandra 0.7.4 as a 6-server cluster and tested its performance via YCSB. The result seems confusing when compared to that of Cassandra0.6.6. Under a write heavy workload(i.e., write/read: 50%/50%), Cassandra0.7.4 obtains a really satisfactory latency. I mean both the read

Consistency model

2011-04-15 Thread James Cipar
I've been experimenting with the consistency model of Cassandra, and I found something that seems a bit unexpected. In my experiment, I have 2 processes, a reader and a writer, each accessing a Cassandra cluster with a replication factor greater than 1. In addition, sometimes I generate

Key cache hit rate

2011-04-15 Thread mcasandra
How to intepret Key cache hit rate? What does this no mean? Keyspace: StressKeyspace Read Count: 87579 Read Latency: 11.792417360326105 ms. Write Count: 179749 Write Latency: 0.009272318622078566 ms. Pending Tasks: 0 Column Family:

Re: What's the best modeling approach for ordering events by date?

2011-04-15 Thread Ethan Rowe
Hi. So, the OPP will direct all activity for a range of keys to a particular node (or set of nodes, in accordance with your replication factor). Depending on the volume of writes, this could be fine. Depending on the distribution of key values you write at any given time, it can also be fine.

Two versions of schema

2011-04-15 Thread mcasandra
Is there a problem? [default@StressKeyspace] update column family StressStandard with keys_cached=100; 854ee0a0-6792-11e0-81f9-93d987913479 Waiting for schema agreement... The schema has not settled in 10 seconds; further migrations are ill-advised until it does. Versions are

Problems with subcolumn retrieval after upgrade from 0.6 to 0.7

2011-04-15 Thread Abraham Sanderson
I'm having some issues with a few of my ColumnFamilies after a cassandra upgrade/import from 0.6.1 to 0.7.4. I followed the instructions to upgrade and everything seem to work OK...until I got into the application and noticed some wierd behavior. I was getting the following stacktrace in

recurring EOFException exception in 0.7.4

2011-04-15 Thread Jonathan Colby
I've been struggling with these kinds of exceptions for some time now. I thought it might have been a one-time thing, so on the 2 nodes where I saw this problem I pulled in fresh data with a repair on an empty data directory. Unfortunately, this problem is now coming up on a new node that has,

cluster IP question and Jconsole?

2011-04-15 Thread tinhuty he
I have followed the description here http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/lauching_5_node_cassandra_clusters to created 5 instances of cassandra in one CentOS 5.5 machine. using nodetool shows the 5 nodes are all running fine. Note the 5 nodes are using IP 127.0.0.1 to

Re: Cassandra 2 DC deployment

2011-04-15 Thread Peter Schuller
You are right about the automatic fallback to ONE. Its quite possible, if 2 nodes die for some reason I will have the same problem. So probably the right thing to do would be to read/write at ONE only when we lose a DC by changing some manual configuration. Since we shouldn't be losing DCs

RE: recurring EOFException exception in 0.7.4

2011-04-15 Thread Dan Hendry
Try running nodetool scrub on the cf: its pretty good at detecting and fixing most corruption problems. Dan -Original Message- From: Jonathan Colby [mailto:jonathan.co...@gmail.com] Sent: April-15-11 15:41 To: user@cassandra.apache.org Subject: recurring EOFException exception in 0.7.4

RE: Consistency model

2011-04-15 Thread Dan Hendry
So Cassandra does not use an atomic commit protocol at the cluster level. Strong consistency on a quorum read is only guaranteed *after* a successful quorum write. The behaviour you are seeing is possible if you are reading in the middle of a write or the write failed (which should be reported to

Schemas diverging while dynamically creating CF.

2011-04-15 Thread Alejandro Perez
Hello, We're testing cassandra for integration with indextank. In this first try, we're creating one column family for each user. In practice, on the first run and for the first few documents (a few 100s), a new CF is created, and a document is immediately added to it. A few (up to 50) requests

Re: CL.ONE gives UnavailableException on ok node

2011-04-15 Thread Jonathan Ellis
Sure sounds like you have RF=1 to me. On Fri, Apr 15, 2011 at 7:45 AM, Mick Semb Wever m...@apache.org wrote: Just experienced something i don't understand yet. Running a 3 node cluster successfully for a few days now, then one of the nodes went down (server required reboot). After this the

Re: CL.ONE gives UnavailableException on ok node

2011-04-15 Thread Mick Semb Wever
On Fri, 2011-04-15 at 15:43 -0500, Jonathan Ellis wrote: Sure sounds like you have RF=1 to me. Yes that's right. I see... so the answer here is that i should be using CL.ANY ? (so the write goes through and hinted handoff can get it to the correct node latter on). ~mck -- The fox condemns

RE: Schemas diverging while dynamically creating CF.

2011-04-15 Thread Dan Hendry
Uh... don't create a column family per user. Column families are meant to be fairly static; conceptually equivalent to a table in a relational database. Why do you need (or even want) a CF per user? Reconsider your data model, a single column family with an inverted index for a 'user' column is

Re: Schemas diverging while dynamically creating CF.

2011-04-15 Thread Alejandro Perez
Thanks for the quick response!. I will reconsider the schema. However, the problem troubles me somehow. How are schema changes supposed to be done? Should I serialize them, should I halt other cluster operations while I do the schema change? Is this a known problem with cassandra? The other

Re: CL.ONE gives UnavailableException on ok node

2011-04-15 Thread Jonathan Ellis
Yes, if you want to keep writes available w/ RF=1 then you need to use CL.ANY. On Fri, Apr 15, 2011 at 3:48 PM, Mick Semb Wever m...@apache.org wrote: On Fri, 2011-04-15 at 15:43 -0500, Jonathan Ellis wrote: Sure sounds like you have RF=1 to me. Yes that's right. I see... so the answer here

Upcoming Bay area Cassandra events

2011-04-15 Thread Jonathan Ellis
FYI, there's a couple Cassandra events coming up in April and May in the Bay area: Wednesday, April 27, 1pm-6pm: Free Cassandra training by DataStax, hosted by Ooyala! *Space is limited*; you can sign up at http://www.datastax.com/freetraining. Wednesday, April 27, 6pm-8pm (yes, the evening of

Re: Cassandra Database Modeling

2011-04-15 Thread Aaron Morton
There rows can have 2 billion columns, max column size is 2 GB . But less than 10 mb sounds like a sane limit for a single column. For the serialisation it depends on what your data looks like, point is that json is not space efficient. You may get away with just compressing it (gzip, lzo...),

Re: question about performance of Cassandra 0.7.4 under a read-heavy workload.

2011-04-15 Thread Aaron Morton
Will need to know more about the number of requests, iostats etc. There is no reason for it to run slower. Aaron On 16/04/2011, at 2:35 AM, 魏金仙 sei_...@126.com wrote: I just deployed cassandra 0.7.4 as a 6-server cluster and tested its performance via YCSB. The result seems confusing when

Re: Key cache hit rate

2011-04-15 Thread Aaron Morton
Move the decimal point 4 places to the left. It's the percent of your queries that get a hit from the key cache . Aaron On 16/04/2011, at 6:25 AM, mcasandra mohitanch...@gmail.com wrote: How to intepret Key cache hit rate? What does this no mean? Keyspace: StressKeyspace Read

DatabaseDescriptor.defsVersion

2011-04-15 Thread Jeffrey Wang
Hey all, I've been seeing a very rare issue with schema change conflicts on 0.7.3 (I am serializing all schema changes to a single Cassandra node and waiting for them to finish before continuing). Occasionally a node in the cluster will never report the correct schema, and I think it may have

Re:Re: question about performance of Cassandra 0.7.4 under a read-heavy workload.

2011-04-15 Thread 魏金仙
To make a comparation, 10 threads were run against the two workloads seperately. below is the result of Cassandra0.7.4. write heavy workload(i.e., write/read: 50%/50%) median throughput: 5816 operations/second(i.e., 2908 writes and 2908 reads) update latency:1.32ms read latency:1.81ms read

Re: cluster IP question and Jconsole?

2011-04-15 Thread Maki Watanabe
127.0.0.2 to 127.0.0.5 are valid IP addresses. Those are just alias addresses for your loopback interface. Verify: % ifconfig -a 127.0.0.0/8 is for loopback, so you can't connect this address from remote machines. You may be able configure SSH port forwarding from your monitroing host to

Re: cluster IP question and Jconsole?

2011-04-15 Thread tinhuty he
Maki, thanks for your reply. for the second question, I wasn't using the loopback address, I was using the actually IP address for that server. I am able to telnet to that IP on port 8081, but using jconsole failed. -Original Message- From: Maki Watanabe Sent: Friday, April 15, 2011

RE: DatabaseDescriptor.defsVersion

2011-04-15 Thread Jeffrey Wang
Done: https://issues.apache.org/jira/browse/CASSANDRA-2490 -Jeffrey -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Friday, April 15, 2011 7:39 PM To: user@cassandra.apache.org Cc: Jeffrey Wang Subject: Re: DatabaseDescriptor.defsVersion I think you found a

What will be the steps for adding new nodes

2011-04-15 Thread Roni
I have a 0.6.4 Cassandra cluster of two nodes in full replica (replica factor 2). I wants to add two more nodes and balance the cluster (replica factor 2). I want all of them to be seed's. What should be the simple steps: 1. add the AutoBootstraptrue/AutoBootstrap to all the nodes or only

What will be the steps for adding new nodes

2011-04-15 Thread Roni
I have a 0.6.4 Cassandra cluster of two nodes in full replica (replica factor 2). I wants to add two more nodes and balance the cluster (replica factor 2). I want all of them to be seed's. What should be the simple steps: 1. add the AutoBootstraptrue/AutoBootstrap to all the nodes or only