Re: No Transactions: An Example

2011-07-28 Thread Jeremy Sevellec
Hi All, Making transaction is my actual preoccupation of the moment. My need is : - update data in column family #1 - insert data in column family #2 My need is to see thes opérations in a single transaction because the data is tightly coupled. I use zookeeper/cage to make distributed lock to

Re: Changing the CLI, not a great idea!

2011-07-28 Thread Hartog C. de Mik
On Wed, Jul 27, 2011 at 11:01:17PM -0500, Jonathan Ellis wrote: On Wed, Jul 27, 2011 at 10:53 PM, Edward Capriolo edlinuxg...@gmail.com wrote: You can not even put two statements on the same line. So the ';' is semi useless syntax. Nobody ever asked for that, but lots of people asked to

Re: Changing the CLI, not a great idea!

2011-07-28 Thread David Boxenhorn
This is part of a much bigger problem, one which has many parts, among them: 1. Cassandra is complex. Getting a gestalt understanding of it makes me think I understand how Alzheimer's patients must feel. 2. There is no official documentation. Perhaps everything is out there somewhere, who knows?

AW: results of index slice query

2011-07-28 Thread Roland Gude
Created https://issues.apache.org/jira/browse/CASSANDRA-2964 -Ursprüngliche Nachricht- Von: Jonathan Ellis [mailto:jbel...@gmail.com] Gesendet: Mittwoch, 27. Juli 2011 17:35 An: user@cassandra.apache.org Betreff: Re: results of index slice query Sounds like a Cassandra bug to me. On

Re: Changing the CLI, not a great idea!

2011-07-28 Thread Sasha Dolgy
Unfortunately, the perception that I have as a business consumer and night-time hack, is that more importance and effort is placed on ensuring information is up to date and correct on the http://www.datastax.com/docs/0.8/index website and less on keeping the wiki up to date or relevant... which

NotFoundException thrown for get(), but not get_slice() with a column_names predicate

2011-07-28 Thread David Allsopp
If I try to retrieve a column that is not present, using get(), then I'll get a NotFoundException. If (for efficiency's sake) I try to retrieve several named columns using get_slice, with a column_names predicate (i.e. a list of columns) then I won't get the exception if one of those columns is

Re: Changing the CLI, not a great idea!

2011-07-28 Thread Edward Capriolo
On Thursday, July 28, 2011, Sasha Dolgy sdo...@gmail.com wrote: Unfortunately, the perception that I have as a business consumer and night-time hack, is that more importance and effort is placed on ensuring information is up to date and correct on the http://www.datastax.com/docs/0.8/index

Re: Expanding 0.6.x cluster to multiple datacenters

2011-07-28 Thread Ashley Martens
Thank you. For 0.7 are the steps similar? On Jul 27, 2011, at 19:56, Jonathan Ellis jbel...@gmail.com wrote: As you know, with 0.6 adding a datacenter is not as easy as 0.7 with NetworkTopologyStrategy. With 0.6 there is a right way that will work with some manual effort, and a wrong way

Re: NotFoundException thrown for get(), but not get_slice() with a column_names predicate

2011-07-28 Thread Jonathan Ellis
No, the slice semantics are give me whatever happens to exist between start and end. It's valid for the answer to be nothing. On Thu, Jul 28, 2011 at 6:55 AM, David Allsopp dnalls...@gmail.com wrote: If I try to retrieve a column that is not present, using get(), then I'll get a

Re: Changing the CLI, not a great idea!

2011-07-28 Thread Jonathan Ellis
It defaults to hex because that is how bytestype is represented. The default remains bytestype to provide the kind of backwards compatibility you are complaining about. :) On Thu, Jul 28, 2011 at 6:56 AM, Edward Capriolo edlinuxg...@gmail.com wrote: On Thursday, July 28, 2011, Sasha Dolgy

Re: memory_locking_policy parameter in cassandra.yaml for disabling swap - has this variable been renamed?

2011-07-28 Thread Jonathan Ellis
This is not advisable in general, since non-mmap'd I/O is substantially slower. The OP is correct that it is best to disable swap entirely, and second-best to enable JNA for mlockall. On Thu, Jul 28, 2011 at 7:05 AM, Adi adi.pan...@gmail.com wrote: Hi, We’ve started having problems with

Re: memory_locking_policy parameter in cassandra.yaml for disabling swap - has this variable been renamed?

2011-07-28 Thread Jonathan Ellis
I don't think there's ever been a memory_locking_policy variable. Cassandra will call mlockall if JNA is present, no further steps required. On Thu, Jul 28, 2011 at 5:17 AM, Stephen Henderson stephen.hender...@cognitivematch.com wrote: Hi, We’ve started having problems with cassandra and

Re: Questions over the use of CQL

2011-07-28 Thread Jonathan Ellis
You can quote CQL column names to allow any column name that Thrift would allow (suitably encoded for ascii). For instance, CQL knows that UUIDs are represented as strings like 12345678-1234-5678-1234-567812345678 and will parse them correctly. If you mean the official CompositeType, that should

Re: Problems using Thrift API in C

2011-07-28 Thread Eric Tamme
On 07/28/2011 05:29 AM, Aleksandrs Saveljevs wrote: essentially a rewrite of the first part of the C++ example given at http://wiki.apache.org/cassandra/ThriftExamples#C.2B-.2B- . If we run it under strace, we see that it hangs on the call to recv() when setting keyspace: $ strace -s 64

Re: Expanding 0.6.x cluster to multiple datacenters

2011-07-28 Thread Ashley Martens
Okay. So what happens when I try to add a third DC? Since we would be using RAS any new node would get the entire dataset. On Jul 28, 2011, at 5:49, Jonathan Ellis jbel...@gmail.com wrote: The steps are the same for RUS - RAS no matter what version of Cassandra you are on, but 0.7 introduced

Re: Changing the CLI, not a great idea!

2011-07-28 Thread Edward Capriolo
On Thu, Jul 28, 2011 at 8:46 AM, Jonathan Ellis jbel...@gmail.com wrote: It defaults to hex because that is how bytestype is represented. The default remains bytestype to provide the kind of backwards compatibility you are complaining about. :) On Thu, Jul 28, 2011 at 6:56 AM, Edward

Re: Changing the CLI, not a great idea!

2011-07-28 Thread Jonathan Ellis
I'm talking about data compatibility, which is more important than cli statement compatibility. Consider someone with a python program that creates a CF with the default settings and inserts some (say) uuid columns and long data. If we changed CF creation to default to ascii we would break this

Re: Changing the CLI, not a great idea!

2011-07-28 Thread Edward Capriolo
On Thu, Jul 28, 2011 at 9:35 AM, Jonathan Ellis jbel...@gmail.com wrote: I'm talking about data compatibility, which is more important than cli statement compatibility. Consider someone with a python program that creates a CF with the default settings and inserts some (say) uuid columns and

CQL Driver status

2011-07-28 Thread Nicholas Neuberger
I had a few questions that I couldn't easily answer by looking through many JIRA tickets and the wiki. We are currently developing an application based on Cassandra 0.8.x with the CQL driver. Does the CQL driver currently support cursor / resultsets? I'd like to implement a pagination feature.

Re: memory_locking_policy parameter in cassandra.yaml for disabling swap - has this variable been renamed?

2011-07-28 Thread Terje Marthinussen
On Jul 28, 2011, at 9:52 PM, Jonathan Ellis wrote: This is not advisable in general, since non-mmap'd I/O is substantially slower. I see this again and again as a claim here, but it is actually close to 10 years since I saw mmap'd I/O have any substantial performance benefits on any real

Re: Changing the CLI, not a great idea!

2011-07-28 Thread Nicholas Knight
On Jul 28, 2011, at 6:35, Jonathan Ellis jbel...@gmail.com wrote: I'm talking about data compatibility, which is more important than cli statement compatibility. Consider someone with a python program that creates a CF with the default settings and inserts some (say) uuid columns and long

Using Cassandra for transaction logging, good idea?

2011-07-28 Thread Kent Narling
Hi! I am considering to use cassandra for clustered transaction logging in a project. What I need are in principal 3 functions: 1 - Log transaction with a unique (but possibly non-sequential) id 2 - Fetch transaction with a specific id 3 - Fetch X new transactions after a specific

Cassandra 0.6.8 snapshot problem?

2011-07-28 Thread Jian Fang
Hi, We have an old production Cassandra 0.6.8 instance without replica, i.e., the replication factor is 1. Recently, we noticed that the snapshot data we took from this instance are inconsistent with the running instance data. For example, we took snapshot in early July 2011. From the running

Re: Questions over the use of CQL

2011-07-28 Thread Ikeda Anthony
Thanks Jonathan, I just had one of our devs playing around with it and he said he had problems with some of the column names of which we delimit using a dash (-) using the JDBC drivers e.g. SELECT m-UUID-hash(value) FROM column_family….. If this is not a problem then I have my questions

Re: NotFoundException thrown for get(), but not get_slice() with a column_names predicate

2011-07-28 Thread David Allsopp
I understand and agree for the case where the slice predicate is a range, but I'd expect the semantics to be different where the predicate is a list of column names (even if it's implemented using a range operation under the hood?) If I ask for columns foo and bar, then usually I'm not trying to

Re: Changing the CLI, not a great idea!

2011-07-28 Thread Sylvain Lebresne
On Thu, Jul 28, 2011 at 4:00 PM, Edward Capriolo edlinuxg...@gmail.com wrote: On Thu, Jul 28, 2011 at 9:35 AM, Jonathan Ellis jbel...@gmail.com wrote: I'm talking about data compatibility, which is more important than cli statement compatibility. Consider someone with a python program that

Re: NotFoundException thrown for get(), but not get_slice() with a column_names predicate

2011-07-28 Thread Sylvain Lebresne
To be honest, collecting the names that were missing in the first name query and doing a new name query for those (if there is any) is so simple that I think it is a bit dishonest to say that it pushes work to the clients. It seems simple enough at least that it does not sound like a good idea to

Re: Changing the CLI, not a great idea!

2011-07-28 Thread Edward Capriolo
On Thu, Jul 28, 2011 at 10:59 AM, Sylvain Lebresne sylv...@datastax.comwrote: On Thu, Jul 28, 2011 at 4:00 PM, Edward Capriolo edlinuxg...@gmail.com wrote: On Thu, Jul 28, 2011 at 9:35 AM, Jonathan Ellis jbel...@gmail.com wrote: I'm talking about data compatibility, which is more

Cassandra timeout exception when works with hadoop

2011-07-28 Thread Jian Fang
Hi, I run Cassandra 0.8.2 and hadoop 0.20.2 on three nodes, each node includes a Cassandra instance and a hadoop data node. I created a simple hadoop job to scan a Cassandra column value in a column family and write it to a file system if it meets some conditions. I keep getting the following

Re: Cassandra timeout exception when works with hadoop

2011-07-28 Thread Jeremy Hanna
See http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting - I would probably start with setting your rpc_timeout_in_ms to something like 3. On Jul 28, 2011, at 11:09 AM, Jian Fang wrote: Hi, I run Cassandra 0.8.2 and hadoop 0.20.2 on three nodes, each node includes a

Re: Cassandra timeout exception when works with hadoop

2011-07-28 Thread Jian Fang
My current setting is 1. I will try 3. Thanks, John On Thu, Jul 28, 2011 at 12:39 PM, Jeremy Hanna jeremy.hanna1...@gmail.comwrote: See http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting - I would probably start with setting your rpc_timeout_in_ms to something like 3.

how to solve one node is in heavy load in unbalanced cluster

2011-07-28 Thread Yan Chunlu
I have three nodes and RF=3.here is the current ring: Address Status State Load Owns Token 84944475733633104818662955375549269696 node1 Up Normal 15.32 GB 81.09% 52773518586096316348543097376923124102 node2 Up Normal 22.51 GB 10.48% 70597222385644499881390884416714081360 node3 Up Normal 56.1 GB

Aggregation and Co-Processors

2011-07-28 Thread Stephen Pope
I just finished watching the video by Eric Evans on CQL - Not just NoSQL. It's MoSQL, and I heard mention of aggregation queries. He said there's been some talk about it, and that you guys were calling it co-processors. Can somebody give me the gist of what that's all about? I couldn't find any

Re: Cassandra timeout exception when works with hadoop

2011-07-28 Thread Jeremy Hanna
Just wondering - what consistency level are you using for hadoop reads? Also, do you have task trackers running on the cassandra nodes so that reads will be local? On Jul 28, 2011, at 2:46 PM, Jian Fang wrote: I changed the rpc_timeout_in_ms to 3 and 4, then changed the

Re: Aggregation and Co-Processors

2011-07-28 Thread Ryan King
On Thu, Jul 28, 2011 at 12:08 PM, Stephen Pope stephen.p...@quest.com wrote: I just finished watching the video by Eric Evans on “CQL – Not just NoSQL. It’s MoSQL”, and I heard mention of aggregation queries. He said there’s been some talk about it, and that you guys were calling it

Re: CQL Driver status

2011-07-28 Thread Jonathan Ellis
There is no cursor support. Moving away from thrift is still in the hopefully someday stage. On Jul 28, 2011 9:03 AM, Nicholas Neuberger nneuberg...@gmail.com wrote: I had a few questions that I couldn't easily answer by looking through many JIRA tickets and the wiki. We are currently

Re: how to solve one node is in heavy load in unbalanced cluster

2011-07-28 Thread Frank Duan
Dropped read message might be an indicator of capacity issue. We experienced the similar issue with 0.7.6. We ended up adding two extra nodes and physically rebooted the offending node(s). The entire cluster then calmed down. On Thu, Jul 28, 2011 at 2:24 PM, Yan Chunlu springri...@gmail.com

Re: memory_locking_policy parameter in cassandra.yaml for disabling swap - has this variable been renamed?

2011-07-28 Thread Jonathan Ellis
If you're actually hitting disk for most or even many of your reads then mmap doesn't matter since the extra copy to a Java buffer is negligible compared to the i/o itself (even on ssds). On Jul 28, 2011 9:04 AM, Terje Marthinussen tmarthinus...@gmail.com wrote: On Jul 28, 2011, at 9:52 PM,

Re: Cassandra timeout exception when works with hadoop

2011-07-28 Thread Jian Fang
I did not set the consistency level because I didn't find this option in the ConfigHelper class. I guess it should use level one by default. Actually, I only twisted the word count example a bit. Here is the code snippet, getConf().set(CONF_COLUMN_NAME, columnName); Job job =

The video from the Cassandra SF 2011 Data Modeling workshop is now online

2011-07-28 Thread Lynn Bender
Video: Matt Dennis' Data Modeling Workshop: http://www.datastax.com/2011/07/video-data-modeling-workshop-from-cassandra-sf-2011 A list of available videos are now available on the Cassandra SF 2011 presentation page: http://www.datastax.com/events/cassandrasf2011/presentations -- -Lynn Bender

Re: memory_locking_policy parameter in cassandra.yaml for disabling swap - has this variable been renamed?

2011-07-28 Thread Peter Schuller
I would love to understand how people got to this conclusion however and try to find out why we seem to see differences! I won't make any claims with Cassandra because I have never bothered benchmarking the different in CPU usage since all my use-cases have been more focused on I/O efficiency,

Re: NotFoundException thrown for get(), but not get_slice() with a column_names predicate

2011-07-28 Thread David Allsopp
On 28 July 2011 16:23, Sylvain Lebresne sylv...@datastax.com wrote: To be honest, collecting the names that were missing in the first name query and doing a new name query for those (if there is any) is so simple that I think it is a bit dishonest to say that it pushes work to the clients.

Re: memory_locking_policy parameter in cassandra.yaml for disabling swap - has this variable been renamed?

2011-07-28 Thread Terje Marthinussen
Benchmarks was done with up to 96GB memory, much more caching than most people will ever have. The point anyway is that you are talking I/O in 10's or at best, a few hundred MB/sec before cassandra will eat all your CPU (with dual CPU 6 cores in our case). The memcopy involved here deep

Re: Questions over the use of CQL

2011-07-28 Thread Jonathan Ellis
So to summarize, yes, you can do that with CQL, but it's a little more of a pain if your comparator is BytesType since you'll need to convert to hex. On Thu, Jul 28, 2011 at 9:48 AM, Ikeda Anthony anthony.ikeda@gmail.com wrote: Thanks Jonathan, I just had one of our devs playing around

Re: Cassandra 0.6.8 snapshot problem?

2011-07-28 Thread Jonathan Ellis
Doesn't ring a bell. But I'd say if you upgrade and it's still a problem, then (a) you're not _worse_ off than you are now, and (b) it's a lot more likely to get fixed in modern version. On Thu, Jul 28, 2011 at 9:47 AM, Jian Fang jian.fang.subscr...@gmail.com wrote: Hi, We have an old

Re: memory_locking_policy parameter in cassandra.yaml for disabling swap - has this variable been renamed?

2011-07-28 Thread Teijo Holzer
Hi, yes I was looking for this config as well. This is really simple to achieve: Put the following line into /etc/security/limits.conf cassandra- memlock 32 Then, start Cassandra as the user cassandra, not as root (note there is never a need to run Cassandra as root,

Ordering by timestamp on optional field

2011-07-28 Thread Mike Gaffney
I have a rating system where users rate items. the item is the row key, the user is the column key. The value is the rating on a 5 point scale. People can only have one rating per item. I'm looking to add the ability to optionally add a text comment to the rating record. If the text comment is

Re: Cassandra 0.6.8 snapshot problem?

2011-07-28 Thread Zhu Han
On Thu, Jul 28, 2011 at 10:47 PM, Jian Fang jian.fang.subscr...@gmail.comwrote: Hi, We have an old production Cassandra 0.6.8 instance without replica, i.e., the replication factor is 1. Recently, we noticed that the snapshot data we took from this instance are inconsistent with the running

Re: how to solve one node is in heavy load in unbalanced cluster

2011-07-28 Thread Yan Chunlu
add new nodes seems added more pressure to the cluster? how about your data size? On Fri, Jul 29, 2011 at 4:16 AM, Frank Duan fr...@aimatch.com wrote: Dropped read message might be an indicator of capacity issue. We experienced the similar issue with 0.7.6. We ended up adding two extra

Re: how to solve one node is in heavy load in unbalanced cluster

2011-07-28 Thread Yan Chunlu
and by the way, my RF=3 and the other two nodes have much more capacity, why does they always routed the request to node3? coud I do a rebalance now? before node repair? On Fri, Jul 29, 2011 at 12:01 PM, Yan Chunlu springri...@gmail.com wrote: add new nodes seems added more pressure to the

Replace multiple dead nodes at the same time. Is it possible?

2011-07-28 Thread Haryadi Gunawi
Hello, I want to run the procedure of Handling Failure option #1 in http://wiki.apache.org/cassandra/Operations (i.e. the (Recommended approach) Bring up the replacement node with a new IP address ...) Say, I have 10 dead nodes, and I want to replace it with 10 new nodes. My questions: Is it