Re: problem with running simple example using cassandra-cli with 0.6.0-beta2
Thanks. With 0.6.0-beta2 using Standard2 does show a human-readable column. However, the behavior is definitely different between 0.5.1 and 0.6.0-beta2. I am using the binary distribution of 0.5.1: cassandra show version 0.5.1 cassandra set Keyspace1.Standard1['jsmith']['first'] = 'John' Value inserted. cassandra set Keyspace1.Standard1['jsmith']['last'] = 'Smith' Value inserted. cassandra set Keyspace1.Standard1['jsmith']['age'] = '42' Value inserted. cassandra get Keyspace1.Standard1['jsmith'] = (column=last, value=Smith, timestamp=1268408466548) = (column=first, value=John, timestamp=1268408464036) = (column=age, value=42, timestamp=1268408468895) Returned 3 results. With 0.5.1 using Standard1 does show a human-readable column as documented in the Wiki. Not sure which one is the correct behavior here. Bill On Thu, Mar 11, 2010 at 1:22 PM, Eric Evans eev...@rackspace.com wrote: On Wed, 2010-03-10 at 18:09 -0500, Bill Au wrote: I am checking out 0.6.0-beta2 since I need the batch-mutate function. I am just trying to run the example is the cassandra-cli Wiki: http://wiki.apache.org/cassandra/CassandraCli Here is what I am getting: cassandra set Keyspace1.Standard1['jsmith']['first'] = 'John' Value inserted. cassandra get Keyspace1.Standard1['jsmith'] = (column=6669727374, value=John, timestamp=1268261785077) Returned 1 results. The column name being returned by get (6669727374) does not match what is set (first). This is true for all column names. cassandra set Keyspace1.Standard1['jsmith']['last'] = 'Smith' Value inserted. cassandra set Keyspace1.Standard1['jsmith']['age'] = '42' Value inserted. cassandra get Keyspace1.Standard1['jsmith'] = (column=6c617374, value=Smith, timestamp=1268262480130) = (column=6669727374, value=John, timestamp=1268261785077) = (column=616765, value=42, timestamp=1268262484133) Returned 3 results. Is this a problem in 0.6.0-beta2 or am I doing anything wrong? No, you're not doing anything wrong. What you're seeing is the hex representation of a BytesType, which is the comparator that Standard1 in the example config uses. This is the same for 0.5.1 too. If you haven't made any changes to the default config, try using Standard2 as the column family and you'll see a human-readable column name as expected (Standard2 uses a UTF8Type comparator). The wiki page has sample output that is confusing, (it's probably cut-and-paste from a time when Standard1 used an ASCII or UTF8 comparator), we should probably fix that. -- Eric Evans eev...@rackspace.com
get_range_slice(s) question
I've noticed that both 0.5.1 and 0.6b2 return (ReplicationFactor) identical copies of the data stored in my keyspace whenever I make a call to get_range_slice or get_range_slices using ConsistencyLevel.QUORUM. So with ReplicationFactor set to 2 for my application's KeySpace I get double the number of KeySlices that I expect to get. When using ConsistencyLevel.ONE I get only one KeySlice for each row. The same routine running against the Standard1 keyspace with a ReplicationFactor of 1 returns only a single KeySlice for each row. A ReplicationFactor of three gives me three identical KeySlices when using ConsistencyLevel.QUORUM. Is this the intended behavior of get_range_slices? I remember reading in one of the Dynamo papers that applications (and not Dynamo) are required to sort out any discrepancies in the data, but in this case there aren't any discrepancies. Omer
Re: problem with running simple example using cassandra-cli with 0.6.0-beta2
On Fri, 2010-03-12 at 11:21 -0500, Bill Au wrote: Thanks. With 0.6.0-beta2 using Standard2 does show a human-readable column. However, the behavior is definitely different between 0.5.1 and 0.6.0-beta2. I am using the binary distribution of 0.5.1: cassandra show version 0.5.1 cassandra set Keyspace1.Standard1['jsmith']['first'] = 'John' Value inserted. cassandra set Keyspace1.Standard1['jsmith']['last'] = 'Smith' Value inserted. cassandra set Keyspace1.Standard1['jsmith']['age'] = '42' Value inserted. cassandra get Keyspace1.Standard1['jsmith'] = (column=last, value=Smith, timestamp=1268408466548) = (column=first, value=John, timestamp=1268408464036) = (column=age, value=42, timestamp=1268408468895) Returned 3 results. With 0.5.1 using Standard1 does show a human-readable column as documented in the Wiki. Right you are, my mistake. This changed in https://issues.apache.org/jira/browse/CASSANDRA-661 (which occurred between 0.5 and 0.6). Not sure which one is the correct behavior here. The current behavior is correct. I'll update the examples to avoid future confusion. -- Eric Evans eev...@rackspace.com
Re: problem with running simple example using cassandra-cli with 0.6.0-beta2
Thanks for clearing this up for me. Bill On Fri, Mar 12, 2010 at 11:49 AM, Eric Evans eev...@rackspace.com wrote: On Fri, 2010-03-12 at 11:21 -0500, Bill Au wrote: Thanks. With 0.6.0-beta2 using Standard2 does show a human-readable column. However, the behavior is definitely different between 0.5.1 and 0.6.0-beta2. I am using the binary distribution of 0.5.1: cassandra show version 0.5.1 cassandra set Keyspace1.Standard1['jsmith']['first'] = 'John' Value inserted. cassandra set Keyspace1.Standard1['jsmith']['last'] = 'Smith' Value inserted. cassandra set Keyspace1.Standard1['jsmith']['age'] = '42' Value inserted. cassandra get Keyspace1.Standard1['jsmith'] = (column=last, value=Smith, timestamp=1268408466548) = (column=first, value=John, timestamp=1268408464036) = (column=age, value=42, timestamp=1268408468895) Returned 3 results. With 0.5.1 using Standard1 does show a human-readable column as documented in the Wiki. Right you are, my mistake. This changed in https://issues.apache.org/jira/browse/CASSANDRA-661 (which occurred between 0.5 and 0.6). Not sure which one is the correct behavior here. The current behavior is correct. I'll update the examples to avoid future confusion. -- Eric Evans eev...@rackspace.com
Re: Effective allocation of multiple disks
Ryan- Are you going to use software or hardware based RAID 0? Does anyone on the list have any data to compare the performance of hardware RAID 0 vs. software LVM RAID 0? I would think software RAID 0 would be fine since there is no actual computation being done... Thanks! -Eric On Thu, Mar 11, 2010 at 1:16 PM, Ryan King r...@twitter.com wrote: Even without major compaction, you can get significant imbalances in how much data is on each disk which will bottleneck your IO throughput. We're running JBOD right now, but going to switch to RAID 0 soon. -ryan
How to force GC in Cassandra?
Suppose I insert a lot of new items but also delete a lot of new items daily, it will be ideal if I can force GC to happen during mid night (when traffic is low). Is there any way to manually force GC to be executed? In this way I can add a cronjob to trigger gc in mid night. I tried nodetool and the JMX interface but they don't seem to have that. -Weijun
Re: Effective allocation of multiple disks
On Thu, 11 Mar 2010 12:01:27 -0600 Eric Evans eev...@rackspace.com wrote: EE On Wed, 2010-03-10 at 23:20 -0600, Jonathan Ellis wrote: On Wed, Mar 10, 2010 at 9:31 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: I would almost recommend just keeping things simple and removing multiple data directories from the config altogether and just documenting that you should plan on using OS level mechanisms for growing diskspace and io. I think that is a pretty sane suggestion actually. EE Or maybe leave the code as is and just document the situation more EE clearly? If you're adding more disks to increase storage capacity EE and you don't strictly need the extra IO, then multiple data EE directories might be preferable to other forms of aggregation (it's EE certainly simpler than say a volume manager). Could Cassandra use a block device as raw storage? You avoid the filesystem overhead and it lets the sysadmin determine the best kind of device (RAID or not underneath) to allocate. Ted
Cassandra Demo/Tutorial Applications
I was looking at this from CASSANDRA-873 as well as hands-on homework (!) for my OSCON tutorial. Have couple of questions. Would appreciate insights: A) Cassandra-873 suggests Luenandra as one demo application B) Are there other ideas that will bring out the various aspects of Cassandra ? C) What would be the goal of demo apps ? Tutorial to help folks learn the ins and outs of Cassandra ? Show case capabilities ? I think Cassandra-873 belongs to the latter; Twissandra most probably belongs to the former. D) Hadoop on Cassandra might be a good demo/tutorial E) How would one structure the infrastructure for the demo/tutorials ? What assumptions can we make in creating them ? As AMIs to be run in EC2 ? Also to be run on 2-3 local machines for folks who can spare some ? Or as multiple processes - all in one machine ? What is an optimum configuration for learning and demo ? We need to make it simple (to reflect the domain) but not simpler. F) Am looking for ideas from developers and users - hence the cross posting. I hope apache mailer is smart enough to dedup - will find it soon ... Cheers k/
Re: get_range_slice(s) question
That would be a bug, not intended behavior. Can you open a ticket? On Fri, Mar 12, 2010 at 11:48 AM, Omer van der Horst Jansen ome...@yahoo.com wrote: I've noticed that both 0.5.1 and 0.6b2 return (ReplicationFactor) identical copies of the data stored in my keyspace whenever I make a call to get_range_slice or get_range_slices using ConsistencyLevel.QUORUM. So with ReplicationFactor set to 2 for my application's KeySpace I get double the number of KeySlices that I expect to get. When using ConsistencyLevel.ONE I get only one KeySlice for each row. The same routine running against the Standard1 keyspace with a ReplicationFactor of 1 returns only a single KeySlice for each row. A ReplicationFactor of three gives me three identical KeySlices when using ConsistencyLevel.QUORUM. Is this the intended behavior of get_range_slices? I remember reading in one of the Dynamo papers that applications (and not Dynamo) are required to sort out any discrepancies in the data, but in this case there aren't any discrepancies. Omer
Re: How to force GC in Cassandra?
I think you mean compaction? You can use nodeprobe / nodetool for that. http://wiki.apache.org/cassandra/NodeProbe On Fri, Mar 12, 2010 at 12:40 PM, Weijun Li weiju...@gmail.com wrote: Suppose I insert a lot of new items but also delete a lot of new items daily, it will be ideal if I can force GC to happen during mid night (when traffic is low). Is there any way to manually force GC to be executed? In this way I can add a cronjob to trigger gc in mid night. I tried nodetool and the JMX interface but they don't seem to have that. -Weijun
Re: Effective allocation of multiple disks
We're going to us software raid. -ryan On Fri, Mar 12, 2010 at 9:24 AM, Eric Rosenberry epros...@gmail.com wrote: Ryan- Are you going to use software or hardware based RAID 0? Does anyone on the list have any data to compare the performance of hardware RAID 0 vs. software LVM RAID 0? I would think software RAID 0 would be fine since there is no actual computation being done... Thanks! -Eric On Thu, Mar 11, 2010 at 1:16 PM, Ryan King r...@twitter.com wrote: Even without major compaction, you can get significant imbalances in how much data is on each disk which will bottleneck your IO throughput. We're running JBOD right now, but going to switch to RAID 0 soon. -ryan
Grails Cassandra plugin
Folks- I put together a quick n' dirty grails plugin for Cassandra, wrapped with Hector. Its available at http://github.com/wolpert/grails-cassandra in its initial state. I wouldn't call it 'production-ready' yet. :-) We're using Cassandra at work and I wanted an easy way to access Cassandra from a grails application, but couldn't find anything. I have some plans on how where I want it to go, but I'm open to suggestions. I'll submit the code to grails plugins once I get a bit further along with it. Its pretty basic at this point. -- Virtually, Ned Wolpert Settle thy studies, Faustus, and begin... --Marlowe
Cassandra 0.5.1 get_key_range problem
Hello, When using the get_key_range method with ConsistencyLevel.ONE an entire block of keys is not returned. I loop over the get_key_range method, advancing the start key after each call (requesting 8K keys per call). When running the program several times, I got the same results with large key blocks not returned. Then, I change the program to use ConsistencyLevel.ALL, then all the keys are returned as expected. Change the program back to use ConsistencyLevel.ONE and all the keys are now returned. Has anyone else seen this issue? I would have expected ConsistencyLevel.ONE to be able to return all the keys. My 6 node cluster uses a replication factor of 3. Thanks for your help, Jon
Re: Grails Cassandra plugin
Great! You should also link it from http://wiki.apache.org/cassandra/ClientExamples (click Login at the top to create an account.) On Fri, Mar 12, 2010 at 3:57 PM, Ned Wolpert ned.wolp...@imemories.com wrote: Folks- I put together a quick n' dirty grails plugin for Cassandra, wrapped with Hector. Its available at http://github.com/wolpert/grails-cassandra in its initial state. I wouldn't call it 'production-ready' yet. :-) We're using Cassandra at work and I wanted an easy way to access Cassandra from a grails application, but couldn't find anything. I have some plans on how where I want it to go, but I'm open to suggestions. I'll submit the code to grails plugins once I get a bit further along with it. Its pretty basic at this point. -- Virtually, Ned Wolpert Settle thy studies, Faustus, and begin... --Marlowe
Re: Cassandra 0.5.1 get_key_range problem
get_key_range is deprecated. You should use get_range_slice. On Fri, Mar 12, 2010 at 3:59 PM, Jon Graham sjclou...@gmail.com wrote: Hello, When using the get_key_range method with ConsistencyLevel.ONE an entire block of keys is not returned. I loop over the get_key_range method, advancing the start key after each call (requesting 8K keys per call). When running the program several times, I got the same results with large key blocks not returned. Then, I change the program to use ConsistencyLevel.ALL, then all the keys are returned as expected. Change the program back to use ConsistencyLevel.ONE and all the keys are now returned. Has anyone else seen this issue? I would have expected ConsistencyLevel.ONE to be able to return all the keys. My 6 node cluster uses a replication factor of 3. Thanks for your help, Jon
Re: Grails Cassandra plugin
Document updated On Fri, Mar 12, 2010 at 2:50 PM, Jonathan Ellis jbel...@gmail.com wrote: Great! You should also link it from http://wiki.apache.org/cassandra/ClientExamples (click Login at the top to create an account.) On Fri, Mar 12, 2010 at 3:57 PM, Ned Wolpert ned.wolp...@imemories.com wrote: Folks- I put together a quick n' dirty grails plugin for Cassandra, wrapped with Hector. Its available at http://github.com/wolpert/grails-cassandra in its initial state. I wouldn't call it 'production-ready' yet. :-) We're using Cassandra at work and I wanted an easy way to access Cassandra from a grails application, but couldn't find anything. I have some plans on how where I want it to go, but I'm open to suggestions. I'll submit the code to grails plugins once I get a bit further along with it. Its pretty basic at this point. -- Virtually, Ned Wolpert Settle thy studies, Faustus, and begin... --Marlowe -- Virtually, Ned Wolpert Settle thy studies, Faustus, and begin... --Marlowe
Re: Grails Cassandra plugin
great, I'm happy you found Hector useful :) btw, in hector 0.5.0-8 I added some interesting performance JMX counters so may be worth to update yours from 0.5.0-6 to -8 when you have time. On Fri, Mar 12, 2010 at 11:55 PM, Ned Wolpert ned.wolp...@imemories.comwrote: Document updated On Fri, Mar 12, 2010 at 2:50 PM, Jonathan Ellis jbel...@gmail.com wrote: Great! You should also link it from http://wiki.apache.org/cassandra/ClientExamples (click Login at the top to create an account.) On Fri, Mar 12, 2010 at 3:57 PM, Ned Wolpert ned.wolp...@imemories.com wrote: Folks- I put together a quick n' dirty grails plugin for Cassandra, wrapped with Hector. Its available at http://github.com/wolpert/grails-cassandra in its initial state. I wouldn't call it 'production-ready' yet. :-) We're using Cassandra at work and I wanted an easy way to access Cassandra from a grails application, but couldn't find anything. I have some plans on how where I want it to go, but I'm open to suggestions. I'll submit the code to grails plugins once I get a bit further along with it. Its pretty basic at this point. -- Virtually, Ned Wolpert Settle thy studies, Faustus, and begin... --Marlowe -- Virtually, Ned Wolpert Settle thy studies, Faustus, and begin... --Marlowe
Re: SuperColumn.getSubColumns() ordering
Thanks. On Thu, Mar 11, 2010 at 6:46 PM, Jonathan Ellis jbel...@gmail.com wrote: it's ordered by the column name as determined by the subcolumn comparator you declared in the definition, yes On Thu, Mar 11, 2010 at 12:24 PM, Matteo Caprari matteo.capr...@gmail.com wrote: Hi. If I iterate over SuperColumn.getSubColumn(), do I get columns sorted by the column name? Thanks. -- :Matteo Caprari matteo.capr...@gmail.com -- :Matteo Caprari matteo.capr...@gmail.com
Re: Grails Cassandra plugin
I added an issue in my github project for the update. Since I have your ear, in hector, if the cassandra server restarts (one server in the pool) hector will not try to reconnect to the cassandra server even if its listening. Is that a known issue? On Fri, Mar 12, 2010 at 3:35 PM, Ran Tavory ran...@gmail.com wrote: great, I'm happy you found Hector useful :) btw, in hector 0.5.0-8 I added some interesting performance JMX counters so may be worth to update yours from 0.5.0-6 to -8 when you have time. On Fri, Mar 12, 2010 at 11:55 PM, Ned Wolpert ned.wolp...@imemories.comwrote: Document updated On Fri, Mar 12, 2010 at 2:50 PM, Jonathan Ellis jbel...@gmail.comwrote: Great! You should also link it from http://wiki.apache.org/cassandra/ClientExamples (click Login at the top to create an account.) On Fri, Mar 12, 2010 at 3:57 PM, Ned Wolpert ned.wolp...@imemories.com wrote: Folks- I put together a quick n' dirty grails plugin for Cassandra, wrapped with Hector. Its available at http://github.com/wolpert/grails-cassandra in its initial state. I wouldn't call it 'production-ready' yet. :-) We're using Cassandra at work and I wanted an easy way to access Cassandra from a grails application, but couldn't find anything. I have some plans on how where I want it to go, but I'm open to suggestions. I'll submit the code to grails plugins once I get a bit further along with it. Its pretty basic at this point. -- Virtually, Ned Wolpert Settle thy studies, Faustus, and begin... --Marlowe -- Virtually, Ned Wolpert Settle thy studies, Faustus, and begin... --Marlowe -- Virtually, Ned Wolpert Settle thy studies, Faustus, and begin... --Marlowe
Re: Strategies for storing lexically ordered data in supercolumns
My original post is probably confusing. I was originally talking about columns and I don't see what the solution is. * So I was thinking I set the subcolumn compareWith to UTF8Type or BytesType and construct a key [for the subcolumn, not a row key] * * * *[user's lastname + user's firstname + user's uuid]* * * *This would result in sorted subcolumn and user list.* * * Nevertheless, I still don't see/understand the solution. Let's say the person's name changes. The sort is no longer valid. That column value would need to be changed in order for the sort to be correct. On Fri, Mar 12, 2010 at 5:10 PM, Brandon Williams dri...@gmail.com wrote: On Fri, Mar 12, 2010 at 7:07 PM, Peter Chang pete...@gmail.com wrote: But wouldn't name + UUID be considered volatile? That was the crux of my questions. It would, but the distinction here is that it is now a column, not a row key. -Brandon
Re: Strategies for storing lexically ordered data in supercolumns
On Fri, Mar 12, 2010 at 7:21 PM, Peter Chang pete...@gmail.com wrote: My original post is probably confusing. I was originally talking about columns and I don't see what the solution is. Sorry, I misunderstood. * So I was thinking I set the subcolumn compareWith to UTF8Type or BytesType and construct a key [for the subcolumn, not a row key] * * * *[user's lastname + user's firstname + user's uuid]* * * *This would result in sorted subcolumn and user list.* * * Nevertheless, I still don't see/understand the solution. Let's say the person's name changes. The sort is no longer valid. That column value would need to be changed in order for the sort to be correct. When their name changes, you delete the existing column and insert a new one with the correct name, which will then sort correctly. -Brandon
Re: Cassandra Demo/Tutorial Applications
On Fri, Mar 12, 2010 at 1:55 PM, Krishna Sankar ksanka...@gmail.com wrote: I was looking at this from CASSANDRA-873 as well as hands-on homework (!) for my OSCON tutorial. Have couple of questions. Would appreciate insights: A) Cassandra-873 suggests Luenandra as one demo application B) Are there other ideas that will bring out the various aspects of Cassandra ? multi-user blog (single-user is too easy :) - extra credit: with full-text search using lucandra discussion forum - also w/ FTS C) What would be the goal of demo apps ? Tutorial to help folks learn the ins and outs of Cassandra ? Show case capabilities ? I think Cassandra-873 belongs to the latter; Twissandra most probably belongs to the former. I think you nailed it. D) Hadoop on Cassandra might be a good demo/tutorial Sure, I'll buy that. I can't think of any standalone projects for that, but compute a twissandra tag cloud would be pretty cool. (Might need to write a twissandra bot to load stuff in to make an interesting cloud. :) E) How would one structure the infrastructure for the demo/tutorials ? What assumptions can we make in creating them ? As AMIs to be run in EC2 ? I'd probably go with virtualbox images as being simpler for people who don't have an AWS key already. (VB can read vmware player images, i think. But there is no free vmware for OS X, so you'd want to check that before going w/ vmware format.) Or just have people d/l cassandra and a configuration xml. Probably easier than teaching people to use virtualbox who haven't before. Also to be run on 2-3 local machines for folks who can spare some ? Or as multiple processes - all in one machine ? You're not going to have time to teach cluster management. Keep it to 1.
About the replication strategy of Cassandra
Hi all. I am interested in the architecture of Cassandra. Cassandra offers the replication policy such as Rack Unaware Rack Aware(within a datacenter) Datacenter Aware. It is necessary to select these replication policies by the application. The algorithm when the replication policy based on Rack Aware(within a datacenter) and the Datacenter Aware strategy is selected might be a little difficult. In Cassandra, Zookeeper was selected to the election algorithm of the node that the system was using. 1. Please give notes the replication strategy of Cassandra is selected. 2. About the Zab protocol adopted with Zookeeper. The weak point of the Paxos protocol of Chubby is a delay. Is the Zab protocol more excellent than this Paxos protocol? --- Kazuki Aranami Twitter: http://twitter.com/kimtea http://d.hatena.ne.jp/kazuki-aranami/ ---
Re: Incr/Decr Counters in Cassandra
Badly need it for my work let me know if i can do something to speed it up :) Regards, /VJ On Wed, Nov 4, 2009 at 1:32 PM, Chris Goffinet goffi...@digg.com wrote: Hey, At Digg we've been thinking about counters in Cassandra. In a lot of our use cases we need this type of support from a distributed storage system. Anyone else out there who has such needs as well? Zookeeper actually has such support and we might use that if we can't get the support in Cassandra. --- Chris Goffinet goffi...@digg.com