Effective allocation of multiple disks

2010-03-10 Thread Eric Rosenberry
Based on the documentation, it is clear that with Cassandra you want to have one disk for commitlog, and one disk for data. My question is: If you think your workload is going to require more io performance to the data disks than a single disk can handle, how would you recommend effectively

RE: Effective allocation of multiple disks

2010-03-10 Thread Stu Hood
You can list multiple DataFileDirectories, and Cassandra will scatter files across all of them. Use 1 disk for the commitlog, and 3 disks for data directories. See http://wiki.apache.org/cassandra/CassandraHardware#Disk Thanks, Stu -Original Message- From: Eric Rosenberry

Re: Effective allocation of multiple disks

2010-03-10 Thread Eric Rosenberry
Ahh, thanks! I had read that, but I had assumed the reference to use one or more devices for DataFileDirectories was referring to somehow making multiple physical devices into one logical device via some underlying RAID system. So then as far as free space on the disks go, I have seen references

CassandraHardware link on the wiki FrontPage

2010-03-10 Thread Eric Rosenberry
Would it be possible to add a link to the CassandraHardware page from the FrontPage of the wiki? I think other new folks to Cassandra may find it useful. ;-) (I would do it myself, though that page is Immutable) http://wiki.apache.org/cassandra/FrontPage

RE: CassandraHardware link on the wiki FrontPage

2010-03-10 Thread Stu Hood
Anyone can edit any page once they have an account: click the Login link at the top right next to the search box to create an account. Thanks, Stu -Original Message- From: Eric Rosenberry e...@rosenberry.org Sent: Wednesday, March 10, 2010 2:52am To: cassandra-user@incubator.apache.org

Re: schema design question

2010-03-10 Thread Matteo Caprari
Well, I don't like clunky and I'm java friendly. I'll go for the abstract class. Thanks for the help. On Tue, Mar 9, 2010 at 7:33 PM, Jonathan Ellis jbel...@gmail.com wrote: On Tue, Mar 9, 2010 at 7:30 AM, Matteo Caprari matteo.capr...@gmail.com wrote: On Tue, Mar 9, 2010 at 1:23 PM,

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

2010-03-10 Thread Sylvain Lebresne
Well, I've found the reason. The default cassandra configuration use a 10% row cache. And the row cache reads all the row each time. So it was indeed reading the full row each time even though the request was asking for only one column. My bad (at least I learned something). -- Sylvain On Tue,

Login Failure Error

2010-03-10 Thread shirish
hello, I have just download the source code from the trunk using svn, I have set up the following configuration Created a different user and group named cassandra When i do *cassandra -f* the following is the output I get INFO 18:02:16,697 Auto DiskAccessMode determined to be standard INFO

Re: Login Failure Error Attached to storage-conf.xml file

2010-03-10 Thread shirish
Shirish Reddy P (Student) Indian Institute Of Information Technology, Allahabad Mob No. +919651418099 On Wed, Mar 10, 2010 at 6:16 PM, shirish shirishredd...@gmail.com wrote: hello, I have just download the source code from the trunk using svn, I have set up the following configuration

Re: Login Failure Error

2010-03-10 Thread Jonathan Ellis
Please don't use trunk unless you're actively fixing bugs. If you want the latest greatest, get the 0.6 branch from svn. On Wed, Mar 10, 2010 at 6:46 AM, shirish shirishredd...@gmail.com wrote: hello, I have just download the source code from the trunk using svn, I have set up the following

Re: Login Failure Error

2010-03-10 Thread shirish
Every thing ran fine using the stable release. I wanted to start contributing and hence downloaded the source code. What could possibly be giving this error ? On Wed, Mar 10, 2010 at 6:49 PM, Jonathan Ellis jbel...@gmail.com wrote: Please don't use trunk unless you're actively fixing bugs. If

exception with python client

2010-03-10 Thread Matteo Caprari
Hi. On Cassandra 0.6 beta-2 I have this schema: Keyspace Name=KS ColumnFamily Name=Users CompareWith=BytesType/ ColumnFamily Name=Items CompareWith=BytesType ColumnType=Super CompareSubcolumnsWith=BytesType/ I'm trying the batch_mutate api using python: socket = TSocket.TSocket(localhost,

Re: exception with python client

2010-03-10 Thread Gary Dusbabek
On Wed, Mar 10, 2010 at 08:33, Matteo Caprari matteo.capr...@gmail.com wrote: protocol = TBinaryProtocol.TBinaryProtocolAccelerated(transport) client = Cassandra.Client(protocol) transport.open() before attempting the mutation, try adding: client.transport = transport Gary.

RE: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

2010-03-10 Thread David Dabbs
So did you disable the row cache entirely? From: Sylvain Lebresne Well, I've found the reason. The default cassandra configuration use a 10% row cache. And the row cache reads all the row each time. So it was indeed reading the full row each time even though the request was asking for

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

2010-03-10 Thread Sylvain Lebresne
So did you disable the row cache entirely? Yes (getting back reasonable performances). From: Sylvain Lebresne Well, I've found the reason. The default cassandra configuration use a 10% row cache. And the row cache reads all the row each time. So it was indeed reading the full row each

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

2010-03-10 Thread Jonathan Ellis
For the record, I note that no row cache is the default on user-defined CFs; we include it in the sample configuration file as an example only. On Wed, Mar 10, 2010 at 9:58 AM, Sylvain Lebresne sylv...@yakaz.com wrote: So did you disable the row cache entirely? Yes (getting back reasonable

Re: cassandra 0.6.0 beta 2 download contains beta 1?

2010-03-10 Thread Eric Evans
On Tue, 2010-03-09 at 12:38 -0800, Omer van der Horst Jansen wrote: The apache-cassandra-0.6.0-beta2-bin.tar.gz download contains both these files in the apache-cassandra-0.6.0-beta2/lib directory: apache-cassandra-0.6.0-beta1.jar apache-cassandra-0.6.0-beta2.jar Ugh, my bad. I must have

Re: Effective allocation of multiple disks

2010-03-10 Thread Jonathan Ellis
Thanks for testing that, added a note to http://wiki.apache.org/cassandra/CassandraHardware on stripe size. On Wed, Mar 10, 2010 at 11:03 AM, B. Todd Burruss bburr...@real.com wrote: with the file sizes we're talking about with cassandra and other database products, the stripe size doesn't seem

Re: schema design question

2010-03-10 Thread Jonathan Ellis
if you want to select stuff out w/ one query, then single CF is the only sane choice if not then 2 CFs may be more performant On Wed, Mar 10, 2010 at 4:42 AM, Matteo Caprari matteo.capr...@gmail.com wrote: I can't quite decide if to go with a flat schema, with keys repeated in different CFs

Re: exception with python client

2010-03-10 Thread Matteo Caprari
There was indeed a very clear message in the logs. I was missing the timestamp in the Column declaration. Thanks On Wed, Mar 10, 2010 at 3:42 PM, Eric Evans eev...@rackspace.com wrote: On Wed, 2010-03-10 at 14:33 +, Matteo Caprari wrote: I get an exception, but it's a shy one and can't

Re: Hackathon?!?

2010-03-10 Thread Peter Chang
Sweet I'm in! Is there going to be a more formal invite? If not, can we get the details on where Digg is and where at Digg? Peter On Tue, Mar 9, 2010 at 9:28 PM, Dan Di Spaltro dan.dispal...@gmail.comwrote: Great, that would probably get us a lot more room. Sweet, so its settled, we'll do

Re: Hackathon?!?

2010-03-10 Thread Jonathan Ellis
I'm in either way, but if we push it a week later then the twitter guys could (a) make it and (b) pimp it at their own conference. On Wed, Mar 10, 2010 at 12:26 AM, Jeff Hodges jhod...@twitter.com wrote: Ah, hell. Thought this was the first day. Can't make it. -- Jeff On Mar 9, 2010 9:32 PM,

Re: Hackathon?!?

2010-03-10 Thread Chris Goffinet
I'll work on putting together the formal invite. Stay tuned. -Chris On Mar 10, 2010, at 9:54 AM, Peter Chang wrote: Sweet I'm in! Is there going to be a more formal invite? If not, can we get the details on where Digg is and where at Digg? Peter On Tue, Mar 9, 2010 at 9:28 PM, Dan

Re: cassandra 0.6.0 beta 2 download contains beta 1?

2010-03-10 Thread Vick Khera
On Wed, Mar 10, 2010 at 11:30 AM, Eric Evans eev...@rackspace.com wrote:  apache-cassandra-0.6.0-beta1.jar  apache-cassandra-0.6.0-beta2.jar Ugh, my bad. I must have failed to `clean' in between the aborted beta1 and beta2. The beta2 also does not include the other support jar files like

Re: Hackathon?!?

2010-03-10 Thread Dan Di Spaltro
I would be good with the week after, the date is arbitrary and it would be great to get a critical mass of folks. On Wed, Mar 10, 2010 at 10:09 AM, Chris Goffinet goffi...@digg.com wrote: I'll work on putting together the formal invite. Stay tuned. -Chris On Mar 10, 2010, at 9:54 AM, Peter

NoSQL live tomorrow

2010-03-10 Thread Jonathan Ellis
Ryan King and I will have 20 minutes to talk about Cassandra in the Lab part of the program. 20 minutes isn't enough to present a whole lot in a structured manner so we are planning to just do QA the whole time. So if you are going to be there, come with your questions. I will also bring a few

Re: NoSQL live tomorrow

2010-03-10 Thread Tim Haines
Hey Jonathan, What event is this and will it be livecasted/recorded? Cheers, Tim. On Thu, Mar 11, 2010 at 10:21 AM, Jonathan Ellis jbel...@gmail.com wrote: Ryan King and I will have 20 minutes to talk about Cassandra in the Lab part of the program. 20 minutes isn't enough to present a

Re: Effective allocation of multiple disks

2010-03-10 Thread Anthony Molinaro
This is incorrect, as discussed a few weeks ago. I have a setup with multiple disks, and as soon as compaction occurs all the data ends up on one disk. If you need the additional io, you will want raid0. But simply listing multiple DataFileDirectories will not work. -Anthony On Wed, Mar 10,

Re: Strategy to delete/expire keys in cassandra

2010-03-10 Thread Weijun Li
Hi Sylvain, I applied your patch to 0.5 but it seems that it's not compilable: 1) column.getTtl() is no defined in RowMutation.java public static RowMutation getRowMutation(String table, String key, MapString, ListColumnOrSuperColumn cfmap) { RowMutation rm = new RowMutation(table,

Re: NoSQL live tomorrow

2010-03-10 Thread Jonathan Ellis
http://nosqlboston.eventbrite.com/ don't know about recording / casting plans. On Wed, Mar 10, 2010 at 3:25 PM, Tim Haines tmhai...@gmail.com wrote: Hey Jonathan, What event is this and will it be livecasted/recorded? Cheers, Tim. On Thu, Mar 11, 2010 at 10:21 AM, Jonathan Ellis

Re: Effective allocation of multiple disks

2010-03-10 Thread Stu Hood
Yea, I suppose major compactions are the wildcard here. Nonetheless, the situation where you only have 1 SSTable should be very rare. I'll open a ticket though, because we really ought to be able to utilize those disks more thoroughly, and I have some ideas there. -Original Message-

Re: Testing row cache feature in trunk: write should put record in cache

2010-03-10 Thread Jonathan Ellis
Thanks for that, Daniel. I'm pretty heads down finishing off the last 0.6 issues right now, but this is on my list to get to. On Mon, Mar 8, 2010 at 1:25 PM, Daniel Kluesing d...@bluekai.com wrote: This is interesting for the use cases I'm looking at Cassandra for, so if that offer still

problem with running simple example using cassandra-cli with 0.6.0-beta2

2010-03-10 Thread Bill Au
I am checking out 0.6.0-beta2 since I need the batch-mutate function. I am just trying to run the example is the cassandra-cli Wiki: http://wiki.apache.org/cassandra/CassandraCli Here is what I am getting: cassandra set Keyspace1.Standard1['jsmith']['first'] = 'John' Value inserted. cassandra

Re: cassandra 0.6.0 beta 2 download contains beta 1?

2010-03-10 Thread Bill Au
I am building from source and found the same problem. I manually copied all the jar files from build/lib/jars to lib and that seems to do the trick. Bill On Wed, Mar 10, 2010 at 1:39 PM, Vick Khera vi...@khera.org wrote: On Wed, Mar 10, 2010 at 11:30 AM, Eric Evans eev...@rackspace.com wrote:

Re: problem with running simple example using cassandra-cli with 0.6.0-beta2

2010-03-10 Thread Brandon Williams
On Wed, Mar 10, 2010 at 5:09 PM, Bill Au bill.w...@gmail.com wrote: I am checking out 0.6.0-beta2 since I need the batch-mutate function. I am just trying to run the example is the cassandra-cli Wiki: http://wiki.apache.org/cassandra/CassandraCli Here is what I am getting: cassandra set

Re: Strategy to delete/expire keys in cassandra

2010-03-10 Thread Weijun Li
Never mind. Figured out I forgot to compile thrift :) Thanks, -Weijun On Wed, Mar 10, 2010 at 1:43 PM, Weijun Li weiju...@gmail.com wrote: Hi Sylvain, I applied your patch to 0.5 but it seems that it's not compilable: 1) column.getTtl() is no defined in RowMutation.java public static

Re: NoSQL live tomorrow

2010-03-10 Thread B. Todd Burruss
does anyone know if there is a plan for nosql seattle anytime soon? Jonathan Ellis wrote: http://nosqlboston.eventbrite.com/ don't know about recording / casting plans. On Wed, Mar 10, 2010 at 3:25 PM, Tim Haines tmhai...@gmail.com wrote: Hey Jonathan, What event is this and will it be

Re: Effective allocation of multiple disks

2010-03-10 Thread Anthony Molinaro
Except major compactions are not that rare if you have a cluster which you need to add capacity to. Anytime to add nodes with bootstrap it is recommended you run cleanup on nodes which you removed data from (and this is useful to see how much space you are now using). Cleanup does a major

Re: NoSQL live tomorrow

2010-03-10 Thread David Timothy Strauss
I will be at NoSQL LIve, but I have a client call for most of the lab part. --Original Message-- From: Jonathan Ellis To: cassandra-user@incubator.apache.org ReplyTo: cassandra-user@incubator.apache.org Subject: NoSQL live tomorrow Sent: Mar 10, 2010 21:21 Ryan King and I will have 20

Re: Hackathon?!?

2010-03-10 Thread Chris Goffinet
We could do it on April 22 (1 week later), that's my birthday :-) What better way to celebrate haha. -Chris On Mar 10, 2010, at 9:58 AM, Jonathan Ellis wrote: I'm in either way, but if we push it a week later then the twitter guys could (a) make it and (b) pimp it at their own conference.

Re: problem with running simple example using cassandra-cli with 0.6.0-beta2

2010-03-10 Thread Jonathan Ellis
I think he means how the column names are rendered as bytes but the values are strings. On Wed, Mar 10, 2010 at 5:22 PM, Brandon Williams dri...@gmail.com wrote: On Wed, Mar 10, 2010 at 5:09 PM, Bill Au bill.w...@gmail.com wrote: I am checking out 0.6.0-beta2 since I need the batch-mutate

Re: Effective allocation of multiple disks

2010-03-10 Thread Jonathan Ellis
On Wed, Mar 10, 2010 at 9:31 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: I would almost recommend just keeping things simple and removing multiple data directories from the config altogether and just documenting that you should plan on using OS level mechanisms for growing

Strategies for storing lexically ordered data in supercolumns

2010-03-10 Thread Peter Chang
I'm wondering about good strategies for picking keys that I want to be lexically sorted in a super column family. For example, my data looks like this: [user1_uuid][connections][some_key_for_user2] = [user1_uuid][connections][some_key_for_user3] = I was thinking that I wanted