Re: Starting Cassandra Fauna

2010-04-12 Thread Nirmala Agadgar
Hi, Yes, used only master. i downloaded the tar file and placed in cassandra folder and run again cassandra_helper cassandra now i am getting Error: Exception thrown by the agent : java.net.MalformedURLException: Local host name when set hostname to localhost or 127.0.0.1 i get Exception in

FW: Is there any way to enable the multiple super column names at a time?

2010-04-12 Thread Dop Sun
I guess I'm putting in the wrong email list. Sorry for this. My question in short is: whether I can get a certain range of super column in a single query? I have checked the following APIs, but looks they don't work for this: get: for a single column get_slice: contains column_parent as

compare cassandra read n write results

2010-04-12 Thread vineet daniel
Hi A little while ago I tried cassandra's read n write operations and timed it. I am using Pandra for communication with cassandra. System is CentOS 5 with 2 GB RAM and dual core. I inserted 10 rows in around 30 secs and read the same in 25 seconds. If anyone of you have run similar tests

Re: FW: Is there any way to enable the multiple super column names at a time?

2010-04-12 Thread Jonathan Ellis
The supercolumn parameter to ColumnParent is optional precisely so you can do this. http://wiki.apache.org/cassandra/API On Mon, Apr 12, 2010 at 6:37 AM, Dop Sun su...@dopsun.com wrote: I guess I'm putting in the wrong email list. Sorry for this. My question in short is: whether I can get a

Re: compare cassandra read n write results

2010-04-12 Thread vineet daniel
I dont think it would be a good idea not to use pandra for benchmarks as we are going to use pandra for our application. Secondly, it will give Pandra guys some boost to enhance the performance of thier library. On Mon, Apr 12, 2010 at 6:05 PM, Jordan Pittier jordan.pitt...@gmail.comwrote: Hi,

RE: FW: Is there any way to enable the multiple super column names at a time?

2010-04-12 Thread Dop Sun
Also tried: client.multiget_slice(ksName, mKeyList, parent, predicate, ConsistencyLevel.ONE); Gets the same result, if I put the super_column as null or empty array, it returns nothing, but if I give the correct super_column value, it returns expected 3 columns. I'm using the Keyspace1.Super1.

Re: Off line client nodes?

2010-04-12 Thread Lucas Di Pentima
Hello Colin, El 12/04/2010, a las 07:52, Colin Yates escribió: Hi, In our architecture, our consultants want to perform some analysis on the train, disconnected from the web. How can I achieve this in Cassandra? I realise this isn't quite the use-case that was thought about when the

Re: compare cassandra read n write results

2010-04-12 Thread vineet daniel
Actually, to be honest I dont know how to insert 100 rows without PHP or Pandra. If you could help me out I will surely try it and will share the results with you guys. On Mon, Apr 12, 2010 at 7:25 PM, Paul Prescod pres...@gmail.com wrote: How will they know whether the performance problem

Re: Off line client nodes?

2010-04-12 Thread Paul Prescod
On Mon, Apr 12, 2010 at 3:52 AM, Colin Yates colin.ya...@gmail.com wrote: Hi, In our architecture, our consultants want to perform some analysis on the train, disconnected from the web. How can I achieve this in Cassandra?  I realise this isn't quite the use-case that was thought about when

Re: compare cassandra read n write results

2010-04-12 Thread Jordan Pittier
First, read carefully and understand : http://wiki.apache.org/cassandra/ThriftExamples#PHP But you really shouldn't bother with benchmarks. Ask yourself this question : what if my Cassandra performs at 5k operation/s ? And what about 3k op/s?. In other terms why are you benchmarking ?. You've got

Re: compare cassandra read n write results

2010-04-12 Thread Paul Prescod
contrib/py_stress Although that's still written in a scripting language, it at least uses threading. Anyhow, what's your real goal? Inserting 100K or 1M rows in 30 seconds from a single-threaded environment like PHP is pretty good. Do your business goals require more? Also: Is it 100K or 1M? In

frequent unknown result errors

2010-04-12 Thread Lee Parker
I am a newbie with Cassandra. We are currently migrating a large amount of data out of MySQL into Cassandra. I have two ColumnFamilies. One contains one row per item and each item has roughly 12 columns. These are items from REST APIs like the Twitter API. Then I have a second ColumnFamily

Re: frequent unknown result errors

2010-04-12 Thread Jonathan Ellis
unknown result means thrift is badly confused. You will get this when using the same thrift connection from multiple threads, for instance. On Mon, Apr 12, 2010 at 10:02 AM, Lee Parker l...@socialagency.com wrote: I am a newbie with Cassandra.  We are currently migrating a large amount of data

Re: frequent unknown result errors

2010-04-12 Thread Lee Parker
If the connections are being made by individual PHP processes running from the command line, they shouldn't be using the same connection. Should my code close the connections after each query and open a new one? Here is the flow of what is happening when we get the error: 1. Get a set of items

Re: frequent unknown result errors

2010-04-12 Thread Jonathan Ellis
Then you're probably using a client incompatible with the server version you're using. On Mon, Apr 12, 2010 at 10:24 AM, Lee Parker l...@socialagency.com wrote: If the connections are being made by individual PHP processes running from the command line, they shouldn't be using the same

Re: frequent unknown result errors

2010-04-12 Thread Lee Parker
According to his docs, he says you need Cassandra = 0.5.0. I guess it is possible that the included thrift files are targeted at 0.6, but I don't see the batch_mutate method which is part of 0.6. So I'm assuming that it should work fine with 0.5.0. I have now changed some of those entries in

Re: Starting Cassandra Fauna

2010-04-12 Thread Ryan King
I'm guessing you missed the ant ivy-retrieve step. We're planning on releasing a new gem today that should fix this issue. -ryan On Mon, Apr 12, 2010 at 3:30 AM, Nirmala Agadgar nirmala...@gmail.com wrote: Hi, Yes, used only master. i downloaded  the tar file and placed in cassandra folder

Re: Two dimensional matrices

2010-04-12 Thread Eric Evans
On Mon, 2010-04-12 at 01:31 +0200, Philippe wrote: I have data that is two dimensional, time varying (think of a grid). At each cell of this grid,I store a binary array. My data model will be - single keyspace - key = {Y dimension} - super column family = {type of data

Re: frequent unknown result errors

2010-04-12 Thread Keith Thornhill
i also noticed unknown result errors when my php thrift code was generated using a different version of thrift than cassandra uses. after regenerating my php code from thrift-r917130 (for cassandra-0.6.0-rc1), the errors stopped. -keith On Mon, Apr 12, 2010 at 9:40 AM, vineet daniel

Re: Two dimensional matrices

2010-04-12 Thread Philippe
Eric, Dop, Thanks for your answers. If I understand what you're asking, a rectangle (identified by X and Y coordinates for a time-frame), will boil down to a single column. There are certainly no problems with retrieving a single sub-column from a super column. I realize I wasn't clear

Re: Worst case #iops to read a row

2010-04-12 Thread Jonathan Ellis
On Mon, Apr 12, 2010 at 3:45 PM, Time Less timelessn...@gmail.com wrote: I'm confused. That's really worst-case? 3 iops? max 3 per sstable, as RK clarified out. What if we have 10B rows in the column family? What sort of index do you use that would only require one iop to find the row index

Re: Off line client nodes?

2010-04-12 Thread David Timothy Strauss
It's not common for me to recommend CouchDB, but this is one instance it's great for: synching complete datasets for disconnected use. Cassandra treats disconnection as a problem, not something that should occur in the normal plan of operations. -Original Message- From: Colin Yates

Re: Off line client nodes?

2010-04-12 Thread Jonathan Ellis
Cassandra treats disconnection as something that *does* occur, like it or not, and deals with it well. But there's no way to easily sync just some subset of data to your laptop. Couch may well be better at doing that sort of thing. On Mon, Apr 12, 2010 at 3:50 PM, David Timothy Strauss

Re: frequent unknown result errors

2010-04-12 Thread Lee Parker
So, it didn't get rid of the problem, i'm still getting the errors. The only thing I can think of now is top upgrade to 0.6, but I would prefer to stay with the current stable release. I have regenerated the thrift code for 0.5.0 and there is no difference between those files and the ones i'm

Re: Two dimensional matrices

2010-04-12 Thread Eric Evans
On Mon, 2010-04-12 at 22:40 +0200, Philippe wrote: If I understand what you're asking, a rectangle (identified by X and Y coordinates for a time-frame), will boil down to a single column. There are certainly no problems with retrieving a single sub-column from a super column. I realize

Re: Two dimensional matrices

2010-04-12 Thread Philippe
Alright, so assuming we're looking for a slice of the grid against a given time-frame, that would look something like: get_range_slice( keyspaceName, ColumnParent(CFname, timeFrame), SlicePredicate( slice_range=SliceRange(xstart, xend, false, colCount) ), ystart,

Best practices to build app with querying/searching functionality

2010-04-12 Thread Olexiy Prokhorenko
Hello, Asked this question on Stack Oveflow (http://stackoverflow.com/questions/2619744/searches-and-general-querying-with-hbase-and-or-cassandra-best-practices) but didn't get much of answers. May be some Cassandra people can help me and point to the right direction? So: I have User model

Re: Two dimensional matrices

2010-04-12 Thread Eric Evans
On Tue, 2010-04-13 at 00:23 +0200, Philippe wrote: Alright, so assuming we're looking for a slice of the grid against a given time-frame, that would look something like: get_range_slice( keyspaceName, ColumnParent(CFname, timeFrame), SlicePredicate(

Re: Two dimensional matrices

2010-04-12 Thread Philippe
However, you are also saying there is no way to also take into account the timeFrame supercolumn in the same API call ? IE, it is not possible to get back a data structure keyed by 'key,supercolumn,column' hence y,x and timeframe which I can then process to my heart's delight ? If

Re: Worst case #iops to read a row

2010-04-12 Thread Time Less
What if we have 10B rows in the column family? What sort of index do you use that would only require one iop to find the row index block? basically what is described in sections 5.3 and 5.4 here: http://labs.google.com/papers/bigtable.html Incorrect. Section 4 of the paper describes the

Re: Worst case #iops to read a row

2010-04-12 Thread Benjamin Black
On Mon, Apr 12, 2010 at 4:27 PM, Time Less timelessn...@gmail.com wrote: With this formula, we can already begin to formulate more useful answers to the question. If I have 10B rows in my CF, and I can fit 10k rows per SStable, and the SStables are spread across 5 nodes, and I have 1 bloom

Re: Two dimensional matrices

2010-04-12 Thread Eric Evans
On Tue, 2010-04-13 at 00:45 +0200, Philippe wrote: However, you are also saying there is no way to also take into account the timeFrame supercolumn in the same API call ? IE, it is not possible to get back a data structure keyed by