Re: Storing big objects into columns

2011-01-14 Thread Peter Schuller
In a project I would like to store big objects in columns, serialized. For example entire images (several Ko to several Mo), flash animations (several Mo) etc... Does someone use Cassandra with those relatively big columns and if yes does it work well ? Is there any drawbacks using this

Re: Usage Pattern : amp;quot;uniqueamp;quot; value of a key.

2011-01-14 Thread Oleg Anastasyev
You're right when you say it's unlikely that 2 threads have the same timestamp, but it can. So it could work for user creation, but maybe not on a more write intensive problem. Um, sorry I thought you re solving exact case of duplicate user creation. If youre trying to solve the concurrent

RE: about the data directory

2011-01-14 Thread raoyixuan (Shandy)
Thanks very much -Original Message- From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller Sent: Friday, January 14, 2011 4:40 PM To: user@cassandra.apache.org Subject: Re: about the data directory as a administrator, I want to know why I can read the data from any

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Roshan Dawrani
It's possible that I am misunderstanding the question in some way. The row keys can be Time UUIDs and with those row keys as column names, u can use comparator TIMEUUIDTYPE to have them sorted by time automatically. On Fri, Jan 14, 2011 at 9:18 AM, Aaron Morton aa...@thelastpickle.comwrote:

Re: Timeout Errors while running Hadoop over Cassandra

2011-01-14 Thread Jairam Chandar
The cassandra logs strangely show no errors at the time of failure. Changing the RPCTimeoutInMillis seemed to help. Though it slowed down the job considerably, it seems to be finishing by changing the timeout value to 1 min. Unfortunately, I cannot be sure if it will continue to work if the data

Different comparator types for column and supercolumn don't work

2011-01-14 Thread Karin Kirsch
Hello, I'm new to cassandra. I'm using cassandra release 0.7.0 (local, single node). I can't perform write operations in case the column and supercolumn families have different comparator types. For example if I use the code given in Issue: https://issues.apache.org/jira/browse/CASSANDRA-1712

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Aklin_81
@Roshan Yes, I thought about that, but then I wouldn't be able to use the Random Partitioner. @Aaron Do you mean like this: 'timeUUID+ row_key' as the supercolumn names? then when retriving the row_key from this column name, will I be required to parse the name ? How do I do that exactly ?

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Roshan Dawrani
On Fri, Jan 14, 2011 at 7:15 PM, Aklin_81 asdk...@gmail.com wrote: @Roshan Yes, I thought about that, but then I wouldn't be able to use the Random Partitioner. Can you please expand a bit on this? What is this restriction? Can you point me to some relevant documentation on this? Thanks.

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Rajkumar Gupta
I am not sure but I guess because all the rows of certain time range will go to just one node will not be evenly distributed because the timeUUID will not be random but sequential according to time... I am not sure anyways... On Fri, Jan 14, 2011 at 7:18 PM, Roshan Dawrani

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Aklin_81
I too believed so! but not totally sure. On 1/14/11, Rajkumar Gupta rajkumar@gmail.com wrote: I am not sure but I guess because all the rows of certain time range will go to just one node will not be evenly distributed because the timeUUID will not be random but sequential according to

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Roshan Dawrani
I am not clear what you guys are trying to do and say :-) So, let's take some specifics... Say you want to create rows in some column family (say CF_A), and as you create them, you want to store their row key in column names in some other column family (say CF_B) - possibly for filtering keys

Problem starting Cassandra on Ubuntu

2011-01-14 Thread kh jo
Hi, just installed Cassandra on Ubuntu using package manager but I can not start it I get the following error in the logs:  INFO [main] 2011-01-14 15:37:49,758 AbstractCassandraDaemon.java (line 74) Heap size: 1051525120/1051525120  WARN [main] 2011-01-14 15:37:49,826 CLibrary.java (line 73)

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Aklin_81
I just read that cassandra internally creates a md5 hash that is used for distributing the load by sending it to a node reponsible for the range within which that md5 hash falls, so even when we create sequential keys, their MD5 hash is not the same hence they are not sent to same node. This was

Re: limiting columns in a row

2011-01-14 Thread Sylvain Lebresne
Hi, does this seem like a generally useful feature? I do think this could be a useful feature. If only because I don't think there is any satisfactory/efficient way to do this client side. if so, would it be hard to implement (maybe it could be done at compaction time like the TTL feature)?

live data migration from mysql to cassandra

2011-01-14 Thread ruslan usifov
Hello Dear community please share your experience, home you make live(without stop) migration from mysql or other RDBM to cassandra

Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?

2011-01-14 Thread Aklin_81
No, you do not need to shut up, please! :) you may be clearing up my further misconceptions on the topic! Anyways, the link b/w 1st and 2nd para was that since the rows distribution among nodes is not affected by key(as you rightly said) but by md5 hash of the key thus I can use just any key

Re: live data migration from mysql to cassandra

2011-01-14 Thread Edward Capriolo
On Fri, Jan 14, 2011 at 10:40 AM, ruslan usifov ruslan.usi...@gmail.com wrote: Hello Dear community please share your experience, home you make live(without stop) migration from mysql or other RDBM to cassandra There is no built in way to do this. I remember hearing at hadoop world this year

Do you have a site in production environment with Cassandra? What client do you use?

2011-01-14 Thread Ertio Lew
Hey, If you have a site in production environment or considering so, what is the client that you use to interact with Cassandra. I know that there are several clients available out there according to the language you use but I would love to know what clients are being used widely in production

Re: cassandra row cache

2011-01-14 Thread Mike Malone
Digest reads could be being dropped..? On Thu, Jan 13, 2011 at 4:11 PM, Jonathan Ellis jbel...@gmail.com wrote: On Thu, Jan 13, 2011 at 2:00 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Is it possible that your are reading at READ.ONE and that READ.ONE only warms cache on 1 of your

Re: cassandra row cache

2011-01-14 Thread Jonathan Ellis
That's possible, yes. He'd want to make sure there aren't any of those WARN messages in the logs. On Fri, Jan 14, 2011 at 11:46 AM, Mike Malone m...@simplegeo.com wrote: Digest reads could be being dropped..? On Thu, Jan 13, 2011 at 4:11 PM, Jonathan Ellis jbel...@gmail.com wrote: On Thu,

Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-14 Thread Ran Tavory
I use Hector, if that counts. .. On Jan 14, 2011 7:25 PM, Ertio Lew ertio...@gmail.com wrote: Hey, If you have a site in production environment or considering so, what is the client that you use to interact with Cassandra. I know that there are several clients available out there according

Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-14 Thread Ertio Lew
what is the technology stack do you use? On 1/14/11, Ran Tavory ran...@gmail.com wrote: I use Hector, if that counts. .. On Jan 14, 2011 7:25 PM, Ertio Lew ertio...@gmail.com wrote: Hey, If you have a site in production environment or considering so, what is the client that you use to

Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-14 Thread Ran Tavory
Java On Jan 14, 2011 8:25 PM, Ertio Lew ertio...@gmail.com wrote: what is the technology stack do you use? On 1/14/11, Ran Tavory ran...@gmail.com wrote: I use Hector, if that counts. .. On Jan 14, 2011 7:25 PM, Ertio Lew ertio...@gmail.com wrote: Hey, If you have a site in production

phpcassa never return(infinite loop)?!!!

2011-01-14 Thread kh jo
I am trying to use phpcasse I use the following example  CassandraConn::add_node('localhost', 9160); $users = new CassandraCF('rhg', 'Users'); // ColumnFamily $users-insert('1', array('email' = 't...@example.com', 'password' = 'test'));  when I run it, it never returns,,, and apache

Cassandra in less than 1G of memory?

2011-01-14 Thread Rajat Chopra
Hello. According to JVM heap size topic at http://wiki.apache.org/cassandra/MemtableThresholds , Cassandra would need atleast 1G of memory to run. Is it possible to have a running Cassandra cluster with machines that have less than that memory... say 512M? I can live with slow transactions,

Re: Newbie Replication/Cluster Question

2011-01-14 Thread Mark Moseley
On Thu, Jan 13, 2011 at 2:32 PM, Mark Moseley moseleym...@gmail.com wrote: On Thu, Jan 13, 2011 at 1:08 PM, Gary Dusbabek gdusba...@gmail.com wrote: It is impossible to properly bootstrap a new node into a system where there are not enough nodes to satisfy the replication factor.  The cluster

Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-14 Thread Dan Kuebrich
We've done hundreds of gigs in and out of cassandra 0.6.8 with pycassa 0.3. Working on upgrading to 0.7 and pycassa 1.03. I don't know if we're using it wrong, but the connection object is tied to a particular keyspace constraint isn't that awesome--we have a number of keyspaces used

Re: Cassandra in less than 1G of memory?

2011-01-14 Thread Victor Kabdebon
Dear rajat, Yes it is possible, I have the same constraints. However I must warn you, from what I see Cassandra memory consumption is not bounded in 0.6.X on debian 64 Bit Here is an example of an instance launch in a node : root 19093 0.1 28.3 1210696 *570052* ? Sl Jan11 9:08

Re: Newbie Replication/Cluster Question

2011-01-14 Thread Mark Moseley
Perhaps the better question would be, if I have a two node cluster and I want to be able to lose one box completely and replace it (without losing the cluster), what settings would I need? Or is that an impossible scenario? In production, I'd imagine a 3 node cluster being the minimum but

Cassandra-Maven-Plugin

2011-01-14 Thread Stephen Connolly
OK, I nearly have the Cassandra-Maven-Plugin ready. It has the following goals: run: launches Cassandra in the foreground and blocks until you press ^C at which point Maven terminates. Use-case: Running integration tests from your IDE. Live development from your IDE. start: launches

Re: Newbie Replication/Cluster Question

2011-01-14 Thread Mark Moseley
On Fri, Jan 14, 2011 at 4:29 PM, Aaron Morton aa...@thelastpickle.com wrote: Here's some slides I did last year that have a simple explanation of RF http://www.slideshare.net/mobile/aaronmorton/well-railedcassandra24112010-5901169 Short version is, generally no single node contains all the

Re: Cassandra in less than 1G of memory?

2011-01-14 Thread Edward Capriolo
On Fri, Jan 14, 2011 at 2:13 PM, Victor Kabdebon victor.kabde...@gmail.com wrote: Dear rajat, Yes it is possible, I have the same constraints. However I must warn you, from what I see Cassandra memory consumption is not bounded in 0.6.X on debian 64 Bit Here is an example of an instance

is it possible to map an one from a a file and an one from cassandra?

2011-01-14 Thread 김준영
hi, cassandra supports hadoop to map reduce from cassandra. now I am digging to find out a way to map from a file and cassandra together. I mean if both of them are files in my disk, it is possible by using splits. but, in this kind of a situtation, which way is posssible? for example. in

Re: Cassandra in less than 1G of memory?

2011-01-14 Thread Jonathan Ellis
mmapping only consumes memory that the OS can afford to feed it. On Fri, Jan 14, 2011 at 7:29 PM, Edward Capriolo edlinuxg...@gmail.com wrote: On Fri, Jan 14, 2011 at 2:13 PM, Victor Kabdebon victor.kabde...@gmail.com wrote: Dear rajat, Yes it is possible, I have the same constraints.

Re: Cassandra in less than 1G of memory?

2011-01-14 Thread Victor Kabdebon
Hi Jonathan, hi Edward, Jonathan : but it looks like mmaping wants to consume the entire memory of my server. It goes up to 1.7 Gb for a ridiculously small amount of data. Am I doing something wrong or is there something I should change to prevent this never ending increase of memory consumption