Re: ORM in Cassandra?

2010-04-23 Thread dir dir
So maybe it's weird to combine ORM and Cassandra, right? Is there anything we can take from ORM? Honestly I do not understand what is your question. It is clear that you can not combine ORM such as Hibernate or iBATIS with Cassandra. Cassandra it self is not a RDBMS, so you will not map the table

Re: ORM in Cassandra?

2010-04-23 Thread Benoit Perroud
I understand the question more like : Is there already a lib which help to get rid of writing hardcoded and hard to maintain lines like : MyClass data; String[] myFields = {name, label, ...} ListColumn columns; for (String field : myFields) { if (field == name) { columns.add(new

Re: Row deletion and get_range_slices (cassandra 0.6.1)

2010-04-23 Thread Ryan King
On Thu, Apr 22, 2010 at 8:24 PM, David Harrison dave.l.harri...@gmail.com wrote: Do those tombstone-d keys ever get purged completely ?  I've tried shortening the GCGraceSeconds right down but they still don't get cleaned up. The GCGraceSeconds will only apply when you compact data. -ryan

Re: ORM in Cassandra?

2010-04-23 Thread aXqd
On Fri, Apr 23, 2010 at 1:25 PM, Jeremy Dunck jdu...@gmail.com wrote: See what you think of tragedy: http://github.com/enki/tragedy This one is feasible. I love the idea of 'Build your data model from Model and Index'. Even better, I am INDEED working with python and those indexes can be

Re: Row deletion and get_range_slices (cassandra 0.6.1)

2010-04-23 Thread David Harrison
So I'm guessing that means compaction doesn't include purging of tombstone-d keys ? Is there any situation or maintenance process that does ? (or are keys forever?) On 23 April 2010 17:44, Ryan King r...@twitter.com wrote: On Thu, Apr 22, 2010 at 8:24 PM, David Harrison

Re: MapReduce, Timeouts and Range Batch Size

2010-04-23 Thread Johan Oskarsson
I have written some code to avoid thrift reconnection, it just keeps the connection open between get_range_slices calls. I can extract that and put it up but not until early next week. /Johan On 23 apr 2010, at 05.09, Jonathan Ellis wrote: That would be an easy win, sure. On Thu, Apr 22,

Re: 0.6.1 insert 1B rows, crashed when using py_stress

2010-04-23 Thread richard yao
I got the same question, and after that cassandra cann't be started. I want to know how to restart the cassandra after it crashed. Thanks for any reply.

Re: Cassandra Ruby Library's batch method example?

2010-04-23 Thread Lucas Di Pentima
So basically the idea behind the batch processing is some performance gain via network usage optimization? Thanks Jonathan! El 22/04/2010, a las 21:32, Jonathan Ellis escribió: nope, there is no guarantee of that. if the server fails mid-operation you have to retry it. On Thu, Apr 22,

org.apache.cassandra.dht.OrderPreservingPartitioner Initial Token

2010-04-23 Thread Mark Jones
How is this specified? Is it a large hex #? A string of bytes in hex? http://wiki.apache.org/cassandra/StorageConfiguration doesn't say.

Re: How to insert a row with a TimeUUIDType column in C++

2010-04-23 Thread Jonathan Ellis
I would assume that you'd want to look for a C++ library that deals with UUIDs. Cassandra or Thrift aren't in the business of doing that conversion. On Fri, Apr 23, 2010 at 4:59 AM, Olivier Rosello orose...@corp.free.fr wrote: Here is my test code : ColumnPath new_col; new_col.__isset.column

Re: MapReduce, Timeouts and Range Batch Size

2010-04-23 Thread Jonathan Ellis
Great! Created https://issues.apache.org/jira/browse/CASSANDRA-1017 to track this. On Fri, Apr 23, 2010 at 4:12 AM, Johan Oskarsson jo...@oskarsson.nu wrote: I have written some code to avoid thrift reconnection, it just keeps the connection open between get_range_slices calls. I can extract

Question about a potential configuration scenario

2010-04-23 Thread Campbell, Joseph
Question: It is possible to setup Cassandra such that 2 independent Cassandra rings/clusters replicate to one another, ensuring that each ring/cluster has at least 1 copy of all the data on each ring/cluster? The setup is like this: 2 Data centers, one in Philadelphia and another

RE: lazyboy - batch insert

2010-04-23 Thread Dop Sun
http://code.google.com/p/jassandra/source/browse/trunk/org.softao.jassandra/ src/org/softao/jassandra/thrift/ThriftColumnFamily.java Insert and Delete method of this class are using batch_mutation. Cheers. Dop From: Lubos Pusty [mailto:lubospu...@gmail.com] Sent: Friday, April 23, 2010

Re: MapReduce, Timeouts and Range Batch Size

2010-04-23 Thread Joost Ouwerkerk
Awesome. In the meantime, I hacked something similar myself. The performance difference does not appear to be material. I think the real killer is the get_range_slices call. Relative to that, the cost of getting the connection appears to be more or less trivial. What can I do to alleviate

Re: MapReduce, Timeouts and Range Batch Size

2010-04-23 Thread Jonathan Ellis
You could look into it, but it's not going to be an easy backport since SSTableReader and SSTableScanner got split into two classes in trunk. On Fri, Apr 23, 2010 at 9:39 AM, Joost Ouwerkerk jo...@openplaces.org wrote: Awesome.  In the meantime, I hacked something similar myself.  The

Re: Clarification on Ring operations in Cassandra 0.5.1

2010-04-23 Thread Jonathan Ellis
On Wed, Apr 21, 2010 at 1:48 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: So why is Token - 1 better?  Doesn't that result in more data movement than PreviousTokenInRing + 1? No, because a node is responsible for (previous token, own token]. So if you introduce token T-1 before

Re: Will cassandra block client ?

2010-04-23 Thread Todd Burruss
Ran, Under very heavy load using more than 50 threads with 20k payload size, I have seen Hector close connections then reopen so such that time_wait builds up and can no longer connect. -Original Message- From: Ran Tavory [ran...@gmail.com] Received: 4/22/10 1:29 AM To:

Re: How to insert a row with a TimeUUIDType column in C++

2010-04-23 Thread Olivier Rosello
Le vendredi 23 avril 2010 à 08:30 -0500, Jonathan Ellis a écrit : want to look for a C++ library that deals with UUIDs. Cassandra or Thrift aren't Tank you for the response. That's not the problem for me. The problem is that new_col.column type is string. uint8_t uuid[17];

Re: Concurrent SuperColumn update question

2010-04-23 Thread Jonathan Ellis
On Thu, Apr 22, 2010 at 11:34 AM, tsuraan tsur...@gmail.com wrote: Suppose I have a SuperColumn CF where one of the SuperColumns in each row is being treated as a list (e.g. keys only, values are just empty).  In this list, values will only ever be added; deletion never occurs.  If I have two

Re: org.apache.cassandra.dht.OrderPreservingPartitioner Initial Token

2010-04-23 Thread Jonathan Ellis
a normal String from the same universe as your keys. On Fri, Apr 23, 2010 at 7:23 AM, Mark Jones mjo...@imagehawk.com wrote: How is this specified? Is it a large hex #? A string of bytes in hex? http://wiki.apache.org/cassandra/StorageConfiguration doesn’t say.

Re: MapReduce, Timeouts and Range Batch Size

2010-04-23 Thread Joost Ouwerkerk
In that case I should probably wait for 0.7. Is there any fundamental performance difference in get_range_slices between Random and Order-Preserving partitioners. If so, by what factor? joost. On Fri, Apr 23, 2010 at 10:47 AM, Jonathan Ellis jbel...@gmail.com wrote: You could look into it,

RE: org.apache.cassandra.dht.OrderPreservingPartitioner Initial Token

2010-04-23 Thread Mark Jones
So if my keys are binary, is there any way to escape the keysequence in? I have 20 bytes (any value 0x0-0xff is possible) as the key. Are they compared as an array of bytes? So that I can use truncation? 4 nodes, broken up by 0x00, 0x40, 0x80, 0xC0? -Original Message- From: Jonathan

RE: How to insert a row with a TimeUUIDType column in C++

2010-04-23 Thread Mark Jones
Turns out assign can be called with the length as well So mod your code to be new_col.column.assign((char *)uuid, 16); and you are fixed. -Original Message- From: Mark Jones [mailto:mjo...@imagehawk.com] Sent: Friday, April 23, 2010 10:52 AM To: user@cassandra.apache.org Subject: RE:

Re: Concurrent SuperColumn update question

2010-04-23 Thread tsuraan
On Thu, Apr 22, 2010 at 11:34 AM, tsuraan tsur...@gmail.com wrote: Suppose I have a SuperColumn CF where one of the SuperColumns in each row is being treated as a list (e.g. keys only, values are just empty).  In this list, values will only ever be added; deletion never occurs.  If I have two

Re: 0.6.1 insert 1B rows, crashed when using py_stress

2010-04-23 Thread Brandon Williams
On Fri, Apr 23, 2010 at 4:59 AM, richard yao richard.yao2...@gmail.comwrote: I got the same question, and after that cassandra cann't be started. I want to know how to restart the cassandra after it crashed. Thanks for any reply. Perhaps supply the error when you restart it? -Brandon

Odd ring problems with 0.5.1

2010-04-23 Thread Anthony Molinaro
So I've been trying to migrate off of old ec2 m1.large nodes onto xlarge nodes so I can get enough breathing room to then do an upgrade to 0.6.x (I can't keep the large nodes up long enough, so I spend all my time restarting and trying to move data, so can get all the packages I would need for

Re: ORM in Cassandra?

2010-04-23 Thread Ned Wolpert
There is nothing wrong with what you are asking. Some work has been done to get an ORM layer ontop of cassandra, for example, with a RubyOnRails project. I'm trying to simplify cassandra integration with grails with the plugin I'm writing. The problem is ORM solutions to date are wrapping a

Re: getting cassandra setup on windows 7

2010-04-23 Thread S Ahmed
Any insights? Much appreciated! On Thu, Apr 22, 2010 at 11:13 PM, S Ahmed sahmed1...@gmail.com wrote: I was just reading that thanks. What does he mean when he says: This appears to be related to data storage paths I set, because if I switch the paths back to the default UNIX paths.

Re: Odd ring problems with 0.5.1

2010-04-23 Thread Anthony Molinaro
On Fri, Apr 23, 2010 at 12:41:17PM -0500, Jonathan Ellis wrote: On Fri, Apr 23, 2010 at 12:30 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: Some nodes appear in the ring from some nodes, but not others.  Right now I have 14 nodes, 10 of those nodes have the same output of a

Re: Odd ring problems with 0.5.1

2010-04-23 Thread Jonathan Ellis
On Fri, Apr 23, 2010 at 1:12 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: I'm not sure how it would get this, maybe I need to restart my seed node? It's worth a try. Sounds like you found an unusual bug in gossip. When I run nodeprobe ring on the seed I don't see any of the hosts I

Re: getting cassandra setup on windows 7

2010-04-23 Thread Mark Greene
Try the cassandra-with-fixes.bathttps://issues.apache.org/jira/secure/attachment/12442349/cassandra-with-fixes.bat file attached to the issue. I had the same issue an that bat file got cassandra to start. It still throws another error complaining about the log4j.properties. On Fri, Apr 23, 2010

Re: running cassandra as a service on windows

2010-04-23 Thread Jonathan Ellis
you could do it with standard techniques to run java apps as windows services. i understand it's a bit painful. On Fri, Apr 23, 2010 at 2:05 PM, S Ahmed sahmed1...@gmail.com wrote: Is it possible to have Cassandra run in the background on a windows server? i.e. as a service so if the server

Re: running cassandra as a service on windows

2010-04-23 Thread Miguel Verde
https://issues.apache.org/jira/browse/CASSANDRA-292 points to http://commons.apache.org/daemon/procrun.html which is used by other Apache software to implement Windows services in Java. CassandraDaemon conforms to the Commons Daemon spec. On Fri, Apr 23, 2010 at 2:20 PM, Jonathan Ellis

Trove maps

2010-04-23 Thread Carlos Sanchez
Jonathan, Have you thought of using Trove collections instead of regular java collections (HashMap / HashSet) in Cassandra? Trove maps are faster and require less memory Carlos This email message and any attachments are for the sole use of the intended recipients and may contain proprietary

Re: Trove maps

2010-04-23 Thread Jonathan Ellis
From what I have seen Trove is only a win when you are doing Maps of primitives, which is mostly not what we use in Cassandra. (The one exception I can think of is a map of int - columnfamilies in CommitLogHeader. You're welcome to experiment and see if using Trove there or elsewhere makes a

Re: Trove maps

2010-04-23 Thread Carlos Sanchez
I will try to modify the code... what I like about Trove is that even for regular maps (non primitive) there are no Entry objects created so there are much less references to be gced On Apr 23, 2010, at 2:55 PM, Jonathan Ellis wrote: From what I have seen Trove is only a win when you are

Re: Will cassandra block client ?

2010-04-23 Thread Ran Tavory
This used to be the case but was fixed couple of weeks ago. Which version are you using? On Apr 23, 2010 5:56 PM, Todd Burruss bburr...@real.com wrote: Ran, Under very heavy load using more than 50 threads with 20k payload size, I have seen Hector close connections then reopen so such that

Super and Regular Columns

2010-04-23 Thread Robert
I am starting out with Cassandra and I had a couple of questions, I read a lot of the documentation including: http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model First I wanted to make sure I understand this bug: http://issues.apache.org/jira/browse/CASSANDRA-598 Borrowing from the

Trying To Understand get_range_slices Results When Using RandomPartitioner

2010-04-23 Thread Larry Root
I trying to better understand how using the RandomPartitioner will affect my ability to select ranges of keys. Consider my simple example where we have many online games across different game genres (GameType). These games need to store data for each one of their users. With that in mind consider

MESSAGE-STREAMING-POOL exception

2010-04-23 Thread B. Todd Burruss
i see these exceptions on 4 out of the 7 nodes in my cluster. in addition those same four nodes all show AE-SERVICE-STAGE with pending work, and been showing this for several hours now. each node in the cluster has less than 2gb, so it should be finished by now. when i do nodetool streams

Re: MESSAGE-STREAMING-POOL exception

2010-04-23 Thread Jonathan Ellis
java.net.ConnectException: Connection timed out at sun.nio.ch.Net.connect is an os-level connection problem. On Fri, Apr 23, 2010 at 3:34 PM, B. Todd Burruss bburr...@real.com wrote: i see these exceptions on 4 out of the 7 nodes in my cluster.  in addition those same four nodes all show

Re: MESSAGE-STREAMING-POOL exception

2010-04-23 Thread B. Todd Burruss
i agree, but it seems to have implications on the streaming service. Jonathan Ellis wrote: java.net.ConnectException: Connection timed out at sun.nio.ch.Net.connect is an os-level connection problem. On Fri, Apr 23, 2010 at 3:34 PM, B. Todd Burruss bburr...@real.com wrote: i see these

Re: MESSAGE-STREAMING-POOL exception

2010-04-23 Thread Jonathan Ellis
Can you create a ticket? On Fri, Apr 23, 2010 at 3:50 PM, B. Todd Burruss bburr...@real.com wrote: i agree, but it seems to have implications on the streaming service. Jonathan Ellis wrote: java.net.ConnectException: Connection timed out at sun.nio.ch.Net.connect is an os-level connection

RE: Trove maps

2010-04-23 Thread Mark Jones
Eliminating GC hell would probably do a lot to help Cassandra maintain speed vs periods of superfast/superslow performance. I look forward to hearing how this experiment goes. From: Eric Hauser [mailto:ewhau...@gmail.com] Sent: Friday, April 23, 2010 3:37 PM To: user@cassandra.apache.org

Re: YCSB - Yahoo Cloud Serving Benchmark - now available for download

2010-04-23 Thread Jeff Hodges
Hell yeah! -- Jeff On Fri, Apr 23, 2010 at 10:59 AM, Brian Frank Cooper coop...@yahoo-inc.com wrote: Yahoo! Research is pleased to announce the release of the Yahoo! Cloud Serving Benchmark, YCSB v. 0.1.0, as an open source package. YCSB is a common benchmarking framework for cloud database,

Re: MESSAGE-STREAMING-POOL exception

2010-04-23 Thread B. Todd Burruss
https://issues.apache.org/jira/browse/CASSANDRA-1019 Jonathan Ellis wrote: Can you create a ticket? On Fri, Apr 23, 2010 at 3:50 PM, B. Todd Burruss bburr...@real.com wrote: i agree, but it seems to have implications on the streaming service. Jonathan Ellis wrote:

Re: Odd ring problems with 0.5.1

2010-04-23 Thread Anthony Molinaro
On Fri, Apr 23, 2010 at 01:17:21PM -0500, Jonathan Ellis wrote: On Fri, Apr 23, 2010 at 1:12 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: I'm not sure how it would get this, maybe I need to restart my seed node? It's worth a try. Sounds like you found an unusual bug in gossip.

Internal error processing describe_keyspace

2010-04-23 Thread Amol Deshpande
Hi, I 'm new to Cassandra, trying to set up a single node to play with. I set one up in a VM (0.6.1 off the website) , running fedora 12. Things seem peachy in that I can connect to it with a modified hector ExampleClient, and insert data into it. However, when I decided to view this

Re: Internal error processing describe_keyspace

2010-04-23 Thread Jonathan Ellis
can you attach the full stacktrace? On Fri, Apr 23, 2010 at 4:50 PM, Amol Deshpande amol.deshpa...@gazillion.com wrote: Hi, I ‘m new to Cassandra, trying to set up a single node to play with.  I set one up in a VM (0.6.1 off the website) , running fedora 12.  Things seem peachy in that I

RE: Internal error processing describe_keyspace

2010-04-23 Thread Amol Deshpande
Sure, INFO [COMPACTION-POOL:1] 2010-04-23 14:21:48,973 CompactionManager.java (line 326) Compacted to /home/amol/apache-cassandra-0.6.1/var/lib/cassandra/data/system/LocationInfo-9-Data.db. 1776/495 bytes for 2 keys. Time: 970ms. ERROR [pool-1-thread-6] 2010-04-23 14:28:46,917 Cassandra.java

Question about TimeUUIDType

2010-04-23 Thread Lucas Di Pentima
Hello, I'm using Cassandra 0.6.1 with ruby library I want to log events on a CF like this: Events = { // CF CompareWith: TimeUUIDType SomeEventID : { // Row uuid_from_unix_timestamp : event_data, ... } } I receive event data with a UNIX timestamp (nr of seconds passed

Re: Question about TimeUUIDType

2010-04-23 Thread Jesse McConnell
try LexicalUUIDType, that will distinguish the secs correctly imo based on the existing impl (last I checked at least) TimeUUIDType was equivalent to LongType cheers, jesse -- jesse mcconnell jesse.mcconn...@gmail.com On Fri, Apr 23, 2010 at 17:51, Lucas Di Pentima lu...@di-pentima.com.ar

Re: Question about a potential configuration scenario

2010-04-23 Thread Paul Prescod
http://wiki.apache.org/cassandra/Operations === A Cassandra cluster always divides up the key space into ranges delimited by Tokens as described above, but additional replica placement is customizable via !IReplicaPlacementStrategy in the configuration file. The standard strategies are

Best way to store millisecond-accurate data

2010-04-23 Thread Andrew Nguyen
Hello, I am looking to store patient physiologic data in Cassandra - it's being collected at rates of 1 to 125 Hz. I'm thinking of storing the timestamps as the column names and the patient/parameter combo as the row key. For example, Bob is in the ICU and is currently having his blood

Re: Best way to store millisecond-accurate data

2010-04-23 Thread Miguel Verde
of sampled time series data is to bucket/shard rows (i.e. Bob-20100423-bloodpressure) so that you put an upper bound on the row length. On Apr 23, 2010, at 7:01 PM, Andrew Nguyen andrew-lists-cassan...@ucsfcti.org wrote: Hello, I am looking to store patient physiologic data in Cassandra

Re: Best way to store millisecond-accurate data

2010-04-23 Thread Erik Holstad
is easily captured by it. One typical way of dealing with the data explosion of sampled time series data is to bucket/shard rows (i.e. Bob-20100423-bloodpressure) so that you put an upper bound on the row length. On Apr 23, 2010, at 7:01 PM, Andrew Nguyen andrew-lists-cassan...@ucsfcti.org wrote

Re: Odd ring problems with 0.5.1

2010-04-23 Thread Anthony Molinaro
Turns out I needed to shut everything down completely, then start it all up a rolling restart was still resulting in some nodes being confused about what ring they were in. I think the moral of all this, is any changes to the seed node must result in a full restart of your cluster. Also any use

Re: ORM in Cassandra?

2010-04-23 Thread aXqd
On Sat, Apr 24, 2010 at 1:36 AM, Ned Wolpert ned.wolp...@imemories.com wrote: There is nothing wrong with what you are asking. Some work has been done to get an ORM layer ontop of cassandra, for example, with a RubyOnRails project. I'm trying to simplify cassandra integration with grails with

Re: Trove maps

2010-04-23 Thread Tatu Saloranta
On Fri, Apr 23, 2010 at 1:22 PM, Carlos Sanchez carlos.sanc...@riskmetrics.com wrote: I will try to modify the code... what I like about Trove is that even for regular maps (non primitive) there are no Entry objects created so there are much less references to be gced This could help, but

RE: org.apache.cassandra.dht.OrderPreservingPartitioner Initial Token

2010-04-23 Thread Stu Hood
Your keys cannot be an encoded as binary for OPP, since Cassandra will attempt to decode them as UTF-8, meaning that they may not come back in the same format. 0.7 supports byte keys using the ByteOrderedPartitioner, and tokens are specified using hex. -Original Message- From: Mark