Re: How to generate 'unique' identifiers for use in Cassandra

2010-04-27 Thread Andriy Bohdan
There's no easy and efficient way to implement auto_increment keys in cassandra. So people usually use UUIDs (http://en.wikipedia.org/wiki/UUID) for this purpose, which is considered globally unique. If you can use one of the fields from your data model as a unique key, better use it instead of

Re: Thrift vs. Hector

2010-04-27 Thread Ran Tavory
Hi David, I have a few questions (and answers), see inline On Tue, Apr 27, 2010 at 12:49 PM, David Boxenhorn da...@lookin2.com wrote: Hi all, I'm trying to install a Cassandra development environment for Java. It is much harder than I thought it would be I got Cassandra up and running

Re: Thrift vs. Hector

2010-04-27 Thread David Boxenhorn
Thanks, Ran. Very nice to meet you! Responses inline. Summary: I downloaded and unzipped Thrift and Hector. I included them in my project. What do I do now? On Tue, Apr 27, 2010 at 1:01 PM, Ran Tavory ran...@gmail.com wrote: Hi David, I have a few questions (and answers), see inline On Tue,

Re: Thrift vs. Hector

2010-04-27 Thread Ran Tavory
That was a lame joke - here it is (in the mail). I can use some help with it, but what we have now for hector is here http://wiki.github.com/rantav/hector/ http://wiki.github.com/rantav/hector/And for thrift, at lease the parts I know of: http://wiki.apache.org/cassandra/ThriftExamples but you

Re: Thrift vs. Hector

2010-04-27 Thread Ran Tavory
good link, thanks, can I get one done for hector as? :) There are some wiki pages and blog posts in the above mentioned links so they are a good start for hector, then there are some unit tests, but I certainly would appreciate help from the community here. On Tue, Apr 27, 2010 at 1:25 PM,

Call for help - Documentation

2010-04-27 Thread Ran Tavory
Do you use Hecor? Do you find it useful? Contribute back by helping others getting started and learn. The hector dev team would appreciate help writing documentation for hector. So far we have a few blog posts, a wiki and unit-tests, but proper documentation, tutorials and examples aren't there

Re: Thrift vs. Hector

2010-04-27 Thread David Boxenhorn
Some background: I am part of a team which is working on a rather large internet program. It is currently using an RDBMS, but we want to start using Cassandra. I'm working in Eclipse. I downloaded and unzipped hector-0.6.0-11.zip and got a whole bunch of jars, most of which don't look like

Re: Thrift vs. Hector

2010-04-27 Thread David Boxenhorn
So I should get rid of my Thrift project? On Tue, Apr 27, 2010 at 2:11 PM, Ran Tavory ran...@gmail.com wrote: It includes thrift, yes. You need cassandra as well (the jar includes both client and server code) On Apr 27, 2010 2:03 PM, David Boxenhorn da...@lookin2.com wrote: Some

Re: Thrift vs. Hector

2010-04-27 Thread David Boxenhorn
Thanks for all your help, Ran! (I'll probably be needing more, later...) On Tue, Apr 27, 2010 at 2:28 PM, Ran Tavory ran...@gmail.com wrote: if you're speaking of an eclipse thrift project, then I don't think you need one, no. On Tue, Apr 27, 2010 at 2:14 PM, David Boxenhorn

detecting write retries

2010-04-27 Thread Maxim Grinev
Hi all, if the node that proccesses my write fails, I should retry my write. If my write increments some counter, the counter will be incremented several times instead of just once. Does Cassandra support any mechanism to identify write retries to avoid multiple exacution? Maxim

Re: Broken pipe

2010-04-27 Thread Jonathan Ellis
get_range_slices works fine in the system tests, so something is wrong on your client side. Some possibilities: - sending to a non-Thrift port - using an incompatible set of Thrift bindings than the one your server supports - mixing a framed client with a non-framed server or vice versa

Multiple keyspaces per application?

2010-04-27 Thread David Boxenhorn
I just saw this note from storage-conf.xml: Except in very unusual circumstances you will have one Keyspace per application. Why is that? I was thinking of putting our normal data and indexes in separate keyspaces so they could be maintained separately. What are the disadvantages of multiple

Re: error during snapshot

2010-04-27 Thread Lee Parker
Can anyone help with this? It is preventing me from getting backups of our cluster. Lee Parker On Mon, Apr 26, 2010 at 10:02 PM, Lee Parker l...@socialagency.com wrote: I was attempting to get a snapshot on our cassandra nodes. I get the following error every time I run nodetool ...

Re: error during snapshot

2010-04-27 Thread Eric Hauser
Have you read this? http://forums.sun.com/thread.jspa?messageID=9734530 http://forums.sun.com/thread.jspa?messageID=9734530I don't think EC2 instances have any swap. On Tue, Apr 27, 2010 at 10:16 AM, Lee Parker l...@socialagency.com wrote: Can anyone help with this? It is preventing me from

Querying by date range when using TimeUUIDType ColumnFamily?

2010-04-27 Thread Ed Anuff
Assuming a ColumnFamily with a CompareWith of TimeUUIDType, is it possible to call get_slice with an arbitrary date range? How would valid values for the start and finish attributes of the slice range be constructed? Thanks Ed

Re: error during snapshot

2010-04-27 Thread Lee Parker
Adding a swapfile fixed the error, but it doesn't look as though the process is even using the swap file at all. Lee Parker On Tue, Apr 27, 2010 at 9:49 AM, Eric Hauser ewhau...@gmail.com wrote: Have you read this? http://forums.sun.com/thread.jspa?messageID=9734530

Re: Multiple keyspaces per application?

2010-04-27 Thread banks
The only advantage is the RF is per keyspace On Tue, Apr 27, 2010 at 6:57 AM, David Boxenhorn da...@lookin2.com wrote: I just saw this note from storage-conf.xml: Except in very unusual circumstances you will have one Keyspace per application. Why is that? I was thinking of putting our

Re: Multiple keyspaces per application?

2010-04-27 Thread David Boxenhorn
Thanks!. er, what is RF? On Tue, Apr 27, 2010 at 6:50 PM, banks bankse...@gmail.com wrote: The only advantage is the RF is per keyspace On Tue, Apr 27, 2010 at 6:57 AM, David Boxenhorn da...@lookin2.comwrote: I just saw this note from storage-conf.xml: Except in very unusual

Re: Multiple keyspaces per application?

2010-04-27 Thread Miguel Verde
Replication Factor, the number of copies (replicas) of the data that Cassandra will store and an important number for quorum consistency calculations. On Tue, Apr 27, 2010 at 11:14 AM, David Boxenhorn da...@lookin2.com wrote: Thanks!. er, what is RF? On Tue, Apr 27, 2010 at 6:50 PM, banks

Re: Cassandra reverting deletes?

2010-04-27 Thread Joost Ouwerkerk
To check that rows are gone, I check that KeySlice.columns is empty. And as I mentioned, immediately after the delete job, this returns the expected number. Unfortunately I reproduced with QUORUM this morning. No node outages. I am going to try ALL to see if that changes anything, but I am

Re: error during snapshot

2010-04-27 Thread Jonathan Shook
The allocation of memory may have failed depending on the available virtual memory, whether or not the memory would have been subsequently accessed by the process. Some systems do the work of allocating physical pages only when they are accessed for the first time. I'm not sure if yours is one of

Re: Querying by date range when using TimeUUIDType ColumnFamily?

2010-04-27 Thread Justin Sanders
You're going to have to build TimeUUIDs based on the date range you are scanning. Problem is most UUID libraries build version 1 UUIDs based on the current time. I was able to get this working in Python by changing the library to allow me to pass in a time. This isn't safe for creating unique

ThriftTransportException using Ruby Gem 0.8.2 against Cassandra 0.6.1

2010-04-27 Thread Lucas Di Pentima
Hello, I'm importing some data on Cassandra, running only on my laptop, with all config values by default. After some time running the import script I've written (which includes some reads besides the import writes), I get the following error message and stack trace:

Re: ThriftTransportException using Ruby Gem 0.8.2 against Cassandra 0.6.1

2010-04-27 Thread Lucas Di Pentima
Thanks Ryan for the fast response! Can you explain to me why binding against 127.0.0.1 causes the problem? Maybe it's useful to point this out in the documentation to avoid users deploy this kind of setups. Thanks again! El 27/04/2010, a las 17:28, Ryan King escribió: It looks like you need

Re: ThriftTransportException using Ruby Gem 0.8.2 against Cassandra 0.6.1

2010-04-27 Thread Ryan King
On Tue, Apr 27, 2010 at 1:31 PM, Lucas Di Pentima lu...@di-pentima.com.ar wrote: Thanks Ryan for the fast response! Can you explain to me why binding against 127.0.0.1 causes the problem? Maybe it's useful to point this out in the documentation to avoid users deploy this kind of setups. Are

Re: ThriftTransportException using Ruby Gem 0.8.2 against Cassandra 0.6.1

2010-04-27 Thread Lucas Di Pentima
El 27/04/2010, a las 17:34, Ryan King escribió: On Tue, Apr 27, 2010 at 1:31 PM, Lucas Di Pentima lu...@di-pentima.com.ar wrote: Thanks Ryan for the fast response! Can you explain to me why binding against 127.0.0.1 causes the problem? Maybe it's useful to point this out in the

Re: ThriftTransportException using Ruby Gem 0.8.2 against Cassandra 0.6.1

2010-04-27 Thread Ryan King
On Tue, Apr 27, 2010 at 1:38 PM, Lucas Di Pentima lu...@di-pentima.com.ar wrote: Nope, I'm doing some tests locally on my notebook (Macbook OSX 10.6.3 w/4GB RAM). My script insert several hundred thousand columns with stable speed, and then it exits throwing that exception. Its possible

Re: Cassandra reverting deletes?

2010-04-27 Thread Joost Ouwerkerk
Hmm... Even after deleting with cl.ALL, I'm getting data back for some rows after having deleted them. Which rows return data is inconsistent from one run of the job to the next. On Tue, Apr 27, 2010 at 1:44 PM, Joost Ouwerkerk jo...@openplaces.org wrote: To check that rows are gone, I check

Re: Querying by date range when using TimeUUIDType ColumnFamily?

2010-04-27 Thread Lee Parker
I have used the solution presented by Justin and it works just fine. When you construct a TimeUUID with a specific timestamp and use that for the start or finish of the range slice, cassandra will use the timestamp embedded in the UUID even if that specific UUID doesn't exist in the index. It is

Storage Layout Questions

2010-04-27 Thread Jonathan Shook
I'm trying to model a one-to-many set of data in which both sides of the relation may grow arbitrarily large. There are arbitrarily many FOOs. For each FOO, there are arbitrarily many BARs. Both types are modeled as an object, containing multiple fields (columns) in the application. Given a

Re: Cassandra reverting deletes?

2010-04-27 Thread Joost Ouwerkerk
Clocks are in sync: cluster04:~/cassandra$ dsh -g development date Tue Apr 27 17:36:33 EDT 2010 Tue Apr 27 17:36:33 EDT 2010 Tue Apr 27 17:36:33 EDT 2010 Tue Apr 27 17:36:33 EDT 2010 Tue Apr 27 17:36:34 EDT 2010 Tue Apr 27 17:36:34 EDT 2010 Tue Apr 27 17:36:34 EDT 2010 Tue Apr 27 17:36:34 EDT

Is Hector a wrapper around thrift?

2010-04-27 Thread S Ahmed
Just trying to get my head wrapped around everything here, so bare with me :) So Thrift can spit out generated code for any language, be it C#, Java or python etc. Hector is a higher level wrapper around the java generated code by Thrift. Do I have this right? And Hector is probably the most

Re: ThriftTransportException using Ruby Gem 0.8.2 against Cassandra 0.6.1

2010-04-27 Thread Ryan King
On Tue, Apr 27, 2010 at 2:29 PM, Lucas Di Pentima lu...@di-pentima.com.ar wrote: El 27/04/2010, a las 18:11, Ryan King escribió: On Tue, Apr 27, 2010 at 1:38 PM, Lucas Di Pentima lu...@di-pentima.com.ar wrote: Nope, I'm doing some tests locally on my notebook (Macbook OSX 10.6.3 w/4GB

Re: batch_mutate - PHP

2010-04-27 Thread Jordan Pittier
Hi, Here is a working example : $mutation_map = array($key=array(Standard1 = array())); for($column_name=0; $column_name$options['numcolumns']; $column_name++) { $column = new cassandra_Column(array('name' = $column_name, 'value' = 'put your data here', 'timestamp' = time()));

Re: ThriftTransportException using Ruby Gem 0.8.2 against Cassandra 0.6.1

2010-04-27 Thread Lucas Di Pentima
El 27/04/2010, a las 19:00, Ryan King escribió: On Tue, Apr 27, 2010 at 2:29 PM, Lucas Di Pentima lu...@di-pentima.com.ar wrote: El 27/04/2010, a las 18:11, Ryan King escribió: On Tue, Apr 27, 2010 at 1:38 PM, Lucas Di Pentima lu...@di-pentima.com.ar wrote: Nope, I'm doing some tests

Re: Multiple keyspaces per application?

2010-04-27 Thread Mark Robson
I can't see any advantage in using multiple keyspaces. It is highly unlikely that several applications would share the same Cassandra cluster in any nontrivial deployment. Things more important than replication-factor, such as partitioner and ring token distribution would be compromised by

Re: How to generate 'unique' identifiers for use in Cassandra

2010-04-27 Thread Mark Robson
2010/4/26 Roland Hänel rol...@haenel.me: Typically, in the SQL world we use things like AUTO_INCREMENT columns that let us create a unique key automatically if a row is inserted into a table. auto_increment is an antipattern; it adds an extra key which you don't need (usually). If your

Re: Querying by date range when using TimeUUIDType ColumnFamily?

2010-04-27 Thread Ed Anuff
Yes, Lucas was correct about the nature of my original question. I'm glad to hear that Justin's solution works, it makes for a much simpler schema. Ed On Tue, Apr 27, 2010 at 3:06 PM, Lucas Di Pentima lu...@di-pentima.com.arwrote: El 27/04/2010, a las 18:23, Lee Parker escribió: I have

Re: How do you construct an index and use it, especially in Ruby

2010-04-27 Thread Bob Hutchison
embedded response, way down below... On 2010-04-26, at 12:56 PM, Ryan King wrote: On Sun, Apr 25, 2010 at 11:14 AM, Bob Hutchison hutch-li...@recursive.ca wrote: Hi, I'm new to Cassandra and trying to work out how to do something that I've implemented any number of times (e.g.

Re: detecting write retries

2010-04-27 Thread Jonathan Ellis
You'll want to follow https://issues.apache.org/jira/browse/CASSANDRA-580 On Tue, Apr 27, 2010 at 7:06 AM, Maxim Grinev ma...@grinev.net wrote: Hi all, if the node that proccesses my write fails, I should retry my write. If my write increments some counter, the counter will be incremented

Re: Problem with JVM? concurrent mode failure

2010-04-27 Thread Jonathan Ellis
We're working on this in https://issues.apache.org/jira/browse/CASSANDRA-1014 On Tue, Apr 27, 2010 at 12:28 PM, Daniel Gimenez danie...@gmail.com wrote: Hi everyone, several days ago I was doing some tests in a Cassandra installation and everything was right (few inserts, few deletions, few

Re: how to get apache cassandra version with thrift client ?

2010-04-27 Thread Jonathan Ellis
You'll want to create a ticket at https://issues.apache.org/jira/browse/CASSANDRA to add that, then. On Mon, Apr 26, 2010 at 9:46 PM, Shuge Lee shuge@gmail.com wrote: I know I can get thrift API version. However, I writing a CLI for Cassandra in Python with readline support, and it will

Re: Problem with JVM? concurrent mode failure

2010-04-27 Thread Brandon Williams
On Tue, Apr 27, 2010 at 7:05 PM, Jonathan Ellis jbel...@gmail.com wrote: We're working on this in https://issues.apache.org/jira/browse/CASSANDRA-1014 There's an easy workaround noted in the ticket if you're willing to sacrifice a bit of performance: use batch mode instead of periodic for

Re: How to generate 'unique' identifiers for use in Cassandra

2010-04-27 Thread Shuge Lee
import uuid unique_key = uuid.uuid4() if you using Python. 2010/4/28 Mark Robson mar...@gmail.com 2010/4/26 Roland Hänel rol...@haenel.me: Typically, in the SQL world we use things like AUTO_INCREMENT columns that let us create a unique key automatically if a row is inserted into a

How to permanently delete one key ?

2010-04-27 Thread Jeff Zhang
Hi all, I use the thrift api to remove one key, and then use the get_range_slices to get all the keys and find that the key which I deleted is still there. I refer to the thrift api doc which says get_range_slices will apply to all the keys including the deleted keys. So my question is how can I

Re: Is Hector a wrapper around thrift?

2010-04-27 Thread Jeff Zhang
Yes, Hector is a higher level wrapper around the java thrift api, also with other features such as connection poll. Not sure whether there's something similar in python. On Wed, Apr 28, 2010 at 5:54 AM, S Ahmed sahmed1...@gmail.com wrote: Just trying to get my head wrapped around everything

Re: error during snapshot

2010-04-27 Thread Lee Parker
So, after reading the thread which Eric posted earlier, I have created a workaround for the issue. In my backup script, I add a swapfile with swapon, tell cassandra to create the snapshots, then remove the swapfile with swapoff. Then I continue with the rest of the work the backup script needs

Re: How to permanently delete one key ?

2010-04-27 Thread Greg Lu
Hey Jeff, I think this article addresses your question: http://spyced.blogspot.com/2010/02/distributed-deletes-in-cassandra.html -Greg On Tue, Apr 27, 2010 at 10:14 PM, Jeff Zhang zjf...@gmail.com wrote: Hi all, I use the thrift api to remove one key, and then use the get_range_slices to