Re: Node Inconsistency

2011-01-11 Thread Peter Schuller
I have few more questions: 1. If we change the write/delete consistency level to ALL, do we eliminate the data inconsistency among nodes (since the delete operations will apply to ALL replicas)? 2. My understanding is that Read Repair doesn't handle tombstones. How about Node Tool Repair

map-reduce failure

2011-01-11 Thread Or Yanay
Hi all, I am using 0.6.8 across 5 machines with ~30G of data on each machine. I am trying to run a map-reduce query (Both with my own Java code and Pig) and failing after about 30 minutes (see stack trace and details below). I have followed this wiki

Confused about CASSANDRA-1417; saving row cache

2011-01-11 Thread Chris Burroughs
https://issues.apache.org/jira/browse/CASSANDRA-1417 http://www.riptano.com/blog/whats-new-cassandra-066 My naive reading of CASSANDRA-1417 was that it could be used to save the row cache to disk. Empirically it appears to only save the row keys, and then reads each row. In my case I set the

Re: Cassandra 0.7.0 Release in Riptano public repository?

2011-01-11 Thread Eric Evans
On Tue, 2011-01-11 at 09:23 -0500, Michael Fortin wrote: This my understanding of 0.* releases. - They're not considered production ready by the maintainers - They subject to changes that break backwards compatibility - Generally poorly documented because the api is so volatile - Previous

[RELEASE] 0.7.0 (and 0.6.9)

2011-01-11 Thread Eric Evans
As some of you may already be aware, 0.7.0 has been officially released. You are free to start your upgrades, though not all at once, you'll spoil your supper! I apologize to anyone that might have noticed artifacts published as early as Sunday and were confused by the lack of announcement, I

Re: Confused about CASSANDRA-1417; saving row cache

2011-01-11 Thread Peter Schuller
https://issues.apache.org/jira/browse/CASSANDRA-1417 [snip, row cache saving only keys] Is this the intentional implementation?  Are there any reason not to just the entire row to disk to allow for faster startup? Intentional (in the sense of not a mistake), but see:

Re: [RELEASE] 0.7.0 (and 0.6.9)

2011-01-11 Thread Joseph Stein
Many thanks to those that put in all the hard work, time, dedication, etc for another awesome release !!! /* Joe Stein http://www.linkedin.com/in/charmalloc Twitter: @allthingshadoop */ On Tue, Jan 11, 2011 at 12:23 PM, Eric Evans eev...@rackspace.com wrote: As some of you may already be

Re: upgrading to 0.7 from 0.6.x

2011-01-11 Thread Peter Schuller
The process to upgrade is: 1) run nodetool drain on _each_ 0.6 node. When drain finishes (log message Node is drained appears), stop the process. 2) Convert your storage-conf.xml to the new cassandra.yaml using bin/config-converter. 3) Rename any of your keyspace

Advice wanted on modeling

2011-01-11 Thread Steven Mac
Hi, I've been experimenting quite a bit with Cassandra and think I'm getting to understand it, but I would like some advice on modeling my data in Cassandra for an application I'm developing. The application will have a large number of records, with the records consisting of a fixed part and

Ruby database migrations for Cassandra - ActiveColumn

2011-01-11 Thread Mike Wynholds
Happy new year all- I just wanted to mention that I have released a new Cassandra data management gem called ActiveColumn. The first major feature is ActiveRecord-like database migrations. The gem is young but it works and is well documented, and I'm very interested in feedback. Blog post:

Re: Ruby database migrations for Cassandra - ActiveColumn

2011-01-11 Thread Jonathan Ellis
Nice work, Mike! I tweeted your project a few days ago. :) On Tue, Jan 11, 2011 at 12:18 PM, Mike Wynholds m...@carbonfive.com wrote: Happy new year all- I just wanted to mention that I have released a new Cassandra data management gem called ActiveColumn.  The first major feature is

Re: Ruby database migrations for Cassandra - ActiveColumn

2011-01-11 Thread Ryan King
Awesome and great to see you're using our fauna cassandra gem. :) -ryan On Tue, Jan 11, 2011 at 10:18 AM, Mike Wynholds m...@carbonfive.com wrote: Happy new year all- I just wanted to mention that I have released a new Cassandra data management gem called ActiveColumn.  The first major

Re: Cassandra 0.7.0 Release in Riptano public repository?

2011-01-11 Thread Michael Fortin
Thanks for your thoughtful and detailed replies Eric, it's much appreciated. Mike On Jan 11, 2011, at 11:23 AM, Eric Evans wrote: On Tue, 2011-01-11 at 09:23 -0500, Michael Fortin wrote: This my understanding of 0.* releases. - They're not considered production ready by the maintainers -

Re: Confused about CASSANDRA-1417; saving row cache

2011-01-11 Thread Chris Burroughs
On 01/11/2011 12:23 PM, Peter Schuller wrote: Is this the intentional implementation? Are there any reason not to just the entire row to disk to allow for faster startup? Intentional (in the sense of not a mistake), but see: https://issues.apache.org/jira/browse/CASSANDRA-1625 The

RE: Need some beginner help with Eclipse+Hector with Cassandra 0.7

2011-01-11 Thread tamara.alexander
What about this logger error? I'm getting it too, and I am also running simple code with Hector and Eclipse: log4j:WARN No appenders could be found for logger (me.prettyprint.cassandra.connection.CassandraHostRetryService). log4j:WARN Please initialize the log4j system properly. log4j:WARN See

Re: Need some beginner help with Eclipse+Hector with Cassandra 0.7

2011-01-11 Thread Nate McCall
Add -verbose to the command-line options for the launch configuration so you can see the classpath. It sounds like log4j.properties is not being found. (Depending on your project setup, you may need to add this file to the classpath explicitly). On Tue, Jan 11, 2011 at 1:36 PM,

Re: Confused about CASSANDRA-1417; saving row cache

2011-01-11 Thread Peter Schuller
But now I need two knobs:  Max size of row cache (best optimal steady state hit rate) and number of row cache items to read in on startup (so that the ROW-READ-STAGE does not need to drop packets and node can be restarted in a reasonable amount of time). Good idea IMO. File a jira ticket? --

Re: Confused about CASSANDRA-1417; saving row cache

2011-01-11 Thread Peter Schuller
This makes total sense and is obvious in hindsight.  But wouldn't such a hypothetical stale row cache on be corrected by read repair (in other words useless for write heavy workloads, not a problem for read heavy)? It's not quite that simple. For example, suppose you write to the cluster at

Re: Confused about CASSANDRA-1417; saving row cache

2011-01-11 Thread Chris Burroughs
On 01/11/2011 02:56 PM, Peter Schuller wrote: But now I need two knobs: Max size of row cache (best optimal steady state hit rate) and number of row cache items to read in on startup (so that the ROW-READ-STAGE does not need to drop packets and node can be restarted in a reasonable amount of

Why my posts are marked as spam?

2011-01-11 Thread Oleg Tsvinev
Whatever I do, it happens :(

Re: Node Inconsistency

2011-01-11 Thread Vram Kouramajian
Thanks Peter for the reply. We are currently fixing our inconsistent data (since we have master data saved) . We will follow your suggestion and we will run Node Repair tool more often in the future. However, what happens to data inserted/deleted after Node Repair tool runs (i.e., between Node

Re: how to do a get_range_slices where all keys start with same string

2011-01-11 Thread Tyler Hobbs
That type of operation only works (directly) when using an OrderPreservingPartitioner. There are a lot of downsides to OPP: http://ria101.wordpress.com/2010/02/22/cassandra-randompartitioner-vs-orderpreservingpartitioner/ You can instead order your keys alphabetically as column names in a row

RE: about the insert data

2011-01-11 Thread raoyixuan (Shandy)
Thanks tyler So the node I connect to Is the coordinate node. right? But what’s the process about the replica? Firstly, the data will be inserted by the coordinate node. Secondly, it will find the first replica node based by the partitioner ,such randompartitioner, Thirdly, it will replicate

Re: how to do a get_range_slices where all keys start with same string

2011-01-11 Thread Jonathan Ellis
http://wiki.apache.org/cassandra/FAQ#range_rp also, start==end==x means give me back exactly row x, if it exists. IF you were using OPP you'd need end=y. On Tue, Jan 11, 2011 at 7:45 PM, Koert Kuipers koert.kuip...@diamondnotch.com wrote: I would like to do a get_range_slices for all keys

RE: how to do a get_range_slices where all keys start with same string

2011-01-11 Thread Koert Kuipers
Ok I see get_range_slice is really only useful for paging with RP... So if I were using OPP (which I am not) and I wanted all keys starting with com.google, what should my start_key and end_key be? -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Tuesday,

Re: how to do a get_range_slices where all keys start with same string

2011-01-11 Thread Roshan Dawrani
On Wed, Jan 12, 2011 at 7:41 AM, Koert Kuipers koert.kuip...@diamondnotch.com wrote: Ok I see get_range_slice is really only useful for paging with RP... So if I were using OPP (which I am not) and I wanted all keys starting with com.google, what should my start_key and end_key be? I think

Re: Confused about CASSANDRA-1417; saving row cache

2011-01-11 Thread Chris Burroughs
On 2011-01-11 15:41, Chris Burroughs wrote: On 01/11/2011 02:56 PM, Peter Schuller wrote: But now I need two knobs: Max size of row cache (best optimal steady state hit rate) and number of row cache items to read in on startup (so that the ROW-READ-STAGE does not need to drop packets and node

Re: how to do a get_range_slices where all keys start with same string

2011-01-11 Thread Aaron Morton
If you were using OPP and get_range_slices then set the start_key to be "com.google" and the end_key to be "". Get is slices of say 1,000 (use the last key read as the next start_ket) and when you see the first key that does not start with com.google top making calls.If you move the data from rows

unsubscribe

2011-01-11 Thread Nichole Kulobone

RE: [RELEASE] 0.7.0 (and 0.6.9)

2011-01-11 Thread Viktor Jevdokimov
Congratulations!!! Best regards/ Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063 Fax: +370 5 261 0453 Konstitucijos pr. 23, LT-08105 Vilnius, Lithuania Disclaimer: The information contained in this message and attachments is intended

Re: Why my posts are marked as spam?

2011-01-11 Thread Arijit Mukherjee
I think this happens for RTF. Some of the mails in the post are RTF, and the reply button creates an RTF reply - that's when it happens. Wonder how the mail to which I replied was in RTF... Arijit On 12 January 2011 05:28, Oleg Tsvinev oleg.tsvi...@gmail.com wrote: Whatever I do, it happens :(

Re: [RELEASE] 0.7.0 (and 0.6.9)

2011-01-11 Thread Shinpei Ohtani
Congratulations for 0.7 and also 0.6.9!!! On Wed, Jan 12, 2011 at 3:29 PM, Viktor Jevdokimov viktor.jevdoki...@adform.com wrote: Congratulations!!! Best regards/ Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063 Fax: +370 5 261 0453

Re: how to do a get_range_slices where all keys start with same string

2011-01-11 Thread Arijit Mukherjee
I have a follow on question on this. I have a super column family like this: ColumnFamily Name=EventSpace CompareWith=TimeUUIDType CompareSubcolumnsWith=BytesType ColumnType=Super/ I store some events keyed by a subscriber id, and for each such row, I have a number of super columns which are