read repair keeps occurring on every quorum read

2009-09-25 Thread Edmond Lau
I have a 3 node cluster with a replication factor of 2, running on 0.4 RC1. I've set both my read and write consistency levels to use a quorum. I'm observing that quorum reads keep invoking read repair and log DigestMismatchExceptions from the StorageProxy. Obviously, this significantly reduces

Re: read repair keeps occurring on every quorum read

2009-09-30 Thread Edmond Lau
. On Mon, Sep 28, 2009 at 4:30 PM, Edmond Lau edm...@ooyala.com wrote: On Fri, Sep 25, 2009 at 8:10 PM, Jonathan Ellis jbel...@gmail.com wrote: No, you're mixing two related concepts. When you do a quorum read it will fetch the actual data from one replica and do digest reads from the others

backing up data from cassandra

2009-10-04 Thread Edmond Lau
For folks who are using or considering using cassandra in their production systems, what do you use for backups? With HBase, one could potentially write a mapreduce to perform a row scan of the entire table (restricted to some historical timestamp to get a consistent view) and export the data to

Re: backing up data from cassandra

2009-10-06 Thread Edmond Lau
Thanks for the replies guys. It sounds like restoration via snapshots + some application-side logic to sanity check/repair any data around the snapshot time is the way to go. Edmond On Mon, Oct 5, 2009 at 10:15 AM, Jonathan Ellis jbel...@gmail.com wrote: On Mon, Oct 5, 2009 at 11:23 AM,

why doesn't get_slice() support reading a slice of columns from a slice of supercolumns?

2009-10-06 Thread Edmond Lau
The get_slice() API takes a ColumnParent and a predicate, which means that I can either 1) pick one super column and slice its constituent subcolumns or 2) slice the super columns and pick up all constituent subcolumns. What I'd like to do instead is to take a slice of supercolumns and then a

Re: cassandra fatal error - The name should match the name of the current column or super column

2009-10-15 Thread Edmond Lau
out? -Jonathan On Thu, Oct 15, 2009 at 12:51 PM, Edmond Lau edm...@ooyala.com wrote: I'm using the cassandra 0.4 release.  I was loading a bunch of data into cassandra when the thrift api started throwing UnavailableExceptions.  Checking the logs, I found errors that looked like the following

Re: cassandra fatal error - The name should match the name of the current column or super column

2009-10-16 Thread Edmond Lau
Jonathan - I patched in your latest change that dropped the assertions and tried to restart my cluster on my old data. 2 of 5 nodes still failed to start, with different errors. One dies with a generic EOFException during recovery: INFO - Compacting

repeated timeouts on quorum reads

2009-10-19 Thread Edmond Lau
Whenever I try to do a quorum read on a row with a particularly large supercolumn with get_slice under high load, cassandra throws timeouts. The reads for that row repeatedly fail until load decreases, but smaller reads still succeed during that time. bin/nodeprobe info shows that the read

Re: cassandra fatal error - The name should match the name of the current column or super column

2009-10-19 Thread Edmond Lau
! On Mon, Oct 19, 2009 at 7:11 PM, Edmond Lau edm...@ooyala.com wrote: I wasn't able to apply the patch and reuse the old tables, but after nuking the data, I'm no longer running into the issue anymore. On Fri, Oct 16, 2009 at 3:16 PM, Jonathan Ellis jbel...@gmail.com wrote: Those are both

Re: repeated timeouts on quorum reads

2009-10-19 Thread Edmond Lau
at 5:36 PM, Jonathan Ellis jbel...@gmail.com wrote: How much of the row that fails are you trying to read at once? On Mon, Oct 19, 2009 at 7:30 PM, Edmond Lau edm...@ooyala.com wrote: Whenever I try to do a quorum read on a row with a particularly large supercolumn with get_slice under high load

Re: repeated timeouts on quorum reads

2009-10-19 Thread Edmond Lau
Comments inline. On Mon, Oct 19, 2009 at 6:33 PM, Jonathan Ellis jbel...@gmail.com wrote: On Mon, Oct 19, 2009 at 8:20 PM, Edmond Lau edm...@ooyala.com wrote: On Mon, Oct 19, 2009 at 6:01 PM, Jonathan Ellis jbel...@gmail.com wrote: are there many rows like this? No - just a handful.  I'm

Re: repeated timeouts on quorum reads

2009-10-21 Thread Edmond Lau
the repair should've been unnecessary. Thanks for your help, Edmond -Jonathan On Mon, Oct 19, 2009 at 11:38 PM, Edmond Lau edm...@ooyala.com wrote: A single local read with debug logging takes 3-4 secs on the node with 3 data files.  It actually takes roughly as long on the other nodes

Re: repeated timeouts on quorum reads

2009-10-22 Thread Edmond Lau
Ellis jbel...@gmail.com wrote: What is your columnfamily definition?  What query should I test with? On Wed, Oct 21, 2009 at 7:33 PM, Edmond Lau edm...@ooyala.com wrote: On Tue, Oct 20, 2009 at 2:03 PM, Jonathan Ellis jbel...@gmail.com wrote: Okay, so we have 2 problems:  - the read is simply

Re: repeated timeouts on quorum reads

2009-10-22 Thread Edmond Lau
CompareWith=BytesType Name=Movie/ On Thu, Oct 22, 2009 at 12:04 PM, Jonathan Ellis jbel...@gmail.com wrote: On Thu, Oct 22, 2009 at 10:59 AM, Edmond Lau edm...@ooyala.com wrote: Try: keyspace: Analytics key: o:movie column family: movie super column: all I was able to get timeouts with a few

Re: repeated timeouts on quorum reads

2009-10-22 Thread Edmond Lau
Thanks for the help Jonathan. Given that the current implementation isn't optimized for large supercolumns and given that the current thrift api doesn't support slicing a set of columns across multiple supercolumns of the same row anyway, I agree that I'd be better off just folding my

on bootstrapping a node

2009-10-27 Thread Edmond Lau
I'd like to improve my mental model of how Cassandra bootstrapping works. My understanding is that bootstrapping is just an extra step during a node's startup where the node copies data from neighboring nodes that, according to its token, it should own; afterwards, the node behaves like any other

Re: on bootstrapping a node

2009-10-28 Thread Edmond Lau
On Tue, Oct 27, 2009 at 10:02 PM, Jonathan Ellis jbel...@gmail.com wrote: On Tue, Oct 27, 2009 at 10:10 PM, Edmond Lau edm...@ooyala.com wrote: I'd like to improve my mental model of how Cassandra bootstrapping works.  My understanding is that bootstrapping is just an extra step during

Re: can't write with consistency level of one after some nodes fail

2009-10-29 Thread Edmond Lau
On Thu, Oct 29, 2009 at 1:20 PM, Jonathan Ellis jbel...@gmail.com wrote: On Thu, Oct 29, 2009 at 1:18 PM, Edmond Lau edm...@ooyala.com wrote: I have a freshly started 3-node cluster with a replication factor of 2. If I take down two nodes, I can no longer do any writes, even

Re: can't write with consistency level of one after some nodes fail

2009-10-29 Thread Edmond Lau
) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Oddly, a write with a consistency level of QUORUM succeeds for certain keys (but fails with others) even though I only have one live node. Edmond On Thu, Oct 29, 2009 at 1:38 PM, Edmond Lau edm

Re: can't write with consistency level of one after some nodes fail

2009-10-29 Thread Edmond Lau
at 3:37 PM, Edmond Lau edm...@ooyala.com wrote: I've updated to trunk, and I'm still hitting the same issue but it's manifesting itself differently.  Again, I'm running with a freshly started 3-node cluster with a replication factor of 2.  I then take down two nodes. If I write

Re: can't write with consistency level of one after some nodes fail

2009-10-29 Thread Edmond Lau
CASSANDRA-524 On Thu, Oct 29, 2009 at 2:56 PM, Edmond Lau edm...@ooyala.com wrote: Will do. On Thu, Oct 29, 2009 at 2:39 PM, Jonathan Ellis jbel...@gmail.com wrote: can you create a ticket for this? On Thu, Oct 29, 2009 at 3:39 PM, Jonathan Ellis jbel...@gmail.com wrote: I've seen that bug

difference between memtable and binary memtable?

2009-11-03 Thread Edmond Lau
What's the difference between the memtable and the binary memtable in cassandra? I've seen binary memtable used in discussions relating to MapReduce. When is each one used? Edmond

Re: newbie questions about Cassandra J. Ellis presentation

2009-11-13 Thread Edmond Lau
On Fri, Nov 13, 2009 at 12:20 AM, TuxRacer69 tuxrace...@gmail.com wrote: Hello Cassandra Users! On the wiki page: http://wiki.apache.org/cassandra/ArticlesAndPresentations there is a link to J. Ellis pdf:

Re: cassandra mangling non-ascii keys

2009-12-07 Thread Edmond Lau
/browse/THRIFT-395, https://issues.apache.org/jira/browse/THRIFT-414) is that this won't be fixed any time soon. -Jonathan On Mon, Dec 7, 2009 at 4:56 PM, Edmond Lau edm...@ooyala.com wrote: This particular client was in Ruby. On Mon, Dec 7, 2009 at 2:49 PM, Jonathan Ellis jbel...@gmail.com wrote

Re: Would deleted columns slow down reads?

2010-02-25 Thread Edmond Lau
-row column indexes.) On Thu, Feb 25, 2010 at 8:56 PM, Edmond Lau edm...@ooyala.com wrote: Given that Cassandra needs to maintain tombstones to handle distributed deletes, does the existence of deleted columns slow down slices? To be more concrete, suppose I used a row as a queue.  I keep