Reached an EOL or something bizzare occured.

2010-04-07 Thread 叶江
hi: i setup a cluster with 2 nodes,and when i insert the data ,something wrong happened . This is my major code: for(int i = 0;i 500;i++) { String tmp = age + i; client.insert(Keyspace1, key_user_id, new

Re: Net::Cassandra::Easy deletion failed

2010-04-07 Thread Ted Zlatanov
On Tue, 06 Apr 2010 14:14:55 -0700 Mike Gallamore mike.e.gallam...@googlemail.com wrote: MG Great it works. Or at least the Cassandra/thrift part seems to MG work. My tests don't pass but I think it is actual logic errors in the MG test now, the column does appear to be getting cleared okay

Re: Reached an EOL or something bizzare occured.

2010-04-07 Thread Jonathan Ellis
Upgrade to 0.6 On Wed, Apr 7, 2010 at 8:52 AM, 叶江 yejiang...@gmail.com wrote: hi:   i setup a cluster with 2 nodes,and when i insert the data ,something wrong happened . This is my major code:        for(int i = 0;i 500;i++)                {                String tmp = age + i;       

Cassandra cluster does not tolerate single node failure

2010-04-07 Thread Oleg Anastasjev
Hello, I am doing some tests of cassandra clsuter behavior on several failure scenarios. And i am stuck woith the very 1st test - what happens, if 1 node of cluster becomes unavailable. I have 4 4gb nodes loaded with write mostly test. Normally it works at the rate about 12000 ops/second.

Re: Cassandra cluster does not tolerate single node failure

2010-04-07 Thread Jonathan Ellis
This is a known problem with 0.5 that was addressed in 0.6. On Wed, Apr 7, 2010 at 9:18 AM, Oleg Anastasjev olega...@gmail.com wrote: Hello, I am doing some tests of cassandra clsuter behavior on several failure scenarios. And i am stuck woith the very 1st test - what happens, if 1 node of

Re: Inconsistency when unit testing

2010-04-07 Thread Sylvain Lebresne
Use ConsistencyLevel.QUORUM when you write *and* when you read. On Wed, Apr 7, 2010 at 5:26 PM, Philip Jackson p...@shellarchive.co.uk wrote: Hi, To summarise my app;  * try to get item from UserUrl cf   * if not found then check in the Url cf to see if we have fetched     url before and

Cassandra cluster does not tolerate single node failure good

2010-04-07 Thread Oleg Anastasjev
Hello, I am doing some tests of cassandra clsuter behavior on several failure scenarios. And i am stuck with the very 1st test - what happens, if 1 node of cluster becomes unavailable. I have 4 4gb nodes loaded with write mostly test. Replication Factor is 2. Normally it works at the rate about

Re: Inconsistency when unit testing

2010-04-07 Thread Philip Jackson
At Wed, 7 Apr 2010 17:29:49 +0200, Sylvain Lebresne wrote: Use ConsistencyLevel.QUORUM when you write *and* when you read. I already do (plus, I only test with one node). BTW, I'm on 0.5.0, if that makes any difference. Cheers, Phil

Re: Cassandra cluster does not tolerate single node failure good

2010-04-07 Thread Jonathan Ellis
Isn't this the same question I just answered? On Wed, Apr 7, 2010 at 10:35 AM, Oleg Anastasjev olega...@gmail.com wrote: Hello, I am doing some tests of cassandra clsuter behavior on several failure scenarios. And i am stuck with the very 1st test - what happens, if 1 node of cluster

Re: ConsistencyLevel.ZERO

2010-04-07 Thread Paul Prescod
Is it planned that Cassandra will eventually be able to handle a buffer overflow without crashing? Is this related to Cassandra-685 - Add backpressure to StorageProxy Now that we have CASSANDRA-401 and CASSANDRA-488 there is one last piece: we need to stop the target node from pulling mutations

Re: Cassandra cluster does not tolerate single node failure good

2010-04-07 Thread Oleg Anastasjev
Jonathan Ellis jbellis at gmail.com writes: Isn't this the same question I just answered? Umm, I am not sure. I looked over last 3 days of your replies and did not found my case. Could you gimme some clue plz ?

Re: Bug in Cassandra that occurs when removing a supercolumn.

2010-04-07 Thread Matthew Grogan
I am seeing a similar problem running on 0.6 rc1. The data/logs have existed since 0.5. If I insert a new row then delete and re-insert then it works fine. If I delete a row that was created under 0.5 then delete and re-insert then the insert silently fails. I can delete the data/logs and

Re: OrderPreservingPartitioner limits and workarounds

2010-04-07 Thread Benjamin Black
I'd suggest you use RandomPartitioner, an index, and multiget. You'll be able to do range queries and won't have the load imbalance and performance problems of OPP and native range queries. b On Wed, Apr 7, 2010 at 3:51 AM, Paul Prescod p...@prescod.net wrote: I have one append-oriented

Re: OrderPreservingPartitioner limits and workarounds

2010-04-07 Thread Jonathan Ellis
One thing you can do is manually randomize keys for any CFs that don't need the OP by pre-pending their md5 to the key you send Cassandra. (This is all RP is doing under the hood anyway.) On Wed, Apr 7, 2010 at 5:51 AM, Paul Prescod p...@prescod.net wrote: I have one append-oriented workload

Re: Bug in Cassandra that occurs when removing a supercolumn.

2010-04-07 Thread Jonathan Ellis
If you can make a reproducible test case using the example CF definitions, that would be great. On Wed, Apr 7, 2010 at 2:48 PM, Matthew Grogan mgro...@system7.co.uk wrote: In both my cases the re-inserts have a higher timestamp. On 7 April 2010 20:13, Jonathan Ellis jbel...@gmail.com wrote:

Re: Handshake failed

2010-04-07 Thread Brandon Williams
On Wed, Apr 7, 2010 at 3:15 PM, Jason Alexander jason.alexan...@match.comwrote: TTransport transport = new TSocket(10.223.131.19, ); This is not the default Thrift port (unless you explicitly set that way), you probably want port 9160. -Brandon

Cassandra at Twitter's Chirp conference

2010-04-07 Thread Ryan King
I'll be giving a talk at our developer's conference next week about how and why we're using cassandra. If there's anything you'd like to hear about, post your question on http://www.google.com/moderator/#15/e=5c0ft=5c0f.49f=5c0f.23623. thanks, ryan PS - Yes, I think video will be available.

does compaction of Super Column Family have same limit as compaction of Column Family

2010-04-07 Thread Jeremy Davis
Quick question: There is an open issue with ColumnFamilies growing too large to fit in memory when compacting.. Does this same limit also apply to SCF? As long as each sub CF is sufficiently small, etc. -JD

Re: does compaction of Super Column Family have same limit as compaction of Column Family

2010-04-07 Thread Benjamin Black
SCF rows are loaded in their entirety into memory, so the limit applies in the same way. On Wed, Apr 7, 2010 at 5:16 PM, Jeremy Davis jerdavis.cassan...@gmail.com wrote: Quick question: There is an open issue with ColumnFamilies growing too large to fit in memory when compacting.. Does this

Re: Integrity of batch_insert and also what about sharding?

2010-04-07 Thread Paul Prescod
On Wed, Apr 7, 2010 at 6:02 PM, banks bankse...@gmail.com wrote: Then from an IT standpoint, if i'm using a RF of 3, it stands to reason that running on Raid 1 makes sense, since RAID and RF achieve the same ends... it makes sense to strip for speed and let cassandra deal with redundancy, eh?

Re: Iterate through entire data set

2010-04-07 Thread Stu Hood
Please read the README in the contrib/word_count directory. -Original Message- From: Sonny Heer sonnyh...@gmail.com Sent: Wednesday, April 7, 2010 6:33pm To: user@cassandra.apache.org Subject: Re: Iterate through entire data set Jon, I've got the word_count.jar and a Hadoop cluster. How

Re: Integrity of batch_insert and also what about sharding?

2010-04-07 Thread Benjamin Black
That depends on your goals for fault tolerance and recovery time. If you use RAID1 (or other redundant configuration) you can tolerate disk failure without Cassandra having to do repair. For large data sets, that can be a significant win. b On Wed, Apr 7, 2010 at 6:02 PM, banks

Re: Integrity of batch_insert and also what about sharding?

2010-04-07 Thread banks
What I'm trying to wrap my head around is what is the break even point... If I'm going to store 30terabytes in this thing... whats optimum to give me performance and scalability... is it best to be running 3 powerfull nodes, 100 smaller nodes, nodes on each web blade with 300g behind each... ya

RE: Integrity of batch_insert and also what about sharding?

2010-04-07 Thread Jason Alexander
FWIW, I'd love to see some guidance here too - From our standpoint, we'll be consolidating the various Match.com sites' (match.com, chemistry.com, etc...) data into a single data warehouse, running Cassandra. We're looking at roughly the same amounts of data (30TB's or more). We were assuming

Re: Integrity of batch_insert and also what about sharding?

2010-04-07 Thread Benjamin Black
Recovery times are shorter the less data per node, so lots of smaller nodes are better on that axis. More nodes also means more frequent node failure, so lots of smaller nodes are worse on that axis. The gossip chatter is miniscule, even with large clusters. Simply not a factor. On Wed, Apr 7,

RE: Integrity of batch_insert and also what about sharding?

2010-04-07 Thread Jason Alexander
Well, IANAITG (I Am Not An IT Guy), but outside of the normal benefits you get from a SAN (that you can, of course, get from other options) is that I believe our IT group likes it for the management aspects - they like to buy a BigAssSAN(tm) and provision storage to different clusters,

Re: Integrity of batch_insert and also what about sharding?

2010-04-07 Thread Cliff Moon
Putting cassandra's data directories on a SAN is like putting a bunch of F1's on one of those big car carrier trucks and entering a race with the truck. You know, since you have so much horsepower. On 4/7/10 7:28 PM, Jason Alexander wrote: Well, IANAITG (I Am Not An IT Guy), but outside of

Basic question

2010-04-07 Thread Palaniappan Thiyagarajan
All, I am investigating how we can use Cassandra in our application. We have tokens and session information stored in db now and I am thinking of moving to Cassandra. Currently it's write and read intensive and having performance issue. Is it good idea to move couple of tables and

Re: Integrity of batch_insert and also what about sharding?

2010-04-07 Thread David Timothy Strauss
Based on empirical usage, Gossip chatter is quite manageable well beyond 100 nodes. One advantage of many small nodes is that the cost of node failure is small on rebuild. If you have 100 nodes with a hundred gigs each, the price you pay for a node's complete failure is pulling a hundred