RE: Schemas diverging while dynamically creating CF.

2011-04-15 Thread Dan Hendry
Uh... don't create a column family per user. Column families are meant to be
fairly static; conceptually equivalent to a table in a relational database.
Why do you need (or even want) a CF per user? Reconsider your data model, a
single column family with an inverted index for a 'user' column is probably
more what you are looking for. Operationally, the fewer CFs the better.

 

Dan

 

From: Alejandro Perez [mailto:sp...@indextank.com] 
Sent: April-15-11 16:39
To: user@cassandra.apache.org
Cc: Support
Subject: Schemas diverging while dynamically creating CF.

 

Hello,

 

We're testing cassandra for integration with indextank. In this first try,
we're creating one column family for each user. In practice, on the first
run and for the first few documents (a few 100s), a new CF is created, and a
document is immediately added to it. A few (up to 50) requests of this type
are issued in parallel (for different column families).

 

The end result, and quite repeatable, is having the cluster split with
different schema versions, and they never agree.

 

Any thoughts?

 

 

Thanks,

 

Spike.


-- 

Alejandro Perez
IndexTank

follow us @indextank http://twitter.com/indextank  | read our blog
http://blog.indextank.com/  | subscribe our user mailing list
http://groups.google.com/group/indextank 

 http://blog.indextank.com/ 


No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.894 / Virus Database: 271.1.1/3574 - Release Date: 04/15/11
02:34:00



Re: Schemas diverging while dynamically creating CF.

2011-04-15 Thread Alejandro Perez
Thanks for the quick response!. I will reconsider the schema.

However, the problem troubles me somehow. How are schema changes supposed to
be done? Should I serialize them, should I halt other cluster operations
while I do the schema change? Is this a known problem with cassandra?

The other question, and I think the more important one for me now: how do I
repair the cluster without loosing data once the schemas diverge? Right now
the only way I have is erase all data and have the cluster start empty.
Should this problem ever happen in production, it's important there's a way
to recover the data.

On Fri, Apr 15, 2011 at 1:57 PM, Dan Hendry dan.hendry.j...@gmail.comwrote:

 Uh... don’t create a column family per user. Column families are meant to
 be fairly static; conceptually equivalent to a table in a relational
 database. Why do you need (or even want) a CF per user? Reconsider your data
 model, a single column family with an inverted index for a ‘user’ column is
 probably more what you are looking for. Operationally, the fewer CFs the
 better.



 Dan



 *From:* Alejandro Perez [mailto:sp...@indextank.com]
 *Sent:* April-15-11 16:39
 *To:* user@cassandra.apache.org
 *Cc:* Support
 *Subject:* Schemas diverging while dynamically creating CF.



 Hello,



 We're testing cassandra for integration with indextank. In this first try,
 we're creating one column family for each user. In practice, on the first
 run and for the first few documents (a few 100s), a new CF is created, and a
 document is immediately added to it. A few (up to 50) requests of this type
 are issued in parallel (for different column families).



 The end result, and quite repeatable, is having the cluster split with
 different schema versions, and they never agree.



 Any thoughts?





 Thanks,



 Spike.


 --

 Alejandro Perez
 IndexTank

 follow us @indextank http://twitter.com/indextank | read our 
 bloghttp://blog.indextank.com/ | subscribe
 our user mailing list http://groups.google.com/group/indextank


 http://blog.indextank.com/

 No virus found in this incoming message.
 Checked by AVG - www.avg.com
 Version: 9.0.894 / Virus Database: 271.1.1/3574 - Release Date: 04/15/11
 02:34:00




-- 
Alejandro Perez
IndexTank

follow us @indextank http://twitter.com/indextank | read our
bloghttp://blog.indextank.com/ | subscribe
our user mailing list http://groups.google.com/group/indextank

http://blog.indextank.com/