Schemas diverging while dynamically creating CF.

2011-04-15 Thread Alejandro Perez
Hello,

We're testing cassandra for integration with indextank. In this first try,
we're creating one column family for each user. In practice, on the first
run and for the first few documents (a few 100s), a new CF is created, and a
document is immediately added to it. A few (up to 50) requests of this type
are issued in parallel (for different column families).

The end result, and quite repeatable, is having the cluster split with
different schema versions, and they never agree.

Any thoughts?


Thanks,

Spike.

-- 
Alejandro Perez
IndexTank

follow us @indextank http://twitter.com/indextank | read our
bloghttp://blog.indextank.com/ | subscribe
our user mailing list http://groups.google.com/group/indextank

http://blog.indextank.com/


Re: Schemas diverging while dynamically creating CF.

2011-04-15 Thread Alejandro Perez
Thanks for the quick response!. I will reconsider the schema.

However, the problem troubles me somehow. How are schema changes supposed to
be done? Should I serialize them, should I halt other cluster operations
while I do the schema change? Is this a known problem with cassandra?

The other question, and I think the more important one for me now: how do I
repair the cluster without loosing data once the schemas diverge? Right now
the only way I have is erase all data and have the cluster start empty.
Should this problem ever happen in production, it's important there's a way
to recover the data.

On Fri, Apr 15, 2011 at 1:57 PM, Dan Hendry dan.hendry.j...@gmail.comwrote:

 Uh... don’t create a column family per user. Column families are meant to
 be fairly static; conceptually equivalent to a table in a relational
 database. Why do you need (or even want) a CF per user? Reconsider your data
 model, a single column family with an inverted index for a ‘user’ column is
 probably more what you are looking for. Operationally, the fewer CFs the
 better.



 Dan



 *From:* Alejandro Perez [mailto:sp...@indextank.com]
 *Sent:* April-15-11 16:39
 *To:* user@cassandra.apache.org
 *Cc:* Support
 *Subject:* Schemas diverging while dynamically creating CF.



 Hello,



 We're testing cassandra for integration with indextank. In this first try,
 we're creating one column family for each user. In practice, on the first
 run and for the first few documents (a few 100s), a new CF is created, and a
 document is immediately added to it. A few (up to 50) requests of this type
 are issued in parallel (for different column families).



 The end result, and quite repeatable, is having the cluster split with
 different schema versions, and they never agree.



 Any thoughts?





 Thanks,



 Spike.


 --

 Alejandro Perez
 IndexTank

 follow us @indextank http://twitter.com/indextank | read our 
 bloghttp://blog.indextank.com/ | subscribe
 our user mailing list http://groups.google.com/group/indextank


 http://blog.indextank.com/

 No virus found in this incoming message.
 Checked by AVG - www.avg.com
 Version: 9.0.894 / Virus Database: 271.1.1/3574 - Release Date: 04/15/11
 02:34:00




-- 
Alejandro Perez
IndexTank

follow us @indextank http://twitter.com/indextank | read our
bloghttp://blog.indextank.com/ | subscribe
our user mailing list http://groups.google.com/group/indextank

http://blog.indextank.com/