Good afternoon, I'm making my data model from scratch for cassandra, this means i can tune and fine tune it for performance.
At this time i'm having problems choosing between a 2 column families or 1 super column family. I will illustrate with a example. Sector, this defines a place, this is one or two properties. Entry, a entry that is bound to a sector, this is simply some text and a few properties. I can model this with a super column family: sectors{ //super column family sector1{ uid1{ text: a text user: joop } uid2{ text: more text user: piet } } sector2{ uid10{ text: even more text user: marie } } } But i can also model this with 2 column families: sectors{ // column family sector1{ textid1: null textid2: null } sector2{ textid4: null } } texts{ //column family textid1{ text: a text user: joop } textid2{ text: more text user: piet } } With the super column family i can retrieve a list of texts for a specific sector with only 1 request to cassandra. With the 2 column families i need to send 2 requests to cassandra: 1. give me all textids from sector x. (returns x, y, z) 2. give me all texts that have id x, y, z. In my final application it is likely that there will be a bit more writes compared to reads. I was wondering what the best approach is when it comes to performance. I suspect that using super column families is slower compared the using column families, but is it stil slower when using 2 column families and with 2 request to cassandra instead of 1 (with super column family). Kind regards, T. Akhayo