On 2010-03-05 18:04, Erik Holstad wrote:
> What are the benefits of using multiple ColumnFamilies compared to using
> a composite row name?

Just for terminology's sake, I'll note that rows have keys, not names.
Only columns and supercolumns have names.

I'm not the top expert here by any means, but I think the choice between
{CF-as-direction, key-as-person} and {key-as-person-and-direction} won't
affect performance substantially if the multiple CFs in the first option
are identically configured. All messages with the same source or
destination still share the same row.

What *would* make a huge difference is composite row keys like
from_userA_userB and to_userB_userA where you'd have to pull key ranges
to get all the messages to or from someone. That design would trade
performance for inbox scalability, assuming users distribute their
messages to a wide breadth other users.

> Example: You have messages that you want to index on sent and to.
> 
> So you can either have
> ColumnFamilyFrom:userTo:{userFrom->messageid}
> ColumnFamilyTo:userFrom:{userTo->messageid}
> 
> or something like
> ColumnFamily:user_to:{user1_messageId, user2_messageId}
> ColumnFamily:user_from:{user1_messageId, user2_messageId}

You've changed two different things between the examples:

(1) Whether direction is distinguished by the key or by the CF.
(2) Something about the columns, but this isn't clear or necessary to
support the change in CF/key structure.

What is the second change, and why did you make it?

-- 
David Strauss
   | da...@fourkitchens.com
Four Kitchens
   | http://fourkitchens.com
   | +1 512 454 6659 [office]
   | +1 512 870 8453 [direct]

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to