This article http://bit.ly/FJgTE about MongoDB is interesting. They prioritized low barriers to entry in their selection process, and ignored performance/scaling of any kind.
Aside from that, they mention that for row-oriented storage, serializing the same string column names to disk for every row is a big waste of disk and cache space. As far as I know, this affects Cassandra too. Would it be possible to add symbolized column names in a forward-compatible way? Maybe scoped per sstable, with the registries always kept in memory. Each node could individually make a decision about whether a column name is duplicated enough to be worth symbolizing, and apply the transformation in the compaction phase. Of course there are pitfalls, but it seems like it could be a big boon to effective cache size in row-oriented applications. Evan -- Evan Weaver
