Somewhere (can't find it now) I've read that Riak, like Cassandra could be classified as a column store.
This is just a name of course but what I understand from Cassandra is that this allows for space-efficient encoding of column-values. Basically storage is surrounded around columns instead of rows, allowing for different persistence strategies on a per-column, or column-family, basis. Moreover, it would allow for zero storage overhead for non-existent column values. I.e: basically allowing for efficient storage of sparse data-sets. Does Riak have this property as well? More specifically, I've got a datastructure on paper with the following properties, when mapped to riak nomenclature: - ~ 1.000.000 keys (will not grow) - ~ 1.000 columns. (may grow) - 1 particular key has a median of ~50 columns. In other words the entire set is ~ 95% sparse. - Wherever a key has a value for a particular column, that value is always exactly a String (base 255) of 4KB length. - the 4KB values themselves are pretty 'sparse' so would benefit a lot from run-length encoding. Is this supported out of the box? Given these properties how would Riak hold up? Hard to say of course, but I'm looking for some general advice. Thanks. -- View this message in context: http://riak-users.197444.n3.nabble.com/Lots-of-sparse-columns-Efficient-like-Cassandra-Some-measures-of-my-dataset-tp4028367.html Sent from the Riak Users mailing list archive at Nabble.com. _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
