On Sun, Jul 26, 2009 at 2:28 AM, Evan Weaver<[email protected]> wrote: > Would it be possible to add symbolized column names in a > forward-compatible way? Maybe scoped per sstable, with the registries > always kept in memory.
Maybe. But it's not obvious to me how to do this in general. The problem is the sparse nature of the column set. We can't encode _all_ the columns this way, or in the degenerate case we OOM just trying to keep the mapping in memory. Similarly, we can't encode just the top N column names, since figuring out the top N requires keeping each name in memory during the counting process. (Besides slowing down compaction -- instead of just deserializing columns where there are keys in common in the merged fragments, we have to deserialize all.) ISTM that all we can do is encode the _first_ N column names we see, which may be useful if the column name set is small for a given CF. -Jonathan
