Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "FileFormatDesignDoc" page has been changed by StuHood.
http://wiki.apache.org/cassandra/FileFormatDesignDoc?action=diff&rev1=11&rev2=12

--------------------------------------------------

  
  One weakness of the implementation so far is that it doesn't allow tuples to 
be reordered within a level. This approach performs well for wide rows with 
high field cardinality, since adding compression is unlikely to remove data.
  
- Since we have domain knowledge that a compression algorithm would not, it 
will often be more efficient to perform reordering by ourselves, particularly 
when a chunk has low cardinality: for example at the "name2" level above. By 
assigning the chunk an ''ordered'' type, we can store the fields in sorted 
order (rather than in parent-sorted order) and remove duplicates.
+ Since we have domain knowledge that a compression algorithm would not, it 
will often be more efficient to perform reordering by ourselves, particularly 
when a chunk has low cardinality: for example at the "name2" level above. By 
assigning the chunk an ordering of ''self'' (as opposed to ''parent''), we can 
store the fields in sorted order (rather than in ''parent''-sorted order) and 
remove duplicates.
  
  || ''name2'' ||
  || flavor ||
  || origin  ||
  
- More importantly, a chunk of type ''ordered'' should influence the order of 
tuples in child chunks. When we encounter an ''ordered'' chunk at level 
"name2", we should expect its children in level "value" to be arranged as 
follows:
+ More importantly, a ''self''-ordered should influence the order of tuples in 
child chunks. When we encounter an ''self''-ordered chunk at level "name2", we 
should expect its children in level "value" to be arranged as follows:
  
  || ''value'' || ''parent_change'' ||
  || 3.4 || 1 ||

Reply via email to