> - writes: a row oriented database writes the whole row regardless > of whether or not values are supplied for every field or not. > Space is reserved for null fields, so the number of bytes > written is the same for every row. In a column oriented > database, only the columns for which values are supplied are > written. Nulls are free. Also row oriented databases must write > a row descriptor so that when the row is read, the column values > can be found.
While I believe this is true for the basic N-Ary Storage Model as published in the literature, I believe most practical products have some mechanism of null compression within a page. Perhaps someone with more experience could confirm if this is the case? > - reads: Unless every column is being returned on a read, a column > oriented database is faster because it only reads the columns > requested. The row oriented database must read the entire row, > figure out where the requested columns are and only return that > portion of the data read. Partly. This is ignoring that the column oriented store has to do tuple reconstruction which also has overhead. As published in the literature, a hybrid of rows across pages but with attributes organized as columns within each page is better than a pure column store in almost all workloads (reference PAX storage manager in the literature). All that said, I found his paper extremely interesting, particularly the willingness to forgo disk altogether. Jason