On Thu, Aug 18, 2016 at 7:05 AM, fidel zheng <[email protected]> wrote:
> I just read the paper of kudu. I have a question about the delta > compaction. > > Any given live row is in exactly one rowset, so the delete of this row is > in the delta file of the same rowset. When the maintains process do the > delta compaction, it can perform the delete. Why not? > The issue is that delta compaction typically does not rewrite all of the columns. For example, consider a schema like: CREATE TABLE users ( user_id PRIMARY KEY, address string, biography string, phone_number string, last_login_ts int64 ); All of the non-PK fields might be updated, but the 'last_login_ts' will be updated much more frequently. The major delta compaction process uses update counts by column to see, in this case, that the last_login_ts column is the only one that needs to be compacted (because the others likely haven't received updates). This saves a lot of IO, especially in cases like this, because the 'last_login_ts' column is likely to be much smaller than other columns such as 'biography' or 'address'. To get back to your question, then: if we are compacting only a subset of columns, it wouldn't be possible to garbage-collect a deleted row. Consider the data: tlipcon "433 California St" 12345 [UPDATE: last_login = 22345] deleted_user "210 Portage Ave" 54321 [DELETED] other_user "151 W 26th St" 90000 and we want to major-compact the 'last_login' column. Since we only read and re-write that column, imagine what would happen if we processed the delete: tlipcon "433 California St" 22345 deleted_user "210 Portage Ave" 90000 other_user "151 W 26th St" ?????? The column that we compacted would now be "too short", and later column values would end up shifted upwards into non-corresponding rows. Hope that helps -Todd -- Todd Lipcon Software Engineer, Cloudera
