On Jul 07, Andrew Purtell wrote: >> Since HDFS is mostly write once how are updates/deletes handled? > >Not mostly, only write once. > >Deletes are just another write, but one that writes tombstones >"covering" data with older timestamps. > >When serving queries, HBase searches store files back in time until it >finds data at the coordinates requested or a tombstone. > >The process of compaction not only merge sorts a bunch of accumulated >store files (from flushes) into fewer store files (or one) for read >efficiency, it also performs housekeeping, dropping data "covered" by >the delete tombstones. Incidentally this is also how TTLs are >supported: expired values are dropped as well.
Just wanted to talk about WAL. My understanding is that updates are journalled onto HDFS by sequentially recording them as they happen per region. This is where the need for HDFS append comes in, something that I don't recollect seeing in the GFS paper. Despite having support for append in HDFS, it is still expensive to update it on every byte and here is where the wal flushing policies come in.
