How do you choose which files to compact? I am asking this because let's assume over time you dumped files f1, f2, f3 and f4 where f1 was dumped before f2 and f2 before f3 etc. Now let us assume that a key was deleted and that information exists in f3. Let us assume that you minor compacted f1, f2 and f4 which resulted in a file f5. Now when a read occurs would you treat the file f5 as the more recent file? In which case you have lost the information about the fact that a key was deleted and that information is in f3. I could always delete a key and then add it again. So I guess this situation is possible.
I am assuming the way the files are chosen to be compacted is important and how does HBase do this. I hope I am not way off on this one. Please bear with me if I am. Thanks JRR On Mon, Aug 18, 2008 at 6:02 PM, Jim Kellerman <[EMAIL PROTECTED]> wrote: > Comments inline: > > -----Original Message----- > > From: John Ryan [mailto:[EMAIL PROTECTED] > > Sent: Monday, August 18, 2008 4:49 PM > > To: hbase-user@hadoop.apache.org > > Subject: Deletes in HBase > > > > How do deletes work in HBase? Suppose I have 2 Column Families and a key > > that has entries for both column families. Now I want to delete this key. > > You mean you want to delete all the values for that row key? > See HTable.deleteAll({byte[]|String) > > > Is a major compaction absolutely essential for this key to be deleted? > > No. Essentially a record is written that indicates that a cell, row, or > column family has been deleted. > > > Where I can I follow the code this operation? > > All the following paths are prefixed with org.apache.hadoop.hbase: > > client.HTable - eventually creates a Java Proxy which has the api specified > by > > ipc.HRegionInterface > > which figures out which region server to send the message to. This call > will be answered by regionserver.HRegionServer.deleteAll which calles > regionserver.HRegion.deleteAll for the appropriate region, calling > HRegion.deleteMultiple, HRegion.update which first appends the change to the > HLog by calling regionserver.HLog.append, and then stores the information in > the HStore(s) for the appropriate families by calling > regionserver.HStore.add, which in turn stores it in the memcache for the > HStore by calling regionserver.Memcache.add, which calls > regionserver.Memcache.add > > Now the change has been persisted to the redo log (HLog) and is cached. > When the cache fills, a cache flush will write the contents of the cache out > to disk and may result in a minor compaction. > > > Now I am assuming that major compaction doesn't take place all the time > > since it may be an expensive operation. Having said that how are the > reads > > for this key supressed? Please explain. > > Reads are suppressed at the level of HStore, and Memcache. They come across > the deleted markers and suppress the results that would otherwise have been > returned. > You would have to follow the call tree for get, getRow, and the various > scanner.next methods to see how this works. It is very complicated. > >