Re: Deletes in HBase

Billy Pearson Wed, 20 Aug 2008 11:28:01 -0700

When you delete a cell there is a record inserted with the same timestamp sowhen the compaction happens it will be deleted

When you inset a second value to the same row/column the timestamp should bedifferent and not deleted.

From what I understand we read from the memcache then from newest written

HStore files until we get what we need to answer the query.
but I could be wrong here

Billy

"John Ryan" <[EMAIL PROTECTED]>wrote in messagenews:[EMAIL PROTECTED]

Ok. Now when I delete a key and then at some later point re-insert thesamekey with a different value would the old values be resuscitated? If nothow
is this enforced? I am looking through the code but like you said it is
complicated :).

JRR
On Mon, Aug 18, 2008 at 6:02 PM, Jim Kellerman<[EMAIL PROTECTED]> wrote:
Comments inline:
> -----Original Message-----
> From: John Ryan> [mailto:[EMAIL PROTECTED]
> Sent: Monday, August 18, 2008 4:49 PM
> To: hbase-user@hadoop.apache.org
> Subject: Deletes in HBase
>
> How do deletes work in HBase? Suppose I have 2 Column Families and a> key> that has entries for both column families. Now I want to delete this> key.
You mean you want to delete all the values for that row key?
See HTable.deleteAll({byte[]|String)

> Is a major compaction absolutely essential for this key to be deleted?

No. Essentially a record is written that indicates that a cell, row, or
column family has been deleted.

> Where I can I follow the code this operation?

All the following paths are prefixed with org.apache.hadoop.hbase:
client.HTable - eventually creates a Java Proxy which has the apispecified
by

ipc.HRegionInterface

which figures out which region server to send the message to. This call
will be answered by regionserver.HRegionServer.deleteAll which calles
regionserver.HRegion.deleteAll for the appropriate region, calling
HRegion.deleteMultiple, HRegion.update which first appends the change totheHLog by calling regionserver.HLog.append, and then stores the informationin
the HStore(s) for the appropriate families by calling
regionserver.HStore.add, which in turn stores it in the memcache for the
HStore by calling regionserver.Memcache.add, which calls
regionserver.Memcache.add

Now the change has been persisted to the redo log (HLog) and is cached.
When the cache fills, a cache flush will write the contents of the cacheout
to disk and may result in a minor compaction.

> Now I am assuming that major compaction doesn't take place all the time
> since it may be an expensive operation. Having said that how are the
reads
> for this key supressed? Please explain.
Reads are suppressed at the level of HStore, and Memcache. They comeacrossthe deleted markers and suppress the results that would otherwise havebeen
returned.
You would have to follow the call tree for get, getRow, and the various
scanner.next methods to see how this works. It is very complicated.

Re: Deletes in HBase

Reply via email to