Hi David and Ryan, That is very interesting! This makes things much clearer.
Thanks for your help! Mike On Jan 31, 2011, at 4:40 PM, Ryan Rawson wrote: > You are correct, since we do not prune extra version except during > these major compactions that happen about once a day, if you delete a > recent version and it exposes an older version, you will see this. > > I might consider this a mis-feature. I would encourage you to > consider using the Delete.deleteColumns() call found here: > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Delete.html#deleteColumns(byte[], > byte[]) > > and NOT USE: > > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Delete.html#deleteColumn(byte[], > byte[]) > > Note the only difference between these is the plurality of 'column'. > > I hope this helps! > -ryan > > On Mon, Jan 31, 2011 at 4:35 PM, Buttler, David <[email protected]> wrote: >> The way I understand it is that old versions do not actually disappear until >> a compaction occurs. A compaction should occur once per day unless you have >> changed the major compaction settings, or whenever a region splits. >> >> Dave >> >> >> >> -----Original Message----- >> From: Mike Percy [mailto:[email protected]] >> Sent: Friday, January 28, 2011 6:10 PM >> To: [email protected] >> Subject: Re: Delete reveals older version of a column even when VERSIONS=1 >> >> Hmm... how does this relate to setting VERSIONS => '1'? By setting # of >> versions to 1 are we getting some space benefit over say VERSIONS => '10'? >> >> Thanks, >> Mike >> >> On Jan 28, 2011, at 5:47 PM, Ryan Rawson wrote: >> >>> I would call it 'a surprising, perhaps unexpected consequence of our >>> storage model'. >>> >>> There are 2 types of deletes in hbase, you are doing type (a) "delete >>> a single version", but you probably want type (b) "delete all versions >>> in this column" >>> >>> >>> >>> On Fri, Jan 28, 2011 at 5:43 PM, Mike Percy <[email protected]> wrote: >>>> Hi folks, >>>> I am seeing some unexpected behavior with HBase 0.20.6 when deleting >>>> columns. Our cluster has been running for some time however we recently >>>> upgraded from Hbase 0.20.3. The family I am writing to is specified as >>>> VERSIONS => '1' when doing a describe, yet HBase appears to be maintaining >>>> several versions of the columns. >>>> >>>> Below is a shell session demonstrating the problem. Is this a >>>> configuration problem, as-designed, or possibly a bug? >>>> >>>> Thanks, >>>> Mike >>>> >>>> hbase(main):004:0> put 'table', 'row', 'family:qual', '1' >>>> 0 row(s) in 0.0110 seconds >>>> hbase(main):007:0> get 'table', 'row' >>>> COLUMN CELL >>>> family:qual timestamp=1296264772717, value=1 >>>> 1 row(s) in 0.0080 seconds >>>> hbase(main):008:0> put 'table', 'row', 'family:qual', '2' >>>> 0 row(s) in 0.0020 seconds >>>> hbase(main):009:0> put 'table', 'row', 'family:qual', '3' >>>> 0 row(s) in 0.0020 seconds >>>> hbase(main):010:0> get 'table', 'row' >>>> COLUMN CELL >>>> family:qual timestamp=1296264797169, value=3 >>>> 1 row(s) in 0.0030 seconds >>>> hbase(main):011:0> delete 'table', 'row', 'family:qual' >>>> 0 row(s) in 0.0040 seconds >>>> hbase(main):012:0> get 'table', 'row' >>>> COLUMN CELL >>>> family:qual timestamp=1296264795365, value=2 >>>> 1 row(s) in 0.0630 seconds >>>> hbase(main):013:0> delete 'table', 'row', 'family:qual' >>>> 0 row(s) in 0.0360 seconds >>>> hbase(main):014:0> get 'table', 'row' >>>> COLUMN CELL >>>> family:qual timestamp=1296264772717, value=1 >>>> 1 row(s) in 0.0030 seconds >>>> hbase(main):013:0> delete 'table', 'row', 'family:qual' >>>> 0 row(s) in 0.0360 seconds >>>> hbase(main):016:0> get 'table', 'row' >>>> COLUMN CELL >>>> 0 row(s) in 0.0030 seconds >>>> >>>> >> >>
