Cool! Maybe we can relate that to the client API as well... On the client this is controlled using the Delete object.
o creating a Delete object for a row without specifying anything else will place a family delete marker for each CF. o columns for specific CFs can be deleted by using deleteFamily(...), places a family delete marker o all versions of a column are deleted by using deleteColumns(...), places a column deleter marker o a specific version of a column is deleted by using deleteColumn(...), places a delete maker All of these methods/constructors take a timestamp, which indicates removal of all versions up to that (including) that version. (except for deleteColumn, which is always version specific). ----- Original Message ----- From: Doug Meil <[email protected]> To: "[email protected]" <[email protected]>; lars hofhansl <[email protected]> Cc: Sent: Monday, November 28, 2011 8:08 AM Subject: Re: How HBase implements delete operations Thanks Lars, I'll update the docs with this. On 11/27/11 6:31 PM, "lars hofhansl" <[email protected]> wrote: >That is correct. > >________________________________ > From: yonghu <[email protected]> >To: [email protected]; lars hofhansl <[email protected]> >Sent: Sunday, November 27, 2011 12:34 PM >Subject: Re: How HBase implements delete operations > >So, it means that if a row contains 3 column-families. To delete this row, >the HBase will create three tombstones. Is that right? > >Yong > >On Sun, Nov 27, 2011 at 8:32 PM, lars hofhansl <[email protected]> >wrote: > >> There are exactly three different types of delete marker: >> >> 1. delete >> 2. delete column >> 3. delete family >> >> >> #1 is for a specific version of a column >> #2 is for all versions of a column >> #3 is for all columns of a particular column family >> >> In order to delete an entire row HBase internally places a delete family >> marker for each column family. >> >> -- Lars >> >> >> ----- Original Message ----- >> From: Jahangir Mohammed <[email protected]> >> To: [email protected] >> Cc: >> Sent: Saturday, November 26, 2011 9:02 AM >> Subject: Re: How HBase implements delete operations >> >> Every version is a record for a rowkey. When you say, a row has to be >> deleted, all the versions of the row have to be deleted and all >>versions go >> as a record in file and they should be marked so that when compaction >>runs, >> the merged file doesn't contain the deleted records. I am ready to be >> wronged, but let any committer comment on this. I am too new to HBase. >> >> Thanks, >> Jahangir Mohammed. >> >> private void prepareDelete(Delete delete) throws IOException { >> // Check to see if this is a deleteRow insert >> if(delete.getFamilyMap().isEmpty()){ >> for(byte [] family : this.htableDescriptor.getFamiliesKeys()){ >> // Don't eat the timestamp >> delete.deleteFamily(family, delete.getTimeStamp()); >> } >> } else { >> for(byte [] family : delete.getFamilyMap().keySet()) { >> if(family == null) { >> throw new NoSuchColumnFamilyException("Empty family is >>invalid"); >> } >> checkFamily(family); >> } >> } >> } >> >> >> >> On Sat, Nov 26, 2011 at 2:47 AM, yonghu <[email protected]> wrote: >> >> > But I just considered about the efficiency. Why HBase does not >>directly >> > write a tombstone to row key instead of for each cell? >> > >> > regards >> > >> > Yong >> > >> > On Sat, Nov 26, 2011 at 8:11 AM, Jahangir Mohammed >> > <[email protected]>wrote: >> > >> > > Tombstone. Same as cell. >> > > >> > > Thanks, >> > > Jahangir Mohammed. >> > > >> > > On Sat, Nov 26, 2011 at 1:14 AM, yonghu <[email protected]> >>wrote: >> > > >> > > > hello, >> > > > >> > > > I read http://hbase.apache.org/book/versions.html and have a >> question >> > > > about >> > > > delete operation. As it mentions, the user can delete a whole row >>or >> > > delete >> > > > a data version of cell. The delete operation of data version of >>cell >> is >> > > > just to write a tombstone marker for that version. I want to know >>how >> > > about >> > > > delete a row? Does HBase deletes the row immediately? or use the >>same >> > > > strategy as deleting a data version which create a tombstone for >>that >> > row >> > > > key? Or create a tombstone for every data version belongs to that >> row? >> > > > >> > > > regards >> > > > >> > > > Yong >> > > > >> > > >> > >>
