[
https://issues.apache.org/jira/browse/HBASE-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ted Yu updated HBASE-8753:
--------------------------
Summary: Provide new delete flag which can delete all cells under a
column-family which have designated timestamp (was: Provide new delete flag
which can delete all cells under a column-family which have a same designated
timestamp)
> Provide new delete flag which can delete all cells under a column-family
> which have designated timestamp
> --------------------------------------------------------------------------------------------------------
>
> Key: HBASE-8753
> URL: https://issues.apache.org/jira/browse/HBASE-8753
> Project: HBase
> Issue Type: New Feature
> Components: Deletes, Scanners
> Affects Versions: 0.95.1
> Reporter: Feng Honghua
> Assignee: Feng Honghua
> Attachments: 8753-trunk-V2.patch, 8753-trunk-v4.txt,
> HBASE-8753-0.94-V0.patch, HBASE-8753-0.94-V1.patch,
> HBASE-8753-trunk-V0.patch, HBASE-8753-trunk-V1.patch,
> HBASE-8753-trunk-V3.patch
>
>
> In one of our production scenario (Xiaomi message search), multiple cells
> will be put in batch using a same timestamp with different column names under
> a specific column-family.
> And after some time these cells also need to be deleted in batch by given a
> specific timestamp. But the column names are parsed tokens which can be
> arbitrary words , so such batch delete is impossible without first retrieving
> all KVs from that CF and get the column name list which has KV with that
> given timestamp, and then issuing individual deleteColumn for each column in
> that column-list.
> Though it's possible to do such batch delete, its performance is poor, and
> customers also find their code is quite clumsy by first retrieving and
> populating the column list and then issuing a deleteColumn for each column in
> that column-list.
> This feature resolves this problem by introducing a new delete flag:
> DeleteFamilyVersion.
> 1). When you need to delete all KVs under a column-family with a given
> timestamp, just call Delete.deleteFamilyVersion(cfName, timestamp); only a
> DeleteFamilyVersion type KV is put to HBase (like DeleteFamily / DeleteColumn
> / Delete) without read operation;
> 2). Like other delete types, DeleteFamilyVersion takes effect in
> get/scan/flush/compact operations, the ScanDeleteTracker now parses out and
> uses DeleteFamilyVersion to prevent all KVs under the specific CF which has
> the same timestamp as the DeleteFamilyVersion KV to pop-up as part of a
> get/scan result (also in flush/compact).
> Our customers find this feature efficient, clean and easy-to-use since it
> does its work without knowing the exact column names list that needs to be
> deleted.
> This feature has been running smoothly for a couple of months in our
> production clusters.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira