Lars George created HBASE-8784:
----------------------------------
Summary: Wildcard/Range/Partition Delete Support
Key: HBASE-8784
URL: https://issues.apache.org/jira/browse/HBASE-8784
Project: HBase
Issue Type: New Feature
Components: Client, Deletes, regionserver
Reporter: Lars George
We often see use-cases where users, for example with timeseries data, would
like to do deletes of large ranges of data, basically like a delete of a
partition as supported by RDBMSs. We should support regular expressions or
range expressions for the matches (supporting binary keys obviously).
The idea is to store the deletes not with the data, but the meta data. When we
read files we read the larger deletes first, and then the inline ones. Of
course, this should be reserved for few but very data intensive deletes. This
reduces the number of deletes to write to one, instead of many (often
thousands, if not millions). This is different from the BulkDeleteEndpoint
introduced in HBASE-6942. It should support similar Scan based selectiveness.
The new range deletes will mask out all the matching data and handled otherwise
like other deletes, for example being dropped during major compactions, once
all masked data has been dropped too.
To be discussed is how and where we store the delete entry in practice, since
meta data might not be wanted. But it seems like a reasonable choice. The
DeleteTracker can handle the delete the same with additional checks for
wildcards/ranges. If the deletes are not used, no critical path is affected,
therefore not causing any additional latencies or other regressions.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira