[
https://issues.apache.org/jira/browse/HBASE-18165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037663#comment-16037663
]
Andrew Purtell edited comment on HBASE-18165 at 6/5/17 9:49 PM:
----------------------------------------------------------------
[~davelatham]
https://accumulo.apache.org/1.7/accumulo_user_manual.html#_iterator_design
See section 7.5.1 (Filter) and 7.8 (Compaction-time Iterators). You'd be able
to filter out by key-predicate. There are some limitations enumerated in the
doc that would hold for an implementation in HBase too, e.g.
{quote}
Iterators will not necessarily see all of the Key-Value pairs in ever
invocation. Because compactions often do not rewrite all files (only a subset
of them), it is possible that the logic take this into consideration.
[...]
a Combiner that runs over data at during compactions, might not see all of the
values for a given Key. The Combiner must recognize this and not perform any
function that would be incorrect due to the missing values.
{quote}
was (Author: apurtell):
[~davelatham]
https://accumulo.apache.org/1.7/accumulo_user_manual.html#_iterator_design
See section 7.5.1 (Filter) and 7.8 (Compaction-time Iterators). You'd be able
to filter out by key-predicate. There are some limitations enumerated in the
doc that would hold for an implementation in side of HBase too, e.g.
{quote}
Iterators will not necessarily see all of the Key-Value pairs in ever
invocation. Because compactions often do not rewrite all files (only a subset
of them), it is possible that the logic take this into consideration.
[...]
a Combiner that runs over data at during compactions, might not see all of the
values for a given Key. The Combiner must recognize this and not perform any
function that would be incorrect due to the missing values.
{quote}
> Predicate based deletion during major compactions
> -------------------------------------------------
>
> Key: HBASE-18165
> URL: https://issues.apache.org/jira/browse/HBASE-18165
> Project: HBase
> Issue Type: Brainstorming
> Reporter: Lars Hofhansl
>
> In many cases it is expensive to place a delete per version, column, or
> family.
> HBase should have way to specify a predicate and remove all Cells matching
> the predicate during the next compactions (major and minor).
> Nothing more concrete. The tricky part would be to know when it is safe to
> remove the predicate, i.e. when we can be sure that all Cells matching the
> predicate actually have been removed.
> Could potentially use HBASE-12859 for that.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)