Bryan Beaudreault created HBASE-27056:
-----------------------------------------
Summary: Add DeleteSortReducer to HFileOutputFormat2
Key: HBASE-27056
URL: https://issues.apache.org/jira/browse/HBASE-27056
Project: HBase
Issue Type: Improvement
Reporter: Bryan Beaudreault
Currently if you want to bulk load Deletes you need to find some way to create
KeyValues instead, because HFileOutputFormat2 doesn't support Delete as a map
output value class. A savvy user will realize that you can first create your
Delete then use {{delete.cellScanner()}} to gain access to the KeyValue's and
write out those.
This feels a little buried next to the ability to bulk load Puts directly.
Additionally, KeyValue is an IA.Private class and in order to make this work
you need to at the very least do
{{{}job.setMapOutputValueClass(KeyValue.class){}}}. It seems a bit wrong to
make the user import an IA.Private class for this.
We can make HFileOutputFormat2 directly accept Deletes. It can do the work in a
DeleteSortReducer of breaking up the Deletes into KeyValues, like we do with
PutSortReducer.
We could also use this moment to add support for full row deletes. If
DeleteSortReducer sees a Delete without any getFamilyMap() values, it can look
up the current families for the table.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)