Bryan Beaudreault created HBASE-27056:
-----------------------------------------

             Summary: Add DeleteSortReducer to HFileOutputFormat2
                 Key: HBASE-27056
                 URL: https://issues.apache.org/jira/browse/HBASE-27056
             Project: HBase
          Issue Type: Improvement
            Reporter: Bryan Beaudreault


Currently if you want to bulk load Deletes you need to find some way to create 
KeyValues instead, because HFileOutputFormat2 doesn't support Delete as a map 
output value class. A savvy user will realize that you can first create your 
Delete then use {{delete.cellScanner()}} to gain access to the KeyValue's and 
write out those.

This feels a little buried next to the ability to bulk load Puts directly. 
Additionally, KeyValue is an IA.Private class and in order to make this work 
you need to at the very least do 
{{{}job.setMapOutputValueClass(KeyValue.class){}}}. It seems a bit wrong to 
make the user import an IA.Private class for this.

We can make HFileOutputFormat2 directly accept Deletes. It can do the work in a 
DeleteSortReducer of breaking up the Deletes into KeyValues, like we do with 
PutSortReducer.

We could also use this moment to add support for full row deletes. If 
DeleteSortReducer sees a Delete without any getFamilyMap() values, it can look 
up the current families for the table.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to