[ 
https://issues.apache.org/jira/browse/HBASE-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049511#comment-13049511
 ] 

Bogdan-Alexandru Matican commented on HBASE-3967:
-------------------------------------------------

Ok, so I think I've managed to make this work. However, I couldn't simply 
abstract up and use Row directly as the mapper output due to the following set 
of lines in "org.apache.hadoop.mapred.MapTask"

844       if (key.getClass() != keyClass) {
845         throw new IOException("Type mismatch in key from map: expected "
846                               + keyClass.getName() + ", recieved "
847                               + key.getClass().getName());
848       }

and the corresponding for value. 

This meant that even if I tried to pass a Put or a Delete as Rows when writing 
to the map context, it would fail at this check. As such, I just created an 
abstraction that acts as a union for _either_ a Put or a Delete and can be 
built off of either.

> Support deletes in HFileOutputFormat based bulk import mechanism
> ----------------------------------------------------------------
>
>                 Key: HBASE-3967
>                 URL: https://issues.apache.org/jira/browse/HBASE-3967
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Kannan Muthukkaruppan
>
> During bulk imports, it'll be useful to be able to do delete mutations 
> (either to delete data that already exists in HBase or was inserted earlier 
> during this run of the import). 
> For example, we have a use case, where we are processing a log of data which 
> may have both inserts and deletes in the mix and we want to upload that into 
> HBase using the bulk import mechanism.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to