[
https://issues.apache.org/jira/browse/HBASE-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049511#comment-13049511
]
Bogdan-Alexandru Matican commented on HBASE-3967:
-------------------------------------------------
Ok, so I think I've managed to make this work. However, I couldn't simply
abstract up and use Row directly as the mapper output due to the following set
of lines in "org.apache.hadoop.mapred.MapTask"
844 if (key.getClass() != keyClass) {
845 throw new IOException("Type mismatch in key from map: expected "
846 + keyClass.getName() + ", recieved "
847 + key.getClass().getName());
848 }
and the corresponding for value.
This meant that even if I tried to pass a Put or a Delete as Rows when writing
to the map context, it would fail at this check. As such, I just created an
abstraction that acts as a union for _either_ a Put or a Delete and can be
built off of either.
> Support deletes in HFileOutputFormat based bulk import mechanism
> ----------------------------------------------------------------
>
> Key: HBASE-3967
> URL: https://issues.apache.org/jira/browse/HBASE-3967
> Project: HBase
> Issue Type: Improvement
> Reporter: Kannan Muthukkaruppan
>
> During bulk imports, it'll be useful to be able to do delete mutations
> (either to delete data that already exists in HBase or was inserted earlier
> during this run of the import).
> For example, we have a use case, where we are processing a log of data which
> may have both inserts and deletes in the mix and we want to upload that into
> HBase using the bulk import mechanism.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira