[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234485#comment-13234485
 ] 

Lars Hofhansl commented on HBASE-5604:
--------------------------------------

This is a part of HBase I not that familiar with.
Why is this in principle different from ImportTsv?
I guess it is because each mapper can encounter WALEdits for many tables in the 
HLog file(s) that it works on...?
In the end, though, it would a reducer writing the HFiles, so the distribution 
of HLogs to mappers should not matter. I think.

Hmm... Maybe this is only useful when we have a *lot* of logs to replay such as 
in a point in time recovery scenario using HLogs.
Or maybe there would be no advantage here turning this in an M/R job, but maybe 
it should just be a standalone client...?
                
> HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.
> ------------------------------------------------------------------------
>
>                 Key: HBASE-5604
>                 URL: https://issues.apache.org/jira/browse/HBASE-5604
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Lars Hofhansl
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to