[
https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732531#comment-15732531
]
Ted Yu commented on HBASE-14417:
--------------------------------
More response to Vlad's review comment w.r.t. fault tolerance in bulk load.
When bulk load fails midway, the user should provide complete set of hfiles
again because the staging directory is not exposed to end users.
With this in mind, the benefit of using another hook (prior to
postBulkLoadHFile()) to persist location of bulk loaded hfiles is minimal -
since in subsequent bulk load attempt(s), the same set of (source) hfiles would
be loaded again.
Another factor is that the more writes to hbase:backup table, the higher the
chance of getting (write) failure.
One optimization we can do in the future is to combine writes (performed in
postBulkLoadHFile()) from several regions on the same region server, provided
that these writes are sufficiently close (300 ms apart, e.g.). The completion
of bulk load on a single region server is determined by the slowest
participating region, so this optimization would keep the response time on par
with the current implementation (where hbase:backup table is not involed).
> Incremental backup and bulk loading
> -----------------------------------
>
> Key: HBASE-14417
> URL: https://issues.apache.org/jira/browse/HBASE-14417
> Project: HBase
> Issue Type: New Feature
> Affects Versions: 2.0.0
> Reporter: Vladimir Rodionov
> Assignee: Ted Yu
> Priority: Critical
> Labels: backup
> Fix For: 2.0.0
>
> Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v9.txt,
> 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, 14417.v2.txt, 14417.v21.txt,
> 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, 14417.v6.txt
>
>
> Currently, incremental backup is based on WAL files. Bulk data loading
> bypasses WALs for obvious reasons, breaking incremental backups. The only way
> to continue backups after bulk loading is to create new full backup of a
> table. This may not be feasible for customers who do bulk loading regularly
> (say, every day).
> Google doc for design:
> https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)