[
https://issues.apache.org/jira/browse/HBASE-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13590715#comment-13590715
]
Jonathan Hsieh commented on HBASE-7360:
---------------------------------------
>From a testing point of view the backport Matteo's provided should be the
>version that CDH has been system testing against a real cluster since
>December. Matteo can describe his rig in more detail (he focused on chains of
>snapshots from tables that were restored/clone snapshots).
My rig tested the online snapshot in the presense of "interference" and load.
It has a PE job running that writes a bunch of data. While the PE is running,
we'd take 30-50 snapshots, and then clone them all. We usually get the large
majority of the snapshots taken and cloned (28 for 30). We also inject RS
kills, Master kills, forced compactions, as snapshots are being taken and as
they are being cloned. Master kills affect the success rate the most --
because the master takes time to failover and we go through a few aborted
master timeout cycles before being able to succeed again. In all of these
cases, the RS/HM never permanently hung during snapshotting or cloning. We did
encounter some master hangs during disable (which i believe is a separate
issue). These tests were run for weeks over a 5 node physical cluster.
Larger versions of online snapshot testing (excluding the fault injection) was
also run against a 20 node cluster and a 100 node cluster.
> Snapshot 0.94 Backport
> -----------------------
>
> Key: HBASE-7360
> URL: https://issues.apache.org/jira/browse/HBASE-7360
> Project: HBase
> Issue Type: New Feature
> Components: snapshots
> Affects Versions: 0.94.3
> Reporter: Matteo Bertozzi
> Assignee: Matteo Bertozzi
> Fix For: 0.94.7
>
> Attachments: 7360-v1.patch, HBASE-7360-v0.patch
>
>
> Backport snapshot code to 0.94
> The main changes needed to backport the snapshot are related to the protobuf
> vs writable rpc.
> Offline Snapshot
> * HBASE-6610 - HFileLink: Hardlink alternative for snapshot restore
> * HBASE-6765 - Take a Snapshot interface
> * HBASE-6571 - Generic multi-thread/cross-process error handling framework
> * HBASE-6353 - Snapshots shell
> * HBASE-7107 - Snapshot References Utils (FileSystem Visitor)
> * HBASE-6863 - Offline snapshots
> * HBASE-6865 - Snapshot File Cleaners
> * HBASE-6777 - Snapshot Restore Interface
> * HBASE-6230 - Clone/Restore Snapshots
> * HBASE-6802 - Export Snapshot
> * HBASE-7864 - Rename HMaster#listSnapshots as getCompletedSnapshots()
> * HBASE-7858 - cleanup before merging snapshots branch to trunk
> * HBASE-7969 - Rename HBaseAdmin#getCompletedSnapshots as
> HBaseAdmin#listSnapshots
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira