[
https://issues.apache.org/jira/browse/HBASE-15321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172375#comment-15172375
]
churro morales commented on HBASE-15321:
----------------------------------------
Use case:
Jobs made regionservers slow. Slow regionservers made jobs slow.
Jobs took up quite a bit of regionserver resources, eg: RS heap, handlers,
etc... We had jobs that did full table scans over a really large table, with
lots of regions and store files. Hbase snapshots were quite slow on our large
cluster (even with skip flush and manifests) they took around 20 minutes to
snapshot this table. This cluster was also taking quite a bit of writes and
serving random reads so the main goal being to reduce the influence these jobs
had on cluster resources
Hdfs snapshots are O(1) operations. Thus for our jobs, we took a snapshot in
setup, ran the job over the hdfs snapshot and then deleted the snapshot after
the job completed.
If the job can afford to have a latency of (Now -
hbase.regionserver.optionalcacheflushinterval) for your job, M/R over hdfs
snapshots is a good option.
This improved the speed at which the jobs completed as well as reduced the
resources being consumed from hbase on our cluster.
> Ability to open a HRegion from hdfs snapshot.
> ---------------------------------------------
>
> Key: HBASE-15321
> URL: https://issues.apache.org/jira/browse/HBASE-15321
> Project: HBase
> Issue Type: New Feature
> Affects Versions: 2.0.0
> Reporter: churro morales
> Fix For: 2.0.0
>
> Attachments: HBASE-15321-v1.patch, HBASE-15321-v2.patch,
> HBASE-15321-v3.patch, HBASE-15321.patch
>
>
> Now that hdfs snapshots are here, we started to run our mapreduce jobs over
> hdfs snapshots. The thing is, hdfs snapshots are read-only point-in-time
> copies of the file system. Thus we had to modify the section of code that
> initialized the region internals in HRegion. We have to skip cleanup of
> certain directories if the HRegion is backed by a hdfs snapshot. I have a
> patch for trunk with some basic tests if folks are interested.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)