[
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271616#comment-13271616
]
Hari Mankude commented on HDFS-2802:
------------------------------------
@Eli,
Regarding scenario #3, consider a hbase setup with huge dataset in production.
A new app has been developed which needs to be validated against production
dataset. It is not feasible to copy the entire dataset to a test setup. At the
same time, app is not ready for production and it is not safe to have the app
modify the data in the production database. One of the solutions for these
types of problems is to take a RW snapshot of the production dataset and then
have the development app run against the RW snapshot. After the app testing is
done, RW snap is deleted. This assumes that the cluster has sufficient compute
capacity and incremental storage capacity to support RW snaps.
Regarding appends, current prototype of snapshot relies on the filesize that is
available at the namenode. So, if a file is appended after snap is taken, then
it is a no-op from a snap perspective. If a snap is taken of a file which has
append pipeline setup, inode is of type underconstruction in the NN. Prototype
relies on filesize that is available on the NN for snaps. This might not be
perfect and I have some ideas on trying to acquire more upto-date filesize.
I thought that truncate is not supported currently in the trunk. If you are
referring to deletes, prototype handles deletes correctly without issues.
I will post a more detailed doc after I am done with HA related work.
> Support for RW/RO snapshots in HDFS
> -----------------------------------
>
> Key: HDFS-2802
> URL: https://issues.apache.org/jira/browse/HDFS-2802
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: data-node, name-node
> Affects Versions: 0.24.0
> Reporter: Hari Mankude
> Assignee: Hari Mankude
> Attachments: snapshot-one-pager.pdf
>
>
> Snapshots are point in time images of parts of the filesystem or the entire
> filesystem. Snapshots can be a read-only or a read-write point in time copy
> of the filesystem. There are several use cases for snapshots in HDFS. I will
> post a detailed write-up soon with with more information.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira