[
https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15881300#comment-15881300
]
Manoj Govindassamy commented on HDFS-11402:
-------------------------------------------
Thanks for sharing your views [~raviprak]. I will wait for others to review the
patch and share the comments. And, I would also prefer this jira _not_
dependent on HDFS-11435.
> HDFS Snapshots should capture point-in-time copies of OPEN files
> ----------------------------------------------------------------
>
> Key: HDFS-11402
> URL: https://issues.apache.org/jira/browse/HDFS-11402
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs
> Affects Versions: 2.6.0
> Reporter: Manoj Govindassamy
> Assignee: Manoj Govindassamy
> Attachments: HDFS-11402.01.patch, HDFS-11402.02.patch
>
>
> *Problem:*
> 1. When there are files being written and when HDFS Snapshots are taken in
> parallel, Snapshots do capture all these files, but these being written files
> in Snapshots do not have the point-in-time file length captured. That is,
> these open files are not frozen in HDFS Snapshots. These open files
> grow/shrink in length, just like the original file, even after the snapshot
> time.
> 2. At the time of File close or any other meta data modification operation on
> these files, HDFS reconciles the file length and records the modification in
> the last taken Snapshot. All the previously taken Snapshots continue to have
> those open Files with no modification recorded. So, all those previous
> snapshots end up using the final modification record in the last snapshot.
> Thus after the file close, file lengths in all those snapshots will end up
> same.
> Assume File1 is opened for write and a total of 1MB written to it. While the
> writes are happening, snapshots are taken in parallel.
> {noformat}
> |---Time---T1-----------T2-------------T3----------------T4------>
> |-----------------------Snap1----------Snap2-------------Snap3--->
> |---File1.open---write---------write-----------close------------->
> {noformat}
> Then at time,
> T2:
> Snap1.File1.length = 0
> T3:
> Snap1.File1.length = 0
> Snap2.File1.length = 0
> <File1 write completed and closed>
> T4:
> Snap1.File1.length = 1MB
> Snap2.File1.length = 1MB
> Snap3.File1.length = 1MB
> *Proposal*
> 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can
> optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze
> open files.
> 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with
> {{LeaseManager}} and get a list INodesInPath for all open files under the
> snapshot dir.
> 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation,
> Diff creation and updating modification time, can invoke
> {{INodeFile#recordModification}} for each of the open files. This way, the
> Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for
> each of the open files.
> 4. Above model follows the current Snapshot and Diff protocols and doesn't
> introduce any any disk formats. So, I don't think we will be needing any new
> FSImage Loader/Saver changes for Snapshots.
> 5. One of the design goals of HDFS Snapshot was ability to take any number of
> snapshots in O(1) time. LeaseManager though has all the open files with
> leases in-memory map, an iteration is still needed to prune the needed open
> files and then run recordModification on each of them. So, it will not be a
> strict O(1) with the above proposal. But, its going be a marginal increase
> only as the new order will be of O(open_files_under_snap_dir). In order to
> avoid HDFS Snapshots change in behavior for open files and avoid change in
> time complexity, this improvement can be made under a new config
> {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be
> {{false}}.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]