[ 
https://issues.apache.org/jira/browse/HDFS-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13537650#comment-13537650
 ] 

Aaron T. Myers commented on HDFS-4103:
--------------------------------------

I've just finished reviewing the patch. Given that nothing's been committed to 
the branch since this was committed, it seems easiest to me to revert the 
commit, address this feedback, and then commit again with the feedback 
incorporated. If for some reason you'd rather address this feedback via a 
follow-up JIRA, that's fine by me as well.

The patch overall looks very good. I agree with all of the comments that Suresh 
has previously posted. Of the comments below, I think that #16 is probably the 
most serious.

# Some commented out code in TestSnapshot without explanation. Add a comment? 
File a JIRA?
# TestNestedSnapshots - I'm really not a fan of disabling loggers in the tests. 
If the tests fails, it may be impossible to tell why because the relevant log 
message may have been disabled.
# With regard to the comment disabling the append test case of 
TestSnapshotPathINodes - appreciate you adding the TODO, but can you please 
elaborate in that comment about what the problem is currently with append? 
And/or file a follow-up JIRA to fix this issue?
# There's a few "System.out.println("XXX ...")" left throughout the patch. If 
you're going to leave these in, please make it a little clearer what the point 
is, along the lines of "====> Test completed, dumbing tree for debugging <====" 
or something.
# TestSnapshot - there's some commented out code which apparently would have 
tested FileAppend. Add a TODO and/or file a JIRA?
# TestNestedSnapshots#print - please rename parameter "mess" to "message" or 
"msg" or something clearer. Using "mess" threw me off a bit at first.
# TestNestedSnapshots#assertFile - could you please add a method comment here? 
Doesn't have to be a proper javadoc necessarily, but it's a little tough to 
tell what the method's doing at first blush. It's also a little odd that the 
method appears to be written so as to support an arbitrary number of paths to 
verify since it uses varargs, but it in fact only supports verifying the 
presence of exactly 3 paths. Maybe switch to a three-arg method?
# TestNestedSnapshots#assertFile - please move "bar" to a constant, given that 
it's hard-coded both here and in testNestedSnapshots.
# Seems like TestSnapshotReplication#getINodeFile can be made private.
# TestSnapshotPathINodes#testSnapshotPathINodesAfterModification - there's an 
assertFalse(...) which compares two values for mod times. Please consider 
adding an assert message there to print out the actual values of of the mod 
times.
# Considering the many places where we now call INodeDirectory#addChild with 
the third argument being "null", perhaps add a two-arg method which defaults to 
passing null?
# SnapshotDiff#compareTo - please change that_snapshot to "thatSnapshot" or 
"otherSnapshot"
# Considering that the Pair and Triple classes are pretty generic classes and 
are now being used in several somewhat unrelated parts of the code, consider 
moving them out of inner classes of INode and into some util package? Perhaps 
in Common?
# Really good class comment for SnapshotDiff. Very helpful and well-described.
# Having SnapshotDiff as a non-static inner class of 
INodeDirectorySnapshottable makes it a little difficult to follow the scope of 
some of the references. I think it's a sufficiently meaty class to warrant 
being a top-level class, with a reference to the associated 
INodeDirectorySnapshottable. If for some reason you want to leave it as an 
inner class, let's make it a static inner class.
# SnapshotDiff#getSnapshotINode - given that this may end up recursively 
calling getSnapshotINode, it seems like we might easily end up overflowing the 
stack if a directory has many snapshots taken of it with changes between each 
snapshot. I believe by default a StackOverflowException will end up getting 
called after a stack is 1024 method calls deep, yet the limit of how many 
snapshots can exist per snapshottable directory is set to 65536.
# I recommend adding a class comment to INodeDirectoryWithSnapshot to explain a 
little bit about how it does its work by overriding several methods of the 
INodeDirectory class to look back into snapshots as appropriate. Once I 
understood it it was very clear, but I think it could stand to be explained 
with a comment.
                
> Support O(1) snapshot creation
> ------------------------------
>
>                 Key: HDFS-4103
>                 URL: https://issues.apache.org/jira/browse/HDFS-4103
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: Snapshot (HDFS-2802)
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: Snapshot (HDFS-2802)
>
>         Attachments: h4103_20121129.patch, h4103_20121202b.patch, 
> h4103_20121202.patch, h4103_20121209b.patch, h4103_20121209.patch, 
> h4103_20121210b.patch, h4103_20121210.patch, h4103_20121211.patch, 
> h4103_20121212b.patch, h4103_20121212.patch, h4103_20121213.patch, 
> h4103_20121215b.patch, h4103_20121215.patch, h4103_20121216.patch, 
> h4103_20121217b.patch, h4103_20121217.patch, h4103_20121218.patch, 
> h4103_20121219.patch, h4103_20121220.patch
>
>
> In our first snapshot implementation, snapshot creation runs in O(N) and 
> occupies O(N) memory space, where N = # files + # directories + # symlinks in 
> the snapshot.  The advantages of the implementation are that there is no 
> additional cost for the modifications after snapshots are created, and it 
> leads to a simple implementation.
> In this JIRA, we optimize snapshot creation to O(1) although it introduces 
> additional cost in the modifications after snapshots are created.  Note that 
> the INode is given as an assumption, otherwise, there is a non-constant cost 
> to find the INode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to