[ 
http://issues.apache.org/jira/browse/HADOOP-334?page=comments#action_12447565 ] 
            
dhruba borthakur commented on HADOOP-334:
-----------------------------------------

In Response to Sameer's comments:
---------------------------------------------------
Regarding copy-on-write approach, we do not need to traverse the entire 
namespace to reset the clone pointers at the end of the checkpointing process. 
We can keep a lookaside list that contains all the nodes that have a clone 
pointer. But we still have to acquire the global lock at the end of the 
checkpointing process, traverse this lookaside list of cloned-nodes, and then 
null-them.

I like the generalized scheme of fine-grain locks (instead of a global lock) 
while traversing the namespace. It is more efficient once implemented 
correctly. There are quite a few tricks about lock-hierarchy that one has to 
play for "renames". But it can be done.

The one thing that I am not clear about is whether we get correct semantics if 
the imagefile and the editfile overlap.  If x, y and z are three transactions, 
are you saying that
                        x + y + z is equilvalent to x + y + y +z where y is a 
single transaction that res ides in the image file as well as the edits file. 
Are you proposing something like a global transaction number to identify 
duplicate transactions?

In response to Doug/Owen' comments:
-----------------------------------------------------
I was saying precisely the same thing. The only difference in my proposal was 
that we do not need to clone all the nodes from a node to its root, we could 
clone only the node that is being modified.

> Redesign the dfs namespace datastructures to be copy on write
> -------------------------------------------------------------
>
>                 Key: HADOOP-334
>                 URL: http://issues.apache.org/jira/browse/HADOOP-334
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.4.0
>            Reporter: Owen O'Malley
>         Assigned To: Konstantin Shvachko
>
> The namespace datastructures should be copy on write so that the namespace 
> does not need to be completely locked down from user changes while the 
> checkpoint is being made.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to