[ 
https://issues.apache.org/jira/browse/HADOOP-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641174#action_12641174
 ] 

Sanjay Radia commented on HADOOP-4044:
--------------------------------------

> I would like to avoid a design that incurs an overhead of an additional RPC 
> everytime a link is traversed.

>+1. This will affect not only NNBench but all benchmarks including DFSIO and 
>especially NNThroughputBenchmark.
>GridMix and Sort will probably be less affected, but will suffer too.

+1. I would also like to avoid an extra rpc, since avoiding one is straight 
forward.

Doug >What did you think about my suggestion above that we might use a cache to 
avoid this? First, we implement the naive approach, benchmark it, and, it it's 
too slow, optimize it with a pre-fetch cache of block locations.

Clearly your cache solution  deals with the extra RPC issue.
Generally I see a cache as a way of improving the  performance of an ordinarily 
good design or algorithm. I don't like the use of caches as  part of a design 
to make an algorithm work  when alternate good designs are available that don't 
need a cache. Would we have come up with this design if we didn't have such an 
emotionally charged discussion on exceptions?

We have a good design where if the resolution fails due to a symlink, we return 
this information to the caller. It does not require the use of a cache.
We are divided over how to return this information - use the return status or 
use an exception. 
The cache solution is a way to avoid making the painfully emotionally charged 
decision for the Hadoop community.
I don't want to explain the reason we use the cache to hadoop developers again 
and again down the road. 
We should not avoid the decision, but make it. 
A couple of weeks ago I was confident that a compromise vote would pass. I am 
hoping that the same is true now.


> Create symbolic links in HDFS
> -----------------------------
>
>                 Key: HADOOP-4044
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4044
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: HADOOP-4044-strawman.patch, symLink1.patch, 
> symLink1.patch, symLink4.patch, symLink5.patch, symLink6.patch, 
> symLink8.patch, symLink9.patch
>
>
> HDFS should support symbolic links. A symbolic link is a special type of file 
> that contains a reference to another file or directory in the form of an 
> absolute or relative path and that affects pathname resolution. Programs 
> which read or write to files named by a symbolic link will behave as if 
> operating directly on the target file. However, archiving utilities can 
> handle symbolic links specially and manipulate them directly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to