[ 
https://issues.apache.org/jira/browse/HDFS-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148643#comment-14148643
 ] 

Yongjun Zhang commented on HDFS-7036:
-------------------------------------

Hi [~wheat9],

Hope my answers yesterday addressed your questions. In case not, here is 
another attempt:
{quote}
Can you point out what the other applications are that need to access both 
secure and insecure cluster at the same time? To my best knowledge, distcp is 
the only use case. The title of this jira suggests it is focusing on distcp.
{quote}
Even though the goal of HDFS-6776 and this jira is to fix distcp, the real 
issue is that *webhdfs is broken* when accessing insecure cluster from secure 
cluster side, for the same reason. Distcp is just one use case. The fsshell 
example I gave in the jira description is another. If user has two clusters 
(secure and insecure), there is chance that user has the need to write 
applications that access data in a similar fashion.  Do you not agree that we 
should fix all these cases?

{quote}
My message has been consistent since HDFS-6776 – the hack should be contained 
which result minimal damage in the codebase. I cannot +1 for the approach on 
hacking WebHdfsFileSystem just for this issue.
{quote}
Yes, I can see you said this in many comments. But would you please explain 
with *real* example about the damage of hacking in webhdfs? This is what I did 
not get from your earlier comments, and I have been asking for.

{quote}
If you still don't get it, it might be helpful to go through the distcp code 
first.
{quote}
Given that there is already  message parsing in webhdfs code (see HDFS-7026), 
and there is no complaint about it, there seems to be no real damage, except 
it's a bit hacky. 

Given the simplicity of this solution I posted that fixed all the above 
mentioned cases, and no real damage, what I really don't get is, why go with 
the more complex solution (the complexity of fixing distcp, plus fsshell, any 
application that user might write with same need)?

After all, it's just a hack that we try use to achieve better user experience, 
and  we will take out when it's the time. If we hack all over the places, to 
take the hack out would be costly too. 

Even if it's just for distcp, I think simplicity should be favored if there is 
no real damage.

Thanks.
 

> HDFS-6776 fix requires to upgrade insecure cluster, which means quite some 
> user pain
> ------------------------------------------------------------------------------------
>
>                 Key: HDFS-7036
>                 URL: https://issues.apache.org/jira/browse/HDFS-7036
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>    Affects Versions: 2.5.1
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-7036.001.patch
>
>
> Issuing command
> {code}
>  hadoop fs -lsr webhdfs://<insecureCluster>
> {code}
> at a secure cluster side fails with message "Failed to get the token ...", 
> similar symptom as reported in HDFS-6776.
> If the fix of HDFS-6776 is applied to only the secure cluster, doing 
> {code}
> distcp webhdfs://<insecureCluster> <secureCluster>
> {code}
> would fail same way.
> Basically running any application in secure cluster to access insecure 
> cluster via webhdfs would fail the same way, if the HDFS-6776 fix is not 
> applied to the insecure cluster.
> This could be quite some user pain. Filing this jira for a solution to make 
> user's life easier.
> One proposed solution was to add a msg-parsing mechanism in webhdfs, which is 
> a bit hacky. The other proposed solution is to do the same kind of hack at 
> application side, which means the same hack need to be applied in each 
> application.
> Thanks [~daryn], [~wheat9], [~jingzhao], [~tucu00] and [~atm] for the 
> discussion in HDFS-6776.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to