[ 
https://issues.apache.org/jira/browse/HADOOP-13593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483686#comment-15483686
 ] 

Yuanbo Liu edited comment on HADOOP-13593 at 9/12/16 9:58 AM:
--------------------------------------------------------------

[~steve_l] Thanks a lot for your comments, that's really helpful.
{quote}
1. please can you stick the full stack trace of the exception in as a comment..
{quote}
Sorry for omitting the stack info and I will edit my comment 1 to add the 
information.

{quote}
2. anything checking hostnames is going
{quote}
In fact there is a code segment in {{FileUtils#compareFs}} as below:
{code}
String srcHost = srcUri.getHost();
String dstHost = dstUri.getHost();
if (!srcHost.equals(dstHost)) {
        return false;
}
{code}
and I think it can cover the case you mentioned above. Using 
"getCanonicalHostName" to double check whether hosts are equal seems good, but 
if the host name is an alias name, it may throw UnknownHostException here. If 
you don't agree to remove the check, at least we can do is to make the output 
info more accurate, "Work path..in different file system" is not right.

{quote}
3. none of the object stores support atomic renames...
{quote]
Thanks for your info, yes you're right, if object store doesn't support atomic 
rename, it's not proper to use `distcp -atomic` here.

{quote}
If there were to be a patch on this, it'd need tests. Here I'd recommend 
{quote}
Thanks for your suggestions. I will investigate them later.
Thanks again for your time!


was (Author: yuanbo):
[~steve_l] Thanks a lot for your comments, that's really helpful.
{quote}
1. please can you stick the full stack trace of the exception in as a comment..
{quote}
Sorry for omitting the stack info and I will edit my comment 1 to add the 
information.

{quote}
2. anything checking hostnames is going
{quote}
In fact there is a code segment in {{FileUtils#compareFs}} as below:
{code}
String srcHost = srcUri.getHost();
String dstHost = dstUri.getHost();
if (!srcHost.equals(dstHost)) {
        return false;
}
{code}
and I think it can cover the case you mentioned above. Using 
"getCanonicalHostName" to double check whether hosts are equal seems good, but 
if the host name is an alias name, it may throw UnknownHostException here. If 
you don't agree to remove the check, at least we can do is to make the output 
info more accurate, "Work path..in different file system" is not right.

{quote}
3. none of the object stores support atomic renames...
{quote]
Thanks for your info, yes you're right, if object store doesn't support atomic 
rename, it's not proper to use `distcp -atomic` here.

{quote}
If there were to be a patch on this, it'd need tests. Here I'd recommend 
{quote}
Thanks for your suggestions. I will investigate them later.
Thanks again for your time!

> `hadoop distcp -atomic` invokes improper host check while copying data from 
> HDFS to Swift
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-13593
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13593
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Yuanbo Liu
>         Attachments: HADOOP-13593.001.patch, HADOOP-13593.002.patch
>
>
> While copying data from HDFS to Swift by using `hadoop distcp -atomic`, for 
> example:
> {code}
> hadoop distcp -atomic /tmp/100M  swift://testhadoop.softlayer//tmp
> {code}
> it throws
> {code}
> java.lang.IllegalArgumentException: Work path 
> swift://testhadoop.softlayer/._WIP_tmp546958075 and target path 
> swift://testhadoop.softlayer/tmp are in different file system
>       at org.apache.hadoop.tools.DistCp.configureOutputFormat(DistCp.java:351)
> .....
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to