[ 
https://issues.apache.org/jira/browse/IMPALA-10658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy resolved IMPALA-10658.
----------------------------------------
    Fix Version/s: Impala 4.0
       Resolution: Fixed

> LOAD DATA INPATH silently fails between HDFS and Azure ABFS
> -----------------------------------------------------------
>
>                 Key: IMPALA-10658
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10658
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Zoltán Borók-Nagy
>            Priority: Major
>             Fix For: Impala 4.0
>
>
> LOAD DATA INPATH silently fails when Impala tries to move files from HDFS to 
> ABFS.
> The problem is that in 'relocateFile()' we try to figure out if 'sourceFile' 
> is on the destination filesystem:
> https://github.com/apache/impala/blob/6b16df9e9a4696b46b6f9c7fe2fc0aaded285623/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java#L246
> We use the following code to decide this:
> https://github.com/apache/impala/blob/6b16df9e9a4696b46b6f9c7fe2fc0aaded285623/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java#L581-L591
> However, the Azure FileSystem implementation doesn't throw an exception in 
> 'fs.makeQualified(path);'. I just happily returns a new Path substituting the 
> prefix "hdfs://" to "abfs://".
> So in relocateFile() Impala thinks the 'sourceFile' and 'destFile' are on the 
> same filesystem so it tries to invoke 'destFs.rename()':
> https://github.com/apache/impala/blob/6b16df9e9a4696b46b6f9c7fe2fc0aaded285623/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java#L266
> From 
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/filesystem.html#boolean_rename.28Path_src.2C_Path_d.29
>  : "In terms of its implementation, it is the one with the most ambiguity 
> regarding when to return false versus raising an exception."
> Seems like the Azure FileSystem implementation doesn't throw an exception on 
> failure, but returns false instead. Unfortunately Impala doesn't check the 
> return value of destFs.rename() (see above), so the error remains silent.
> To fix this issue we need to do two things:
> * fix FileSystemUtil.isPathOnFileSystem()
> * check the return value of destFs.rename() and throw an exception when it's 
> false



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to