[ https://issues.apache.org/jira/browse/IMPALA-10658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zoltán Borók-Nagy resolved IMPALA-10658. ---------------------------------------- Fix Version/s: Impala 4.0 Resolution: Fixed > LOAD DATA INPATH silently fails between HDFS and Azure ABFS > ----------------------------------------------------------- > > Key: IMPALA-10658 > URL: https://issues.apache.org/jira/browse/IMPALA-10658 > Project: IMPALA > Issue Type: Bug > Reporter: Zoltán Borók-Nagy > Assignee: Zoltán Borók-Nagy > Priority: Major > Fix For: Impala 4.0 > > > LOAD DATA INPATH silently fails when Impala tries to move files from HDFS to > ABFS. > The problem is that in 'relocateFile()' we try to figure out if 'sourceFile' > is on the destination filesystem: > https://github.com/apache/impala/blob/6b16df9e9a4696b46b6f9c7fe2fc0aaded285623/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java#L246 > We use the following code to decide this: > https://github.com/apache/impala/blob/6b16df9e9a4696b46b6f9c7fe2fc0aaded285623/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java#L581-L591 > However, the Azure FileSystem implementation doesn't throw an exception in > 'fs.makeQualified(path);'. I just happily returns a new Path substituting the > prefix "hdfs://" to "abfs://". > So in relocateFile() Impala thinks the 'sourceFile' and 'destFile' are on the > same filesystem so it tries to invoke 'destFs.rename()': > https://github.com/apache/impala/blob/6b16df9e9a4696b46b6f9c7fe2fc0aaded285623/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java#L266 > From > https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/filesystem.html#boolean_rename.28Path_src.2C_Path_d.29 > : "In terms of its implementation, it is the one with the most ambiguity > regarding when to return false versus raising an exception." > Seems like the Azure FileSystem implementation doesn't throw an exception on > failure, but returns false instead. Unfortunately Impala doesn't check the > return value of destFs.rename() (see above), so the error remains silent. > To fix this issue we need to do two things: > * fix FileSystemUtil.isPathOnFileSystem() > * check the return value of destFs.rename() and throw an exception when it's > false -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org