Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17316 )

Change subject: IMPALA-10658: LOAD DATA INPATH silently fails between HDFS and 
Azure ABFS
......................................................................


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17316/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/17316/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@584
PS1, Line 584: isPathOnFileSystem
> This patch adds requirement for checking existence. So we can only used it
It also returns true if the parameter is a qualified path (which is always the 
case currently), and its URI/scheme belongs to the filesystem 
('path.equals(qp))'), but does not exist. So in that sense the name 
'isPathExistsOnFileSystem' could be misleading.

Currently what this method does is:

* For qualified paths: return /*fast-path=*/ path.equals(qualifiedPath) || 
/*slow-path=*/ fs.exists(path)
* For unqualified paths: return /*slow-path=*/ fs.exists(path)


http://gerrit.cloudera.org:8080/#/c/17316/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@588
PS1, Line 588:       return fs.exists(path);
> I think it'd be good if we have unit tests in FileSystemUtilTest on this me
Good idea. We already have the necessary configuration in the dev environment 
to test the different file system implementations, so it wasn't too hard to add 
an extra unit test to FileSystemUtilTest.


http://gerrit.cloudera.org:8080/#/c/17316/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@588
PS1, Line 588: return fs.exists(path);
> I wonder if there is a need to call fs.exists(path) here, since a relative
This would be similar to the 'base' version. It wouldn't work because 
AzureBlobFileSystem doesn't throw an exception when it sees a different 
qualified path than fs. E.g. it happily substitutes 'hdfs://' to 'abfs://'.

Currently we only use this function in relocateFile() to decide whether the 
source file is on the destination file system. So it seemed rational to me to 
check existence, but only if 'path.equals(qp)' returns false.

If we can assume that path is always qualified, then the following code should 
be enough:

 try {
   Path qp = fs.makeQualified(path);
   return path.equals(qp));
 }

Maybe I could add a precondition like this to ensure that we are dealing with a 
qualified path:

 
Precondition.checkState(!path.equals(Path.getPathWithoutSchemeAndAuthority(path));


http://gerrit.cloudera.org:8080/#/c/17316/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@590
PS1, Line 590:       // Path is not on fs.
> Maybe it's worth some logs here. So we know what path (whether qualified or
Added a debug log.



--
To view, visit http://gerrit.cloudera.org:8080/17316
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id807e8a200b83283a09d3a917185cabab930017d
Gerrit-Change-Number: 17316
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Tue, 20 Apr 2021 17:02:55 +0000
Gerrit-HasComments: Yes

Reply via email to