Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5431: Remove redundant path exists checks during table load ......................................................................
IMPALA-5431: Remove redundant path exists checks during table load There are multiple places that do an exists() check on a path and then perform some subsequent action on it. This pattern results in two RPCs to the NN (one for the exists() check and one for the subsequent action). We can avoid the exists() check in these cases since most HDFS methods on paths throw a FileNotFoundException if the path does not exist. This can save an RPC to NN and improve the metadata loading time. Testing: Enough tests already cover this code path. This patch passed core and exhaustive tests. Metadata benchmark shows decent increase in perf numbers, for ex: 100K-PARTITIONS-1M-FILES-CUSTOM-05-QUERY-AFTER-INV -20.51% 80-PARTITIONS-250K-FILES-S3-03-RECOVER -20.58% 80-PARTITIONS-250K-FILES-11-DROP-PARTITION -22.13% 80-PARTITIONS-250K-FILES-S3-08-ADD-PARTITION -22.38% 80-PARTITIONS-250K-FILES-S3-12-DROP -23.69% 100K-PARTITIONS-1M-FILES-CUSTOM-11-REFRESH-PARTITION -23.91% 100K-PARTITIONS-1M-FILES-CUSTOM-10-REFRESH-AFTER-ADD-PARTITION -26.04% 100K-PARTITIONS-1M-FILES-CUSTOM-07-REFRESH -26.38% 80-PARTITIONS-250K-FILES-S3-02-CREATE -36.47% 100K-PARTITIONS-1M-FILES-CUSTOM-12-QUERY-PARTITIONS -58.72% 80-PARTITIONS-250K-FILES-S3-01-DROP -95.33% 80-PARTITIONS-250K-FILES-01-DROP -95.93% Change-Id: Id10ecf64ea2eda2d0f9299c0aa371933eca22281 Reviewed-on: http://gerrit.cloudera.org:8080/7095 Reviewed-by: Bharath Vissapragada <[email protected]> Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java 2 files changed, 64 insertions(+), 39 deletions(-) Approvals: Impala Public Jenkins: Verified Bharath Vissapragada: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7095 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Id10ecf64ea2eda2d0f9299c0aa371933eca22281 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Impala Public Jenkins
