Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12437 )
Change subject: IMPALA-8185: Abstract out real/mock file system operations ...................................................................... Patch Set 5: (4 comments) The general separation of FS stuff looks good to me. http://gerrit.cloudera.org:8080/#/c/12437/4/fe/src/main/java/org/apache/impala/analysis/HdfsFileSystemFacade.java File fe/src/main/java/org/apache/impala/analysis/HdfsFileSystemFacade.java: http://gerrit.cloudera.org:8080/#/c/12437/4/fe/src/main/java/org/apache/impala/analysis/HdfsFileSystemFacade.java@102 PS4, Line 102: public ScanAllocation allocateScans(FeFsTable table, TScanRangeSpec scanRangeSpecs, > Do you know what this is used for? Per my understanding, we use this in cost formulae. Say, if you have a join on top of the scan, the lhs scan determines the number of nodes on which the join runs and we factor in the num nodes to determine the overall join cost (for ex: is inverted join any cheaper?) Yes this can diverge with the scheduler implementation and ideally we need to consolidate. I remember asking Alex exactly this and he confirmed the behavior. http://gerrit.cloudera.org:8080/#/c/12437/5/fe/src/main/java/org/apache/impala/analysis/HdfsFileSystemFacade.java File fe/src/main/java/org/apache/impala/analysis/HdfsFileSystemFacade.java: http://gerrit.cloudera.org:8080/#/c/12437/5/fe/src/main/java/org/apache/impala/analysis/HdfsFileSystemFacade.java@42 PS5, Line 42: new Configuration(); Use FileSystemUtil.getConfiguration(). I vaguely remember that building a new Conf object was expensive and I don't think we need to do it for every query. http://gerrit.cloudera.org:8080/#/c/12437/5/fe/src/main/java/org/apache/impala/analysis/MockFileSystemFacade.java File fe/src/main/java/org/apache/impala/analysis/MockFileSystemFacade.java: http://gerrit.cloudera.org:8080/#/c/12437/5/fe/src/main/java/org/apache/impala/analysis/MockFileSystemFacade.java@40 PS5, Line 40: public Path validatePath(Analyzer analyzer, Path path, Add comments why we pass? http://gerrit.cloudera.org:8080/#/c/12437/5/testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test File testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test: http://gerrit.cloudera.org:8080/#/c/12437/5/testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test@222 PS5, Line 222: | Per-Host Resources: mem-estimate=88.00MB mem-reservation=88.00MB thread-reservation=1 Any idea why the mem-estimates changed with this patch? -- To view, visit http://gerrit.cloudera.org:8080/12437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d Gerrit-Change-Number: 12437 Gerrit-PatchSet: 5 Gerrit-Owner: Paul Rogers <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Paul Rogers <[email protected]> Gerrit-Reviewer: Philip Zeyliger <[email protected]> Gerrit-Comment-Date: Thu, 21 Feb 2019 21:56:03 +0000 Gerrit-HasComments: Yes
