Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/9065 )
Change subject: IMPALA-5440 Add planner tests with extreme statistics values ...................................................................... Patch Set 8: (9 comments) thanks for the changes-- its heading in the right direction. I reviewed the patch that fixes the overflow issues in more detail and suggested additional tests to cover the cases that the patch touched. Some of those fixes were already guarded, so IMO, are lower priority than the ones that were not guarded (HdfsScanNode and HdfsTableSink) http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTest.java File fe/src/test/java/org/apache/impala/planner/PlannerTest.java: http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@543 PS8, Line 543: lets add a test for FULL_OUTER_JOIN in reference to L456 of https://gerrit.cloudera.org/#/c/7084/5/fe/src/main/java/org/apache/impala/planner/JoinNode.java http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@595 PS8, Line 595: can we add a test to exercise the changes in HdfsScanNode: L699 and L991 https://gerrit.cloudera.org/#/c/7084/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java I see that L991 is a memory estimate, which is covered in Jim's post so lets see if we can do it. the expression in question depends on readSize, which is a backend config, so we should be able to set it appropriately to expose the previous issue (incidentally, its not obvious from that piece of code if readSize is allowed to be 0 which would be a div 0 issue). The other components of the expression depend on perThreadIoBuffers and maxScannerThreads. maxScannerThreads depends on num hosts, which I'm not sure how to set at this point. How about L699? that would require that stats are set per partition? I'd make this a higher priority since this one since it was not guarded before the overflow patch. http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@596 PS8, Line 596: // create a table with negative rows and cardinality should not be negative can we add a test for HdfsTableSink: L99 of https://gerrit.cloudera.org/#/c/7084/5/fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java That, in turn, will exercise L268 of https://gerrit.cloudera.org/#/c/7084/5/fe/src/main/java/org/apache/impala/planner/PlanFragment.java which I think can be tested with extreme NDV settings I'd make this a higher priority since this one since it was not guarded before the overflow patch. http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@597 PS8, Line 597: addTestTable("CREATE EXTERNAL TABLE tpch.ex_customer_cardinality_neg (" can we add a test for subplan nodes: L88 of https://gerrit.cloudera.org/#/c/7084/5/fe/src/main/java/org/apache/impala/planner/SubplanNode.java look at planner tests for nested collection (you have an example above). http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@603 PS8, Line 603: } can we make a test for exchange: L96 of https://gerrit.cloudera.org/#/c/7084/5/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java File fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java: http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@783 PS8, Line 783: cardinality pls improve the comment. for example, cardinality of what? what is the meaning of the parameters? a previous version included a description of these using javadoc params; would be good to get back that content, just without javadocs. let me know if that's unclear. http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@783 PS8, Line 783: boundary pls be more specific for the bounds that you intend to check. http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@799 PS8, Line 799: node.estimated_stats == null you'll want this check first, otherwise you'll get a null pointer exception with the preceding check in the event this is true. however, it seems you're asserting that estimated_stats must be set for each node. if so, lets check that first and issue a more specific error message. http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@819 PS8, Line 819: function generates the testing query this comment can improved. for example, it computes stats for the given table and issues a query (how is that query related to the table?) another option is that if you're not using this method any where else, consider inlining it the method above. -- To view, visit http://gerrit.cloudera.org:8080/9065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb Gerrit-Change-Number: 9065 Gerrit-PatchSet: 8 Gerrit-Owner: xyutin...@cloudera.com Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Jim Apple <jbapple-imp...@apache.org> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Reviewer: xyutin...@cloudera.com Gerrit-Comment-Date: Wed, 31 Jan 2018 19:15:15 +0000 Gerrit-HasComments: Yes