Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9065 )

Change subject: IMPALA-5440 Add planner tests with extreme statistics values
......................................................................


Patch Set 8:

(9 comments)

thanks for the changes-- its heading in the right direction.
I reviewed the patch that fixes the overflow issues in more detail and 
suggested additional tests to cover the cases that the patch touched.
Some of those fixes were already guarded, so IMO, are lower priority than the 
ones that were not guarded (HdfsScanNode and HdfsTableSink)

http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@543
PS8, Line 543:
lets add a test for FULL_OUTER_JOIN in reference to L456 of 
https://gerrit.cloudera.org/#/c/7084/5/fe/src/main/java/org/apache/impala/planner/JoinNode.java


http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@595
PS8, Line 595:
can we add a test to exercise the changes in HdfsScanNode: L699 and L991 
https://gerrit.cloudera.org/#/c/7084/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java

I see that L991 is a memory estimate, which is covered in Jim's post so lets 
see if we can do it. the expression in question depends on readSize, which is a 
backend config, so we should be able to set it appropriately to expose the 
previous issue (incidentally, its not obvious from that piece of code if 
readSize is allowed to be 0 which would be a div 0 issue). The other components 
of the expression depend on perThreadIoBuffers and maxScannerThreads. 
maxScannerThreads depends on num hosts, which I'm not sure how to set at this 
point.

How about L699? that would require that stats are set per partition?

I'd make this a higher priority since this one since it was not guarded before 
the overflow patch.


http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@596
PS8, Line 596:     // create a table with negative rows and cardinality should 
not be negative
can we add a test for HdfsTableSink: L99 of 
https://gerrit.cloudera.org/#/c/7084/5/fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java

That, in turn, will exercise L268 of 
https://gerrit.cloudera.org/#/c/7084/5/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
 which I think can be tested with extreme NDV settings

I'd make this a higher priority since this one since it was not guarded before 
the overflow patch.


http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@597
PS8, Line 597:     addTestTable("CREATE EXTERNAL TABLE 
tpch.ex_customer_cardinality_neg ("
can we add a test for subplan nodes: L88 of 
https://gerrit.cloudera.org/#/c/7084/5/fe/src/main/java/org/apache/impala/planner/SubplanNode.java

look at planner tests for nested collection (you have an example above).


http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@603
PS8, Line 603:   }
can we make a test for exchange: L96 of 
https://gerrit.cloudera.org/#/c/7084/5/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java


http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
File fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java:

http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@783
PS8, Line 783: cardinality
pls improve the comment. for example, cardinality of what? what is the meaning 
of the parameters? a previous version included a description of these using 
javadoc params; would be good to get back that content, just without javadocs. 
let me know if that's unclear.


http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@783
PS8, Line 783: boundary
pls be more specific for the bounds that you intend to check.


http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@799
PS8, Line 799: node.estimated_stats == null
you'll want this check first, otherwise you'll get a null pointer exception 
with the preceding check in the event this is true.

however, it seems you're asserting that estimated_stats must be set for each 
node. if so, lets check that first and issue a more specific error message.


http://gerrit.cloudera.org:8080/#/c/9065/8/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@819
PS8, Line 819: function generates the testing query
this comment can improved. for example, it computes stats for the given table 
and issues a query (how is that query related to the table?)

another option is that if you're not using this method any where else, consider 
inlining it the method above.



--
To view, visit http://gerrit.cloudera.org:8080/9065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb
Gerrit-Change-Number: 9065
Gerrit-PatchSet: 8
Gerrit-Owner: xyutin...@cloudera.com
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jbapple-imp...@apache.org>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Reviewer: xyutin...@cloudera.com
Gerrit-Comment-Date: Wed, 31 Jan 2018 19:15:15 +0000
Gerrit-HasComments: Yes

Reply via email to