[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0
[ https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651701#comment-15651701 ] Pengcheng Xiong commented on HIVE-15023: [~kgyrtkirk], thanks for finding this out. I have pushed the patch to the master. Thanks again. > SimpleFetchOptimizer needs to optimize limit=0 > -- > > Key: HIVE-15023 > URL: https://issues.apache.org/jira/browse/HIVE-15023 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.2.0 > > Attachments: HIVE-15023.01.patch, HIVE-15023.02.patch > > > on current master > {code} > hive> explain select key from src limit 0; > OK > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: 0 > Processor Tree: > TableScan > alias: src > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: key (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Limit > Number of rows: 0 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > ListSink > Time taken: 7.534 seconds, Fetched: 20 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0
[ https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15650960#comment-15650960 ] Zoltan Haindrich commented on HIVE-15023: - [~pxiong] it seems to me that there is a qtest which have "evaded" the output update ;) and it's affected by the limit 0 optimization: https://builds.apache.org/job/PreCommit-HIVE-Build/2046/testReport/org.apache.hadoop.hive.cli/TestSparkCliDriver/testCliDriver_limit_pushdown_/ > SimpleFetchOptimizer needs to optimize limit=0 > -- > > Key: HIVE-15023 > URL: https://issues.apache.org/jira/browse/HIVE-15023 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.2.0 > > Attachments: HIVE-15023.01.patch, HIVE-15023.02.patch > > > on current master > {code} > hive> explain select key from src limit 0; > OK > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: 0 > Processor Tree: > TableScan > alias: src > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: key (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Limit > Number of rows: 0 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > ListSink > Time taken: 7.534 seconds, Fetched: 20 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0
[ https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15649092#comment-15649092 ] Pengcheng Xiong commented on HIVE-15023: pushed to master. Thanks [~ashutoshc] for the review. > SimpleFetchOptimizer needs to optimize limit=0 > -- > > Key: HIVE-15023 > URL: https://issues.apache.org/jira/browse/HIVE-15023 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.2.0 > > Attachments: HIVE-15023.01.patch, HIVE-15023.02.patch > > > on current master > {code} > hive> explain select key from src limit 0; > OK > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: 0 > Processor Tree: > TableScan > alias: src > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: key (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Limit > Number of rows: 0 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > ListSink > Time taken: 7.534 seconds, Fetched: 20 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0
[ https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15645547#comment-15645547 ] Ashutosh Chauhan commented on HIVE-15023: - LGTM +1 > SimpleFetchOptimizer needs to optimize limit=0 > -- > > Key: HIVE-15023 > URL: https://issues.apache.org/jira/browse/HIVE-15023 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15023.01.patch, HIVE-15023.02.patch > > > on current master > {code} > hive> explain select key from src limit 0; > OK > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: 0 > Processor Tree: > TableScan > alias: src > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: key (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Limit > Number of rows: 0 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > ListSink > Time taken: 7.534 seconds, Fetched: 20 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0
[ https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15645537#comment-15645537 ] Pengcheng Xiong commented on HIVE-15023: all the tests look good to me except {code} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] 40 sec 1 org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] 10 sec 58 org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] 3 sec 58 {code} They seem unrelated. [~ashutoshc] or [~jcamachorodriguez], could u take a look? Thanks. > SimpleFetchOptimizer needs to optimize limit=0 > -- > > Key: HIVE-15023 > URL: https://issues.apache.org/jira/browse/HIVE-15023 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15023.01.patch, HIVE-15023.02.patch > > > on current master > {code} > hive> explain select key from src limit 0; > OK > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: 0 > Processor Tree: > TableScan > alias: src > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: key (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Limit > Number of rows: 0 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > ListSink > Time taken: 7.534 seconds, Fetched: 20 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0
[ https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15645404#comment-15645404 ] Hive QA commented on HIVE-15023: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12837814/HIVE-15023.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10628 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_limit] (batchId=10) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit] (batchId=33) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[limit_pushdown3] (batchId=141) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[limit_pushdown] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit_ppd_optimizer] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=145) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] (batchId=91) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vectorization_limit] (batchId=90) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[limit_pushdown] (batchId=121) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2007/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2007/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2007/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12837814 - PreCommit-HIVE-Build > SimpleFetchOptimizer needs to optimize limit=0 > -- > > Key: HIVE-15023 > URL: https://issues.apache.org/jira/browse/HIVE-15023 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15023.01.patch, HIVE-15023.02.patch > > > on current master > {code} > hive> explain select key from src limit 0; > OK > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: 0 > Processor Tree: > TableScan > alias: src > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: key (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Limit > Number of rows: 0 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > ListSink > Time taken: 7.534 seconds, Fetched: 20 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0
[ https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613651#comment-15613651 ] Ashutosh Chauhan commented on HIVE-15023: - seems like tests didn't run on this. Reattach to trigger tests. > SimpleFetchOptimizer needs to optimize limit=0 > -- > > Key: HIVE-15023 > URL: https://issues.apache.org/jira/browse/HIVE-15023 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15023.01.patch, HIVE-15023.02.patch > > > on current master > {code} > hive> explain select key from src limit 0; > OK > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: 0 > Processor Tree: > TableScan > alias: src > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: key (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Limit > Number of rows: 0 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > ListSink > Time taken: 7.534 seconds, Fetched: 20 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0
[ https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592600#comment-15592600 ] Jesus Camacho Rodriguez commented on HIVE-15023: Your solution works; as I said, maybe you could just add a comment to the _if_ clause. Expect some additional plan improvements. > SimpleFetchOptimizer needs to optimize limit=0 > -- > > Key: HIVE-15023 > URL: https://issues.apache.org/jira/browse/HIVE-15023 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15023.01.patch > > > on current master > {code} > hive> explain select key from src limit 0; > OK > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: 0 > Processor Tree: > TableScan > alias: src > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: key (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Limit > Number of rows: 0 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > ListSink > Time taken: 7.534 seconds, Fetched: 20 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0
[ https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592592#comment-15592592 ] Pengcheng Xiong commented on HIVE-15023: [~jcamachorodriguez], I proposed the same plan to move L200 ahead before L119 before. And I got lots of test case failures. Thus, I think it is not simple to move that line, as you said. :) Does my patch solves all your problems or not? which one still fails? > SimpleFetchOptimizer needs to optimize limit=0 > -- > > Key: HIVE-15023 > URL: https://issues.apache.org/jira/browse/HIVE-15023 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15023.01.patch > > > on current master > {code} > hive> explain select key from src limit 0; > OK > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: 0 > Processor Tree: > TableScan > alias: src > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: key (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Limit > Number of rows: 0 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > ListSink > Time taken: 7.534 seconds, Fetched: 20 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0
[ https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591142#comment-15591142 ] Jesus Camacho Rodriguez commented on HIVE-15023: I checked a bit further the code and I think the approach I propose above might not be as simple as I thought, since we would need to create the task and thus it is not simply moving the clause from one place to another... I think it is OK then to proceed, but if we leave the fix as it is now, please add a comment to the new _if_ statement in SimpleFetchOptimizer explaining why we are bailing out there. Thanks > SimpleFetchOptimizer needs to optimize limit=0 > -- > > Key: HIVE-15023 > URL: https://issues.apache.org/jira/browse/HIVE-15023 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15023.01.patch > > > on current master > {code} > hive> explain select key from src limit 0; > OK > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: 0 > Processor Tree: > TableScan > alias: src > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: key (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Limit > Number of rows: 0 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > ListSink > Time taken: 7.534 seconds, Fetched: 20 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0
[ https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591060#comment-15591060 ] Jesus Camacho Rodriguez commented on HIVE-15023: [~pxiong], this indeed solves some of the issues referred in HIVE-14866. I wrongly thought that one of the problems was that we were not setting the _outerQueryLimit_ variable for simple queries in SemanticAnalyzer, but we do; thanks for discovering that. To understand properly the fix: _outerQueryLimit=0_ was previously checked in TaskCompiler (L200), but apparently if SimpleFetchOptimizer kicks in, we never reach that line (we enter in L119-L137), thus you are proposing to bail out from SimpleFetchOptimizer if limit is 0, is that right? I think another possibility is to move the check in TaskCompiler in L200 and have it before L119; the advantage with this approach would be that we avoid dealing with the special case _limit=0_ in both places. What do you think? Would that work? > SimpleFetchOptimizer needs to optimize limit=0 > -- > > Key: HIVE-15023 > URL: https://issues.apache.org/jira/browse/HIVE-15023 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15023.01.patch > > > on current master > {code} > hive> explain select key from src limit 0; > OK > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: 0 > Processor Tree: > TableScan > alias: src > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: key (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Limit > Number of rows: 0 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > ListSink > Time taken: 7.534 seconds, Fetched: 20 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0
[ https://issues.apache.org/jira/browse/HIVE-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15590060#comment-15590060 ] Ashutosh Chauhan commented on HIVE-15023: - This might have some overlap with HIVE-14866 > SimpleFetchOptimizer needs to optimize limit=0 > -- > > Key: HIVE-15023 > URL: https://issues.apache.org/jira/browse/HIVE-15023 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > > on current master > {code} > hive> explain select key from src limit 0; > OK > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: 0 > Processor Tree: > TableScan > alias: src > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: key (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE > Column stats: NONE > Limit > Number of rows: 0 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > ListSink > Time taken: 7.534 seconds, Fetched: 20 row(s) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)