[jira] [Work logged] (HIVE-19653) Incorrect predicate pushdown for groupby with grouping sets

2020-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19653?focusedWorklogId=450098=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450098
 ]

ASF GitHub Bot logged work on HIVE-19653:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:25
Start Date: 24/Jun/20 00:25
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #354:
URL: https://github.com/apache/hive/pull/354


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450098)
Time Spent: 2h  (was: 1h 50m)

> Incorrect predicate pushdown for groupby with grouping sets
> ---
>
> Key: HIVE-19653
> URL: https://issues.apache.org/jira/browse/HIVE-19653
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhang Li
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19653.1.patch, HIVE-19653.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Consider the following query:
> {code:java}
> CREATE TABLE T1(a STRING, b STRING, s BIGINT);
> INSERT OVERWRITE TABLE T1 VALUES ('', '', 123456);
> SELECT * FROM (
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((), (a), (b), (a, b))
> ) t WHERE a IS NOT NULL;
> {code}
> When hive.optimize.ppd is enabled (and hive.cbo.enable=false), the query will 
> output:
> {code:java}
> NULL  NULL123456
> NULL  123456
>   NULL123456
>   123456
> {code}
> We can see the predicate "a IS NOT NULL" takes no effect, which is incorrect.
> When performing PPD optimization for a GBY operator, we should make sure all 
> grouping sets contains the processing expr before pushdown. otherwise the 
> expr value after GBY is changed and the result is wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-19653) Incorrect predicate pushdown for groupby with grouping sets

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19653?focusedWorklogId=446705=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446705
 ]

ASF GitHub Bot logged work on HIVE-19653:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:55
Start Date: 16/Jun/20 16:55
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #354:
URL: https://github.com/apache/hive/pull/354#issuecomment-644886564


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446705)
Time Spent: 1h 50m  (was: 1h 40m)

> Incorrect predicate pushdown for groupby with grouping sets
> ---
>
> Key: HIVE-19653
> URL: https://issues.apache.org/jira/browse/HIVE-19653
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhang Li
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19653.1.patch, HIVE-19653.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Consider the following query:
> {code:java}
> CREATE TABLE T1(a STRING, b STRING, s BIGINT);
> INSERT OVERWRITE TABLE T1 VALUES ('', '', 123456);
> SELECT * FROM (
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((), (a), (b), (a, b))
> ) t WHERE a IS NOT NULL;
> {code}
> When hive.optimize.ppd is enabled (and hive.cbo.enable=false), the query will 
> output:
> {code:java}
> NULL  NULL123456
> NULL  123456
>   NULL123456
>   123456
> {code}
> We can see the predicate "a IS NOT NULL" takes no effect, which is incorrect.
> When performing PPD optimization for a GBY operator, we should make sure all 
> grouping sets contains the processing expr before pushdown. otherwise the 
> expr value after GBY is changed and the result is wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-19653) Incorrect predicate pushdown for groupby with grouping sets

2020-06-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19653?focusedWorklogId=444656=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-444656
 ]

ASF GitHub Bot logged work on HIVE-19653:
-

Author: ASF GitHub Bot
Created on: 12/Jun/20 00:42
Start Date: 12/Jun/20 00:42
Worklog Time Spent: 10m 
  Work Description: jcamachor merged pull request #1070:
URL: https://github.com/apache/hive/pull/1070


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 444656)
Time Spent: 1h 40m  (was: 1.5h)

> Incorrect predicate pushdown for groupby with grouping sets
> ---
>
> Key: HIVE-19653
> URL: https://issues.apache.org/jira/browse/HIVE-19653
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhang Li
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19653.1.patch, HIVE-19653.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Consider the following query:
> {code:java}
> CREATE TABLE T1(a STRING, b STRING, s BIGINT);
> INSERT OVERWRITE TABLE T1 VALUES ('', '', 123456);
> SELECT * FROM (
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((), (a), (b), (a, b))
> ) t WHERE a IS NOT NULL;
> {code}
> When hive.optimize.ppd is enabled (and hive.cbo.enable=false), the query will 
> output:
> {code:java}
> NULL  NULL123456
> NULL  123456
>   NULL123456
>   123456
> {code}
> We can see the predicate "a IS NOT NULL" takes no effect, which is incorrect.
> When performing PPD optimization for a GBY operator, we should make sure all 
> grouping sets contains the processing expr before pushdown. otherwise the 
> expr value after GBY is changed and the result is wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-19653) Incorrect predicate pushdown for groupby with grouping sets

2020-06-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19653?focusedWorklogId=444346=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-444346
 ]

ASF GitHub Bot logged work on HIVE-19653:
-

Author: ASF GitHub Bot
Created on: 11/Jun/20 15:38
Start Date: 11/Jun/20 15:38
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #1070:
URL: https://github.com/apache/hive/pull/1070#issuecomment-642744366


   @jcamachor Can you take a look at the changes? thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 444346)
Time Spent: 1.5h  (was: 1h 20m)

> Incorrect predicate pushdown for groupby with grouping sets
> ---
>
> Key: HIVE-19653
> URL: https://issues.apache.org/jira/browse/HIVE-19653
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhang Li
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19653.1.patch, HIVE-19653.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Consider the following query:
> {code:java}
> CREATE TABLE T1(a STRING, b STRING, s BIGINT);
> INSERT OVERWRITE TABLE T1 VALUES ('', '', 123456);
> SELECT * FROM (
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((), (a), (b), (a, b))
> ) t WHERE a IS NOT NULL;
> {code}
> When hive.optimize.ppd is enabled (and hive.cbo.enable=false), the query will 
> output:
> {code:java}
> NULL  NULL123456
> NULL  123456
>   NULL123456
>   123456
> {code}
> We can see the predicate "a IS NOT NULL" takes no effect, which is incorrect.
> When performing PPD optimization for a GBY operator, we should make sure all 
> grouping sets contains the processing expr before pushdown. otherwise the 
> expr value after GBY is changed and the result is wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-19653) Incorrect predicate pushdown for groupby with grouping sets

2020-06-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19653?focusedWorklogId=442964=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-442964
 ]

ASF GitHub Bot logged work on HIVE-19653:
-

Author: ASF GitHub Bot
Created on: 09/Jun/20 16:16
Start Date: 09/Jun/20 16:16
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on a change in pull request #1070:
URL: https://github.com/apache/hive/pull/1070#discussion_r437062135



##
File path: ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
##
@@ -733,6 +736,97 @@ private void applyFilterTransitivity(JoinOperator join, 
int targetPos, OpWalkerI
 }
   }
 
+  public static class GroupByPPD extends DefaultPPD implements 
SemanticNodeProcessor {
+
+@Override
+public Object process(Node nd, Stack stack, NodeProcessorCtx procCtx,
+Object... nodeOutputs) throws SemanticException {
+  super.process(nd, stack, procCtx, nodeOutputs);
+  OpWalkerInfo owi = (OpWalkerInfo) procCtx;
+  GroupByDesc groupByDesc = ((GroupByOperator)nd).getConf();
+  ExprWalkerInfo prunedPred = owi.getPrunedPreds((Operator) nd);
+  if (prunedPred == null || !prunedPred.hasAnyCandidates() ||
+  !groupByDesc.isGroupingSetsPresent()) {
+return null;
+  }
+
+  List groupingSets = groupByDesc.getListGroupingSets();
+  Map> candidates = 
prunedPred.getFinalCandidates();
+  FastBitSet[] fastBitSets = new FastBitSet[groupingSets.size()];
+  int groupingSetPosition = groupByDesc.getGroupingSetPosition();
+  for (int pos = 0; pos < fastBitSets.length; pos ++) {
+fastBitSets[pos] = 
GroupByOperator.groupingSet2BitSet(groupingSets.get(pos),
+groupingSetPosition);
+  }
+  List groupByKeys = 
((GroupByOperator)nd).getConf().getKeys();
+  Map newToOldExprMap = 
prunedPred.getNewToOldExprMap();
+  Map> nonFinalCandidates = new HashMap>();
+  for (Iterator>>

Review comment:
   Done, Thanks @jcamachor for the review!

##
File path: ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
##
@@ -733,6 +736,97 @@ private void applyFilterTransitivity(JoinOperator join, 
int targetPos, OpWalkerI
 }
   }
 
+  public static class GroupByPPD extends DefaultPPD implements 
SemanticNodeProcessor {
+
+@Override
+public Object process(Node nd, Stack stack, NodeProcessorCtx procCtx,
+Object... nodeOutputs) throws SemanticException {
+  super.process(nd, stack, procCtx, nodeOutputs);
+  OpWalkerInfo owi = (OpWalkerInfo) procCtx;
+  GroupByDesc groupByDesc = ((GroupByOperator)nd).getConf();
+  ExprWalkerInfo prunedPred = owi.getPrunedPreds((Operator) nd);
+  if (prunedPred == null || !prunedPred.hasAnyCandidates() ||
+  !groupByDesc.isGroupingSetsPresent()) {
+return null;
+  }
+
+  List groupingSets = groupByDesc.getListGroupingSets();
+  Map> candidates = 
prunedPred.getFinalCandidates();
+  FastBitSet[] fastBitSets = new FastBitSet[groupingSets.size()];
+  int groupingSetPosition = groupByDesc.getGroupingSetPosition();
+  for (int pos = 0; pos < fastBitSets.length; pos ++) {
+fastBitSets[pos] = 
GroupByOperator.groupingSet2BitSet(groupingSets.get(pos),
+groupingSetPosition);
+  }
+  List groupByKeys = 
((GroupByOperator)nd).getConf().getKeys();
+  Map newToOldExprMap = 
prunedPred.getNewToOldExprMap();
+  Map> nonFinalCandidates = new HashMap>();
+  for (Iterator>>
+   iter = candidates.entrySet().iterator(); iter.hasNext(); ) {
+Map.Entry> entry = iter.next();
+List residualExprs = new ArrayList();
+List finalCandidates = new ArrayList();
+List exprs = entry.getValue();
+for (ExprNodeDesc expr : exprs) {
+  if (canPredPushdown(expr, groupByKeys, fastBitSets, 
groupingSetPosition)) {
+finalCandidates.add(expr);
+  } else {
+residualExprs.add(newToOldExprMap.get(expr));
+  }
+}
+if (!residualExprs.isEmpty()) {
+  nonFinalCandidates.put(entry.getKey(), residualExprs);
+}
+
+if (finalCandidates.isEmpty()) {
+  iter.remove();
+} else {
+  exprs.clear();
+  exprs.addAll(finalCandidates);
+}
+  }
+  
+  if (!nonFinalCandidates.isEmpty()) {
+createFilter((Operator) nd, nonFinalCandidates, owi);
+  }
+  return null;
+}
+
+private boolean canPredPushdown(ExprNodeDesc expr, List 
groupByKeys,
+FastBitSet[] bitSets, int groupingSetPosition) {
+  List columns = new ArrayList();
+  extractCols(expr, columns);
+  for (ExprNodeDesc col : columns) {
+int index = groupByKeys.indexOf(col);
+assert index >= 0;
+for (FastBitSet bitset : bitSets) {
+  int keyPos = 

[jira] [Work logged] (HIVE-19653) Incorrect predicate pushdown for groupby with grouping sets

2020-06-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19653?focusedWorklogId=442887=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-442887
 ]

ASF GitHub Bot logged work on HIVE-19653:
-

Author: ASF GitHub Bot
Created on: 09/Jun/20 16:09
Start Date: 09/Jun/20 16:09
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1070:
URL: https://github.com/apache/hive/pull/1070#discussion_r437021422



##
File path: ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
##
@@ -733,6 +736,97 @@ private void applyFilterTransitivity(JoinOperator join, 
int targetPos, OpWalkerI
 }
   }
 
+  public static class GroupByPPD extends DefaultPPD implements 
SemanticNodeProcessor {
+
+@Override
+public Object process(Node nd, Stack stack, NodeProcessorCtx procCtx,
+Object... nodeOutputs) throws SemanticException {
+  super.process(nd, stack, procCtx, nodeOutputs);
+  OpWalkerInfo owi = (OpWalkerInfo) procCtx;
+  GroupByDesc groupByDesc = ((GroupByOperator)nd).getConf();
+  ExprWalkerInfo prunedPred = owi.getPrunedPreds((Operator) nd);
+  if (prunedPred == null || !prunedPred.hasAnyCandidates() ||
+  !groupByDesc.isGroupingSetsPresent()) {
+return null;
+  }
+
+  List groupingSets = groupByDesc.getListGroupingSets();
+  Map> candidates = 
prunedPred.getFinalCandidates();
+  FastBitSet[] fastBitSets = new FastBitSet[groupingSets.size()];
+  int groupingSetPosition = groupByDesc.getGroupingSetPosition();
+  for (int pos = 0; pos < fastBitSets.length; pos ++) {
+fastBitSets[pos] = 
GroupByOperator.groupingSet2BitSet(groupingSets.get(pos),
+groupingSetPosition);
+  }
+  List groupByKeys = 
((GroupByOperator)nd).getConf().getKeys();
+  Map newToOldExprMap = 
prunedPred.getNewToOldExprMap();
+  Map> nonFinalCandidates = new HashMap>();
+  for (Iterator>>

Review comment:
   Please use a `while`.

##
File path: ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
##
@@ -733,6 +736,97 @@ private void applyFilterTransitivity(JoinOperator join, 
int targetPos, OpWalkerI
 }
   }
 
+  public static class GroupByPPD extends DefaultPPD implements 
SemanticNodeProcessor {
+
+@Override
+public Object process(Node nd, Stack stack, NodeProcessorCtx procCtx,
+Object... nodeOutputs) throws SemanticException {
+  super.process(nd, stack, procCtx, nodeOutputs);
+  OpWalkerInfo owi = (OpWalkerInfo) procCtx;
+  GroupByDesc groupByDesc = ((GroupByOperator)nd).getConf();
+  ExprWalkerInfo prunedPred = owi.getPrunedPreds((Operator) nd);
+  if (prunedPred == null || !prunedPred.hasAnyCandidates() ||
+  !groupByDesc.isGroupingSetsPresent()) {
+return null;
+  }
+
+  List groupingSets = groupByDesc.getListGroupingSets();
+  Map> candidates = 
prunedPred.getFinalCandidates();
+  FastBitSet[] fastBitSets = new FastBitSet[groupingSets.size()];
+  int groupingSetPosition = groupByDesc.getGroupingSetPosition();
+  for (int pos = 0; pos < fastBitSets.length; pos ++) {
+fastBitSets[pos] = 
GroupByOperator.groupingSet2BitSet(groupingSets.get(pos),
+groupingSetPosition);
+  }
+  List groupByKeys = 
((GroupByOperator)nd).getConf().getKeys();
+  Map newToOldExprMap = 
prunedPred.getNewToOldExprMap();
+  Map> nonFinalCandidates = new HashMap>();
+  for (Iterator>>
+   iter = candidates.entrySet().iterator(); iter.hasNext(); ) {
+Map.Entry> entry = iter.next();
+List residualExprs = new ArrayList();
+List finalCandidates = new ArrayList();
+List exprs = entry.getValue();
+for (ExprNodeDesc expr : exprs) {
+  if (canPredPushdown(expr, groupByKeys, fastBitSets, 
groupingSetPosition)) {
+finalCandidates.add(expr);
+  } else {
+residualExprs.add(newToOldExprMap.get(expr));
+  }
+}
+if (!residualExprs.isEmpty()) {
+  nonFinalCandidates.put(entry.getKey(), residualExprs);
+}
+
+if (finalCandidates.isEmpty()) {
+  iter.remove();
+} else {
+  exprs.clear();
+  exprs.addAll(finalCandidates);
+}
+  }
+  
+  if (!nonFinalCandidates.isEmpty()) {
+createFilter((Operator) nd, nonFinalCandidates, owi);
+  }
+  return null;
+}
+
+private boolean canPredPushdown(ExprNodeDesc expr, List 
groupByKeys,
+FastBitSet[] bitSets, int groupingSetPosition) {
+  List columns = new ArrayList();
+  extractCols(expr, columns);
+  for (ExprNodeDesc col : columns) {
+int index = groupByKeys.indexOf(col);
+assert index >= 0;
+for (FastBitSet bitset : bitSets) {
+  int keyPos = bitset.nextClearBit(0);
+   

[jira] [Work logged] (HIVE-19653) Incorrect predicate pushdown for groupby with grouping sets

2020-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19653?focusedWorklogId=442428=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-442428
 ]

ASF GitHub Bot logged work on HIVE-19653:
-

Author: ASF GitHub Bot
Created on: 07/Jun/20 13:00
Start Date: 07/Jun/20 13:00
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 removed a comment on pull request #1070:
URL: https://github.com/apache/hive/pull/1070#issuecomment-640215254


   The two failed tests seem to be unrelated to the changes...



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 442428)
Time Spent: 1h  (was: 50m)

> Incorrect predicate pushdown for groupby with grouping sets
> ---
>
> Key: HIVE-19653
> URL: https://issues.apache.org/jira/browse/HIVE-19653
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhang Li
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19653.1.patch, HIVE-19653.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Consider the following query:
> {code:java}
> CREATE TABLE T1(a STRING, b STRING, s BIGINT);
> INSERT OVERWRITE TABLE T1 VALUES ('', '', 123456);
> SELECT * FROM (
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((), (a), (b), (a, b))
> ) t WHERE a IS NOT NULL;
> {code}
> When hive.optimize.ppd is enabled (and hive.cbo.enable=false), the query will 
> output:
> {code:java}
> NULL  NULL123456
> NULL  123456
>   NULL123456
>   123456
> {code}
> We can see the predicate "a IS NOT NULL" takes no effect, which is incorrect.
> When performing PPD optimization for a GBY operator, we should make sure all 
> grouping sets contains the processing expr before pushdown. otherwise the 
> expr value after GBY is changed and the result is wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-19653) Incorrect predicate pushdown for groupby with grouping sets

2020-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19653?focusedWorklogId=442427=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-442427
 ]

ASF GitHub Bot logged work on HIVE-19653:
-

Author: ASF GitHub Bot
Created on: 07/Jun/20 12:59
Start Date: 07/Jun/20 12:59
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #1070:
URL: https://github.com/apache/hive/pull/1070#issuecomment-640215254


   The two failed tests seem to be unrelated to the changes...



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 442427)
Time Spent: 50m  (was: 40m)

> Incorrect predicate pushdown for groupby with grouping sets
> ---
>
> Key: HIVE-19653
> URL: https://issues.apache.org/jira/browse/HIVE-19653
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhang Li
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19653.1.patch, HIVE-19653.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Consider the following query:
> {code:java}
> CREATE TABLE T1(a STRING, b STRING, s BIGINT);
> INSERT OVERWRITE TABLE T1 VALUES ('', '', 123456);
> SELECT * FROM (
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((), (a), (b), (a, b))
> ) t WHERE a IS NOT NULL;
> {code}
> When hive.optimize.ppd is enabled (and hive.cbo.enable=false), the query will 
> output:
> {code:java}
> NULL  NULL123456
> NULL  123456
>   NULL123456
>   123456
> {code}
> We can see the predicate "a IS NOT NULL" takes no effect, which is incorrect.
> When performing PPD optimization for a GBY operator, we should make sure all 
> grouping sets contains the processing expr before pushdown. otherwise the 
> expr value after GBY is changed and the result is wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-19653) Incorrect predicate pushdown for groupby with grouping sets

2020-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19653?focusedWorklogId=442426=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-442426
 ]

ASF GitHub Bot logged work on HIVE-19653:
-

Author: ASF GitHub Bot
Created on: 07/Jun/20 12:55
Start Date: 07/Jun/20 12:55
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #1070:
URL: https://github.com/apache/hive/pull/1070#issuecomment-640214775


   The two failed tests seem to be unrelated to the changes...



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 442426)
Time Spent: 40m  (was: 0.5h)

> Incorrect predicate pushdown for groupby with grouping sets
> ---
>
> Key: HIVE-19653
> URL: https://issues.apache.org/jira/browse/HIVE-19653
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhang Li
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19653.1.patch, HIVE-19653.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Consider the following query:
> {code:java}
> CREATE TABLE T1(a STRING, b STRING, s BIGINT);
> INSERT OVERWRITE TABLE T1 VALUES ('', '', 123456);
> SELECT * FROM (
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((), (a), (b), (a, b))
> ) t WHERE a IS NOT NULL;
> {code}
> When hive.optimize.ppd is enabled (and hive.cbo.enable=false), the query will 
> output:
> {code:java}
> NULL  NULL123456
> NULL  123456
>   NULL123456
>   123456
> {code}
> We can see the predicate "a IS NOT NULL" takes no effect, which is incorrect.
> When performing PPD optimization for a GBY operator, we should make sure all 
> grouping sets contains the processing expr before pushdown. otherwise the 
> expr value after GBY is changed and the result is wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-19653) Incorrect predicate pushdown for groupby with grouping sets

2020-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19653?focusedWorklogId=442401=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-442401
 ]

ASF GitHub Bot logged work on HIVE-19653:
-

Author: ASF GitHub Bot
Created on: 07/Jun/20 09:30
Start Date: 07/Jun/20 09:30
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #1070:
URL: https://github.com/apache/hive/pull/1070


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 442401)
Time Spent: 0.5h  (was: 20m)

> Incorrect predicate pushdown for groupby with grouping sets
> ---
>
> Key: HIVE-19653
> URL: https://issues.apache.org/jira/browse/HIVE-19653
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhang Li
>Assignee: Zhang Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19653.1.patch, HIVE-19653.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Consider the following query:
> {code:java}
> CREATE TABLE T1(a STRING, b STRING, s BIGINT);
> INSERT OVERWRITE TABLE T1 VALUES ('', '', 123456);
> SELECT * FROM (
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((), (a), (b), (a, b))
> ) t WHERE a IS NOT NULL;
> {code}
> When hive.optimize.ppd is enabled (and hive.cbo.enable=false), the query will 
> output:
> {code:java}
> NULL  NULL123456
> NULL  123456
>   NULL123456
>   123456
> {code}
> We can see the predicate "a IS NOT NULL" takes no effect, which is incorrect.
> When performing PPD optimization for a GBY operator, we should make sure all 
> grouping sets contains the processing expr before pushdown. otherwise the 
> expr value after GBY is changed and the result is wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-19653) Incorrect predicate pushdown for groupby with grouping sets

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19653?focusedWorklogId=331401=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331401
 ]

ASF GitHub Bot logged work on HIVE-19653:
-

Author: ASF GitHub Bot
Created on: 21/Oct/19 13:00
Start Date: 21/Oct/19 13:00
Worklog Time Spent: 10m 
  Work Description: richox commented on issue #354: HIVE-19653: Incorrect 
predicate pushdown for groupby with grouping sets
URL: https://github.com/apache/hive/pull/354#issuecomment-544503441
 
 
   > Now I'm facing this problem and I wonder why this pull request is still 
unmerged.
   
   i'm not interested in hive any more... maybe you can try set hive.cbo.enable 
to true and use the new cbo optimizer, this bug won't happen with cbo
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331401)
Time Spent: 20m  (was: 10m)

> Incorrect predicate pushdown for groupby with grouping sets
> ---
>
> Key: HIVE-19653
> URL: https://issues.apache.org/jira/browse/HIVE-19653
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhang Li
>Assignee: Zhang Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19653.1.patch, HIVE-19653.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Consider the following query:
> {code:java}
> CREATE TABLE T1(a STRING, b STRING, s BIGINT);
> INSERT OVERWRITE TABLE T1 VALUES ('', '', 123456);
> SELECT * FROM (
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((), (a), (b), (a, b))
> ) t WHERE a IS NOT NULL;
> {code}
> When hive.optimize.ppd is enabled (and hive.cbo.enable=false), the query will 
> output:
> {code:java}
> NULL  NULL123456
> NULL  123456
>   NULL123456
>   123456
> {code}
> We can see the predicate "a IS NOT NULL" takes no effect, which is incorrect.
> When performing PPD optimization for a GBY operator, we should make sure all 
> grouping sets contains the processing expr before pushdown. otherwise the 
> expr value after GBY is changed and the result is wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-19653) Incorrect predicate pushdown for groupby with grouping sets

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19653?focusedWorklogId=331291=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331291
 ]

ASF GitHub Bot logged work on HIVE-19653:
-

Author: ASF GitHub Bot
Created on: 21/Oct/19 09:00
Start Date: 21/Oct/19 09:00
Worklog Time Spent: 10m 
  Work Description: fan624009652 commented on issue #354: HIVE-19653: 
Incorrect predicate pushdown for groupby with grouping sets
URL: https://github.com/apache/hive/pull/354#issuecomment-544419803
 
 
   Now I'm facing this problem and I wonder why this pull request is still 
unmerged.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331291)
Remaining Estimate: 0h
Time Spent: 10m

> Incorrect predicate pushdown for groupby with grouping sets
> ---
>
> Key: HIVE-19653
> URL: https://issues.apache.org/jira/browse/HIVE-19653
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhang Li
>Assignee: Zhang Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19653.1.patch, HIVE-19653.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Consider the following query:
> {code:java}
> CREATE TABLE T1(a STRING, b STRING, s BIGINT);
> INSERT OVERWRITE TABLE T1 VALUES ('', '', 123456);
> SELECT * FROM (
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((), (a), (b), (a, b))
> ) t WHERE a IS NOT NULL;
> {code}
> When hive.optimize.ppd is enabled (and hive.cbo.enable=false), the query will 
> output:
> {code:java}
> NULL  NULL123456
> NULL  123456
>   NULL123456
>   123456
> {code}
> We can see the predicate "a IS NOT NULL" takes no effect, which is incorrect.
> When performing PPD optimization for a GBY operator, we should make sure all 
> grouping sets contains the processing expr before pushdown. otherwise the 
> expr value after GBY is changed and the result is wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)