[jira] [Commented] (HIVE-24638) Redundant filter in scalar subquery
[ https://issues.apache.org/jira/browse/HIVE-24638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17265609#comment-17265609 ] Jesus Camacho Rodriguez commented on HIVE-24638: {quote} Other idea was to replace this filter with project {quote} Yes, I discussed this with [~mustafaiman] but the problem would be that the trimmer (or some rules) may remove that project column if is not referenced in subsequent operators in the plan? > Redundant filter in scalar subquery > > > Key: HIVE-24638 > URL: https://issues.apache.org/jira/browse/HIVE-24638 > Project: Hive > Issue Type: Improvement >Reporter: Mustafa İman >Assignee: Vineet Garg >Priority: Major > > Look at the query and CBO plan in > https://issues.apache.org/jira/browse/HIVE-24595 . > Note that there is a filter to guarantee that subquery returns only one row: > "HiveFilter(condition=[<=(sq_count_check($0), 1)])" . This condition is > redundant as either sq_count_check fails in runtime or condition is true for > all rows. > Look at the stacktrace > {code:java} > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFSQCountCheck.evaluate(GenericUDFSQCountCheck.java:70) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFSQCountCheck.evaluate(GenericUDFSQCountCheck.java:70) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrLessThan.evaluate(GenericUDFOPEqualOrLessThan.java:111) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68) > at > org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:113) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888) at > org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1004) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1028) > {code} > GenericUDFOPEqualOrLessThan is redundant here as GenericUDFSQCountCheck does > the same check. > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24638) Redundant filter in scalar subquery
[ https://issues.apache.org/jira/browse/HIVE-24638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17265608#comment-17265608 ] Vineet Garg commented on HIVE-24638: [~jcamachorodriguez] Other idea was to replace this filter with project, but yes just changing the UDF to return boolean and getting rid of the filter condition should be straight forward. > Redundant filter in scalar subquery > > > Key: HIVE-24638 > URL: https://issues.apache.org/jira/browse/HIVE-24638 > Project: Hive > Issue Type: Improvement >Reporter: Mustafa İman >Priority: Major > > Look at the query and CBO plan in > https://issues.apache.org/jira/browse/HIVE-24595 . > Note that there is a filter to guarantee that subquery returns only one row: > "HiveFilter(condition=[<=(sq_count_check($0), 1)])" . This condition is > redundant as either sq_count_check fails in runtime or condition is true for > all rows. > Look at the stacktrace > {code:java} > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFSQCountCheck.evaluate(GenericUDFSQCountCheck.java:70) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFSQCountCheck.evaluate(GenericUDFSQCountCheck.java:70) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrLessThan.evaluate(GenericUDFOPEqualOrLessThan.java:111) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68) > at > org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:113) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888) at > org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1004) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1028) > {code} > GenericUDFOPEqualOrLessThan is redundant here as GenericUDFSQCountCheck does > the same check. > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24638) Redundant filter in scalar subquery
[ https://issues.apache.org/jira/browse/HIVE-24638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17265606#comment-17265606 ] Jesus Camacho Rodriguez commented on HIVE-24638: [~vgarg], thoughts? This should not be too difficult, {{sq_count_check }} would return a boolean rather than the value itself? Apparently this has an impact on vectorization ([~mustafaiman] can add further details). Cc [~scarlin] > Redundant filter in scalar subquery > > > Key: HIVE-24638 > URL: https://issues.apache.org/jira/browse/HIVE-24638 > Project: Hive > Issue Type: Improvement >Reporter: Mustafa İman >Priority: Major > > Look at the query and CBO plan in > https://issues.apache.org/jira/browse/HIVE-24595 . > Note that there is a filter to guarantee that subquery returns only one row: > "HiveFilter(condition=[<=(sq_count_check($0), 1)])" . This condition is > redundant as either sq_count_check fails in runtime or condition is true for > all rows. > Look at the stacktrace > {code:java} > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFSQCountCheck.evaluate(GenericUDFSQCountCheck.java:70) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFSQCountCheck.evaluate(GenericUDFSQCountCheck.java:70) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrLessThan.evaluate(GenericUDFOPEqualOrLessThan.java:111) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68) > at > org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:113) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888) at > org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1004) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1028) > {code} > GenericUDFOPEqualOrLessThan is redundant here as GenericUDFSQCountCheck does > the same check. > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)