[jira] [Commented] (HIVE-11405) Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression
[ https://issues.apache.org/jira/browse/HIVE-11405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658737#comment-14658737 ] Gunther Hagleitner commented on HIVE-11405: --- +1 Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression -- Key: HIVE-11405 URL: https://issues.apache.org/jira/browse/HIVE-11405 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Prasanth Jayachandran Attachments: HIVE-11405.1.patch, HIVE-11405.2.patch, HIVE-11405.2.patch, HIVE-11405.2.patch, HIVE-11405.2.patch, HIVE-11405.patch Thanks to [~gopalv] for uncovering this issue as part of HIVE-11330. Quoting him, The recursion protection works well with an AND expr, but it doesn't work against (OR a=1 (OR a=2 (OR a=3 (OR ...) since the for the rows will never be reduced during recursion due to the nature of the OR. We need to execute a short-circuit to satisfy the OR properly - no case which matches a=1 qualifies for the rest of the filters. Recursion should pass in the numRows - branch1Rows for the branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11405) Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression
[ https://issues.apache.org/jira/browse/HIVE-11405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654006#comment-14654006 ] Hive QA commented on HIVE-11405: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748583/HIVE-11405.2.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4819/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4819/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4819/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: java.io.IOException: Could not create /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4819/succeeded/TestParseNegative {noformat} This message is automatically generated. ATTACHMENT ID: 12748583 - PreCommit-HIVE-TRUNK-Build Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression -- Key: HIVE-11405 URL: https://issues.apache.org/jira/browse/HIVE-11405 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Prasanth Jayachandran Attachments: HIVE-11405.1.patch, HIVE-11405.2.patch, HIVE-11405.2.patch, HIVE-11405.patch Thanks to [~gopalv] for uncovering this issue as part of HIVE-11330. Quoting him, The recursion protection works well with an AND expr, but it doesn't work against (OR a=1 (OR a=2 (OR a=3 (OR ...) since the for the rows will never be reduced during recursion due to the nature of the OR. We need to execute a short-circuit to satisfy the OR properly - no case which matches a=1 qualifies for the rest of the filters. Recursion should pass in the numRows - branch1Rows for the branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11405) Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression
[ https://issues.apache.org/jira/browse/HIVE-11405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651464#comment-14651464 ] Hive QA commented on HIVE-11405: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748383/HIVE-11405.2.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4795/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4795/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4795/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: java.io.IOException: Could not create /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4795/succeeded/TestFolderPermissions {noformat} This message is automatically generated. ATTACHMENT ID: 12748383 - PreCommit-HIVE-TRUNK-Build Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression -- Key: HIVE-11405 URL: https://issues.apache.org/jira/browse/HIVE-11405 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Prasanth Jayachandran Attachments: HIVE-11405.1.patch, HIVE-11405.2.patch, HIVE-11405.patch Thanks to [~gopalv] for uncovering this issue as part of HIVE-11330. Quoting him, The recursion protection works well with an AND expr, but it doesn't work against (OR a=1 (OR a=2 (OR a=3 (OR ...) since the for the rows will never be reduced during recursion due to the nature of the OR. We need to execute a short-circuit to satisfy the OR properly - no case which matches a=1 qualifies for the rest of the filters. Recursion should pass in the numRows - branch1Rows for the branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11405) Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression
[ https://issues.apache.org/jira/browse/HIVE-11405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649845#comment-14649845 ] Hive QA commented on HIVE-11405: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748211/HIVE-11405.1.patch {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9279 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_17 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_17 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_join2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_join3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_17 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4778/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4778/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4778/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748211 - PreCommit-HIVE-TRUNK-Build Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression -- Key: HIVE-11405 URL: https://issues.apache.org/jira/browse/HIVE-11405 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Prasanth Jayachandran Attachments: HIVE-11405.1.patch, HIVE-11405.patch Thanks to [~gopalv] for uncovering this issue as part of HIVE-11330. Quoting him, The recursion protection works well with an AND expr, but it doesn't work against (OR a=1 (OR a=2 (OR a=3 (OR ...) since the for the rows will never be reduced during recursion due to the nature of the OR. We need to execute a short-circuit to satisfy the OR properly - no case which matches a=1 qualifies for the rest of the filters. Recursion should pass in the numRows - branch1Rows for the branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11405) Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression
[ https://issues.apache.org/jira/browse/HIVE-11405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648958#comment-14648958 ] Hive QA commented on HIVE-11405: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748096/HIVE-11405.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9277 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_null_projection org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_7 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_7 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4771/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4771/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4771/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748096 - PreCommit-HIVE-TRUNK-Build Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression -- Key: HIVE-11405 URL: https://issues.apache.org/jira/browse/HIVE-11405 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Prasanth Jayachandran Attachments: HIVE-11405.patch Thanks to [~gopalv] for uncovering this issue as part of HIVE-11330. Quoting him, The recursion protection works well with an AND expr, but it doesn't work against (OR a=1 (OR a=2 (OR a=3 (OR ...) since the for the rows will never be reduced during recursion due to the nature of the OR. We need to execute a short-circuit to satisfy the OR properly - no case which matches a=1 qualifies for the rest of the filters. Recursion should pass in the numRows - branch1Rows for the branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11405) Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression
[ https://issues.apache.org/jira/browse/HIVE-11405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648081#comment-14648081 ] Prasanth Jayachandran commented on HIVE-11405: -- [~gopalv] is the column stats available for this query? If not your patch will early terminate because of data size becoming 0 and AND evaluation terminating early. Also I am not sure if this assumption is correct {code} final long branch2Rows = (newNumRows = branchRows) ? 0 : (newNumRows - branchRows); {code} I am still evaluating this change. The idea of mirroring the tree and passing the branchRows to sibling branch looks good so far. Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression -- Key: HIVE-11405 URL: https://issues.apache.org/jira/browse/HIVE-11405 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Prasanth Jayachandran Thanks to [~gopalv] for uncovering this issue as part of HIVE-11330. Quoting him, The recursion protection works well with an AND expr, but it doesn't work against (OR a=1 (OR a=2 (OR a=3 (OR ...) since the for the rows will never be reduced during recursion due to the nature of the OR. We need to execute a short-circuit to satisfy the OR properly - no case which matches a=1 qualifies for the rest of the filters. Recursion should pass in the numRows - branch1Rows for the branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11405) Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression
[ https://issues.apache.org/jira/browse/HIVE-11405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14646863#comment-14646863 ] Gopal V commented on HIVE-11405: thanks [~hsubramaniyan], I'm currently bypassing that with a temporary band-aid which needs attention for correctness {code} --- a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java @@ -325,9 +325,16 @@ private long evaluateExpression(Statistics stats, ExprNodeDesc pred, } } else if (udf instanceof GenericUDFOPOr) { // for OR condition independently compute and update stats - for (ExprNodeDesc child : genFunc.getChildren()) { -newNumRows = StatsUtils.safeAdd( -evaluateChildExpr(stats, child, aspCtx, neededCols, fop), newNumRows); + newNumRows = stats.getNumRows(); + Statistics orStats = stats.clone(); + int k = 0; + + for (ExprNodeDesc child : com.google.common.collect.Lists.reverse(genFunc.getChildren())) { +final long branchRows = evaluateChildExpr(orStats, child, aspCtx, neededCols, fop); +final long branch2Rows = (newNumRows = branchRows) ? 0 : (newNumRows - branchRows); +updateStats(orStats, branch2Rows, true, fop); +newNumRows = StatsUtils.safeAdd(branchRows, newNumRows); } } else if (udf instanceof GenericUDFOPNot) { newNumRows = evaluateNotExpr(stats, pred, aspCtx, neededCols, fop); {code} Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression -- Key: HIVE-11405 URL: https://issues.apache.org/jira/browse/HIVE-11405 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Prasanth Jayachandran Thanks to [~gopalv] for uncovering this issue as part of HIVE-11330. Quoting him, The recursion protection works well with an AND expr, but it doesn't work against (OR a=1 (OR a=2 (OR a=3 (OR ...) since the for the rows will never be reduced during recursion due to the nature of the OR. We need to execute a short-circuit to satisfy the OR properly - no case which matches a=1 qualifies for the rest of the filters. Recursion should pass in the numRows - branch1Rows for the branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)