[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969923#comment-16969923 ] Hive QA commented on HIVE-22238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12985234/HIVE-22238.10.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19344/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19344/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19344/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12985234/HIVE-22238.10.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12985234 - PreCommit-HIVE-Build > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch, HIVE-22238.04.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch, HIVE-22238.05.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch, HIVE-22238.06.patch, HIVE-22238.06.patch, > HIVE-22238.07.patch, HIVE-22238.09.patch, HIVE-22238.10.patch, > HIVE-22238.10.patch, HIVE-22238.10.patch, HIVE-22238.10.patch > > Time Spent: 40m > Remaining Estimate: 0h > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969799#comment-16969799 ] Hive QA commented on HIVE-22238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12985234/HIVE-22238.10.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 17667 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19340/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19340/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19340/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12985234 - PreCommit-HIVE-Build > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch, HIVE-22238.04.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch, HIVE-22238.05.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch, HIVE-22238.06.patch, HIVE-22238.06.patch, > HIVE-22238.07.patch, HIVE-22238.09.patch, HIVE-22238.10.patch, > HIVE-22238.10.patch, HIVE-22238.10.patch, HIVE-22238.10.patch > > Time Spent: 40m > Remaining Estimate: 0h > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969792#comment-16969792 ] Hive QA commented on HIVE-22238: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 43s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 20s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 38s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 11s{color} | {color:blue} ql in master has 1548 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 37s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 35s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 41s{color} | {color:red} ql: The patch generated 3 new + 103 unchanged - 0 fixed = 106 total (was 103) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 2m 5s{color} | {color:red} root: The patch generated 3 new + 103 unchanged - 0 fixed = 106 total (was 103) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 24s{color} | {color:red} ql generated 1 new + 1548 unchanged - 0 fixed = 1549 total (was 1548) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 40s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 64m 14s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Unread field:AccurateEstimatesCheckerHook.java:[line 61] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-19340/dev-support/hive-personality.sh | | git revision | master / 358e5a9 | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19340/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19340/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-19340/yetus/whitespace-eol.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-19340/yetus/new-findbugs-ql.html | | modules | C: ql . itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-19340/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch,
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969397#comment-16969397 ] Hive QA commented on HIVE-22238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12985225/HIVE-22238.10.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 17667 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.metastore.TestPartitionManagement.testPartitionDiscoveryTransactionalTable (batchId=224) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19332/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19332/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19332/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12985225 - PreCommit-HIVE-Build > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch, HIVE-22238.04.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch, HIVE-22238.05.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch, HIVE-22238.06.patch, HIVE-22238.06.patch, > HIVE-22238.07.patch, HIVE-22238.09.patch, HIVE-22238.10.patch, > HIVE-22238.10.patch, HIVE-22238.10.patch > > Time Spent: 40m > Remaining Estimate: 0h > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969390#comment-16969390 ] Hive QA commented on HIVE-22238: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 49s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 42s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 48s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 47s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 20s{color} | {color:blue} ql in master has 1548 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 56s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 57s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 45s{color} | {color:red} ql: The patch generated 3 new + 103 unchanged - 0 fixed = 106 total (was 103) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 2m 9s{color} | {color:red} root: The patch generated 3 new + 103 unchanged - 0 fixed = 106 total (was 103) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 31s{color} | {color:red} ql generated 1 new + 1548 unchanged - 0 fixed = 1549 total (was 1548) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 8m 5s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 67m 4s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Unread field:AccurateEstimatesCheckerHook.java:[line 61] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-19332/dev-support/hive-personality.sh | | git revision | master / 358e5a9 | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19332/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19332/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-19332/yetus/whitespace-eol.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-19332/yetus/new-findbugs-ql.html | | modules | C: ql . itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-19332/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch,
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969082#comment-16969082 ] Hive QA commented on HIVE-22238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12985116/HIVE-22238.10.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 17573 tests executed *Failed tests:* {noformat} org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomNonExistent (batchId=285) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighBytesRead (batchId=285) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerSlowQueryElapsedTime (batchId=285) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19327/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19327/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19327/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12985116 - PreCommit-HIVE-Build > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch, HIVE-22238.04.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch, HIVE-22238.05.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch, HIVE-22238.06.patch, HIVE-22238.06.patch, > HIVE-22238.07.patch, HIVE-22238.09.patch, HIVE-22238.10.patch > > Time Spent: 40m > Remaining Estimate: 0h > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969071#comment-16969071 ] Hive QA commented on HIVE-22238: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 47s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 11s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 40s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 42s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 5s{color} | {color:blue} ql in master has 1550 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 49s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 45s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 43s{color} | {color:red} ql: The patch generated 3 new + 103 unchanged - 0 fixed = 106 total (was 103) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 2m 0s{color} | {color:red} root: The patch generated 3 new + 103 unchanged - 0 fixed = 106 total (was 103) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 20s{color} | {color:red} ql generated 1 new + 1550 unchanged - 0 fixed = 1551 total (was 1550) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 44s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 65m 58s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Unread field:AccurateEstimatesCheckerHook.java:[line 61] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-19327/dev-support/hive-personality.sh | | git revision | master / 4cfb7a0 | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19327/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19327/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-19327/yetus/whitespace-eol.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-19327/yetus/new-findbugs-ql.html | | modules | C: ql . itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-19327/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch,
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968518#comment-16968518 ] Hive QA commented on HIVE-22238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12985059/HIVE-22238.09.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19316/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19316/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19316/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2019-11-06 17:15:48.258 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-19316/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2019-11-06 17:15:48.262 + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive d0bd071..4cfb7a0 master -> origin/master + git reset --hard HEAD HEAD is now at d0bd071 HIVE-22292: Implement Hypothetical-Set Aggregate Functions (Krisztian Kasa, reviewed by Jesus Camacho Rodriguez) + git clean -f -d Removing standalone-metastore/metastore-server/src/gen/ + git checkout master Already on 'master' Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded. (use "git pull" to update your local branch) + git reset --hard origin/master HEAD is now at 4cfb7a0 HIVE-22311: Propagate min/max column values from statistics to the optimizer for timestamp type (Jesus Camacho Rodriguez, reviewed by Miklos Gergely) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2019-11-06 17:15:50.484 + rm -rf ../yetus_PreCommit-HIVE-Build-19316 + mkdir ../yetus_PreCommit-HIVE-Build-19316 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-19316 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-19316/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: patch failed: ql/src/test/results/clientpositive/llap/vector_interval_mapjoin.q.out:240 Falling back to three-way merge... Applied patch to 'ql/src/test/results/clientpositive/llap/vector_interval_mapjoin.q.out' with conflicts. error: patch failed: ql/src/test/results/clientpositive/vector_interval_mapjoin.q.out:259 Falling back to three-way merge... Applied patch to 'ql/src/test/results/clientpositive/vector_interval_mapjoin.q.out' with conflicts. Going to apply patch with: git apply -p0 /data/hiveptest/working/scratch/build.patch:368: trailing whitespace. /data/hiveptest/working/scratch/build.patch:370: trailing whitespace. /data/hiveptest/working/scratch/build.patch:374: trailing whitespace. /data/hiveptest/working/scratch/build.patch:412: trailing whitespace. st.type_name='galaxy class' /data/hiveptest/working/scratch/build.patch:429: trailing whitespace. st.type_name='galaxy class' error: patch failed: ql/src/test/results/clientpositive/llap/vector_interval_mapjoin.q.out:240 Falling back to three-way merge... Applied patch to 'ql/src/test/results/clientpositive/llap/vector_interval_mapjoin.q.out' with conflicts. error: patch failed: ql/src/test/results/clientpositive/vector_interval_mapjoin.q.out:259 Falling back to three-way merge... Applied patch to 'ql/src/test/results/clientpositive/vector_interval_mapjoin.q.out' with conflicts. U ql/src/test/results/clientpositive/llap/vector_interval_mapjoin.q.out U ql/src/test/results/clientpositive/vector_interval_mapjoin.q.out warning: squelched 32 whitespace errors warning: 37 lines add whitespace errors. + result=1 + '[' 1 -ne 0 ']' + rm -rf yetus_PreCommit-HIVE-Build-19316 + exit 1 ' {noformat} This
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968280#comment-16968280 ] Hive QA commented on HIVE-22238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12985034/HIVE-22238.07.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 189 failed/errored test(s), 17572 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join_pkfk] (batchId=16) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join6] (batchId=46) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[estimate_pkfk_push] (batchId=174) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_notin] (batchId=182) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=173) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_inner_join] (batchId=182) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query23] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query6] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query10] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query11] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query12] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query13] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query15] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query17] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query18] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query19] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query1] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query1b] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query20] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query25] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query26] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query27] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query29] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query2] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query30] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query31] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query32] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query33] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query34] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query35] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query36] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query38] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query3] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query40] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query42] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query43] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query45] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query46] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query47] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query48] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query49] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query4] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query50] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query51] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query52] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query53] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query54] (batchId=300) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query55]
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968272#comment-16968272 ] Hive QA commented on HIVE-22238: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 49s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 49s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 42s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 11s{color} | {color:blue} ql in master has 1550 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 8m 0s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 32s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 42s{color} | {color:red} ql: The patch generated 3 new + 102 unchanged - 0 fixed = 105 total (was 102) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 2m 0s{color} | {color:red} root: The patch generated 3 new + 102 unchanged - 0 fixed = 105 total (was 102) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 21s{color} | {color:red} ql generated 1 new + 1550 unchanged - 0 fixed = 1551 total (was 1550) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 54s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 66m 16s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Unread field:AccurateEstimatesCheckerHook.java:[line 61] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-19311/dev-support/hive-personality.sh | | git revision | master / d0bd071 | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19311/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19311/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-19311/yetus/whitespace-eol.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-19311/yetus/new-findbugs-ql.html | | modules | C: ql . itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-19311/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch,
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968004#comment-16968004 ] Jesus Camacho Rodriguez commented on HIVE-22238: +1 > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch, HIVE-22238.04.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch, HIVE-22238.05.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch, HIVE-22238.06.patch, HIVE-22238.06.patch > > Time Spent: 40m > Remaining Estimate: 0h > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967967#comment-16967967 ] Hive QA commented on HIVE-22238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12984978/HIVE-22238.06.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 17570 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19302/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19302/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19302/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12984978 - PreCommit-HIVE-Build > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch, HIVE-22238.04.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch, HIVE-22238.05.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch, HIVE-22238.06.patch, HIVE-22238.06.patch > > Time Spent: 40m > Remaining Estimate: 0h > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967958#comment-16967958 ] Hive QA commented on HIVE-22238: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 58s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 54s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 49s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 14s{color} | {color:blue} ql in master has 1550 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 59s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 45s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 40s{color} | {color:red} ql: The patch generated 6 new + 101 unchanged - 1 fixed = 107 total (was 102) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 2m 0s{color} | {color:red} root: The patch generated 6 new + 101 unchanged - 1 fixed = 107 total (was 102) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 22s{color} | {color:red} ql generated 1 new + 1550 unchanged - 0 fixed = 1551 total (was 1550) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 66m 51s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Unread field:AccurateEstimatesCheckerHook.java:[line 61] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-19302/dev-support/hive-personality.sh | | git revision | master / 65a5935 | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19302/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19302/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-19302/yetus/whitespace-eol.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-19302/yetus/new-findbugs-ql.html | | modules | C: ql . itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-19302/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch,
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964335#comment-16964335 ] Hive QA commented on HIVE-22238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12984454/HIVE-22238.05.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 17549 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19232/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19232/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19232/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12984454 - PreCommit-HIVE-Build > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch, HIVE-22238.04.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch, HIVE-22238.05.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch > > Time Spent: 10m > Remaining Estimate: 0h > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964321#comment-16964321 ] Hive QA commented on HIVE-22238: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 42s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 22s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 34s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 7s{color} | {color:blue} ql in master has 1545 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 43s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 19s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 39s{color} | {color:red} ql: The patch generated 1 new + 93 unchanged - 0 fixed = 94 total (was 93) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 2m 1s{color} | {color:red} root: The patch generated 1 new + 93 unchanged - 0 fixed = 94 total (was 93) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 1s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 24s{color} | {color:red} ql generated 1 new + 1545 unchanged - 0 fixed = 1546 total (was 1545) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 63m 55s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Unread field:AccurateEstimatesCheckerHook.java:[line 61] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-19232/dev-support/hive-personality.sh | | git revision | master / 244de3b | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19232/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19232/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-19232/yetus/whitespace-eol.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-19232/yetus/new-findbugs-ql.html | | modules | C: ql . itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-19232/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, >
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963590#comment-16963590 ] Hive QA commented on HIVE-22238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12984346/HIVE-22238.05.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 17545 tests executed *Failed tests:* {noformat} TestStatsReplicationScenariosACIDNoAutogather - did not produce a TEST-*.xml file (likely timed out) (batchId=255) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19219/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19219/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19219/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12984346 - PreCommit-HIVE-Build > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch, HIVE-22238.04.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch, HIVE-22238.05.patch, HIVE-22238.05.patch > > Time Spent: 10m > Remaining Estimate: 0h > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963582#comment-16963582 ] Hive QA commented on HIVE-22238: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 36s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 16s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 42s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 6s{color} | {color:blue} ql in master has 1545 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 32s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 15s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 43s{color} | {color:red} ql: The patch generated 1 new + 93 unchanged - 0 fixed = 94 total (was 93) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 58s{color} | {color:red} root: The patch generated 1 new + 93 unchanged - 0 fixed = 94 total (was 93) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 12s{color} | {color:red} ql generated 1 new + 1545 unchanged - 0 fixed = 1546 total (was 1545) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 45s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 63m 18s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Unread field:AccurateEstimatesCheckerHook.java:[line 61] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-19219/dev-support/hive-personality.sh | | git revision | master / 6a154ee | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19219/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19219/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-19219/yetus/whitespace-eol.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-19219/yetus/new-findbugs-ql.html | | modules | C: ql . itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-19219/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, >
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962810#comment-16962810 ] Zoltan Haindrich commented on HIVE-22238: - unrelated tests are falling...attached the same patch the 3rd time [~jcamachorodriguez] could you please take a look? > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch, HIVE-22238.04.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch, HIVE-22238.05.patch, HIVE-22238.05.patch > > Time Spent: 10m > Remaining Estimate: 0h > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962760#comment-16962760 ] Hive QA commented on HIVE-22238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12984290/HIVE-22238.05.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 17538 tests executed *Failed tests:* {noformat} TestJdbcWithMiniLlapArrow - did not produce a TEST-*.xml file (likely timed out) (batchId=284) org.apache.hadoop.hive.llap.security.TestLlapSignerImpl.testSigning (batchId=364) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19205/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19205/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19205/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12984290 - PreCommit-HIVE-Build > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch, HIVE-22238.04.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch, HIVE-22238.05.patch > > Time Spent: 10m > Remaining Estimate: 0h > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962755#comment-16962755 ] Hive QA commented on HIVE-22238: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 26s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 25s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 43s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 0s{color} | {color:blue} ql in master has 1545 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 25s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 16s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 38s{color} | {color:red} ql: The patch generated 1 new + 93 unchanged - 0 fixed = 94 total (was 93) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 52s{color} | {color:red} root: The patch generated 1 new + 93 unchanged - 0 fixed = 94 total (was 93) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 5s{color} | {color:red} ql generated 1 new + 1545 unchanged - 0 fixed = 1546 total (was 1545) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 62m 7s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Unread field:AccurateEstimatesCheckerHook.java:[line 61] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-19205/dev-support/hive-personality.sh | | git revision | master / 305e710 | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19205/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19205/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-19205/yetus/whitespace-eol.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-19205/yetus/new-findbugs-ql.html | | modules | C: ql . itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-19205/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, >
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962191#comment-16962191 ] Hive QA commented on HIVE-22238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12984247/HIVE-22238.05.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19193/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19193/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19193/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12984247/HIVE-22238.05.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12984247 - PreCommit-HIVE-Build > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch, HIVE-22238.04.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch > > Time Spent: 10m > Remaining Estimate: 0h > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962079#comment-16962079 ] Hive QA commented on HIVE-22238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12984247/HIVE-22238.05.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 17549 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.insertOverwriteCreateAcid (batchId=353) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19191/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19191/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19191/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12984247 - PreCommit-HIVE-Build > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch, HIVE-22238.04.patch, HIVE-22238.05.patch, > HIVE-22238.05.patch > > Time Spent: 10m > Remaining Estimate: 0h > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962069#comment-16962069 ] Hive QA commented on HIVE-22238: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 35s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 15s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 8s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 37s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 56s{color} | {color:blue} ql in master has 1545 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 42s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 11s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 40s{color} | {color:red} ql: The patch generated 1 new + 93 unchanged - 0 fixed = 94 total (was 93) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 54s{color} | {color:red} root: The patch generated 1 new + 93 unchanged - 0 fixed = 94 total (was 93) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 12s{color} | {color:red} ql generated 1 new + 1545 unchanged - 0 fixed = 1546 total (was 1545) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 34s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 62m 25s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Unread field:AccurateEstimatesCheckerHook.java:[line 61] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-19191/dev-support/hive-personality.sh | | git revision | master / aceb8b6 | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19191/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19191/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-19191/yetus/whitespace-eol.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-19191/yetus/new-findbugs-ql.html | | modules | C: ql . itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-19191/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, >
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961527#comment-16961527 ] Hive QA commented on HIVE-22238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12984172/HIVE-22238.04.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 17549 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join_pkfk] (batchId=16) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join33] (batchId=13) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[constprog_partitioner] (batchId=80) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[correlationoptimizer9] (batchId=7) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join45] (batchId=22) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join47] (batchId=35) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin47] (batchId=66) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_2] (batchId=97) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_disablecbo_2] (batchId=49) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_exists] (batchId=46) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_unqualcolumnrefs] (batchId=20) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_interval_mapjoin] (batchId=43) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[groupby_groupingset_bug] (batchId=186) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2] (batchId=168) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[reopt_dpp] (batchId=181) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[reopt_semijoin] (batchId=186) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[retry_failure_reorder] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_exists] (batchId=173) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in] (batchId=179) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_multi] (batchId=166) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_interval_mapjoin] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_mapjoin_reduce] (batchId=184) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1] (batchId=194) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[hybridgrace_hashjoin_2] (batchId=111) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19180/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19180/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19180/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 26 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12984172 - PreCommit-HIVE-Build > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch, HIVE-22238.04.patch > > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961516#comment-16961516 ] Hive QA commented on HIVE-22238: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 32s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 21s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 23s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 56s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 5s{color} | {color:blue} ql in master has 1545 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 35s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 31s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 52s{color} | {color:red} ql: The patch generated 1 new + 883 unchanged - 0 fixed = 884 total (was 883) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 2m 5s{color} | {color:red} root: The patch generated 1 new + 900 unchanged - 0 fixed = 901 total (was 900) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 12 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 11s{color} | {color:red} ql generated 3 new + 1545 unchanged - 0 fixed = 1548 total (was 1545) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 36s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 64m 29s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Unread field:AccurateEstimatesCheckerHook.java:[line 61] | | | Dead store to currCS in org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.xxx1(Operator, ColStatistics) At StatsRulesProcFactory.java:org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.xxx1(Operator, ColStatistics) At StatsRulesProcFactory.java:[line 2294] | | | Dead store to o in org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.xxx1(Operator, ColStatistics) At StatsRulesProcFactory.java:org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.xxx1(Operator, ColStatistics) At StatsRulesProcFactory.java:[line 2295] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-19180/dev-support/hive-personality.sh | | git revision | master / aceb8b6 | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19180/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-19180/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-19180/yetus/whitespace-eol.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-19180/yetus/new-findbugs-ql.html | | modules | C: ql . itests U: . | | Console output |
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961081#comment-16961081 ] Zoltan Haindrich commented on HIVE-22238: - written some tests to check how well a particular solution works I've added 2 small "tweaks": * column stats are marked if they are used in a filter expr * "is null" filters are not considered as affected by the filter if the number of nulls is 0 in the columnstat there is one test which doesnt pass with accurate estimates on the second join; because the first join has a .01 selectivity from the 2 0.1 branches; but min is used during estimation so it kept as .1 - I think this is okay. I've uploaded a new patch to run hiveqa - will clean up the patch for the next run > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch, HIVE-22238.04.patch > > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957156#comment-16957156 ] Jesus Camacho Rodriguez commented on HIVE-22238: Let's work on the more robust solution. It is not worth to trade a wrong estimation for another, especially since overestimation will cause regressions in mapjoin transformation. If we have identified that this is due to filter conditions on the join keys accounted for more than once when selectivity is computed, we can focus on that specific issue. At least for the `getSelectivitySimpleTree` method (no binary operators below), it seems the fix is not too complicated and will improve our estimates. > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch > > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957002#comment-16957002 ] Zoltan Haindrich commented on HIVE-22238: - [~jcamachorodriguez]: yes; things like that have crossed my mind...but those things are not readily available...that's why I didn't choosen that path. right now we have an underestimation bug; which this ticket could change into a "sometimes" overestimation issue. This new overestimation in some cases could be improved upon later. > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch > > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956461#comment-16956461 ] Hive QA commented on HIVE-22238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12983619/HIVE-22238.03.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19093/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19093/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19093/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12983619/HIVE-22238.03.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12983619 - PreCommit-HIVE-Build > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch > > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956422#comment-16956422 ] Jesus Camacho Rodriguez commented on HIVE-22238: [~kgyrtkirk], `getSelectivitySimpleTree` looks for the TS that is below that operator. Does it find it or do we go into logic for multiple operators? If it does, maybe we should skip the predicates that have already been accounted for on PK side (filter conditions on join keys) from the estimate. Does that make sense? Skipping any reduction performed by a join seems too radical (for instance, if we filter by year but joined by any other key, we will not predict any reduction due to join). > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch > > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956408#comment-16956408 ] Hive QA commented on HIVE-22238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12983619/HIVE-22238.03.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 17545 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join_pkfk] (batchId=16) org.apache.hadoop.hive.metastore.security.TestHadoopAuthBridge23.testSaslWithHiveMetaStore (batchId=292) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/19091/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19091/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19091/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12983619 - PreCommit-HIVE-Build > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch > > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956387#comment-16956387 ] Hive QA commented on HIVE-22238: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 40s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 8s{color} | {color:blue} ql in master has 1547 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 26m 3s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-19091/dev-support/hive-personality.sh | | git revision | master / c9850b4 | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-19091/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-22238.01.patch, HIVE-22238.02.patch, > HIVE-22238.03.patch > > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations > what happened was: > * optimization was able to push the filter to the other side of the join > * as a result the incoming data was already filtered > * scaling down by the PK selectiviy - was actually already there...but a new > "scaling" happened -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948296#comment-16948296 ] Zoltan Haindrich commented on HIVE-22238: - I went after this - but forgot to write an update here...so what's happening is somewhat both, now I think that the rescaling is accurate and I agree with the logic...but when calcite pushes the filter predicates to the other branch as well it ends up downscaling by the same factor again - hence my patch have solved some case...I'll try to get back to this sooner than later :) my current idea is to somehow identify that the FK column in question is not filtered so far - so that we may downscale it by the PK factor > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-22238.01.patch > > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16943328#comment-16943328 ] Jesus Camacho Rodriguez commented on HIVE-22238: [~kgyrtkirk], reading the description of the issue, it seems this is expected since the FK input may be filtered, e.g., by a Filter condition in that input, while the FK-PK join may filter it again based on the condition in the PK side? > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-22238.01.patch > > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937115#comment-16937115 ] Hive QA commented on HIVE-22238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12981202/HIVE-22238.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 193 failed/errored test(s), 17000 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[updateBasicStats] (batchId=69) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_3] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_4] (batchId=165) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_5] (batchId=162) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_rebuild_dummy] (batchId=169) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_time_window] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=172) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query23] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query10] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query11] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query12] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query13] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query15] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query17] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query18] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query19] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query1] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query1b] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query20] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query21] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query22] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query24] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query25] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query26] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query27] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query29] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query30] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query31] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query32] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query33] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query34] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query35] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query36] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query37] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query38] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query39] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query3] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query40] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query42] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query43] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query45] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query46] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query47] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query48] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query49] (batchId=297) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query4] (batchId=297)
[jira] [Commented] (HIVE-22238) PK/FK selectivity estimation underscales estimations
[ https://issues.apache.org/jira/browse/HIVE-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937089#comment-16937089 ] Hive QA commented on HIVE-22238: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 49s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 5m 55s{color} | {color:blue} ql in master has 1567 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 34m 36s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-18706/dev-support/hive-personality.sh | | git revision | master / 48ae7ef | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-18706/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > PK/FK selectivity estimation underscales estimations > > > Key: HIVE-22238 > URL: https://issues.apache.org/jira/browse/HIVE-22238 > Project: Hive > Issue Type: Bug > Components: Statistics >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-22238.01.patch > > > at [this > point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182] > the parent operators rownum is scaled according to pkfkselectivity > however [pkfkselectivity is > computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157] > on a whole subtree. > Scaling it by that amount will count in estimation already used when > parentstats was calculated...so depending on the number of upstream joins - > this may lead to severe underestimations -- This message was sent by Atlassian Jira (v8.3.4#803005)