[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649688#comment-16649688 ] Vineet Garg commented on HIVE-17043: Pushed to master > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, > HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, > HIVE-17043.14.patch, HIVE-17043.15.patch, HIVE-17043.16.patch, > HIVE-17043.2.patch, HIVE-17043.3.patch, HIVE-17043.4.patch, > HIVE-17043.5.patch, HIVE-17043.6.patch, HIVE-17043.7.patch, > HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649654#comment-16649654 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12943846/HIVE-17043.16.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15079 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/14460/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14460/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14460/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12943846 - PreCommit-HIVE-Build > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, > HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, > HIVE-17043.14.patch, HIVE-17043.15.patch, HIVE-17043.16.patch, > HIVE-17043.2.patch, HIVE-17043.3.patch, HIVE-17043.4.patch, > HIVE-17043.5.patch, HIVE-17043.6.patch, HIVE-17043.7.patch, > HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649631#comment-16649631 ] Hive QA commented on HIVE-17043: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 35s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 33s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 53s{color} | {color:blue} ql in master has 2318 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 39s{color} | {color:red} ql: The patch generated 2 new + 44 unchanged - 10 fixed = 46 total (was 54) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 23m 46s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-14460/dev-support/hive-personality.sh | | git revision | master / 259db56 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-14460/yetus/diff-checkstyle-ql.txt | | modules | C: itests ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-14460/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, > HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, > HIVE-17043.14.patch, HIVE-17043.15.patch, HIVE-17043.16.patch, > HIVE-17043.2.patch, HIVE-17043.3.patch, HIVE-17043.4.patch, > HIVE-17043.5.patch, HIVE-17043.6.patch, HIVE-17043.7.patch, > HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649477#comment-16649477 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12943059/HIVE-17043.15.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/14452/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14452/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14452/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12943059/HIVE-17043.15.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12943059 - PreCommit-HIVE-Build > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, > HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, > HIVE-17043.14.patch, HIVE-17043.15.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, > HIVE-17043.6.patch, HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648679#comment-16648679 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12943059/HIVE-17043.15.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/14397/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14397/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14397/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12943059/HIVE-17043.15.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12943059 - PreCommit-HIVE-Build > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, > HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, > HIVE-17043.14.patch, HIVE-17043.15.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, > HIVE-17043.6.patch, HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645850#comment-16645850 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12943059/HIVE-17043.15.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15072 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.exec.spark.TestSparkSessionTimeout.testMultiSessionSparkSessionTimeout (batchId=246) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/14358/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14358/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14358/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12943059 - PreCommit-HIVE-Build > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, > HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, > HIVE-17043.14.patch, HIVE-17043.15.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, > HIVE-17043.6.patch, HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645803#comment-16645803 ] Hive QA commented on HIVE-17043: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 49s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 43s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 59s{color} | {color:blue} ql in master has 2319 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 38s{color} | {color:red} ql: The patch generated 2 new + 44 unchanged - 10 fixed = 46 total (was 54) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 25m 13s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-14358/dev-support/hive-personality.sh | | git revision | master / 64bef36 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-14358/yetus/diff-checkstyle-ql.txt | | modules | C: itests ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-14358/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, > HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, > HIVE-17043.14.patch, HIVE-17043.15.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, > HIVE-17043.6.patch, HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16641364#comment-16641364 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12942748/HIVE-17043.14.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/14305/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14305/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14305/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12942748/HIVE-17043.14.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12942748 - PreCommit-HIVE-Build > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, > HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, > HIVE-17043.14.patch, HIVE-17043.2.patch, HIVE-17043.3.patch, > HIVE-17043.4.patch, HIVE-17043.5.patch, HIVE-17043.6.patch, > HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16641332#comment-16641332 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12942748/HIVE-17043.14.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15035 tests executed *Failed tests:* {noformat} TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=181) [vector_case_when_1.q,tez_nway_join.q,escape2.q,bucket_map_join_tez1.q,insert_update_delete.q,schema_evol_orc_nonvec_part_all_primitive_llap_io.q,cte_1.q,autoColumnStats_2.q,schema_evol_orc_acid_part_llap_io.q,semijoin6.q,reopt_semijoin.q,materialized_view_rebuild.q,vectorization_0.q,orc_merge8.q,orc_merge_incompat2.q,vector_outer_join4.q,materialized_view_partitioned.q,orc_merge7.q,bucketpruning1.q,schema_evol_orc_acidvec_table.q,vector_grouping_sets.q,vector_outer_join5.q,schema_evol_orc_acidvec_part_update_llap_io.q,groupby_groupingset_bug.q,bucketmapjoin1.q,vector_udf_inline.q,load_dyn_part1.q,results_cache_temptable.q,orc_merge_incompat_writer_version.q,udf_coalesce.q] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stat_estimate_related_col] (batchId=43) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/14303/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14303/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14303/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12942748 - PreCommit-HIVE-Build > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, > HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, > HIVE-17043.14.patch, HIVE-17043.2.patch, HIVE-17043.3.patch, > HIVE-17043.4.patch, HIVE-17043.5.patch, HIVE-17043.6.patch, > HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16641316#comment-16641316 ] Hive QA commented on HIVE-17043: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 44s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 43s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 2s{color} | {color:blue} ql in master has 2320 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 42s{color} | {color:red} ql: The patch generated 2 new + 44 unchanged - 10 fixed = 46 total (was 54) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 14s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 24m 44s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-14303/dev-support/hive-personality.sh | | git revision | master / 61a027a | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-14303/yetus/diff-checkstyle-ql.txt | | asflicense | http://104.198.109.242/logs//PreCommit-HIVE-Build-14303/yetus/patch-asflicense-problems.txt | | modules | C: itests ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-14303/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, > HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, > HIVE-17043.14.patch, HIVE-17043.2.patch, HIVE-17043.3.patch, > HIVE-17043.4.patch, HIVE-17043.5.patch, HIVE-17043.6.patch, > HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case,
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640968#comment-16640968 ] Jesus Camacho Rodriguez commented on HIVE-17043: [~vgarg], latest patch seems to have unrelated changes: {{VectorizedOrcAcidRowBatchReader}}. > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, > HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, > HIVE-17043.2.patch, HIVE-17043.3.patch, HIVE-17043.4.patch, > HIVE-17043.5.patch, HIVE-17043.6.patch, HIVE-17043.7.patch, > HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640967#comment-16640967 ] Vineet Garg commented on HIVE-17043: Sorry that was a typo, thanks for reviewing it. I have uploaded an updated patch. > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, > HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, > HIVE-17043.2.patch, HIVE-17043.3.patch, HIVE-17043.4.patch, > HIVE-17043.5.patch, HIVE-17043.6.patch, HIVE-17043.7.patch, > HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640964#comment-16640964 ] Jesus Camacho Rodriguez commented on HIVE-17043: In latest patch, there are two calls to _generateKeys();_, please remove the second one in L140. Once that is done, patch LGTM. +1 (pending tests) > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, > HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, > HIVE-17043.6.patch, HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640892#comment-16640892 ] Jesus Camacho Rodriguez commented on HIVE-17043: [~vgarg], I left three minor comments in RB, could you take a look? I think other than that, patch LGTM. > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, > HIVE-17043.11.patch, HIVE-17043.2.patch, HIVE-17043.3.patch, > HIVE-17043.4.patch, HIVE-17043.5.patch, HIVE-17043.6.patch, > HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640672#comment-16640672 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12942591/HIVE-17043.10.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15022 tests executed *Failed tests:* {noformat} TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=194) [druidmini_dynamic_partition.q,druidmini_test_ts.q,druidmini_expressions.q,druidmini_test_alter.q,druidmini_test_insert.q] org.apache.hadoop.hive.ql.exec.spark.TestSparkSessionTimeout.testMultiSparkSessionTimeout (batchId=246) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/14271/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14271/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14271/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12942591 - PreCommit-HIVE-Build > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, > HIVE-17043.2.patch, HIVE-17043.3.patch, HIVE-17043.4.patch, > HIVE-17043.5.patch, HIVE-17043.6.patch, HIVE-17043.7.patch, > HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640649#comment-16640649 ] Hive QA commented on HIVE-17043: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 44s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 25s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 56s{color} | {color:blue} ql in master has 2320 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 38s{color} | {color:red} ql: The patch generated 12 new + 51 unchanged - 3 fixed = 63 total (was 54) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 13s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 23m 53s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-14271/dev-support/hive-personality.sh | | git revision | master / 78e45be | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-14271/yetus/diff-checkstyle-ql.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-14271/yetus/whitespace-eol.txt | | asflicense | http://104.198.109.242/logs//PreCommit-HIVE-Build-14271/yetus/patch-asflicense-problems.txt | | modules | C: itests ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-14271/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, > HIVE-17043.2.patch, HIVE-17043.3.patch, HIVE-17043.4.patch, > HIVE-17043.5.patch, HIVE-17043.6.patch, HIVE-17043.7.patch, > HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638538#comment-16638538 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12942306/HIVE-17043.9.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/14230/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14230/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14230/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12942306/HIVE-17043.9.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12942306 - PreCommit-HIVE-Build > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, > HIVE-17043.6.patch, HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16637445#comment-16637445 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12942306/HIVE-17043.9.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 48 failed/errored test(s), 15011 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join_pkfk] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark4] (batchId=1) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_vc] (batchId=4) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[runtime_skewjoin_mapjoin_spark] (batchId=59) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level] (batchId=155) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_4] (batchId=166) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_sw] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_rewrite_1] (batchId=174) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_6] (batchId=189) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_recursive_mapjoin] (batchId=189) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_vc] (batchId=111) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[runtime_skewjoin_mapjoin_spark] (batchId=135) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query17] (batchId=267) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query22] (batchId=267) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query24] (batchId=267) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query25] (batchId=267) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query29] (batchId=267) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query32] (batchId=267) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query45] (batchId=267) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query57] (batchId=267) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query65] (batchId=267) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query66] (batchId=267) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query67] (batchId=267) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query70] (batchId=267) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query72] (batchId=267) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query85] (batchId=267) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query91] (batchId=267) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query92] (batchId=267) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query99] (batchId=267) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query17] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query22] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query24] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query25] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query29] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query32] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query45] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query57] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query64] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query65] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query67] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query70] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query72] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query85] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query91] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query92] (batchId=265) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query99] (batchId=265) {noformat} Test results:
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16637389#comment-16637389 ] Hive QA commented on HIVE-17043: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 35s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 28s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 54s{color} | {color:blue} ql in master has 2321 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 38s{color} | {color:red} ql: The patch generated 11 new + 51 unchanged - 3 fixed = 62 total (was 54) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 23m 35s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-14207/dev-support/hive-personality.sh | | git revision | master / a06a370 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-14207/yetus/diff-checkstyle-ql.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-14207/yetus/whitespace-eol.txt | | modules | C: itests ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-14207/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, > HIVE-17043.6.patch, HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid >
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632522#comment-16632522 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12941620/HIVE-17043.7.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15007 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketmapjoin6] (batchId=189) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/14107/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14107/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14107/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12941620 - PreCommit-HIVE-Build > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, > HIVE-17043.6.patch, HIVE-17043.7.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632481#comment-16632481 ] Hive QA commented on HIVE-17043: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 48s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 5s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 20s{color} | {color:blue} ql in master has 2322 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 44s{color} | {color:red} ql: The patch generated 16 new + 51 unchanged - 3 fixed = 67 total (was 54) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 24s{color} | {color:red} ql generated 1 new + 2322 unchanged - 0 fixed = 2323 total (was 2322) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 14s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 25m 45s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Possible null pointer dereference of left in org.apache.hadoop.hive.ql.optimizer.calcite.stats.EstimateUniqueKeys.getUniqueKeys(HiveJoin, boolean) Dereferenced at EstimateUniqueKeys.java:left in org.apache.hadoop.hive.ql.optimizer.calcite.stats.EstimateUniqueKeys.getUniqueKeys(HiveJoin, boolean) Dereferenced at EstimateUniqueKeys.java:[line 210] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-14107/dev-support/hive-personality.sh | | git revision | master / 6e27a53 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-14107/yetus/diff-checkstyle-ql.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-14107/yetus/whitespace-eol.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-14107/yetus/new-findbugs-ql.html | | asflicense | http://104.198.109.242/logs//PreCommit-HIVE-Build-14107/yetus/patch-asflicense-problems.txt | | modules | C: itests ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-14107/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626919#comment-16626919 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12941108/HIVE-17043.6.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 14997 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[constraints_optimization] (batchId=170) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query45] (batchId=266) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query45] (batchId=264) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query64] (batchId=264) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/14033/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14033/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14033/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12941108 - PreCommit-HIVE-Build > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, HIVE-17043.6.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626863#comment-16626863 ] Hive QA commented on HIVE-17043: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 30s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 42s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 2s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 39s{color} | {color:red} ql: The patch generated 16 new + 51 unchanged - 3 fixed = 67 total (was 54) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 24m 12s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-14033/dev-support/hive-personality.sh | | git revision | master / e161b01 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-14033/yetus/diff-checkstyle-ql.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-14033/yetus/whitespace-eol.txt | | modules | C: itests ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-14033/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, HIVE-17043.6.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625288#comment-16625288 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12940852/HIVE-17043.5.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/14008/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14008/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14008/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12940852/HIVE-17043.5.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12940852 - PreCommit-HIVE-Build > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624958#comment-16624958 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12940852/HIVE-17043.5.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/13991/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13991/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13991/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12940852/HIVE-17043.5.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12940852 - PreCommit-HIVE-Build > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624505#comment-16624505 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12940852/HIVE-17043.5.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 41 failed/errored test(s), 14994 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join_pkfk] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark4] (batchId=1) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_join_pushdown] (batchId=85) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_vc] (batchId=4) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[runtime_skewjoin_mapjoin_spark] (batchId=58) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction] (batchId=169) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_4] (batchId=165) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_sw] (batchId=158) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_6] (batchId=188) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_recursive_mapjoin] (batchId=188) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_vc] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[runtime_skewjoin_mapjoin_spark] (batchId=134) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query22] (batchId=266) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query24] (batchId=266) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query45] (batchId=266) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query54] (batchId=266) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query57] (batchId=266) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query58] (batchId=266) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query65] (batchId=266) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query66] (batchId=266) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query67] (batchId=266) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query70] (batchId=266) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query91] (batchId=266) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query99] (batchId=266) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=264) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query22] (batchId=264) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query24] (batchId=264) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query45] (batchId=264) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query54] (batchId=264) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query57] (batchId=264) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query58] (batchId=264) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query64] (batchId=264) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query65] (batchId=264) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query67] (batchId=264) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query70] (batchId=264) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query91] (batchId=264) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query99] (batchId=264) org.apache.hadoop.hive.metastore.TestHiveMetaStoreAlterColumnPar.org.apache.hadoop.hive.metastore.TestHiveMetaStoreAlterColumnPar (batchId=238) org.apache.hadoop.hive.ql.exec.spark.TestSparkSessionTimeout.testMultiSparkSessionTimeout (batchId=245) org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testKillQuery (batchId=251) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/13965/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13965/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13965/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 41 tests failed
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624494#comment-16624494 ] Hive QA commented on HIVE-17043: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 33s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 29s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 54s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 40s{color} | {color:red} ql: The patch generated 14 new + 46 unchanged - 3 fixed = 60 total (was 49) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 23m 38s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-13965/dev-support/hive-personality.sh | | git revision | master / cdba00c | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-13965/yetus/diff-checkstyle-ql.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-13965/yetus/whitespace-eol.txt | | modules | C: itests ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-13965/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624240#comment-16624240 ] Vineet Garg commented on HIVE-17043: [~jcamachorodriguez] I introduced new logic to compute unique keys based on statistics. Now {{RelMdUniqueKeys}} is only used for computing keys based on constraints. > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624133#comment-16624133 ] Vineet Garg commented on HIVE-17043: [~jcamachorodriguez] I agree it is ugly. The problem with {{RelMdColumnUniqueness}} is that it only tells you if given set of columns are unique or not, for this optimization we need to know the set of unique keys (if there are any for a given input). Therefore {{RelMdColumnUniqueness}} wouldn't really work here. Another possible solution I could think of was calling {{getColumnOrigin}} on each group key to track lineage and build the set, then calling {{getTableOrigin}} to get to the base table using which we can figure out the keys, get rid of the corresponding columns from group sets. But this will be pretty expensive (calling getColumnOrigin on all the keys and then calling getTableOrigin). I think we should keep RelMdUniqueKeys for determining unique keys based on the constraints, it seems like it is designed for this. We can write (preferably in later patch) different logic/methods for getRowCount to use (which will be based on stats) since it only override project to determine uniqueness based on statistics. Let me know what you think. > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch, HIVE-17043.4.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622990#comment-16622990 ] Jesus Camacho Rodriguez commented on HIVE-17043: [~vgarg], I had a quick look at the patch. I believe we should not have different behavior for method in {{RelMdUniqueKeys}}, this may be misleading moving forward. Instead of changing the semantics of {{RelMdUniqueKeys}} method, which currently is inferred from stats and only used to get the row count, we could determine uniqueness using {{RelMdColumnUniqueness}}. If I remember correctly, {{RelMdColumnUniqueness}} is the metadata provider used by other Calcite optimizations when they need to infer whether a set of columns contains unique values or not with guarantees, i.e., not from stats that may be imprecise. > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch, HIVE-17043.4.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622698#comment-16622698 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12940514/HIVE-17043.3.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/13938/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13938/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13938/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12940514/HIVE-17043.3.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12940514 - PreCommit-HIVE-Build > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622339#comment-16622339 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12940514/HIVE-17043.3.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 14991 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query78] (batchId=264) org.apache.hadoop.hive.ql.exec.spark.TestSparkSessionTimeout.testMultiSparkSessionTimeout (batchId=245) org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testKillQuery (batchId=251) org.apache.hive.service.cli.thrift.TestThriftCLIServiceWithHttp.org.apache.hive.service.cli.thrift.TestThriftCLIServiceWithHttp (batchId=247) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/13933/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13933/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13933/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12940514 - PreCommit-HIVE-Build > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622276#comment-16622276 ] Hive QA commented on HIVE-17043: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 52s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 39s{color} | {color:red} ql: The patch generated 6 new + 46 unchanged - 3 fixed = 52 total (was 49) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 50s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-13933/dev-support/hive-personality.sh | | git revision | master / 9b376a7 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-13933/yetus/diff-checkstyle-ql.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-13933/yetus/whitespace-eol.txt | | modules | C: itests ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-13933/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. --
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621318#comment-16621318 ] Vineet Garg commented on HIVE-17043: Patch (3) adds NOT NULL filter elimination tests > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, > HIVE-17043.3.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620536#comment-16620536 ] Hive QA commented on HIVE-17043: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12940333/HIVE-17043.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 14978 tests executed *Failed tests:* {noformat} TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=194) [druidmini_masking.q,druidmini_test1.q,druidkafkamini_basic.q,druidmini_joins.q,druid_timestamptz.q] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_rewrite_9] (batchId=174) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query78] (batchId=266) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query78] (batchId=264) org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testKillQuery (batchId=251) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/13904/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13904/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13904/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12940333 - PreCommit-HIVE-Build > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later
[ https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620508#comment-16620508 ] Hive QA commented on HIVE-17043: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 45s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 6s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 38s{color} | {color:red} ql: The patch generated 5 new + 46 unchanged - 3 fixed = 51 total (was 49) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 25m 2s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-13904/dev-support/hive-personality.sh | | git revision | master / 9c90776 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-13904/yetus/diff-checkstyle-ql.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-13904/yetus/whitespace-eol.txt | | modules | C: itests ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-13904/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Remove non unique columns from group by keys if not referenced later > > > Key: HIVE-17043 > URL: https://issues.apache.org/jira/browse/HIVE-17043 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-17043.1.patch > > > Group by keys may be a mix of unique (or primary) keys and regular columns. > In such cases presence of regular column won't alter cardinality of groups. > So, if regular columns are not referenced later, they can be dropped from > group by keys. Depending on operator tree may result in those columns not > being read at all from disk in best case. In worst case, we will avoid > shuffling and sorting regular columns from mapper to reducer, which still > could be substantial CPU and network savings. -- This message was sent by Atlassian JIRA