[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-14 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649688#comment-16649688
 ] 

Vineet Garg commented on HIVE-17043:


Pushed to master

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, 
> HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, 
> HIVE-17043.14.patch, HIVE-17043.15.patch, HIVE-17043.16.patch, 
> HIVE-17043.2.patch, HIVE-17043.3.patch, HIVE-17043.4.patch, 
> HIVE-17043.5.patch, HIVE-17043.6.patch, HIVE-17043.7.patch, 
> HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-14 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649654#comment-16649654
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12943846/HIVE-17043.16.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15079 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14460/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14460/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14460/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12943846 - PreCommit-HIVE-Build

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, 
> HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, 
> HIVE-17043.14.patch, HIVE-17043.15.patch, HIVE-17043.16.patch, 
> HIVE-17043.2.patch, HIVE-17043.3.patch, HIVE-17043.4.patch, 
> HIVE-17043.5.patch, HIVE-17043.6.patch, HIVE-17043.7.patch, 
> HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-14 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649631#comment-16649631
 ] 

Hive QA commented on HIVE-17043:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
53s{color} | {color:blue} ql in master has 2318 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
39s{color} | {color:red} ql: The patch generated 2 new + 44 unchanged - 10 
fixed = 46 total (was 54) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14460/dev-support/hive-personality.sh
 |
| git revision | master / 259db56 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14460/yetus/diff-checkstyle-ql.txt
 |
| modules | C: itests ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14460/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, 
> HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, 
> HIVE-17043.14.patch, HIVE-17043.15.patch, HIVE-17043.16.patch, 
> HIVE-17043.2.patch, HIVE-17043.3.patch, HIVE-17043.4.patch, 
> HIVE-17043.5.patch, HIVE-17043.6.patch, HIVE-17043.7.patch, 
> HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular 

[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-14 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649477#comment-16649477
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12943059/HIVE-17043.15.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14452/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14452/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14452/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12943059/HIVE-17043.15.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12943059 - PreCommit-HIVE-Build

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, 
> HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, 
> HIVE-17043.14.patch, HIVE-17043.15.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, 
> HIVE-17043.6.patch, HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648679#comment-16648679
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12943059/HIVE-17043.15.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14397/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14397/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14397/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12943059/HIVE-17043.15.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12943059 - PreCommit-HIVE-Build

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, 
> HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, 
> HIVE-17043.14.patch, HIVE-17043.15.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, 
> HIVE-17043.6.patch, HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-10 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645850#comment-16645850
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12943059/HIVE-17043.15.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15072 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.exec.spark.TestSparkSessionTimeout.testMultiSessionSparkSessionTimeout
 (batchId=246)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14358/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14358/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14358/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12943059 - PreCommit-HIVE-Build

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, 
> HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, 
> HIVE-17043.14.patch, HIVE-17043.15.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, 
> HIVE-17043.6.patch, HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-10 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645803#comment-16645803
 ] 

Hive QA commented on HIVE-17043:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
49s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
59s{color} | {color:blue} ql in master has 2319 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 2 new + 44 unchanged - 10 
fixed = 46 total (was 54) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14358/dev-support/hive-personality.sh
 |
| git revision | master / 64bef36 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14358/yetus/diff-checkstyle-ql.txt
 |
| modules | C: itests ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14358/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, 
> HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, 
> HIVE-17043.14.patch, HIVE-17043.15.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, 
> HIVE-17043.6.patch, HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to 

[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16641364#comment-16641364
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12942748/HIVE-17043.14.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14305/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14305/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14305/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12942748/HIVE-17043.14.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12942748 - PreCommit-HIVE-Build

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, 
> HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, 
> HIVE-17043.14.patch, HIVE-17043.2.patch, HIVE-17043.3.patch, 
> HIVE-17043.4.patch, HIVE-17043.5.patch, HIVE-17043.6.patch, 
> HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16641332#comment-16641332
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12942748/HIVE-17043.14.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15035 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=181)

[vector_case_when_1.q,tez_nway_join.q,escape2.q,bucket_map_join_tez1.q,insert_update_delete.q,schema_evol_orc_nonvec_part_all_primitive_llap_io.q,cte_1.q,autoColumnStats_2.q,schema_evol_orc_acid_part_llap_io.q,semijoin6.q,reopt_semijoin.q,materialized_view_rebuild.q,vectorization_0.q,orc_merge8.q,orc_merge_incompat2.q,vector_outer_join4.q,materialized_view_partitioned.q,orc_merge7.q,bucketpruning1.q,schema_evol_orc_acidvec_table.q,vector_grouping_sets.q,vector_outer_join5.q,schema_evol_orc_acidvec_part_update_llap_io.q,groupby_groupingset_bug.q,bucketmapjoin1.q,vector_udf_inline.q,load_dyn_part1.q,results_cache_temptable.q,orc_merge_incompat_writer_version.q,udf_coalesce.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stat_estimate_related_col]
 (batchId=43)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14303/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14303/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14303/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12942748 - PreCommit-HIVE-Build

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, 
> HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, 
> HIVE-17043.14.patch, HIVE-17043.2.patch, HIVE-17043.3.patch, 
> HIVE-17043.4.patch, HIVE-17043.5.patch, HIVE-17043.6.patch, 
> HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16641316#comment-16641316
 ] 

Hive QA commented on HIVE-17043:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
44s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
2s{color} | {color:blue} ql in master has 2320 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
42s{color} | {color:red} ql: The patch generated 2 new + 44 unchanged - 10 
fixed = 46 total (was 54) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14303/dev-support/hive-personality.sh
 |
| git revision | master / 61a027a |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14303/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14303/yetus/patch-asflicense-problems.txt
 |
| modules | C: itests ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14303/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, 
> HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, 
> HIVE-17043.14.patch, HIVE-17043.2.patch, HIVE-17043.3.patch, 
> HIVE-17043.4.patch, HIVE-17043.5.patch, HIVE-17043.6.patch, 
> HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, 

[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-06 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640968#comment-16640968
 ] 

Jesus Camacho Rodriguez commented on HIVE-17043:


[~vgarg], latest patch seems to have unrelated changes: 
{{VectorizedOrcAcidRowBatchReader}}.

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, 
> HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, 
> HIVE-17043.2.patch, HIVE-17043.3.patch, HIVE-17043.4.patch, 
> HIVE-17043.5.patch, HIVE-17043.6.patch, HIVE-17043.7.patch, 
> HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-06 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640967#comment-16640967
 ] 

Vineet Garg commented on HIVE-17043:


Sorry that was a typo, thanks for reviewing it. I have uploaded an updated 
patch.

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, 
> HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.13.patch, 
> HIVE-17043.2.patch, HIVE-17043.3.patch, HIVE-17043.4.patch, 
> HIVE-17043.5.patch, HIVE-17043.6.patch, HIVE-17043.7.patch, 
> HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-06 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640964#comment-16640964
 ] 

Jesus Camacho Rodriguez commented on HIVE-17043:


In latest patch, there are two calls to _generateKeys();_, please remove the 
second one in L140. Once that is done, patch LGTM.

+1 (pending tests)

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, 
> HIVE-17043.11.patch, HIVE-17043.12.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, 
> HIVE-17043.6.patch, HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-06 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640892#comment-16640892
 ] 

Jesus Camacho Rodriguez commented on HIVE-17043:


[~vgarg], I left three minor comments in RB, could you take a look? I think 
other than that, patch LGTM.

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, 
> HIVE-17043.11.patch, HIVE-17043.2.patch, HIVE-17043.3.patch, 
> HIVE-17043.4.patch, HIVE-17043.5.patch, HIVE-17043.6.patch, 
> HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640672#comment-16640672
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12942591/HIVE-17043.10.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15022 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=194)

[druidmini_dynamic_partition.q,druidmini_test_ts.q,druidmini_expressions.q,druidmini_test_alter.q,druidmini_test_insert.q]
org.apache.hadoop.hive.ql.exec.spark.TestSparkSessionTimeout.testMultiSparkSessionTimeout
 (batchId=246)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14271/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14271/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14271/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12942591 - PreCommit-HIVE-Build

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, 
> HIVE-17043.2.patch, HIVE-17043.3.patch, HIVE-17043.4.patch, 
> HIVE-17043.5.patch, HIVE-17043.6.patch, HIVE-17043.7.patch, 
> HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640649#comment-16640649
 ] 

Hive QA commented on HIVE-17043:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
44s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
56s{color} | {color:blue} ql in master has 2320 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 12 new + 51 unchanged - 3 
fixed = 63 total (was 54) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14271/dev-support/hive-personality.sh
 |
| git revision | master / 78e45be |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14271/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14271/yetus/whitespace-eol.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14271/yetus/patch-asflicense-problems.txt
 |
| modules | C: itests ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14271/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.10.patch, 
> HIVE-17043.2.patch, HIVE-17043.3.patch, HIVE-17043.4.patch, 
> HIVE-17043.5.patch, HIVE-17043.6.patch, HIVE-17043.7.patch, 
> HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator 

[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638538#comment-16638538
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12942306/HIVE-17043.9.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14230/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14230/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14230/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12942306/HIVE-17043.9.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12942306 - PreCommit-HIVE-Build

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, 
> HIVE-17043.6.patch, HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-03 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16637445#comment-16637445
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12942306/HIVE-17043.9.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 48 failed/errored test(s), 15011 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join_pkfk]
 (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark4] 
(batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_vc] (batchId=4)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[runtime_skewjoin_mapjoin_spark]
 (batchId=59)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_4]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_sw]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_rewrite_1]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_6]
 (batchId=189)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_recursive_mapjoin]
 (batchId=189)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_vc] 
(batchId=111)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[runtime_skewjoin_mapjoin_spark]
 (batchId=135)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query17] 
(batchId=267)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query22] 
(batchId=267)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query24] 
(batchId=267)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query25] 
(batchId=267)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query29] 
(batchId=267)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query32] 
(batchId=267)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query45] 
(batchId=267)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query57] 
(batchId=267)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query65] 
(batchId=267)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query66] 
(batchId=267)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query67] 
(batchId=267)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query70] 
(batchId=267)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query72] 
(batchId=267)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query85] 
(batchId=267)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query91] 
(batchId=267)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query92] 
(batchId=267)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query99] 
(batchId=267)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query17] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query22] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query24] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query25] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query29] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query32] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query45] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query57] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query64] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query65] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query67] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query70] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query72] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query85] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query91] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query92] 
(batchId=265)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query99] 
(batchId=265)
{noformat}

Test results: 

[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-10-03 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16637389#comment-16637389
 ] 

Hive QA commented on HIVE-17043:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
54s{color} | {color:blue} ql in master has 2321 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 11 new + 51 unchanged - 3 
fixed = 62 total (was 54) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14207/dev-support/hive-personality.sh
 |
| git revision | master / a06a370 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14207/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14207/yetus/whitespace-eol.txt
 |
| modules | C: itests ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14207/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, 
> HIVE-17043.6.patch, HIVE-17043.7.patch, HIVE-17043.8.patch, HIVE-17043.9.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> 

[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632522#comment-16632522
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941620/HIVE-17043.7.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15007 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketmapjoin6]
 (batchId=189)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14107/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14107/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14107/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941620 - PreCommit-HIVE-Build

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, 
> HIVE-17043.6.patch, HIVE-17043.7.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-28 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632481#comment-16632481
 ] 

Hive QA commented on HIVE-17043:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
48s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
20s{color} | {color:blue} ql in master has 2322 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
44s{color} | {color:red} ql: The patch generated 16 new + 51 unchanged - 3 
fixed = 67 total (was 54) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
24s{color} | {color:red} ql generated 1 new + 2322 unchanged - 0 fixed = 2323 
total (was 2322) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 45s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Possible null pointer dereference of left in 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.EstimateUniqueKeys.getUniqueKeys(HiveJoin,
 boolean)  Dereferenced at EstimateUniqueKeys.java:left in 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.EstimateUniqueKeys.getUniqueKeys(HiveJoin,
 boolean)  Dereferenced at EstimateUniqueKeys.java:[line 210] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14107/dev-support/hive-personality.sh
 |
| git revision | master / 6e27a53 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14107/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14107/yetus/whitespace-eol.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14107/yetus/new-findbugs-ql.html
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14107/yetus/patch-asflicense-problems.txt
 |
| modules | C: itests ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14107/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>

[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626919#comment-16626919
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941108/HIVE-17043.6.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 14997 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[constraints_optimization]
 (batchId=170)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query45] 
(batchId=266)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query45] 
(batchId=264)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query64] 
(batchId=264)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14033/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14033/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14033/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941108 - PreCommit-HIVE-Build

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, HIVE-17043.6.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626863#comment-16626863
 ] 

Hive QA commented on HIVE-17043:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
30s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
2s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
39s{color} | {color:red} ql: The patch generated 16 new + 51 unchanged - 3 
fixed = 67 total (was 54) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 12s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14033/dev-support/hive-personality.sh
 |
| git revision | master / e161b01 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14033/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14033/yetus/whitespace-eol.txt
 |
| modules | C: itests ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14033/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, HIVE-17043.6.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which 

[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-23 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625288#comment-16625288
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12940852/HIVE-17043.5.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14008/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14008/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14008/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12940852/HIVE-17043.5.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12940852 - PreCommit-HIVE-Build

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624958#comment-16624958
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12940852/HIVE-17043.5.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/13991/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13991/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13991/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12940852/HIVE-17043.5.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12940852 - PreCommit-HIVE-Build

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624505#comment-16624505
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12940852/HIVE-17043.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 41 failed/errored test(s), 14994 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join_pkfk]
 (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark4] 
(batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_join_pushdown] 
(batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_vc] (batchId=4)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[runtime_skewjoin_mapjoin_spark]
 (batchId=58)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_4]
 (batchId=165)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_sw]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_6]
 (batchId=188)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_recursive_mapjoin]
 (batchId=188)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_vc] 
(batchId=110)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[runtime_skewjoin_mapjoin_spark]
 (batchId=134)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query22] 
(batchId=266)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query24] 
(batchId=266)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query45] 
(batchId=266)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query54] 
(batchId=266)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query57] 
(batchId=266)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query58] 
(batchId=266)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query65] 
(batchId=266)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query66] 
(batchId=266)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query67] 
(batchId=266)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query70] 
(batchId=266)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query91] 
(batchId=266)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query99] 
(batchId=266)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=264)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query22] 
(batchId=264)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query24] 
(batchId=264)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query45] 
(batchId=264)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query54] 
(batchId=264)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query57] 
(batchId=264)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query58] 
(batchId=264)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query64] 
(batchId=264)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query65] 
(batchId=264)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query67] 
(batchId=264)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query70] 
(batchId=264)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query91] 
(batchId=264)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query99] 
(batchId=264)
org.apache.hadoop.hive.metastore.TestHiveMetaStoreAlterColumnPar.org.apache.hadoop.hive.metastore.TestHiveMetaStoreAlterColumnPar
 (batchId=238)
org.apache.hadoop.hive.ql.exec.spark.TestSparkSessionTimeout.testMultiSparkSessionTimeout
 (batchId=245)
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testKillQuery (batchId=251)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/13965/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13965/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13965/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 41 tests failed

[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624494#comment-16624494
 ] 

Hive QA commented on HIVE-17043:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
33s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
54s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
40s{color} | {color:red} ql: The patch generated 14 new + 46 unchanged - 3 
fixed = 60 total (was 49) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-13965/dev-support/hive-personality.sh
 |
| git revision | master / cdba00c |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13965/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13965/yetus/whitespace-eol.txt
 |
| modules | C: itests ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13965/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be 

[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-21 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624240#comment-16624240
 ] 

Vineet Garg commented on HIVE-17043:


[~jcamachorodriguez] I introduced new logic to compute unique keys based on 
statistics. Now {{RelMdUniqueKeys}} is only used for computing keys based on 
constraints.

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-21 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624133#comment-16624133
 ] 

Vineet Garg commented on HIVE-17043:


[~jcamachorodriguez] I agree it is ugly. The problem with 
{{RelMdColumnUniqueness}} is that it only tells you if given set of columns are 
unique or not, for this optimization we need to know the set of unique keys (if 
there are any for a given input). Therefore {{RelMdColumnUniqueness}} wouldn't 
really work here.

Another possible solution I could think of was calling {{getColumnOrigin}} on 
each group key to track lineage and build the set, then calling 
{{getTableOrigin}} to get to the base table using which we can figure out the 
keys, get rid of the corresponding columns from group sets. But this will be 
pretty expensive (calling getColumnOrigin on all the keys and then calling 
getTableOrigin).

I think we should keep RelMdUniqueKeys for determining unique keys based on the 
constraints, it seems like it is designed for this. We can write (preferably in 
later patch) different logic/methods for getRowCount to use (which will be 
based on stats) since  it only override project to determine uniqueness based 
on statistics.

Let me know what you think.



> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-20 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622990#comment-16622990
 ] 

Jesus Camacho Rodriguez commented on HIVE-17043:


[~vgarg], I had a quick look at the patch. I believe we should not have 
different behavior for method in {{RelMdUniqueKeys}}, this may be misleading 
moving forward.
Instead of changing the semantics of {{RelMdUniqueKeys}} method, which 
currently is inferred from stats and only used to get the row count, we could 
determine uniqueness using {{RelMdColumnUniqueness}}. If I remember correctly, 
{{RelMdColumnUniqueness}} is the metadata provider used by other Calcite 
optimizations when they need to infer whether a set of columns contains unique 
values or not with guarantees, i.e., not from stats that may be imprecise.

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-20 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622698#comment-16622698
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12940514/HIVE-17043.3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/13938/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13938/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13938/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12940514/HIVE-17043.3.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12940514 - PreCommit-HIVE-Build

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-20 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622339#comment-16622339
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12940514/HIVE-17043.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 14991 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query78] 
(batchId=264)
org.apache.hadoop.hive.ql.exec.spark.TestSparkSessionTimeout.testMultiSparkSessionTimeout
 (batchId=245)
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testKillQuery (batchId=251)
org.apache.hive.service.cli.thrift.TestThriftCLIServiceWithHttp.org.apache.hive.service.cli.thrift.TestThriftCLIServiceWithHttp
 (batchId=247)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/13933/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13933/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13933/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12940514 - PreCommit-HIVE-Build

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-20 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622276#comment-16622276
 ] 

Hive QA commented on HIVE-17043:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
52s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
39s{color} | {color:red} ql: The patch generated 6 new + 46 unchanged - 3 fixed 
= 52 total (was 49) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 50s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-13933/dev-support/hive-personality.sh
 |
| git revision | master / 9b376a7 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13933/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13933/yetus/whitespace-eol.txt
 |
| modules | C: itests ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13933/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--

[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-19 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621318#comment-16621318
 ] 

Vineet Garg commented on HIVE-17043:


Patch (3) adds NOT NULL filter elimination tests

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620536#comment-16620536
 ] 

Hive QA commented on HIVE-17043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12940333/HIVE-17043.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 14978 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=194)

[druidmini_masking.q,druidmini_test1.q,druidkafkamini_basic.q,druidmini_joins.q,druid_timestamptz.q]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_rewrite_9]
 (batchId=174)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query78] 
(batchId=266)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query78] 
(batchId=264)
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testKillQuery (batchId=251)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/13904/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13904/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13904/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12940333 - PreCommit-HIVE-Build

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620508#comment-16620508
 ] 

Hive QA commented on HIVE-17043:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
45s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
6s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 5 new + 46 unchanged - 3 fixed 
= 51 total (was 49) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-13904/dev-support/hive-personality.sh
 |
| git revision | master / 9c90776 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13904/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13904/yetus/whitespace-eol.txt
 |
| modules | C: itests ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13904/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA