[jira] [Commented] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2020-02-12 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035664#comment-17035664
 ] 

Hive QA commented on HIVE-22363:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12993145/HIVE-22363.04.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 17992 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_7] 
(batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby10] (batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby11] (batchId=83)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby8] (batchId=85)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[groupby2] 
(batchId=178)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ptf] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_windowing]
 (batchId=190)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_ptf]
 (batchId=183)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby10] 
(batchId=144)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby11] 
(batchId=149)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby2] 
(batchId=138)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby8] 
(batchId=150)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ptf] (batchId=124)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_ptf] 
(batchId=145)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20579/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20579/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20579/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12993145 - PreCommit-HIVE-Build

> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch, 
> HIVE-22363.03.patch, HIVE-22363.04.patch, HIVE-22363.04.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between the sink and the gby - the 
> removal may not happen 
> [here|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L458]
> {code}
> set hive.cbo.enable=false;
> drop table if exists xl1;
> create table xl1 as
> select '1' as mdl_yr_desc, 2 as seq_no,'3' as opt_desc1,4 as opt_desc,1 as 
> row_num;
> explain
> select trim(base.mdl_yr_desc) mdl_yr_desc, trim(base.opt_desc) opt_desc
> from
> (
> SELECT trim(mdl_yr_desc) mdl_yr_desc, concat_ws(' ', 
> collect_set(trim(opt_desc1))) AS opt_desc
> from
> (
> select t14304.* 
> from
> (
> select * from xl1
> ) t14304  
> where row_num = 1
> order by trim(mdl_yr_desc), cast(seq_no as int) asc
> ) x
> group by trim(mdl_yr_desc)
> ) base
> inner join
> (
> select 1 as v
> ) dedup
> on  trim(base.mdl_yr_desc) != dedup.v
> group by trim(base.mdl_yr_desc), trim(base.opt_desc) ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2020-02-12 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035628#comment-17035628
 ] 

Hive QA commented on HIVE-22363:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
 4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
57s{color} | {color:blue} ql in master has 1532 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20579/dev-support/hive-personality.sh
 |
| git revision | master / fcfc71b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20579/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch, 
> HIVE-22363.03.patch, HIVE-22363.04.patch, HIVE-22363.04.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between the sink and the gby - the 
> removal may not happen 
> [here|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L458]
> {code}
> set hive.cbo.enable=false;
> drop table if exists xl1;
> create table xl1 as
> select '1' as mdl_yr_desc, 2 as seq_no,'3' as opt_desc1,4 as opt_desc,1 as 
> row_num;
> explain
> select trim(base.mdl_yr_desc) mdl_yr_desc, trim(base.opt_desc) opt_desc
> from
> (
> SELECT trim(mdl_yr_desc) mdl_yr_desc, concat_ws(' ', 
> collect_set(trim(opt_desc1))) AS opt_desc
> from
> 

[jira] [Commented] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-29 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962340#comment-16962340
 ] 

Hive QA commented on HIVE-22363:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12984263/HIVE-22363.04.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 17546 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_7] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby10] (batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby11] (batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby8] (batchId=83)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[groupby2] 
(batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ptf] 
(batchId=165)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_windowing]
 (batchId=187)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_ptf]
 (batchId=179)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby10] 
(batchId=141)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby11] 
(batchId=146)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby2] 
(batchId=135)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby8] 
(batchId=147)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ptf] (batchId=121)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_ptf] 
(batchId=142)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19195/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19195/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19195/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12984263 - PreCommit-HIVE-Build

> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch, 
> HIVE-22363.03.patch, HIVE-22363.04.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between the sink and the gby - the 
> removal may not happen 
> [here|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L458]
> {code}
> set hive.cbo.enable=false;
> drop table if exists xl1;
> create table xl1 as
> select '1' as mdl_yr_desc, 2 as seq_no,'3' as opt_desc1,4 as opt_desc,1 as 
> row_num;
> explain
> select trim(base.mdl_yr_desc) mdl_yr_desc, trim(base.opt_desc) opt_desc
> from
> (
> SELECT trim(mdl_yr_desc) mdl_yr_desc, concat_ws(' ', 
> collect_set(trim(opt_desc1))) AS opt_desc
> from
> (
> select t14304.* 
> from
> (
> select * from xl1
> ) t14304  
> where row_num = 1
> order by trim(mdl_yr_desc), cast(seq_no as int) asc
> ) x
> group by trim(mdl_yr_desc)
> ) base
> inner join
> (
> select 1 as v
> ) dedup
> on  trim(base.mdl_yr_desc) != dedup.v
> group by trim(base.mdl_yr_desc), trim(base.opt_desc) ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-29 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962311#comment-16962311
 ] 

Hive QA commented on HIVE-22363:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
51s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
5s{color} | {color:blue} ql in master has 1545 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 12s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19195/dev-support/hive-personality.sh
 |
| git revision | master / 6d0f461 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19195/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch, 
> HIVE-22363.03.patch, HIVE-22363.04.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between the sink and the gby - the 
> removal may not happen 
> [here|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L458]
> {code}
> set hive.cbo.enable=false;
> drop table if exists xl1;
> create table xl1 as
> select '1' as mdl_yr_desc, 2 as seq_no,'3' as opt_desc1,4 as opt_desc,1 as 
> row_num;
> explain
> select trim(base.mdl_yr_desc) mdl_yr_desc, trim(base.opt_desc) opt_desc
> from
> (
> SELECT trim(mdl_yr_desc) mdl_yr_desc, concat_ws(' ', 
> collect_set(trim(opt_desc1))) AS opt_desc
> from
> (
> 

[jira] [Commented] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-22 Thread Zoltan Haindrich (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957011#comment-16957011
 ] 

Zoltan Haindrich commented on HIVE-22363:
-

for a detection of  something like this:
{code}
pRS-pGBY-cRS-cGBY
{code}

the problematic case was something like this:
{code}
pRS-pGBY-FIL-cRS-cGBY
{code}

note that it also accepted cases like:
{code}
pRS-xGBY-pGBY-cRS-cGBY
{code}

> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch, 
> HIVE-22363.03.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between the sink and the gby - the 
> removal may not happen 
> [here|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L458]
> {code}
> set hive.cbo.enable=false;
> drop table if exists xl1;
> create table xl1 as
> select '1' as mdl_yr_desc, 2 as seq_no,'3' as opt_desc1,4 as opt_desc,1 as 
> row_num;
> explain
> select trim(base.mdl_yr_desc) mdl_yr_desc, trim(base.opt_desc) opt_desc
> from
> (
> SELECT trim(mdl_yr_desc) mdl_yr_desc, concat_ws(' ', 
> collect_set(trim(opt_desc1))) AS opt_desc
> from
> (
> select t14304.* 
> from
> (
> select * from xl1
> ) t14304  
> where row_num = 1
> order by trim(mdl_yr_desc), cast(seq_no as int) asc
> ) x
> group by trim(mdl_yr_desc)
> ) base
> inner join
> (
> select 1 as v
> ) dedup
> on  trim(base.mdl_yr_desc) != dedup.v
> group by trim(base.mdl_yr_desc), trim(base.opt_desc) ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956595#comment-16956595
 ] 

Hive QA commented on HIVE-22363:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983639/HIVE-22363.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17545 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19096/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19096/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19096/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12983639 - PreCommit-HIVE-Build

> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch, 
> HIVE-22363.03.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between the sink and the gby - the 
> removal may not happen 
> [here|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L458]
> {code}
> set hive.cbo.enable=false;
> drop table if exists xl1;
> create table xl1 as
> select '1' as mdl_yr_desc, 2 as seq_no,'3' as opt_desc1,4 as opt_desc,1 as 
> row_num;
> explain
> select trim(base.mdl_yr_desc) mdl_yr_desc, trim(base.opt_desc) opt_desc
> from
> (
> SELECT trim(mdl_yr_desc) mdl_yr_desc, concat_ws(' ', 
> collect_set(trim(opt_desc1))) AS opt_desc
> from
> (
> select t14304.* 
> from
> (
> select * from xl1
> ) t14304  
> where row_num = 1
> order by trim(mdl_yr_desc), cast(seq_no as int) asc
> ) x
> group by trim(mdl_yr_desc)
> ) base
> inner join
> (
> select 1 as v
> ) dedup
> on  trim(base.mdl_yr_desc) != dedup.v
> group by trim(base.mdl_yr_desc), trim(base.opt_desc) ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956575#comment-16956575
 ] 

Hive QA commented on HIVE-22363:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
5s{color} | {color:blue} ql in master has 1545 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
17s{color} | {color:red} ql generated 1 new + 1545 unchanged - 0 fixed = 1546 
total (was 1545) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 13s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Dead store to p in 
org.apache.hadoop.hive.ql.optimizer.correlation.ReduceSinkDeDuplication$ReducerReducerProc.process(ReduceSinkOperator,
 GroupByOperator, ReduceSinkDeDuplication$ReduceSinkDeduplicateProcCtx)  At 
ReduceSinkDeDuplication.java:org.apache.hadoop.hive.ql.optimizer.correlation.ReduceSinkDeDuplication$ReducerReducerProc.process(ReduceSinkOperator,
 GroupByOperator, ReduceSinkDeDuplication$ReduceSinkDeduplicateProcCtx)  At 
ReduceSinkDeDuplication.java:[line 328] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19096/dev-support/hive-personality.sh
 |
| git revision | master / 40cd40d |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19096/yetus/new-findbugs-ql.html
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19096/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch, 
> HIVE-22363.03.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a 

[jira] [Commented] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956146#comment-16956146
 ] 

Hive QA commented on HIVE-22363:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983588/HIVE-22363.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 17545 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[groupby2] 
(batchId=175)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=194)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby10] 
(batchId=141)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby11] 
(batchId=146)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby2] 
(batchId=135)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby8] 
(batchId=147)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[multi_insert_move_tasks_share_dependencies]
 (batchId=138)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19087/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19087/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19087/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12983588 - PreCommit-HIVE-Build

> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between the sink and the gby - the 
> removal may not happen 
> [here|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L458]
> {code}
> set hive.cbo.enable=false;
> drop table if exists xl1;
> create table xl1 as
> select '1' as mdl_yr_desc, 2 as seq_no,'3' as opt_desc1,4 as opt_desc,1 as 
> row_num;
> explain
> select trim(base.mdl_yr_desc) mdl_yr_desc, trim(base.opt_desc) opt_desc
> from
> (
> SELECT trim(mdl_yr_desc) mdl_yr_desc, concat_ws(' ', 
> collect_set(trim(opt_desc1))) AS opt_desc
> from
> (
> select t14304.* 
> from
> (
> select * from xl1
> ) t14304  
> where row_num = 1
> order by trim(mdl_yr_desc), cast(seq_no as int) asc
> ) x
> group by trim(mdl_yr_desc)
> ) base
> inner join
> (
> select 1 as v
> ) dedup
> on  trim(base.mdl_yr_desc) != dedup.v
> group by trim(base.mdl_yr_desc), trim(base.opt_desc) ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-21 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956080#comment-16956080
 ] 

Hive QA commented on HIVE-22363:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
10s{color} | {color:blue} ql in master has 1547 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
18s{color} | {color:red} ql generated 1 new + 1547 unchanged - 0 fixed = 1548 
total (was 1547) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m  4s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Dead store to p in 
org.apache.hadoop.hive.ql.optimizer.correlation.ReduceSinkDeDuplication$ReducerReducerProc.process(ReduceSinkOperator,
 GroupByOperator, ReduceSinkDeDuplication$ReduceSinkDeduplicateProcCtx)  At 
ReduceSinkDeDuplication.java:org.apache.hadoop.hive.ql.optimizer.correlation.ReduceSinkDeDuplication$ReducerReducerProc.process(ReduceSinkOperator,
 GroupByOperator, ReduceSinkDeDuplication$ReduceSinkDeduplicateProcCtx)  At 
ReduceSinkDeDuplication.java:[line 328] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19087/dev-support/hive-personality.sh
 |
| git revision | master / 1866d7d |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19087/yetus/new-findbugs-ql.html
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19087/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch, HIVE-22363.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between 

[jira] [Commented] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-17 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954171#comment-16954171
 ] 

Hive QA commented on HIVE-22363:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12983290/HIVE-22363.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 17541 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_7] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby10] (batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby11] (batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby8] (batchId=83)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[multi_insert_gby2] 
(batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[multi_insert_move_tasks_share_dependencies]
 (batchId=61)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[groupby2] 
(batchId=175)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=194)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby10] 
(batchId=141)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby11] 
(batchId=146)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby2] 
(batchId=135)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby8] 
(batchId=147)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[multi_insert_move_tasks_share_dependencies]
 (batchId=138)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/19043/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/19043/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-19043/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12983290 - PreCommit-HIVE-Build

> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between the sink and the gby - the 
> removal may not happen 
> [here|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L458]
> {code}
> set hive.cbo.enable=false;
> drop table if exists xl1;
> create table xl1 as
> select '1' as mdl_yr_desc, 2 as seq_no,'3' as opt_desc1,4 as opt_desc,1 as 
> row_num;
> explain
> select trim(base.mdl_yr_desc) mdl_yr_desc, trim(base.opt_desc) opt_desc
> from
> (
> SELECT trim(mdl_yr_desc) mdl_yr_desc, concat_ws(' ', 
> collect_set(trim(opt_desc1))) AS opt_desc
> from
> (
> select t14304.* 
> from
> (
> select * from xl1
> ) t14304  
> where row_num = 1
> order by trim(mdl_yr_desc), cast(seq_no as int) asc
> ) x
> group by trim(mdl_yr_desc)
> ) base
> inner join
> (
> select 1 as v
> ) dedup
> on  trim(base.mdl_yr_desc) != dedup.v
> group by trim(base.mdl_yr_desc), trim(base.opt_desc) ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-17 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954164#comment-16954164
 ] 

Hive QA commented on HIVE-22363:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
18s{color} | {color:blue} ql in master has 1548 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
26s{color} | {color:red} ql generated 1 new + 1548 unchanged - 0 fixed = 1549 
total (was 1548) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
17s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 18s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Dead store to p in 
org.apache.hadoop.hive.ql.optimizer.correlation.ReduceSinkDeDuplication$ReducerReducerProc.process(ReduceSinkOperator,
 GroupByOperator, ReduceSinkDeDuplication$ReduceSinkDeduplicateProcCtx)  At 
ReduceSinkDeDuplication.java:org.apache.hadoop.hive.ql.optimizer.correlation.ReduceSinkDeDuplication$ReducerReducerProc.process(ReduceSinkOperator,
 GroupByOperator, ReduceSinkDeDuplication$ReduceSinkDeduplicateProcCtx)  At 
ReduceSinkDeDuplication.java:[line 328] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-19043/dev-support/hive-personality.sh
 |
| git revision | master / c6626ed |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19043/yetus/new-findbugs-ql.html
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19043/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-19043/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22363.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first 

[jira] [Commented] (HIVE-22363) ReduceDeduplication may leave an invalid GroupByOperator behind in some cases

2019-10-17 Thread Zoltan Haindrich (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953834#comment-16953834
 ] 

Zoltan Haindrich commented on HIVE-22363:
-

since HIVE-19940 the issue is not present on master; since after that patch the 
problematic FIL operator is pushed further down

> ReduceDeduplication may leave an invalid GroupByOperator behind in some cases
> -
>
> Key: HIVE-22363
> URL: https://issues.apache.org/jira/browse/HIVE-22363
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.1.2
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> since HIVE-11387 reducededup may traverse {{GroupByOperators}} [as 
> well|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L244]
> But the removal logic only removes the first parent; so if there is some 
> other operator (a FIL in this case) between the sink and the gby - the 
> removal may not happen 
> [here|https://github.com/apache/hive/blob/c6626edb65c2cd00576647e54db1995628fe64da/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java#L458]
> {code}
> set hive.cbo.enable=false;
> drop table if exists xl1;
> create table xl1 as
> select '1' as mdl_yr_desc, 2 as seq_no,'3' as opt_desc1,4 as opt_desc,1 as 
> row_num;
> explain
> select trim(base.mdl_yr_desc) mdl_yr_desc, trim(base.opt_desc) opt_desc
> from
> (
> SELECT trim(mdl_yr_desc) mdl_yr_desc, concat_ws(' ', 
> collect_set(trim(opt_desc1))) AS opt_desc
> from
> (
> select t14304.* 
> from
> (
> select * from xl1
> ) t14304  
> where row_num = 1
> order by trim(mdl_yr_desc), cast(seq_no as int) asc
> ) x
> group by trim(mdl_yr_desc)
> ) base
> inner join
> (
> select 1 as v
> ) dedup
> on  trim(base.mdl_yr_desc) != dedup.v
> group by trim(base.mdl_yr_desc), trim(base.opt_desc) ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)