[jira] [Updated] (HIVE-20329) Long running repl load (incr/bootstrap) causing OOM error

2018-08-12 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20329:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Long running repl load (incr/bootstrap) causing OOM error
> -
>
> Key: HIVE-20329
> URL: https://issues.apache.org/jira/browse/HIVE-20329
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-20329.01.patch
>
>
> The task created in the previous iterations of the load are not delinked and 
> thus causing heap memory usage issue. need to delink the tasks to avoid OOM 
> error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20329) Long running repl load (incr/bootstrap) causing OOM error

2018-08-12 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577859#comment-16577859
 ] 

Sankar Hariappan commented on HIVE-20329:
-

01.patch is committed to master and branch-3.

Thanks [~maheshk114] for the patch!

> Long running repl load (incr/bootstrap) causing OOM error
> -
>
> Key: HIVE-20329
> URL: https://issues.apache.org/jira/browse/HIVE-20329
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-20329.01.patch
>
>
> The task created in the previous iterations of the load are not delinked and 
> thus causing heap memory usage issue. need to delink the tasks to avoid OOM 
> error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20329) Long running repl load (incr/bootstrap) causing OOM error

2018-08-12 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20329:

Summary: Long running repl load (incr/bootstrap) causing OOM error  (was: 
Running long running load (incr/bootstrap) causing OOM error)

> Long running repl load (incr/bootstrap) causing OOM error
> -
>
> Key: HIVE-20329
> URL: https://issues.apache.org/jira/browse/HIVE-20329
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-20329.01.patch
>
>
> The task created in the previous iterations of the load are not delinked and 
> thus causing heap memory usage issue. need to delink the tasks to avoid OOM 
> error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20329) Running long running load (incr/bootstrap) causing OOM error

2018-08-12 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20329:

Summary: Running long running load (incr/bootstrap) causing OOM error  
(was: Repl Scale Test : Running long running load (incr/bootstrap) causing OOM 
error)

> Running long running load (incr/bootstrap) causing OOM error
> 
>
> Key: HIVE-20329
> URL: https://issues.apache.org/jira/browse/HIVE-20329
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-20329.01.patch
>
>
> The task created in the previous iterations of the load are not delinked and 
> thus causing heap memory usage issue. need to delink the tasks to avoid OOM 
> error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20329) Repl Scale Test : Running long running load (incr/bootstrap) causing OOM error

2018-08-12 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577858#comment-16577858
 ] 

Sankar Hariappan commented on HIVE-20329:
-

+1

> Repl Scale Test : Running long running load (incr/bootstrap) causing OOM error
> --
>
> Key: HIVE-20329
> URL: https://issues.apache.org/jira/browse/HIVE-20329
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-20329.01.patch
>
>
> The task created in the previous iterations of the load are not delinked and 
> thus causing heap memory usage issue. need to delink the tasks to avoid OOM 
> error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20370) Vectorization: Add Native Vector MapJoin hash table optimization for Left/Right Outer Joins when there are no Small Table values

2018-08-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577857#comment-16577857
 ] 

Hive QA commented on HIVE-20370:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
9s{color} | {color:blue} ql in master has 2306 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
45s{color} | {color:red} ql: The patch generated 106 new + 489 unchanged - 57 
fixed = 595 total (was 546) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 10 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-13186/dev-support/hive-personality.sh
 |
| git revision | master / 4a30574 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13186/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13186/yetus/whitespace-eol.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13186/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization: Add Native Vector MapJoin hash table optimization for 
> Left/Right Outer Joins when there are no Small Table values
> 
>
> Key: HIVE-20370
> URL: https://issues.apache.org/jira/browse/HIVE-20370
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20370.01.patch
>
>
> Similar to Native Vector MapJoin's InnerBigOnly optimization that uses an 
> efficient Hash Multi-Set with a counter instead of a Hash Map with an empty 
> value, do the same for Outer joins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19924) Tag distcp jobs run by Repl Load

2018-08-12 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577854#comment-16577854
 ] 

Sankar Hariappan commented on HIVE-19924:
-

14.patch is committed to master.

Thanks [~maheshk114] for the patch!

> Tag distcp jobs run by Repl Load
> 
>
> Key: HIVE-19924
> URL: https://issues.apache.org/jira/browse/HIVE-19924
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0
>
> Attachments: HIVE-19924.01.patch, HIVE-19924.02.patch, 
> HIVE-19924.03.patch, HIVE-19924.04.patch, HIVE-19924.05.patch, 
> HIVE-19924.06.patch, HIVE-19924.07.patch, HIVE-19924.08.patch, 
> HIVE-19924.09.patch, HIVE-19924.10.patch, HIVE-19924.11.patch, 
> HIVE-19924.12.patch, HIVE-19924.13.patch, HIVE-19924.14.patch
>
>
> Add tags in jobconf for distcp related jobs started by replication. This will 
> allow hive to kill these jobs in case beacon retries, or hs2 dies and beacon 
> issues a kill command.
>  * one of the tags should definitely be the query_id that starts the job : 
> With this flow beacon before retrying the bootstrap load, will issue a kill 
> command to hs2 with the query id of the previous issued command. hs2 will 
> then kill an running jobs on yarn tagged with the Query_id.
>  * To get around the additional failure point as mentioned above. The jobs 
> can be tagged with an additional unique tag_id provided by Beacon in the WITH 
> clause in repl load command to be used to tag distcp jobs ). Enhance the kill 
> api to take the tag as input and kill jobs associated with that tag. Problem 
> here is how do we validate the association of the tag with a hive query id to 
> make sure this api is not used to kill jobs run by other components, however 
> we can provide this capability to only admins and should be ok in that case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19924) Tag distcp jobs run by Repl Load

2018-08-12 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-19924:

Fix Version/s: (was: 3.2.0)

> Tag distcp jobs run by Repl Load
> 
>
> Key: HIVE-19924
> URL: https://issues.apache.org/jira/browse/HIVE-19924
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0
>
> Attachments: HIVE-19924.01.patch, HIVE-19924.02.patch, 
> HIVE-19924.03.patch, HIVE-19924.04.patch, HIVE-19924.05.patch, 
> HIVE-19924.06.patch, HIVE-19924.07.patch, HIVE-19924.08.patch, 
> HIVE-19924.09.patch, HIVE-19924.10.patch, HIVE-19924.11.patch, 
> HIVE-19924.12.patch, HIVE-19924.13.patch, HIVE-19924.14.patch
>
>
> Add tags in jobconf for distcp related jobs started by replication. This will 
> allow hive to kill these jobs in case beacon retries, or hs2 dies and beacon 
> issues a kill command.
>  * one of the tags should definitely be the query_id that starts the job : 
> With this flow beacon before retrying the bootstrap load, will issue a kill 
> command to hs2 with the query id of the previous issued command. hs2 will 
> then kill an running jobs on yarn tagged with the Query_id.
>  * To get around the additional failure point as mentioned above. The jobs 
> can be tagged with an additional unique tag_id provided by Beacon in the WITH 
> clause in repl load command to be used to tag distcp jobs ). Enhance the kill 
> api to take the tag as input and kill jobs associated with that tag. Problem 
> here is how do we validate the association of the tag with a hive query id to 
> make sure this api is not used to kill jobs run by other components, however 
> we can provide this capability to only admins and should be ok in that case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19924) Tag distcp jobs run by Repl Load

2018-08-12 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577853#comment-16577853
 ] 

Sankar Hariappan commented on HIVE-19924:
-

+1

> Tag distcp jobs run by Repl Load
> 
>
> Key: HIVE-19924
> URL: https://issues.apache.org/jira/browse/HIVE-19924
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19924.01.patch, HIVE-19924.02.patch, 
> HIVE-19924.03.patch, HIVE-19924.04.patch, HIVE-19924.05.patch, 
> HIVE-19924.06.patch, HIVE-19924.07.patch, HIVE-19924.08.patch, 
> HIVE-19924.09.patch, HIVE-19924.10.patch, HIVE-19924.11.patch, 
> HIVE-19924.12.patch, HIVE-19924.13.patch, HIVE-19924.14.patch
>
>
> Add tags in jobconf for distcp related jobs started by replication. This will 
> allow hive to kill these jobs in case beacon retries, or hs2 dies and beacon 
> issues a kill command.
>  * one of the tags should definitely be the query_id that starts the job : 
> With this flow beacon before retrying the bootstrap load, will issue a kill 
> command to hs2 with the query id of the previous issued command. hs2 will 
> then kill an running jobs on yarn tagged with the Query_id.
>  * To get around the additional failure point as mentioned above. The jobs 
> can be tagged with an additional unique tag_id provided by Beacon in the WITH 
> clause in repl load command to be used to tag distcp jobs ). Enhance the kill 
> api to take the tag as input and kill jobs associated with that tag. Problem 
> here is how do we validate the association of the tag with a hive query id to 
> make sure this api is not used to kill jobs run by other components, however 
> we can provide this capability to only admins and should be ok in that case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19924) Tag distcp jobs run by Repl Load

2018-08-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577841#comment-16577841
 ] 

Hive QA commented on HIVE-19924:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12935304/HIVE-19924.14.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14877 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/13185/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13185/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13185/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12935304 - PreCommit-HIVE-Build

> Tag distcp jobs run by Repl Load
> 
>
> Key: HIVE-19924
> URL: https://issues.apache.org/jira/browse/HIVE-19924
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19924.01.patch, HIVE-19924.02.patch, 
> HIVE-19924.03.patch, HIVE-19924.04.patch, HIVE-19924.05.patch, 
> HIVE-19924.06.patch, HIVE-19924.07.patch, HIVE-19924.08.patch, 
> HIVE-19924.09.patch, HIVE-19924.10.patch, HIVE-19924.11.patch, 
> HIVE-19924.12.patch, HIVE-19924.13.patch, HIVE-19924.14.patch
>
>
> Add tags in jobconf for distcp related jobs started by replication. This will 
> allow hive to kill these jobs in case beacon retries, or hs2 dies and beacon 
> issues a kill command.
>  * one of the tags should definitely be the query_id that starts the job : 
> With this flow beacon before retrying the bootstrap load, will issue a kill 
> command to hs2 with the query id of the previous issued command. hs2 will 
> then kill an running jobs on yarn tagged with the Query_id.
>  * To get around the additional failure point as mentioned above. The jobs 
> can be tagged with an additional unique tag_id provided by Beacon in the WITH 
> clause in repl load command to be used to tag distcp jobs ). Enhance the kill 
> api to take the tag as input and kill jobs associated with that tag. Problem 
> here is how do we validate the association of the tag with a hive query id to 
> make sure this api is not used to kill jobs run by other components, however 
> we can provide this capability to only admins and should be ok in that case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-20370) Vectorization: Add Native Vector MapJoin hash table optimization for Left/Right Outer Joins when there are no Small Table values

2018-08-12 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577838#comment-16577838
 ] 

Gopal V edited comment on HIVE-20370 at 8/13/18 5:12 AM:
-

Zero rows on the outer side is often found within MERGE statements with the 
dynamic semi-join enabled (i.e there's no rows to update & it's just a pure 
UPSERT).


was (Author: gopalv):
Zero rows on the outer side is often found within MERGE statements with the 
dynamic semi-join enabled (i.e all rows are new primary keys and there's no 
rows to update).

> Vectorization: Add Native Vector MapJoin hash table optimization for 
> Left/Right Outer Joins when there are no Small Table values
> 
>
> Key: HIVE-20370
> URL: https://issues.apache.org/jira/browse/HIVE-20370
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20370.01.patch
>
>
> Similar to Native Vector MapJoin's InnerBigOnly optimization that uses an 
> efficient Hash Multi-Set with a counter instead of a Hash Map with an empty 
> value, do the same for Outer joins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20370) Vectorization: Add Native Vector MapJoin hash table optimization for Left/Right Outer Joins when there are no Small Table values

2018-08-12 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577838#comment-16577838
 ] 

Gopal V commented on HIVE-20370:


Zero rows on the outer side is often found within MERGE statements with the 
dynamic semi-join enabled (i.e all rows are new primary keys and there's no 
rows to update).

> Vectorization: Add Native Vector MapJoin hash table optimization for 
> Left/Right Outer Joins when there are no Small Table values
> 
>
> Key: HIVE-20370
> URL: https://issues.apache.org/jira/browse/HIVE-20370
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20370.01.patch
>
>
> Similar to Native Vector MapJoin's InnerBigOnly optimization that uses an 
> efficient Hash Multi-Set with a counter instead of a Hash Map with an empty 
> value, do the same for Outer joins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20370) Vectorization: Add Native Vector MapJoin hash table optimization for Left/Right Outer Joins when there are no Small Table values

2018-08-12 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-20370:

Status: Patch Available  (was: Open)

> Vectorization: Add Native Vector MapJoin hash table optimization for 
> Left/Right Outer Joins when there are no Small Table values
> 
>
> Key: HIVE-20370
> URL: https://issues.apache.org/jira/browse/HIVE-20370
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20370.01.patch
>
>
> Similar to Native Vector MapJoin's InnerBigOnly optimization that uses an 
> efficient Hash Multi-Set with a counter instead of a Hash Map with an empty 
> value, do the same for Outer joins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20370) Vectorization: Add Native Vector MapJoin hash table optimization for Left/Right Outer Joins when there are no Small Table values

2018-08-12 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-20370:

Attachment: HIVE-20370.01.patch

> Vectorization: Add Native Vector MapJoin hash table optimization for 
> Left/Right Outer Joins when there are no Small Table values
> 
>
> Key: HIVE-20370
> URL: https://issues.apache.org/jira/browse/HIVE-20370
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20370.01.patch
>
>
> Similar to Native Vector MapJoin's InnerBigOnly optimization that uses an 
> efficient Hash Multi-Set with a counter instead of a Hash Map with an empty 
> value, do the same for Outer joins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19924) Tag distcp jobs run by Repl Load

2018-08-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577830#comment-16577830
 ] 

Hive QA commented on HIVE-19924:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
32s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
41s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
8s{color} | {color:blue} ql in master has 2306 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
41s{color} | {color:blue} service in master has 48 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
45s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
22s{color} | {color:red} itests/hive-unit: The patch generated 4 new + 230 
unchanged - 0 fixed = 234 total (was 230) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
43s{color} | {color:red} ql: The patch generated 2 new + 306 unchanged - 13 
fixed = 308 total (was 319) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 35m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-13185/dev-support/hive-personality.sh
 |
| git revision | master / 4a30574 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13185/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13185/yetus/diff-checkstyle-ql.txt
 |
| modules | C: itests/hive-unit ql service U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13185/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Tag distcp jobs run by Repl Load
> 
>
> Key: HIVE-19924
> URL: https://issues.apache.org/jira/browse/HIVE-19924
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19924.01.patch, HIVE-19924.02.patch, 
> HIVE-19924.03.patch, 

[jira] [Assigned] (HIVE-20370) Vectorization: Add Native Vector MapJoin hash table optimization for Left/Right Outer Joins when there are no Small Table values

2018-08-12 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-20370:
---


> Vectorization: Add Native Vector MapJoin hash table optimization for 
> Left/Right Outer Joins when there are no Small Table values
> 
>
> Key: HIVE-20370
> URL: https://issues.apache.org/jira/browse/HIVE-20370
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> Similar to Native Vector MapJoin's InnerBigOnly optimization that uses an 
> efficient Hash Multi-Set with a counter instead of a Hash Map with an empty 
> value, do the same for Outer joins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19924) Tag distcp jobs run by Repl Load

2018-08-12 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-19924:
---
Status: Patch Available  (was: Open)

fixed the one connection multi statement issue

> Tag distcp jobs run by Repl Load
> 
>
> Key: HIVE-19924
> URL: https://issues.apache.org/jira/browse/HIVE-19924
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19924.01.patch, HIVE-19924.02.patch, 
> HIVE-19924.03.patch, HIVE-19924.04.patch, HIVE-19924.05.patch, 
> HIVE-19924.06.patch, HIVE-19924.07.patch, HIVE-19924.08.patch, 
> HIVE-19924.09.patch, HIVE-19924.10.patch, HIVE-19924.11.patch, 
> HIVE-19924.12.patch, HIVE-19924.13.patch, HIVE-19924.14.patch
>
>
> Add tags in jobconf for distcp related jobs started by replication. This will 
> allow hive to kill these jobs in case beacon retries, or hs2 dies and beacon 
> issues a kill command.
>  * one of the tags should definitely be the query_id that starts the job : 
> With this flow beacon before retrying the bootstrap load, will issue a kill 
> command to hs2 with the query id of the previous issued command. hs2 will 
> then kill an running jobs on yarn tagged with the Query_id.
>  * To get around the additional failure point as mentioned above. The jobs 
> can be tagged with an additional unique tag_id provided by Beacon in the WITH 
> clause in repl load command to be used to tag distcp jobs ). Enhance the kill 
> api to take the tag as input and kill jobs associated with that tag. Problem 
> here is how do we validate the association of the tag with a hive query id to 
> make sure this api is not used to kill jobs run by other components, however 
> we can provide this capability to only admins and should be ok in that case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19924) Tag distcp jobs run by Repl Load

2018-08-12 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-19924:
---
Attachment: HIVE-19924.14.patch

> Tag distcp jobs run by Repl Load
> 
>
> Key: HIVE-19924
> URL: https://issues.apache.org/jira/browse/HIVE-19924
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19924.01.patch, HIVE-19924.02.patch, 
> HIVE-19924.03.patch, HIVE-19924.04.patch, HIVE-19924.05.patch, 
> HIVE-19924.06.patch, HIVE-19924.07.patch, HIVE-19924.08.patch, 
> HIVE-19924.09.patch, HIVE-19924.10.patch, HIVE-19924.11.patch, 
> HIVE-19924.12.patch, HIVE-19924.13.patch, HIVE-19924.14.patch
>
>
> Add tags in jobconf for distcp related jobs started by replication. This will 
> allow hive to kill these jobs in case beacon retries, or hs2 dies and beacon 
> issues a kill command.
>  * one of the tags should definitely be the query_id that starts the job : 
> With this flow beacon before retrying the bootstrap load, will issue a kill 
> command to hs2 with the query id of the previous issued command. hs2 will 
> then kill an running jobs on yarn tagged with the Query_id.
>  * To get around the additional failure point as mentioned above. The jobs 
> can be tagged with an additional unique tag_id provided by Beacon in the WITH 
> clause in repl load command to be used to tag distcp jobs ). Enhance the kill 
> api to take the tag as input and kill jobs associated with that tag. Problem 
> here is how do we validate the association of the tag with a hive query id to 
> make sure this api is not used to kill jobs run by other components, however 
> we can provide this capability to only admins and should be ok in that case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19924) Tag distcp jobs run by Repl Load

2018-08-12 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-19924:
---
Status: Open  (was: Patch Available)

> Tag distcp jobs run by Repl Load
> 
>
> Key: HIVE-19924
> URL: https://issues.apache.org/jira/browse/HIVE-19924
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19924.01.patch, HIVE-19924.02.patch, 
> HIVE-19924.03.patch, HIVE-19924.04.patch, HIVE-19924.05.patch, 
> HIVE-19924.06.patch, HIVE-19924.07.patch, HIVE-19924.08.patch, 
> HIVE-19924.09.patch, HIVE-19924.10.patch, HIVE-19924.11.patch, 
> HIVE-19924.12.patch, HIVE-19924.13.patch, HIVE-19924.14.patch
>
>
> Add tags in jobconf for distcp related jobs started by replication. This will 
> allow hive to kill these jobs in case beacon retries, or hs2 dies and beacon 
> issues a kill command.
>  * one of the tags should definitely be the query_id that starts the job : 
> With this flow beacon before retrying the bootstrap load, will issue a kill 
> command to hs2 with the query id of the previous issued command. hs2 will 
> then kill an running jobs on yarn tagged with the Query_id.
>  * To get around the additional failure point as mentioned above. The jobs 
> can be tagged with an additional unique tag_id provided by Beacon in the WITH 
> clause in repl load command to be used to tag distcp jobs ). Enhance the kill 
> api to take the tag as input and kill jobs associated with that tag. Problem 
> here is how do we validate the association of the tag with a hive query id to 
> make sure this api is not used to kill jobs run by other components, however 
> we can provide this capability to only admins and should be ok in that case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19924) Tag distcp jobs run by Repl Load

2018-08-12 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-19924:
---
Status: Patch Available  (was: Open)

> Tag distcp jobs run by Repl Load
> 
>
> Key: HIVE-19924
> URL: https://issues.apache.org/jira/browse/HIVE-19924
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19924.01.patch, HIVE-19924.02.patch, 
> HIVE-19924.03.patch, HIVE-19924.04.patch, HIVE-19924.05.patch, 
> HIVE-19924.06.patch, HIVE-19924.07.patch, HIVE-19924.08.patch, 
> HIVE-19924.09.patch, HIVE-19924.10.patch, HIVE-19924.11.patch, 
> HIVE-19924.12.patch, HIVE-19924.13.patch
>
>
> Add tags in jobconf for distcp related jobs started by replication. This will 
> allow hive to kill these jobs in case beacon retries, or hs2 dies and beacon 
> issues a kill command.
>  * one of the tags should definitely be the query_id that starts the job : 
> With this flow beacon before retrying the bootstrap load, will issue a kill 
> command to hs2 with the query id of the previous issued command. hs2 will 
> then kill an running jobs on yarn tagged with the Query_id.
>  * To get around the additional failure point as mentioned above. The jobs 
> can be tagged with an additional unique tag_id provided by Beacon in the WITH 
> clause in repl load command to be used to tag distcp jobs ). Enhance the kill 
> api to take the tag as input and kill jobs associated with that tag. Problem 
> here is how do we validate the association of the tag with a hive query id to 
> make sure this api is not used to kill jobs run by other components, however 
> we can provide this capability to only admins and should be ok in that case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19924) Tag distcp jobs run by Repl Load

2018-08-12 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-19924:
---
Status: Open  (was: Patch Available)

rerunning same patch again

> Tag distcp jobs run by Repl Load
> 
>
> Key: HIVE-19924
> URL: https://issues.apache.org/jira/browse/HIVE-19924
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19924.01.patch, HIVE-19924.02.patch, 
> HIVE-19924.03.patch, HIVE-19924.04.patch, HIVE-19924.05.patch, 
> HIVE-19924.06.patch, HIVE-19924.07.patch, HIVE-19924.08.patch, 
> HIVE-19924.09.patch, HIVE-19924.10.patch, HIVE-19924.11.patch, 
> HIVE-19924.12.patch, HIVE-19924.13.patch
>
>
> Add tags in jobconf for distcp related jobs started by replication. This will 
> allow hive to kill these jobs in case beacon retries, or hs2 dies and beacon 
> issues a kill command.
>  * one of the tags should definitely be the query_id that starts the job : 
> With this flow beacon before retrying the bootstrap load, will issue a kill 
> command to hs2 with the query id of the previous issued command. hs2 will 
> then kill an running jobs on yarn tagged with the Query_id.
>  * To get around the additional failure point as mentioned above. The jobs 
> can be tagged with an additional unique tag_id provided by Beacon in the WITH 
> clause in repl load command to be used to tag distcp jobs ). Enhance the kill 
> api to take the tag as input and kill jobs associated with that tag. Problem 
> here is how do we validate the association of the tag with a hive query id to 
> make sure this api is not used to kill jobs run by other components, however 
> we can provide this capability to only admins and should be ok in that case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20366) TPC-DS query78 stats estimates are off for is null filter

2018-08-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577740#comment-16577740
 ] 

Hive QA commented on HIVE-20366:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12935299/HIVE-20366.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14877 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/13184/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13184/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13184/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12935299 - PreCommit-HIVE-Build

> TPC-DS query78 stats estimates are off for is null filter
> -
>
> Key: HIVE-20366
> URL: https://issues.apache.org/jira/browse/HIVE-20366
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20366.1.patch, HIVE-20366.2.patch, 
> HIVE-20366.3.patch
>
>
> In Query 78, there is Left outer join between fact table combos: stores_sales 
> LOJ store_returns, catalog_sales LOJ catalog_returns and web_sales LOJ 
> web_returns. Each of these joins estimates only a single row and the result 
> is BROADCAST and causes hash table memory errors
> {code}
>  Reducer 12 |
> | Execution mode: vectorized, llap   |
> | Reduce Operator Tree:  |
> ++
> |  Explain   |
> ++
> |   Map Join Operator|
> | condition map: |
> |  Left Outer Join 0 to 1|
> | keys:  |
> |   0 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 
> (type: bigint) |
> |   1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 
> (type: bigint) |
> | outputColumnNames: _col0, _col1, _col3, _col4, _col5, 
> _col6, _col8 |
> | input vertices:|
> |   1 Map 14 |
> | Statistics: Num rows: 10282477384 Data size: 534184867432 
> Basic stats: COMPLETE Column stats: COMPLETE |
> | Filter Operator|
> |   predicate: _col8 is null (type: boolean) |
> |  * Statistics: Num rows: 1* Data size: 52 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20366) TPC-DS query78 stats estimates are off for is null filter

2018-08-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577737#comment-16577737
 ] 

Hive QA commented on HIVE-20366:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
6s{color} | {color:blue} ql in master has 2306 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
37s{color} | {color:red} ql: The patch generated 2 new + 23 unchanged - 0 fixed 
= 25 total (was 23) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 47s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-13184/dev-support/hive-personality.sh
 |
| git revision | master / 4a30574 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13184/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13184/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> TPC-DS query78 stats estimates are off for is null filter
> -
>
> Key: HIVE-20366
> URL: https://issues.apache.org/jira/browse/HIVE-20366
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20366.1.patch, HIVE-20366.2.patch, 
> HIVE-20366.3.patch
>
>
> In Query 78, there is Left outer join between fact table combos: stores_sales 
> LOJ store_returns, catalog_sales LOJ catalog_returns and web_sales LOJ 
> web_returns. Each of these joins estimates only a single row and the result 
> is BROADCAST and causes hash table memory errors
> {code}
>  Reducer 12 |
> | Execution mode: vectorized, llap   |
> | Reduce Operator Tree:  |
> ++
> |  Explain   |
> ++
> |   Map Join Operator|
> | condition map: |
> |  Left Outer Join 0 to 1|
> | keys:  |
> |   0 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 
> (type: bigint) |
> |   1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 
> (type: bigint) |
> | 

[jira] [Commented] (HIVE-20366) TPC-DS query78 stats estimates are off for is null filter

2018-08-12 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577726#comment-16577726
 ] 

Vineet Garg commented on HIVE-20366:


Review board link: https://reviews.apache.org/r/68313/

> TPC-DS query78 stats estimates are off for is null filter
> -
>
> Key: HIVE-20366
> URL: https://issues.apache.org/jira/browse/HIVE-20366
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20366.1.patch, HIVE-20366.2.patch, 
> HIVE-20366.3.patch
>
>
> In Query 78, there is Left outer join between fact table combos: stores_sales 
> LOJ store_returns, catalog_sales LOJ catalog_returns and web_sales LOJ 
> web_returns. Each of these joins estimates only a single row and the result 
> is BROADCAST and causes hash table memory errors
> {code}
>  Reducer 12 |
> | Execution mode: vectorized, llap   |
> | Reduce Operator Tree:  |
> ++
> |  Explain   |
> ++
> |   Map Join Operator|
> | condition map: |
> |  Left Outer Join 0 to 1|
> | keys:  |
> |   0 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 
> (type: bigint) |
> |   1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 
> (type: bigint) |
> | outputColumnNames: _col0, _col1, _col3, _col4, _col5, 
> _col6, _col8 |
> | input vertices:|
> |   1 Map 14 |
> | Statistics: Num rows: 10282477384 Data size: 534184867432 
> Basic stats: COMPLETE Column stats: COMPLETE |
> | Filter Operator|
> |   predicate: _col8 is null (type: boolean) |
> |  * Statistics: Num rows: 1* Data size: 52 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20366) TPC-DS query78 stats estimates are off for is null filter

2018-08-12 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20366:
---
Attachment: HIVE-20366.3.patch

> TPC-DS query78 stats estimates are off for is null filter
> -
>
> Key: HIVE-20366
> URL: https://issues.apache.org/jira/browse/HIVE-20366
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20366.1.patch, HIVE-20366.2.patch, 
> HIVE-20366.3.patch
>
>
> In Query 78, there is Left outer join between fact table combos: stores_sales 
> LOJ store_returns, catalog_sales LOJ catalog_returns and web_sales LOJ 
> web_returns. Each of these joins estimates only a single row and the result 
> is BROADCAST and causes hash table memory errors
> {code}
>  Reducer 12 |
> | Execution mode: vectorized, llap   |
> | Reduce Operator Tree:  |
> ++
> |  Explain   |
> ++
> |   Map Join Operator|
> | condition map: |
> |  Left Outer Join 0 to 1|
> | keys:  |
> |   0 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 
> (type: bigint) |
> |   1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 
> (type: bigint) |
> | outputColumnNames: _col0, _col1, _col3, _col4, _col5, 
> _col6, _col8 |
> | input vertices:|
> |   1 Map 14 |
> | Statistics: Num rows: 10282477384 Data size: 534184867432 
> Basic stats: COMPLETE Column stats: COMPLETE |
> | Filter Operator|
> |   predicate: _col8 is null (type: boolean) |
> |  * Statistics: Num rows: 1* Data size: 52 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20366) TPC-DS query78 stats estimates are off for is null filter

2018-08-12 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577725#comment-16577725
 ] 

Vineet Garg commented on HIVE-20366:


bq. There are few commented out asserts. Is that required?
Yes this assumption is not true anymore so had to be removed.

 

> TPC-DS query78 stats estimates are off for is null filter
> -
>
> Key: HIVE-20366
> URL: https://issues.apache.org/jira/browse/HIVE-20366
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20366.1.patch, HIVE-20366.2.patch, 
> HIVE-20366.3.patch
>
>
> In Query 78, there is Left outer join between fact table combos: stores_sales 
> LOJ store_returns, catalog_sales LOJ catalog_returns and web_sales LOJ 
> web_returns. Each of these joins estimates only a single row and the result 
> is BROADCAST and causes hash table memory errors
> {code}
>  Reducer 12 |
> | Execution mode: vectorized, llap   |
> | Reduce Operator Tree:  |
> ++
> |  Explain   |
> ++
> |   Map Join Operator|
> | condition map: |
> |  Left Outer Join 0 to 1|
> | keys:  |
> |   0 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 
> (type: bigint) |
> |   1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 
> (type: bigint) |
> | outputColumnNames: _col0, _col1, _col3, _col4, _col5, 
> _col6, _col8 |
> | input vertices:|
> |   1 Map 14 |
> | Statistics: Num rows: 10282477384 Data size: 534184867432 
> Basic stats: COMPLETE Column stats: COMPLETE |
> | Filter Operator|
> |   predicate: _col8 is null (type: boolean) |
> |  * Statistics: Num rows: 1* Data size: 52 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20366) TPC-DS query78 stats estimates are off for is null filter

2018-08-12 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577718#comment-16577718
 ] 

Ashutosh Chauhan commented on HIVE-20366:
-

*  Please you remove unneeded import net.jpountz.util.SafeUtils;
* Instead of calling variables {{danglingRows}}, I think {{unmatchedRows}} will 
be better suited.
* There are few commented out asserts. Is that required?
* Can you please also create RB for it.

> TPC-DS query78 stats estimates are off for is null filter
> -
>
> Key: HIVE-20366
> URL: https://issues.apache.org/jira/browse/HIVE-20366
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20366.1.patch, HIVE-20366.2.patch
>
>
> In Query 78, there is Left outer join between fact table combos: stores_sales 
> LOJ store_returns, catalog_sales LOJ catalog_returns and web_sales LOJ 
> web_returns. Each of these joins estimates only a single row and the result 
> is BROADCAST and causes hash table memory errors
> {code}
>  Reducer 12 |
> | Execution mode: vectorized, llap   |
> | Reduce Operator Tree:  |
> ++
> |  Explain   |
> ++
> |   Map Join Operator|
> | condition map: |
> |  Left Outer Join 0 to 1|
> | keys:  |
> |   0 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 
> (type: bigint) |
> |   1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 
> (type: bigint) |
> | outputColumnNames: _col0, _col1, _col3, _col4, _col5, 
> _col6, _col8 |
> | input vertices:|
> |   1 Map 14 |
> | Statistics: Num rows: 10282477384 Data size: 534184867432 
> Basic stats: COMPLETE Column stats: COMPLETE |
> | Filter Operator|
> |   predicate: _col8 is null (type: boolean) |
> |  * Statistics: Num rows: 1* Data size: 52 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17040) Join elimination in the presence of FK relationship

2018-08-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577702#comment-16577702
 ] 

Hive QA commented on HIVE-17040:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
24s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
33s{color} | {color:blue} common in master has 64 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
5s{color} | {color:blue} ql in master has 2306 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m 
11s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
41s{color} | {color:red} ql: The patch generated 10 new + 172 unchanged - 4 
fixed = 182 total (was 176) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 28 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
25s{color} | {color:red} ql generated 1 new + 2306 unchanged - 0 fixed = 2307 
total (was 2306) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 62m 23s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Dead store to p1 in 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveJoinConstraintsRule$EquivalenceClasses.addEquivalenceClass(RexTableInputRef,
 RexTableInputRef)  At 
HiveJoinConstraintsRule.java:org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveJoinConstraintsRule$EquivalenceClasses.addEquivalenceClass(RexTableInputRef,
 RexTableInputRef)  At HiveJoinConstraintsRule.java:[line 391] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-13183/dev-support/hive-personality.sh
 |
| git revision | master / 4a30574 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13183/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13183/yetus/whitespace-eol.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13183/yetus/new-findbugs-ql.html
 |
| modules | C: common . itests ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13183/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Join elimination in the presence of FK relationship
> ---
>
> Key: HIVE-17040
> URL: https://issues.apache.org/jira/browse/HIVE-17040
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical 

[jira] [Commented] (HIVE-17040) Join elimination in the presence of FK relationship

2018-08-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577701#comment-16577701
 ] 

Hive QA commented on HIVE-17040:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12935296/HIVE-17040.02.patch

{color:green}SUCCESS:{color} +1 due to 10 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14880 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[list_bucket_dml_2] 
(batchId=114)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/13183/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13183/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13183/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12935296 - PreCommit-HIVE-Build

> Join elimination in the presence of FK relationship
> ---
>
> Key: HIVE-17040
> URL: https://issues.apache.org/jira/browse/HIVE-17040
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-17040.01.patch, HIVE-17040.02.patch, 
> HIVE-17040.patch
>
>
> If the PK/UK table is not filtered, we can safely remove the join.
> A simple example:
> {code:sql}
> SELECT c_current_cdemo_sk
> FROM customer, customer_address
> ON c_current_addr_sk = ca_address_sk;
> {code}
> As a Calcite rule, we could implement this rewriting by 1) matching a Project 
> on top of a Join operator, 2) checking that only columns from the FK are used 
> in the Project, 3) checking that the join condition matches the FK - PK/UK 
> relationship, 4) pulling all the predicates from the PK/UK side and checking 
> that the input is not filtered, and 5) removing the join, possibly adding a 
> IS NOT NULL condition on the join column from the FK side.
> If the PK/UK table is filtered, we should still transform the Join into a 
> SemiJoin operator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17040) Join elimination in the presence of FK relationship

2018-08-12 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17040:
---
Attachment: HIVE-17040.02.patch

> Join elimination in the presence of FK relationship
> ---
>
> Key: HIVE-17040
> URL: https://issues.apache.org/jira/browse/HIVE-17040
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-17040.01.patch, HIVE-17040.02.patch, 
> HIVE-17040.patch
>
>
> If the PK/UK table is not filtered, we can safely remove the join.
> A simple example:
> {code:sql}
> SELECT c_current_cdemo_sk
> FROM customer, customer_address
> ON c_current_addr_sk = ca_address_sk;
> {code}
> As a Calcite rule, we could implement this rewriting by 1) matching a Project 
> on top of a Join operator, 2) checking that only columns from the FK are used 
> in the Project, 3) checking that the join condition matches the FK - PK/UK 
> relationship, 4) pulling all the predicates from the PK/UK side and checking 
> that the input is not filtered, and 5) removing the join, possibly adding a 
> IS NOT NULL condition on the join column from the FK side.
> If the PK/UK table is filtered, we should still transform the Join into a 
> SemiJoin operator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20150) TopNKey pushdown

2018-08-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577657#comment-16577657
 ] 

Hive QA commented on HIVE-20150:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12935290/HIVE-20150.4.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 14877 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_struct_type_vectorization]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_complex_types_vectorization]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_map_type_vectorization]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_struct_type_vectorization]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_groupby]
 (batchId=180)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[topnkey] 
(batchId=107)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vector_topnkey] 
(batchId=107)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/13182/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13182/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13182/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12935290 - PreCommit-HIVE-Build

> TopNKey pushdown
> 
>
> Key: HIVE-20150
> URL: https://issues.apache.org/jira/browse/HIVE-20150
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20150.1.patch, HIVE-20150.2.patch, 
> HIVE-20150.4.patch
>
>
> TopNKey operator is implemented in HIVE-17896, but it needs more work in 
> pushdown implementation. So this issue covers TopNKey pushdown implementation 
> with proper tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20368) Remove VectorTopNKeyOperator lock

2018-08-12 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577655#comment-16577655
 ] 

Jesus Camacho Rodriguez commented on HIVE-20368:


+1

> Remove VectorTopNKeyOperator lock
> -
>
> Key: HIVE-20368
> URL: https://issues.apache.org/jira/browse/HIVE-20368
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20368.1.patch, HIVE-20368.2.patch
>
>
> VectorTopNKeyOperator has a lock in line 199 as following.
> {code:java}
> priorityQueue.offer(WritableUtils.clone(keysWritable, getConfiguration()));
> {code}
> WritableUtils.clone calls Confgiruation.getClassByNameOrNull that has 
> synchronized block. So it needs to run without locks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20150) TopNKey pushdown

2018-08-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577647#comment-16577647
 ] 

Hive QA commented on HIVE-20150:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
2s{color} | {color:blue} ql in master has 2306 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
39s{color} | {color:red} ql: The patch generated 49 new + 41 unchanged - 0 
fixed = 90 total (was 41) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
15s{color} | {color:red} ql generated 1 new + 2306 unchanged - 0 fixed = 2307 
total (was 2306) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 29s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Switch statement found in 
org.apache.hadoop.hive.ql.optimizer.TopNKeyPushdownProcessor.pushdown(TopNKeyOperator)
 where default case is missing  At TopNKeyPushdownProcessor.java:where default 
case is missing  At TopNKeyPushdownProcessor.java:[lines 94-104] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-13182/dev-support/hive-personality.sh
 |
| git revision | master / 4a30574 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13182/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13182/yetus/new-findbugs-ql.html
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13182/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> TopNKey pushdown
> 
>
> Key: HIVE-20150
> URL: https://issues.apache.org/jira/browse/HIVE-20150
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20150.1.patch, HIVE-20150.2.patch, 
> HIVE-20150.4.patch
>
>
> TopNKey operator is implemented in HIVE-17896, but it needs more work in 
> pushdown implementation. So this issue covers TopNKey pushdown implementation 
> with proper tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20150) TopNKey pushdown

2018-08-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577635#comment-16577635
 ] 

Hive QA commented on HIVE-20150:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12935285/HIVE-20150.3.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14877 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_groupby]
 (batchId=180)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/13181/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13181/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13181/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12935285 - PreCommit-HIVE-Build

> TopNKey pushdown
> 
>
> Key: HIVE-20150
> URL: https://issues.apache.org/jira/browse/HIVE-20150
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20150.1.patch, HIVE-20150.2.patch, 
> HIVE-20150.4.patch
>
>
> TopNKey operator is implemented in HIVE-17896, but it needs more work in 
> pushdown implementation. So this issue covers TopNKey pushdown implementation 
> with proper tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20150) TopNKey pushdown

2018-08-12 Thread Teddy Choi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577624#comment-16577624
 ] 

Teddy Choi commented on HIVE-20150:
---

HIVE-20150.4.patch applied Jesus' feedback. bucket_groupby.q depends on 
HIVE-20368, so it will fail until HIVE-20368 is resolved.

> TopNKey pushdown
> 
>
> Key: HIVE-20150
> URL: https://issues.apache.org/jira/browse/HIVE-20150
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20150.1.patch, HIVE-20150.2.patch, 
> HIVE-20150.4.patch
>
>
> TopNKey operator is implemented in HIVE-17896, but it needs more work in 
> pushdown implementation. So this issue covers TopNKey pushdown implementation 
> with proper tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20150) TopNKey pushdown

2018-08-12 Thread Teddy Choi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-20150:
--
Attachment: (was: HIVE-20150.3.patch)

> TopNKey pushdown
> 
>
> Key: HIVE-20150
> URL: https://issues.apache.org/jira/browse/HIVE-20150
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20150.1.patch, HIVE-20150.2.patch, 
> HIVE-20150.4.patch
>
>
> TopNKey operator is implemented in HIVE-17896, but it needs more work in 
> pushdown implementation. So this issue covers TopNKey pushdown implementation 
> with proper tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20150) TopNKey pushdown

2018-08-12 Thread Teddy Choi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-20150:
--
Attachment: HIVE-20150.4.patch

> TopNKey pushdown
> 
>
> Key: HIVE-20150
> URL: https://issues.apache.org/jira/browse/HIVE-20150
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20150.1.patch, HIVE-20150.2.patch, 
> HIVE-20150.4.patch
>
>
> TopNKey operator is implemented in HIVE-17896, but it needs more work in 
> pushdown implementation. So this issue covers TopNKey pushdown implementation 
> with proper tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20150) TopNKey pushdown

2018-08-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577620#comment-16577620
 ] 

Hive QA commented on HIVE-20150:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
2s{color} | {color:blue} ql in master has 2306 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
39s{color} | {color:red} ql: The patch generated 52 new + 41 unchanged - 0 
fixed = 93 total (was 41) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
21s{color} | {color:red} ql generated 1 new + 2306 unchanged - 0 fixed = 2307 
total (was 2306) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 44s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Switch statement found in 
org.apache.hadoop.hive.ql.optimizer.TopNKeyPushdownProcessor.pushdown(TopNKeyOperator)
 where default case is missing  At TopNKeyPushdownProcessor.java:where default 
case is missing  At TopNKeyPushdownProcessor.java:[lines 94-108] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-13181/dev-support/hive-personality.sh
 |
| git revision | master / 4a30574 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13181/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13181/yetus/new-findbugs-ql.html
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-13181/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> TopNKey pushdown
> 
>
> Key: HIVE-20150
> URL: https://issues.apache.org/jira/browse/HIVE-20150
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20150.1.patch, HIVE-20150.2.patch, 
> HIVE-20150.3.patch
>
>
> TopNKey operator is implemented in HIVE-17896, but it needs more work in 
> pushdown implementation. So this issue covers TopNKey pushdown implementation 
> with proper tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20150) TopNKey pushdown

2018-08-12 Thread Teddy Choi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-20150:
--
Attachment: HIVE-20150.3.patch

> TopNKey pushdown
> 
>
> Key: HIVE-20150
> URL: https://issues.apache.org/jira/browse/HIVE-20150
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20150.1.patch, HIVE-20150.2.patch, 
> HIVE-20150.3.patch
>
>
> TopNKey operator is implemented in HIVE-17896, but it needs more work in 
> pushdown implementation. So this issue covers TopNKey pushdown implementation 
> with proper tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20354) Semijoin hints dont work with merge statements

2018-08-12 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20354:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master and branch-3

Thanks [~ekoifman] for the review.

> Semijoin hints dont work with merge statements
> --
>
> Key: HIVE-20354
> URL: https://issues.apache.org/jira/browse/HIVE-20354
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20354.1.patch, HIVE-20354.2.patch, 
> HIVE-20354.3.patch, HIVE-20354.4.patch
>
>
> When merge statement is rewritten, it ignores any comment in the query which 
> may include hints like semijoin.
> If it is, it should not be ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)