[jira] [Commented] (HIVE-19202) CBO failed due to NullPointerException in HiveAggregate.isBucketedInput()

2018-04-25 Thread Daniel Voros (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16452240#comment-16452240
 ] 

Daniel Voros commented on HIVE-19202:
-

[~qunyan] could you please share an example query that was failing without this 
patch?

> CBO failed due to NullPointerException in HiveAggregate.isBucketedInput()
> -
>
> Key: HIVE-19202
> URL: https://issues.apache.org/jira/browse/HIVE-19202
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.1
>Reporter: zhuwei
>Assignee: zhuwei
>Priority: Critical
> Fix For: 3.1.0
>
> Attachments: HIVE-19202.1.patch, HIVE-19202.2.patch
>
>
> I ran a query with join and group by with below settings, COB failed due to 
> NullPointerException in HiveAggregate.isBucketedInput()
> set hive.execution.engine=tez;
> set hive.cbo.costmodel.extended=true;
>  
> In class HiveRelMdDistribution, we implemented below functions:
> public RelDistribution distribution(HiveAggregate aggregate, RelMetadataQuery 
> mq)
> public RelDistribution distribution(HiveJoin join, RelMetadataQuery mq)
>  
> But in HiveAggregate.isBucketedInput, the argument passed to distribution is 
> "this.getInput()"
> , obviously it's not right here. The right argument needed is "this"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-22501) Stats reported multiple times during MR execution for UNION queries

2019-11-15 Thread Daniel Voros (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros reassigned HIVE-22501:
---


> Stats reported multiple times during MR execution for UNION queries
> ---
>
> Key: HIVE-22501
> URL: https://issues.apache.org/jira/browse/HIVE-22501
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>  Labels: mapreduce
>
> Take the following example:
> {code}
> set hive.execution.engine=mr;
> create table tb(id string) stored as orc;
> insert into tb values('1');
> create table tb2 like tb stored as orc;
> insert into tb2 select * from tb union all select * from tb;
> {code}
> Last insert results in 2 records in the table, but 
> {{TOTAL_TABLE_ROWS_WRITTEN}} statistic (and number of affected rows on the 
> consolse) is 4.
> We seem to traverse the operator graph multiple times starting from every TS 
> operator and increment the counters every time we hit the FS operator. 
> UNION-ing the table 3 times results in 9 TOTAL_TABLE_ROWS_WRITTEN.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22501) Stats reported multiple times during MR execution for UNION queries

2019-11-15 Thread Daniel Voros (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-22501:

Attachment: HIVE-22501.1.patch

> Stats reported multiple times during MR execution for UNION queries
> ---
>
> Key: HIVE-22501
> URL: https://issues.apache.org/jira/browse/HIVE-22501
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>  Labels: mapreduce
> Attachments: HIVE-22501.1.patch
>
>
> Take the following example:
> {code}
> set hive.execution.engine=mr;
> create table tb(id string) stored as orc;
> insert into tb values('1');
> create table tb2 like tb stored as orc;
> insert into tb2 select * from tb union all select * from tb;
> {code}
> Last insert results in 2 records in the table, but 
> {{TOTAL_TABLE_ROWS_WRITTEN}} statistic (and number of affected rows on the 
> consolse) is 4.
> We seem to traverse the operator graph multiple times starting from every TS 
> operator and increment the counters every time we hit the FS operator. 
> UNION-ing the table 3 times results in 9 TOTAL_TABLE_ROWS_WRITTEN.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22501) Stats reported multiple times during MR execution for UNION queries

2019-11-15 Thread Daniel Voros (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-22501:

Status: Patch Available  (was: Open)

Attached patch #1 that only let's operators report their stats once.

> Stats reported multiple times during MR execution for UNION queries
> ---
>
> Key: HIVE-22501
> URL: https://issues.apache.org/jira/browse/HIVE-22501
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>  Labels: mapreduce
> Attachments: HIVE-22501.1.patch
>
>
> Take the following example:
> {code}
> set hive.execution.engine=mr;
> create table tb(id string) stored as orc;
> insert into tb values('1');
> create table tb2 like tb stored as orc;
> insert into tb2 select * from tb union all select * from tb;
> {code}
> Last insert results in 2 records in the table, but 
> {{TOTAL_TABLE_ROWS_WRITTEN}} statistic (and number of affected rows on the 
> consolse) is 4.
> We seem to traverse the operator graph multiple times starting from every TS 
> operator and increment the counters every time we hit the FS operator. 
> UNION-ing the table 3 times results in 9 TOTAL_TABLE_ROWS_WRITTEN.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-22501) Stats reported multiple times during MR execution for UNION queries

2019-11-15 Thread Daniel Voros (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16975148#comment-16975148
 ] 

Daniel Voros edited comment on HIVE-22501 at 11/15/19 2:46 PM:
---

Attached patch #1 that only lets operators report their stats once.


was (Author: dvoros):
Attached patch #1 that only let's operators report their stats once.

> Stats reported multiple times during MR execution for UNION queries
> ---
>
> Key: HIVE-22501
> URL: https://issues.apache.org/jira/browse/HIVE-22501
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>  Labels: mapreduce
> Attachments: HIVE-22501.1.patch
>
>
> Take the following example:
> {code}
> set hive.execution.engine=mr;
> create table tb(id string) stored as orc;
> insert into tb values('1');
> create table tb2 like tb stored as orc;
> insert into tb2 select * from tb union all select * from tb;
> {code}
> Last insert results in 2 records in the table, but 
> {{TOTAL_TABLE_ROWS_WRITTEN}} statistic (and number of affected rows on the 
> consolse) is 4.
> We seem to traverse the operator graph multiple times starting from every TS 
> operator and increment the counters every time we hit the FS operator. 
> UNION-ing the table 3 times results in 9 TOTAL_TABLE_ROWS_WRITTEN.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-18858) System properties in job configuration not resolved when submitting MR job

2018-03-19 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-18858:

Attachment: HIVE-18858.2.patch

> System properties in job configuration not resolved when submitting MR job
> --
>
> Key: HIVE-18858
> URL: https://issues.apache.org/jira/browse/HIVE-18858
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
> Environment: Hadoop 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-18858.1.patch, HIVE-18858.2.patch
>
>
> Since [this hadoop 
> commit|https://github.com/apache/hadoop/commit/5eb7dbe9b31a45f57f2e1623aa1c9ce84a56c4d1]
>  that was first released in 3.0.0, Configuration has a restricted mode, that 
> disables the resolution of system properties (that happens when retrieving a 
> configuration option).
> This leads to test failures when switching to Hadoop 3.0.0 (instead of 
> 3.0.0-beta1), since we're relying on the [substitution of 
> test.tmp.dir|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/data/conf/hive-site.xml#L37]
>  during the [maven 
> build|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/pom.xml#L83].
>  See test results on HIVE-18327.
> When we're passing job configurations to Hadoop, I believe there's no way to 
> disable the restricted mode, since we go through some Hadoop MR calls first, 
> see here:
> {code}
> "HiveServer2-Background-Pool: Thread-105@9500" prio=5 tid=0x69 nid=NA runnable
>   java.lang.Thread.State: RUNNABLE
> at 
> org.apache.hadoop.conf.Configuration.addResourceObject(Configuration.java:970)
> - locked <0x2fe6> (a org.apache.hadoop.mapred.JobConf)
> at 
> org.apache.hadoop.conf.Configuration.addResource(Configuration.java:895)
> at org.apache.hadoop.mapred.JobConf.(JobConf.java:476)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.(LocalJobRunner.java:162)
> at 
> org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:788)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:415)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2314)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1985)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1687)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1438)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1432)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:248)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:90)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:353)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(Thr

[jira] [Commented] (HIVE-18858) System properties in job configuration not resolved when submitting MR job

2018-03-19 Thread Daniel Voros (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16404969#comment-16404969
 ] 

Daniel Voros commented on HIVE-18858:
-

Attached patch #2. This uses {{Confiuration#iterator()}} directly instead of 
{{HiveConf#getProperties()}} to skip the extra conversion. Hadoop version is 
still 3.0.0 to make sure tests will pass. If they do, I'll upload a patch 
without bumping the Hadoop version.

> System properties in job configuration not resolved when submitting MR job
> --
>
> Key: HIVE-18858
> URL: https://issues.apache.org/jira/browse/HIVE-18858
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
> Environment: Hadoop 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-18858.1.patch, HIVE-18858.2.patch
>
>
> Since [this hadoop 
> commit|https://github.com/apache/hadoop/commit/5eb7dbe9b31a45f57f2e1623aa1c9ce84a56c4d1]
>  that was first released in 3.0.0, Configuration has a restricted mode, that 
> disables the resolution of system properties (that happens when retrieving a 
> configuration option).
> This leads to test failures when switching to Hadoop 3.0.0 (instead of 
> 3.0.0-beta1), since we're relying on the [substitution of 
> test.tmp.dir|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/data/conf/hive-site.xml#L37]
>  during the [maven 
> build|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/pom.xml#L83].
>  See test results on HIVE-18327.
> When we're passing job configurations to Hadoop, I believe there's no way to 
> disable the restricted mode, since we go through some Hadoop MR calls first, 
> see here:
> {code}
> "HiveServer2-Background-Pool: Thread-105@9500" prio=5 tid=0x69 nid=NA runnable
>   java.lang.Thread.State: RUNNABLE
> at 
> org.apache.hadoop.conf.Configuration.addResourceObject(Configuration.java:970)
> - locked <0x2fe6> (a org.apache.hadoop.mapred.JobConf)
> at 
> org.apache.hadoop.conf.Configuration.addResource(Configuration.java:895)
> at org.apache.hadoop.mapred.JobConf.(JobConf.java:476)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.(LocalJobRunner.java:162)
> at 
> org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:788)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:415)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2314)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1985)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1687)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1438)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1432)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:248)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:90)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hive.service.cli.op

[jira] [Updated] (HIVE-18858) System properties in job configuration not resolved when submitting MR job

2018-03-20 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-18858:

Attachment: HIVE-18858.3.patch

> System properties in job configuration not resolved when submitting MR job
> --
>
> Key: HIVE-18858
> URL: https://issues.apache.org/jira/browse/HIVE-18858
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
> Environment: Hadoop 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-18858.1.patch, HIVE-18858.2.patch, 
> HIVE-18858.3.patch
>
>
> Since [this hadoop 
> commit|https://github.com/apache/hadoop/commit/5eb7dbe9b31a45f57f2e1623aa1c9ce84a56c4d1]
>  that was first released in 3.0.0, Configuration has a restricted mode, that 
> disables the resolution of system properties (that happens when retrieving a 
> configuration option).
> This leads to test failures when switching to Hadoop 3.0.0 (instead of 
> 3.0.0-beta1), since we're relying on the [substitution of 
> test.tmp.dir|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/data/conf/hive-site.xml#L37]
>  during the [maven 
> build|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/pom.xml#L83].
>  See test results on HIVE-18327.
> When we're passing job configurations to Hadoop, I believe there's no way to 
> disable the restricted mode, since we go through some Hadoop MR calls first, 
> see here:
> {code}
> "HiveServer2-Background-Pool: Thread-105@9500" prio=5 tid=0x69 nid=NA runnable
>   java.lang.Thread.State: RUNNABLE
> at 
> org.apache.hadoop.conf.Configuration.addResourceObject(Configuration.java:970)
> - locked <0x2fe6> (a org.apache.hadoop.mapred.JobConf)
> at 
> org.apache.hadoop.conf.Configuration.addResource(Configuration.java:895)
> at org.apache.hadoop.mapred.JobConf.(JobConf.java:476)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.(LocalJobRunner.java:162)
> at 
> org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:788)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:415)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2314)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1985)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1687)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1438)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1432)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:248)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:90)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:353)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoo

[jira] [Commented] (HIVE-18858) System properties in job configuration not resolved when submitting MR job

2018-03-20 Thread Daniel Voros (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16406395#comment-16406395
 ] 

Daniel Voros commented on HIVE-18858:
-

Test failures are not related and affected tests are passing.

Attached patch #3. This is the same as patch #2 without the Hadoop version 
bumping.

[~aihuaxu], [~kgyrtkirk] could you please take a look?

> System properties in job configuration not resolved when submitting MR job
> --
>
> Key: HIVE-18858
> URL: https://issues.apache.org/jira/browse/HIVE-18858
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
> Environment: Hadoop 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-18858.1.patch, HIVE-18858.2.patch, 
> HIVE-18858.3.patch
>
>
> Since [this hadoop 
> commit|https://github.com/apache/hadoop/commit/5eb7dbe9b31a45f57f2e1623aa1c9ce84a56c4d1]
>  that was first released in 3.0.0, Configuration has a restricted mode, that 
> disables the resolution of system properties (that happens when retrieving a 
> configuration option).
> This leads to test failures when switching to Hadoop 3.0.0 (instead of 
> 3.0.0-beta1), since we're relying on the [substitution of 
> test.tmp.dir|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/data/conf/hive-site.xml#L37]
>  during the [maven 
> build|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/pom.xml#L83].
>  See test results on HIVE-18327.
> When we're passing job configurations to Hadoop, I believe there's no way to 
> disable the restricted mode, since we go through some Hadoop MR calls first, 
> see here:
> {code}
> "HiveServer2-Background-Pool: Thread-105@9500" prio=5 tid=0x69 nid=NA runnable
>   java.lang.Thread.State: RUNNABLE
> at 
> org.apache.hadoop.conf.Configuration.addResourceObject(Configuration.java:970)
> - locked <0x2fe6> (a org.apache.hadoop.mapred.JobConf)
> at 
> org.apache.hadoop.conf.Configuration.addResource(Configuration.java:895)
> at org.apache.hadoop.mapred.JobConf.(JobConf.java:476)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.(LocalJobRunner.java:162)
> at 
> org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:788)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:415)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2314)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1985)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1687)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1438)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1432)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:248)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:90)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(

[jira] [Commented] (HIVE-18858) System properties in job configuration not resolved when submitting MR job

2018-03-21 Thread Daniel Voros (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16407627#comment-16407627
 ] 

Daniel Voros commented on HIVE-18858:
-

Thank you [~kgyrtkirk]!

> System properties in job configuration not resolved when submitting MR job
> --
>
> Key: HIVE-18858
> URL: https://issues.apache.org/jira/browse/HIVE-18858
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
> Environment: Hadoop 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18858.1.patch, HIVE-18858.2.patch, 
> HIVE-18858.3.patch
>
>
> Since [this hadoop 
> commit|https://github.com/apache/hadoop/commit/5eb7dbe9b31a45f57f2e1623aa1c9ce84a56c4d1]
>  that was first released in 3.0.0, Configuration has a restricted mode, that 
> disables the resolution of system properties (that happens when retrieving a 
> configuration option).
> This leads to test failures when switching to Hadoop 3.0.0 (instead of 
> 3.0.0-beta1), since we're relying on the [substitution of 
> test.tmp.dir|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/data/conf/hive-site.xml#L37]
>  during the [maven 
> build|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/pom.xml#L83].
>  See test results on HIVE-18327.
> When we're passing job configurations to Hadoop, I believe there's no way to 
> disable the restricted mode, since we go through some Hadoop MR calls first, 
> see here:
> {code}
> "HiveServer2-Background-Pool: Thread-105@9500" prio=5 tid=0x69 nid=NA runnable
>   java.lang.Thread.State: RUNNABLE
> at 
> org.apache.hadoop.conf.Configuration.addResourceObject(Configuration.java:970)
> - locked <0x2fe6> (a org.apache.hadoop.mapred.JobConf)
> at 
> org.apache.hadoop.conf.Configuration.addResource(Configuration.java:895)
> at org.apache.hadoop.mapred.JobConf.(JobConf.java:476)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.(LocalJobRunner.java:162)
> at 
> org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:788)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:415)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2314)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1985)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1687)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1438)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1432)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:248)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:90)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:353)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.Futu

[jira] [Assigned] (HIVE-18291) An exception should be raised if the result is outside the range of decimal

2018-03-26 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros reassigned HIVE-18291:
---

Assignee: (was: Daniel Voros)

> An exception should be raised if the result is outside the range of decimal
> ---
>
> Key: HIVE-18291
> URL: https://issues.apache.org/jira/browse/HIVE-18291
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Marco Gaido
>Priority: Major
>
> Citing SQL:2011 on page 27 available at 
> http://standards.iso.org/ittf/PubliclyAvailableStandards/c053681_ISO_IEC_9075-1_2011.zip:
> {noformat}
> If the result cannot be represented exactly in the result type, then whether 
> it is rounded
> or truncated is implementation-defined. An exception condition is raised if 
> the result is
> outside the range of numeric values of the result type, or if the arithmetic 
> operation
> is not defined for the operands.
> {noformat}
> Currently Hive is returning NULL instead of throwing an exception if the 
> result is out of range, eg.:
> {code}
> > select 100.01*100.01;
> +---+
> |  _c0  |
> +---+
> | NULL  |
> +---+
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18646) Update errata.txt for HIVE-18617

2018-02-07 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros reassigned HIVE-18646:
---


> Update errata.txt for HIVE-18617
> 
>
> Key: HIVE-18646
> URL: https://issues.apache.org/jira/browse/HIVE-18646
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Trivial
>
> HIVE-18617 was committed as HIVE-18671.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18646) Update errata.txt for HIVE-18617

2018-02-07 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-18646:

Status: Patch Available  (was: Open)

Attached patch #1. This adds the line for HIVE-18617 to errata.txt.

> Update errata.txt for HIVE-18617
> 
>
> Key: HIVE-18646
> URL: https://issues.apache.org/jira/browse/HIVE-18646
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Trivial
> Attachments: HIVE-18646.1.patch
>
>
> HIVE-18617 was committed as HIVE-18671.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18646) Update errata.txt for HIVE-18617

2018-02-07 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-18646:

Attachment: HIVE-18646.1.patch

> Update errata.txt for HIVE-18617
> 
>
> Key: HIVE-18646
> URL: https://issues.apache.org/jira/browse/HIVE-18646
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Trivial
> Attachments: HIVE-18646.1.patch
>
>
> HIVE-18617 was committed as HIVE-18671.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18327) Remove the unnecessary HiveConf dependency for MiniHiveKdc

2018-02-22 Thread Daniel Voros (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372588#comment-16372588
 ] 

Daniel Voros commented on HIVE-18327:
-

This ticket is necessary to fix the test failures introduced by switching from 
Hadoop 3.0.0-beta1 to 3.0.0 (see HADOOP-15100). I'm linking this to the upgrade 
ticket (HIVE-18319).

> Remove the unnecessary HiveConf dependency for MiniHiveKdc
> --
>
> Key: HIVE-18327
> URL: https://issues.apache.org/jira/browse/HIVE-18327
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Priority: Major
>
> MiniHiveKdc takes HiveConf as input parameter while it's not needed. Remove 
> the unnecessary HiveConf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18327) Remove the unnecessary HiveConf dependency for MiniHiveKdc

2018-02-22 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros reassigned HIVE-18327:
---

Assignee: Daniel Voros

[~aihuaxu] hope you don't mind if I pick this up, let me assign to myself.

> Remove the unnecessary HiveConf dependency for MiniHiveKdc
> --
>
> Key: HIVE-18327
> URL: https://issues.apache.org/jira/browse/HIVE-18327
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Daniel Voros
>Priority: Major
>
> MiniHiveKdc takes HiveConf as input parameter while it's not needed. Remove 
> the unnecessary HiveConf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18327) Remove the unnecessary HiveConf dependency for MiniHiveKdc

2018-02-22 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-18327:

Attachment: HIVE-18327.1.patch

> Remove the unnecessary HiveConf dependency for MiniHiveKdc
> --
>
> Key: HIVE-18327
> URL: https://issues.apache.org/jira/browse/HIVE-18327
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-18327.1.patch
>
>
> MiniHiveKdc takes HiveConf as input parameter while it's not needed. Remove 
> the unnecessary HiveConf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18327) Remove the unnecessary HiveConf dependency for MiniHiveKdc

2018-02-22 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-18327:

Status: Patch Available  (was: Open)

Attached patch #1. This removes HiveConf from MiniHiveKdc and adds loginUser() 
call to the end of MiniHiveKdc's constructor, to prevent errors described in 
HADOOP-15100 when updating to hadoop 3.0.0. To verify this, this first patch 
also updates the hadoop dependency to 3.0.0, but I don't plan on including that 
in the final patch.

> Remove the unnecessary HiveConf dependency for MiniHiveKdc
> --
>
> Key: HIVE-18327
> URL: https://issues.apache.org/jira/browse/HIVE-18327
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-18327.1.patch
>
>
> MiniHiveKdc takes HiveConf as input parameter while it's not needed. Remove 
> the unnecessary HiveConf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18784) TestJdbcWithMiniKdcSQLAuthBinary runs with HTTP transport mode instead of binary

2018-02-23 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros reassigned HIVE-18784:
---


> TestJdbcWithMiniKdcSQLAuthBinary runs with HTTP transport mode instead of 
> binary
> 
>
> Key: HIVE-18784
> URL: https://issues.apache.org/jira/browse/HIVE-18784
> Project: Hive
>  Issue Type: Test
>Affects Versions: 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>  Labels: test
>
> TestJdbcWithMiniKdcSQLAuthHttp should run HTTP and 
> TestJdbcWithMiniKdcSQLAuthBinary should run binary, but currently they're 
> both using HTTP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18784) TestJdbcWithMiniKdcSQLAuthBinary runs with HTTP transport mode instead of binary

2018-02-23 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-18784:

Attachment: HIVE-18784.1.patch

> TestJdbcWithMiniKdcSQLAuthBinary runs with HTTP transport mode instead of 
> binary
> 
>
> Key: HIVE-18784
> URL: https://issues.apache.org/jira/browse/HIVE-18784
> Project: Hive
>  Issue Type: Test
>Affects Versions: 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>  Labels: test
> Attachments: HIVE-18784.1.patch
>
>
> TestJdbcWithMiniKdcSQLAuthHttp should run HTTP and 
> TestJdbcWithMiniKdcSQLAuthBinary should run binary, but currently they're 
> both using HTTP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18784) TestJdbcWithMiniKdcSQLAuthBinary runs with HTTP transport mode instead of binary

2018-02-23 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-18784:

Status: Patch Available  (was: Open)

Attached patch #1. This changes transport mode to binary in 
TestJdbcWithMiniKdcSQLAuthBinary.

> TestJdbcWithMiniKdcSQLAuthBinary runs with HTTP transport mode instead of 
> binary
> 
>
> Key: HIVE-18784
> URL: https://issues.apache.org/jira/browse/HIVE-18784
> Project: Hive
>  Issue Type: Test
>Affects Versions: 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>  Labels: test
> Attachments: HIVE-18784.1.patch
>
>
> TestJdbcWithMiniKdcSQLAuthHttp should run HTTP and 
> TestJdbcWithMiniKdcSQLAuthBinary should run binary, but currently they're 
> both using HTTP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18327) Remove the unnecessary HiveConf dependency for MiniHiveKdc

2018-02-23 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-18327:

Attachment: HIVE-18327.2.patch

> Remove the unnecessary HiveConf dependency for MiniHiveKdc
> --
>
> Key: HIVE-18327
> URL: https://issues.apache.org/jira/browse/HIVE-18327
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-18327.1.patch, HIVE-18327.2.patch
>
>
> MiniHiveKdc takes HiveConf as input parameter while it's not needed. Remove 
> the unnecessary HiveConf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18327) Remove the unnecessary HiveConf dependency for MiniHiveKdc

2018-02-23 Thread Daniel Voros (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16374169#comment-16374169
 ] 

Daniel Voros commented on HIVE-18327:
-

Attached patch #2. This fixes the checkstyle issues and does not upgrade hadoop 
dependency.

The affected tests have all passed with Hadoop 3.0.0. The new failures in these 
files all fail with similar errors ({{output/file.out.index does not exist}}):
* TestMetaStoreLimitPartitionRequest
* TestAutoPurgeTables
* TestEmbeddedThriftBinaryCLIService
* TestJdbcWithMiniHS2

An example stack trace from {{TestMetaStoreLimitPartitionRequest}}:

{code}
2018-02-22T15:12:54,852  WARN [Thread-1485] mapred.LocalJobRunner: 
job_local1550446283_0016
java.lang.Exception: 
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle 
in localfetcher#8
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) 
~[hadoop-mapreduce-client-common-3.0.0.jar:?]
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:559) 
[hadoop-mapreduce-client-common-3.0.0.jar:?]
Caused by: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error 
in shuffle in localfetcher#8
at 
org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) 
~[hadoop-mapreduce-client-core-3.0.0.jar:?]
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:377) 
~[hadoop-mapreduce-client-core-3.0.0.jar:?]
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:347)
 ~[hadoop-mapreduce-client-common-3.0.0.jar:?]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_102]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_102]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[?:1.8.0_102]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[?:1.8.0_102]
at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_102]
Caused by: java.io.FileNotFoundException: File 
/home/hiveptest/35.188.186.120-hiveptest-0/apache-github-source-source/itests/hive-unit/$%7Btest.tmp.dir%7D/hadoop-tmp/mapred/local/localRunner/hiveptest/jobcache/job_local1550446283_0016/attempt_local1550446283_0016_m_00_0/output/file.out.index
 does not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641)
 ~[hadoop-common-3.0.0.jar:?]
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:867)
 ~[hadoop-common-3.0.0.jar:?]
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
 ~[hadoop-common-3.0.0.jar:?]
at 
org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:211) 
~[hadoop-common-3.0.0.jar:?]
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:949) 
~[hadoop-common-3.0.0.jar:?]
at 
org.apache.hadoop.io.SecureIOUtils.openFSDataInputStream(SecureIOUtils.java:152)
 ~[hadoop-common-3.0.0.jar:?]
at org.apache.hadoop.mapred.SpillRecord.(SpillRecord.java:71) 
~[hadoop-mapreduce-client-core-3.0.0.jar:?]
at org.apache.hadoop.mapred.SpillRecord.(SpillRecord.java:62) 
~[hadoop-mapreduce-client-core-3.0.0.jar:?]
at org.apache.hadoop.mapred.SpillRecord.(SpillRecord.java:57) 
~[hadoop-mapreduce-client-core-3.0.0.jar:?]
at 
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:125)
 ~[hadoop-mapreduce-client-core-3.0.0.jar:?]
at 
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.doCopy(LocalFetcher.java:103)
 ~[hadoop-mapreduce-client-core-3.0.0.jar:?]
at 
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.run(LocalFetcher.java:86) 
~[hadoop-mapreduce-client-core-3.0.0.jar:?]
{code}

There's no MiniHiveKdc involved in these tests, so I believe this must be a 
separate issue (might be caused by the same Hadoop commit tho, I'm not sure 
yet). I'll continue investigating and will probably open a new ticket to deal 
with these failures.

> Remove the unnecessary HiveConf dependency for MiniHiveKdc
> --
>
> Key: HIVE-18327
> URL: https://issues.apache.org/jira/browse/HIVE-18327
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-18327.1.patch, HIVE-18327.2.patch
>
>
> MiniHiveKdc takes HiveConf as input parameter while it's not needed. Remove 
> the unnecessary HiveConf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18784) TestJdbcWithMiniKdcSQLAuthBinary runs with HTTP transport mode instead of binary

2018-02-24 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-18784:

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Thank you [~kgyrtkirk]! I've actually included this in HIVE-18327 as well, 
since that touches the same lines. I'll close this as duplicate, sorry for the 
inconvenience!

> TestJdbcWithMiniKdcSQLAuthBinary runs with HTTP transport mode instead of 
> binary
> 
>
> Key: HIVE-18784
> URL: https://issues.apache.org/jira/browse/HIVE-18784
> Project: Hive
>  Issue Type: Test
>Affects Versions: 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>  Labels: test
> Attachments: HIVE-18784.1.patch
>
>
> TestJdbcWithMiniKdcSQLAuthHttp should run HTTP and 
> TestJdbcWithMiniKdcSQLAuthBinary should run binary, but currently they're 
> both using HTTP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-02-26 Thread Daniel Voros (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16376846#comment-16376846
 ] 

Daniel Voros commented on HIVE-17684:
-

[~aihuaxu], [~stakiar] are the test failures due to Hadoop not resolving system 
properties something you're actively working on? Should I open a new ticket for 
that?

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18327) Remove the unnecessary HiveConf dependency for MiniHiveKdc

2018-02-26 Thread Daniel Voros (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16376951#comment-16376951
 ] 

Daniel Voros commented on HIVE-18327:
-

Thank you for your review [~ashutoshc] and thanks [~kgyrtkirk] for checking the 
failures. I agree, these were not caused by the patch.

> Remove the unnecessary HiveConf dependency for MiniHiveKdc
> --
>
> Key: HIVE-18327
> URL: https://issues.apache.org/jira/browse/HIVE-18327
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-18327.1.patch, HIVE-18327.2.patch
>
>
> MiniHiveKdc takes HiveConf as input parameter while it's not needed. Remove 
> the unnecessary HiveConf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18858) System properties in job configuration not resolved when submitting MR job

2018-03-05 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros reassigned HIVE-18858:
---


> System properties in job configuration not resolved when submitting MR job
> --
>
> Key: HIVE-18858
> URL: https://issues.apache.org/jira/browse/HIVE-18858
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
> Environment: Hadoop 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>
> Since [this hadoop 
> commit|https://github.com/apache/hadoop/commit/5eb7dbe9b31a45f57f2e1623aa1c9ce84a56c4d1]
>  that was first released in 3.0.0, Configuration has a restricted mode, that 
> disables the resolution of system properties (that happens when retrieving a 
> configuration option).
> This leads to test failures when switching to Hadoop 3.0.0 (instead of 
> 3.0.0-beta1), since we're relying on the [substitution of 
> test.tmp.dir|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/data/conf/hive-site.xml#L37]
>  during the [maven 
> build|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/pom.xml#L83].
>  See test results on HIVE-18327.
> When we're passing job configurations to Hadoop, I believe there's no way to 
> disable the restricted mode, since we go through some Hadoop MR calls first, 
> see here:
> {code}
> "HiveServer2-Background-Pool: Thread-105@9500" prio=5 tid=0x69 nid=NA runnable
>   java.lang.Thread.State: RUNNABLE
> at 
> org.apache.hadoop.conf.Configuration.addResourceObject(Configuration.java:970)
> - locked <0x2fe6> (a org.apache.hadoop.mapred.JobConf)
> at 
> org.apache.hadoop.conf.Configuration.addResource(Configuration.java:895)
> at org.apache.hadoop.mapred.JobConf.(JobConf.java:476)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.(LocalJobRunner.java:162)
> at 
> org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:788)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:415)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2314)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1985)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1687)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1438)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1432)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:248)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:90)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:353)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.ru

[jira] [Updated] (HIVE-18858) System properties in job configuration not resolved when submitting MR job

2018-03-05 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-18858:

Attachment: HIVE-18858.1.patch

> System properties in job configuration not resolved when submitting MR job
> --
>
> Key: HIVE-18858
> URL: https://issues.apache.org/jira/browse/HIVE-18858
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
> Environment: Hadoop 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-18858.1.patch
>
>
> Since [this hadoop 
> commit|https://github.com/apache/hadoop/commit/5eb7dbe9b31a45f57f2e1623aa1c9ce84a56c4d1]
>  that was first released in 3.0.0, Configuration has a restricted mode, that 
> disables the resolution of system properties (that happens when retrieving a 
> configuration option).
> This leads to test failures when switching to Hadoop 3.0.0 (instead of 
> 3.0.0-beta1), since we're relying on the [substitution of 
> test.tmp.dir|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/data/conf/hive-site.xml#L37]
>  during the [maven 
> build|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/pom.xml#L83].
>  See test results on HIVE-18327.
> When we're passing job configurations to Hadoop, I believe there's no way to 
> disable the restricted mode, since we go through some Hadoop MR calls first, 
> see here:
> {code}
> "HiveServer2-Background-Pool: Thread-105@9500" prio=5 tid=0x69 nid=NA runnable
>   java.lang.Thread.State: RUNNABLE
> at 
> org.apache.hadoop.conf.Configuration.addResourceObject(Configuration.java:970)
> - locked <0x2fe6> (a org.apache.hadoop.mapred.JobConf)
> at 
> org.apache.hadoop.conf.Configuration.addResource(Configuration.java:895)
> at org.apache.hadoop.mapred.JobConf.(JobConf.java:476)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.(LocalJobRunner.java:162)
> at 
> org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:788)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:415)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2314)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1985)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1687)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1438)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1432)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:248)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:90)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:353)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java

[jira] [Updated] (HIVE-18858) System properties in job configuration not resolved when submitting MR job

2018-03-05 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-18858:

Status: Patch Available  (was: Open)

Attached patch #1. This resolves all variables before passing to MR and also 
updates the Hadoop dependency to 3.0.0 to see if it helps the tests. In 
HIVE-18327 these test cases were failing due to this issue, I'm expecting these 
to pass with this patch in place:

{code}
FAILED org.apache.hadoop.hive.ql.TestAutoPurgeTables#testAutoPurge
FAILED org.apache.hadoop.hive.ql.TestAutoPurgeTables#testNoAutoPurge
FAILED 
org.apache.hadoop.hive.ql.TestAutoPurgeTables#testExternalPartitionedTable
FAILED 
org.apache.hadoop.hive.ql.TestAutoPurgeTables#testTruncatePartitionedNoAutoPurge
FAILED 
org.apache.hadoop.hive.ql.TestAutoPurgeTables#testTruncatePartitionedAutoPurge
FAILED org.apache.hadoop.hive.ql.TestAutoPurgeTables#testAutoPurgeUnset
FAILED org.apache.hadoop.hive.ql.TestAutoPurgeTables#testTruncateNoAutoPurge
FAILED org.apache.hadoop.hive.ql.TestAutoPurgeTables#testAutoPurgeInvalid
FAILED org.apache.hadoop.hive.ql.TestAutoPurgeTables#testPartitionedNoAutoPurge
FAILED org.apache.hadoop.hive.ql.TestAutoPurgeTables#testExternalNoAutoPurge
FAILED org.apache.hadoop.hive.ql.TestAutoPurgeTables#testTruncateUnsetAutoPurge
FAILED org.apache.hadoop.hive.ql.TestAutoPurgeTables#testExternalTable
FAILED org.apache.hadoop.hive.ql.TestAutoPurgeTables#testPartitionedTable
FAILED 
org.apache.hadoop.hive.ql.TestAutoPurgeTables#testPartitionedExternalNoAutoPurge
FAILED 
org.apache.hadoop.hive.ql.TestAutoPurgeTables#testTruncateInvalidAutoPurge
FAILED 
org.apache.hadoop.hive.ql.TestMetaStoreLimitPartitionRequest#testMoreComplexQueryWithDirectSqlTooManyPartitions
FAILED 
org.apache.hadoop.hive.ql.TestMetaStoreLimitPartitionRequest#testSimpleQueryWithDirectSqlTooManyPartitions
FAILED 
org.apache.hadoop.hive.ql.TestMetaStoreLimitPartitionRequest#testQueryWithLikeWithFallbackToORMTooManyPartitions
FAILED 
org.apache.hadoop.hive.ql.TestMetaStoreLimitPartitionRequest#testQueryWithInWithFallbackToORMTooManyPartitions2
FAILED 
org.apache.hadoop.hive.ql.TestMetaStoreLimitPartitionRequest#testQueryWithFallbackToORM1
FAILED 
org.apache.hadoop.hive.ql.TestMetaStoreLimitPartitionRequest#testQueryWithFallbackToORM2
FAILED 
org.apache.hadoop.hive.ql.TestMetaStoreLimitPartitionRequest#testQueryWithFallbackToORM3
FAILED 
org.apache.hadoop.hive.ql.TestMetaStoreLimitPartitionRequest#testMoreComplexQueryWithDirectSql
FAILED 
org.apache.hadoop.hive.ql.TestMetaStoreLimitPartitionRequest#testQueryWithFallbackToORMTooManyPartitions1
FAILED 
org.apache.hadoop.hive.ql.TestMetaStoreLimitPartitionRequest#testQueryWithFallbackToORMTooManyPartitions2
FAILED 
org.apache.hadoop.hive.ql.TestMetaStoreLimitPartitionRequest#testQueryWithFallbackToORMTooManyPartitions3
FAILED 
org.apache.hadoop.hive.ql.TestMetaStoreLimitPartitionRequest#testQueryWithFallbackToORMTooManyPartitions4
FAILED 
org.apache.hadoop.hive.ql.TestMetaStoreLimitPartitionRequest#testQueryWithInWithFallbackToORM
FAILED 
org.apache.hadoop.hive.ql.TestMetaStoreLimitPartitionRequest#testSimpleQueryWithDirectSql
FAILED 
org.apache.hadoop.hive.ql.TestMetaStoreLimitPartitionRequest#testQueryWithInWithFallbackToORMTooManyPartitions
FAILED org.apache.hive.jdbc.TestJdbcWithMiniHS2#testSelectThriftSerializeInTasks
FAILED 
org.apache.hive.jdbc.TestJdbcWithMiniHS2#testEmptyResultsetThriftSerializeInTasks
FAILED org.apache.hive.jdbc.TestJdbcWithMiniHS2#testParallelCompilation2
FAILED org.apache.hive.jdbc.TestJdbcWithMiniHS2#testJoinThriftSerializeInTasks
FAILED org.apache.hive.jdbc.TestJdbcWithMiniHS2#testParallelCompilation
FAILED org.apache.hive.jdbc.TestJdbcWithMiniHS2#testConcurrentStatements
FAILED 
org.apache.hive.jdbc.TestJdbcWithMiniHS2#testFloatCast2DoubleThriftSerializeInTasks
FAILED org.apache.hive.jdbc.TestJdbcWithMiniHS2#testEnableThriftSerializeInTasks
FAILED 
org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService#testExecuteStatementParallel
{code}

> System properties in job configuration not resolved when submitting MR job
> --
>
> Key: HIVE-18858
> URL: https://issues.apache.org/jira/browse/HIVE-18858
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
> Environment: Hadoop 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-18858.1.patch
>
>
> Since [this hadoop 
> commit|https://github.com/apache/hadoop/commit/5eb7dbe9b31a45f57f2e1623aa1c9ce84a56c4d1]
>  that was first released in 3.0.0, Configuration has a restricted mode, that 
> disables the resolution of system properties (that happens when retrieving a 
> configuration option).
> This leads to test failures when switching to Hadoop 3.0.0 (instead of 
> 3.0.0-beta1), since 

[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-03-05 Thread Daniel Voros (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385978#comment-16385978
 ] 

Daniel Voros commented on HIVE-17684:
-

I've opened HIVE-18858 to track the issues caused by Hadoop not resolving 
system properties.

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18858) System properties in job configuration not resolved when submitting MR job

2018-03-08 Thread Daniel Voros (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390890#comment-16390890
 ] 

Daniel Voros commented on HIVE-18858:
-

Thank you for reviewing! [~kgyrtkirk] I think we might as well change 
[HiveConf#getProperties()|https://github.com/apache/hive/blob/073dc88083650c414f10b9a9511008f1a68e8282/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L4448]
 to call 
[Configuration#getProps()|https://github.com/apache/hadoop/blob/583f4594314b3db25b57b1e46ea8026eab21f932/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L2797]
 directly. Currently it calls 
[Configuration.iterator()|https://github.com/apache/hadoop/blob/583f4594314b3db25b57b1e46ea8026eab21f932/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L2844]
 and we end up doing a {{Properties->Map->Properties}} conversion.

[~aihuaxu] is the getAllProperties() part that had you concerned or the 
resolution of every variable?

> System properties in job configuration not resolved when submitting MR job
> --
>
> Key: HIVE-18858
> URL: https://issues.apache.org/jira/browse/HIVE-18858
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
> Environment: Hadoop 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-18858.1.patch
>
>
> Since [this hadoop 
> commit|https://github.com/apache/hadoop/commit/5eb7dbe9b31a45f57f2e1623aa1c9ce84a56c4d1]
>  that was first released in 3.0.0, Configuration has a restricted mode, that 
> disables the resolution of system properties (that happens when retrieving a 
> configuration option).
> This leads to test failures when switching to Hadoop 3.0.0 (instead of 
> 3.0.0-beta1), since we're relying on the [substitution of 
> test.tmp.dir|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/data/conf/hive-site.xml#L37]
>  during the [maven 
> build|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/pom.xml#L83].
>  See test results on HIVE-18327.
> When we're passing job configurations to Hadoop, I believe there's no way to 
> disable the restricted mode, since we go through some Hadoop MR calls first, 
> see here:
> {code}
> "HiveServer2-Background-Pool: Thread-105@9500" prio=5 tid=0x69 nid=NA runnable
>   java.lang.Thread.State: RUNNABLE
> at 
> org.apache.hadoop.conf.Configuration.addResourceObject(Configuration.java:970)
> - locked <0x2fe6> (a org.apache.hadoop.mapred.JobConf)
> at 
> org.apache.hadoop.conf.Configuration.addResource(Configuration.java:895)
> at org.apache.hadoop.mapred.JobConf.(JobConf.java:476)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.(LocalJobRunner.java:162)
> at 
> org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:788)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:415)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2314)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1985)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1687)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1438)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1432)
> at 
> org.apache.hive.ser

[jira] [Updated] (HIVE-20022) Upgrade hadoop.version to 3.1.1

2018-08-14 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-20022:

Attachment: HIVE-20022.2.patch

> Upgrade hadoop.version to 3.1.1
> ---
>
> Key: HIVE-20022
> URL: https://issues.apache.org/jira/browse/HIVE-20022
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Blocker
> Attachments: HIVE-20022.1.patch, HIVE-20022.2.patch
>
>
> HIVE-19304 is relying on YARN-7142 and YARN-8122 that will only be released 
> in Hadoop 3.1.1. We should upgrade when possible.
> cc [~gsaha]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20022) Upgrade hadoop.version to 3.1.1

2018-08-14 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16579572#comment-16579572
 ] 

Daniel Voros commented on HIVE-20022:
-

Re-uploaded the same patch as patch #2 to trigger a new test run. Hopefully 
local repo has been updated in the meantime.

> Upgrade hadoop.version to 3.1.1
> ---
>
> Key: HIVE-20022
> URL: https://issues.apache.org/jira/browse/HIVE-20022
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Blocker
> Attachments: HIVE-20022.1.patch, HIVE-20022.2.patch
>
>
> HIVE-19304 is relying on YARN-7142 and YARN-8122 that will only be released 
> in Hadoop 3.1.1. We should upgrade when possible.
> cc [~gsaha]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20191) PreCommit patch application doesn't fail if patch is empty

2018-08-14 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-20191:

Attachment: HIVE-20191.1.patch

> PreCommit patch application doesn't fail if patch is empty
> --
>
> Key: HIVE-20191
> URL: https://issues.apache.org/jira/browse/HIVE-20191
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-20191.1.patch
>
>
> I've created some backport tickets to branch-3 (e.g. HIVE-20181) and made the 
> mistake of uploading the patch files with wrong filename ({{.}} instead of 
> {{-}} between version and branch).
> These get applied on master, where they're already present, since {{git 
> apply}} with {{-3}} won't fail if patch is already there. Tests are run on 
> master instead of failing.
> I think the patch application should fail if the patch is empty and branch 
> selection logic should probably fail too if the patch name is malformed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20191) PreCommit patch application doesn't fail if patch is empty

2018-08-14 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-20191:

Status: Patch Available  (was: Open)

Attached patch #1. This changes the patch filename pattern to accept both '-' 
and '.' before the branch name and also makes the patch application fail if the 
patch is empty.

> PreCommit patch application doesn't fail if patch is empty
> --
>
> Key: HIVE-20191
> URL: https://issues.apache.org/jira/browse/HIVE-20191
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-20191.1.patch
>
>
> I've created some backport tickets to branch-3 (e.g. HIVE-20181) and made the 
> mistake of uploading the patch files with wrong filename ({{.}} instead of 
> {{-}} between version and branch).
> These get applied on master, where they're already present, since {{git 
> apply}} with {{-3}} won't fail if patch is already there. Tests are run on 
> master instead of failing.
> I think the patch application should fail if the patch is empty and branch 
> selection logic should probably fail too if the patch name is malformed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20191) PreCommit patch application doesn't fail if patch is empty

2018-08-21 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16587632#comment-16587632
 ] 

Daniel Voros commented on HIVE-20191:
-

[~vihangk1], [~kgyrtkirk] this is what we've discussed on the dev list earlier. 
If you could take a look that would be much appreciated!

> PreCommit patch application doesn't fail if patch is empty
> --
>
> Key: HIVE-20191
> URL: https://issues.apache.org/jira/browse/HIVE-20191
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-20191.1.patch
>
>
> I've created some backport tickets to branch-3 (e.g. HIVE-20181) and made the 
> mistake of uploading the patch files with wrong filename ({{.}} instead of 
> {{-}} between version and branch).
> These get applied on master, where they're already present, since {{git 
> apply}} with {{-3}} won't fail if patch is already there. Tests are run on 
> master instead of failing.
> I think the patch application should fail if the patch is empty and branch 
> selection logic should probably fail too if the patch name is malformed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20022) Upgrade hadoop.version to 3.1.1

2018-09-03 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-20022:

Attachment: HIVE-20022.3.patch

> Upgrade hadoop.version to 3.1.1
> ---
>
> Key: HIVE-20022
> URL: https://issues.apache.org/jira/browse/HIVE-20022
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Blocker
> Attachments: HIVE-20022.1.patch, HIVE-20022.2.patch, 
> HIVE-20022.3.patch
>
>
> HIVE-19304 is relying on YARN-7142 and YARN-8122 that will only be released 
> in Hadoop 3.1.1. We should upgrade when possible.
> cc [~gsaha]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20022) Upgrade hadoop.version to 3.1.1

2018-09-03 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16602182#comment-16602182
 ] 

Daniel Voros commented on HIVE-20022:
-

Attached for another try. Build passes locally with empty local repo and no 
special repo settings.

> Upgrade hadoop.version to 3.1.1
> ---
>
> Key: HIVE-20022
> URL: https://issues.apache.org/jira/browse/HIVE-20022
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Blocker
> Attachments: HIVE-20022.1.patch, HIVE-20022.2.patch, 
> HIVE-20022.3.patch
>
>
> HIVE-19304 is relying on YARN-7142 and YARN-8122 that will only be released 
> in Hadoop 3.1.1. We should upgrade when possible.
> cc [~gsaha]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20498) Support date type for column stats autogather

2018-09-04 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros reassigned HIVE-20498:
---

Assignee: Daniel Voros

> Support date type for column stats autogather
> -
>
> Key: HIVE-20498
> URL: https://issues.apache.org/jira/browse/HIVE-20498
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Daniel Voros
>Priority: Major
>
> {code}
> set hive.stats.column.autogather=true;
> create table dx2(a int,b int,d date);
> explain insert into dx2 values(1,1,'2011-11-11');
> -- no compute_stats calls
> insert into dx2 values(1,1,'2011-11-11');
> insert into dx2 values(1,1,'2001-11-11');
> explain analyze table dx2 compute statistics for columns;
> -- as expected; has compute_stats calls
> analyze table dx2 compute statistics for columns;
> -- runs ok
> desc formatted dx2 d;
> -- looks good
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20498) Support date type for column stats autogather

2018-09-05 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-20498:

Attachment: HIVE-20498.1.patch

> Support date type for column stats autogather
> -
>
> Key: HIVE-20498
> URL: https://issues.apache.org/jira/browse/HIVE-20498
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-20498.1.patch
>
>
> {code}
> set hive.stats.column.autogather=true;
> create table dx2(a int,b int,d date);
> explain insert into dx2 values(1,1,'2011-11-11');
> -- no compute_stats calls
> insert into dx2 values(1,1,'2011-11-11');
> insert into dx2 values(1,1,'2001-11-11');
> explain analyze table dx2 compute statistics for columns;
> -- as expected; has compute_stats calls
> analyze table dx2 compute statistics for columns;
> -- runs ok
> desc formatted dx2 d;
> -- looks good
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20498) Support date type for column stats autogather

2018-09-05 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-20498:

Status: Patch Available  (was: Open)

Attached patch #1. This adds DATE to the list of supported types for autogather 
column stats.

> Support date type for column stats autogather
> -
>
> Key: HIVE-20498
> URL: https://issues.apache.org/jira/browse/HIVE-20498
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-20498.1.patch
>
>
> {code}
> set hive.stats.column.autogather=true;
> create table dx2(a int,b int,d date);
> explain insert into dx2 values(1,1,'2011-11-11');
> -- no compute_stats calls
> insert into dx2 values(1,1,'2011-11-11');
> insert into dx2 values(1,1,'2001-11-11');
> explain analyze table dx2 compute statistics for columns;
> -- as expected; has compute_stats calls
> analyze table dx2 compute statistics for columns;
> -- runs ok
> desc formatted dx2 d;
> -- looks good
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20502) Fix NPE while running skewjoin_mapjoin10.q when column stats is used.

2018-09-05 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros reassigned HIVE-20502:
---

Assignee: Daniel Voros  (was: Zoltan Haindrich)

> Fix NPE while running skewjoin_mapjoin10.q when column stats is used.
> -
>
> Key: HIVE-20502
> URL: https://issues.apache.org/jira/browse/HIVE-20502
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Daniel Voros
>Priority: Major
>
> Enabling {{hive.stats.fetch.column.stats}} makes this test fail during:
> {code}
> EXPLAIN
> SELECT a.*, b.* FROM T1_n151 a RIGHT OUTER JOIN T2_n88 b ON a.key = b.key
> {code}
> Seems like joinKeys is null at [this 
> point|https://github.com/apache/hive/blob/48f92c31dee3983f573f2e66baaa213a0196f1ba/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2169]
> Exception:
> {code}
> 2018-09-04T23:47:02,398 DEBUG [fef236ce-e62e-4c20-b0c0-3b15d2b336f7 main] 
> annotation.StatsRulesProcFactory: STATS-JOIN[15]: detects none/multiple PK 
> parents.
> 2018-09-04T23:47:02,409 ERROR [fef236ce-e62e-4c20-b0c0-3b15d2b336f7 main] 
> ql.Driver: FAILED: NullPointerException null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.isJoinKey(StatsRulesProcFactory.java:2169)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.updateNumNulls(StatsRulesProcFactory.java:2210)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.updateColStats(StatsRulesProcFactory.java:2276)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.process(StatsRulesProcFactory.java:1785)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.LevelOrderWalker.walk(LevelOrderWalker.java:143)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20502) Fix NPE while running skewjoin_mapjoin10.q when column stats is used.

2018-09-05 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-20502:

Attachment: HIVE-20502.1.patch

> Fix NPE while running skewjoin_mapjoin10.q when column stats is used.
> -
>
> Key: HIVE-20502
> URL: https://issues.apache.org/jira/browse/HIVE-20502
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-20502.1.patch
>
>
> Enabling {{hive.stats.fetch.column.stats}} makes this test fail during:
> {code}
> EXPLAIN
> SELECT a.*, b.* FROM T1_n151 a RIGHT OUTER JOIN T2_n88 b ON a.key = b.key
> {code}
> Seems like joinKeys is null at [this 
> point|https://github.com/apache/hive/blob/48f92c31dee3983f573f2e66baaa213a0196f1ba/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2169]
> Exception:
> {code}
> 2018-09-04T23:47:02,398 DEBUG [fef236ce-e62e-4c20-b0c0-3b15d2b336f7 main] 
> annotation.StatsRulesProcFactory: STATS-JOIN[15]: detects none/multiple PK 
> parents.
> 2018-09-04T23:47:02,409 ERROR [fef236ce-e62e-4c20-b0c0-3b15d2b336f7 main] 
> ql.Driver: FAILED: NullPointerException null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.isJoinKey(StatsRulesProcFactory.java:2169)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.updateNumNulls(StatsRulesProcFactory.java:2210)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.updateColStats(StatsRulesProcFactory.java:2276)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.process(StatsRulesProcFactory.java:1785)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.LevelOrderWalker.walk(LevelOrderWalker.java:143)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20502) Fix NPE while running skewjoin_mapjoin10.q when column stats is used.

2018-09-05 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-20502:

Status: Patch Available  (was: Open)

Attached patch #1. joinKeys was not cloned when splitting JOIN into two JOINs 
in SkewJoinOptimizer, this lead to NPE when processing the clone.

> Fix NPE while running skewjoin_mapjoin10.q when column stats is used.
> -
>
> Key: HIVE-20502
> URL: https://issues.apache.org/jira/browse/HIVE-20502
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-20502.1.patch
>
>
> Enabling {{hive.stats.fetch.column.stats}} makes this test fail during:
> {code}
> EXPLAIN
> SELECT a.*, b.* FROM T1_n151 a RIGHT OUTER JOIN T2_n88 b ON a.key = b.key
> {code}
> Seems like joinKeys is null at [this 
> point|https://github.com/apache/hive/blob/48f92c31dee3983f573f2e66baaa213a0196f1ba/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2169]
> Exception:
> {code}
> 2018-09-04T23:47:02,398 DEBUG [fef236ce-e62e-4c20-b0c0-3b15d2b336f7 main] 
> annotation.StatsRulesProcFactory: STATS-JOIN[15]: detects none/multiple PK 
> parents.
> 2018-09-04T23:47:02,409 ERROR [fef236ce-e62e-4c20-b0c0-3b15d2b336f7 main] 
> ql.Driver: FAILED: NullPointerException null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.isJoinKey(StatsRulesProcFactory.java:2169)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.updateNumNulls(StatsRulesProcFactory.java:2210)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.updateColStats(StatsRulesProcFactory.java:2276)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.process(StatsRulesProcFactory.java:1785)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.LevelOrderWalker.walk(LevelOrderWalker.java:143)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20502) Fix NPE while running skewjoin_mapjoin10.q when column stats is used.

2018-09-06 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-20502:

Attachment: HIVE-20502.2.patch

> Fix NPE while running skewjoin_mapjoin10.q when column stats is used.
> -
>
> Key: HIVE-20502
> URL: https://issues.apache.org/jira/browse/HIVE-20502
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-20502.1.patch, HIVE-20502.2.patch
>
>
> Enabling {{hive.stats.fetch.column.stats}} makes this test fail during:
> {code}
> EXPLAIN
> SELECT a.*, b.* FROM T1_n151 a RIGHT OUTER JOIN T2_n88 b ON a.key = b.key
> {code}
> Seems like joinKeys is null at [this 
> point|https://github.com/apache/hive/blob/48f92c31dee3983f573f2e66baaa213a0196f1ba/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2169]
> Exception:
> {code}
> 2018-09-04T23:47:02,398 DEBUG [fef236ce-e62e-4c20-b0c0-3b15d2b336f7 main] 
> annotation.StatsRulesProcFactory: STATS-JOIN[15]: detects none/multiple PK 
> parents.
> 2018-09-04T23:47:02,409 ERROR [fef236ce-e62e-4c20-b0c0-3b15d2b336f7 main] 
> ql.Driver: FAILED: NullPointerException null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.isJoinKey(StatsRulesProcFactory.java:2169)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.updateNumNulls(StatsRulesProcFactory.java:2210)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.updateColStats(StatsRulesProcFactory.java:2276)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.process(StatsRulesProcFactory.java:1785)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.LevelOrderWalker.walk(LevelOrderWalker.java:143)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20502) Fix NPE while running skewjoin_mapjoin10.q when column stats is used.

2018-09-06 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16605525#comment-16605525
 ] 

Daniel Voros commented on HIVE-20502:
-

Attached patch #2 that updates related q.out files. checkFast3estimations is 
unrelated, unable to reproduce locally.

> Fix NPE while running skewjoin_mapjoin10.q when column stats is used.
> -
>
> Key: HIVE-20502
> URL: https://issues.apache.org/jira/browse/HIVE-20502
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-20502.1.patch, HIVE-20502.2.patch
>
>
> Enabling {{hive.stats.fetch.column.stats}} makes this test fail during:
> {code}
> EXPLAIN
> SELECT a.*, b.* FROM T1_n151 a RIGHT OUTER JOIN T2_n88 b ON a.key = b.key
> {code}
> Seems like joinKeys is null at [this 
> point|https://github.com/apache/hive/blob/48f92c31dee3983f573f2e66baaa213a0196f1ba/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2169]
> Exception:
> {code}
> 2018-09-04T23:47:02,398 DEBUG [fef236ce-e62e-4c20-b0c0-3b15d2b336f7 main] 
> annotation.StatsRulesProcFactory: STATS-JOIN[15]: detects none/multiple PK 
> parents.
> 2018-09-04T23:47:02,409 ERROR [fef236ce-e62e-4c20-b0c0-3b15d2b336f7 main] 
> ql.Driver: FAILED: NullPointerException null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.isJoinKey(StatsRulesProcFactory.java:2169)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.updateNumNulls(StatsRulesProcFactory.java:2210)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.updateColStats(StatsRulesProcFactory.java:2276)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.process(StatsRulesProcFactory.java:1785)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.LevelOrderWalker.walk(LevelOrderWalker.java:143)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20502) Fix NPE while running skewjoin_mapjoin10.q when column stats is used.

2018-09-10 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16609209#comment-16609209
 ] 

Daniel Voros commented on HIVE-20502:
-

Thanks  [~kgyrtkirk] for the review and for rerunning the tests!

> Fix NPE while running skewjoin_mapjoin10.q when column stats is used.
> -
>
> Key: HIVE-20502
> URL: https://issues.apache.org/jira/browse/HIVE-20502
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Daniel Voros
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20502.1.patch, HIVE-20502.2.patch, 
> HIVE-20502.2.patch, HIVE-20502.2.patch
>
>
> Enabling {{hive.stats.fetch.column.stats}} makes this test fail during:
> {code}
> EXPLAIN
> SELECT a.*, b.* FROM T1_n151 a RIGHT OUTER JOIN T2_n88 b ON a.key = b.key
> {code}
> Seems like joinKeys is null at [this 
> point|https://github.com/apache/hive/blob/48f92c31dee3983f573f2e66baaa213a0196f1ba/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2169]
> Exception:
> {code}
> 2018-09-04T23:47:02,398 DEBUG [fef236ce-e62e-4c20-b0c0-3b15d2b336f7 main] 
> annotation.StatsRulesProcFactory: STATS-JOIN[15]: detects none/multiple PK 
> parents.
> 2018-09-04T23:47:02,409 ERROR [fef236ce-e62e-4c20-b0c0-3b15d2b336f7 main] 
> ql.Driver: FAILED: NullPointerException null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.isJoinKey(StatsRulesProcFactory.java:2169)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.updateNumNulls(StatsRulesProcFactory.java:2210)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.updateColStats(StatsRulesProcFactory.java:2276)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$JoinStatsRule.process(StatsRulesProcFactory.java:1785)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.LevelOrderWalker.walk(LevelOrderWalker.java:143)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20586) Beeline is asking for user/pass when invoked without -u

2018-09-18 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros reassigned HIVE-20586:
---


> Beeline is asking for user/pass when invoked without -u
> ---
>
> Key: HIVE-20586
> URL: https://issues.apache.org/jira/browse/HIVE-20586
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 3.1.0, 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>
> Since HIVE-18963 it's possible to define a default connection URL in 
> beeline-site.xml to be able to use beeline without specifying the HS2 JDBC 
> URL.
> When invoked with no arguments, beeline is asking for username/password on 
> the command line. When running with {{-u}} and the exact same URL as in 
> beeline-site.xml, it does not ask for username/password.
> I think these two should do exactly the same, given that the URL after {{-u}} 
> is the same as in beeline-site.xml:
> {code:java}
> beeline -u URL
> beeline
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20022) Upgrade hadoop.version to 3.1.1

2018-09-19 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-20022:

Attachment: HIVE-20022.3.patch

> Upgrade hadoop.version to 3.1.1
> ---
>
> Key: HIVE-20022
> URL: https://issues.apache.org/jira/browse/HIVE-20022
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Blocker
> Attachments: HIVE-20022.1.patch, HIVE-20022.2.patch, 
> HIVE-20022.3.patch, HIVE-20022.3.patch
>
>
> HIVE-19304 is relying on YARN-7142 and YARN-8122 that will only be released 
> in Hadoop 3.1.1. We should upgrade when possible.
> cc [~gsaha]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20191) PreCommit patch application doesn't fail if patch is empty

2018-10-02 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16635161#comment-16635161
 ] 

Daniel Voros commented on HIVE-20191:
-

Updated [https://cwiki.apache.org/confluence/display/Hive/HowToContribute], 
thanks again [~kgyrtkirk]!

> PreCommit patch application doesn't fail if patch is empty
> --
>
> Key: HIVE-20191
> URL: https://issues.apache.org/jira/browse/HIVE-20191
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20191.1.patch
>
>
> I've created some backport tickets to branch-3 (e.g. HIVE-20181) and made the 
> mistake of uploading the patch files with wrong filename ({{.}} instead of 
> {{-}} between version and branch).
> These get applied on master, where they're already present, since {{git 
> apply}} with {{-3}} won't fail if patch is already there. Tests are run on 
> master instead of failing.
> I think the patch application should fail if the patch is empty and branch 
> selection logic should probably fail too if the patch name is malformed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18291) An exception should be raised if the result is outside the range of decimal

2017-12-18 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros reassigned HIVE-18291:
---

Assignee: Daniel Voros

> An exception should be raised if the result is outside the range of decimal
> ---
>
> Key: HIVE-18291
> URL: https://issues.apache.org/jira/browse/HIVE-18291
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Marco Gaido
>Assignee: Daniel Voros
>
> Citing SQL:2011 on page 27 available at 
> http://standards.iso.org/ittf/PubliclyAvailableStandards/c053681_ISO_IEC_9075-1_2011.zip:
> {noformat}
> If the result cannot be represented exactly in the result type, then whether 
> it is rounded
> or truncated is implementation-defined. An exception condition is raised if 
> the result is
> outside the range of numeric values of the result type, or if the arithmetic 
> operation
> is not defined for the operands.
> {noformat}
> Currently Hive is returning NULL instead of throwing an exception if the 
> result is out of range, eg.:
> {code}
> > select 100.01*100.01;
> +---+
> |  _c0  |
> +---+
> | NULL  |
> +---+
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18292) correct typo of vector_reduce_groupby_duplicate_cols in testconfiguration.properties

2017-12-18 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros reassigned HIVE-18292:
---

Assignee: Daniel Voros

> correct typo of vector_reduce_groupby_duplicate_cols in 
> testconfiguration.properties
> 
>
> Key: HIVE-18292
> URL: https://issues.apache.org/jira/browse/HIVE-18292
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Daniel Voros
>
> HIVE-18258 added a test; but the matching properties 
> [entry|https://github.com/apache/hive/blob/82590226a89eeac7aa0ace8c311a8d4f4794c5bc/itests/src/test/resources/testconfiguration.properties#L384]
>  for it is typoed...
> I never taught TestDanglingQOuts#checkDanglingQOut  will catch usefull things 
> as well... :)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18291) An exception should be raised if the result is outside the range of decimal

2018-01-03 Thread Daniel Voros (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310116#comment-16310116
 ] 

Daniel Voros commented on HIVE-18291:
-

[~sershe] I think that ticket is HIVE-13098. Based on the size of that WIP 
patch, I think I'll put this on hold for now.

> An exception should be raised if the result is outside the range of decimal
> ---
>
> Key: HIVE-18291
> URL: https://issues.apache.org/jira/browse/HIVE-18291
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Marco Gaido
>Assignee: Daniel Voros
>
> Citing SQL:2011 on page 27 available at 
> http://standards.iso.org/ittf/PubliclyAvailableStandards/c053681_ISO_IEC_9075-1_2011.zip:
> {noformat}
> If the result cannot be represented exactly in the result type, then whether 
> it is rounded
> or truncated is implementation-defined. An exception condition is raised if 
> the result is
> outside the range of numeric values of the result type, or if the arithmetic 
> operation
> is not defined for the operands.
> {noformat}
> Currently Hive is returning NULL instead of throwing an exception if the 
> result is out of range, eg.:
> {code}
> > select 100.01*100.01;
> +---+
> |  _c0  |
> +---+
> | NULL  |
> +---+
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18414) upgrade to tez-0.9.1

2018-01-09 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros reassigned HIVE-18414:
---

Assignee: Daniel Voros

> upgrade to tez-0.9.1
> 
>
> Key: HIVE-18414
> URL: https://issues.apache.org/jira/browse/HIVE-18414
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Daniel Voros
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18414) upgrade to tez-0.9.1

2018-01-09 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-18414:

Attachment: HIVE-18414.1.patch

Attaching patch #1. This upgrades tez.version from 0.9.1-SNAPSHOT to 0.9.1.

> upgrade to tez-0.9.1
> 
>
> Key: HIVE-18414
> URL: https://issues.apache.org/jira/browse/HIVE-18414
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Daniel Voros
> Attachments: HIVE-18414.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18414) upgrade to tez-0.9.1

2018-01-09 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-18414:

Status: Patch Available  (was: Open)

> upgrade to tez-0.9.1
> 
>
> Key: HIVE-18414
> URL: https://issues.apache.org/jira/browse/HIVE-18414
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Daniel Voros
> Attachments: HIVE-18414.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-10251) HIVE-9664 makes hive depend on ivysettings.xml

2018-01-10 Thread Daniel Voros (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-10251:

   Resolution: Duplicate
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

Resolving as this was superseded by HIVE-10267.

> HIVE-9664 makes hive depend on ivysettings.xml
> --
>
> Key: HIVE-10251
> URL: https://issues.apache.org/jira/browse/HIVE-10251
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Sushanth Sowmyan
>Assignee: Anant Nag
>  Labels: patch
> Fix For: 1.2.0
>
> Attachments: HIVE-10251.1.patch, HIVE-10251.2.patch, 
> HIVE-10251.3.patch, HIVE-10251.simple.patch
>
>
> HIVE-9664 makes hive depend on the existence of ivysettings.xml, and if it is 
> not present, it makes hive NPE when instantiating a CLISessionState.
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.session.DependencyResolver.(DependencyResolver.java:61)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.(SessionState.java:343)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.(SessionState.java:334)
> at org.apache.hadoop.hive.cli.CliSessionState.(CliSessionState.java:60)
> {noformat}
> This happens because of the following bit:
> {noformat}
> // If HIVE_HOME is not defined or file is not found in HIVE_HOME/conf 
> then load default ivysettings.xml from class loader
> if (ivysettingsPath == null || !(new File(ivysettingsPath).exists())) {
>   ivysettingsPath = 
> ClassLoader.getSystemResource("ivysettings.xml").getFile();
>   _console.printInfo("ivysettings.xml file not found in HIVE_HOME or 
> HIVE_CONF_DIR," + ivysettingsPath + " will be used");
> }
> {noformat}
> This makes it so that an attempt to instantiate CliSessionState without an 
> ivysettings.xml file will cause hive to fail with an NPE. Hive should not 
> have a hard dependency on a ivysettings,xml being present, and this feature 
> should gracefully fail in that case instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-19444) Create View - Table not found _dummy_table

2018-12-12 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros resolved HIVE-19444.
-
Resolution: Duplicate

I've also seen this with 3.1.1, but wasn't able to reproduce with current 
master where HIVE-20010 is fixed.

> Create View - Table not found _dummy_table
> --
>
> Key: HIVE-19444
> URL: https://issues.apache.org/jira/browse/HIVE-19444
> Project: Hive
>  Issue Type: Bug
>  Components: Views
>Affects Versions: 1.1.0
>Reporter: BELUGA BEHR
>Priority: Major
>
> {code:sql}
> CREATE VIEW view_s1 AS select 1;
> -- FAILED: SemanticException 
> org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found 
> _dummy_table
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HIVE-20872) Creating information_schema and sys schema via schematool fails with parser error

2018-12-12 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros resolved HIVE-20872.
-
Resolution: Duplicate

> Creating information_schema and sys schema via schematool fails with parser 
> error
> -
>
> Key: HIVE-20872
> URL: https://issues.apache.org/jira/browse/HIVE-20872
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, Metastore, SQL
>Affects Versions: 3.1.0, 3.1.1
> Environment: Apache Hive (version 3.1.1)
> Hive JDBC (version 3.1.1)
> metastore on derby embedded, derby server, postgres server
> Apache Hadoop (version 2.9.1)
>Reporter: Carsten Steckel
>Priority: Critical
>
> it took quite some time to figure out how to install the "information_schema" 
> and "sys" schemas (thanks to 
> https://issues.apache.org/jira/browse/HIVE-16941) into a hive 3.1.0/3.1.1 on 
> hdfs/hadoop 2.9.1 and I am still unsure if it is the proper way of doing it.
> when I execute:
>  
> {noformat}
> hive@hive-server ~> schematool -metaDbType derby -dbType hive -initSchema 
> -url jdbc:hive2://localhost:1/default -driver 
> org.apache.hive.jdbc.HiveDriver"
> {noformat}
>  I receive an error (from --verbose log):
>  
> {noformat}
> [...]
> Error: Error while compiling statement: FAILED: SemanticException 
> org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found 
> _dummy_table (state=42000,code=4)
> org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization 
> FAILED! Metastore state would be inconsistent !!
> [...]
> {noformat}
>   
> It seems to be the last statement during setup of the sys-schema causes the 
> issue. When executing it manually:
>  
>  
> {noformat}
> 0: jdbc:hive2://localhost:1> CREATE OR REPLACE VIEW `VERSION` AS SELECT 1 
> AS `VER_ID`, '3.1.0' AS `SCHEMA_VERSION`, 'Hive release version 3.1.0' AS 
> `VERSION_COMMENT`;
> Error: Error while compiling statement: FAILED: SemanticException 
> org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found 
> _dummy_table (state=42000,code=4)
> {noformat}
>  
> I have tried to switch the metastore_db from derby embedded to derby server 
> to postgresql and made sure the changed metadatabases each worked, but 
> setting up the information_schema and sys schemas always delivers the same 
> error.
> Executing only the select part without the create view works:
>  
> {noformat}
> 0: jdbc:hive2://localhost:1> SELECT 1 AS `VER_ID`, '3.1.0' AS 
> `SCHEMA_VERSION`, 'Hive release version 3.1.0' AS `VERSION_COMMENT`;
> +-+-+-+
> | ver_id  | schema_version  |   version_comment   |
> +-+-+-+
> | 1   | 3.1.0   | Hive release version 3.1.0  |
> +-+-+-+
> 1 row selected (0.595 seconds)
> {noformat}
> It seems to be related to: HIVE-19444
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20586) Beeline is asking for user/pass when invoked without -u

2018-12-12 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-20586:

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Same issue was solved in HIVE-20734 with a different approach.

> Beeline is asking for user/pass when invoked without -u
> ---
>
> Key: HIVE-20586
> URL: https://issues.apache.org/jira/browse/HIVE-20586
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Daniel Voros
>Assignee: Janos Gub
>Priority: Major
> Attachments: HIVE-20586.1.patch, HIVE-20586.1.patch, 
> HIVE-20586.1.patch, HIVE-20586.1.patch, HIVE-20586.2.patch, HIVE-20586.patch
>
>
> Since HIVE-18963 it's possible to define a default connection URL in 
> beeline-site.xml to be able to use beeline without specifying the HS2 JDBC 
> URL.
> When invoked with no arguments, beeline is asking for username/password on 
> the command line. When running with {{-u}} and the exact same URL as in 
> beeline-site.xml, it does not ask for username/password.
> I think these two should do exactly the same, given that the URL after {{-u}} 
> is the same as in beeline-site.xml:
> {code:java}
> beeline -u URL
> beeline
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21034) Add option to schematool to drop Hive databases

2018-12-12 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros reassigned HIVE-21034:
---


> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21034) Add option to schematool to drop Hive databases

2018-12-12 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16719253#comment-16719253
 ] 

Daniel Voros commented on HIVE-21034:
-

Thank you for the feedback. I do agree with you, it's a dangerous operation, 
and we should make sure it's hard do accidentally/maliciously invoke. I was 
hoping to find some documentation on how to secure your deployment with respect 
to restricting access to schematool and other executables, but came back empty 
handed so I'm not sure what (if any) protections we already have (or suggest to 
have) in place.

What I had in mind for use-cases was ephemeral cloud workloads, where the users 
might want to drop everything once the job has finished.

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21034) Add option to schematool to drop Hive databases

2018-12-13 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16719958#comment-16719958
 ] 

Daniel Voros commented on HIVE-21034:
-

This new option could be used for (or during) dropping the HMS. When you're 
dropping that, you might want to remove any data associated with it, since 
you're losing the metadata that would help reading it anyway.

By "returning the datastore" do you mean dropping the S3 bucket for example? If 
yes, then I was thinking about the use-case when you're reusing the datastore, 
but want to free up space to save cost.

I believe an inverse of -initSchema could be useful too, both for the security 
concern you've described and simply to clean up the RDBMS behind HMS.

All in all I think we need to work on defining the process(es) of uninstalling 
Hive, keeping cloud workloads in mind. This new schematool option could be the 
first step in that direction.

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21034) Add option to schematool to drop Hive databases

2019-01-14 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742060#comment-16742060
 ] 

Daniel Voros commented on HIVE-21034:
-

Thank you both for your replies. Regarding safeguards I have two proposals:
1) Introduce the new schematool flag ({{-dropAllDatabases}}) as discussed 
above, but only do the deletion if an environment variable was set (e.g. 
{{ALLOW_SCHEMATOOL_UNSAFE=true}})
2) Instead of extending schematool, introduce a new "tool", that can _not_ be 
invoked from the cli, only via a hadoop jar command (e.g. {{hadoop jar 
/path/to/hive-cli-*.jar org.apache.hive.some.package.DropDbTool}})

Since the second approach would also require the user to properly setup the 
HADOOP_CLASSPATH, I'd go with the first one. Please let me know what you think!



> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21034) Add option to schematool to drop Hive databases

2019-01-18 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-21034:

Attachment: HIVE-21034.1.patch

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21034.1.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21034) Add option to schematool to drop Hive databases

2019-01-18 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-21034:

Status: Patch Available  (was: Open)

Attached patch #1. It adds the {{-dropAllDatabases}} option to schematool. 
Supports {{-verbose}} and {{-dryRun}} and introduces the new {{-yes}} flag. 
When invoked without {{-yes}} it will ask for confirmation on the command line.

The patch also fixes an NPE in the cases when schematool failed before parsing 
the command line.

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21034.1.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21034) Add option to schematool to drop Hive databases

2019-01-18 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-21034:

Attachment: HIVE-21034.2.patch

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21034.1.patch, HIVE-21034.2.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21034) Add option to schematool to drop Hive databases

2019-01-18 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16746469#comment-16746469
 ] 

Daniel Voros commented on HIVE-21034:
-

Attached patch #2 to address the findbugs warnings.

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21034.1.patch, HIVE-21034.2.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21034) Add option to schematool to drop Hive databases

2019-01-19 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-21034:

Attachment: HIVE-21034.2.patch

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21034.1.patch, HIVE-21034.2.patch, 
> HIVE-21034.2.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21034) Add option to schematool to drop Hive databases

2019-03-01 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-21034:

Attachment: HIVE-21034.3.patch

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21034.1.patch, HIVE-21034.2.patch, 
> HIVE-21034.2.patch, HIVE-21034.3.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21034) Add option to schematool to drop Hive databases

2019-03-01 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16781572#comment-16781572
 ] 

Daniel Voros commented on HIVE-21034:
-

Attached patch #3 that is rebased on master (classes moved in HIVE-21298) and 
also moves the "yes" handling to the task as suggested by Miklos.

[~mgergely] I've seen NPEs there when {{MetastoreSchemaTool#init()}} failed 
before (or during) setting cmdLine, that's why I've added the null-check.

[~ashutoshc], [~alangates] could you please also take a look?

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21034.1.patch, HIVE-21034.2.patch, 
> HIVE-21034.2.patch, HIVE-21034.3.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21034) Add option to schematool to drop Hive databases

2019-03-01 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-21034:

Attachment: HIVE-21034.4.patch

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21034.1.patch, HIVE-21034.2.patch, 
> HIVE-21034.2.patch, HIVE-21034.3.patch, HIVE-21034.4.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21034) Add option to schematool to drop Hive databases

2019-03-01 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16781836#comment-16781836
 ] 

Daniel Voros commented on HIVE-21034:
-

Attached patch #4 that corrects the checkstyle warnings. Fingers crossed for a 
clean ptest run.

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21034.1.patch, HIVE-21034.2.patch, 
> HIVE-21034.2.patch, HIVE-21034.3.patch, HIVE-21034.4.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21034) Add option to schematool to drop Hive databases

2019-03-19 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-21034:

Attachment: HIVE-21034.5.patch

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21034.1.patch, HIVE-21034.2.patch, 
> HIVE-21034.2.patch, HIVE-21034.3.patch, HIVE-21034.4.patch, HIVE-21034.5.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21034) Add option to schematool to drop Hive databases

2019-03-19 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796118#comment-16796118
 ] 

Daniel Voros commented on HIVE-21034:
-

Attached patch #5, that removes the getter+setter methods and adds 
{{@VisibleForTesting}} to the {{yes}} field.

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21034.1.patch, HIVE-21034.2.patch, 
> HIVE-21034.2.patch, HIVE-21034.3.patch, HIVE-21034.4.patch, HIVE-21034.5.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21034) Add option to schematool to drop Hive databases

2019-03-20 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-21034:

Attachment: HIVE-21034.5.patch

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21034.1.patch, HIVE-21034.2.patch, 
> HIVE-21034.2.patch, HIVE-21034.3.patch, HIVE-21034.4.patch, 
> HIVE-21034.5.patch, HIVE-21034.5.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21034) Add option to schematool to drop Hive databases

2019-03-21 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-21034:

Attachment: HIVE-21034.5.patch

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21034.1.patch, HIVE-21034.2.patch, 
> HIVE-21034.2.patch, HIVE-21034.3.patch, HIVE-21034.4.patch, 
> HIVE-21034.5.patch, HIVE-21034.5.patch, HIVE-21034.5.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21034) Add option to schematool to drop Hive databases

2019-03-25 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-21034:

Attachment: HIVE-21034.5.patch

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21034.1.patch, HIVE-21034.2.patch, 
> HIVE-21034.2.patch, HIVE-21034.3.patch, HIVE-21034.4.patch, 
> HIVE-21034.5.patch, HIVE-21034.5.patch, HIVE-21034.5.patch, HIVE-21034.5.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20022) Upgrade hadoop.version to 3.1.1

2018-10-03 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-20022:

Attachment: HIVE-20022.4.patch

> Upgrade hadoop.version to 3.1.1
> ---
>
> Key: HIVE-20022
> URL: https://issues.apache.org/jira/browse/HIVE-20022
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Blocker
> Attachments: HIVE-20022.1.patch, HIVE-20022.2.patch, 
> HIVE-20022.3.patch, HIVE-20022.3.patch, HIVE-20022.4.patch
>
>
> HIVE-19304 is relying on YARN-7142 and YARN-8122 that will only be released 
> in Hadoop 3.1.1. We should upgrade when possible.
> cc [~gsaha]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20022) Upgrade hadoop.version to 3.1.1

2018-10-03 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16636561#comment-16636561
 ] 

Daniel Voros commented on HIVE-20022:
-

Attached patch #4 that fixes failure of 
\{{TestReplicationOnHDFSEncryptedZones#targetAndSourceHaveDifferentEncryptionZoneKeys}}.

> Upgrade hadoop.version to 3.1.1
> ---
>
> Key: HIVE-20022
> URL: https://issues.apache.org/jira/browse/HIVE-20022
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Blocker
> Attachments: HIVE-20022.1.patch, HIVE-20022.2.patch, 
> HIVE-20022.3.patch, HIVE-20022.3.patch, HIVE-20022.4.patch
>
>
> HIVE-19304 is relying on YARN-7142 and YARN-8122 that will only be released 
> in Hadoop 3.1.1. We should upgrade when possible.
> cc [~gsaha]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20022) Upgrade hadoop.version to 3.1.1

2018-10-10 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-20022:

Attachment: HIVE-20022.4.patch

> Upgrade hadoop.version to 3.1.1
> ---
>
> Key: HIVE-20022
> URL: https://issues.apache.org/jira/browse/HIVE-20022
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Blocker
> Attachments: HIVE-20022.1.patch, HIVE-20022.2.patch, 
> HIVE-20022.3.patch, HIVE-20022.3.patch, HIVE-20022.4.patch, HIVE-20022.4.patch
>
>
> HIVE-19304 is relying on YARN-7142 and YARN-8122 that will only be released 
> in Hadoop 3.1.1. We should upgrade when possible.
> cc [~gsaha]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20770) Need improvement in hive for ACID properties and tables

2018-10-18 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16654787#comment-16654787
 ] 

Daniel Voros commented on HIVE-20770:
-

Hey [~pritambhandare],

Thank you for your interest! Could you please forward your message to the 
mailing list (**[u...@hive.apache.org|mailto:u...@hive.apache.org])? Jira is 
supposed to be used for reporting bugs and we use the mailing list for general 
questions. You'll also reach a broader audience there.

Thanks,
Daniel

> Need improvement in hive for ACID properties and tables
> ---
>
> Key: HIVE-20770
> URL: https://issues.apache.org/jira/browse/HIVE-20770
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: pritam
>Assignee: Daniel Voros
>Priority: Major
>
> Hello Team,
> In current version of Apache Hive if we set ACID properties it can not be 
> revert as well as
> Apache Spark does not support Hive ACID table. If it is possible to revert 
> ACID properties from hive table and read-write Hive table from spark-scala 
> please let me know.
> If there is no provision for above conditions it is important to add features 
> and improvement in next Apache Hive version. It will be very helpful for all 
> in distributed batch processing.
> I am eager to hear from you. Thank you all in advance for such a great batch 
> processing tool. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18767) Some alterPartitions invocations throw 'NumberFormatException: null'

2018-10-19 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16656673#comment-16656673
 ] 

Daniel Voros commented on HIVE-18767:
-

[~pvary] HIVE-20191 only changed how the ptest runner figures out the branch 
name. {{HIVE-18767.4.branch-3.1.patch}} should be fine and from the ptest logs 
it seems the correct branch is picked there:

{code}
+ git checkout branch-3.1
Switched to branch 'branch-3.1'
Your branch is up-to-date with 'origin/branch-3.1'.
+ git reset --hard origin/branch-3.1
HEAD is now at c39b5d1 HIVE-18778: Needs to capture input/output entities in 
explain (Daniel Dai, reviewed by Thejas Nair)
{code}

Can you help me finding out how/where Yetus is picking the branch?

> Some alterPartitions invocations throw 'NumberFormatException: null'
> 
>
> Key: HIVE-18767
> URL: https://issues.apache.org/jira/browse/HIVE-18767
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.3, 3.1.0, 4.0.0, 3.2.0
>Reporter: Yuming Wang
>Assignee: Mass Dosage
>Priority: Major
> Fix For: 2.4.0, 4.0.0, 2.3.4
>
> Attachments: HIVE-18767-branch-2.3.patch, HIVE-18767-branch-2.patch, 
> HIVE-18767-branch-3.1.patch, HIVE-18767-branch-3.patch, HIVE-18767.1.patch, 
> HIVE-18767.2-branch-2.3.patch, HIVE-18767.2-branch-2.patch, 
> HIVE-18767.2-branch-3.1.patch, HIVE-18767.2.patch, 
> HIVE-18767.3-branch-3.1.patch, HIVE-18767.3.patch, 
> HIVE-18767.4-branch-3.1.patch, HIVE-18767.4.branch-3.1.patch, 
> HIVE-18767.4.patch, HIVE-18767.5.patch, HIVE-18767.6.patch
>
>
> Error messages:
> {noformat}
> [info] Cause: java.lang.NumberFormatException: null
> [info] at java.lang.Long.parseLong(Long.java:552)
> [info] at java.lang.Long.parseLong(Long.java:631)
> [info] at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.isFastStatsSame(MetaStoreUtils.java:315)
> [info] at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterPartitions(HiveAlterHandler.java:605)
> [info] at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_partitions_with_environment_context(HiveMetaStore.java:3837)
> [info] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [info] at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [info] at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [info] at java.lang.reflect.Method.invoke(Method.java:498)
> [info] at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
> [info] at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> [info] at 
> com.sun.proxy.$Proxy23.alter_partitions_with_environment_context(Unknown 
> Source)
> [info] at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_partitions(HiveMetaStoreClient.java:1527)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18767) Some alterPartitions invocations throw 'NumberFormatException: null'

2018-10-19 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16656696#comment-16656696
 ] 

Daniel Voros commented on HIVE-18767:
-

No worries, I didn't know that it's a different path, that's why I haven't 
updated Yetus in HIVE-20191. Here's the list of filename formats we're 
supporting in ptests: 
[https://github.com/apache/hive/blob/14b972e964b1fed49c962949e1b0119ebf441bb1/dev-support/jenkins-common.sh#L63-L76]

Please let me know if I can help with the Yetus part!

> Some alterPartitions invocations throw 'NumberFormatException: null'
> 
>
> Key: HIVE-18767
> URL: https://issues.apache.org/jira/browse/HIVE-18767
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.3, 3.1.0, 4.0.0, 3.2.0
>Reporter: Yuming Wang
>Assignee: Mass Dosage
>Priority: Major
> Fix For: 2.4.0, 4.0.0, 2.3.4
>
> Attachments: HIVE-18767-branch-2.3.patch, HIVE-18767-branch-2.patch, 
> HIVE-18767-branch-3.1.patch, HIVE-18767-branch-3.patch, HIVE-18767.1.patch, 
> HIVE-18767.2-branch-2.3.patch, HIVE-18767.2-branch-2.patch, 
> HIVE-18767.2-branch-3.1.patch, HIVE-18767.2.patch, 
> HIVE-18767.3-branch-3.1.patch, HIVE-18767.3.patch, 
> HIVE-18767.4-branch-3.1.patch, HIVE-18767.4.branch-3.1.patch, 
> HIVE-18767.4.patch, HIVE-18767.5.patch, HIVE-18767.6.patch
>
>
> Error messages:
> {noformat}
> [info] Cause: java.lang.NumberFormatException: null
> [info] at java.lang.Long.parseLong(Long.java:552)
> [info] at java.lang.Long.parseLong(Long.java:631)
> [info] at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.isFastStatsSame(MetaStoreUtils.java:315)
> [info] at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterPartitions(HiveAlterHandler.java:605)
> [info] at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_partitions_with_environment_context(HiveMetaStore.java:3837)
> [info] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [info] at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [info] at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [info] at java.lang.reflect.Method.invoke(Method.java:498)
> [info] at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
> [info] at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> [info] at 
> com.sun.proxy.$Proxy23.alter_partitions_with_environment_context(Unknown 
> Source)
> [info] at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_partitions(HiveMetaStoreClient.java:1527)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HIVE-20770) Need improvement in hive for ACID properties and tables

2018-11-29 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros resolved HIVE-20770.
-
Resolution: Invalid

> Need improvement in hive for ACID properties and tables
> ---
>
> Key: HIVE-20770
> URL: https://issues.apache.org/jira/browse/HIVE-20770
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: pritam
>Assignee: Daniel Voros
>Priority: Major
>
> Hello Team,
> In current version of Apache Hive if we set ACID properties it can not be 
> revert as well as
> Apache Spark does not support Hive ACID table. If it is possible to revert 
> ACID properties from hive table and read-write Hive table from spark-scala 
> please let me know.
> If there is no provision for above conditions it is important to add features 
> and improvement in next Apache Hive version. It will be very helpful for all 
> in distributed batch processing.
> I am eager to hear from you. Thank you all in advance for such a great batch 
> processing tool. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21724) Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead

2019-05-13 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros reassigned HIVE-21724:
---


> Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead
> 
>
> Key: HIVE-21724
> URL: https://issues.apache.org/jira/browse/HIVE-21724
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.1.1
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>
> The logic during vectorized execution that keeps track of how deep we are in 
> the nested structure doesn't work for ARRAYs and STRUCTs embedded inside maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21724) Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead

2019-05-13 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-21724:

Description: 
The logic during vectorized execution that keeps track of how deep we are in 
the nested structure doesn't work for ARRAYs and STRUCTs embedded inside maps.

Repro steps (with hive.vectorized.execution.enabled=true):
{code}
CREATE TABLE srctable(a map>) STORED AS TEXTFILE;
create table desttable(c1 map>);
insert into srctable values (map(1, array(1, 2, 3)));
insert into desttable select a from srctable;
select * from desttable;
{code}
Will produce:
{code}
{1:[null]}
{code}

  was:The logic during vectorized execution that keeps track of how deep we are 
in the nested structure doesn't work for ARRAYs and STRUCTs embedded inside 
maps.


> Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead
> 
>
> Key: HIVE-21724
> URL: https://issues.apache.org/jira/browse/HIVE-21724
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.1.1
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>
> The logic during vectorized execution that keeps track of how deep we are in 
> the nested structure doesn't work for ARRAYs and STRUCTs embedded inside maps.
> Repro steps (with hive.vectorized.execution.enabled=true):
> {code}
> CREATE TABLE srctable(a map>) STORED AS TEXTFILE;
> create table desttable(c1 map>);
> insert into srctable values (map(1, array(1, 2, 3)));
> insert into desttable select a from srctable;
> select * from desttable;
> {code}
> Will produce:
> {code}
> {1:[null]}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21724) Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead

2019-05-13 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-21724:

Attachment: HIVE-21724.1.patch

> Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead
> 
>
> Key: HIVE-21724
> URL: https://issues.apache.org/jira/browse/HIVE-21724
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.1.1
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21724.1.patch
>
>
> The logic during vectorized execution that keeps track of how deep we are in 
> the nested structure doesn't work for ARRAYs and STRUCTs embedded inside maps.
> Repro steps (with hive.vectorized.execution.enabled=true):
> {code}
> CREATE TABLE srctable(a map>) STORED AS TEXTFILE;
> create table desttable(c1 map>);
> insert into srctable values (map(1, array(1, 2, 3)));
> insert into desttable select a from srctable;
> select * from desttable;
> {code}
> Will produce:
> {code}
> {1:[null]}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21724) Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead

2019-05-13 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-21724:

Status: Patch Available  (was: Open)

Attached patch #1 that adds the "is my parent a map?" check to LIST and STRUCT. 
(Same checks were already in place for MAP and UNION.)

> Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead
> 
>
> Key: HIVE-21724
> URL: https://issues.apache.org/jira/browse/HIVE-21724
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.1.1
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21724.1.patch
>
>
> The logic during vectorized execution that keeps track of how deep we are in 
> the nested structure doesn't work for ARRAYs and STRUCTs embedded inside maps.
> Repro steps (with hive.vectorized.execution.enabled=true):
> {code}
> CREATE TABLE srctable(a map>) STORED AS TEXTFILE;
> create table desttable(c1 map>);
> insert into srctable values (map(1, array(1, 2, 3)));
> insert into desttable select a from srctable;
> select * from desttable;
> {code}
> Will produce:
> {code}
> {1:[null]}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21724) Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead

2019-05-14 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16839482#comment-16839482
 ] 

Daniel Voros commented on HIVE-21724:
-

Thanks [~vgarg] for pointing this out, I missed this before. Based on 
[https://cwiki.apache.org/confluence/display/Hive/Vectorized+Query+Execution] 
it seems it does not, but we should fall back to "row-at-a-time execution". I 
guess this is what's happening here.

> Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead
> 
>
> Key: HIVE-21724
> URL: https://issues.apache.org/jira/browse/HIVE-21724
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.1.1
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21724.1.patch
>
>
> The logic during vectorized execution that keeps track of how deep we are in 
> the nested structure doesn't work for ARRAYs and STRUCTs embedded inside maps.
> Repro steps (with hive.vectorized.execution.enabled=true):
> {code}
> CREATE TABLE srctable(a map>) STORED AS TEXTFILE;
> create table desttable(c1 map>);
> insert into srctable values (map(1, array(1, 2, 3)));
> insert into desttable select a from srctable;
> select * from desttable;
> {code}
> Will produce:
> {code}
> {1:[null]}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21724) Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead

2019-05-14 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-21724:

Attachment: HIVE-21724.2.patch

> Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead
> 
>
> Key: HIVE-21724
> URL: https://issues.apache.org/jira/browse/HIVE-21724
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.1.1
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21724.1.patch, HIVE-21724.2.patch
>
>
> The logic during vectorized execution that keeps track of how deep we are in 
> the nested structure doesn't work for ARRAYs and STRUCTs embedded inside maps.
> Repro steps (with hive.vectorized.execution.enabled=true):
> {code}
> CREATE TABLE srctable(a map>) STORED AS TEXTFILE;
> create table desttable(c1 map>);
> insert into srctable values (map(1, array(1, 2, 3)));
> insert into desttable select a from srctable;
> select * from desttable;
> {code}
> Will produce:
> {code}
> {1:[null]}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21724) Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead

2019-05-14 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16839536#comment-16839536
 ] 

Daniel Voros commented on HIVE-21724:
-

Attached patch #2 that adds the LLAP q.out file that I've missed in the first 
patch.

> Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead
> 
>
> Key: HIVE-21724
> URL: https://issues.apache.org/jira/browse/HIVE-21724
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.1.1
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21724.1.patch, HIVE-21724.2.patch
>
>
> The logic during vectorized execution that keeps track of how deep we are in 
> the nested structure doesn't work for ARRAYs and STRUCTs embedded inside maps.
> Repro steps (with hive.vectorized.execution.enabled=true):
> {code}
> CREATE TABLE srctable(a map>) STORED AS TEXTFILE;
> create table desttable(c1 map>);
> insert into srctable values (map(1, array(1, 2, 3)));
> insert into desttable select a from srctable;
> select * from desttable;
> {code}
> Will produce:
> {code}
> {1:[null]}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21724) Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead

2019-05-16 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-21724:

Attachment: HIVE-21724.2.patch

> Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead
> 
>
> Key: HIVE-21724
> URL: https://issues.apache.org/jira/browse/HIVE-21724
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.1.1
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21724.1.patch, HIVE-21724.2.patch, 
> HIVE-21724.2.patch
>
>
> The logic during vectorized execution that keeps track of how deep we are in 
> the nested structure doesn't work for ARRAYs and STRUCTs embedded inside maps.
> Repro steps (with hive.vectorized.execution.enabled=true):
> {code}
> CREATE TABLE srctable(a map>) STORED AS TEXTFILE;
> create table desttable(c1 map>);
> insert into srctable values (map(1, array(1, 2, 3)));
> insert into desttable select a from srctable;
> select * from desttable;
> {code}
> Will produce:
> {code}
> {1:[null]}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21724) Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead

2019-05-28 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-21724:

Attachment: HIVE-21724.2.patch

> Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead
> 
>
> Key: HIVE-21724
> URL: https://issues.apache.org/jira/browse/HIVE-21724
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.1.1
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21724.1.patch, HIVE-21724.2.patch, 
> HIVE-21724.2.patch, HIVE-21724.2.patch
>
>
> The logic during vectorized execution that keeps track of how deep we are in 
> the nested structure doesn't work for ARRAYs and STRUCTs embedded inside maps.
> Repro steps (with hive.vectorized.execution.enabled=true):
> {code}
> CREATE TABLE srctable(a map>) STORED AS TEXTFILE;
> create table desttable(c1 map>);
> insert into srctable values (map(1, array(1, 2, 3)));
> insert into desttable select a from srctable;
> select * from desttable;
> {code}
> Will produce:
> {code}
> {1:[null]}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19979) Backport HIVE-19304 to branch-3

2018-07-02 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16529938#comment-16529938
 ] 

Daniel Voros commented on HIVE-19979:
-

Thank you [~kgyrtkirk]! Not sure what's wrong with Yetus, from the message it 
seems it's trying to apply the patch on master. I've tried separating patch 
number from branch with '-' in HIVE-19978 but that didn't work either.

The ptest failure seems to be unrelated, it has passed locally.

> Backport HIVE-19304 to branch-3
> ---
>
> Key: HIVE-19979
> URL: https://issues.apache.org/jira/browse/HIVE-19979
> Project: Hive
>  Issue Type: Task
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-19979.1.branch-3.patch
>
>
> Needs HIVE-19978 (backport of HIVE-18037) to land first.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20066) hive.load.data.owner is compared to full principal

2018-07-03 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros reassigned HIVE-20066:
---


> hive.load.data.owner is compared to full principal
> --
>
> Key: HIVE-20066
> URL: https://issues.apache.org/jira/browse/HIVE-20066
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
>
> HIVE-19928 compares the user running HS2 to the configured owner 
> (hive.load.data.owner) to check if we're able to move the file with LOAD DATA 
> or need to copy.
> This check compares the full username (that may contain the full kerberos 
> principal) to hive.load.data.owner. We should compare to the short username 
> ({{UGI.getShortUserName()}}) instead. That's used in similar context 
> [here|https://github.com/apache/hive/blob/f519db7eafacb4b4d2d9fe2a9e10e908d8077224/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L398].
> cc [~djaiswal]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20066) hive.load.data.owner is compared to full principal

2018-07-03 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-20066:

Attachment: HIVE-20066.1.patch

> hive.load.data.owner is compared to full principal
> --
>
> Key: HIVE-20066
> URL: https://issues.apache.org/jira/browse/HIVE-20066
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-20066.1.patch
>
>
> HIVE-19928 compares the user running HS2 to the configured owner 
> (hive.load.data.owner) to check if we're able to move the file with LOAD DATA 
> or need to copy.
> This check compares the full username (that may contain the full kerberos 
> principal) to hive.load.data.owner. We should compare to the short username 
> ({{UGI.getShortUserName()}}) instead. That's used in similar context 
> [here|https://github.com/apache/hive/blob/f519db7eafacb4b4d2d9fe2a9e10e908d8077224/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L398].
> cc [~djaiswal]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20066) hive.load.data.owner is compared to full principal

2018-07-03 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-20066:

Status: Patch Available  (was: Open)

Attached patch #1. This changes {{UGI.getUserName()}} to 
{{UGI.getShortUserName()}}.

> hive.load.data.owner is compared to full principal
> --
>
> Key: HIVE-20066
> URL: https://issues.apache.org/jira/browse/HIVE-20066
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-20066.1.patch
>
>
> HIVE-19928 compares the user running HS2 to the configured owner 
> (hive.load.data.owner) to check if we're able to move the file with LOAD DATA 
> or need to copy.
> This check compares the full username (that may contain the full kerberos 
> principal) to hive.load.data.owner. We should compare to the short username 
> ({{UGI.getShortUserName()}}) instead. That's used in similar context 
> [here|https://github.com/apache/hive/blob/f519db7eafacb4b4d2d9fe2a9e10e908d8077224/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L398].
> cc [~djaiswal]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20066) hive.load.data.owner is compared to full principal

2018-07-03 Thread Daniel Voros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Voros updated HIVE-20066:

Affects Version/s: 4.0.0
   3.1.0

> hive.load.data.owner is compared to full principal
> --
>
> Key: HIVE-20066
> URL: https://issues.apache.org/jira/browse/HIVE-20066
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-20066.1.patch
>
>
> HIVE-19928 compares the user running HS2 to the configured owner 
> (hive.load.data.owner) to check if we're able to move the file with LOAD DATA 
> or need to copy.
> This check compares the full username (that may contain the full kerberos 
> principal) to hive.load.data.owner. We should compare to the short username 
> ({{UGI.getShortUserName()}}) instead. That's used in similar context 
> [here|https://github.com/apache/hive/blob/f519db7eafacb4b4d2d9fe2a9e10e908d8077224/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L398].
> cc [~djaiswal]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20066) hive.load.data.owner is compared to full principal

2018-07-05 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16533350#comment-16533350
 ] 

Daniel Voros commented on HIVE-20066:
-

Thank you all!

> hive.load.data.owner is compared to full principal
> --
>
> Key: HIVE-20066
> URL: https://issues.apache.org/jira/browse/HIVE-20066
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-20066.1.patch
>
>
> HIVE-19928 compares the user running HS2 to the configured owner 
> (hive.load.data.owner) to check if we're able to move the file with LOAD DATA 
> or need to copy.
> This check compares the full username (that may contain the full kerberos 
> principal) to hive.load.data.owner. We should compare to the short username 
> ({{UGI.getShortUserName()}}) instead. That's used in similar context 
> [here|https://github.com/apache/hive/blob/f519db7eafacb4b4d2d9fe2a9e10e908d8077224/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L398].
> cc [~djaiswal]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >