[jira] [Commented] (HIVE-17050) Multiline queries that have comment in middle fail when executed via "beeline -e"

2017-07-27 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16102927#comment-16102927
 ] 

Yibing Shi commented on HIVE-17050:
---

No, [~pvary], these test failures seem irrelevant. 
I cannot reproduce the test failures even on my local machine. Not sure why 
they have failed.

> Multiline queries that have comment in middle fail when executed via "beeline 
> -e"
> -
>
> Key: HIVE-17050
> URL: https://issues.apache.org/jira/browse/HIVE-17050
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-17050.1.patch, HIVE-17050.2.patch, 
> HIVE-17050.3.PATCH, HIVE-17050.4.patch
>
>
> After applying HIVE-13864, multiple line queries that have comment at the end 
> of one of the middle lines fail when executed via beeline -e
> {noformat}
> $ beeline -u "" -e "select 1, --test
> > 2"
> scan complete in 3ms
> ..
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: Error while compiling statement: FAILED: ParseException line 1:9 
> cannot recognize input near '' '' '' in selection target 
> (state=42000,code=4)
> Closing: 0: 
> jdbc:hive2://host-10-17-80-194.coe.cloudera.com:1/default;principal=hive/host-10-17-80-194.coe.cloudera@yshi.com;ssl=true;sslTrustStore=/certs/hive/hive-keystore.jks;trustStorePassword=cloudera
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17050) Multiline queries that have comment in middle fail when executed via "beeline -e"

2017-07-27 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16102927#comment-16102927
 ] 

Yibing Shi edited comment on HIVE-17050 at 7/27/17 8:32 AM:


No, [~pvary], these test failures seem irrelevant. 


was (Author: yibing):
No, [~pvary], these test failures seem irrelevant. 
I cannot reproduce the test failures even on my local machine. Not sure why 
they have failed.

> Multiline queries that have comment in middle fail when executed via "beeline 
> -e"
> -
>
> Key: HIVE-17050
> URL: https://issues.apache.org/jira/browse/HIVE-17050
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-17050.1.patch, HIVE-17050.2.patch, 
> HIVE-17050.3.PATCH, HIVE-17050.4.patch
>
>
> After applying HIVE-13864, multiple line queries that have comment at the end 
> of one of the middle lines fail when executed via beeline -e
> {noformat}
> $ beeline -u "" -e "select 1, --test
> > 2"
> scan complete in 3ms
> ..
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: Error while compiling statement: FAILED: ParseException line 1:9 
> cannot recognize input near '' '' '' in selection target 
> (state=42000,code=4)
> Closing: 0: 
> jdbc:hive2://host-10-17-80-194.coe.cloudera.com:1/default;principal=hive/host-10-17-80-194.coe.cloudera@yshi.com;ssl=true;sslTrustStore=/certs/hive/hive-keystore.jks;trustStorePassword=cloudera
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17050) Multiline queries that have comment in middle fail when executed via "beeline -e"

2017-07-26 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-17050:
--
Attachment: HIVE-17050.4.patch

Hi [~pvary], I was in a rush and accidentally changed the pom.xml. Sorry for 
the mess!
Resutmit the patch.

> Multiline queries that have comment in middle fail when executed via "beeline 
> -e"
> -
>
> Key: HIVE-17050
> URL: https://issues.apache.org/jira/browse/HIVE-17050
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-17050.1.patch, HIVE-17050.2.patch, 
> HIVE-17050.3.PATCH, HIVE-17050.4.patch
>
>
> After applying HIVE-13864, multiple line queries that have comment at the end 
> of one of the middle lines fail when executed via beeline -e
> {noformat}
> $ beeline -u "" -e "select 1, --test
> > 2"
> scan complete in 3ms
> ..
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: Error while compiling statement: FAILED: ParseException line 1:9 
> cannot recognize input near '' '' '' in selection target 
> (state=42000,code=4)
> Closing: 0: 
> jdbc:hive2://host-10-17-80-194.coe.cloudera.com:1/default;principal=hive/host-10-17-80-194.coe.cloudera@yshi.com;ssl=true;sslTrustStore=/certs/hive/hive-keystore.jks;trustStorePassword=cloudera
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17050) Multiline queries that have comment in middle fail when executed via "beeline -e"

2017-07-26 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16102389#comment-16102389
 ] 

Yibing Shi edited comment on HIVE-17050 at 7/26/17 10:30 PM:
-

Hi [~pvary], I was in a rush and accidentally changed the pom.xml. Sorry for 
the mess!
Resubmit the patch.


was (Author: yibing):
Hi [~pvary], I was in a rush and accidentally changed the pom.xml. Sorry for 
the mess!
Resutmit the patch.

> Multiline queries that have comment in middle fail when executed via "beeline 
> -e"
> -
>
> Key: HIVE-17050
> URL: https://issues.apache.org/jira/browse/HIVE-17050
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-17050.1.patch, HIVE-17050.2.patch, 
> HIVE-17050.3.PATCH, HIVE-17050.4.patch
>
>
> After applying HIVE-13864, multiple line queries that have comment at the end 
> of one of the middle lines fail when executed via beeline -e
> {noformat}
> $ beeline -u "" -e "select 1, --test
> > 2"
> scan complete in 3ms
> ..
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: Error while compiling statement: FAILED: ParseException line 1:9 
> cannot recognize input near '' '' '' in selection target 
> (state=42000,code=4)
> Closing: 0: 
> jdbc:hive2://host-10-17-80-194.coe.cloudera.com:1/default;principal=hive/host-10-17-80-194.coe.cloudera@yshi.com;ssl=true;sslTrustStore=/certs/hive/hive-keystore.jks;trustStorePassword=cloudera
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17050) Multiline queries that have comment in middle fail when executed via "beeline -e"

2017-07-26 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-17050:
--
Attachment: HIVE-17050.3.PATCH

Submit a new patch

> Multiline queries that have comment in middle fail when executed via "beeline 
> -e"
> -
>
> Key: HIVE-17050
> URL: https://issues.apache.org/jira/browse/HIVE-17050
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-17050.1.patch, HIVE-17050.2.patch, 
> HIVE-17050.3.PATCH
>
>
> After applying HIVE-13864, multiple line queries that have comment at the end 
> of one of the middle lines fail when executed via beeline -e
> {noformat}
> $ beeline -u "" -e "select 1, --test
> > 2"
> scan complete in 3ms
> ..
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: Error while compiling statement: FAILED: ParseException line 1:9 
> cannot recognize input near '' '' '' in selection target 
> (state=42000,code=4)
> Closing: 0: 
> jdbc:hive2://host-10-17-80-194.coe.cloudera.com:1/default;principal=hive/host-10-17-80-194.coe.cloudera@yshi.com;ssl=true;sslTrustStore=/certs/hive/hive-keystore.jks;trustStorePassword=cloudera
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17050) Multiline queries that have comment in middle fail when executed via "beeline -e"

2017-07-25 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101097#comment-16101097
 ] 

Yibing Shi commented on HIVE-17050:
---

Hi [~ychena], I will have a further look at this.

> Multiline queries that have comment in middle fail when executed via "beeline 
> -e"
> -
>
> Key: HIVE-17050
> URL: https://issues.apache.org/jira/browse/HIVE-17050
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-17050.1.patch, HIVE-17050.2.patch
>
>
> After applying HIVE-13864, multiple line queries that have comment at the end 
> of one of the middle lines fail when executed via beeline -e
> {noformat}
> $ beeline -u "" -e "select 1, --test
> > 2"
> scan complete in 3ms
> ..
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: Error while compiling statement: FAILED: ParseException line 1:9 
> cannot recognize input near '' '' '' in selection target 
> (state=42000,code=4)
> Closing: 0: 
> jdbc:hive2://host-10-17-80-194.coe.cloudera.com:1/default;principal=hive/host-10-17-80-194.coe.cloudera@yshi.com;ssl=true;sslTrustStore=/certs/hive/hive-keystore.jks;trustStorePassword=cloudera
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17078) Add more logs to MapredLocalTask

2017-07-17 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-17078:
--
Attachment: HIVE-17078.5.PATCH

Thank you for the review, [~stakiar_impala_496e]!

I have modified the patch to adopt in point #1 and #3.

As for #2: 
bq. Where does l4j print to?
It depends. By default the local task is run in a new process, and the log4j 
appenders are not setup in the child process. As such, this l4j doesn't print 
to anywhere.
But if {{hive.exec.submit.local.task.via.child}} is disabled, the Hive l4j 
variable is used, and thus the information will be printed to Hive log.

> Add more logs to MapredLocalTask
> 
>
> Key: HIVE-17078
> URL: https://issues.apache.org/jira/browse/HIVE-17078
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yibing Shi
>Assignee: Yibing Shi
>Priority: Minor
> Attachments: HIVE-17078.1.patch, HIVE-17078.2.patch, 
> HIVE-17078.3.patch, HIVE-17078.4.PATCH, HIVE-17078.5.PATCH
>
>
> By default, {{MapredLocalTask}} is executed in a child process of Hive, in 
> case the local task uses too much resources that may affect Hive. Currently, 
> the stdout and stderr information of the child process is printed in Hive's 
> stdout/stderr log, which doesn't have a timestamp information, and is 
> separated from Hive service logs. This makes it hard to troubleshoot problems 
> in MapredLocalTasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15767) Hive On Spark is not working on secure clusters from Oozie

2017-07-13 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085634#comment-16085634
 ] 

Yibing Shi commented on HIVE-15767:
---

Thanks for the explanation!
This may be done by YARN instead of Spark. 


> Hive On Spark is not working on secure clusters from Oozie
> --
>
> Key: HIVE-15767
> URL: https://issues.apache.org/jira/browse/HIVE-15767
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1, 2.1.1
>Reporter: Peter Cseh
>Assignee: Peter Cseh
> Attachments: HIVE-15767-001.patch, HIVE-15767-002.patch
>
>
> When a HiveAction is launched form Oozie with Hive On Spark enabled, we're 
> getting errors:
> {noformat}
> Caused by: java.io.IOException: Exception reading 
> file:/yarn/nm/usercache/yshi/appcache/application_1485271416004_0022/container_1485271416004_0022_01_02/container_tokens
> at 
> org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:188)
> at 
> org.apache.hadoop.mapreduce.security.TokenCache.mergeBinaryTokens(TokenCache.java:155)
> {noformat}
> This is caused by passing the {{mapreduce.job.credentials.binary}} property 
> to the Spark configuration in RemoteHiveSparkClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15767) Hive On Spark is not working on secure clusters from Oozie

2017-07-13 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085612#comment-16085612
 ] 

Yibing Shi commented on HIVE-15767:
---

bq. I think the Spark driver will get the tokens afterwards
I really doubt that Spark driver can do this. In Oozie environment, it is Oozie 
server that obtains all the tokens *on behalf of the end user*. When the Hive 
actions starts a Spark job, the Spark driver has no access to end user ticket 
or keytab file. I don't think it can obtain necessary tokens. 
I believe we should somehow extract all the tokens from existing toke file, and 
pass it on to the Spark driver.

> Hive On Spark is not working on secure clusters from Oozie
> --
>
> Key: HIVE-15767
> URL: https://issues.apache.org/jira/browse/HIVE-15767
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1, 2.1.1
>Reporter: Peter Cseh
>Assignee: Peter Cseh
> Attachments: HIVE-15767-001.patch, HIVE-15767-002.patch
>
>
> When a HiveAction is launched form Oozie with Hive On Spark enabled, we're 
> getting errors:
> {noformat}
> Caused by: java.io.IOException: Exception reading 
> file:/yarn/nm/usercache/yshi/appcache/application_1485271416004_0022/container_1485271416004_0022_01_02/container_tokens
> at 
> org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:188)
> at 
> org.apache.hadoop.mapreduce.security.TokenCache.mergeBinaryTokens(TokenCache.java:155)
> {noformat}
> This is caused by passing the {{mapreduce.job.credentials.binary}} property 
> to the Spark configuration in RemoteHiveSparkClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17078) Add more logs to MapredLocalTask

2017-07-13 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-17078:
--
Attachment: HIVE-17078.4.PATCH

> Add more logs to MapredLocalTask
> 
>
> Key: HIVE-17078
> URL: https://issues.apache.org/jira/browse/HIVE-17078
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yibing Shi
>Assignee: Yibing Shi
>Priority: Minor
> Attachments: HIVE-17078.1.patch, HIVE-17078.2.patch, 
> HIVE-17078.3.patch, HIVE-17078.4.PATCH
>
>
> By default, {{MapredLocalTask}} is executed in a child process of Hive, in 
> case the local task uses too much resources that may affect Hive. Currently, 
> the stdout and stderr information of the child process is printed in Hive's 
> stdout/stderr log, which doesn't have a timestamp information, and is 
> separated from Hive service logs. This makes it hard to troubleshoot problems 
> in MapredLocalTasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17078) Add more logs to MapredLocalTask

2017-07-13 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085252#comment-16085252
 ] 

Yibing Shi commented on HIVE-17078:
---

Checked the failed tests.

# org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
fails with something irrelevant to this patch
# org.apache.hive.hcatalog.api.TestHCatClient. The failure also has nothing to 
do our patch.
# org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14/23] fails 
because the output has changed in order. Nothing serious. We need to somehow 
update the .out files, but maybe in a separate JIRA
# The other tests fails because now we have more logs in local tasks. Will 
update the .out files.

> Add more logs to MapredLocalTask
> 
>
> Key: HIVE-17078
> URL: https://issues.apache.org/jira/browse/HIVE-17078
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yibing Shi
>Assignee: Yibing Shi
>Priority: Minor
> Attachments: HIVE-17078.1.patch, HIVE-17078.2.patch, 
> HIVE-17078.3.patch
>
>
> By default, {{MapredLocalTask}} is executed in a child process of Hive, in 
> case the local task uses too much resources that may affect Hive. Currently, 
> the stdout and stderr information of the child process is printed in Hive's 
> stdout/stderr log, which doesn't have a timestamp information, and is 
> separated from Hive service logs. This makes it hard to troubleshoot problems 
> in MapredLocalTasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17078) Add more logs to MapredLocalTask

2017-07-12 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-17078:
--
Attachment: HIVE-17078.3.patch

Add a bit more logs

> Add more logs to MapredLocalTask
> 
>
> Key: HIVE-17078
> URL: https://issues.apache.org/jira/browse/HIVE-17078
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yibing Shi
>Assignee: Yibing Shi
>Priority: Minor
> Attachments: HIVE-17078.1.patch, HIVE-17078.2.patch, 
> HIVE-17078.3.patch
>
>
> By default, {{MapredLocalTask}} is executed in a child process of Hive, in 
> case the local task uses too much resources that may affect Hive. Currently, 
> the stdout and stderr information of the child process is printed in Hive's 
> stdout/stderr log, which doesn't have a timestamp information, and is 
> separated from Hive service logs. This makes it hard to troubleshoot problems 
> in MapredLocalTasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17078) Add more logs to MapredLocalTask

2017-07-12 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16084891#comment-16084891
 ] 

Yibing Shi commented on HIVE-17078:
---

I am trying to keep the current behaviour. With Hive CLI, by default Hive logs 
are not printed. Some users may rely on the stdout/stderr information. I don't 
want to surprise them.
If you still think it is unnecessary to print child stdout/stderr to Hive 
stdout/stderr, I can remove the corresponding code.

> Add more logs to MapredLocalTask
> 
>
> Key: HIVE-17078
> URL: https://issues.apache.org/jira/browse/HIVE-17078
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yibing Shi
>Assignee: Yibing Shi
>Priority: Minor
> Attachments: HIVE-17078.1.patch, HIVE-17078.2.patch
>
>
> By default, {{MapredLocalTask}} is executed in a child process of Hive, in 
> case the local task uses too much resources that may affect Hive. Currently, 
> the stdout and stderr information of the child process is printed in Hive's 
> stdout/stderr log, which doesn't have a timestamp information, and is 
> separated from Hive service logs. This makes it hard to troubleshoot problems 
> in MapredLocalTasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17078) Add more logs to MapredLocalTask

2017-07-12 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-17078:
--
Attachment: HIVE-17078.2.patch

Recreate the patch

> Add more logs to MapredLocalTask
> 
>
> Key: HIVE-17078
> URL: https://issues.apache.org/jira/browse/HIVE-17078
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yibing Shi
>Assignee: Yibing Shi
>Priority: Minor
> Attachments: HIVE-17078.1.patch, HIVE-17078.2.patch
>
>
> By default, {{MapredLocalTask}} is executed in a child process of Hive, in 
> case the local task uses too much resources that may affect Hive. Currently, 
> the stdout and stderr information of the child process is printed in Hive's 
> stdout/stderr log, which doesn't have a timestamp information, and is 
> separated from Hive service logs. This makes it hard to troubleshoot problems 
> in MapredLocalTasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17078) Add more logs to MapredLocalTask

2017-07-12 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-17078:
--
Attachment: HIVE-17078.1.patch

Attach a quick patch.
No tests are added, because this feature seems not be able to be tested in mini 
cluster.

> Add more logs to MapredLocalTask
> 
>
> Key: HIVE-17078
> URL: https://issues.apache.org/jira/browse/HIVE-17078
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yibing Shi
>Assignee: Yibing Shi
>Priority: Minor
> Attachments: HIVE-17078.1.patch
>
>
> By default, {{MapredLocalTask}} is executed in a child process of Hive, in 
> case the local task uses too much resources that may affect Hive. Currently, 
> the stdout and stderr information of the child process is printed in Hive's 
> stdout/stderr log, which doesn't have a timestamp information, and is 
> separated from Hive service logs. This makes it hard to troubleshoot problems 
> in MapredLocalTasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17078) Add more logs to MapredLocalTask

2017-07-12 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-17078:
--
Status: Patch Available  (was: Open)

> Add more logs to MapredLocalTask
> 
>
> Key: HIVE-17078
> URL: https://issues.apache.org/jira/browse/HIVE-17078
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yibing Shi
>Assignee: Yibing Shi
>Priority: Minor
> Attachments: HIVE-17078.1.patch
>
>
> By default, {{MapredLocalTask}} is executed in a child process of Hive, in 
> case the local task uses too much resources that may affect Hive. Currently, 
> the stdout and stderr information of the child process is printed in Hive's 
> stdout/stderr log, which doesn't have a timestamp information, and is 
> separated from Hive service logs. This makes it hard to troubleshoot problems 
> in MapredLocalTasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17078) Add more logs to MapredLocalTask

2017-07-12 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi reassigned HIVE-17078:
-

Assignee: Yibing Shi

> Add more logs to MapredLocalTask
> 
>
> Key: HIVE-17078
> URL: https://issues.apache.org/jira/browse/HIVE-17078
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yibing Shi
>Assignee: Yibing Shi
>Priority: Minor
>
> By default, {{MapredLocalTask}} is executed in a child process of Hive, in 
> case the local task uses too much resources that may affect Hive. Currently, 
> the stdout and stderr information of the child process is printed in Hive's 
> stdout/stderr log, which doesn't have a timestamp information, and is 
> separated from Hive service logs. This makes it hard to troubleshoot problems 
> in MapredLocalTasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15767) Hive On Spark is not working on secure clusters from Oozie

2017-07-11 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16083374#comment-16083374
 ] 

Yibing Shi commented on HIVE-15767:
---

[~peterceluch], can the tokens in Oozie launcher application still be passed to 
Spark job when property {{mapreduce.job.credentials.binary}} is unset? For 
example, in an environment where HDFS transparent encryption is enabled, is 
Spark job still able to connect to KMS servers?

(The change is in {{RemoteHiveSparkClient}}. Hive on MR shouldn't be affected. 
Oozie actions have already make sure the tokens are added to action 
configuration, which then should be passed to MR jobs).

> Hive On Spark is not working on secure clusters from Oozie
> --
>
> Key: HIVE-15767
> URL: https://issues.apache.org/jira/browse/HIVE-15767
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1, 2.1.1
>Reporter: Peter Cseh
>Assignee: Peter Cseh
> Attachments: HIVE-15767-001.patch, HIVE-15767-002.patch
>
>
> When a HiveAction is launched form Oozie with Hive On Spark enabled, we're 
> getting errors:
> {noformat}
> Caused by: java.io.IOException: Exception reading 
> file:/yarn/nm/usercache/yshi/appcache/application_1485271416004_0022/container_1485271416004_0022_01_02/container_tokens
> at 
> org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:188)
> at 
> org.apache.hadoop.mapreduce.security.TokenCache.mergeBinaryTokens(TokenCache.java:155)
> {noformat}
> This is caused by passing the {{mapreduce.job.credentials.binary}} property 
> to the Spark configuration in RemoteHiveSparkClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17050) Multiline queries that have comment in middle fail when executed via "beeline -e"

2017-07-06 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-17050:
--
Attachment: HIVE-17050.2.patch

> Multiline queries that have comment in middle fail when executed via "beeline 
> -e"
> -
>
> Key: HIVE-17050
> URL: https://issues.apache.org/jira/browse/HIVE-17050
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-17050.1.patch, HIVE-17050.2.patch
>
>
> After applying HIVE-13864, multiple line queries that have comment at the end 
> of one of the middle lines fail when executed via beeline -e
> {noformat}
> $ beeline -u "" -e "select 1, --test
> > 2"
> scan complete in 3ms
> ..
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: Error while compiling statement: FAILED: ParseException line 1:9 
> cannot recognize input near '' '' '' in selection target 
> (state=42000,code=4)
> Closing: 0: 
> jdbc:hive2://host-10-17-80-194.coe.cloudera.com:1/default;principal=hive/host-10-17-80-194.coe.cloudera@yshi.com;ssl=true;sslTrustStore=/certs/hive/hive-keystore.jks;trustStorePassword=cloudera
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17050) Multiline queries that have comment in middle fail when executed via "beeline -e"

2017-07-06 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16077393#comment-16077393
 ] 

Yibing Shi commented on HIVE-17050:
---

[~asherman], your change has covered what my change does. So I just added a few 
more tests in this JIRA.


> Multiline queries that have comment in middle fail when executed via "beeline 
> -e"
> -
>
> Key: HIVE-17050
> URL: https://issues.apache.org/jira/browse/HIVE-17050
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-17050.1.patch, HIVE-17050.2.patch
>
>
> After applying HIVE-13864, multiple line queries that have comment at the end 
> of one of the middle lines fail when executed via beeline -e
> {noformat}
> $ beeline -u "" -e "select 1, --test
> > 2"
> scan complete in 3ms
> ..
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: Error while compiling statement: FAILED: ParseException line 1:9 
> cannot recognize input near '' '' '' in selection target 
> (state=42000,code=4)
> Closing: 0: 
> jdbc:hive2://host-10-17-80-194.coe.cloudera.com:1/default;principal=hive/host-10-17-80-194.coe.cloudera@yshi.com;ssl=true;sslTrustStore=/certs/hive/hive-keystore.jks;trustStorePassword=cloudera
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17050) Multiline queries that have comment in middle fail when executed via "beeline -e"

2017-07-06 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16076414#comment-16076414
 ] 

Yibing Shi commented on HIVE-17050:
---

Error seem irrelevant. The only failed test that is possibly affected by this 
change is TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1], 
which doesn't contain any comment in the query script.

> Multiline queries that have comment in middle fail when executed via "beeline 
> -e"
> -
>
> Key: HIVE-17050
> URL: https://issues.apache.org/jira/browse/HIVE-17050
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-17050.1.patch
>
>
> After applying HIVE-13864, multiple line queries that have comment at the end 
> of one of the middle lines fail when executed via beeline -e
> {noformat}
> $ beeline -u "" -e "select 1, --test
> > 2"
> scan complete in 3ms
> ..
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: Error while compiling statement: FAILED: ParseException line 1:9 
> cannot recognize input near '' '' '' in selection target 
> (state=42000,code=4)
> Closing: 0: 
> jdbc:hive2://host-10-17-80-194.coe.cloudera.com:1/default;principal=hive/host-10-17-80-194.coe.cloudera@yshi.com;ssl=true;sslTrustStore=/certs/hive/hive-keystore.jks;trustStorePassword=cloudera
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17052) Remove logging of predicate filters

2017-07-06 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-17052:
--
Attachment: HIVE-17052.1.patch

Submit the patch

> Remove logging of predicate filters
> ---
>
> Key: HIVE-17052
> URL: https://issues.apache.org/jira/browse/HIVE-17052
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Yibing Shi
> Attachments: HIVE-17052.1.patch
>
>
> HIVE-16869 added the filter predicate to the debug log of HS2, but since 
> these filters may contain sensitive information they should not be logged out.
> The log statement should be changed back to the original form.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17052) Remove logging of predicate filters

2017-07-06 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-17052:
--
Status: Patch Available  (was: Open)

> Remove logging of predicate filters
> ---
>
> Key: HIVE-17052
> URL: https://issues.apache.org/jira/browse/HIVE-17052
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Yibing Shi
> Attachments: HIVE-17052.1.patch
>
>
> HIVE-16869 added the filter predicate to the debug log of HS2, but since 
> these filters may contain sensitive information they should not be logged out.
> The log statement should be changed back to the original form.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17052) Remove logging of predicate filters

2017-07-06 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi reassigned HIVE-17052:
-

Assignee: Yibing Shi

> Remove logging of predicate filters
> ---
>
> Key: HIVE-17052
> URL: https://issues.apache.org/jira/browse/HIVE-17052
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Yibing Shi
>
> HIVE-16869 added the filter predicate to the debug log of HS2, but since 
> these filters may contain sensitive information they should not be logged out.
> The log statement should be changed back to the original form.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17050) Multiline queries that have comment in middle fail when executed via "beeline -e"

2017-07-06 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-17050:
--
Assignee: Yibing Shi
  Status: Patch Available  (was: Open)

> Multiline queries that have comment in middle fail when executed via "beeline 
> -e"
> -
>
> Key: HIVE-17050
> URL: https://issues.apache.org/jira/browse/HIVE-17050
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-17050.1.patch
>
>
> After applying HIVE-13864, multiple line queries that have comment at the end 
> of one of the middle lines fail when executed via beeline -e
> {noformat}
> $ beeline -u "" -e "select 1, --test
> > 2"
> scan complete in 3ms
> ..
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: Error while compiling statement: FAILED: ParseException line 1:9 
> cannot recognize input near '' '' '' in selection target 
> (state=42000,code=4)
> Closing: 0: 
> jdbc:hive2://host-10-17-80-194.coe.cloudera.com:1/default;principal=hive/host-10-17-80-194.coe.cloudera@yshi.com;ssl=true;sslTrustStore=/certs/hive/hive-keystore.jks;trustStorePassword=cloudera
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17050) Multiline queries that have comment in middle fail when executed via "beeline -e"

2017-07-06 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-17050:
--
Attachment: HIVE-17050.1.patch

Submit a patch

> Multiline queries that have comment in middle fail when executed via "beeline 
> -e"
> -
>
> Key: HIVE-17050
> URL: https://issues.apache.org/jira/browse/HIVE-17050
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
> Attachments: HIVE-17050.1.patch
>
>
> After applying HIVE-13864, multiple line queries that have comment at the end 
> of one of the middle lines fail when executed via beeline -e
> {noformat}
> $ beeline -u "" -e "select 1, --test
> > 2"
> scan complete in 3ms
> ..
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: Error while compiling statement: FAILED: ParseException line 1:9 
> cannot recognize input near '' '' '' in selection target 
> (state=42000,code=4)
> Closing: 0: 
> jdbc:hive2://host-10-17-80-194.coe.cloudera.com:1/default;principal=hive/host-10-17-80-194.coe.cloudera@yshi.com;ssl=true;sslTrustStore=/certs/hive/hive-keystore.jks;trustStorePassword=cloudera
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16930) HoS should verify the value of Kerberos principal and keytab file before adding them to spark-submit command parameters

2017-06-21 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-16930:
--
Assignee: Yibing Shi
  Status: Patch Available  (was: Open)

> HoS should verify the value of Kerberos principal and keytab file before 
> adding them to spark-submit command parameters
> ---
>
> Key: HIVE-16930
> URL: https://issues.apache.org/jira/browse/HIVE-16930
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-16930.1.patch
>
>
> When Kerberos is enabled, Hive CLI fails to run Hive on Spark queries:
> {noformat}
> >hive -e "set hive.execution.engine=spark; create table if not exists test(a 
> >int); select count(*) from test" --hiveconf hive.root.logger=INFO,console > 
> >/var/tmp/hive_log.txt > /var/tmp/hive_log_2.txt 
> 17/06/16 16:13:13 [main]: ERROR client.SparkClientImpl: Error while waiting 
> for client to connect. 
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel 
> client 'a5de85d1-6933-43e7-986f-5f8e5c001b5f'. Error: Child process exited 
> before connecting back with error log Error: Cannot load main class from JAR 
> file:/tmp/spark-submit.7196051517706529285.properties 
> Run with --help for usage help or --verbose for debug output 
> at 
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) 
> at 
> org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:107) 
> at 
> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:100)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.(RemoteHiveSparkClient.java:96)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:66)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:111)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97) 
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) 
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1972) 
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1685) 
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1421) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1205) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1195) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:318) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:720) 
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:693) 
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:628) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:606) 
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221) 
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 
> Caused by: java.lang.RuntimeException: Cancel client 
> 'a5de85d1-6933-43e7-986f-5f8e5c001b5f'. Error: Child process exited before 
> connecting back with error log Error: Cannot load main class from JAR 
> file:/tmp/spark-submit.7196051517706529285.properties 
> Run with --help for usage help or --verbose for debug output 
> at 
> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179) 
> at 
> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:490) 
> at java.lang.Thread.run(Thread.java:745) 
> 17/06/16 16:13:13 [Driver]: WARN client.SparkClientImpl: Child process exited 
> with code 1 
> {noformat} 
> In the log, below message shows up:
> 

[jira] [Updated] (HIVE-16930) HoS should verify the value of Kerberos principal and keytab file before adding them to spark-submit command parameters

2017-06-21 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-16930:
--
Attachment: HIVE-16930.1.patch

Submit a patch.

> HoS should verify the value of Kerberos principal and keytab file before 
> adding them to spark-submit command parameters
> ---
>
> Key: HIVE-16930
> URL: https://issues.apache.org/jira/browse/HIVE-16930
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Yibing Shi
> Attachments: HIVE-16930.1.patch
>
>
> When Kerberos is enabled, Hive CLI fails to run Hive on Spark queries:
> {noformat}
> >hive -e "set hive.execution.engine=spark; create table if not exists test(a 
> >int); select count(*) from test" --hiveconf hive.root.logger=INFO,console > 
> >/var/tmp/hive_log.txt > /var/tmp/hive_log_2.txt 
> 17/06/16 16:13:13 [main]: ERROR client.SparkClientImpl: Error while waiting 
> for client to connect. 
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel 
> client 'a5de85d1-6933-43e7-986f-5f8e5c001b5f'. Error: Child process exited 
> before connecting back with error log Error: Cannot load main class from JAR 
> file:/tmp/spark-submit.7196051517706529285.properties 
> Run with --help for usage help or --verbose for debug output 
> at 
> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) 
> at 
> org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:107) 
> at 
> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:100)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.(RemoteHiveSparkClient.java:96)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:66)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:111)
>  
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97) 
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) 
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1972) 
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1685) 
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1421) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1205) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1195) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:318) 
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:720) 
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:693) 
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:628) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:606) 
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221) 
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 
> Caused by: java.lang.RuntimeException: Cancel client 
> 'a5de85d1-6933-43e7-986f-5f8e5c001b5f'. Error: Child process exited before 
> connecting back with error log Error: Cannot load main class from JAR 
> file:/tmp/spark-submit.7196051517706529285.properties 
> Run with --help for usage help or --verbose for debug output 
> at 
> org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179) 
> at 
> org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:490) 
> at java.lang.Thread.run(Thread.java:745) 
> 17/06/16 16:13:13 [Driver]: WARN client.SparkClientImpl: Child process exited 
> with code 1 
> {noformat} 
> In the log, below message shows up:
> {noformat}
> 17/06/16 16:13:12 [main]: INFO 

[jira] [Commented] (HIVE-16869) Hive returns wrong result when predicates on non-existing columns are pushed down to Parquet reader

2017-06-09 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045279#comment-16045279
 ] 

Yibing Shi commented on HIVE-16869:
---

The idea of the patch is to change the logic of predicate {{OR}}. Currently, if 
a child of predicate {{OR}} returns a null predicate, this child is ignored. 
This is not correct. A null predicate means that the condition is on a column 
that doesn't exist in Parquet file (partition column etc.). In such a scenario, 
the whole {{OR}} should be considered to true (returns null) so that the record 
should be returned for further checking (if this {{OR}} is at top level) or the 
parent predicate can be correctly evaluated (if current {{OR}} is a child of 
another predicate).

> Hive returns wrong result when predicates on non-existing columns are pushed 
> down to Parquet reader
> ---
>
> Key: HIVE-16869
> URL: https://issues.apache.org/jira/browse/HIVE-16869
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Yibing Shi
>Priority: Critical
> Attachments: HIVE-16869.1.patch, HIVE-16869.2.patch
>
>
> When {{hive.optimize.ppd}} and {{hive.optimize.index.filter}} are turned, and 
> a select query has a condition on a column that doesn't exist in Parquet file 
> (such as a partition column), Hive often returns wrong result.
> Please see below example for details:
> {noformat}
> hive> create table test_parq (a int, b int) partitioned by (p int) stored as 
> parquet;
> OK
> Time taken: 0.292 seconds
> hive> insert overwrite table test_parq partition (p=1) values (1, 2);
> OK
> Time taken: 5.08 seconds
> hive> select * from test_parq where a=1 and p=1;
> OK
> 1 2   1
> Time taken: 0.441 seconds, Fetched: 1 row(s)
> hive> select * from test_parq where (a=1 and p=1) or (a=999 and p=999);
> OK
> 1 2   1
> Time taken: 0.197 seconds, Fetched: 1 row(s)
> hive> set hive.optimize.index.filter=true;
> hive> select * from test_parq where (a=1 and p=1) or (a=999 and p=999);
> OK
> Time taken: 0.167 seconds
> hive> select * from test_parq where (a=1 or a=999) and (a=999 or p=1);
> OK
> Time taken: 0.563 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16869) Hive returns wrong result when predicates on non-existing columns are pushed down to Parquet reader

2017-06-09 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-16869:
--
Attachment: HIVE-16869.2.patch

fix the typo in qtest

> Hive returns wrong result when predicates on non-existing columns are pushed 
> down to Parquet reader
> ---
>
> Key: HIVE-16869
> URL: https://issues.apache.org/jira/browse/HIVE-16869
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Yibing Shi
>Priority: Critical
> Attachments: HIVE-16869.1.patch, HIVE-16869.2.patch
>
>
> When {{hive.optimize.ppd}} and {{hive.optimize.index.filter}} are turned, and 
> a select query has a condition on a column that doesn't exist in Parquet file 
> (such as a partition column), Hive often returns wrong result.
> Please see below example for details:
> {noformat}
> hive> create table test_parq (a int, b int) partitioned by (p int) stored as 
> parquet;
> OK
> Time taken: 0.292 seconds
> hive> insert overwrite table test_parq partition (p=1) values (1, 2);
> OK
> Time taken: 5.08 seconds
> hive> select * from test_parq where a=1 and p=1;
> OK
> 1 2   1
> Time taken: 0.441 seconds, Fetched: 1 row(s)
> hive> select * from test_parq where (a=1 and p=1) or (a=999 and p=999);
> OK
> 1 2   1
> Time taken: 0.197 seconds, Fetched: 1 row(s)
> hive> set hive.optimize.index.filter=true;
> hive> select * from test_parq where (a=1 and p=1) or (a=999 and p=999);
> OK
> Time taken: 0.167 seconds
> hive> select * from test_parq where (a=1 or a=999) and (a=999 or p=1);
> OK
> Time taken: 0.563 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16869) Hive returns wrong result when predicates on non-existing columns are pushed down to Parquet reader

2017-06-09 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-16869:
--
Status: Patch Available  (was: Open)

> Hive returns wrong result when predicates on non-existing columns are pushed 
> down to Parquet reader
> ---
>
> Key: HIVE-16869
> URL: https://issues.apache.org/jira/browse/HIVE-16869
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Yibing Shi
>Priority: Critical
> Attachments: HIVE-16869.1.patch
>
>
> When {{hive.optimize.ppd}} and {{hive.optimize.index.filter}} are turned, and 
> a select query has a condition on a column that doesn't exist in Parquet file 
> (such as a partition column), Hive often returns wrong result.
> Please see below example for details:
> {noformat}
> hive> create table test_parq (a int, b int) partitioned by (p int) stored as 
> parquet;
> OK
> Time taken: 0.292 seconds
> hive> insert overwrite table test_parq partition (p=1) values (1, 2);
> OK
> Time taken: 5.08 seconds
> hive> select * from test_parq where a=1 and p=1;
> OK
> 1 2   1
> Time taken: 0.441 seconds, Fetched: 1 row(s)
> hive> select * from test_parq where (a=1 and p=1) or (a=999 and p=999);
> OK
> 1 2   1
> Time taken: 0.197 seconds, Fetched: 1 row(s)
> hive> set hive.optimize.index.filter=true;
> hive> select * from test_parq where (a=1 and p=1) or (a=999 and p=999);
> OK
> Time taken: 0.167 seconds
> hive> select * from test_parq where (a=1 or a=999) and (a=999 or p=1);
> OK
> Time taken: 0.563 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16869) Hive returns wrong result when predicates on non-existing columns are pushed down to Parquet reader

2017-06-09 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-16869:
--
Attachment: HIVE-16869.1.patch

Submit a patch

> Hive returns wrong result when predicates on non-existing columns are pushed 
> down to Parquet reader
> ---
>
> Key: HIVE-16869
> URL: https://issues.apache.org/jira/browse/HIVE-16869
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Yibing Shi
>Priority: Critical
> Attachments: HIVE-16869.1.patch
>
>
> When {{hive.optimize.ppd}} and {{hive.optimize.index.filter}} are turned, and 
> a select query has a condition on a column that doesn't exist in Parquet file 
> (such as a partition column), Hive often returns wrong result.
> Please see below example for details:
> {noformat}
> hive> create table test_parq (a int, b int) partitioned by (p int) stored as 
> parquet;
> OK
> Time taken: 0.292 seconds
> hive> insert overwrite table test_parq partition (p=1) values (1, 2);
> OK
> Time taken: 5.08 seconds
> hive> select * from test_parq where a=1 and p=1;
> OK
> 1 2   1
> Time taken: 0.441 seconds, Fetched: 1 row(s)
> hive> select * from test_parq where (a=1 and p=1) or (a=999 and p=999);
> OK
> 1 2   1
> Time taken: 0.197 seconds, Fetched: 1 row(s)
> hive> set hive.optimize.index.filter=true;
> hive> select * from test_parq where (a=1 and p=1) or (a=999 and p=999);
> OK
> Time taken: 0.167 seconds
> hive> select * from test_parq where (a=1 or a=999) and (a=999 or p=1);
> OK
> Time taken: 0.563 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16869) Hive returns wrong result when predicates on non-existing columns are pushed down to Parquet reader

2017-06-09 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi reassigned HIVE-16869:
-


> Hive returns wrong result when predicates on non-existing columns are pushed 
> down to Parquet reader
> ---
>
> Key: HIVE-16869
> URL: https://issues.apache.org/jira/browse/HIVE-16869
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Yibing Shi
>Priority: Critical
>
> When {{hive.optimize.ppd}} and {{hive.optimize.index.filter}} are turned, and 
> a select query has a condition on a column that doesn't exist in Parquet file 
> (such as a partition column), Hive often returns wrong result.
> Please see below example for details:
> {noformat}
> hive> create table test_parq (a int, b int) partitioned by (p int) stored as 
> parquet;
> OK
> Time taken: 0.292 seconds
> hive> insert overwrite table test_parq partition (p=1) values (1, 2);
> OK
> Time taken: 5.08 seconds
> hive> select * from test_parq where a=1 and p=1;
> OK
> 1 2   1
> Time taken: 0.441 seconds, Fetched: 1 row(s)
> hive> select * from test_parq where (a=1 and p=1) or (a=999 and p=999);
> OK
> 1 2   1
> Time taken: 0.197 seconds, Fetched: 1 row(s)
> hive> set hive.optimize.index.filter=true;
> hive> select * from test_parq where (a=1 and p=1) or (a=999 and p=999);
> OK
> Time taken: 0.167 seconds
> hive> select * from test_parq where (a=1 or a=999) and (a=999 or p=1);
> OK
> Time taken: 0.563 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16660) Not able to add partition for views in hive when sentry is enabled

2017-05-12 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009153#comment-16009153
 ] 

Yibing Shi commented on HIVE-16660:
---

[~ychena], should we solve these 2 problems in 2 different JIRAs? They are not 
related.

> Not able to add partition for views in hive when sentry is enabled
> --
>
> Key: HIVE-16660
> URL: https://issues.apache.org/jira/browse/HIVE-16660
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-16660.1.patch
>
>
> Repro:
> create table tesnit (a int) partitioned by (p int);
> insert into table tesnit partition (p = 1) values (1);
> insert into table tesnit partition (p = 2) values (1);
> create view test_view partitioned on (p) as select * from tesnit where p =1;
> alter view test_view add partition (p = 2);
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10056]: The query does not reference any valid partition. To run this query, 
> set hive.mapred.mode=nonstrict (state=42000,code=10056)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-16646) Alias in transform ... as clause shouldn't be case sensitive

2017-05-12 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007695#comment-16007695
 ] 

Yibing Shi edited comment on HIVE-16646 at 5/12/17 6:37 AM:


These errors seem irrelevant. Could you please have a look as well [~ychena]?



was (Author: yibing):
Pulled down the latest master branch, and applied the patch from [~ychena]. The 
failed tests listed above all succeed for me. Can we kick off the test again 
and see how it goes?


> Alias in transform ... as clause shouldn't be case sensitive
> 
>
> Key: HIVE-16646
> URL: https://issues.apache.org/jira/browse/HIVE-16646
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-16646.1.patch, HIVE-16646.2.patch
>
>
> Create a table like below:
> {code:sql}
> CREATE TABLE hive_bug(col1 string);
> {code}
> Run below query in Hive:
> {code}
> from hive_bug select transform(col1) using '/bin/cat' as ( string);
> {code}
> The result would be:
> {noformat}
> 0: jdbc:hive2://localhost:1> from hive_bug select transform(col1) using 
> '/bin/cat' as ( string);
> ..
> INFO  : OK
> +---+--+
> |   |
> +---+--+
> +---+--+
> {noformat}
> The output column name is ** instead of the lowercase .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16646) Alias in transform ... as clause shouldn't be case sensitive

2017-05-12 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007695#comment-16007695
 ] 

Yibing Shi commented on HIVE-16646:
---

Pulled down the latest master branch, and applied the patch from [~ychena]. The 
failed tests listed above all succeed for me. Can we kick off the test again 
and see how it goes?


> Alias in transform ... as clause shouldn't be case sensitive
> 
>
> Key: HIVE-16646
> URL: https://issues.apache.org/jira/browse/HIVE-16646
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-16646.1.patch, HIVE-16646.2.patch
>
>
> Create a table like below:
> {code:sql}
> CREATE TABLE hive_bug(col1 string);
> {code}
> Run below query in Hive:
> {code}
> from hive_bug select transform(col1) using '/bin/cat' as ( string);
> {code}
> The result would be:
> {noformat}
> 0: jdbc:hive2://localhost:1> from hive_bug select transform(col1) using 
> '/bin/cat' as ( string);
> ..
> INFO  : OK
> +---+--+
> |   |
> +---+--+
> +---+--+
> {noformat}
> The output column name is ** instead of the lowercase .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16646) Alias in transform ... as clause shouldn't be case sensitive

2017-05-11 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007339#comment-16007339
 ] 

Yibing Shi commented on HIVE-16646:
---

Thank you, [~ychena]!

> Alias in transform ... as clause shouldn't be case sensitive
> 
>
> Key: HIVE-16646
> URL: https://issues.apache.org/jira/browse/HIVE-16646
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-16646.1.patch, HIVE-16646.2.patch
>
>
> Create a table like below:
> {code:sql}
> CREATE TABLE hive_bug(col1 string);
> {code}
> Run below query in Hive:
> {code}
> from hive_bug select transform(col1) using '/bin/cat' as ( string);
> {code}
> The result would be:
> {noformat}
> 0: jdbc:hive2://localhost:1> from hive_bug select transform(col1) using 
> '/bin/cat' as ( string);
> ..
> INFO  : OK
> +---+--+
> |   |
> +---+--+
> +---+--+
> {noformat}
> The output column name is ** instead of the lowercase .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16646) Alias in transform ... as clause shouldn't be case sensitive

2017-05-11 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-16646:
--
Assignee: Yibing Shi
  Status: Patch Available  (was: Open)

> Alias in transform ... as clause shouldn't be case sensitive
> 
>
> Key: HIVE-16646
> URL: https://issues.apache.org/jira/browse/HIVE-16646
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-16646.1.patch
>
>
> Create a table like below:
> {code:sql}
> CREATE TABLE hive_bug(col1 string);
> {code}
> Run below query in Hive:
> {code}
> from hive_bug select transform(col1) using '/bin/cat' as ( string);
> {code}
> The result would be:
> {noformat}
> 0: jdbc:hive2://localhost:1> from hive_bug select transform(col1) using 
> '/bin/cat' as ( string);
> ..
> INFO  : OK
> +---+--+
> |   |
> +---+--+
> +---+--+
> {noformat}
> The output column name is ** instead of the lowercase .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16646) Alias in transform ... as clause shouldn't be case sensitive

2017-05-11 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-16646:
--
Attachment: HIVE-16646.1.patch

Attach a patch

> Alias in transform ... as clause shouldn't be case sensitive
> 
>
> Key: HIVE-16646
> URL: https://issues.apache.org/jira/browse/HIVE-16646
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Yibing Shi
> Attachments: HIVE-16646.1.patch
>
>
> Create a table like below:
> {code:sql}
> CREATE TABLE hive_bug(col1 string);
> {code}
> Run below query in Hive:
> {code}
> from hive_bug select transform(col1) using '/bin/cat' as ( string);
> {code}
> The result would be:
> {noformat}
> 0: jdbc:hive2://localhost:1> from hive_bug select transform(col1) using 
> '/bin/cat' as ( string);
> ..
> INFO  : OK
> +---+--+
> |   |
> +---+--+
> +---+--+
> {noformat}
> The output column name is ** instead of the lowercase .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16646) Alias in transform ... as clause shouldn't be case sensitive

2017-05-11 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006394#comment-16006394
 ] 

Yibing Shi commented on HIVE-16646:
---

Another query that can show this problem more clearly is:
{code:sql}
select t.col from ( 
select transform(col) using 'cat' as (COL string) from transform3_t1
) t;
{code}

It fails with below error:
{noformat}
FAILED: SemanticException [Error 10002]: Line 1:9 Invalid column reference 'col'
{noformat}

Changing {{as (COL string)}} to {{as (col string)}} makes the query run 
properly.

> Alias in transform ... as clause shouldn't be case sensitive
> 
>
> Key: HIVE-16646
> URL: https://issues.apache.org/jira/browse/HIVE-16646
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Yibing Shi
>
> Create a table like below:
> {code:sql}
> CREATE TABLE hive_bug(col1 string);
> {code}
> Run below query in Hive:
> {code}
> from hive_bug select transform(col1) using '/bin/cat' as ( string);
> {code}
> The result would be:
> {noformat}
> 0: jdbc:hive2://localhost:1> from hive_bug select transform(col1) using 
> '/bin/cat' as ( string);
> ..
> INFO  : OK
> +---+--+
> |   |
> +---+--+
> +---+--+
> {noformat}
> The output column name is ** instead of the lowercase .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16291) Hive fails when unions a parquet table with itself

2017-04-05 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-16291:
--
Attachment: HIVE-16291.2.patch

> Hive fails when unions a parquet table with itself
> --
>
> Key: HIVE-16291
> URL: https://issues.apache.org/jira/browse/HIVE-16291
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-16291.1.patch, HIVE-16291.2.patch
>
>
> Reproduce commands:
> {code:sql}
> create table tst_unin (col1 int) partitioned by (p_tdate int) stored as 
> parquet;
> insert into tst_unin partition (p_tdate=201603) values (20160312), (20160310);
> insert into tst_unin partition (p_tdate=201604) values (20160412), (20160410);
> select count(*) from (select tst_unin.p_tdate from tst_unin where 
> tst_unin.col1=20160302 union all select tst_unin.p_tdate from tst_unin) t1;
> {code}
> The table is stored in Parquet format, which is a columnar file format. Hive 
> tries to push the query predicates to the table scan operators so that only 
> the needed columns are read. This is done by adding the needed column IDs 
> into job configuration with property "hive.io.file.readcolumn.ids".
> In above case, the query unions the result of 2 subqueries, which select data 
> from one same table. The first subquery doesn't need any column from Parquet 
> file, while the second subquery needs a column "col1". Hive has a bug here, 
> it finally set "hive.io.file.readcolumn.ids" to a value like "0,,0", which 
> method ColumnProjectionUtils.getReadColumnIDs cannot parse.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16291) Hive fails when unions a parquet table with itself

2017-04-05 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958307#comment-15958307
 ] 

Yibing Shi commented on HIVE-16291:
---

[~aihuaxu]
Actually, I have just realized that we can change it into below line:
{code}
String newConfStr = HiveStringUtils.joinIgnoringEmpty(new String[] {id, old}, 
StringUtils.COMMA);
{code}

> Hive fails when unions a parquet table with itself
> --
>
> Key: HIVE-16291
> URL: https://issues.apache.org/jira/browse/HIVE-16291
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-16291.1.patch
>
>
> Reproduce commands:
> {code:sql}
> create table tst_unin (col1 int) partitioned by (p_tdate int) stored as 
> parquet;
> insert into tst_unin partition (p_tdate=201603) values (20160312), (20160310);
> insert into tst_unin partition (p_tdate=201604) values (20160412), (20160410);
> select count(*) from (select tst_unin.p_tdate from tst_unin where 
> tst_unin.col1=20160302 union all select tst_unin.p_tdate from tst_unin) t1;
> {code}
> The table is stored in Parquet format, which is a columnar file format. Hive 
> tries to push the query predicates to the table scan operators so that only 
> the needed columns are read. This is done by adding the needed column IDs 
> into job configuration with property "hive.io.file.readcolumn.ids".
> In above case, the query unions the result of 2 subqueries, which select data 
> from one same table. The first subquery doesn't need any column from Parquet 
> file, while the second subquery needs a column "col1". Hive has a bug here, 
> it finally set "hive.io.file.readcolumn.ids" to a value like "0,,0", which 
> method ColumnProjectionUtils.getReadColumnIDs cannot parse.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16291) Hive fails when unions a parquet table with itself

2017-04-05 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958304#comment-15958304
 ] 

Yibing Shi commented on HIVE-16291:
---

[~aihuaxu]
Sorry for the delay! I was totally stuck in other problems and didn't get a 
chance to check this.
I submitted my patch trying to minimize the scope of my changes (touch as less 
line as possible). Yes, I agree that the logic is a bit confusing. Your 
suggestions look great! I have a slightly modified version as below. How do you 
think?

{code}
String newConfStr = null;
for (String s : Arrays.asList(id, old)) {
  if (org.apache.commons.lang.StringUtils.isNotBlank(s)) {
newConfStr = newConfStr == null ? s : newConfStr + 
StringUtils.COMMA_STR + s;
  }
}

{code}

> Hive fails when unions a parquet table with itself
> --
>
> Key: HIVE-16291
> URL: https://issues.apache.org/jira/browse/HIVE-16291
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-16291.1.patch
>
>
> Reproduce commands:
> {code:sql}
> create table tst_unin (col1 int) partitioned by (p_tdate int) stored as 
> parquet;
> insert into tst_unin partition (p_tdate=201603) values (20160312), (20160310);
> insert into tst_unin partition (p_tdate=201604) values (20160412), (20160410);
> select count(*) from (select tst_unin.p_tdate from tst_unin where 
> tst_unin.col1=20160302 union all select tst_unin.p_tdate from tst_unin) t1;
> {code}
> The table is stored in Parquet format, which is a columnar file format. Hive 
> tries to push the query predicates to the table scan operators so that only 
> the needed columns are read. This is done by adding the needed column IDs 
> into job configuration with property "hive.io.file.readcolumn.ids".
> In above case, the query unions the result of 2 subqueries, which select data 
> from one same table. The first subquery doesn't need any column from Parquet 
> file, while the second subquery needs a column "col1". Hive has a bug here, 
> it finally set "hive.io.file.readcolumn.ids" to a value like "0,,0", which 
> method ColumnProjectionUtils.getReadColumnIDs cannot parse.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16291) Hive fails when unions a parquet table with itself

2017-03-24 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-16291:
--
Assignee: Yibing Shi
  Status: Patch Available  (was: Open)

> Hive fails when unions a parquet table with itself
> --
>
> Key: HIVE-16291
> URL: https://issues.apache.org/jira/browse/HIVE-16291
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-16291.1.patch
>
>
> Reproduce commands:
> {code:sql}
> create table tst_unin (col1 int) partitioned by (p_tdate int) stored as 
> parquet;
> insert into tst_unin partition (p_tdate=201603) values (20160312), (20160310);
> insert into tst_unin partition (p_tdate=201604) values (20160412), (20160410);
> select count(*) from (select tst_unin.p_tdate from tst_unin union all select 
> tst_unin.p_tdate from tst_unin where tst_unin.col1=20160302) t1;
> {code}
> The table is stored in Parquet format, which is a columnar file format. Hive 
> tries to push the query predicates to the table scan operators so that only 
> the needed columns are read. This is done by adding the needed column IDs 
> into job configuration with property "hive.io.file.readcolumn.ids".
> In above case, the query unions the result of 2 subqueries, which select data 
> from one same table. The first subquery doesn't need any column from Parquet 
> file, while the second subquery needs a column "col1". Hive has a bug here, 
> it finally set "hive.io.file.readcolumn.ids" to a value like "0,,0", which 
> method ColumnProjectionUtils.getReadColumnIDs cannot parse.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16291) Hive fails when unions a parquet table with itself

2017-03-24 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-16291:
--
Attachment: HIVE-16291.1.patch

> Hive fails when unions a parquet table with itself
> --
>
> Key: HIVE-16291
> URL: https://issues.apache.org/jira/browse/HIVE-16291
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Yibing Shi
> Attachments: HIVE-16291.1.patch
>
>
> Reproduce commands:
> {code:sql}
> create table tst_unin (col1 int) partitioned by (p_tdate int) stored as 
> parquet;
> insert into tst_unin partition (p_tdate=201603) values (20160312), (20160310);
> insert into tst_unin partition (p_tdate=201604) values (20160412), (20160410);
> select count(*) from (select tst_unin.p_tdate from tst_unin union all select 
> tst_unin.p_tdate from tst_unin where tst_unin.col1=20160302) t1;
> {code}
> The table is stored in Parquet format, which is a columnar file format. Hive 
> tries to push the query predicates to the table scan operators so that only 
> the needed columns are read. This is done by adding the needed column IDs 
> into job configuration with property "hive.io.file.readcolumn.ids".
> In above case, the query unions the result of 2 subqueries, which select data 
> from one same table. The first subquery doesn't need any column from Parquet 
> file, while the second subquery needs a column "col1". Hive has a bug here, 
> it finally set "hive.io.file.readcolumn.ids" to a value like "0,,0", which 
> method ColumnProjectionUtils.getReadColumnIDs cannot parse.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15530) Optimize the column stats update logic in table alteration

2017-01-10 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-15530:
--
Attachment: HIVE-15530.5.patch

Attach a new patch based on [~ctang.ma]'s comment

> Optimize the column stats update logic in table alteration
> --
>
> Key: HIVE-15530
> URL: https://issues.apache.org/jira/browse/HIVE-15530
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-15530.1.patch, HIVE-15530.2.patch, 
> HIVE-15530.3.patch, HIVE-15530.4.patch, HIVE-15530.5.patch
>
>
> Currently when a table is altered, if any of below conditions is true, HMS 
> would try to update column statistics for the table:
> # database name is changed
> # table name is changed
> # old columns and new columns are not the same
> As a result, when a column is added to a table, Hive also tries to update 
> column statistics, which is not necessary. We can loose the last condition by 
> checking whether all existing columns are changed or not. If not, we don't 
> have to update stats info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15530) Optimize the column stats update logic in table alteration

2017-01-10 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15814799#comment-15814799
 ] 

Yibing Shi commented on HIVE-15530:
---

You are right that the column stats don't need to be updated if only column 
positions are changed. Current patch doesn't optimize this, because I didn't 
notice that {{areSameColumns}} also compares column positions. I will upload a 
new patch soon.

> Optimize the column stats update logic in table alteration
> --
>
> Key: HIVE-15530
> URL: https://issues.apache.org/jira/browse/HIVE-15530
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-15530.1.patch, HIVE-15530.2.patch, 
> HIVE-15530.3.patch, HIVE-15530.4.patch
>
>
> Currently when a table is altered, if any of below conditions is true, HMS 
> would try to update column statistics for the table:
> # database name is changed
> # table name is changed
> # old columns and new columns are not the same
> As a result, when a column is added to a table, Hive also tries to update 
> column statistics, which is not necessary. We can loose the last condition by 
> checking whether all existing columns are changed or not. If not, we don't 
> have to update stats info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15530) Optimize the column stats update logic in table alteration

2017-01-09 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812787#comment-15812787
 ] 

Yibing Shi commented on HIVE-15530:
---

Hi [~ctang.ma], thanks for looking into this patch! I believe that the stats 
should be still be updated in the scenario you described, because it is column 
name (not ID) is stored in stats tables. When a column name is changed, the 
existing stats info should be updated, or at least removed.

> Optimize the column stats update logic in table alteration
> --
>
> Key: HIVE-15530
> URL: https://issues.apache.org/jira/browse/HIVE-15530
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-15530.1.patch, HIVE-15530.2.patch, 
> HIVE-15530.3.patch, HIVE-15530.4.patch
>
>
> Currently when a table is altered, if any of below conditions is true, HMS 
> would try to update column statistics for the table:
> # database name is changed
> # table name is changed
> # old columns and new columns are not the same
> As a result, when a column is added to a table, Hive also tries to update 
> column statistics, which is not necessary. We can loose the last condition by 
> checking whether all existing columns are changed or not. If not, we don't 
> have to update stats info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)



[jira] [Updated] (HIVE-15530) Optimize the column stats update logic in table alteration

2017-01-05 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-15530:
--
Attachment: HIVE-15530.4.patch

Thanks [~aihuaxu] for looking into the patch. I have corrected the license 
declarement of the new files based on your suggestion.

> Optimize the column stats update logic in table alteration
> --
>
> Key: HIVE-15530
> URL: https://issues.apache.org/jira/browse/HIVE-15530
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-15530.1.patch, HIVE-15530.2.patch, 
> HIVE-15530.3.patch, HIVE-15530.4.patch
>
>
> Currently when a table is altered, if any of below conditions is true, HMS 
> would try to update column statistics for the table:
> # database name is changed
> # table name is changed
> # old columns and new columns are not the same
> As a result, when a column is added to a table, Hive also tries to update 
> column statistics, which is not necessary. We can loose the last condition by 
> checking whether all existing columns are changed or not. If not, we don't 
> have to update stats info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15530) Optimize the column stats update logic in table alteration

2017-01-05 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-15530:
--
Attachment: HIVE-15530.3.patch

Try to fix the broken patch

> Optimize the column stats update logic in table alteration
> --
>
> Key: HIVE-15530
> URL: https://issues.apache.org/jira/browse/HIVE-15530
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-15530.1.patch, HIVE-15530.2.patch, 
> HIVE-15530.3.patch
>
>
> Currently when a table is altered, if any of below conditions is true, HMS 
> would try to update column statistics for the table:
> # database name is changed
> # table name is changed
> # old columns and new columns are not the same
> As a result, when a column is added to a table, Hive also tries to update 
> column statistics, which is not necessary. We can loose the last condition by 
> checking whether all existing columns are changed or not. If not, we don't 
> have to update stats info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15530) Optimize the column stats update logic in table alteration

2017-01-04 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-15530:
--
Attachment: HIVE-15530.2.patch

Add unit tests

> Optimize the column stats update logic in table alteration
> --
>
> Key: HIVE-15530
> URL: https://issues.apache.org/jira/browse/HIVE-15530
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-15530.1.patch, HIVE-15530.2.patch
>
>
> Currently when a table is altered, if any of below conditions is true, HMS 
> would try to update column statistics for the table:
> # database name is changed
> # table name is changed
> # old columns and new columns are not the same
> As a result, when a column is added to a table, Hive also tries to update 
> column statistics, which is not necessary. We can loose the last condition by 
> checking whether all existing columns are changed or not. If not, we don't 
> have to update stats info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15530) Optimize the column stats update logic in table alteration

2017-01-02 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-15530:
--
Status: Patch Available  (was: Open)

> Optimize the column stats update logic in table alteration
> --
>
> Key: HIVE-15530
> URL: https://issues.apache.org/jira/browse/HIVE-15530
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Yibing Shi
> Attachments: HIVE-15530.1.patch
>
>
> Currently when a table is altered, if any of below conditions is true, HMS 
> would try to update column statistics for the table:
> # database name is changed
> # table name is changed
> # old columns and new columns are not the same
> As a result, when a column is added to a table, Hive also tries to update 
> column statistics, which is not necessary. We can loose the last condition by 
> checking whether all existing columns are changed or not. If not, we don't 
> have to update stats info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-15530) Optimize the column stats update logic in table alteration

2017-01-02 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi reassigned HIVE-15530:
-

Assignee: Yibing Shi

> Optimize the column stats update logic in table alteration
> --
>
> Key: HIVE-15530
> URL: https://issues.apache.org/jira/browse/HIVE-15530
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-15530.1.patch
>
>
> Currently when a table is altered, if any of below conditions is true, HMS 
> would try to update column statistics for the table:
> # database name is changed
> # table name is changed
> # old columns and new columns are not the same
> As a result, when a column is added to a table, Hive also tries to update 
> column statistics, which is not necessary. We can loose the last condition by 
> checking whether all existing columns are changed or not. If not, we don't 
> have to update stats info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15530) Optimize the column stats update logic in table alteration

2017-01-02 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-15530:
--
Attachment: HIVE-15530.1.patch

> Optimize the column stats update logic in table alteration
> --
>
> Key: HIVE-15530
> URL: https://issues.apache.org/jira/browse/HIVE-15530
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Yibing Shi
> Attachments: HIVE-15530.1.patch
>
>
> Currently when a table is altered, if any of below conditions is true, HMS 
> would try to update column statistics for the table:
> # database name is changed
> # table name is changed
> # old columns and new columns are not the same
> As a result, when a column is added to a table, Hive also tries to update 
> column statistics, which is not necessary. We can loose the last condition by 
> checking whether all existing columns are changed or not. If not, we don't 
> have to update stats info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15530) Optimize the column stats update logic in table alteration

2017-01-02 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-15530:
--
Description: 
Currently when a table is altered, if any of below conditions is true, HMS 
would try to update column statistics for the table:

# database name is changed
# table name is changed
# old columns and new columns are not the same

As a result, when a column is added to a table, Hive also tries to update 
column statistics, which is not necessary. We can loose the last condition by 
checking whether all existing columns are changed or not. If not, we don't have 
to update stats info.

  was:
Currently when a table is altered, if any of below conditions is false, HMS 
would try to update column statistics for the table:

# database name is changed
# table name is changed
# old columns and new columns are not the same

As a result, when a column is added to a table, Hive also tries to update 
column statistics, which is not necessary. We can loose the last condition by 
checking whether all existing columns are changed or not. If not, we don't have 
to update stats info.


> Optimize the column stats update logic in table alteration
> --
>
> Key: HIVE-15530
> URL: https://issues.apache.org/jira/browse/HIVE-15530
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Yibing Shi
>
> Currently when a table is altered, if any of below conditions is true, HMS 
> would try to update column statistics for the table:
> # database name is changed
> # table name is changed
> # old columns and new columns are not the same
> As a result, when a column is added to a table, Hive also tries to update 
> column statistics, which is not necessary. We can loose the last condition by 
> checking whether all existing columns are changed or not. If not, we don't 
> have to update stats info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15225) QueryPlan.getJSONValue should code against empty string values

2016-11-16 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-15225:
--
Status: Patch Available  (was: Open)

> QueryPlan.getJSONValue should code against empty string values
> --
>
> Key: HIVE-15225
> URL: https://issues.apache.org/jira/browse/HIVE-15225
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
> Attachments: HIVE-15225.1.patch
>
>
> The current {{QueryPlan.getJSONValue}} implementation is as below:
> {code}
>   public String getJSONValue(Object value) {
> String v = "null";
> if (value != null) {
>   v = value.toString();
>   if (v.charAt(0) != '[' && v.charAt(0) != '{') {
> v = "\"" + v + "\"";
>   }
> }
> return v;
>   }
> {code}
> When {{value.toString()}} returns an empty string, a 
> StringIndexOutOfRangeException would be thrown out when "v.charAt(0)" is 
> evaluated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-15225) QueryPlan.getJSONValue should code against empty string values

2016-11-16 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi reassigned HIVE-15225:
-

Assignee: Yibing Shi

> QueryPlan.getJSONValue should code against empty string values
> --
>
> Key: HIVE-15225
> URL: https://issues.apache.org/jira/browse/HIVE-15225
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-15225.1.patch
>
>
> The current {{QueryPlan.getJSONValue}} implementation is as below:
> {code}
>   public String getJSONValue(Object value) {
> String v = "null";
> if (value != null) {
>   v = value.toString();
>   if (v.charAt(0) != '[' && v.charAt(0) != '{') {
> v = "\"" + v + "\"";
>   }
> }
> return v;
>   }
> {code}
> When {{value.toString()}} returns an empty string, a 
> StringIndexOutOfRangeException would be thrown out when "v.charAt(0)" is 
> evaluated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15225) QueryPlan.getJSONValue should code against empty string values

2016-11-16 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-15225:
--
Attachment: HIVE-15225.1.patch

Attach a quick patch

> QueryPlan.getJSONValue should code against empty string values
> --
>
> Key: HIVE-15225
> URL: https://issues.apache.org/jira/browse/HIVE-15225
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
> Attachments: HIVE-15225.1.patch
>
>
> The current {{QueryPlan.getJSONValue}} implementation is as below:
> {code}
>   public String getJSONValue(Object value) {
> String v = "null";
> if (value != null) {
>   v = value.toString();
>   if (v.charAt(0) != '[' && v.charAt(0) != '{') {
> v = "\"" + v + "\"";
>   }
> }
> return v;
>   }
> {code}
> When {{value.toString()}} returns an empty string, a 
> StringIndexOutOfRangeException would be thrown out when "v.charAt(0)" is 
> evaluated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14609) HS2 cannot drop a function whose associated jar file has been removed

2016-08-24 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435978#comment-15435978
 ] 

Yibing Shi commented on HIVE-14609:
---

To drop a function, Hive first gets the function definition:
https://github.com/cloudera/hive/blob/cdh5-1.1.0_5.8.0/ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java#L99
{code}
FunctionInfo info = FunctionRegistry.getFunctionInfo(functionName);
if (info == null) {
  if (throwException) {
throw new 
SemanticException(ErrorMsg.INVALID_FUNCTION.getMsg(functionName));
  } else {
// Fail silently
return;
  }
} else if (info.isBuiltIn()) {
  throw new 
SemanticException(ErrorMsg.DROP_NATIVE_FUNCTION.getMsg(functionName));
}
{code}

Unfortunately {{FunctionRegistry.getFunctionInfo}} tries to load the function 
into registry after gets its definition, which includes the step of downloading 
jars and causes the failure. We should be able to fix this by adding one 
parameter to the getFunctionInfo method to control whether to adds the function 
to registry.

And for the reason why Hive fails silently, it is because 
"hive.exec.drop.ignorenonexistent" is set to true by default, and thus Hive 
doesn't throw any exception when the failure happens.

> HS2 cannot drop a function whose associated jar file has been removed
> -
>
> Key: HIVE-14609
> URL: https://issues.apache.org/jira/browse/HIVE-14609
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Chaoyu Tang
>
> Create a permanent function with below command:
> {code:sql}
> create function yshi.dummy as 'com.yshi.hive.udf.DummyUDF' using jar 
> 'hdfs://host-10-17-81-142.coe.cloudera.com:8020/hive/jars/yshi.jar';
> {code}
> After that, delete the HDFS file 
> {{hdfs://host-10-17-81-142.coe.cloudera.com:8020/hive/jars/yshi.jar}}, and 
> *restart HS2 to remove the loaded class*.
> Now the function cannot be dropped:
> {noformat}
> 0: jdbc:hive2://10.17.81.144:1/default> show functions yshi.dummy;
> INFO  : Compiling 
> command(queryId=hive_20160821213434_d0271d77-84d8-45ba-8d92-3da1c143bded): 
> show functions yshi.dummy
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from 
> deserializer)], properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20160821213434_d0271d77-84d8-45ba-8d92-3da1c143bded); 
> Time taken: 1.259 seconds
> INFO  : Executing 
> command(queryId=hive_20160821213434_d0271d77-84d8-45ba-8d92-3da1c143bded): 
> show functions yshi.dummy
> INFO  : Starting task [Stage-0:DDL] in serial mode
> INFO  : SHOW FUNCTIONS is deprecated, please use SHOW FUNCTIONS LIKE instead.
> INFO  : Completed executing 
> command(queryId=hive_20160821213434_d0271d77-84d8-45ba-8d92-3da1c143bded); 
> Time taken: 0.024 seconds
> INFO  : OK
> +-+--+
> |  tab_name   |
> +-+--+
> | yshi.dummy  |
> +-+--+
> 1 row selected (3.877 seconds)
> 0: jdbc:hive2://10.17.81.144:1/default> drop function yshi.dummy;
> INFO  : Compiling 
> command(queryId=hive_20160821213434_47d14df5-59b3-4ebc-9a48-5e1d9c60c1fc): 
> drop function yshi.dummy
> INFO  : converting to local 
> hdfs://host-10-17-81-142.coe.cloudera.com:8020/hive/jars/yshi.jar
> ERROR : Failed to read external resource 
> hdfs://host-10-17-81-142.coe.cloudera.com:8020/hive/jars/yshi.jar
> java.lang.RuntimeException: Failed to read external resource 
> hdfs://host-10-17-81-142.coe.cloudera.com:8020/hive/jars/yshi.jar
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1200)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1136)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1126)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionTask.addFunctionResources(FunctionTask.java:304)
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.registerToSessionRegistry(Registry.java:470)
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.getQualifiedFunctionInfo(Registry.java:456)
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.getFunctionInfo(Registry.java:245)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:455)
>   at 
> org.apache.hadoop.hive.ql.parse.FunctionSemanticAnalyzer.analyzeDropFunction(FunctionSemanticAnalyzer.java:99)
>   at 
> org.apache.hadoop.hive.ql.parse.FunctionSemanticAnalyzer.analyzeInternal(FunctionSemanticAnalyzer.java:61)
>   at 
> 

[jira] [Commented] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-26 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395038#comment-15395038
 ] 

Yibing Shi commented on HIVE-14205:
---

Thanks [~ctang.ma]!

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14205.1.patch, HIVE-14205.2.patch, 
> HIVE-14205.3.patch, HIVE-14205.4.patch, HIVE-14205.5.patch, 
> HIVE-14205.6.patch, HIVE-14205.7.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'

[jira] [Commented] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-25 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15391794#comment-15391794
 ] 

Yibing Shi commented on HIVE-14205:
---

These errors seem irrelevant. 

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch, HIVE-14205.2.patch, 
> HIVE-14205.3.patch, HIVE-14205.4.patch, HIVE-14205.5.patch, 
> HIVE-14205.6.patch, HIVE-14205.7.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT
>  

[jira] [Updated] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-24 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-14205:
--
Attachment: HIVE-14205.7.patch

Fix the qtests

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch, HIVE-14205.2.patch, 
> HIVE-14205.3.patch, HIVE-14205.4.patch, HIVE-14205.5.patch, 
> HIVE-14205.6.patch, HIVE-14205.7.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT
>   

[jira] [Commented] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-24 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15391021#comment-15391021
 ] 

Yibing Shi commented on HIVE-14205:
---

It looks like some latest change in master branch breaks my test. After 
applying the latest changes of master branch, I can reproduce the test failure. 
Will look further into it.

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch, HIVE-14205.2.patch, 
> HIVE-14205.3.patch, HIVE-14205.4.patch, HIVE-14205.5.patch, HIVE-14205.6.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` 

[jira] [Updated] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-22 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-14205:
--
Attachment: HIVE-14205.6.patch

Modify the itests to use text files. See how it goes now

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch, HIVE-14205.2.patch, 
> HIVE-14205.3.patch, HIVE-14205.4.patch, HIVE-14205.5.patch, HIVE-14205.6.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT
>   

[jira] [Commented] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-20 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387114#comment-15387114
 ] 

Yibing Shi commented on HIVE-14205:
---

Still failed. I will work on a new patch.

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch, HIVE-14205.2.patch, 
> HIVE-14205.3.patch, HIVE-14205.4.patch, HIVE-14205.5.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT
>   

[jira] [Updated] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-19 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-14205:
--
Attachment: HIVE-14205.5.patch

attach a new patch that includes latest changes in master branch. If this still 
doesn't work, I will remove the binary files and use insert instead as 
[~ctang.ma] has said.

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch, HIVE-14205.2.patch, 
> HIVE-14205.3.patch, HIVE-14205.4.patch, HIVE-14205.5.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')

[jira] [Updated] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-18 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-14205:
--
Attachment: HIVE-14205.4.patch

I have verified this patch can be applied:
{noformat}
➜  repo git:(master) patch -p0 <~/Downloads/HIVE-14205.4.patch
File data/files/union_non_nullable.avro: git binary diffs are not supported.
File data/files/union_nullable.avro: git binary diffs are not supported.
patching file ql/src/test/queries/clientnegative/avro_non_nullable_union.q
patching file ql/src/test/queries/clientpositive/avro_nullable_union.q
patching file ql/src/test/results/clientnegative/avro_non_nullable_union.q.out
patching file ql/src/test/results/clientpositive/avro_nullable_union.q.out
patching file 
serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java
patching file 
serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java
patching file 
serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroDeserializer.java
patching file 
serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java
{noformat}

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch, HIVE-14205.2.patch, 
> HIVE-14205.3.patch, HIVE-14205.4.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at 

[jira] [Commented] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-17 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15381670#comment-15381670
 ] 

Yibing Shi commented on HIVE-14205:
---

[~ctang.ma], could you please helpl check whether you can apply the patch? I 
can apply it on my laptop

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch, HIVE-14205.2.patch, 
> HIVE-14205.3.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT
>   

[jira] [Updated] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-17 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-14205:
--
Attachment: HIVE-14205.3.patch

I created this patch with command:
{noformat}
git diff --no-prefix --binary HEAD~1 HEAD > ~/Downloads/HIVE-14205.3.patch
{noformat}



> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch, HIVE-14205.2.patch, 
> HIVE-14205.3.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS 

[jira] [Updated] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-17 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-14205:
--
Attachment: (was: HIVE-14205.3.patch)

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch, HIVE-14205.2.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT
>   

[jira] [Updated] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-17 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-14205:
--
Attachment: HIVE-14205.3.patch

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch, HIVE-14205.2.patch, 
> HIVE-14205.3.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT
>   

[jira] [Commented] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-17 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15381651#comment-15381651
 ] 

Yibing Shi commented on HIVE-14205:
---

[~ctang.ma], these 2 files are binary AVRO files. Looks like they are causing 
trouble to git apply.
Let me recreate the patch file with the command described 
[here|http://stackoverflow.com/questions/17152171/git-cannot-apply-binary-patch-without-full-index-line]


> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch, HIVE-14205.2.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` 

[jira] [Updated] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-16 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-14205:
--
Attachment: HIVE-14205.2.patch

submit a new patch based on code review

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch, HIVE-14205.2.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT
>   

[jira] [Issue Comment Deleted] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-15 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-14205:
--
Comment: was deleted

(was: Just found that current Hive union type implementation has an essential 
confliction with AVRO implementation.

Currently Hive uses {{UnionObject}} as the value of union type columns. For 
example, if we create a table like below:
{noformat}
create table avro_union_test2 (value uniontype);
{noformat}
We cannot just stored int or bigint data to column "value". Instead, we will 
have to use UDF create_union to create a {{UnionObject}} value:
{noformat}
insert overwrite table avro_union_test2 select 1 as value; -- this fails
insert overwrite table avro_union_test2 select create_union(0,1,2L) as value; 
-- this succeeds
{noformat}

If the table uses text file format, the data stored in file is as below:
{noformat}
0:1
{noformat}
where the 0 is the tag/offset of the object, and 1 is the actual value.
(the 2L part is used only for type checking and isn't stored in data file at 
all)

AvroSerDe stores data in a similar way. It stores the type offset together with 
the actual data. But when reading data, avro returns the actual data instead of 
a {{UnionObject}}:
https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/generic/GenericDatumReader.java#L179

For above data created by {{create_union}}, the AvroSerDe returns an Integer 
instead of a UnionObject. This makes Hive fail in future operations (writing to 
data files or formatting as Json string).

I will check to see whether we have a way to fix this.)

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> 

[jira] [Commented] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-15 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378929#comment-15378929
 ] 

Yibing Shi commented on HIVE-14205:
---

Just found that current Hive union type implementation has an essential 
confliction with AVRO implementation.

Currently Hive uses {{UnionObject}} as the value of union type columns. For 
example, if we create a table like below:
{noformat}
create table avro_union_test2 (value uniontype);
{noformat}
We cannot just stored int or bigint data to column "value". Instead, we will 
have to use UDF create_union to create a {{UnionObject}} value:
{noformat}
insert overwrite table avro_union_test2 select 1 as value; -- this fails
insert overwrite table avro_union_test2 select create_union(0,1,2L) as value; 
-- this succeeds
{noformat}

If the table uses text file format, the data stored in file is as below:
{noformat}
0:1
{noformat}
where the 0 is the tag/offset of the object, and 1 is the actual value.
(the 2L part is used only for type checking and isn't stored in data file at 
all)

AvroSerDe stores data in a similar way. It stores the type offset together with 
the actual data. But when reading data, avro returns the actual data instead of 
a {{UnionObject}}:
https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/generic/GenericDatumReader.java#L179

For above data created by {{create_union}}, the AvroSerDe returns an Integer 
instead of a UnionObject. This makes Hive fail in future operations (writing to 
data files or formatting as Json string).

I will check to see whether we have a way to fix this.

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> 

[jira] [Commented] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-12 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372699#comment-15372699
 ] 

Yibing Shi commented on HIVE-14205:
---

code review:
https://reviews.apache.org/r/49952/

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT
>   

[jira] [Updated] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-12 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-14205:
--
Attachment: HIVE-14205.1.patch

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
> Attachments: HIVE-14205.1.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION
>   

[jira] [Updated] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-12 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-14205:
--
Assignee: Yibing Shi
  Status: Patch Available  (was: Open)

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-14205.1.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT
>   

[jira] [Updated] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-11 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-14205:
--
Attachment: (was: HIVE-14205.1.patch)

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION
>   'hdfs://localhost/user/hive/warehouse/avro_union_test2'
> TBLPROPERTIES 

[jira] [Updated] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-11 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-14205:
--
Attachment: HIVE-14205.1.patch

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
> Attachments: HIVE-14205.1.patch
>
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION
>   

[jira] [Commented] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-11 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15370611#comment-15370611
 ] 

Yibing Shi commented on HIVE-14205:
---

Will submit a patch later

> Hive doesn't support union type with AVRO file format
> -
>
> Key: HIVE-14205
> URL: https://issues.apache.org/jira/browse/HIVE-14205
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Yibing Shi
>
> Reproduce steps:
> {noformat}
> hive> CREATE TABLE avro_union_test
> > PARTITIONED BY (p int)
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> > STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> > OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> > TBLPROPERTIES ('avro.schema.literal'='{
> >"type":"record",
> >"name":"nullUnionTest",
> >"fields":[
> >   {
> >  "name":"value",
> >  "type":[
> > "null",
> > "int",
> > "long"
> >  ],
> >  "default":null
> >   }
> >]
> > }');
> OK
> Time taken: 0.105 seconds
> hive> alter table avro_union_test add partition (p=1);
> OK
> Time taken: 0.093 seconds
> hive> select * from avro_union_test;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Another test case to show this problem is:
> {noformat}
> hive> create table avro_union_test2 (value uniontype) stored as 
> avro;
> OK
> Time taken: 0.053 seconds
> hive> show create table avro_union_test2;
> OK
> CREATE TABLE `avro_union_test2`(
>   `value` uniontype COMMENT '')
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION
>   

[jira] [Commented] (HIVE-13065) Hive throws NPE when writing map type data to a HBase backed table

2016-02-16 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149531#comment-15149531
 ] 

Yibing Shi commented on HIVE-13065:
---

How about the reading part? If we skip the null values, would it affect the 
reading part?
And what if we have a null value in key set? This is possible in theory.

> Hive throws NPE when writing map type data to a HBase backed table
> --
>
> Key: HIVE-13065
> URL: https://issues.apache.org/jira/browse/HIVE-13065
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 1.1.0, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-13065.1.patch
>
>
> Hive throws NPE when writing data to a HBase backed table with below 
> conditions:
> # There is a map type column
> # The map type column has NULL in its values
> Below are the reproduce steps:
> *1) Create a HBase backed Hive table*
> {code:sql}
> create table hbase_test (id bigint, data map)
> stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> with serdeproperties ("hbase.columns.mapping" = ":key,cf:map_col")
> tblproperties ("hbase.table.name" = "hive_test");
> {code}
> *2) insert data into above table*
> {code:sql}
> insert overwrite table hbase_test select 1 as id, map('abcd', null) as data 
> from src limit 1;
> {code}
> The mapreduce job for insert query fails. Error messages are as below:
> {noformat}
> 2016-02-15 02:26:33,225 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) {"key":{},"value":{"_col0":1,"_col1":{"abcd":null}}}
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:265)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=0) 
> {"key":{},"value":{"_col0":1,"_col1":{"abcd":null}}}
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:253)
>   ... 7 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:731)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.LimitOperator.processOp(LimitOperator.java:51)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
>   ... 7 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:286)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:666)
>   ... 14 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:221)
>   at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:236)
>   at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:275)
>   at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:222)
>   at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serializeField(HBaseRowSerializer.java:194)
>   at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:118)
>   at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:282)
>   ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11733) UDF GenericUDFReflect cannot find classes added by "ADD JAR"

2015-09-25 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908836#comment-14908836
 ] 

Yibing Shi commented on HIVE-11733:
---

Sorry, got distracted by other stuff. Will add a test case for this.

> UDF GenericUDFReflect cannot find classes added by "ADD JAR"
> 
>
> Key: HIVE-11733
> URL: https://issues.apache.org/jira/browse/HIVE-11733
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.2.1
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-11733.1.patch
>
>
> When run below command:
> {quote}
> hive -e "add jar /root/hive/TestReflect.jar; \
> select reflect('com.yshi.hive.TestReflect', 'testReflect', code) from 
> sample_07 limit 3"
> {quote}
> Get below error:
> {noformat}
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> UDFReflect evaluate
> {noformat}
> The full stack trace is:
> {noformat}
> 15/09/04 07:00:37 [main]: INFO compress.CodecPool: Got brand-new decompressor 
> [.bz2]
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> UDFReflect evaluate
> 15/09/04 07:00:37 [main]: ERROR CliDriver: Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> UDFReflect evaluate
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> UDFReflect evaluate
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:152)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1657)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: UDFReflect 
> evaluate
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFReflect.evaluate(GenericUDFReflect.java:107)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:424)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:416)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
>   ... 13 more
> Caused by: java.lang.ClassNotFoundException: com.yshi.hive.TestReflect
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:190)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFReflect.evaluate(GenericUDFReflect.java:105)
>   ... 22 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11733) UDF GenericUDFReflect cannot find classes added by "ADD JAR"

2015-09-04 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi reassigned HIVE-11733:
-

Assignee: Yibing Shi

> UDF GenericUDFReflect cannot find classes added by "ADD JAR"
> 
>
> Key: HIVE-11733
> URL: https://issues.apache.org/jira/browse/HIVE-11733
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.2.1
>Reporter: Yibing Shi
>Assignee: Yibing Shi
>
> When run below command:
> {quote}
> hive -e "add jar /root/hive/TestReflect.jar; \
> select reflect('com.yshi.hive.TestReflect', 'testReflect', code) from 
> sample_07 limit 3"
> {quote}
> Get below error:
> {noformat}
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> UDFReflect evaluate
> {noformat}
> The full stack trace is:
> {noformat}
> 15/09/04 07:00:37 [main]: INFO compress.CodecPool: Got brand-new decompressor 
> [.bz2]
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> UDFReflect evaluate
> 15/09/04 07:00:37 [main]: ERROR CliDriver: Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> UDFReflect evaluate
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> UDFReflect evaluate
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:152)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1657)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: UDFReflect 
> evaluate
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFReflect.evaluate(GenericUDFReflect.java:107)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:424)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:416)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
>   ... 13 more
> Caused by: java.lang.ClassNotFoundException: com.yshi.hive.TestReflect
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:190)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFReflect.evaluate(GenericUDFReflect.java:105)
>   ... 22 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11733) UDF GenericUDFReflect cannot find classes added by "ADD JAR"

2015-09-04 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-11733:
--
Attachment: HIVE-11733.1.patch

Upload the patch.

> UDF GenericUDFReflect cannot find classes added by "ADD JAR"
> 
>
> Key: HIVE-11733
> URL: https://issues.apache.org/jira/browse/HIVE-11733
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.2.1
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-11733.1.patch
>
>
> When run below command:
> {quote}
> hive -e "add jar /root/hive/TestReflect.jar; \
> select reflect('com.yshi.hive.TestReflect', 'testReflect', code) from 
> sample_07 limit 3"
> {quote}
> Get below error:
> {noformat}
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> UDFReflect evaluate
> {noformat}
> The full stack trace is:
> {noformat}
> 15/09/04 07:00:37 [main]: INFO compress.CodecPool: Got brand-new decompressor 
> [.bz2]
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> UDFReflect evaluate
> 15/09/04 07:00:37 [main]: ERROR CliDriver: Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> UDFReflect evaluate
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> UDFReflect evaluate
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:152)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1657)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: UDFReflect 
> evaluate
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFReflect.evaluate(GenericUDFReflect.java:107)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:424)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:416)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
>   ... 13 more
> Caused by: java.lang.ClassNotFoundException: com.yshi.hive.TestReflect
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:190)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFReflect.evaluate(GenericUDFReflect.java:105)
>   ... 22 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11216) UDF GenericUDFMapKeys throws NPE when a null map value is passed in

2015-07-09 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi reassigned HIVE-11216:
-

Assignee: Yibing Shi

 UDF GenericUDFMapKeys throws NPE when a null map value is passed in
 ---

 Key: HIVE-11216
 URL: https://issues.apache.org/jira/browse/HIVE-11216
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 1.2.0
Reporter: Yibing Shi
Assignee: Yibing Shi

 We can reproduce the problem as below:
 {noformat}
 hive show create table map_txt;
 OK
 CREATE  TABLE `map_txt`(
   `id` int,
   `content` mapint,string)
 ROW FORMAT SERDE
   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
 STORED AS INPUTFORMAT
   'org.apache.hadoop.mapred.TextInputFormat'
 OUTPUTFORMAT
   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
 ...
 Time taken: 0.233 seconds, Fetched: 18 row(s)
 hive select * from map_txt;
 OK
 1   NULL
 Time taken: 0.679 seconds, Fetched: 1 row(s)
 hive select id, map_keys(content) from map_txt;
 
 Error during job, obtaining debugging information...
 Examining task ID: task_1435534231122_0025_m_00 (and more) from job 
 job_1435534231122_0025
 Task with the most failures(4):
 -
 Task ID:
   task_1435534231122_0025_m_00
 URL:
   
 http://host-10-17-80-40.coe.cloudera.com:8088/taskdetails.jsp?jobid=job_1435534231122_0025tipid=task_1435534231122_0025_m_00
 -
 Diagnostic Messages for this Task:
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row {id:1,content:null}
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:198)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row {id:1,content:null}
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:559)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:180)
 ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 map_keys(content)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:549)
 ... 9 more
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapKeys.evaluate(GenericUDFMapKeys.java:64)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:79)
 ... 13 more
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 MapReduce Jobs Launched:
 Stage-Stage-1: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
 hive
 {noformat}
 The error is as below (in mappers):
 {noformat}
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapKeys.evaluate(GenericUDFMapKeys.java:64)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
 at 
 org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.getNewKey(KeyWrapperFactory.java:113)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:778)
 ... 17 more
 {noformat}
 Looking at the source code:
 {code}
   public Object evaluate(DeferredObject[] arguments) throws HiveException {
 

[jira] [Updated] (HIVE-11216) UDF GenericUDFMapKeys throws NPE when a null map value is passed in

2015-07-09 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-11216:
--
Attachment: HIVE-11216.patch

 UDF GenericUDFMapKeys throws NPE when a null map value is passed in
 ---

 Key: HIVE-11216
 URL: https://issues.apache.org/jira/browse/HIVE-11216
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 1.2.0
Reporter: Yibing Shi
Assignee: Yibing Shi
 Attachments: HIVE-11216.patch


 We can reproduce the problem as below:
 {noformat}
 hive show create table map_txt;
 OK
 CREATE  TABLE `map_txt`(
   `id` int,
   `content` mapint,string)
 ROW FORMAT SERDE
   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
 STORED AS INPUTFORMAT
   'org.apache.hadoop.mapred.TextInputFormat'
 OUTPUTFORMAT
   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
 ...
 Time taken: 0.233 seconds, Fetched: 18 row(s)
 hive select * from map_txt;
 OK
 1   NULL
 Time taken: 0.679 seconds, Fetched: 1 row(s)
 hive select id, map_keys(content) from map_txt;
 
 Error during job, obtaining debugging information...
 Examining task ID: task_1435534231122_0025_m_00 (and more) from job 
 job_1435534231122_0025
 Task with the most failures(4):
 -
 Task ID:
   task_1435534231122_0025_m_00
 URL:
   
 http://host-10-17-80-40.coe.cloudera.com:8088/taskdetails.jsp?jobid=job_1435534231122_0025tipid=task_1435534231122_0025_m_00
 -
 Diagnostic Messages for this Task:
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row {id:1,content:null}
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:198)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row {id:1,content:null}
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:559)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:180)
 ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 map_keys(content)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:549)
 ... 9 more
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapKeys.evaluate(GenericUDFMapKeys.java:64)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:79)
 ... 13 more
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 MapReduce Jobs Launched:
 Stage-Stage-1: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
 hive
 {noformat}
 The error is as below (in mappers):
 {noformat}
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapKeys.evaluate(GenericUDFMapKeys.java:64)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
 at 
 org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.getNewKey(KeyWrapperFactory.java:113)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:778)
 ... 17 more
 {noformat}
 Looking at the source code:
 {code}
   public Object 

[jira] [Updated] (HIVE-11216) UDF GenericUDFMapKeys throws NPE when a null map value is passed in

2015-07-09 Thread Yibing Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibing Shi updated HIVE-11216:
--
Attachment: HIVE-11216.1.patch

Attach a new patch.

 UDF GenericUDFMapKeys throws NPE when a null map value is passed in
 ---

 Key: HIVE-11216
 URL: https://issues.apache.org/jira/browse/HIVE-11216
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 1.2.0
Reporter: Yibing Shi
Assignee: Yibing Shi
 Attachments: HIVE-11216.1.patch, HIVE-11216.patch


 We can reproduce the problem as below:
 {noformat}
 hive show create table map_txt;
 OK
 CREATE  TABLE `map_txt`(
   `id` int,
   `content` mapint,string)
 ROW FORMAT SERDE
   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
 STORED AS INPUTFORMAT
   'org.apache.hadoop.mapred.TextInputFormat'
 OUTPUTFORMAT
   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
 ...
 Time taken: 0.233 seconds, Fetched: 18 row(s)
 hive select * from map_txt;
 OK
 1   NULL
 Time taken: 0.679 seconds, Fetched: 1 row(s)
 hive select id, map_keys(content) from map_txt;
 
 Error during job, obtaining debugging information...
 Examining task ID: task_1435534231122_0025_m_00 (and more) from job 
 job_1435534231122_0025
 Task with the most failures(4):
 -
 Task ID:
   task_1435534231122_0025_m_00
 URL:
   
 http://host-10-17-80-40.coe.cloudera.com:8088/taskdetails.jsp?jobid=job_1435534231122_0025tipid=task_1435534231122_0025_m_00
 -
 Diagnostic Messages for this Task:
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row {id:1,content:null}
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:198)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row {id:1,content:null}
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:559)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:180)
 ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 map_keys(content)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:549)
 ... 9 more
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapKeys.evaluate(GenericUDFMapKeys.java:64)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:79)
 ... 13 more
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 MapReduce Jobs Launched:
 Stage-Stage-1: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
 hive
 {noformat}
 The error is as below (in mappers):
 {noformat}
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapKeys.evaluate(GenericUDFMapKeys.java:64)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
 at 
 org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.getNewKey(KeyWrapperFactory.java:113)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:778)
 ... 17 more
 {noformat}
 Looking at the source code:
 

[jira] [Commented] (HIVE-11150) Remove wrong warning message related to chgrp

2015-06-30 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609303#comment-14609303
 ] 

Yibing Shi commented on HIVE-11150:
---

Should we also fix {{Hadoop20Shims}} and {{Hadoop20SShims}}?
Should we also protect the call to {{chmod}} in a similar way?

 Remove wrong warning message related to chgrp
 -

 Key: HIVE-11150
 URL: https://issues.apache.org/jira/browse/HIVE-11150
 Project: Hive
  Issue Type: Bug
  Components: Shims
Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-11150.1.patch


 When using other file system other than hdfs, users see warning message 
 regarding hdfs chgrp. The warning is very annoying and confusing. We'd better 
 remove it. 
 The warning example:
 {noformat}
 hive insert overwrite table s3_test select total_emp, salary, description 
 from sample_07 limit 5;
 -chgrp: '' does not match expected pattern for group
 Usage: hadoop fs [generic options] -chgrp [-R] GROUP PATH...
 Total jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)