from:"\"Aihua Xu \\\\\\\(JIRA\\\\\\\)\""

[jira] [Commented] (HIVE-19403) Demote 'Pattern' Logging

2018-06-08 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506564#comment-16506564
 ] 

Aihua Xu commented on HIVE-19403:
-

I feel we can remove such log entries. How do you think? 

> Demote 'Pattern' Logging
> 
>
> Key: HIVE-19403
> URL: https://issues.apache.org/jira/browse/HIVE-19403
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0, 2.4.0
>Reporter: BELUGA BEHR
>Assignee: gonglinglei
>Priority: Trivial
>  Labels: noob
> Attachments: HIVE-19403.1.patch
>
>
> In the {{DDLTask}} class, there is some logging that is not helpful to a 
> cluster admin and should be demoted to _debug_ level logging.  In fact, in 
> one place in the code, it already is.
> {code}
> LOG.info("pattern: {}", showDatabasesDesc.getPattern());
> LOG.debug("pattern: {}", pattern);
> LOG.info("pattern: {}", showFuncs.getPattern());
> LOG.info("pattern: {}", showTblStatus.getPattern());
> {code}
> Here is an example... as an admin, I can already see what the pattern is, I 
> do not need this extra logging.  It provides no additional context.
> {code:java|title=Example}
> 2018-05-03 03:08:26,354 INFO  org.apache.hadoop.hive.ql.Driver: 
> [HiveServer2-Background-Pool: Thread-101980]: Executing 
> command(queryId=hive_20180503030808_e53c26ef-2280-4eca-929b-668503105e2e): 
> SHOW TABLE EXTENDED FROM my_db LIKE '*'
> 2018-05-03 03:08:26,355 INFO  hive.ql.exec.DDLTask: 
> [HiveServer2-Background-Pool: Thread-101980]: pattern: *
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19805) TableScanDesc Use Commons Library

2018-06-08 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506632#comment-16506632
 ] 

Aihua Xu commented on HIVE-19805:
-

[~belugabehr] That's nice. Do you know how commons-collections4 dependency gets 
included?

> TableScanDesc Use Commons Library
> -
>
> Key: HIVE-19805
> URL: https://issues.apache.org/jira/browse/HIVE-19805
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-19805.1.patch
>
>
> Use commons library and remove some code



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19809) Remove Deprecated Code From Utilities Class

2018-06-08 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506644#comment-16506644
 ] 

Aihua Xu commented on HIVE-19809:
-

The change looks good to me. +1.

> Remove Deprecated Code From Utilities Class
> ---
>
> Key: HIVE-19809
> URL: https://issues.apache.org/jira/browse/HIVE-19809
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-19809.1.patch
>
>
> {quote}
> This can go away once hive moves to support only JDK 7  and can use 
> Files.createTempDirectory
> {quote}
> Remove the {{createTempDir}} method from the {{Utilities}} class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19203) Thread-Safety Issue in HiveMetaStore

2018-06-11 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508978#comment-16508978
 ] 

Aihua Xu commented on HIVE-19203:
-

+1.

> Thread-Safety Issue in HiveMetaStore
> 
>
> Key: HIVE-19203
> URL: https://issues.apache.org/jira/browse/HIVE-19203
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-19203.1.patch
>
>
> [https://github.com/apache/hive/blob/550d1e1196b7c801c572092db974a459aac6c249/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L345-L351]
> {code:java}
> private static int nextSerialNum = 0;
> private static ThreadLocal threadLocalId = new 
> ThreadLocal() {
>   @Override
>   protected Integer initialValue() {
> return nextSerialNum++;
>   }
> };{code}
>  
> {{nextSerialNum}} needs to be an atomic value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19203) Thread-Safety Issue in HiveMetaStore

2018-06-12 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19203:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks Alice for the work.

> Thread-Safety Issue in HiveMetaStore
> 
>
> Key: HIVE-19203
> URL: https://issues.apache.org/jira/browse/HIVE-19203
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-19203.1.patch
>
>
> [https://github.com/apache/hive/blob/550d1e1196b7c801c572092db974a459aac6c249/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L345-L351]
> {code:java}
> private static int nextSerialNum = 0;
> private static ThreadLocal threadLocalId = new 
> ThreadLocal() {
>   @Override
>   protected Integer initialValue() {
> return nextSerialNum++;
>   }
> };{code}
>  
> {{nextSerialNum}} needs to be an atomic value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19422) Create Docker env for running HoS locally

2018-06-12 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510370#comment-16510370
 ] 

Aihua Xu commented on HIVE-19422:
-

[~stakiar] Trying to understand what you are thinking of this jira. Are you 
talking about deploying spark on docker and debugging hive against it? I have 
been able to setup spark locally and connect Hive against it so we can debug 
hive and spark process. If we install spark on docker, then it's harder to 
debug against it.

Can you clarify? 

> Create Docker env for running HoS locally
> -
>
> Key: HIVE-19422
> URL: https://issues.apache.org/jira/browse/HIVE-19422
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Aihua Xu
>Priority: Major
>
> It's really hard to run HoS on a locally installed distribution of Hive built 
> using {{mvn package}}. The only way developers can really run HoS is via the 
> Spark CLI Drivers. However, there are occasions where devs need to run HoS on 
> a proper Hive distribution in order to validate some behavior.
> The docker image will also be useful to users who want to play around with 
> HoS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19785) Race condition when timeout task is invoked during SASL negotation

2018-06-13 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19785:

Attachment: HIVE-19785.1.patch

> Race condition when timeout task is invoked during SASL negotation
> --
>
> Key: HIVE-19785
> URL: https://issues.apache.org/jira/browse/HIVE-19785
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19785.1.patch
>
>
> There is a race condition that leads to some extraneous exception messages 
> when the timeout task is invoked in {{RpcServer}}.
> If a timeout is triggered by {{RpcServer#registerClient}} the method will 
> remove the {{clientId}} from {{pendingClients}}. However, if the SASL 
> negotiation is in progress when the timeout task is invoked, then 
> {{SaslServerHandler#update}} will throw an {{IllegalArgumentException}} 
> complaining that it can't find the {{clientId}} in the map of 
> {{pendingClients}}.
> The timeout still succeeds, but the logging is confusing and multiple 
> exceptions make this difficult to debug.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19785) Race condition when timeout task is invoked during SASL negotation

2018-06-13 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19785:

Status: Patch Available  (was: Open)

patch-1: by analyzing the logic, when timeout is triggered, we need to cancel 
registerClient. Also, change setSuccess/setFailure to trySuccess/tryFailure 
since if the task already completes, we really can't change the status.

> Race condition when timeout task is invoked during SASL negotation
> --
>
> Key: HIVE-19785
> URL: https://issues.apache.org/jira/browse/HIVE-19785
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19785.1.patch
>
>
> There is a race condition that leads to some extraneous exception messages 
> when the timeout task is invoked in {{RpcServer}}.
> If a timeout is triggered by {{RpcServer#registerClient}} the method will 
> remove the {{clientId}} from {{pendingClients}}. However, if the SASL 
> negotiation is in progress when the timeout task is invoked, then 
> {{SaslServerHandler#update}} will throw an {{IllegalArgumentException}} 
> complaining that it can't find the {{clientId}} in the map of 
> {{pendingClients}}.
> The timeout still succeeds, but the logging is confusing and multiple 
> exceptions make this difficult to debug.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-14788) Investigate how to access permanent function with restarting HS2 if load balancer is configured

2018-06-13 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-14788:

Description: 
When load balancer is configured for multiple HS2 servers, seems we need to 
restart each HS2 server to get permanent function to work. Since the command 
"reload function" issued from the client to refresh the global registry may not 
be targeted to a specific HS2 server, some servers may not get refreshed and 
ClassNotFoundException may be thrown later.

Investigate if it's an issue and a good solution for it.

  was:
When load balancer is configured for multiple HS2 servers, seems we need to 
restart each HS2 server to get permanent function to work. Since the command 
"reload function" issued from the client to refresh the global registry may is 
not targeted to a specific HS2 server, some servers may not get refreshed and 
ClassNotFoundException may be thrown later.

Investigate if it's an issue and a good solution for it.


> Investigate how to access permanent function with restarting HS2 if load 
> balancer is configured
> ---
>
> Key: HIVE-14788
> URL: https://issues.apache.org/jira/browse/HIVE-14788
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
>
> When load balancer is configured for multiple HS2 servers, seems we need to 
> restart each HS2 server to get permanent function to work. Since the command 
> "reload function" issued from the client to refresh the global registry may 
> not be targeted to a specific HS2 server, some servers may not get refreshed 
> and ClassNotFoundException may be thrown later.
> Investigate if it's an issue and a good solution for it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18916) SparkClientImpl doesn't error out if spark-submit fails

2018-06-13 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-18916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16511568#comment-16511568
 ] 

Aihua Xu commented on HIVE-18916:
-

[~stakiar] One thought on how to get the error: Rather than checking the log 
for "Error", can we separate the STDERR from STDOUT from bin/spark-submit 
process so when there is an error, we can capture the error from STDERR? Is 
that possible? 

> SparkClientImpl doesn't error out if spark-submit fails
> ---
>
> Key: HIVE-18916
> URL: https://issues.apache.org/jira/browse/HIVE-18916
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18916.1.WIP.patch, HIVE-18916.2.patch, 
> HIVE-18916.3.patch
>
>
> If {{spark-submit}} returns a non-zero exit code, {{SparkClientImpl}} will 
> simply log the exit code, but won't throw an error. Eventually, the 
> connection timeout will get triggered and an exception like {{Timed out 
> waiting for client connection}} will be logged, which is pretty misleading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19878) Hive On Spark support AM shut down when there is no job submit

2018-06-13 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16511635#comment-16511635
 ] 

Aihua Xu commented on HIVE-19878:
-

[~windpiger] I checked the patch and it's promising. Do you want to contribute 
to the patch? If so, can you assign to yourself and attach the patch with the 
name HIVE-19878.1.patch to trigger the pre-commit build? Thanks.

> Hive On Spark support AM shut down when there is no job submit
> --
>
> Key: HIVE-19878
> URL: https://issues.apache.org/jira/browse/HIVE-19878
> Project: Hive
>  Issue Type: New Feature
>  Components: Spark
>Reporter: Song Jun
>Priority: Minor
> Attachments: HIVE-19878.patch.1
>
>
> the Application Master of Hive on Spark always live on the yarn if the Hive 
> client do not exit(such as one session in HiveServer2), which will accupy 
> lots of resources, we should control the AM shut down when there is no more 
> jobs submit.
> Now Tez use the param  ` tez.session.am.dag.submit.timeout.secs` to control 
> the DAGAppMaster on the yarn to shut down.
> So here Spark need to do this too.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19422) Create Docker env for running HoS locally

2018-06-13 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16511645#comment-16511645
 ] 

Aihua Xu commented on HIVE-19422:
-

[~stakiar] I was checking https://github.com/big-data-europe/docker-spark/ and 
https://github.com/big-data-europe/docker-hive. There are some docker for spark 
and hive. I can investigate to create a HoS docker. Regarding debugging and 
testing HoS locally, maybe we can write some scripts to automate the steps 
(separate from this)? 

> Create Docker env for running HoS locally
> -
>
> Key: HIVE-19422
> URL: https://issues.apache.org/jira/browse/HIVE-19422
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Aihua Xu
>Priority: Major
>
> It's really hard to run HoS on a locally installed distribution of Hive built 
> using {{mvn package}}. The only way developers can really run HoS is via the 
> Spark CLI Drivers. However, there are occasions where devs need to run HoS on 
> a proper Hive distribution in order to validate some behavior.
> The docker image will also be useful to users who want to play around with 
> HoS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19899) Support stored as JsonFile

2018-06-14 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-19899:
---


> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: This is to add "stored as jsonfile" support for json 
> file format. 
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19899) Support stored as JsonFile

2018-06-14 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19899:

Attachment: HIVE-19899.1.patch

> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: This is to add "stored as jsonfile" support for json 
> file format. 
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19899.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19899) Support stored as JsonFile

2018-06-14 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19899:

Status: Patch Available  (was: Open)

Patch-1: add the support of "stored as JsonFile" for json file format. 

> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: This is to add "stored as jsonfile" support for json 
> file format. 
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19899.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19899) Support stored as JsonFile

2018-06-14 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16513047#comment-16513047
 ] 

Aihua Xu commented on HIVE-19899:
-

[~belugabehr] Thanks for reviewing it. I will update the doc after the patch 
gets committed. 

Curious why you prefer to JSON over JSONFILE. We can go either way by following 
parquet/ORC or RCFile/TextFile. I'm not sure if there is a pattern here, but I 
chose JsonFile since I thought it's clearer.  Let me know your thought on this.

{{ROW FORMAT JSON STORED AS TEXTFILE}} will be a lot more customized work for 
textfile support and I feel it's less intuitive.  



> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: This is to add "stored as jsonfile" support for json 
> file format. 
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19899.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19899) Support stored as JsonFile

2018-06-14 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19899:

Attachment: HIVE-19899.2.patch

> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: This is to add "stored as jsonfile" support for json 
> file format. 
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19899.1.patch, HIVE-19899.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19899) Support stored as JsonFile

2018-06-14 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16513101#comment-16513101
 ] 

Aihua Xu commented on HIVE-19899:
-

Thanks. Uploaded patch-2 to address the comment.

> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: This is to add "stored as jsonfile" support for json 
> file format. 
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19899.1.patch, HIVE-19899.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19899) Support stored as JsonFile

2018-06-14 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16513279#comment-16513279
 ] 

Aihua Xu commented on HIVE-19899:
-

[~ychena] Seems I changed that by mistake. I will update that.

> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: This is to add "stored as jsonfile" support for json 
> file format. 
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19899.1.patch, HIVE-19899.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19899) Support stored as JsonFile

2018-06-14 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19899:

Attachment: HIVE-19899.3.patch

> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: This is to add "stored as jsonfile" support for json 
> file format. 
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19899.1.patch, HIVE-19899.2.patch, 
> HIVE-19899.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19899) Support stored as JsonFile

2018-06-14 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19899:

Attachment: (was: HIVE-19899.3.patch)

> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: This is to add "stored as jsonfile" support for json 
> file format. 
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19899.1.patch, HIVE-19899.2.patch, 
> HIVE-19899.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19899) Support stored as JsonFile

2018-06-14 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19899:

Attachment: HIVE-19899.3.patch

> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: This is to add "stored as jsonfile" support for json 
> file format. 
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19899.1.patch, HIVE-19899.2.patch, 
> HIVE-19899.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19785) Race condition when timeout task is invoked during SASL negotation

2018-06-15 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19785:

Status: In Progress  (was: Patch Available)

> Race condition when timeout task is invoked during SASL negotation
> --
>
> Key: HIVE-19785
> URL: https://issues.apache.org/jira/browse/HIVE-19785
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Aihua Xu
>Priority: Major
>
> There is a race condition that leads to some extraneous exception messages 
> when the timeout task is invoked in {{RpcServer}}.
> If a timeout is triggered by {{RpcServer#registerClient}} the method will 
> remove the {{clientId}} from {{pendingClients}}. However, if the SASL 
> negotiation is in progress when the timeout task is invoked, then 
> {{SaslServerHandler#update}} will throw an {{IllegalArgumentException}} 
> complaining that it can't find the {{clientId}} in the map of 
> {{pendingClients}}.
> The timeout still succeeds, but the logging is confusing and multiple 
> exceptions make this difficult to debug.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19785) Race condition when timeout task is invoked during SASL negotation

2018-06-15 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19785:

Attachment: (was: HIVE-19785.1.patch)

> Race condition when timeout task is invoked during SASL negotation
> --
>
> Key: HIVE-19785
> URL: https://issues.apache.org/jira/browse/HIVE-19785
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Aihua Xu
>Priority: Major
>
> There is a race condition that leads to some extraneous exception messages 
> when the timeout task is invoked in {{RpcServer}}.
> If a timeout is triggered by {{RpcServer#registerClient}} the method will 
> remove the {{clientId}} from {{pendingClients}}. However, if the SASL 
> negotiation is in progress when the timeout task is invoked, then 
> {{SaslServerHandler#update}} will throw an {{IllegalArgumentException}} 
> complaining that it can't find the {{clientId}} in the map of 
> {{pendingClients}}.
> The timeout still succeeds, but the logging is confusing and multiple 
> exceptions make this difficult to debug.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19899) Support stored as JsonFile

2018-06-18 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19899:

Attachment: HIVE-19899.4.patch

> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: This is to add "stored as jsonfile" support for json 
> file format. 
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19899.1.patch, HIVE-19899.2.patch, 
> HIVE-19899.3.patch, HIVE-19899.4.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19899) Support stored as JsonFile

2018-06-18 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19899:

Attachment: HIVE-19899.4.patch

> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: This is to add "stored as jsonfile" support for json 
> file format. 
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19899.1.patch, HIVE-19899.2.patch, 
> HIVE-19899.3.patch, HIVE-19899.4.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19899) Support stored as JsonFile

2018-06-18 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19899:

Attachment: (was: HIVE-19899.4.patch)

> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: This is to add "stored as jsonfile" support for json 
> file format. 
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19899.1.patch, HIVE-19899.2.patch, 
> HIVE-19899.3.patch, HIVE-19899.4.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19899) Support stored as JsonFile

2018-06-18 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516105#comment-16516105
 ] 

Aihua Xu commented on HIVE-19899:
-

patch-4: TestHCatStorer test failures are related. We have private 
storageFormat defined in child class which causes the issue.  Removed 
duplicated private definition in TestHCatStorer.

> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: This is to add "stored as jsonfile" support for json 
> file format. 
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19899.1.patch, HIVE-19899.2.patch, 
> HIVE-19899.3.patch, HIVE-19899.4.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19936) explain on a query failing in secure cluster whereas query itself works

2018-06-18 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-19936:
---

Assignee: Aihua Xu

> explain on a query failing in secure cluster whereas query itself works
> ---
>
> Key: HIVE-19936
> URL: https://issues.apache.org/jira/browse/HIVE-19936
> Project: Hive
>  Issue Type: Bug
>  Components: Hooks
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
>
> On a secured cluster with Sentry integrated run the following queries
> {noformat}
> create table foobar (id int) partitioned by (val int);
> explain alter table foobar add partition (val=50);
> {noformat}
> The explain query will fail with the following exception while the query 
> itself works with no issue.
> Error while compiling statement: FAILED: SemanticException No valid 
> privileges{color}
>  Required privilege( Table) not available in output privileges
>  The required privileges: (state=42000,code=4)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19936) explain on a query failing in secure cluster whereas query itself works

2018-06-18 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19936:

Status: Patch Available  (was: Open)

patch-1: the source code change is to add the inputs/outputs from explain 
command. It will affect many test cases since the inputs/outputs of  PREHOOK 
and POSTHOOK get updated. Will update that later if this change makes sense.

> explain on a query failing in secure cluster whereas query itself works
> ---
>
> Key: HIVE-19936
> URL: https://issues.apache.org/jira/browse/HIVE-19936
> Project: Hive
>  Issue Type: Bug
>  Components: Hooks
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19936.1.patch
>
>
> On a secured cluster with Sentry integrated run the following queries
> {noformat}
> create table foobar (id int) partitioned by (val int);
> explain alter table foobar add partition (val=50);
> {noformat}
> The explain query will fail with the following exception while the query 
> itself works with no issue.
> Error while compiling statement: FAILED: SemanticException No valid 
> privileges{color}
>  Required privilege( Table) not available in output privileges
>  The required privileges: (state=42000,code=4)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19936) explain on a query failing in secure cluster whereas query itself works

2018-06-18 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19936:

Attachment: HIVE-19936.1.patch

> explain on a query failing in secure cluster whereas query itself works
> ---
>
> Key: HIVE-19936
> URL: https://issues.apache.org/jira/browse/HIVE-19936
> Project: Hive
>  Issue Type: Bug
>  Components: Hooks
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19936.1.patch
>
>
> On a secured cluster with Sentry integrated run the following queries
> {noformat}
> create table foobar (id int) partitioned by (val int);
> explain alter table foobar add partition (val=50);
> {noformat}
> The explain query will fail with the following exception while the query 
> itself works with no issue.
> Error while compiling statement: FAILED: SemanticException No valid 
> privileges{color}
>  Required privilege( Table) not available in output privileges
>  The required privileges: (state=42000,code=4)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-19936) explain on a query failing in secure cluster whereas query itself works

2018-06-18 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516363#comment-16516363
 ] 

Aihua Xu edited comment on HIVE-19936 at 6/18/18 10:27 PM:
---

patch-1: the source code change is to add the inputs/outputs from explain 
command. For explain command like {{explain alter table foobar add partition 
(val=50);}}, the inputs/outputs should be the one from the query itself. I 
checked MySQL and Oracle, 
https://www.vividcortex.com/blog/2014/07/28/what-privileges-does-explain-require/
 and 
https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_9010.htm, 
both require the same privilege as the query itself. 

 It will affect many test cases since the inputs/outputs of  PREHOOK and 
POSTHOOK get updated. Will update that later if this change makes sense.


was (Author: aihuaxu):
patch-1: the source code change is to add the inputs/outputs from explain 
command. It will affect many test cases since the inputs/outputs of  PREHOOK 
and POSTHOOK get updated. Will update that later if this change makes sense.

> explain on a query failing in secure cluster whereas query itself works
> ---
>
> Key: HIVE-19936
> URL: https://issues.apache.org/jira/browse/HIVE-19936
> Project: Hive
>  Issue Type: Bug
>  Components: Hooks
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19936.1.patch
>
>
> On a secured cluster with Sentry integrated run the following queries
> {noformat}
> create table foobar (id int) partitioned by (val int);
> explain alter table foobar add partition (val=50);
> {noformat}
> The explain query will fail with the following exception while the query 
> itself works with no issue.
> Error while compiling statement: FAILED: SemanticException No valid 
> privileges{color}
>  Required privilege( Table) not available in output privileges
>  The required privileges: (state=42000,code=4)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18916) SparkClientImpl doesn't error out if spark-submit fails

2018-06-18 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-18916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516433#comment-16516433
 ] 

Aihua Xu commented on HIVE-18916:
-

Got it. Can you take care of checkstyle errors above? One file is missing 
license header. 

> SparkClientImpl doesn't error out if spark-submit fails
> ---
>
> Key: HIVE-18916
> URL: https://issues.apache.org/jira/browse/HIVE-18916
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18916.1.WIP.patch, HIVE-18916.2.patch, 
> HIVE-18916.3.patch
>
>
> If {{spark-submit}} returns a non-zero exit code, {{SparkClientImpl}} will 
> simply log the exit code, but won't throw an error. Eventually, the 
> connection timeout will get triggered and an exception like {{Timed out 
> waiting for client connection}} will be logged, which is pretty misleading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19403) Demote 'Pattern' Logging

2018-06-19 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16517347#comment-16517347
 ] 

Aihua Xu commented on HIVE-19403:
-

Debug level is enough for such log. +1.

> Demote 'Pattern' Logging
> 
>
> Key: HIVE-19403
> URL: https://issues.apache.org/jira/browse/HIVE-19403
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0, 2.4.0
>Reporter: BELUGA BEHR
>Assignee: gonglinglei
>Priority: Trivial
>  Labels: noob
> Attachments: HIVE-19403.1.patch
>
>
> In the {{DDLTask}} class, there is some logging that is not helpful to a 
> cluster admin and should be demoted to _debug_ level logging.  In fact, in 
> one place in the code, it already is.
> {code}
> LOG.info("pattern: {}", showDatabasesDesc.getPattern());
> LOG.debug("pattern: {}", pattern);
> LOG.info("pattern: {}", showFuncs.getPattern());
> LOG.info("pattern: {}", showTblStatus.getPattern());
> {code}
> Here is an example... as an admin, I can already see what the pattern is, I 
> do not need this extra logging.  It provides no additional context.
> {code:java|title=Example}
> 2018-05-03 03:08:26,354 INFO  org.apache.hadoop.hive.ql.Driver: 
> [HiveServer2-Background-Pool: Thread-101980]: Executing 
> command(queryId=hive_20180503030808_e53c26ef-2280-4eca-929b-668503105e2e): 
> SHOW TABLE EXTENDED FROM my_db LIKE '*'
> 2018-05-03 03:08:26,355 INFO  hive.ql.exec.DDLTask: 
> [HiveServer2-Background-Pool: Thread-101980]: pattern: *
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19948) HiveCli is not splitting the command by semicolon properly if quotes are inside the string

2018-06-19 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-19948:
---

Assignee: Aihua Xu

> HiveCli is not splitting the command by semicolon properly if quotes are 
> inside the string 
> ---
>
> Key: HIVE-19948
> URL: https://issues.apache.org/jira/browse/HIVE-19948
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
>
> HIVE-15297 tries to split the command by considering semicolon inside string, 
> but it doesn't consider the case that quotes can also be inside string. 
> For the following command {{insert into escape1 partition (ds='1', part='3') 
> values ("abc' ");}}, it will fail with 
> {noformat}
> 18/06/19 16:37:05 ERROR ql.Driver: FAILED: ParseException line 1:64 
> extraneous input ';' expecting EOF near ''
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:64 extraneous input 
> ';' expecting EOF near ''
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:606)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1686)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1633)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1628)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19899) Support stored as JsonFile

2018-06-20 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19899:

Attachment: HIVE-19899.5.patch

> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: This is to add "stored as jsonfile" support for json 
> file format. 
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19899.1.patch, HIVE-19899.2.patch, 
> HIVE-19899.3.patch, HIVE-19899.4.patch, HIVE-19899.5.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19899) Support stored as JsonFile

2018-06-20 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19899:

Environment: (was: This is to add "stored as jsonfile" support for json 
file format. )

> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19899.1.patch, HIVE-19899.2.patch, 
> HIVE-19899.3.patch, HIVE-19899.4.patch, HIVE-19899.5.patch
>
>
> This is to support "Create table ... stored as JsonFile" syntax. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19899) Support stored as JsonFile

2018-06-20 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19899:

Description: This is to support "Create table ... stored as JsonFile" 
syntax. 

> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: This is to add "stored as jsonfile" support for json 
> file format. 
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19899.1.patch, HIVE-19899.2.patch, 
> HIVE-19899.3.patch, HIVE-19899.4.patch, HIVE-19899.5.patch
>
>
> This is to support "Create table ... stored as JsonFile" syntax. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19897) Add more tests for parallel compilation

2018-06-20 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518401#comment-16518401
 ] 

Aihua Xu commented on HIVE-19897:
-

Looks good to me. +1.

> Add more tests for parallel compilation 
> 
>
> Key: HIVE-19897
> URL: https://issues.apache.org/jira/browse/HIVE-19897
> Project: Hive
>  Issue Type: Test
>  Components: HiveServer2
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Minor
> Attachments: HIVE-19897.1.patch, HIVE-19897.3.patch
>
>
> The two parallel compilation tests in 
> org.apache.hive.jdbc.TestJdbcWithMiniHS2 do not real cover the case of 
> queries compile concurrently from different connections. No sure it is on 
> purpose or by mistake. Add more tests to cover the case. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19948) HiveCli is not splitting the command by semicolon properly if quotes are inside the string

2018-06-20 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19948:

Attachment: HIVE-19948.1.patch

> HiveCli is not splitting the command by semicolon properly if quotes are 
> inside the string 
> ---
>
> Key: HIVE-19948
> URL: https://issues.apache.org/jira/browse/HIVE-19948
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19948.1.patch
>
>
> HIVE-15297 tries to split the command by considering semicolon inside string, 
> but it doesn't consider the case that quotes can also be inside string. 
> For the following command {{insert into escape1 partition (ds='1', part='3') 
> values ("abc' ");}}, it will fail with 
> {noformat}
> 18/06/19 16:37:05 ERROR ql.Driver: FAILED: ParseException line 1:64 
> extraneous input ';' expecting EOF near ''
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:64 extraneous input 
> ';' expecting EOF near ''
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:606)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1686)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1633)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1628)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19948) HiveCli is not splitting the command by semicolon properly if quotes are inside the string

2018-06-20 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19948:

Status: Patch Available  (was: Open)

patch-1:  Handle both single quote and double quote when we are splitting by 
semicolon. 

> HiveCli is not splitting the command by semicolon properly if quotes are 
> inside the string 
> ---
>
> Key: HIVE-19948
> URL: https://issues.apache.org/jira/browse/HIVE-19948
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19948.1.patch
>
>
> HIVE-15297 tries to split the command by considering semicolon inside string, 
> but it doesn't consider the case that quotes can also be inside string. 
> For the following command {{insert into escape1 partition (ds='1', part='3') 
> values ("abc' ");}}, it will fail with 
> {noformat}
> 18/06/19 16:37:05 ERROR ql.Driver: FAILED: ParseException line 1:64 
> extraneous input ';' expecting EOF near ''
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:64 extraneous input 
> ';' expecting EOF near ''
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:606)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1686)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1633)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1628)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19948) HiveCli is not splitting the command by semicolon properly if quotes are inside the string

2018-06-20 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518596#comment-16518596
 ] 

Aihua Xu commented on HIVE-19948:
-

[~stakiar] Can you help take a look?

> HiveCli is not splitting the command by semicolon properly if quotes are 
> inside the string 
> ---
>
> Key: HIVE-19948
> URL: https://issues.apache.org/jira/browse/HIVE-19948
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19948.1.patch
>
>
> HIVE-15297 tries to split the command by considering semicolon inside string, 
> but it doesn't consider the case that quotes can also be inside string. 
> For the following command {{insert into escape1 partition (ds='1', part='3') 
> values ("abc' ");}}, it will fail with 
> {noformat}
> 18/06/19 16:37:05 ERROR ql.Driver: FAILED: ParseException line 1:64 
> extraneous input ';' expecting EOF near ''
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:64 extraneous input 
> ';' expecting EOF near ''
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:606)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1686)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1633)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1628)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19948) HiveCli is not splitting the command by semicolon properly if quotes are inside the string

2018-06-21 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19948:

Attachment: HIVE-19948.2.patch

> HiveCli is not splitting the command by semicolon properly if quotes are 
> inside the string 
> ---
>
> Key: HIVE-19948
> URL: https://issues.apache.org/jira/browse/HIVE-19948
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19948.1.patch, HIVE-19948.2.patch
>
>
> HIVE-15297 tries to split the command by considering semicolon inside string, 
> but it doesn't consider the case that quotes can also be inside string. 
> For the following command {{insert into escape1 partition (ds='1', part='3') 
> values ("abc' ");}}, it will fail with 
> {noformat}
> 18/06/19 16:37:05 ERROR ql.Driver: FAILED: ParseException line 1:64 
> extraneous input ';' expecting EOF near ''
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:64 extraneous input 
> ';' expecting EOF near ''
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:606)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1686)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1633)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1628)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19948) HiveCli is not splitting the command by semicolon properly if quotes are inside the string

2018-06-21 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16519607#comment-16519607
 ] 

Aihua Xu commented on HIVE-19948:
-

patch-2: splitSemiColon needs to be public since it's used in QTestUtils.java

> HiveCli is not splitting the command by semicolon properly if quotes are 
> inside the string 
> ---
>
> Key: HIVE-19948
> URL: https://issues.apache.org/jira/browse/HIVE-19948
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19948.1.patch, HIVE-19948.2.patch
>
>
> HIVE-15297 tries to split the command by considering semicolon inside string, 
> but it doesn't consider the case that quotes can also be inside string. 
> For the following command {{insert into escape1 partition (ds='1', part='3') 
> values ("abc' ");}}, it will fail with 
> {noformat}
> 18/06/19 16:37:05 ERROR ql.Driver: FAILED: ParseException line 1:64 
> extraneous input ';' expecting EOF near ''
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:64 extraneous input 
> ';' expecting EOF near ''
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:606)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1686)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1633)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1628)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19899) Support stored as JsonFile

2018-06-21 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19899:

   Resolution: Fixed
Fix Version/s: 4.0.0
 Release Note: Support "create table ... stored as JsonFile" syntax.
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks [~ychena] and [~belugabehr] for reviewing.

> Support stored as JsonFile 
> ---
>
> Key: HIVE-19899
> URL: https://issues.apache.org/jira/browse/HIVE-19899
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19899.1.patch, HIVE-19899.2.patch, 
> HIVE-19899.3.patch, HIVE-19899.4.patch, HIVE-19899.5.patch
>
>
> This is to support "Create table ... stored as JsonFile" syntax. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19948) HiveCli is not splitting the command by semicolon properly if quotes are inside the string

2018-06-22 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19948:

Attachment: HIVE-19948.3.patch

> HiveCli is not splitting the command by semicolon properly if quotes are 
> inside the string 
> ---
>
> Key: HIVE-19948
> URL: https://issues.apache.org/jira/browse/HIVE-19948
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19948.1.patch, HIVE-19948.2.patch, 
> HIVE-19948.3.patch
>
>
> HIVE-15297 tries to split the command by considering semicolon inside string, 
> but it doesn't consider the case that quotes can also be inside string. 
> For the following command {{insert into escape1 partition (ds='1', part='3') 
> values ("abc' ");}}, it will fail with 
> {noformat}
> 18/06/19 16:37:05 ERROR ql.Driver: FAILED: ParseException line 1:64 
> extraneous input ';' expecting EOF near ''
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:64 extraneous input 
> ';' expecting EOF near ''
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:606)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1686)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1633)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1628)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19948) HiveCli is not splitting the command by semicolon properly if quotes are inside the string

2018-06-25 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19948:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. This test failure seems flaky.

> HiveCli is not splitting the command by semicolon properly if quotes are 
> inside the string 
> ---
>
> Key: HIVE-19948
> URL: https://issues.apache.org/jira/browse/HIVE-19948
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19948.1.patch, HIVE-19948.2.patch, 
> HIVE-19948.3.patch
>
>
> HIVE-15297 tries to split the command by considering semicolon inside string, 
> but it doesn't consider the case that quotes can also be inside string. 
> For the following command {{insert into escape1 partition (ds='1', part='3') 
> values ("abc' ");}}, it will fail with 
> {noformat}
> 18/06/19 16:37:05 ERROR ql.Driver: FAILED: ParseException line 1:64 
> extraneous input ';' expecting EOF near ''
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:64 extraneous input 
> ';' expecting EOF near ''
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:606)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1686)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1633)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1628)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-06-27 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16525477#comment-16525477
 ] 

Aihua Xu commented on HIVE-19668:
-

[~mi...@cloudera.com] The patch looks good to me. There are some style issues 
like missing apache header not from your change. Can you fix those issues? 

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-06-28 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526593#comment-16526593
 ] 

Aihua Xu commented on HIVE-19668:
-

The patch looks good to me. +1 pending tests.

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19948) HiveCli is not splitting the command by semicolon properly if quotes are inside the string

2018-06-28 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526672#comment-16526672
 ] 

Aihua Xu commented on HIVE-19948:
-

[~hagleitn] I may have misunderstood. So should we retry to get the flaky test 
to pass or should we get the flaky test to get fixed completely?

> HiveCli is not splitting the command by semicolon properly if quotes are 
> inside the string 
> ---
>
> Key: HIVE-19948
> URL: https://issues.apache.org/jira/browse/HIVE-19948
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19948.1.patch, HIVE-19948.2.patch, 
> HIVE-19948.3.patch
>
>
> HIVE-15297 tries to split the command by considering semicolon inside string, 
> but it doesn't consider the case that quotes can also be inside string. 
> For the following command {{insert into escape1 partition (ds='1', part='3') 
> values ("abc' ");}}, it will fail with 
> {noformat}
> 18/06/19 16:37:05 ERROR ql.Driver: FAILED: ParseException line 1:64 
> extraneous input ';' expecting EOF near ''
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:64 extraneous input 
> ';' expecting EOF near ''
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)
>   at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:606)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1686)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1633)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1628)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HIVE-20027) TestRuntimeStats.testCleanup is flaky

2018-06-28 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu resolved HIVE-20027.
-
Resolution: Duplicate

> TestRuntimeStats.testCleanup is flaky
> -
>
> Key: HIVE-20027
> URL: https://issues.apache.org/jira/browse/HIVE-20027
> Project: Hive
>  Issue Type: Bug
>Reporter: Aihua Xu
>Priority: Major
>
> int deleted = objStore.deleteRuntimeStats(1);
> assertEquals(1, deleted);
> The testCleanup could fail if somehow there is GC pause before 
> deleteRuntimeStats happens so actually 2 stats will get deleted rather than 
> one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20027) TestRuntimeStats.testCleanup is flaky

2018-06-28 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526878#comment-16526878
 ] 

Aihua Xu commented on HIVE-20027:
-

Resolve it now as dup.

> TestRuntimeStats.testCleanup is flaky
> -
>
> Key: HIVE-20027
> URL: https://issues.apache.org/jira/browse/HIVE-20027
> Project: Hive
>  Issue Type: Bug
>Reporter: Aihua Xu
>Priority: Major
>
> int deleted = objStore.deleteRuntimeStats(1);
> assertEquals(1, deleted);
> The testCleanup could fail if somehow there is GC pause before 
> deleteRuntimeStats happens so actually 2 stats will get deleted rather than 
> one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18916) SparkClientImpl doesn't error out if spark-submit fails

2018-06-28 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-18916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527196#comment-16527196
 ] 

Aihua Xu commented on HIVE-18916:
-

[~stakiar] Right now we need to make tests clear before commit. Can you rebase 
your code? Otherwise, the change looks good to me. +1.

> SparkClientImpl doesn't error out if spark-submit fails
> ---
>
> Key: HIVE-18916
> URL: https://issues.apache.org/jira/browse/HIVE-18916
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18916.1.WIP.patch, HIVE-18916.2.patch, 
> HIVE-18916.3.patch, HIVE-18916.4.patch, HIVE-18916.5.patch, HIVE-18916.6.patch
>
>
> If {{spark-submit}} returns a non-zero exit code, {{SparkClientImpl}} will 
> simply log the exit code, but won't throw an error. Eventually, the 
> connection timeout will get triggered and an exception like {{Timed out 
> waiting for client connection}} will be logged, which is pretty misleading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20037) Print root cause exception's toString() rather than getMessage()

2018-06-29 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-20037:
---


> Print root cause exception's toString() rather than getMessage()
> 
>
> Key: HIVE-20037
> URL: https://issues.apache.org/jira/browse/HIVE-20037
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Trivial
>
> When we run HoS job and if it fails for some errors, we are printing the 
> exception message rather than exception toString(), for some exceptions, 
> e.g., this java.lang.NoClassDefFoundError, we are missing the exception type 
> information. 
> {noformat}
> Failed to execute Spark task Stage-1, with exception 
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark 
> client for Spark session cf054497-b073-4327-a315-68c867ce3434: 
> org/apache/spark/SparkConf)'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20037) Print root cause exception's toString() rather than getMessage()

2018-06-29 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-20037:

Description: 
When we run HoS job and if it fails for some errors, we are printing the 
exception message rather than exception toString(), for some exceptions, e.g., 
this java.lang.NoClassDefFoundError, we are missing the exception type 
information. 

{noformat}
Failed to execute Spark task Stage-1, with exception 
'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark client 
for Spark session cf054497-b073-4327-a315-68c867ce3434: 
org/apache/spark/SparkConf)'
{noformat}

If we use exception's toString(), it will be as follows and make more sense.
{noformat}
Failed to execute Spark task Stage-1, with exception 
'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark client 
for Spark session cf054497-b073-4327-a315-68c867ce3434: 
java.lang.NoClassDefFoundError: org/apache/spark/SparkConf)'
{noformat}

  was:
When we run HoS job and if it fails for some errors, we are printing the 
exception message rather than exception toString(), for some exceptions, e.g., 
this java.lang.NoClassDefFoundError, we are missing the exception type 
information. 

{noformat}
Failed to execute Spark task Stage-1, with exception 
'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark client 
for Spark session cf054497-b073-4327-a315-68c867ce3434: 
org/apache/spark/SparkConf)'
{noformat}




> Print root cause exception's toString() rather than getMessage()
> 
>
> Key: HIVE-20037
> URL: https://issues.apache.org/jira/browse/HIVE-20037
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Trivial
>
> When we run HoS job and if it fails for some errors, we are printing the 
> exception message rather than exception toString(), for some exceptions, 
> e.g., this java.lang.NoClassDefFoundError, we are missing the exception type 
> information. 
> {noformat}
> Failed to execute Spark task Stage-1, with exception 
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark 
> client for Spark session cf054497-b073-4327-a315-68c867ce3434: 
> org/apache/spark/SparkConf)'
> {noformat}
> If we use exception's toString(), it will be as follows and make more sense.
> {noformat}
> Failed to execute Spark task Stage-1, with exception 
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark 
> client for Spark session cf054497-b073-4327-a315-68c867ce3434: 
> java.lang.NoClassDefFoundError: org/apache/spark/SparkConf)'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20037) Print root cause exception's toString() rather than getMessage()

2018-06-29 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-20037:

Status: Patch Available  (was: Open)

> Print root cause exception's toString() rather than getMessage()
> 
>
> Key: HIVE-20037
> URL: https://issues.apache.org/jira/browse/HIVE-20037
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Trivial
> Attachments: HIVE-20037.1.patch
>
>
> When we run HoS job and if it fails for some errors, we are printing the 
> exception message rather than exception toString(), for some exceptions, 
> e.g., this java.lang.NoClassDefFoundError, we are missing the exception type 
> information. 
> {noformat}
> Failed to execute Spark task Stage-1, with exception 
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark 
> client for Spark session cf054497-b073-4327-a315-68c867ce3434: 
> org/apache/spark/SparkConf)'
> {noformat}
> If we use exception's toString(), it will be as follows and make more sense.
> {noformat}
> Failed to execute Spark task Stage-1, with exception 
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark 
> client for Spark session cf054497-b073-4327-a315-68c867ce3434: 
> java.lang.NoClassDefFoundError: org/apache/spark/SparkConf)'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20037) Print root cause exception's toString() rather than getMessage()

2018-06-29 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-20037:

Attachment: HIVE-20037.1.patch

> Print root cause exception's toString() rather than getMessage()
> 
>
> Key: HIVE-20037
> URL: https://issues.apache.org/jira/browse/HIVE-20037
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Trivial
> Attachments: HIVE-20037.1.patch
>
>
> When we run HoS job and if it fails for some errors, we are printing the 
> exception message rather than exception toString(), for some exceptions, 
> e.g., this java.lang.NoClassDefFoundError, we are missing the exception type 
> information. 
> {noformat}
> Failed to execute Spark task Stage-1, with exception 
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark 
> client for Spark session cf054497-b073-4327-a315-68c867ce3434: 
> org/apache/spark/SparkConf)'
> {noformat}
> If we use exception's toString(), it will be as follows and make more sense.
> {noformat}
> Failed to execute Spark task Stage-1, with exception 
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark 
> client for Spark session cf054497-b073-4327-a315-68c867ce3434: 
> java.lang.NoClassDefFoundError: org/apache/spark/SparkConf)'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-10643) Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers (1 for number of preceding and 1 for number of following)

2015-05-11 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537955#comment-14537955
 ] 

Aihua Xu commented on HIVE-10643:
-

All of the failures seem to be unrelated. 

> Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers 
> (1 for number of preceding and 1 for number of following)
> ---
>
> Key: HIVE-10643
> URL: https://issues.apache.org/jira/browse/HIVE-10643
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-10643.patch
>
>
> The functionality should not be affected. Instead of passing 2 numbers (1 for 
> # of preceding rows and 1 for # of following rows), we will pass 
> WindowFrameDef object around. In the following subtasks, it will be used for 
> the cases of {{rows between x preceding and y preceding}} and {{rows between 
> x following and y following}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10643) Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers (1 for number of preceding and 1 for number of following)

2015-05-11 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10643:

Description: The functionality should not be affected. Instead of passing 2 
numbers (1 for # of preceding rows and 1 for # of following rows), we will pass 
WindowFrameDef object around. In the following subtasks, it will be used to 
support additional window like {{rows between x preceding and y preceding}} and 
{{rows between x following and y following}}.  (was: The functionality should 
not be affected. Instead of passing 2 numbers (1 for # of preceding rows and 1 
for # of following rows), we will pass WindowFrameDef object around. In the 
following subtasks, it will be used for the cases of {{rows between x preceding 
and y preceding}} and {{rows between x following and y following}}.)

> Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers 
> (1 for number of preceding and 1 for number of following)
> ---
>
> Key: HIVE-10643
> URL: https://issues.apache.org/jira/browse/HIVE-10643
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-10643.patch
>
>
> The functionality should not be affected. Instead of passing 2 numbers (1 for 
> # of preceding rows and 1 for # of following rows), we will pass 
> WindowFrameDef object around. In the following subtasks, it will be used to 
> support additional window like {{rows between x preceding and y preceding}} 
> and {{rows between x following and y following}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10643) Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers (1 for number of preceding and 1 for number of following)

2015-05-11 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537973#comment-14537973
 ] 

Aihua Xu commented on HIVE-10643:
-

[~ashutoshc], can you help review the code? This is the first step with just 
refactoring the code. In the patch, we will pass WindowFrameDef around now, 
that will be used in the next tasks. 

I only did refactoring for sum() related stuff and later will work on sum() fix 
first since I believe the work for other UDAF functions should be similar to 
it. So I would prefer to work on one function first rather than refactoring 
everything at once.  

> Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers 
> (1 for number of preceding and 1 for number of following)
> ---
>
> Key: HIVE-10643
> URL: https://issues.apache.org/jira/browse/HIVE-10643
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-10643.patch
>
>
> The functionality should not be affected. Instead of passing 2 numbers (1 for 
> # of preceding rows and 1 for # of following rows), we will pass 
> WindowFrameDef object around. In the following subtasks, it will be used to 
> support additional window like {{rows between x preceding and y preceding}} 
> and {{rows between x following and y following}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10643) Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers (1 for number of preceding and 1 for number of following)

2015-05-11 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10643:

Affects Version/s: 1.3.0

> Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers 
> (1 for number of preceding and 1 for number of following)
> ---
>
> Key: HIVE-10643
> URL: https://issues.apache.org/jira/browse/HIVE-10643
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-10643.patch
>
>
> The functionality should not be affected. Instead of passing 2 numbers (1 for 
> # of preceding rows and 1 for # of following rows), we will pass 
> WindowFrameDef object around. In the following subtasks, it will be used to 
> support additional window like {{rows between x preceding and y preceding}} 
> and {{rows between x following and y following}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10643) Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers (1 for number of preceding and 1 for number of following)

2015-05-11 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10643:

Attachment: HIVE-10643.patch

> Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers 
> (1 for number of preceding and 1 for number of following)
> ---
>
> Key: HIVE-10643
> URL: https://issues.apache.org/jira/browse/HIVE-10643
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-10643.patch, HIVE-10643.patch
>
>
> The functionality should not be affected. Instead of passing 2 numbers (1 for 
> # of preceding rows and 1 for # of following rows), we will pass 
> WindowFrameDef object around. In the following subtasks, it will be used to 
> support additional window like {{rows between x preceding and y preceding}} 
> and {{rows between x following and y following}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10643) Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers (1 for number of preceding and 1 for number of following)

2015-05-11 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10643:

Attachment: (was: HIVE-10643.patch)

> Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers 
> (1 for number of preceding and 1 for number of following)
> ---
>
> Key: HIVE-10643
> URL: https://issues.apache.org/jira/browse/HIVE-10643
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-10643.patch
>
>
> The functionality should not be affected. Instead of passing 2 numbers (1 for 
> # of preceding rows and 1 for # of following rows), we will pass 
> WindowFrameDef object around. In the following subtasks, it will be used to 
> support additional window like {{rows between x preceding and y preceding}} 
> and {{rows between x following and y following}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10643) Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers (1 for number of preceding and 1 for number of following)

2015-05-11 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538570#comment-14538570
 ] 

Aihua Xu commented on HIVE-10643:
-

Thanks [~ashutoshc] for reviewing. Uploaded a new patch to address the 
comments. I will do the logic change for the window size along with other 
changes in the next task, so I didn't fix in here.

> Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers 
> (1 for number of preceding and 1 for number of following)
> ---
>
> Key: HIVE-10643
> URL: https://issues.apache.org/jira/browse/HIVE-10643
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-10643.patch
>
>
> The functionality should not be affected. Instead of passing 2 numbers (1 for 
> # of preceding rows and 1 for # of following rows), we will pass 
> WindowFrameDef object around. In the following subtasks, it will be used to 
> support additional window like {{rows between x preceding and y preceding}} 
> and {{rows between x following and y following}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10643) Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers (1 for number of preceding and 1 for number of following)

2015-05-12 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539844#comment-14539844
 ] 

Aihua Xu commented on HIVE-10643:
-

Unrelated failures.

> Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers 
> (1 for number of preceding and 1 for number of following)
> ---
>
> Key: HIVE-10643
> URL: https://issues.apache.org/jira/browse/HIVE-10643
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-10643.patch
>
>
> The functionality should not be affected. Instead of passing 2 numbers (1 for 
> # of preceding rows and 1 for # of following rows), we will pass 
> WindowFrameDef object around. In the following subtasks, it will be used to 
> support additional window like {{rows between x preceding and y preceding}} 
> and {{rows between x following and y following}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10643) Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers (1 for number of preceding and 1 for number of following)

2015-05-12 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540265#comment-14540265
 ] 

Aihua Xu commented on HIVE-10643:
-

Thanks.

> Refactoring Windowing for sum() to pass WindowFrameDef instead of two numbers 
> (1 for number of preceding and 1 for number of following)
> ---
>
> Key: HIVE-10643
> URL: https://issues.apache.org/jira/browse/HIVE-10643
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: HIVE-10643.patch
>
>
> The functionality should not be affected. Instead of passing 2 numbers (1 for 
> # of preceding rows and 1 for # of following rows), we will pass 
> WindowFrameDef object around. In the following subtasks, it will be used to 
> support additional window like {{rows between x preceding and y preceding}} 
> and {{rows between x following and y following}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10650) Improve sum() function over windowing to support additional range formats

2015-05-12 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10650:

Attachment: HIVE-10650.patch

> Improve sum() function over windowing to support additional range formats
> -
>
> Key: HIVE-10650
> URL: https://issues.apache.org/jira/browse/HIVE-10650
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10650.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10650) Improve sum() function over windowing to support additional range formats

2015-05-12 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10650:

Description: 
Support the following windowing function {{x preceding and y preceding}} and 
{{x following and y following}}.
e.g.
{noformat} 
select sum(value) over (partition by key order by value rows between 2 
preceding and 1 preceding) from tbl1;
{noformat}

> Improve sum() function over windowing to support additional range formats
> -
>
> Key: HIVE-10650
> URL: https://issues.apache.org/jira/browse/HIVE-10650
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10650.patch
>
>
> Support the following windowing function {{x preceding and y preceding}} and 
> {{x following and y following}}.
> e.g.
> {noformat} 
> select sum(value) over (partition by key order by value rows between 2 
> preceding and 1 preceding) from tbl1;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10650) Improve sum() function over windowing to support additional range formats

2015-05-12 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10650:

Description: 
Support the following windowing function {{x preceding and y preceding}} and 
{{x following and y following}}.
e.g.
{noformat} 
select sum(value) over (partition by key order by value rows between 2 
preceding and 1 preceding) from tbl1;
select sum(value) over (partition by key order by value rows between unbounded 
preceding and 1 preceding) from tbl1;
{noformat}

  was:
Support the following windowing function {{x preceding and y preceding}} and 
{{x following and y following}}.
e.g.
{noformat} 
select sum(value) over (partition by key order by value rows between 2 
preceding and 1 preceding) from tbl1;
{noformat}


> Improve sum() function over windowing to support additional range formats
> -
>
> Key: HIVE-10650
> URL: https://issues.apache.org/jira/browse/HIVE-10650
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10650.patch
>
>
> Support the following windowing function {{x preceding and y preceding}} and 
> {{x following and y following}}.
> e.g.
> {noformat} 
> select sum(value) over (partition by key order by value rows between 2 
> preceding and 1 preceding) from tbl1;
> select sum(value) over (partition by key order by value rows between 
> unbounded preceding and 1 preceding) from tbl1;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10650) Improve sum() function over windowing to support additional range formats

2015-05-12 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540839#comment-14540839
 ] 

Aihua Xu commented on HIVE-10650:
-

Initial patch.

> Improve sum() function over windowing to support additional range formats
> -
>
> Key: HIVE-10650
> URL: https://issues.apache.org/jira/browse/HIVE-10650
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10650.patch
>
>
> Support the following windowing function {{x preceding and y preceding}} and 
> {{x following and y following}}.
> e.g.
> {noformat} 
> select sum(value) over (partition by key order by value rows between 2 
> preceding and 1 preceding) from tbl1;
> select sum(value) over (partition by key order by value rows between 
> unbounded preceding and 1 preceding) from tbl1;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10650) Improve sum() function over windowing to support additional range formats

2015-05-13 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541952#comment-14541952
 ] 

Aihua Xu commented on HIVE-10650:
-

+[~ashutoshc] Could you please help review the code? 

In this patch, we handle 'x preceding and y preceding' and 'x following and y 
following' windowing. Based on different windowing, we generate intermediate 
results and final results. Before the first row and after the last row, we 
could insert 'NULL' for different windowing.

Unit tests are added to test various cases. 

> Improve sum() function over windowing to support additional range formats
> -
>
> Key: HIVE-10650
> URL: https://issues.apache.org/jira/browse/HIVE-10650
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10650.patch
>
>
> Support the following windowing function {{x preceding and y preceding}} and 
> {{x following and y following}}.
> e.g.
> {noformat} 
> select sum(value) over (partition by key order by value rows between 2 
> preceding and 1 preceding) from tbl1;
> select sum(value) over (partition by key order by value rows between 
> unbounded preceding and 1 preceding) from tbl1;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10650) Improve sum() function over windowing to support additional range formats

2015-05-13 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542639#comment-14542639
 ] 

Aihua Xu commented on HIVE-10650:
-

[~ashutoshc] Thanks for reviewing. You are right. I added such test case in 
HIVE-10140, then I noticed that such windowing actually doesn't work. I 
mentioned there the result was incorrect and this is the patch to fix such 
issue. 

> Improve sum() function over windowing to support additional range formats
> -
>
> Key: HIVE-10650
> URL: https://issues.apache.org/jira/browse/HIVE-10650
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10650.patch
>
>
> Support the following windowing function {{x preceding and y preceding}} and 
> {{x following and y following}}.
> e.g.
> {noformat} 
> select sum(value) over (partition by key order by value rows between 2 
> preceding and 1 preceding) from tbl1;
> select sum(value) over (partition by key order by value rows between 
> unbounded preceding and 1 preceding) from tbl1;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10650) Improve sum() function over windowing to support additional range formats

2015-05-14 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544262#comment-14544262
 ] 

Aihua Xu commented on HIVE-10650:
-

Thanks for reviewing.

> Improve sum() function over windowing to support additional range formats
> -
>
> Key: HIVE-10650
> URL: https://issues.apache.org/jira/browse/HIVE-10650
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10650.patch
>
>
> Support the following windowing function {{x preceding and y preceding}} and 
> {{x following and y following}}.
> e.g.
> {noformat} 
> select sum(value) over (partition by key order by value rows between 2 
> preceding and 1 preceding) from tbl1;
> select sum(value) over (partition by key order by value rows between 
> unbounded preceding and 1 preceding) from tbl1;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10720) Pig using HCatLoader to access RCFile and perform join but get incorrect result.

2015-05-18 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548105#comment-14548105
 ] 

Aihua Xu commented on HIVE-10720:
-

Seems the improvement HIVE-5193 causes the issue. I may consider reverting the 
change and will redo later.

> Pig using HCatLoader to access RCFile and perform join but get incorrect 
> result.
> 
>
> Key: HIVE-10720
> URL: https://issues.apache.org/jira/browse/HIVE-10720
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>
> {noformat}
> Create table tbl1 (key string, value string) stored as rcfile;
> Create table tbl2 (key string, value string);
> insert into tbl1 values('1', 'value1');
> insert into tbl2 values('1', 'value2');
> {noformat}
> Pig script:
> {noformat}
> tbl1 = LOAD 'tbl1' USING org.apache.hive.hcatalog.pig.HCatLoader();
> tbl2 = LOAD 'tbl2' USING org.apache.hive.hcatalog.pig.HCatLoader();
> src_tbl1 = FILTER tbl1 BY (key == '1');
> prj_tbl1 = FOREACH src_tbl1 GENERATE
>key as tbl1_key,
>value as tbl1_value,
>'333' as tbl1_v1;
>
> src_tbl2 = FILTER tbl2 BY (key == '1');
> prj_tbl2 = FOREACH src_tbl2 GENERATE
>key as tbl2_key,
>value as tbl2_value;
>
> result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
> prj_result = FOREACH result 
>   GENERATE  prj_tbl1::tbl1_key AS key1,
> prj_tbl1::tbl1_value AS value1,
> prj_tbl1::tbl1_v1 AS v1,
> prj_tbl2::tbl2_key AS key2,
> prj_tbl2::tbl2_value AS value2;
>
> dump prj_result;
> {noformat}
> Based on the pig script, we could see different invalid results or even no 
> result which should return.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-10720) Pig using HCatLoader to access RCFile and perform join but get incorrect result.

2015-05-18 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-10720:
---

Assignee: Aihua Xu

> Pig using HCatLoader to access RCFile and perform join but get incorrect 
> result.
> 
>
> Key: HIVE-10720
> URL: https://issues.apache.org/jira/browse/HIVE-10720
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> {noformat}
> Create table tbl1 (key string, value string) stored as rcfile;
> Create table tbl2 (key string, value string);
> insert into tbl1 values('1', 'value1');
> insert into tbl2 values('1', 'value2');
> {noformat}
> Pig script:
> {noformat}
> tbl1 = LOAD 'tbl1' USING org.apache.hive.hcatalog.pig.HCatLoader();
> tbl2 = LOAD 'tbl2' USING org.apache.hive.hcatalog.pig.HCatLoader();
> src_tbl1 = FILTER tbl1 BY (key == '1');
> prj_tbl1 = FOREACH src_tbl1 GENERATE
>key as tbl1_key,
>value as tbl1_value,
>'333' as tbl1_v1;
>
> src_tbl2 = FILTER tbl2 BY (key == '1');
> prj_tbl2 = FOREACH src_tbl2 GENERATE
>key as tbl2_key,
>value as tbl2_value;
>
> result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
> prj_result = FOREACH result 
>   GENERATE  prj_tbl1::tbl1_key AS key1,
> prj_tbl1::tbl1_value AS value1,
> prj_tbl1::tbl1_v1 AS v1,
> prj_tbl2::tbl2_key AS key2,
> prj_tbl2::tbl2_value AS value2;
>
> dump prj_result;
> {noformat}
> Based on the pig script, we could see different invalid results or even no 
> result which should return.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10720) Pig using HCatLoader to access RCFile and perform join but get incorrect result.

2015-05-19 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550908#comment-14550908
 ] 

Aihua Xu commented on HIVE-10720:
-

The issue is due to ColumnProjectionUtils class is only supported for one table 
column projection. If the query involves multiple tables, then the later one 
will overwrite the previous and causes retrieving incorrect results or not 
retrieving any values. 

I will revert HIVE-5193 so that HCatalog will not use that class for now and do 
the correct column projection enhancement later.

> Pig using HCatLoader to access RCFile and perform join but get incorrect 
> result.
> 
>
> Key: HIVE-10720
> URL: https://issues.apache.org/jira/browse/HIVE-10720
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> {noformat}
> Create table tbl1 (key string, value string) stored as rcfile;
> Create table tbl2 (key string, value string);
> insert into tbl1 values('1', 'value1');
> insert into tbl2 values('1', 'value2');
> {noformat}
> Pig script:
> {noformat}
> tbl1 = LOAD 'tbl1' USING org.apache.hive.hcatalog.pig.HCatLoader();
> tbl2 = LOAD 'tbl2' USING org.apache.hive.hcatalog.pig.HCatLoader();
> src_tbl1 = FILTER tbl1 BY (key == '1');
> prj_tbl1 = FOREACH src_tbl1 GENERATE
>key as tbl1_key,
>value as tbl1_value,
>'333' as tbl1_v1;
>
> src_tbl2 = FILTER tbl2 BY (key == '1');
> prj_tbl2 = FOREACH src_tbl2 GENERATE
>key as tbl2_key,
>value as tbl2_value;
>
> result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
> prj_result = FOREACH result 
>   GENERATE  prj_tbl1::tbl1_key AS key1,
> prj_tbl1::tbl1_value AS value1,
> prj_tbl1::tbl1_v1 AS v1,
> prj_tbl2::tbl2_key AS key2,
> prj_tbl2::tbl2_value AS value2;
>
> dump prj_result;
> {noformat}
> Based on the pig script, we could see different invalid results or even no 
> result which should return.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10752) Revert HIVE-5193

2015-05-19 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550943#comment-14550943
 ] 

Aihua Xu commented on HIVE-10752:
-

Yes. This is from the customer bug reporting. I have included the repro in 
HIVE-10720. Basically it causes pig+hcatalog doesn't work at all if it involves 
2 or more tables and I think it doesn't matter whether we have such performance 
enhancement or not.  

> Revert HIVE-5193
> 
>
> Key: HIVE-10752
> URL: https://issues.apache.org/jira/browse/HIVE-10752
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> Revert HIVE-5193 since it causes pig+hcatalog not working.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10752) Revert HIVE-5193

2015-05-19 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550947#comment-14550947
 ] 

Aihua Xu commented on HIVE-10752:
-

I mean since it doesn't work, such performance enhancement doesn't matter.

> Revert HIVE-5193
> 
>
> Key: HIVE-10752
> URL: https://issues.apache.org/jira/browse/HIVE-10752
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> Revert HIVE-5193 since it causes pig+hcatalog not working.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10752) Revert HIVE-5193

2015-05-19 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10752:

Attachment: HIVE-10752.patch

Revert HIVE-5193. Please note, we have an additional problem even after 
reverting. I will address later. I didn't include in this patch to keep the 
work separate. 

> Revert HIVE-5193
> 
>
> Key: HIVE-10752
> URL: https://issues.apache.org/jira/browse/HIVE-10752
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10752.patch
>
>
> Revert HIVE-5193 since it causes pig+hcatalog not working.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10754) After reverting HIVE-5193, make pig+hCatalog join working

2015-05-20 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10754:

Attachment: HIVE-10754.patch

> After reverting HIVE-5193, make pig+hCatalog join working 
> --
>
> Key: HIVE-10754
> URL: https://issues.apache.org/jira/browse/HIVE-10754
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10754.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10754) Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader

2015-05-20 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10754:

Summary: Pig+Hcatalog doesn't work properly since we need to clone the Job 
instance in HCatLoader  (was: After reverting HIVE-5193, make pig+hCatalog join 
working )

> Pig+Hcatalog doesn't work properly since we need to clone the Job instance in 
> HCatLoader
> 
>
> Key: HIVE-10754
> URL: https://issues.apache.org/jira/browse/HIVE-10754
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10754.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10754) Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader

2015-05-20 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552999#comment-14552999
 ] 

Aihua Xu commented on HIVE-10754:
-

The Job object needs to be cloned, otherwise we were referring the same object 
in the following.

> Pig+Hcatalog doesn't work properly since we need to clone the Job instance in 
> HCatLoader
> 
>
> Key: HIVE-10754
> URL: https://issues.apache.org/jira/browse/HIVE-10754
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10754.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10752) Revert HIVE-5193

2015-05-21 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14554271#comment-14554271
 ] 

Aihua Xu commented on HIVE-10752:
-

[~daijy] and  [~sushanth]. Seems you guys are more familiar with this part of 
the code. Can you help review this patch and HIVE-10754? The changes are to 
make Pig+HCatalog on RCFile to work first, after that, I will rework on the 
HIVE-5193 enhancement to improve the RCFile performance.

HIVE-5193 introduces the issue that if pig query involves more than one table, 
then the columns to be retrieved for one table may be applied to another table, 
since we save such information in the JobConf and it overwrites the previous 
one. 

> Revert HIVE-5193
> 
>
> Key: HIVE-10752
> URL: https://issues.apache.org/jira/browse/HIVE-10752
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10752.patch
>
>
> Revert HIVE-5193 since it causes pig+hcatalog not working.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10702) COUNT(*) over windowing 'x preceding and y preceding' doesn't work properly

2015-05-21 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10702:

Attachment: HIVE-10702.patch

> COUNT(*) over windowing 'x preceding and y preceding' doesn't work properly
> ---
>
> Key: HIVE-10702
> URL: https://issues.apache.org/jira/browse/HIVE-10702
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10702.patch
>
>
> Given the following query:
> {noformat}
> select ts, f, count(*) over (partition by ts order by f rows between 2 
> preceding and 1 preceding) from over10k limit 100;
> {noformat}
> It returns the result 
> {noformat}
> 2013-03-01 09:11:58.70307   3.170
> 2013-03-01 09:11:58.70307   10.89   0
> 2013-03-01 09:11:58.70307   14.54   1
> 2013-03-01 09:11:58.70307   14.78   1
> 2013-03-01 09:11:58.70307   17.85   1
> 2013-03-01 09:11:58.70307   20.61   1
> 2013-03-01 09:11:58.70307   28.69   1
> 2013-03-01 09:11:58.70307   29.22   1
> 2013-03-01 09:11:58.70307   31.17   1
> 2013-03-01 09:11:58.70307   38.35   1
> 2013-03-01 09:11:58.70307   38.61   1
> {noformat}
> Mostly it should return count 2 rather than 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10702) COUNT(*) over windowing 'x preceding and y preceding' doesn't work properly

2015-05-22 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14556029#comment-14556029
 ] 

Aihua Xu commented on HIVE-10702:
-

The test failure seems not related.

> COUNT(*) over windowing 'x preceding and y preceding' doesn't work properly
> ---
>
> Key: HIVE-10702
> URL: https://issues.apache.org/jira/browse/HIVE-10702
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10702.patch
>
>
> Given the following query:
> {noformat}
> select ts, f, count(*) over (partition by ts order by f rows between 2 
> preceding and 1 preceding) from over10k limit 100;
> {noformat}
> It returns the result 
> {noformat}
> 2013-03-01 09:11:58.70307   3.170
> 2013-03-01 09:11:58.70307   10.89   0
> 2013-03-01 09:11:58.70307   14.54   1
> 2013-03-01 09:11:58.70307   14.78   1
> 2013-03-01 09:11:58.70307   17.85   1
> 2013-03-01 09:11:58.70307   20.61   1
> 2013-03-01 09:11:58.70307   28.69   1
> 2013-03-01 09:11:58.70307   29.22   1
> 2013-03-01 09:11:58.70307   31.17   1
> 2013-03-01 09:11:58.70307   38.35   1
> 2013-03-01 09:11:58.70307   38.61   1
> {noformat}
> Mostly it should return count 2 rather than 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10702) COUNT(*) over windowing 'x preceding and y preceding' doesn't work properly

2015-05-22 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14556031#comment-14556031
 ] 

Aihua Xu commented on HIVE-10702:
-

[~ashutoshc] Can you help me review this patch?

> COUNT(*) over windowing 'x preceding and y preceding' doesn't work properly
> ---
>
> Key: HIVE-10702
> URL: https://issues.apache.org/jira/browse/HIVE-10702
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10702.patch
>
>
> Given the following query:
> {noformat}
> select ts, f, count(*) over (partition by ts order by f rows between 2 
> preceding and 1 preceding) from over10k limit 100;
> {noformat}
> It returns the result 
> {noformat}
> 2013-03-01 09:11:58.70307   3.170
> 2013-03-01 09:11:58.70307   10.89   0
> 2013-03-01 09:11:58.70307   14.54   1
> 2013-03-01 09:11:58.70307   14.78   1
> 2013-03-01 09:11:58.70307   17.85   1
> 2013-03-01 09:11:58.70307   20.61   1
> 2013-03-01 09:11:58.70307   28.69   1
> 2013-03-01 09:11:58.70307   29.22   1
> 2013-03-01 09:11:58.70307   31.17   1
> 2013-03-01 09:11:58.70307   38.35   1
> 2013-03-01 09:11:58.70307   38.61   1
> {noformat}
> Mostly it should return count 2 rather than 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10802) Table join query with some constant field in select fails

2015-05-22 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10802:

Description: 
The following query fails:
{noformat}
create table tb1 (year string, month string);
create table tb2(month string);
select unix_timestamp(a.year) 
from (select * from tb1 where year='2001') a join tb2 b on (a.month=b.month);
{noformat}

with the exception {noformat}
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:109)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:290)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:275)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:175)
{noformat}

The issue seems to be: during the query compilation, the field in the select 
should be replaced with the constant while it's not for some UDFs.

  was:
The following query fails:
{noformat}
create table tb1 (year string, month string);
create table tb2(month string);
select unix_timestamp(a.year) 
from (select * from tb1 where year='2001') a join tb2 b on (a.month=b.month);
{noformat}

with the exception {noformat}
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:109)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:290)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:275)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:175)
{noformat}


> Table join query with some constant field in select fails
> -
>
> Key: HIVE-10802
> URL: https://issues.apache.org/jira/browse/HIVE-10802
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>
> The following query fails:
> {noformat}
> create table tb1 (year string, month string);
> create table tb2(month string);
> select unix_timestamp(a.year) 
> from (select * from tb1 where year='2001') a join tb2 b on (a.month=b.month);
> {noformat}
> with the exception {noformat}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:109)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:290)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:275)
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:175)
> {noformat}
> The issue seems to be: during the query compilation, the field in the select 
> should be replaced with the constant while it's not for some UDFs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10802) Table join query with some constant field in select fails

2015-05-22 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10802:

Description: 
The following query fails:
{noformat}
create table tb1 (year string, month string);
create table tb2(month string);
select unix_timestamp(a.year) 
from (select * from tb1 where year='2001') a join tb2 b on (a.month=b.month);
{noformat}

with the exception {noformat}
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:109)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:290)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:275)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:175)
{noformat}

The issue seems to be: during the query compilation, the field in the select 
should be replaced with the constant when some UDFs are used.

  was:
The following query fails:
{noformat}
create table tb1 (year string, month string);
create table tb2(month string);
select unix_timestamp(a.year) 
from (select * from tb1 where year='2001') a join tb2 b on (a.month=b.month);
{noformat}

with the exception {noformat}
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:109)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:290)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:275)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:175)
{noformat}

The issue seems to be: during the query compilation, the field in the select 
should be replaced with the constant while it's not for some UDFs.


> Table join query with some constant field in select fails
> -
>
> Key: HIVE-10802
> URL: https://issues.apache.org/jira/browse/HIVE-10802
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>
> The following query fails:
> {noformat}
> create table tb1 (year string, month string);
> create table tb2(month string);
> select unix_timestamp(a.year) 
> from (select * from tb1 where year='2001') a join tb2 b on (a.month=b.month);
> {noformat}
> with the exception {noformat}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:109)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:290)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:275)
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:175)
> {noformat}
> The issue seems to be: during the query compilation, the field in the select 
> should be replaced with the constant when some UDFs are used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10752) Revert HIVE-5193

2015-05-22 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14556737#comment-14556737
 ] 

Aihua Xu commented on HIVE-10752:
-

[~mithun] Any updates from you? 

> Revert HIVE-5193
> 
>
> Key: HIVE-10752
> URL: https://issues.apache.org/jira/browse/HIVE-10752
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10752.patch
>
>
> Revert HIVE-5193 since it causes pig+hcatalog not working.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10702) COUNT(*) over windowing 'x preceding and y preceding' doesn't work properly

2015-05-26 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14559118#comment-14559118
 ] 

Aihua Xu commented on HIVE-10702:
-

Thanks for reviewing and checkingin, Ashutosh.

> COUNT(*) over windowing 'x preceding and y preceding' doesn't work properly
> ---
>
> Key: HIVE-10702
> URL: https://issues.apache.org/jira/browse/HIVE-10702
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 1.3.0
>
> Attachments: HIVE-10702.patch
>
>
> Given the following query:
> {noformat}
> select ts, f, count(*) over (partition by ts order by f rows between 2 
> preceding and 1 preceding) from over10k limit 100;
> {noformat}
> It returns the result 
> {noformat}
> 2013-03-01 09:11:58.70307   3.170
> 2013-03-01 09:11:58.70307   10.89   0
> 2013-03-01 09:11:58.70307   14.54   1
> 2013-03-01 09:11:58.70307   14.78   1
> 2013-03-01 09:11:58.70307   17.85   1
> 2013-03-01 09:11:58.70307   20.61   1
> 2013-03-01 09:11:58.70307   28.69   1
> 2013-03-01 09:11:58.70307   29.22   1
> 2013-03-01 09:11:58.70307   31.17   1
> 2013-03-01 09:11:58.70307   38.35   1
> 2013-03-01 09:11:58.70307   38.61   1
> {noformat}
> Mostly it should return count 2 rather than 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-10802) Table join query with some constant field in select fails

2015-05-26 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-10802:
---

Assignee: Aihua Xu

> Table join query with some constant field in select fails
> -
>
> Key: HIVE-10802
> URL: https://issues.apache.org/jira/browse/HIVE-10802
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> The following query fails:
> {noformat}
> create table tb1 (year string, month string);
> create table tb2(month string);
> select unix_timestamp(a.year) 
> from (select * from tb1 where year='2001') a join tb2 b on (a.month=b.month);
> {noformat}
> with the exception {noformat}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:109)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:290)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:275)
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:175)
> {noformat}
> The issue seems to be: during the query compilation, the field in the select 
> should be replaced with the constant when some UDFs are used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10826) Support min()/max() functions over x preceding and y preceding windowing

2015-05-27 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10826:

Attachment: HIVE-10826.patch

> Support min()/max() functions over x preceding and y preceding windowing 
> -
>
> Key: HIVE-10826
> URL: https://issues.apache.org/jira/browse/HIVE-10826
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10826.patch
>
>
> Currently the query 
> {noformat}
> select key, value, min(value) over (partition by key order by value rows 
> between 1 preceding and 1 preceding) from small;
> {noformat}
> doesn't work. It failed with 
> {noformat}
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row (tag=0) 
> {"key":{"reducesinkkey0":"2"},"value":{"_col0":"500"}}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256)
> at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=0) 
> {"key":{"reducesinkkey0":"2"},"value":{"_col0":"500"}}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
> ... 3 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Internal Error: 
> cannot generate all output rows for a Partition
> at 
> org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:520)
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10826) Support min()/max() functions over x preceding and y preceding windowing

2015-05-27 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10826:

Attachment: HIVE-10826.patch

> Support min()/max() functions over x preceding and y preceding windowing 
> -
>
> Key: HIVE-10826
> URL: https://issues.apache.org/jira/browse/HIVE-10826
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10826.patch, HIVE-10826.patch
>
>
> Currently the query 
> {noformat}
> select key, value, min(value) over (partition by key order by value rows 
> between 1 preceding and 1 preceding) from small;
> {noformat}
> doesn't work. It failed with 
> {noformat}
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row (tag=0) 
> {"key":{"reducesinkkey0":"2"},"value":{"_col0":"500"}}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256)
> at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=0) 
> {"key":{"reducesinkkey0":"2"},"value":{"_col0":"500"}}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
> ... 3 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Internal Error: 
> cannot generate all output rows for a Partition
> at 
> org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:520)
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10826) Support min()/max() functions over x preceding and y preceding windowing

2015-05-27 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10826:

Attachment: (was: HIVE-10826.patch)

> Support min()/max() functions over x preceding and y preceding windowing 
> -
>
> Key: HIVE-10826
> URL: https://issues.apache.org/jira/browse/HIVE-10826
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10826.patch
>
>
> Currently the query 
> {noformat}
> select key, value, min(value) over (partition by key order by value rows 
> between 1 preceding and 1 preceding) from small;
> {noformat}
> doesn't work. It failed with 
> {noformat}
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row (tag=0) 
> {"key":{"reducesinkkey0":"2"},"value":{"_col0":"500"}}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256)
> at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=0) 
> {"key":{"reducesinkkey0":"2"},"value":{"_col0":"500"}}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
> ... 3 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Internal Error: 
> cannot generate all output rows for a Partition
> at 
> org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:520)
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10834) Support First_value()/last_value() over x preceding and y preceding windowing

2015-05-27 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10834:

Attachment: HIVE-10834.patch

> Support First_value()/last_value() over x preceding and y preceding windowing
> -
>
> Key: HIVE-10834
> URL: https://issues.apache.org/jira/browse/HIVE-10834
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10834.patch
>
>
> Currently the following query
> {noformat}
> select ts, f, first_value(f) over (partition by ts order by t rows between 2 
> preceding and 1 preceding) from over10k limit 100;
> {noformat}
> throws exception:
> {noformat}
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row (tag=0) 
> {"key":{"reducesinkkey0":"2013-03-01 
> 09:11:58.703071","reducesinkkey1":-3},"value":{"_col3":0.83}}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256)
> at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=0) {"key":{"reducesinkkey0":"2013-03-01 
> 09:11:58.703071","reducesinkkey1":-3},"value":{"_col3":0.83}}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
> ... 3 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Internal Error: 
> cannot generate all output rows for a Partition
> at 
> org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10826) Support min()/max() functions over x preceding and y preceding windowing

2015-05-27 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561781#comment-14561781
 ] 

Aihua Xu commented on HIVE-10826:
-

The test failure seems unrelated. 

> Support min()/max() functions over x preceding and y preceding windowing 
> -
>
> Key: HIVE-10826
> URL: https://issues.apache.org/jira/browse/HIVE-10826
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10826.patch
>
>
> Currently the query 
> {noformat}
> select key, value, min(value) over (partition by key order by value rows 
> between 1 preceding and 1 preceding) from small;
> {noformat}
> doesn't work. It failed with 
> {noformat}
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row (tag=0) 
> {"key":{"reducesinkkey0":"2"},"value":{"_col0":"500"}}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256)
> at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=0) 
> {"key":{"reducesinkkey0":"2"},"value":{"_col0":"500"}}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
> ... 3 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Internal Error: 
> cannot generate all output rows for a Partition
> at 
> org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:520)
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10826) Support min()/max() functions over x preceding and y preceding windowing

2015-05-27 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561794#comment-14561794
 ] 

Aihua Xu commented on HIVE-10826:
-

[~ashutoshc] Can you help review this change as well? Thanks.

> Support min()/max() functions over x preceding and y preceding windowing 
> -
>
> Key: HIVE-10826
> URL: https://issues.apache.org/jira/browse/HIVE-10826
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10826.patch
>
>
> Currently the query 
> {noformat}
> select key, value, min(value) over (partition by key order by value rows 
> between 1 preceding and 1 preceding) from small;
> {noformat}
> doesn't work. It failed with 
> {noformat}
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row (tag=0) 
> {"key":{"reducesinkkey0":"2"},"value":{"_col0":"500"}}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256)
> at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=0) 
> {"key":{"reducesinkkey0":"2"},"value":{"_col0":"500"}}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
> ... 3 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Internal Error: 
> cannot generate all output rows for a Partition
> at 
> org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:520)
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10826) Support min()/max() functions over x preceding and y preceding windowing

2015-05-27 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561799#comment-14561799
 ] 

Aihua Xu commented on HIVE-10826:
-

[~ashutoshc] Can you help review this change as well? Thanks.

> Support min()/max() functions over x preceding and y preceding windowing 
> -
>
> Key: HIVE-10826
> URL: https://issues.apache.org/jira/browse/HIVE-10826
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10826.patch
>
>
> Currently the query 
> {noformat}
> select key, value, min(value) over (partition by key order by value rows 
> between 1 preceding and 1 preceding) from small;
> {noformat}
> doesn't work. It failed with 
> {noformat}
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row (tag=0) 
> {"key":{"reducesinkkey0":"2"},"value":{"_col0":"500"}}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256)
> at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=0) 
> {"key":{"reducesinkkey0":"2"},"value":{"_col0":"500"}}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
> ... 3 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Internal Error: 
> cannot generate all output rows for a Partition
> at 
> org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:520)
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10720) Pig using HCatLoader to access RCFile and perform join but get incorrect result.

2015-05-28 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562873#comment-14562873
 ] 

Aihua Xu commented on HIVE-10720:
-

[~viraj] Thanks for your information. Your approach actually is the same as 
what I did in the subtasks (to revert HIVE-5193) in which we will not optimize 
the retrieval for column-based table. 

Such optimization, which will only retrieve required columns, is still valid 
since it will improve the performance.  My plan is to revert HIVE-5193 to 
unblock the customer and then rework the optimization part. 

> Pig using HCatLoader to access RCFile and perform join but get incorrect 
> result.
> 
>
> Key: HIVE-10720
> URL: https://issues.apache.org/jira/browse/HIVE-10720
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.3.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10720.patch
>
>
> {noformat}
> Create table tbl1 (key string, value string) stored as rcfile;
> Create table tbl2 (key string, value string);
> insert into tbl1 values('1', 'value1');
> insert into tbl2 values('1', 'value2');
> {noformat}
> Pig script:
> {noformat}
> tbl1 = LOAD 'tbl1' USING org.apache.hive.hcatalog.pig.HCatLoader();
> tbl2 = LOAD 'tbl2' USING org.apache.hive.hcatalog.pig.HCatLoader();
> src_tbl1 = FILTER tbl1 BY (key == '1');
> prj_tbl1 = FOREACH src_tbl1 GENERATE
>key as tbl1_key,
>value as tbl1_value,
>'333' as tbl1_v1;
>
> src_tbl2 = FILTER tbl2 BY (key == '1');
> prj_tbl2 = FOREACH src_tbl2 GENERATE
>key as tbl2_key,
>value as tbl2_value;
>
> result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
> prj_result = FOREACH result 
>   GENERATE  prj_tbl1::tbl1_key AS key1,
> prj_tbl1::tbl1_value AS value1,
> prj_tbl1::tbl1_v1 AS v1,
> prj_tbl2::tbl2_key AS key2,
> prj_tbl2::tbl2_value AS value2;
>
> dump prj_result;
> {noformat}
> Based on the pig script, we could see different invalid results or even no 
> result which should return.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10752) Revert HIVE-5193

2015-05-28 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562917#comment-14562917
 ] 

Aihua Xu commented on HIVE-10752:
-

[~mithun] Are you able to verify that? Want to move forward so that I can 
rework on HIVE-5193. Thanks.

> Revert HIVE-5193
> 
>
> Key: HIVE-10752
> URL: https://issues.apache.org/jira/browse/HIVE-10752
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10752.patch
>
>
> Revert HIVE-5193 since it causes pig+hcatalog not working.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

< 1 2 3 4 5 6 7 8 9 10 >

401 - 500 of 3023 matches

Mail list logo