[jira] [Updated] (HIVE-21205) Tests for replace flag in insert event messages in Metastore notifications.

2019-04-01 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-21205:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed the patch to master and branch-3. Thanks for the review [~vihangk1]

Closing this JIRA.

> Tests for replace flag in insert event messages in Metastore notifications.
> ---
>
> Key: HIVE-21205
> URL: https://issues.apache.org/jira/browse/HIVE-21205
> Project: Hive
>  Issue Type: Test
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
> Attachments: HIVE-21205.1-branch-3.patch, HIVE-21205.1.patch, 
> HIVE-21205.2.patch
>
>
> The replace flag is initially added in HIVE-16197. It would be good to have 
> some tests in TestDbNotificationListener to validate if the flag is set as 
> expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21526) JSONDropDatabaseMessage needs to have the full database object.

2019-04-01 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-21526:

   Resolution: Fixed
Fix Version/s: 3.2.0
   4.0.0
   Status: Resolved  (was: Patch Available)

Pushed the patch to master and branch-3. Thanks for the review [~vihangk1]

Closing this JIRA.

> JSONDropDatabaseMessage needs to have the full database object.
> ---
>
> Key: HIVE-21526
> URL: https://issues.apache.org/jira/browse/HIVE-21526
> Project: Hive
>  Issue Type: Improvement
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-21526.1-branch-3.patch, HIVE-21526.1.patch
>
>
> The metastore notification event DROP_DATABASE does not provide full-thrift 
> objects as of now.
> We have added CREATION_TIME to databases in HIVE-21077, and metadata like 
> this would be useful in notification processing. One of the use-cases is 
> IMPALA-8338.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21526) JSONDropDatabaseMessage needs to have the full database object.

2019-04-01 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-21526:

Attachment: HIVE-21526.1-branch-3.patch

> JSONDropDatabaseMessage needs to have the full database object.
> ---
>
> Key: HIVE-21526
> URL: https://issues.apache.org/jira/browse/HIVE-21526
> Project: Hive
>  Issue Type: Improvement
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-21526.1-branch-3.patch, HIVE-21526.1.patch
>
>
> The metastore notification event DROP_DATABASE does not provide full-thrift 
> objects as of now.
> We have added CREATION_TIME to databases in HIVE-21077, and metadata like 
> this would be useful in notification processing. One of the use-cases is 
> IMPALA-8338.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21205) Tests for replace flag in insert event messages in Metastore notifications.

2019-04-01 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-21205:

Attachment: HIVE-21205.1-branch-3.patch

> Tests for replace flag in insert event messages in Metastore notifications.
> ---
>
> Key: HIVE-21205
> URL: https://issues.apache.org/jira/browse/HIVE-21205
> Project: Hive
>  Issue Type: Test
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
> Attachments: HIVE-21205.1-branch-3.patch, HIVE-21205.1.patch, 
> HIVE-21205.2.patch
>
>
> The replace flag is initially added in HIVE-16197. It would be good to have 
> some tests in TestDbNotificationListener to validate if the flag is set as 
> expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21526) JSONDropDatabaseMessage needs to have the full database object.

2019-03-28 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804191#comment-16804191
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-21526:
-

Tests are passing. I guess the checkstyle error reported can be ignored. ASF 
License errors are unrelated.
[~vihangk1] can you take a final look?

> JSONDropDatabaseMessage needs to have the full database object.
> ---
>
> Key: HIVE-21526
> URL: https://issues.apache.org/jira/browse/HIVE-21526
> Project: Hive
>  Issue Type: Improvement
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-21526.1.patch
>
>
> The metastore notification event DROP_DATABASE does not provide full-thrift 
> objects as of now.
> We have added CREATION_TIME to databases in HIVE-21077, and metadata like 
> this would be useful in notification processing. One of the use-cases is 
> IMPALA-8338.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21526) JSONDropDatabaseMessage needs to have the full database object.

2019-03-27 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-21526:

Status: Patch Available  (was: Open)

> JSONDropDatabaseMessage needs to have the full database object.
> ---
>
> Key: HIVE-21526
> URL: https://issues.apache.org/jira/browse/HIVE-21526
> Project: Hive
>  Issue Type: Improvement
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-21526.1.patch
>
>
> The metastore notification event DROP_DATABASE does not provide full-thrift 
> objects as of now.
> We have added CREATION_TIME to databases in HIVE-21077, and metadata like 
> this would be useful in notification processing. One of the use-cases is 
> IMPALA-8338.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21526) JSONDropDatabaseMessage needs to have the full database object.

2019-03-27 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-21526:

Attachment: HIVE-21526.1.patch

> JSONDropDatabaseMessage needs to have the full database object.
> ---
>
> Key: HIVE-21526
> URL: https://issues.apache.org/jira/browse/HIVE-21526
> Project: Hive
>  Issue Type: Improvement
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-21526.1.patch
>
>
> The metastore notification event DROP_DATABASE does not provide full-thrift 
> objects as of now.
> We have added CREATION_TIME to databases in HIVE-21077, and metadata like 
> this would be useful in notification processing. One of the use-cases is 
> IMPALA-8338.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21526) JSONDropDatabaseMessage needs to have the full database object.

2019-03-27 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali reassigned HIVE-21526:
---


> JSONDropDatabaseMessage needs to have the full database object.
> ---
>
> Key: HIVE-21526
> URL: https://issues.apache.org/jira/browse/HIVE-21526
> Project: Hive
>  Issue Type: Improvement
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
>
> The metastore notification event DROP_DATABASE does not provide full-thrift 
> objects as of now.
> We have added CREATION_TIME to databases in HIVE-21077, and metadata like 
> this would be useful in notification processing. One of the use-cases is 
> IMPALA-8338.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21205) Tests for replace flag in insert event messages in Metastore notifications.

2019-03-20 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-21205:

Attachment: HIVE-21205.2.patch

> Tests for replace flag in insert event messages in Metastore notifications.
> ---
>
> Key: HIVE-21205
> URL: https://issues.apache.org/jira/browse/HIVE-21205
> Project: Hive
>  Issue Type: Test
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
> Attachments: HIVE-21205.1.patch, HIVE-21205.2.patch
>
>
> The replace flag is initially added in HIVE-16197. It would be good to have 
> some tests in TestDbNotificationListener to validate if the flag is set as 
> expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-13517) Hive logs in Spark Executor and Driver should show thread-id.

2019-03-18 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali reassigned HIVE-13517:
---

Assignee: (was: Bharathkrishna Guruvayoor Murali)

> Hive logs in Spark Executor and Driver should show thread-id.
> -
>
> Key: HIVE-13517
> URL: https://issues.apache.org/jira/browse/HIVE-13517
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Szehon Ho
>Priority: Major
> Attachments: HIVE-13517.1.patch, HIVE-13517.2.patch, 
> executor-driver-log.PNG
>
>
> In Spark, there might be more than one task running in one executor. 
> Similarly, there may be more than one thread running in Driver.
> This makes debugging through the logs a nightmare. It would be great if there 
> could be thread-ids in the logs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21205) Tests for replace flag in insert event messages in Metastore notifications.

2019-02-05 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali reassigned HIVE-21205:
---

Assignee: Bharathkrishna Guruvayoor Murali

> Tests for replace flag in insert event messages in Metastore notifications.
> ---
>
> Key: HIVE-21205
> URL: https://issues.apache.org/jira/browse/HIVE-21205
> Project: Hive
>  Issue Type: Test
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
> Attachments: HIVE-21205.1.patch
>
>
> The replace flag is initially added in HIVE-16197. It would be good to have 
> some tests in TestDbNotificationListener to validate if the flag is set as 
> expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21205) Tests for replace flag in insert event messages in Metastore notifications.

2019-02-05 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-21205:

Attachment: HIVE-21205.1.patch
Status: Patch Available  (was: Open)

> Tests for replace flag in insert event messages in Metastore notifications.
> ---
>
> Key: HIVE-21205
> URL: https://issues.apache.org/jira/browse/HIVE-21205
> Project: Hive
>  Issue Type: Test
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
> Attachments: HIVE-21205.1.patch
>
>
> The replace flag is initially added in HIVE-16197. It would be good to have 
> some tests in TestDbNotificationListener to validate if the flag is set as 
> expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21205) Tests for replace flag in insert event messages in Metastore notifications.

2019-02-05 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-21205:

Priority: Minor  (was: Major)

> Tests for replace flag in insert event messages in Metastore notifications.
> ---
>
> Key: HIVE-21205
> URL: https://issues.apache.org/jira/browse/HIVE-21205
> Project: Hive
>  Issue Type: Test
>Reporter: Bharathkrishna Guruvayoor Murali
>Priority: Minor
>
> The replace flag is initially added in HIVE-16197. It would be good to have 
> some tests in TestDbNotificationListener to validate if the flag is set as 
> expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21115) Add support for object versions in metastore

2019-01-23 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750280#comment-16750280
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-21115:
-

Hi Alan, sure we can proceed after discussing and getting the consensus. The 
main idea of putting this patch is for everyone to get an idea of how we are 
currently thinking of implementing this and to see if any unexpected tests fail 
to detect early if there are any problems with the approach. 
Updating the versions via datanucleus has a problem that the updated version 
number is not reflected in the MetaStoreConf notifications for transactional 
listeners because the notifications are issued before commitTransaction(). 
Hence, the logic in the prototype patch attached involves select..for update on 
the respective table/partition row to update version. As I mentioned, let's go 
ahead and put more thoughts into this, and the patch just serves to give more 
clarity on the idea.

> Add support for object versions in metastore
> 
>
> Key: HIVE-21115
> URL: https://issues.apache.org/jira/browse/HIVE-21115
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-21115.1.patch, HIVE-21115.2.patch
>
>
> Currently, metastore objects are identified uniquely by their names (eg. 
> catName, dbName and tblName for a table is unique). Once a table or partition 
> is created it could be altered in many ways. There is no good way currently 
> to identify the version of the object once it is altered. For example, 
> suppose there are two clients (Hive and Impala) using the same metastore. 
> Once some alter operations are performed by a client, another client which 
> wants to do a alter operation has no good way to know if the object which it 
> has is the same as the one stored in metastore. Metastore updates the 
> {{transient_lastDdlTime}} every time there is a DDL operation on the object. 
> However, this value cannot be relied for all the clients since after 
> HIVE-1768 metastore updates the value only when it is not set in the 
> parameters. It is possible that a client which alters the object state, does 
> not remove the {{transient_lastDdlTime}} and metastore will not update it. 
> Secondly, if there is a clock skew between multiple HMS instances when HMS-HA 
> is configured, time values cannot be relied on to find out the sequence of 
> alter operations on a given object.
> This JIRA propose to use JDO versioning support by Datanucleus  
> http://www.datanucleus.org/products/accessplatform_4_2/jdo/versioning.html to 
> generate a incrementing sequence number every time a object is altered. The 
> value of this object can be set as one of the values in the parameters. The 
> advantage of using Datanucleus the versioning can be done across HMS 
> instances as part of the database transaction and it should work for all the 
> supported databases.
> In theory such a version can be used to detect if the client is presenting a 
> object which is "stale" when issuing a alter request. Metastore can choose to 
> reject such a alter request since the client may be caching a old version of 
> the object and any alter operation on such stale object can potentially 
> overwrite previous operations. However, this is can be done in a separate 
> JIRA.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21115) Add support for object versions in metastore

2019-01-23 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-21115:

Attachment: HIVE-21115.2.patch

> Add support for object versions in metastore
> 
>
> Key: HIVE-21115
> URL: https://issues.apache.org/jira/browse/HIVE-21115
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-21115.1.patch, HIVE-21115.2.patch
>
>
> Currently, metastore objects are identified uniquely by their names (eg. 
> catName, dbName and tblName for a table is unique). Once a table or partition 
> is created it could be altered in many ways. There is no good way currently 
> to identify the version of the object once it is altered. For example, 
> suppose there are two clients (Hive and Impala) using the same metastore. 
> Once some alter operations are performed by a client, another client which 
> wants to do a alter operation has no good way to know if the object which it 
> has is the same as the one stored in metastore. Metastore updates the 
> {{transient_lastDdlTime}} every time there is a DDL operation on the object. 
> However, this value cannot be relied for all the clients since after 
> HIVE-1768 metastore updates the value only when it is not set in the 
> parameters. It is possible that a client which alters the object state, does 
> not remove the {{transient_lastDdlTime}} and metastore will not update it. 
> Secondly, if there is a clock skew between multiple HMS instances when HMS-HA 
> is configured, time values cannot be relied on to find out the sequence of 
> alter operations on a given object.
> This JIRA propose to use JDO versioning support by Datanucleus  
> http://www.datanucleus.org/products/accessplatform_4_2/jdo/versioning.html to 
> generate a incrementing sequence number every time a object is altered. The 
> value of this object can be set as one of the values in the parameters. The 
> advantage of using Datanucleus the versioning can be done across HMS 
> instances as part of the database transaction and it should work for all the 
> supported databases.
> In theory such a version can be used to detect if the client is presenting a 
> object which is "stale" when issuing a alter request. Metastore can choose to 
> reject such a alter request since the client may be caching a old version of 
> the object and any alter operation on such stale object can potentially 
> overwrite previous operations. However, this is can be done in a separate 
> JIRA.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21115) Add support for object versions in metastore

2019-01-22 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749551#comment-16749551
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-21115:
-

Attaching a prototype for adding "version" to Table and Partition objects.
Also to see the test failures.

> Add support for object versions in metastore
> 
>
> Key: HIVE-21115
> URL: https://issues.apache.org/jira/browse/HIVE-21115
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-21115.1.patch
>
>
> Currently, metastore objects are identified uniquely by their names (eg. 
> catName, dbName and tblName for a table is unique). Once a table or partition 
> is created it could be altered in many ways. There is no good way currently 
> to identify the version of the object once it is altered. For example, 
> suppose there are two clients (Hive and Impala) using the same metastore. 
> Once some alter operations are performed by a client, another client which 
> wants to do a alter operation has no good way to know if the object which it 
> has is the same as the one stored in metastore. Metastore updates the 
> {{transient_lastDdlTime}} every time there is a DDL operation on the object. 
> However, this value cannot be relied for all the clients since after 
> HIVE-1768 metastore updates the value only when it is not set in the 
> parameters. It is possible that a client which alters the object state, does 
> not remove the {{transient_lastDdlTime}} and metastore will not update it. 
> Secondly, if there is a clock skew between multiple HMS instances when HMS-HA 
> is configured, time values cannot be relied on to find out the sequence of 
> alter operations on a given object.
> This JIRA propose to use JDO versioning support by Datanucleus  
> http://www.datanucleus.org/products/accessplatform_4_2/jdo/versioning.html to 
> generate a incrementing sequence number every time a object is altered. The 
> value of this object can be set as one of the values in the parameters. The 
> advantage of using Datanucleus the versioning can be done across HMS 
> instances as part of the database transaction and it should work for all the 
> supported databases.
> In theory such a version can be used to detect if the client is presenting a 
> object which is "stale" when issuing a alter request. Metastore can choose to 
> reject such a alter request since the client may be caching a old version of 
> the object and any alter operation on such stale object can potentially 
> overwrite previous operations. However, this is can be done in a separate 
> JIRA.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21115) Add support for object versions in metastore

2019-01-22 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-21115:

Status: Patch Available  (was: Open)

> Add support for object versions in metastore
> 
>
> Key: HIVE-21115
> URL: https://issues.apache.org/jira/browse/HIVE-21115
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-21115.1.patch
>
>
> Currently, metastore objects are identified uniquely by their names (eg. 
> catName, dbName and tblName for a table is unique). Once a table or partition 
> is created it could be altered in many ways. There is no good way currently 
> to identify the version of the object once it is altered. For example, 
> suppose there are two clients (Hive and Impala) using the same metastore. 
> Once some alter operations are performed by a client, another client which 
> wants to do a alter operation has no good way to know if the object which it 
> has is the same as the one stored in metastore. Metastore updates the 
> {{transient_lastDdlTime}} every time there is a DDL operation on the object. 
> However, this value cannot be relied for all the clients since after 
> HIVE-1768 metastore updates the value only when it is not set in the 
> parameters. It is possible that a client which alters the object state, does 
> not remove the {{transient_lastDdlTime}} and metastore will not update it. 
> Secondly, if there is a clock skew between multiple HMS instances when HMS-HA 
> is configured, time values cannot be relied on to find out the sequence of 
> alter operations on a given object.
> This JIRA propose to use JDO versioning support by Datanucleus  
> http://www.datanucleus.org/products/accessplatform_4_2/jdo/versioning.html to 
> generate a incrementing sequence number every time a object is altered. The 
> value of this object can be set as one of the values in the parameters. The 
> advantage of using Datanucleus the versioning can be done across HMS 
> instances as part of the database transaction and it should work for all the 
> supported databases.
> In theory such a version can be used to detect if the client is presenting a 
> object which is "stale" when issuing a alter request. Metastore can choose to 
> reject such a alter request since the client may be caching a old version of 
> the object and any alter operation on such stale object can potentially 
> overwrite previous operations. However, this is can be done in a separate 
> JIRA.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21115) Add support for object versions in metastore

2019-01-22 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-21115:

Attachment: HIVE-21115.1.patch

> Add support for object versions in metastore
> 
>
> Key: HIVE-21115
> URL: https://issues.apache.org/jira/browse/HIVE-21115
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-21115.1.patch
>
>
> Currently, metastore objects are identified uniquely by their names (eg. 
> catName, dbName and tblName for a table is unique). Once a table or partition 
> is created it could be altered in many ways. There is no good way currently 
> to identify the version of the object once it is altered. For example, 
> suppose there are two clients (Hive and Impala) using the same metastore. 
> Once some alter operations are performed by a client, another client which 
> wants to do a alter operation has no good way to know if the object which it 
> has is the same as the one stored in metastore. Metastore updates the 
> {{transient_lastDdlTime}} every time there is a DDL operation on the object. 
> However, this value cannot be relied for all the clients since after 
> HIVE-1768 metastore updates the value only when it is not set in the 
> parameters. It is possible that a client which alters the object state, does 
> not remove the {{transient_lastDdlTime}} and metastore will not update it. 
> Secondly, if there is a clock skew between multiple HMS instances when HMS-HA 
> is configured, time values cannot be relied on to find out the sequence of 
> alter operations on a given object.
> This JIRA propose to use JDO versioning support by Datanucleus  
> http://www.datanucleus.org/products/accessplatform_4_2/jdo/versioning.html to 
> generate a incrementing sequence number every time a object is altered. The 
> value of this object can be set as one of the values in the parameters. The 
> advantage of using Datanucleus the versioning can be done across HMS 
> instances as part of the database transaction and it should work for all the 
> supported databases.
> In theory such a version can be used to detect if the client is presenting a 
> object which is "stale" when issuing a alter request. Metastore can choose to 
> reject such a alter request since the client may be caching a old version of 
> the object and any alter operation on such stale object can potentially 
> overwrite previous operations. However, this is can be done in a separate 
> JIRA.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21115) Add support for object versions in metastore

2019-01-22 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali reassigned HIVE-21115:
---

Assignee: Bharathkrishna Guruvayoor Murali

> Add support for object versions in metastore
> 
>
> Key: HIVE-21115
> URL: https://issues.apache.org/jira/browse/HIVE-21115
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
>
> Currently, metastore objects are identified uniquely by their names (eg. 
> catName, dbName and tblName for a table is unique). Once a table or partition 
> is created it could be altered in many ways. There is no good way currently 
> to identify the version of the object once it is altered. For example, 
> suppose there are two clients (Hive and Impala) using the same metastore. 
> Once some alter operations are performed by a client, another client which 
> wants to do a alter operation has no good way to know if the object which it 
> has is the same as the one stored in metastore. Metastore updates the 
> {{transient_lastDdlTime}} every time there is a DDL operation on the object. 
> However, this value cannot be relied for all the clients since after 
> HIVE-1768 metastore updates the value only when it is not set in the 
> parameters. It is possible that a client which alters the object state, does 
> not remove the {{transient_lastDdlTime}} and metastore will not update it. 
> Secondly, if there is a clock skew between multiple HMS instances when HMS-HA 
> is configured, time values cannot be relied on to find out the sequence of 
> alter operations on a given object.
> This JIRA propose to use JDO versioning support by Datanucleus  
> http://www.datanucleus.org/products/accessplatform_4_2/jdo/versioning.html to 
> generate a incrementing sequence number every time a object is altered. The 
> value of this object can be set as one of the values in the parameters. The 
> advantage of using Datanucleus the versioning can be done across HMS 
> instances as part of the database transaction and it should work for all the 
> supported databases.
> In theory such a version can be used to detect if the client is presenting a 
> object which is "stale" when issuing a alter request. Metastore can choose to 
> reject such a alter request since the client may be caching a old version of 
> the object and any alter operation on such stale object can potentially 
> overwrite previous operations. However, this is can be done in a separate 
> JIRA.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21128) hive.version.shortname should be 3.2 on branch-3

2019-01-22 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749045#comment-16749045
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-21128:
-

+1 pending tests green run.

> hive.version.shortname should be 3.2 on branch-3
> 
>
> Key: HIVE-21128
> URL: https://issues.apache.org/jira/browse/HIVE-21128
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-21128.01.branch-3.patch, 
> HIVE-21128.02.branch-3.patch
>
>
> Since 3.1.0 is already release, the {{hive.version.shortname}} property in 
> the pom.xml of standalone-metastore should be 3.2.0. This version shortname 
> is used to generate the metastore schema version and used by Schematool to 
> initialize the schema using the correct script. Currently it using 3.1.0 
> schema init script instead of 3.2.0 init script



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21077) Database and catalogs should have creation time

2019-01-17 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16745797#comment-16745797
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-21077:
-

The changes for upgrade scripts from 3.1.0 to 3.2.0 looks good to me.
+1

> Database and catalogs should have creation time
> ---
>
> Key: HIVE-21077
> URL: https://issues.apache.org/jira/browse/HIVE-21077
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-21077.01.patch, HIVE-21077.02.patch, 
> HIVE-21077.03.patch, HIVE-21077.04.patch, HIVE-21077.05.patch, 
> HIVE-21077.06.patch, HIVE-21077.07.patch, HIVE-21077.08.branch-3.patch, 
> HIVE-21077.09.patch, HIVE-21077.10.patch
>
>
> Currently, database do not have creation time like we have for tables and 
> partitions.
> {noformat}
> // namespace for tables
> struct Database {
>   1: string name,
>   2: string description,
>   3: string locationUri,
>   4: map parameters, // properties associated with the 
> database
>   5: optional PrincipalPrivilegeSet privileges,
>   6: optional string ownerName,
>   7: optional PrincipalType ownerType,
>   8: optional string catalogName
> }
> {noformat}
> Currently, without creationTime there is no way to identify if the copy of 
> Database which a client has is the same as the one on the server if the name 
> is same. Without object ids creationTime value is the only way currently to 
> identify uniquely a instance of metastore object. It would be good to have 
> Database creation time as well.
> Same applies for catalogs as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21077) Database and catalogs should have creation time

2019-01-15 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743320#comment-16743320
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-21077:
-

LGTM +1.

> Database and catalogs should have creation time
> ---
>
> Key: HIVE-21077
> URL: https://issues.apache.org/jira/browse/HIVE-21077
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-21077.01.patch, HIVE-21077.02.patch, 
> HIVE-21077.03.patch, HIVE-21077.04.patch, HIVE-21077.05.patch, 
> HIVE-21077.06.patch, HIVE-21077.07.patch
>
>
> Currently, database do not have creation time like we have for tables and 
> partitions.
> {noformat}
> // namespace for tables
> struct Database {
>   1: string name,
>   2: string description,
>   3: string locationUri,
>   4: map parameters, // properties associated with the 
> database
>   5: optional PrincipalPrivilegeSet privileges,
>   6: optional string ownerName,
>   7: optional PrincipalType ownerType,
>   8: optional string catalogName
> }
> {noformat}
> Currently, without creationTime there is no way to identify if the copy of 
> Database which a client has is the same as the one on the server if the name 
> is same. Without object ids creationTime value is the only way currently to 
> identify uniquely a instance of metastore object. It would be good to have 
> Database creation time as well.
> Same applies for catalogs as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21027) Add a configuration to include entire thrift objects in HMS notifications

2018-12-20 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-21027:

Attachment: HIVE-21027.1.patch

> Add a configuration to include entire thrift objects in HMS notifications
> -
>
> Key: HIVE-21027
> URL: https://issues.apache.org/jira/browse/HIVE-21027
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-21027.1.patch
>
>
> Currently, we add the full thrift objects of Table / Partition in the HMS 
> notification messages, starting from HIVE-15180.
> We can have a configuration like NOTIFICATIONS_ADD_THRIFT_OBJECTS to do this 
> under a flag.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21027) Add a configuration to include entire thrift objects in HMS notifications

2018-12-20 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725669#comment-16725669
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-21027:
-

Re attaching patch to run tests again.

> Add a configuration to include entire thrift objects in HMS notifications
> -
>
> Key: HIVE-21027
> URL: https://issues.apache.org/jira/browse/HIVE-21027
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-21027.1.patch
>
>
> Currently, we add the full thrift objects of Table / Partition in the HMS 
> notification messages, starting from HIVE-15180.
> We can have a configuration like NOTIFICATIONS_ADD_THRIFT_OBJECTS to do this 
> under a flag.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21027) Add a configuration to include entire thrift objects in HMS notifications

2018-12-20 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-21027:

Attachment: (was: HIVE-21027.1.patch)

> Add a configuration to include entire thrift objects in HMS notifications
> -
>
> Key: HIVE-21027
> URL: https://issues.apache.org/jira/browse/HIVE-21027
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-21027.1.patch
>
>
> Currently, we add the full thrift objects of Table / Partition in the HMS 
> notification messages, starting from HIVE-15180.
> We can have a configuration like NOTIFICATIONS_ADD_THRIFT_OBJECTS to do this 
> under a flag.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21027) Add a configuration to include entire thrift objects in HMS notifications

2018-12-19 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-21027:

Status: Patch Available  (was: Open)

> Add a configuration to include entire thrift objects in HMS notifications
> -
>
> Key: HIVE-21027
> URL: https://issues.apache.org/jira/browse/HIVE-21027
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-21027.1.patch
>
>
> Currently, we add the full thrift objects of Table / Partition in the HMS 
> notification messages, starting from HIVE-15180.
> We can have a configuration like NOTIFICATIONS_ADD_THRIFT_OBJECTS to do this 
> under a flag.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21027) Add a configuration to include entire thrift objects in HMS notifications

2018-12-19 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-21027:

Attachment: HIVE-21027.1.patch

> Add a configuration to include entire thrift objects in HMS notifications
> -
>
> Key: HIVE-21027
> URL: https://issues.apache.org/jira/browse/HIVE-21027
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-21027.1.patch
>
>
> Currently, we add the full thrift objects of Table / Partition in the HMS 
> notification messages, starting from HIVE-15180.
> We can have a configuration like NOTIFICATIONS_ADD_THRIFT_OBJECTS to do this 
> under a flag.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HIVE-20993) Update committer list

2018-12-19 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali resolved HIVE-20993.
-
Resolution: Fixed

> Update committer list
> -
>
> Key: HIVE-20993
> URL: https://issues.apache.org/jira/browse/HIVE-20993
> Project: Hive
>  Issue Type: Task
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
> Attachments: HIVE-20993.patch
>
>
> Please update committer list:
> Name: Bharath Krishna
> Apache ID: bharos92
> Organization: Cloudera



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21027) Add a configuration to include entire thrift objects in HMS notifications

2018-12-10 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali reassigned HIVE-21027:
---


> Add a configuration to include entire thrift objects in HMS notifications
> -
>
> Key: HIVE-21027
> URL: https://issues.apache.org/jira/browse/HIVE-21027
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
>
> Currently, we add the full thrift objects of Table / Partition in the HMS 
> notification messages, starting from HIVE-15180.
> We can have a configuration like NOTIFICATIONS_ADD_THRIFT_OBJECTS to do this 
> under a flag.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20993) Update committer list

2018-11-30 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali reassigned HIVE-20993:
---

Assignee: Bharathkrishna Guruvayoor Murali

> Update committer list
> -
>
> Key: HIVE-20993
> URL: https://issues.apache.org/jira/browse/HIVE-20993
> Project: Hive
>  Issue Type: Task
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
> Attachments: HIVE-20993.patch
>
>
> Please update committer list:
> Name: Bharath Krishna
> Apache ID: bharos92
> Organization: Cloudera



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20993) Update committer list

2018-11-30 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20993:

Attachment: HIVE-20993.patch

> Update committer list
> -
>
> Key: HIVE-20993
> URL: https://issues.apache.org/jira/browse/HIVE-20993
> Project: Hive
>  Issue Type: Task
>Reporter: Bharathkrishna Guruvayoor Murali
>Priority: Minor
> Attachments: HIVE-20993.patch
>
>
> Please update committer list:
> Name: Bharath Krishna
> Apache ID: bharos92
> Organization: Cloudera



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19814) RPC Server port is always random for spark

2018-11-19 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-19814:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch available for 4.0.0

> RPC Server port is always random for spark
> --
>
> Key: HIVE-19814
> URL: https://issues.apache.org/jira/browse/HIVE-19814
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.3.0, 3.0.0, 2.4.0, 4.0.0
>Reporter: bounkong khamphousone
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19814.1.patch, HIVE-19814.2.patch, 
> HIVE-19814.3.branch-2.patch, HIVE-19814.3.branch-3.patch, HIVE-19814.3.patch
>
>
> RPC server port is always a random one. In fact, the problem is in 
> RpcConfiguration.HIVE_SPARK_RSC_CONFIGS which doesn't include 
> SPARK_RPC_SERVER_PORT.
>  
> I've found this issue while trying to make hive-on-spark running inside 
> docker.
>  
> HIVE_SPARK_RSC_CONFIGS is called by HiveSparkClientFactory.initiateSparkConf 
> > SparkSessionManagerImpl.setup and the latter call 
> SparkClientFactory.initialize(conf) which initialize the rpc server. This 
> RPCServer is then used to create the sparkClient which use the rpc server 
> port as --remote-port arg. Since initiateSparkConf ignore 
> SPARK_RPC_SERVER_PORT, then it will always be a random port.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HIVE-19093) some parts of the Driver runs from the "Background-Pool" in HS2

2018-11-19 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali reopened HIVE-19093:
-

> some parts of the Driver runs from the "Background-Pool" in HS2
> ---
>
> Key: HIVE-19093
> URL: https://issues.apache.org/jira/browse/HIVE-19093
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Fix For: 4.0.0
>
>
> I was looking into perflog results; and the fact that Driver.run open / close 
> happens on a different thread caught my eye - this might cause real problems 
> since {{Session.get()}} will return an entirely different session in the 
> aftermath...most notably there are some lock related calls like: releaseLocks
> {code}
> 2018-04-03T08:36:53,488 DEBUG [2c81c6c1-aa6f-4609-8250-5b1a5360a8ba 
> HiveServer2-Handler-Pool: Thread-16242]: log.PerfLogger 
> (PerfLogger.java:PerfLogBegin(132)) -  from=org.apache.hadoop.hive.ql.Driver>
> 2018-04-03T08:37:21,791 DEBUG [HiveServer2-Background-Pool: Thread-16247]: 
> log.PerfLogger (PerfLogger.java:PerfLogEnd(172)) -  method=Driver.run start=1522744613488 end=1522744641791 duration=28303 
> from=org.apache.hadoop.hive.ql.Driver>
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HIVE-19093) some parts of the Driver runs from the "Background-Pool" in HS2

2018-11-19 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali resolved HIVE-19093.
-
   Resolution: Duplicate
Fix Version/s: 4.0.0

Resolving as Duplicate

> some parts of the Driver runs from the "Background-Pool" in HS2
> ---
>
> Key: HIVE-19093
> URL: https://issues.apache.org/jira/browse/HIVE-19093
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Fix For: 4.0.0
>
>
> I was looking into perflog results; and the fact that Driver.run open / close 
> happens on a different thread caught my eye - this might cause real problems 
> since {{Session.get()}} will return an entirely different session in the 
> aftermath...most notably there are some lock related calls like: releaseLocks
> {code}
> 2018-04-03T08:36:53,488 DEBUG [2c81c6c1-aa6f-4609-8250-5b1a5360a8ba 
> HiveServer2-Handler-Pool: Thread-16242]: log.PerfLogger 
> (PerfLogger.java:PerfLogBegin(132)) -  from=org.apache.hadoop.hive.ql.Driver>
> 2018-04-03T08:37:21,791 DEBUG [HiveServer2-Background-Pool: Thread-16247]: 
> log.PerfLogger (PerfLogger.java:PerfLogEnd(172)) -  method=Driver.run start=1522744613488 end=1522744641791 duration=28303 
> from=org.apache.hadoop.hive.ql.Driver>
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HIVE-18928) HS2: Perflogger has a race condition

2018-11-19 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali resolved HIVE-18928.
-
Resolution: Duplicate

> HS2: Perflogger has a race condition
> 
>
> Key: HIVE-18928
> URL: https://issues.apache.org/jira/browse/HIVE-18928
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-18928.1.patch
>
>
> {code}
> Caused by: java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437) 
> ~[?:1.8.0_112]
> at java.util.HashMap$EntryIterator.next(HashMap.java:1471) 
> ~[?:1.8.0_112]
> at java.util.HashMap$EntryIterator.next(HashMap.java:1469) 
> ~[?:1.8.0_112]
> at java.util.AbstractCollection.toArray(AbstractCollection.java:196) 
> ~[?:1.8.0_112]
> at com.google.common.collect.Iterables.toArray(Iterables.java:316) 
> ~[guava-19.0.jar:?]
> at 
> com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:342) 
> ~[guava-19.0.jar:?]
> at 
> com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:327) 
> ~[guava-19.0.jar:?]
> at 
> org.apache.hadoop.hive.ql.log.PerfLogger.getEndTimes(PerfLogger.java:218) 
> ~[hive-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1561) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1498) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:198)
>  ~[hive-service-3.0.0.3.0.0.2-132.jar:3.0.0.3.0.0.2-132]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18928) HS2: Perflogger has a race condition

2018-11-19 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-18928:

Fix Version/s: 4.0.0

> HS2: Perflogger has a race condition
> 
>
> Key: HIVE-18928
> URL: https://issues.apache.org/jira/browse/HIVE-18928
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-18928.1.patch
>
>
> {code}
> Caused by: java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437) 
> ~[?:1.8.0_112]
> at java.util.HashMap$EntryIterator.next(HashMap.java:1471) 
> ~[?:1.8.0_112]
> at java.util.HashMap$EntryIterator.next(HashMap.java:1469) 
> ~[?:1.8.0_112]
> at java.util.AbstractCollection.toArray(AbstractCollection.java:196) 
> ~[?:1.8.0_112]
> at com.google.common.collect.Iterables.toArray(Iterables.java:316) 
> ~[guava-19.0.jar:?]
> at 
> com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:342) 
> ~[guava-19.0.jar:?]
> at 
> com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:327) 
> ~[guava-19.0.jar:?]
> at 
> org.apache.hadoop.hive.ql.log.PerfLogger.getEndTimes(PerfLogger.java:218) 
> ~[hive-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1561) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1498) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:198)
>  ~[hive-service-3.0.0.3.0.0.2-132.jar:3.0.0.3.0.0.2-132]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HIVE-18928) HS2: Perflogger has a race condition

2018-11-19 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali reopened HIVE-18928:
-

> HS2: Perflogger has a race condition
> 
>
> Key: HIVE-18928
> URL: https://issues.apache.org/jira/browse/HIVE-18928
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-18928.1.patch
>
>
> {code}
> Caused by: java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437) 
> ~[?:1.8.0_112]
> at java.util.HashMap$EntryIterator.next(HashMap.java:1471) 
> ~[?:1.8.0_112]
> at java.util.HashMap$EntryIterator.next(HashMap.java:1469) 
> ~[?:1.8.0_112]
> at java.util.AbstractCollection.toArray(AbstractCollection.java:196) 
> ~[?:1.8.0_112]
> at com.google.common.collect.Iterables.toArray(Iterables.java:316) 
> ~[guava-19.0.jar:?]
> at 
> com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:342) 
> ~[guava-19.0.jar:?]
> at 
> com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:327) 
> ~[guava-19.0.jar:?]
> at 
> org.apache.hadoop.hive.ql.log.PerfLogger.getEndTimes(PerfLogger.java:218) 
> ~[hive-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1561) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1498) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:198)
>  ~[hive-service-3.0.0.3.0.0.2-132.jar:3.0.0.3.0.0.2-132]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20916) Fix typo in JSONCreateDatabaseMessage and add test for alter database

2018-11-14 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687108#comment-16687108
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20916:
-

+1

> Fix typo in JSONCreateDatabaseMessage and add test for alter database
> -
>
> Key: HIVE-20916
> URL: https://issues.apache.org/jira/browse/HIVE-20916
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-20916.01.patch
>
>
> {code}
> public JSONCreateDatabaseMessage(String server, String servicePrincipal, 
> Database db,
>   Long timestamp) {
> this.server = server;
> this.servicePrincipal = servicePrincipal;
> this.db = db.getName();
> this.timestamp = timestamp;
> try {
>   this.dbJson = MessageBuilder.createDatabaseObjJson(db);
> } catch (TException ex) {
>   throw new IllegalArgumentException("Could not serialize Function 
> object", ex);
> }
> checkValid();
>   }
> {code}
> The exception message should say Database instead of Function. Also, the 
> {{TestDbNotificationListener#createDatabase}} should be modified to make sure 
> that the deserialized database object from the dbJson field matches with the 
> original database object 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-12 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.92.patch

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, 
> HIVE-20512.6.patch, HIVE-20512.7.patch, HIVE-20512.8.patch, 
> HIVE-20512.9.patch, HIVE-20512.91.patch, HIVE-20512.92.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-12 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.91.patch

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, 
> HIVE-20512.6.patch, HIVE-20512.7.patch, HIVE-20512.8.patch, 
> HIVE-20512.9.patch, HIVE-20512.91.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-12 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: (was: HIVE-20512.10.patch)

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, 
> HIVE-20512.6.patch, HIVE-20512.7.patch, HIVE-20512.8.patch, HIVE-20512.9.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-12 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.10.patch

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.10.patch, 
> HIVE-20512.2.patch, HIVE-20512.3.patch, HIVE-20512.4.patch, 
> HIVE-20512.5.patch, HIVE-20512.6.patch, HIVE-20512.7.patch, 
> HIVE-20512.8.patch, HIVE-20512.9.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-09 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: (was: HIVE-20512.9.patch)

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, 
> HIVE-20512.6.patch, HIVE-20512.7.patch, HIVE-20512.8.patch, HIVE-20512.9.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-09 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.9.patch

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, 
> HIVE-20512.6.patch, HIVE-20512.7.patch, HIVE-20512.8.patch, HIVE-20512.9.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19925) NPE in SparkTask#printConsoleMetrics

2018-11-08 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali reassigned HIVE-19925:
---

Assignee: (was: Bharathkrishna Guruvayoor Murali)

> NPE in SparkTask#printConsoleMetrics
> 
>
> Key: HIVE-19925
> URL: https://issues.apache.org/jira/browse/HIVE-19925
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Bharathkrishna Guruvayoor Murali
>Priority: Major
>
> When running a join query with HOS, as :
> {code:java}
> SELECT a.id FROM sample a JOIN sample b ON (a.id=b.id);{code}
> Got the following exception :
> {code:java}
> Error while processing statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.printConsoleMetrics(SparkTask.java:229)
> at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:166)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2678)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2330)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2001)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1701)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1695)
> at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:224)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$600(SQLOperation.java:87)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:315)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:328)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748) (state=08S01,code=1)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19354) from_utc_timestamp returns incorrect results for datetime values with timezone

2018-11-08 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680429#comment-16680429
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-19354:
-

Unassigning this Jira as I am not planning to work on it any time soon. I have 
attached a patch, but there could be a better way to do this.

> from_utc_timestamp returns incorrect results for datetime values with timezone
> --
>
> Key: HIVE-19354
> URL: https://issues.apache.org/jira/browse/HIVE-19354
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Bruce Robbins
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-19354.01.patch
>
>
> On the master branch, from_utc_timestamp returns incorrect results for 
> datetime strings that contain a timezone:
> {noformat}
> hive> select from_utc_timestamp('2000-10-10 00:00:00+00:00', 
> 'America/Los_Angeles');
> OK
> 2000-10-09 10:00:00
> Time taken: 0.294 seconds, Fetched: 1 row(s)
> hive> select from_utc_timestamp('2000-10-10 00:00:00', 'America/Los_Angeles');
> OK
> 2000-10-09 17:00:00
> Time taken: 0.121 seconds, Fetched: 1 row(s)
> hive> 
> {noformat}
> Both inputs are 2000-10-10 00:00:00 in UTC time, but I got two different 
> results.
> In version 2.3.3, from_utc_timestamp doesn't accept timezones in its input 
> strings, so it does not have this bug:
> {noformat}
> hive> select from_utc_timestamp('2000-10-10 00:00:00+00:00', 
> 'America/Los_Angeles');
> OK
> NULL
> Time taken: 5.152 seconds, Fetched: 1 row(s)
> hive> select from_utc_timestamp('2000-10-10 00:00:00', 'America/Los_Angeles');
> OK
> 2000-10-09 17:00:00
> Time taken: 0.069 seconds, Fetched: 1 row(s)
> hive> 
> {noformat}
> Since the function is expecting a UTC datetime value, it probably should 
> continue to reject input that contains a timezone component.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-8446) Failing to delete data after dropping a table or database should result in error

2018-11-08 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali reassigned HIVE-8446:
--

Assignee: (was: Bharathkrishna Guruvayoor Murali)

> Failing to delete data after dropping a table or database should result in 
> error
> 
>
> Key: HIVE-8446
> URL: https://issues.apache.org/jira/browse/HIVE-8446
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Priority: Major
> Attachments: HIVE-8446.1.patch
>
>
> Currently if we drop a table and it fails to delete the data, the command 
> completes successfully. We should instead return an error to the user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-13157) MetaStoreEventListener.onAlter triggered for INSERT and SELECT

2018-11-08 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali reassigned HIVE-13157:
---

Assignee: (was: Bharathkrishna Guruvayoor Murali)

> MetaStoreEventListener.onAlter triggered for INSERT and SELECT
> --
>
> Key: HIVE-13157
> URL: https://issues.apache.org/jira/browse/HIVE-13157
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 4.0.0
>Reporter: Eugen Stoianovici
>Priority: Critical
>
> The event onAlter from 
> org.apache.hadoop.hive.metastore.MetaStoreEventListener is triggered when 
> INSERT or SELECT statements are executed on the target table.
> Furthermore, the value of transient_lastDdl is updated in table properties 
> for INSERT statements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19354) from_utc_timestamp returns incorrect results for datetime values with timezone

2018-11-08 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali reassigned HIVE-19354:
---

Assignee: (was: Bharathkrishna Guruvayoor Murali)

> from_utc_timestamp returns incorrect results for datetime values with timezone
> --
>
> Key: HIVE-19354
> URL: https://issues.apache.org/jira/browse/HIVE-19354
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Bruce Robbins
>Priority: Major
> Attachments: HIVE-19354.01.patch
>
>
> On the master branch, from_utc_timestamp returns incorrect results for 
> datetime strings that contain a timezone:
> {noformat}
> hive> select from_utc_timestamp('2000-10-10 00:00:00+00:00', 
> 'America/Los_Angeles');
> OK
> 2000-10-09 10:00:00
> Time taken: 0.294 seconds, Fetched: 1 row(s)
> hive> select from_utc_timestamp('2000-10-10 00:00:00', 'America/Los_Angeles');
> OK
> 2000-10-09 17:00:00
> Time taken: 0.121 seconds, Fetched: 1 row(s)
> hive> 
> {noformat}
> Both inputs are 2000-10-10 00:00:00 in UTC time, but I got two different 
> results.
> In version 2.3.3, from_utc_timestamp doesn't accept timezones in its input 
> strings, so it does not have this bug:
> {noformat}
> hive> select from_utc_timestamp('2000-10-10 00:00:00+00:00', 
> 'America/Los_Angeles');
> OK
> NULL
> Time taken: 5.152 seconds, Fetched: 1 row(s)
> hive> select from_utc_timestamp('2000-10-10 00:00:00', 'America/Los_Angeles');
> OK
> 2000-10-09 17:00:00
> Time taken: 0.069 seconds, Fetched: 1 row(s)
> hive> 
> {noformat}
> Since the function is expecting a UTC datetime value, it probably should 
> continue to reject input that contains a timezone component.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19925) NPE in SparkTask#printConsoleMetrics

2018-11-08 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680421#comment-16680421
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-19925:
-

Unassigning this Jira as I am not planning to look into this soon.

> NPE in SparkTask#printConsoleMetrics
> 
>
> Key: HIVE-19925
> URL: https://issues.apache.org/jira/browse/HIVE-19925
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Bharathkrishna Guruvayoor Murali
>Priority: Major
>
> When running a join query with HOS, as :
> {code:java}
> SELECT a.id FROM sample a JOIN sample b ON (a.id=b.id);{code}
> Got the following exception :
> {code:java}
> Error while processing statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.printConsoleMetrics(SparkTask.java:229)
> at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:166)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2678)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2330)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2001)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1701)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1695)
> at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:224)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$600(SQLOperation.java:87)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:315)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:328)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748) (state=08S01,code=1)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-08 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680039#comment-16680039
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20512:
-

Adding awaitTermination() and shutDownNow() after canceling the thread in 
close().

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, 
> HIVE-20512.6.patch, HIVE-20512.7.patch, HIVE-20512.8.patch, HIVE-20512.9.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-08 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.9.patch

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, 
> HIVE-20512.6.patch, HIVE-20512.7.patch, HIVE-20512.8.patch, HIVE-20512.9.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-07 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678684#comment-16678684
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20512:
-

AwaitTermination causes delay by blocking the main thread. Since we actually 
don't care about whether loggerThread is executed during close, avoiding usage 
of awaitTermination by canceling the scheduledFuture after shutDown as 
suggested by Sahil.
Attaching HIVE-20512.8.patch with this change to see if the tests pass.

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, 
> HIVE-20512.6.patch, HIVE-20512.7.patch, HIVE-20512.8.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-07 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.8.patch

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, 
> HIVE-20512.6.patch, HIVE-20512.7.patch, HIVE-20512.8.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-05 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676147#comment-16676147
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20512:
-

I am adding back shutDown and awaitTermination according to Javadoc it looks 
cleaner, although I felt shutDownNow will be enough for this specific 
situation. I think tests timed out because of awaitTermination which blocks 
main thread until timeout. I have reduced the timeout to 15 seconds from 30 for 
awaitTimeout, let's see if tests still fail.

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, 
> HIVE-20512.6.patch, HIVE-20512.7.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-05 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.7.patch

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, 
> HIVE-20512.6.patch, HIVE-20512.7.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-02 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673291#comment-16673291
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20512:
-

Tests run successfully. [~stakiar] , can you please push this patch to master 
if there are no further comments.

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, HIVE-20512.6.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-01 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: (was: HIVE-20512.6.patch)

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, HIVE-20512.6.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-01 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.6.patch

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, HIVE-20512.6.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-10-31 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.6.patch

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, HIVE-20512.6.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-10-29 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667698#comment-16667698
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20512:
-

Tests run locally but here it ails with "did not produce a TEST-*.xml file 
(likely timed out". Attaching HIVE-20512.5.patch with no awaitTermination to 
see if the tests are passing.

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-10-29 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.5.patch

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-10-29 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667405#comment-16667405
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20512:
-

Thanks [~stakiar] for the review.
The test failures are all related to Hive-On-spark so I definitely want to 
check if those are working. I will follow up on that.

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-10-28 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.4.patch

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-10-28 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: (was: HIVE-20512.4.patch)

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-10-26 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.4.patch

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-10-24 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.3.patch

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-10-20 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657965#comment-16657965
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20512:
-

Added review board link with updated patch addressing the above comments.
[~stakiar] 

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-10-20 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.2.patch

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20659) Update commons-compress to 1.18 due to security issues

2018-10-17 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16653966#comment-16653966
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20659:
-

Thanks for the review.

> Update commons-compress to 1.18 due to security issues
> --
>
> Key: HIVE-20659
> URL: https://issues.apache.org/jira/browse/HIVE-20659
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1, 3.0.0, 2.3.2, 3.1.0
>Reporter: Jörn Franke
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Critical
> Attachments: HIVE-20659.1.patch
>
>
> Currently most Hive version depends on commons-compress 1.9 or 1.4. Those 
> versions have several security issues: 
> [https://commons.apache.org/proper/commons-compress/security-reports.html]
> I propose to upgrade all commons-compress dependencies in all Hive 
> (sub-)projects to at least 1.18. This will also make it easier for future 
> extensions to Hive (serde, udfs, etc.) that have dependencies to 
> commons-compress (e.g. [https://github.com/zuinnote/hadoopoffice/wiki)] to 
> integrate into Hive without upgrading the commons-compress library manually 
> in the Hive lib folder.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20659) Update commons-compress to 1.18 due to security issues

2018-10-17 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20659:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Update commons-compress to 1.18 due to security issues
> --
>
> Key: HIVE-20659
> URL: https://issues.apache.org/jira/browse/HIVE-20659
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1, 3.0.0, 2.3.2, 3.1.0
>Reporter: Jörn Franke
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Critical
> Attachments: HIVE-20659.1.patch
>
>
> Currently most Hive version depends on commons-compress 1.9 or 1.4. Those 
> versions have several security issues: 
> [https://commons.apache.org/proper/commons-compress/security-reports.html]
> I propose to upgrade all commons-compress dependencies in all Hive 
> (sub-)projects to at least 1.18. This will also make it easier for future 
> extensions to Hive (serde, udfs, etc.) that have dependencies to 
> commons-compress (e.g. [https://github.com/zuinnote/hadoopoffice/wiki)] to 
> integrate into Hive without upgrading the commons-compress library manually 
> in the Hive lib folder.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20488) SparkSubmitSparkClient#launchDriver should parse exceptions, not just errors

2018-10-16 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652993#comment-16652993
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20488:
-

Attached same patch again to run tests. All tests passing now.

> SparkSubmitSparkClient#launchDriver should parse exceptions, not just errors
> 
>
> Key: HIVE-20488
> URL: https://issues.apache.org/jira/browse/HIVE-20488
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20488.1.patch
>
>
> In {{SparkSubmitSparkClient#launchDriver}} we parse the stdout / stderr of 
> {{bin/spark-submit}} for strings that contain "Error", but we should also 
> look for "Exception".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-10-16 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.1.patch

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-10-16 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: (was: HIVE-20512.1.patch)

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-10-16 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.1.patch
Status: Patch Available  (was: Open)

[~stakiar] Can you please check the attached patch if this approach looks good.

I tried adding unit tests, but not sure if it is needed, as I should parse the 
log in a hard-coded way to check the String "processed " + rowNumber + " rows: 
used memory ="

and probably sleep() till the threshold to check if logs are printed again. 
What do you suggest?

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20488) SparkSubmitSparkClient#launchDriver should parse exceptions, not just errors

2018-10-16 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20488:

Attachment: (was: HIVE-20488.1.patch)

> SparkSubmitSparkClient#launchDriver should parse exceptions, not just errors
> 
>
> Key: HIVE-20488
> URL: https://issues.apache.org/jira/browse/HIVE-20488
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20488.1.patch
>
>
> In {{SparkSubmitSparkClient#launchDriver}} we parse the stdout / stderr of 
> {{bin/spark-submit}} for strings that contain "Error", but we should also 
> look for "Exception".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20488) SparkSubmitSparkClient#launchDriver should parse exceptions, not just errors

2018-10-16 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20488:

Attachment: HIVE-20488.1.patch

> SparkSubmitSparkClient#launchDriver should parse exceptions, not just errors
> 
>
> Key: HIVE-20488
> URL: https://issues.apache.org/jira/browse/HIVE-20488
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20488.1.patch
>
>
> In {{SparkSubmitSparkClient#launchDriver}} we parse the stdout / stderr of 
> {{bin/spark-submit}} for strings that contain "Error", but we should also 
> look for "Exception".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20679) DDL operations on hive might create large messages for DBNotification

2018-10-15 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650511#comment-16650511
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20679:
-

Hi,

Can you please also add a review board link for the patch?

> DDL operations on hive might create large messages for DBNotification
> -
>
> Key: HIVE-20679
> URL: https://issues.apache.org/jira/browse/HIVE-20679
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
> Attachments: HIVE-20679.1.patch, HIVE-20679.2.patch, 
> HIVE-20679.3.patch, HIVE-20679.4.patch, HIVE-20679.5.patch, a.sql, b.sql
>
>
> Certain type of ddl operations might create large messages as part of 
> DBNoitification, this might lead to the rdbms throwing an error when storing 
> the message since its size is to large. It will also increase the footprint 
> of the rdbms space usage. 
> We should try store compressed messages to allow handling these situations. 
> Edit: For notification_log table the message column for all supported 
> databases can store messages from 2GB to 4GB



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20488) SparkSubmitSparkClient#launchDriver should parse exceptions, not just errors

2018-10-15 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20488:

Attachment: HIVE-20488.1.patch
Status: Patch Available  (was: Open)

> SparkSubmitSparkClient#launchDriver should parse exceptions, not just errors
> 
>
> Key: HIVE-20488
> URL: https://issues.apache.org/jira/browse/HIVE-20488
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20488.1.patch
>
>
> In {{SparkSubmitSparkClient#launchDriver}} we parse the stdout / stderr of 
> {{bin/spark-submit}} for strings that contain "Error", but we should also 
> look for "Exception".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20488) SparkSubmitSparkClient#launchDriver should parse exceptions, not just errors

2018-10-12 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali reassigned HIVE-20488:
---

Assignee: Bharathkrishna Guruvayoor Murali

> SparkSubmitSparkClient#launchDriver should parse exceptions, not just errors
> 
>
> Key: HIVE-20488
> URL: https://issues.apache.org/jira/browse/HIVE-20488
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
>
> In {{SparkSubmitSparkClient#launchDriver}} we parse the stdout / stderr of 
> {{bin/spark-submit}} for strings that contain "Error", but we should also 
> look for "Exception".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-10-12 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali reassigned HIVE-20512:
---

Assignee: Bharathkrishna Guruvayoor Murali

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-10-12 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648292#comment-16648292
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20512:
-

Hi Sahil,



What do you mean by logging at a given interval? Do you mean to log at given 
intervals of time instead of the number of rows?

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Priority: Major
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20659) Update commons-compress to 1.18 due to security issues

2018-10-11 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20659:

Status: Patch Available  (was: Open)

> Update commons-compress to 1.18 due to security issues
> --
>
> Key: HIVE-20659
> URL: https://issues.apache.org/jira/browse/HIVE-20659
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0, 2.3.2, 3.0.0, 1.2.1
>Reporter: Jörn Franke
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Critical
> Attachments: HIVE-20659.1.patch
>
>
> Currently most Hive version depends on commons-compress 1.9 or 1.4. Those 
> versions have several security issues: 
> [https://commons.apache.org/proper/commons-compress/security-reports.html]
> I propose to upgrade all commons-compress dependencies in all Hive 
> (sub-)projects to at least 1.18. This will also make it easier for future 
> extensions to Hive (serde, udfs, etc.) that have dependencies to 
> commons-compress (e.g. [https://github.com/zuinnote/hadoopoffice/wiki)] to 
> integrate into Hive without upgrading the commons-compress library manually 
> in the Hive lib folder.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20659) Update commons-compress to 1.18 due to security issues

2018-10-11 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20659:

Attachment: HIVE-20659.1.patch

> Update commons-compress to 1.18 due to security issues
> --
>
> Key: HIVE-20659
> URL: https://issues.apache.org/jira/browse/HIVE-20659
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1, 3.0.0, 2.3.2, 3.1.0
>Reporter: Jörn Franke
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Critical
> Attachments: HIVE-20659.1.patch
>
>
> Currently most Hive version depends on commons-compress 1.9 or 1.4. Those 
> versions have several security issues: 
> [https://commons.apache.org/proper/commons-compress/security-reports.html]
> I propose to upgrade all commons-compress dependencies in all Hive 
> (sub-)projects to at least 1.18. This will also make it easier for future 
> extensions to Hive (serde, udfs, etc.) that have dependencies to 
> commons-compress (e.g. [https://github.com/zuinnote/hadoopoffice/wiki)] to 
> integrate into Hive without upgrading the commons-compress library manually 
> in the Hive lib folder.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20659) Update commons-compress to 1.18 due to security issues

2018-10-11 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali reassigned HIVE-20659:
---

Assignee: Bharathkrishna Guruvayoor Murali

> Update commons-compress to 1.18 due to security issues
> --
>
> Key: HIVE-20659
> URL: https://issues.apache.org/jira/browse/HIVE-20659
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1, 3.0.0, 2.3.2, 3.1.0
>Reporter: Jörn Franke
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Critical
>
> Currently most Hive version depends on commons-compress 1.9 or 1.4. Those 
> versions have several security issues: 
> [https://commons.apache.org/proper/commons-compress/security-reports.html]
> I propose to upgrade all commons-compress dependencies in all Hive 
> (sub-)projects to at least 1.18. This will also make it easier for future 
> extensions to Hive (serde, udfs, etc.) that have dependencies to 
> commons-compress (e.g. [https://github.com/zuinnote/hadoopoffice/wiki)] to 
> integrate into Hive without upgrading the commons-compress library manually 
> in the Hive lib folder.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20600) Metastore connection leak

2018-10-11 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16646776#comment-16646776
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20600:
-

Hi,
Is this issue fixed yet?

> Metastore connection leak
> -
>
> Key: HIVE-20600
> URL: https://issues.apache.org/jira/browse/HIVE-20600
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.3.3
>Reporter: Damon Cortesi
>Priority: Major
> Attachments: HIVE-20600.patch, consume_threads.py
>
>
> Within the execute method of HiveServer2, there appears to be a connection 
> leak. With fairly straightforward series of INSERT statements, the connection 
> count in the logs continues to increase over time. Under certain loads, this 
> can also consume all underlying threads of the Hive metastore and result in 
> HS2 becoming unresponsive to new connections.
> The log below is the result of some python code executing a single insert 
> statement, and then looping through a series of 10 more insert statements. We 
> can see there's one dangling connection left open after each execution 
> leaving us with 12 open connections (11 from the execute statements + 1 from 
> HS2 startup).
> {code}
> 2018-09-19T17:14:32,108 INFO [main([])]: hive.metastore 
> (HiveMetaStoreClient.java:open(481)) - Opened a connection to metastore, 
> current connections: 1
>  2018-09-19T17:14:48,175 INFO [29049f74-73c4-4f48-9cf7-b4bfe524a85b 
> HiveServer2-Handler-Pool: Thread-31([])]: hive.metastore 
> (HiveMetaStoreClient.java:open(481)) - Opened a connection to metastore, 
> current connections: 2
>  2018-09-19T17:15:05,543 INFO [HiveServer2-Background-Pool: Thread-36([])]: 
> hive.metastore (HiveMetaStoreClient.java:close(564)) - Closed a connection to 
> metastore, current connections: 1
>  2018-09-19T17:15:05,548 INFO [HiveServer2-Background-Pool: Thread-36([])]: 
> hive.metastore (HiveMetaStoreClient.java:open(481)) - Opened a connection to 
> metastore, current connections: 2
>  2018-09-19T17:15:05,932 INFO [HiveServer2-Background-Pool: Thread-36([])]: 
> hive.metastore (HiveMetaStoreClient.java:close(564)) - Closed a connection to 
> metastore, current connections: 1
>  2018-09-19T17:15:05,935 INFO [HiveServer2-Background-Pool: Thread-36([])]: 
> hive.metastore (HiveMetaStoreClient.java:open(481)) - Opened a connection to 
> metastore, current connections: 2
>  2018-09-19T17:15:06,123 INFO [HiveServer2-Background-Pool: Thread-36([])]: 
> hive.metastore (HiveMetaStoreClient.java:close(564)) - Closed a connection to 
> metastore, current connections: 1
>  2018-09-19T17:15:06,126 INFO [HiveServer2-Background-Pool: Thread-36([])]: 
> hive.metastore (HiveMetaStoreClient.java:open(481)) - Opened a connection to 
> metastore, current connections: 2
> ...
>  2018-09-19T17:15:20,626 INFO [29049f74-73c4-4f48-9cf7-b4bfe524a85b 
> HiveServer2-Handler-Pool: Thread-31([])]: hive.metastore 
> (HiveMetaStoreClient.java:open(481)) - Opened a connection to metastore, 
> current connections: 12
>  2018-09-19T17:15:21,153 INFO [HiveServer2-Background-Pool: Thread-162([])]: 
> hive.metastore (HiveMetaStoreClient.java:close(564)) - Closed a connection to 
> metastore, current connections: 11
>  2018-09-19T17:15:21,155 INFO [HiveServer2-Background-Pool: Thread-162([])]: 
> hive.metastore (HiveMetaStoreClient.java:open(481)) - Opened a connection to 
> metastore, current connections: 12
>  2018-09-19T17:15:21,306 INFO [HiveServer2-Background-Pool: Thread-162([])]: 
> hive.metastore (HiveMetaStoreClient.java:close(564)) - Closed a connection to 
> metastore, current connections: 11
>  2018-09-19T17:15:21,308 INFO [HiveServer2-Background-Pool: Thread-162([])]: 
> hive.metastore (HiveMetaStoreClient.java:open(481)) - Opened a connection to 
> metastore, current connections: 12
>  2018-09-19T17:15:21,385 INFO [HiveServer2-Background-Pool: Thread-162([])]: 
> hive.metastore (HiveMetaStoreClient.java:close(564)) - Closed a connection to 
> metastore, current connections: 11
>  2018-09-19T17:15:21,387 INFO [HiveServer2-Background-Pool: Thread-162([])]: 
> hive.metastore (HiveMetaStoreClient.java:open(481)) - Opened a connection to 
> metastore, current connections: 12
>  2018-09-19T17:15:21,541 INFO [HiveServer2-Handler-Pool: Thread-31([])]: 
> hive.metastore (HiveMetaStoreClient.java:open(481)) - Opened a connection to 
> metastore, current connections: 13
>  2018-09-19T17:15:21,542 INFO [HiveServer2-Handler-Pool: Thread-31([])]: 
> hive.metastore (HiveMetaStoreClient.java:close(564)) - Closed a connection to 
> metastore, current connections: 12
> {code}
> Attached is a simple [impyla|https://github.com/cloudera/impyla] script that 
> triggers the condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20679) DDL operations on hive might create large messages for DBNotification

2018-10-10 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645576#comment-16645576
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20679:
-

Hi [~anishek] ,

Were you able to run any performance tests with the compressed messages? If so, 
please share the results.

> DDL operations on hive might create large messages for DBNotification
> -
>
> Key: HIVE-20679
> URL: https://issues.apache.org/jira/browse/HIVE-20679
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
> Attachments: HIVE-20679.1.patch, HIVE-20679.2.patch, 
> HIVE-20679.3.patch, HIVE-20679.4.patch, a.sql, b.sql
>
>
> Certain type of ddl operations might create large messages as part of 
> DBNoitification, this might lead to the rdbms throwing an error when storing 
> the message since its size is to large. It will also increase the footprint 
> of the rdbms space usage. 
> We should try store compressed messages to allow handling these situations. 
> Edit: For notification_log table the message column for all supported 
> databases can store messages from 2GB to 4GB



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-20545) Ability to exclude potentially large parameters in HMS Notifications

2018-10-08 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640211#comment-16640211
 ] 

Bharathkrishna Guruvayoor Murali edited comment on HIVE-20545 at 10/8/18 6:32 
PM:
--

Hi [~anishek] ,

An example would be when Impala writes stats information to Partition objects 
and accesses it, as shown :  
 [Impala 
stats|https://github.com/apache/impala/blob/d48ffc2d45b2a9d4b9c730bba5677d3096311a25/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java#L46]

But this information is not relevant as a Partition metadata for any other 
purpose. Also, these parameters have considerably large-size. On running some 
preliminary tests, we observe that there is a performance hit on increasing 
message size of HMS Notifications, so we can use this configuration to filter 
parameters like above mentioned.

 

Edited : Updated the link which was pointing to wrong one previously.


was (Author: bharos92):
Hi [~anishek] ,

An example would be when Impala writes stats information to Partition objects 
and accesses it, as shown :  
[Impala 
stats|https://github.com/apache/impala/blob/d48ffc2d45b2a9d4b9c730bba5677d3096311a25/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java#L46]

But this information is not relevant as a Partition metadata for any other 
purpose. Also, these parameters have considerably large-size. On running some 
preliminary tests, we observe that there is a performance hit on increasing 
message size of HMS Notifications, so we can use this configuration to filter 
parameters like above mentioned.

> Ability to exclude potentially large parameters in HMS Notifications
> 
>
> Key: HIVE-20545
> URL: https://issues.apache.org/jira/browse/HIVE-20545
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20545.1.patch, HIVE-20545.2.patch, 
> HIVE-20545.3.branch-3.patch, HIVE-20545.3.patch, HIVE-20545.4.patch, 
> HIVE-20545.6.patch, HIVE-20545.7.patch
>
>
> Clients can add large-sized parameters in Table/Partition objects. So we need 
> to enable adding regex patterns through HiveConf to match parameters to be 
> filtered from table and partition objects before serialization in HMS 
> notifications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-20545) Ability to exclude potentially large parameters in HMS Notifications

2018-10-08 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640211#comment-16640211
 ] 

Bharathkrishna Guruvayoor Murali edited comment on HIVE-20545 at 10/8/18 6:19 
PM:
--

Hi [~anishek] ,

An example would be when Impala writes stats information to Partition objects 
and accesses it, as shown :  
[Impala 
stats|https://github.com/apache/impala/blob/d48ffc2d45b2a9d4b9c730bba5677d3096311a25/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java#L46]

But this information is not relevant as a Partition metadata for any other 
purpose. Also, these parameters have considerably large-size. On running some 
preliminary tests, we observe that there is a performance hit on increasing 
message size of HMS Notifications, so we can use this configuration to filter 
parameters like above mentioned.


was (Author: bharos92):
Hi [~anishek] ,

An example would be when Impala writes stats information to Partition objects 
and accesses it, as shown :  
[Impala 
stats|http://github.mtv.cloudera.com/CDH/Impala/blob/6f2d928734a33ace15ec6abd5659651173b9e69e/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java#L45]

But this information is not relevant as a Partition metadata for any other 
purpose. Also, these parameters have considerably large-size. On running some 
preliminary tests, we observe that there is a performance hit on increasing 
message size of HMS Notifications, so we can use this configuration to filter 
parameters like above mentioned.

> Ability to exclude potentially large parameters in HMS Notifications
> 
>
> Key: HIVE-20545
> URL: https://issues.apache.org/jira/browse/HIVE-20545
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20545.1.patch, HIVE-20545.2.patch, 
> HIVE-20545.3.branch-3.patch, HIVE-20545.3.patch, HIVE-20545.4.patch, 
> HIVE-20545.6.patch, HIVE-20545.7.patch
>
>
> Clients can add large-sized parameters in Table/Partition objects. So we need 
> to enable adding regex patterns through HiveConf to match parameters to be 
> filtered from table and partition objects before serialization in HMS 
> notifications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-13157) MetaStoreEventListener.onAlter triggered for INSERT and SELECT

2018-10-05 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640468#comment-16640468
 ] 

Bharathkrishna Guruvayoor Murali edited comment on HIVE-13157 at 10/5/18 11:26 
PM:
---

I have observed that 2 alter notifications are created for insert, while only 
one is actually needed (still present in master  4.0).
Basically my understanding is that the loadTable in Hive.java : 
[Hive.java#L2630|https://github.com/apache/hive/blob/a4b087b18bd5b0b4023bced68c85cf1e16301fed/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2630]
 calls alterTable even when there is a stats change, and only change happening 
is transient_lastDdl because actual alter of stats happens in the next 
alter_table event that follows this.

I wanted to change the code in the following way :
{code:java}
  if (hasFollowingStatsTask) {
  environmentContext = new EnvironmentContext();
  environmentContext.putToProperties(StatsSetupConst.DO_NOT_UPDATE_STATS, 
StatsSetupConst.TRUE);
}
else {
alterTable(tbl, false, environmentContext, true);
}
{code}
Change above is adding alterTable in else part, ie. when there is no stat task 
to follow. But I do not know if loadTable is used in a different code path 
where this alterTable is useful irrespective of stat change.

 

+ [~sershe] [~vihangk1] [~akolb] [~pvary] What do you think about this, can you 
think of any case where the alterTable should not be in the else part.


was (Author: bharos92):
This is still present in master. I was looking into the same issue, basically 
my understanding is that the loadTable in Hive.java : 
[Hive.java#L2630|https://github.com/apache/hive/blob/a4b087b18bd5b0b4023bced68c85cf1e16301fed/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2630]
 calls alterTable even when there is a stats change, and only change happening 
is transient_lastDdl because actual alter of stats happens in the next 
alter_table event that follows this.

I wanted to change the code in the following way :
{code:java}
  if (hasFollowingStatsTask) {
  environmentContext = new EnvironmentContext();
  environmentContext.putToProperties(StatsSetupConst.DO_NOT_UPDATE_STATS, 
StatsSetupConst.TRUE);
}
else {
alterTable(tbl, false, environmentContext, true);
}
{code}
Change above is adding alterTable in else part, ie. when there is no stat task 
to follow. But I do not know if loadTable is used in a different code path 
where this alterTable is useful irrespective of stat change.

 

+ [~sershe] [~vihangk1] [~akolb] [~pvary] What do you think about this, can you 
think of any case where the alterTable should not be in the else part.

> MetaStoreEventListener.onAlter triggered for INSERT and SELECT
> --
>
> Key: HIVE-13157
> URL: https://issues.apache.org/jira/browse/HIVE-13157
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 4.0.0
>Reporter: Eugen Stoianovici
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Critical
>
> The event onAlter from 
> org.apache.hadoop.hive.metastore.MetaStoreEventListener is triggered when 
> INSERT or SELECT statements are executed on the target table.
> Furthermore, the value of transient_lastDdl is updated in table properties 
> for INSERT statements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-13157) MetaStoreEventListener.onAlter triggered for INSERT and SELECT

2018-10-05 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-13157:

Affects Version/s: 4.0.0

> MetaStoreEventListener.onAlter triggered for INSERT and SELECT
> --
>
> Key: HIVE-13157
> URL: https://issues.apache.org/jira/browse/HIVE-13157
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 4.0.0
>Reporter: Eugen Stoianovici
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Critical
>
> The event onAlter from 
> org.apache.hadoop.hive.metastore.MetaStoreEventListener is triggered when 
> INSERT or SELECT statements are executed on the target table.
> Furthermore, the value of transient_lastDdl is updated in table properties 
> for INSERT statements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-13157) MetaStoreEventListener.onAlter triggered for INSERT and SELECT

2018-10-05 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali reassigned HIVE-13157:
---

Assignee: Bharathkrishna Guruvayoor Murali

> MetaStoreEventListener.onAlter triggered for INSERT and SELECT
> --
>
> Key: HIVE-13157
> URL: https://issues.apache.org/jira/browse/HIVE-13157
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 4.0.0
>Reporter: Eugen Stoianovici
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Critical
>
> The event onAlter from 
> org.apache.hadoop.hive.metastore.MetaStoreEventListener is triggered when 
> INSERT or SELECT statements are executed on the target table.
> Furthermore, the value of transient_lastDdl is updated in table properties 
> for INSERT statements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-13157) MetaStoreEventListener.onAlter triggered for INSERT and SELECT

2018-10-05 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640468#comment-16640468
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-13157:
-

This is still present in master. I was looking into the same issue, basically 
my understanding is that the loadTable in Hive.java : 
[Hive.java#L2630|https://github.com/apache/hive/blob/a4b087b18bd5b0b4023bced68c85cf1e16301fed/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2630]
 calls alterTable even when there is a stats change, and only change happening 
is transient_lastDdl because actual alter of stats happens in the next 
alter_table event that follows this.

I wanted to change the code in the following way :
{code:java}
  if (hasFollowingStatsTask) {
  environmentContext = new EnvironmentContext();
  environmentContext.putToProperties(StatsSetupConst.DO_NOT_UPDATE_STATS, 
StatsSetupConst.TRUE);
}
else {
alterTable(tbl, false, environmentContext, true);
}
{code}
Change above is adding alterTable in else part, ie. when there is no stat task 
to follow. But I do not know if loadTable is used in a different code path 
where this alterTable is useful irrespective of stat change.

 

+ [~sershe] [~vihangk1] [~akolb] [~pvary] What do you think about this, can you 
think of any case where the alterTable should not be in the else part.

> MetaStoreEventListener.onAlter triggered for INSERT and SELECT
> --
>
> Key: HIVE-13157
> URL: https://issues.apache.org/jira/browse/HIVE-13157
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: Eugen Stoianovici
>Priority: Critical
>
> The event onAlter from 
> org.apache.hadoop.hive.metastore.MetaStoreEventListener is triggered when 
> INSERT or SELECT statements are executed on the target table.
> Furthermore, the value of transient_lastDdl is updated in table properties 
> for INSERT statements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20698) Better error instead of NPE when timestamp is null for any row when ingesting to druid

2018-10-05 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640393#comment-16640393
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20698:
-

LGTM.
Just nit: 
{code:java}
Preconditions.checkNotNull(timestamp,"Timestamp column cannot have null 
value");{code}
Needs space after ,

> Better error instead of NPE when timestamp is null for any row when ingesting 
> to druid
> --
>
> Key: HIVE-20698
> URL: https://issues.apache.org/jira/browse/HIVE-20698
> Project: Hive
>  Issue Type: Improvement
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20698.patch
>
>
> Currently when ingesting data to druid we get a wierd NPE when timestamp is 
> null for any row. 
> We should provide an error with a better message which helps user to know 
> what is actually wrong. 
> {code} 
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.druid.serde.DruidSerDe.serialize(DruidSerDe.java:364)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:957)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
>   at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:480)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20610) TestDbNotificationListener should not use /tmp directory

2018-10-05 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20610:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> TestDbNotificationListener should not use /tmp directory
> 
>
> Key: HIVE-20610
> URL: https://issues.apache.org/jira/browse/HIVE-20610
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20610.1.patch, HIVE-20610.2.patch, 
> HIVE-20610.3.patch, HIVE-20610.4.patch
>
>
> Using /tmp directory creates exceptions for tests like dropTable :
> {code:java}
> 2018-09-19T06:42:04,818  INFO [main] metastore.HiveMetaStore: 0: drop_table : 
> tbl=hive.default.droptbl
> 2018-09-19T06:42:04,819  INFO [main] HiveMetaStore.audit: ugi=hiveptest   
> ip=unknown-ip-addr  cmd=drop_table : tbl=hive.default.droptbl   
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.ICE-unix]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.XIM-unix]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.X11-unix]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/hsperfdata_root]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.font-unix]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.Test-unix]: it still exists.
> 2018-09-19T06:42:05,072 ERROR [main] utils.FileUtils: Failed to delete 
> file:/tmp
> 2018-09-19T06:42:05,072 ERROR [main] utils.MetaStoreUtils: Got exception: 
> org.apache.hadoop.hive.metastore.api.MetaException Unable to delete 
> directory: file:/tmp
> org.apache.hadoop.hive.metastore.api.MetaException: Unable to delete 
> directory: file:/tmp
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreFsImpl.deleteDir(HiveMetaStoreFsImpl.java:45)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:365) 
> [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:353) 
> [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.deleteTableData(HiveMetaStore.java:2562)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:2523)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:2685)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_102]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_102]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_102]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_102]
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at com.sun.proxy.$Proxy33.drop_table_with_environment_context(Unknown 
> Source) [?:?]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.drop_table_with_environment_context(HiveMetaStoreClient.java:3204)
>  [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1492)
>  [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1432)
>  [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hive.hcatalog.listener.TestDbNotificationListener.dropTable(TestDbNotificationListener.java:522)
>  [test-classes/:?]
>   at 

[jira] [Commented] (HIVE-20610) TestDbNotificationListener should not use /tmp directory

2018-10-05 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640213#comment-16640213
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20610:
-

Test failure looks unrelated.

> TestDbNotificationListener should not use /tmp directory
> 
>
> Key: HIVE-20610
> URL: https://issues.apache.org/jira/browse/HIVE-20610
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20610.1.patch, HIVE-20610.2.patch, 
> HIVE-20610.3.patch, HIVE-20610.4.patch
>
>
> Using /tmp directory creates exceptions for tests like dropTable :
> {code:java}
> 2018-09-19T06:42:04,818  INFO [main] metastore.HiveMetaStore: 0: drop_table : 
> tbl=hive.default.droptbl
> 2018-09-19T06:42:04,819  INFO [main] HiveMetaStore.audit: ugi=hiveptest   
> ip=unknown-ip-addr  cmd=drop_table : tbl=hive.default.droptbl   
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.ICE-unix]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.XIM-unix]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.X11-unix]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/hsperfdata_root]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.font-unix]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.Test-unix]: it still exists.
> 2018-09-19T06:42:05,072 ERROR [main] utils.FileUtils: Failed to delete 
> file:/tmp
> 2018-09-19T06:42:05,072 ERROR [main] utils.MetaStoreUtils: Got exception: 
> org.apache.hadoop.hive.metastore.api.MetaException Unable to delete 
> directory: file:/tmp
> org.apache.hadoop.hive.metastore.api.MetaException: Unable to delete 
> directory: file:/tmp
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreFsImpl.deleteDir(HiveMetaStoreFsImpl.java:45)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:365) 
> [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:353) 
> [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.deleteTableData(HiveMetaStore.java:2562)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:2523)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:2685)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_102]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_102]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_102]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_102]
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at com.sun.proxy.$Proxy33.drop_table_with_environment_context(Unknown 
> Source) [?:?]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.drop_table_with_environment_context(HiveMetaStoreClient.java:3204)
>  [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1492)
>  [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1432)
>  [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hive.hcatalog.listener.TestDbNotificationListener.dropTable(TestDbNotificationListener.java:522)
>  [test-classes/:?]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> 

[jira] [Comment Edited] (HIVE-20545) Ability to exclude potentially large parameters in HMS Notifications

2018-10-05 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640211#comment-16640211
 ] 

Bharathkrishna Guruvayoor Murali edited comment on HIVE-20545 at 10/5/18 6:51 
PM:
--

Hi [~anishek] ,

An example would be when Impala writes stats information to Partition objects 
and accesses it, as shown :  
[Impala 
stats|http://github.mtv.cloudera.com/CDH/Impala/blob/6f2d928734a33ace15ec6abd5659651173b9e69e/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java#L45]

But this information is not relevant as a Partition metadata for any other 
purpose. Also, these parameters have considerably large-size. On running some 
preliminary tests, we observe that there is a performance hit on increasing 
message size of HMS Notifications, so we can use this configuration to filter 
parameters like above mentioned.


was (Author: bharos92):
Hi [~anishek] ,

An example would be when Impala writes stats information to Partition objects 
and accesses it, as shown :  [Impala reading stats
|http://github.mtv.cloudera.com/CDH/Impala/blob/6f2d928734a33ace15ec6abd5659651173b9e69e/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java#L45]

But this information is not relevant as a Partition metadata for any other 
purpose. Also, these parameters have considerably large-size. On running some 
preliminary tests, we observe that there is a performance hit on increasing 
message size of HMS Notifications, so we can use this configuration to filter 
parameters like above mentioned.

> Ability to exclude potentially large parameters in HMS Notifications
> 
>
> Key: HIVE-20545
> URL: https://issues.apache.org/jira/browse/HIVE-20545
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20545.1.patch, HIVE-20545.2.patch, 
> HIVE-20545.3.branch-3.patch, HIVE-20545.3.patch, HIVE-20545.4.patch, 
> HIVE-20545.6.patch, HIVE-20545.7.patch
>
>
> Clients can add large-sized parameters in Table/Partition objects. So we need 
> to enable adding regex patterns through HiveConf to match parameters to be 
> filtered from table and partition objects before serialization in HMS 
> notifications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20545) Ability to exclude potentially large parameters in HMS Notifications

2018-10-05 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640211#comment-16640211
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20545:
-

Hi [~anishek] ,

An example would be when Impala writes stats information to Partition objects 
and accesses it, as shown :  [Impala reading stats
|http://github.mtv.cloudera.com/CDH/Impala/blob/6f2d928734a33ace15ec6abd5659651173b9e69e/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java#L45]

But this information is not relevant as a Partition metadata for any other 
purpose. Also, these parameters have considerably large-size. On running some 
preliminary tests, we observe that there is a performance hit on increasing 
message size of HMS Notifications, so we can use this configuration to filter 
parameters like above mentioned.

> Ability to exclude potentially large parameters in HMS Notifications
> 
>
> Key: HIVE-20545
> URL: https://issues.apache.org/jira/browse/HIVE-20545
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20545.1.patch, HIVE-20545.2.patch, 
> HIVE-20545.3.branch-3.patch, HIVE-20545.3.patch, HIVE-20545.4.patch, 
> HIVE-20545.6.patch, HIVE-20545.7.patch
>
>
> Clients can add large-sized parameters in Table/Partition objects. So we need 
> to enable adding regex patterns through HiveConf to match parameters to be 
> filtered from table and partition objects before serialization in HMS 
> notifications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20610) TestDbNotificationListener should not use /tmp directory

2018-10-04 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20610:

Attachment: (was: HIVE-20610.4.patch)

> TestDbNotificationListener should not use /tmp directory
> 
>
> Key: HIVE-20610
> URL: https://issues.apache.org/jira/browse/HIVE-20610
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20610.1.patch, HIVE-20610.2.patch, 
> HIVE-20610.3.patch, HIVE-20610.4.patch
>
>
> Using /tmp directory creates exceptions for tests like dropTable :
> {code:java}
> 2018-09-19T06:42:04,818  INFO [main] metastore.HiveMetaStore: 0: drop_table : 
> tbl=hive.default.droptbl
> 2018-09-19T06:42:04,819  INFO [main] HiveMetaStore.audit: ugi=hiveptest   
> ip=unknown-ip-addr  cmd=drop_table : tbl=hive.default.droptbl   
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.ICE-unix]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.XIM-unix]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.X11-unix]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/hsperfdata_root]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.font-unix]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.Test-unix]: it still exists.
> 2018-09-19T06:42:05,072 ERROR [main] utils.FileUtils: Failed to delete 
> file:/tmp
> 2018-09-19T06:42:05,072 ERROR [main] utils.MetaStoreUtils: Got exception: 
> org.apache.hadoop.hive.metastore.api.MetaException Unable to delete 
> directory: file:/tmp
> org.apache.hadoop.hive.metastore.api.MetaException: Unable to delete 
> directory: file:/tmp
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreFsImpl.deleteDir(HiveMetaStoreFsImpl.java:45)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:365) 
> [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:353) 
> [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.deleteTableData(HiveMetaStore.java:2562)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:2523)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:2685)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_102]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_102]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_102]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_102]
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at com.sun.proxy.$Proxy33.drop_table_with_environment_context(Unknown 
> Source) [?:?]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.drop_table_with_environment_context(HiveMetaStoreClient.java:3204)
>  [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1492)
>  [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1432)
>  [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hive.hcatalog.listener.TestDbNotificationListener.dropTable(TestDbNotificationListener.java:522)
>  [test-classes/:?]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_102]{code}
>  
>  




[jira] [Updated] (HIVE-20610) TestDbNotificationListener should not use /tmp directory

2018-10-04 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20610:

Attachment: HIVE-20610.4.patch

> TestDbNotificationListener should not use /tmp directory
> 
>
> Key: HIVE-20610
> URL: https://issues.apache.org/jira/browse/HIVE-20610
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20610.1.patch, HIVE-20610.2.patch, 
> HIVE-20610.3.patch, HIVE-20610.4.patch
>
>
> Using /tmp directory creates exceptions for tests like dropTable :
> {code:java}
> 2018-09-19T06:42:04,818  INFO [main] metastore.HiveMetaStore: 0: drop_table : 
> tbl=hive.default.droptbl
> 2018-09-19T06:42:04,819  INFO [main] HiveMetaStore.audit: ugi=hiveptest   
> ip=unknown-ip-addr  cmd=drop_table : tbl=hive.default.droptbl   
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.ICE-unix]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.XIM-unix]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.X11-unix]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/hsperfdata_root]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.font-unix]: it still exists.
> 2018-09-19T06:42:05,072  WARN [main] fs.FileUtil: Failed to delete file or 
> dir [/tmp/.Test-unix]: it still exists.
> 2018-09-19T06:42:05,072 ERROR [main] utils.FileUtils: Failed to delete 
> file:/tmp
> 2018-09-19T06:42:05,072 ERROR [main] utils.MetaStoreUtils: Got exception: 
> org.apache.hadoop.hive.metastore.api.MetaException Unable to delete 
> directory: file:/tmp
> org.apache.hadoop.hive.metastore.api.MetaException: Unable to delete 
> directory: file:/tmp
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreFsImpl.deleteDir(HiveMetaStoreFsImpl.java:45)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:365) 
> [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:353) 
> [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.deleteTableData(HiveMetaStore.java:2562)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:2523)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:2685)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_102]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_102]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_102]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_102]
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at com.sun.proxy.$Proxy33.drop_table_with_environment_context(Unknown 
> Source) [?:?]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.drop_table_with_environment_context(HiveMetaStoreClient.java:3204)
>  [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1492)
>  [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1432)
>  [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hive.hcatalog.listener.TestDbNotificationListener.dropTable(TestDbNotificationListener.java:522)
>  [test-classes/:?]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_102]{code}
>  
>  



--
This 

  1   2   3   4   5   >