[jira] [Created] (HIVE-10637) Cleanup TestPassProperties changes introduced due to HIVE-8696

2015-05-06 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-10637:
---

 Summary: Cleanup TestPassProperties changes introduced due to 
HIVE-8696
 Key: HIVE-10637
 URL: https://issues.apache.org/jira/browse/HIVE-10637
 Project: Hive
  Issue Type: Test
  Components: HCatalog, Tests
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
Priority: Trivial


Follow up JIRA to cleanup the test case as per recommendations from Sushanth.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33909: HIVE-8696: HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient

2015-05-06 Thread Thiruvel Thirumoolan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33909/
---

(Updated May 6, 2015, 11:30 p.m.)


Review request for hive.


Repository: hive-git


Description
---

HIVE-8696: HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient


Diffs (updated)
-

  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HiveClientCache.java
 578b6ea 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestPassProperties.java
 735ab5f 
  
metastore/src/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java
 1b6487a 

Diff: https://reviews.apache.org/r/33909/diff/


Testing
---


Thanks,

Thiruvel Thirumoolan



Review Request 33909: HIVE-8696: HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient

2015-05-06 Thread Thiruvel Thirumoolan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33909/
---

Review request for hive.


Repository: hive-git


Description
---

HIVE-8696: HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient


Diffs
-

  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HiveClientCache.java
 578b6ea589069022d1d5f72c582e823822f1d529 
  
metastore/src/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java
 1b6487af748202d1d0411ac23a7507a9fbd7f251 

Diff: https://reviews.apache.org/r/33909/diff/


Testing
---


Thanks,

Thiruvel Thirumoolan



Re: Review Request 31146: HIVE-9508: MetaStore client socket connection should have a lifetime

2015-04-30 Thread Thiruvel Thirumoolan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31146/
---

(Updated April 30, 2015, 7:28 p.m.)


Review request for hive.


Repository: hive-git


Description
---

HIVE-9508: MetaStore client socket connection should have a lifetime


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 72e4ff2 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 130fd67 
  
metastore/src/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java
 5ce58ee 

Diff: https://reviews.apache.org/r/31146/diff/


Testing
---

1. Added a unit test case.
2. Tested on a live deployment with the fix. The client reconnected after 5 
minutes.


Thanks,

Thiruvel Thirumoolan



Re: Review Request 31146: HIVE-9508: MetaStore client socket connection should have a lifetime

2015-04-30 Thread Thiruvel Thirumoolan


 On April 30, 2015, 9:29 p.m., Vaibhav Gumashta wrote:
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 403
  https://reviews.apache.org/r/31146/diff/3/?file=946416#file946416line403
 
  Can you add a small comment here to indicate that 0s means this config 
  is disabled? You might want to add the new config to HiveConf.metaVars as 
  well.

I have rephrased the whole description. Please let me know if this looks ok.

Good catch, I forgot about the metaVars.


- Thiruvel


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31146/#review82203
---


On April 30, 2015, 11:33 p.m., Thiruvel Thirumoolan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31146/
 ---
 
 (Updated April 30, 2015, 11:33 p.m.)
 
 
 Review request for hive.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-9508: MetaStore client socket connection should have a lifetime
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 72e4ff2 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
  130fd67 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java
  5ce58ee 
 
 Diff: https://reviews.apache.org/r/31146/diff/
 
 
 Testing
 ---
 
 1. Added a unit test case.
 2. Tested on a live deployment with the fix. The client reconnected after 5 
 minutes.
 
 
 Thanks,
 
 Thiruvel Thirumoolan
 




Re: Review Request 31146: HIVE-9508: MetaStore client socket connection should have a lifetime

2015-04-30 Thread Thiruvel Thirumoolan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31146/
---

(Updated April 30, 2015, 11:33 p.m.)


Review request for hive.


Repository: hive-git


Description
---

HIVE-9508: MetaStore client socket connection should have a lifetime


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 72e4ff2 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 130fd67 
  
metastore/src/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java
 5ce58ee 

Diff: https://reviews.apache.org/r/31146/diff/


Testing
---

1. Added a unit test case.
2. Tested on a live deployment with the fix. The client reconnected after 5 
minutes.


Thanks,

Thiruvel Thirumoolan



Re: Review Request 31146: HIVE-9508: MetaStore client socket connection should have a lifetime

2015-04-29 Thread Thiruvel Thirumoolan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31146/
---

(Updated April 29, 2015, 8:34 p.m.)


Review request for hive.


Repository: hive-git


Description
---

HIVE-9508: MetaStore client socket connection should have a lifetime


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 72e4ff2 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 130fd67 
  
metastore/src/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java
 5ce58ee 

Diff: https://reviews.apache.org/r/31146/diff/


Testing
---

1. Added a unit test case.
2. Tested on a live deployment with the fix. The client reconnected after 5 
minutes.


Thanks,

Thiruvel Thirumoolan



Re: Review Request 31152: HIVE-9582: HCatalog should use IMetaStoreClient interface

2015-03-04 Thread Thiruvel Thirumoolan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31152/
---

(Updated March 4, 2015, 11:34 p.m.)


Review request for hive.


Changes
---

Addressed review comments.


Repository: hive-git


Description
---

HIVE-9582: HCatalog should use IMetaStoreClient interface


Diffs (updated)
-

  hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java 
63909b893b4be32647a0d91e58bc0dca86bcabd9 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HiveClientCache.java
 a001252faaf9949b6d2f0e3110c2b343b9648a91 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultOutputCommitterContainer.java
 cead40d6eb7df285987c92b58021246e888dc502 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java
 bf2ba5a1c9135bb99cb12b4111e60e2b0a2ea10f 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputFormatContainer.java
 1cd5306aafb9b3ec61c31fb6504c8082b47ed2ae 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatOutputFormat.java
 694739821a202780818924d54d10edb707cfbcfa 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InitializeInput.java
 1980ef50af42499e0fed8863b6ff7a45f926d9fc 
  hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/Security.java 
39ef86e4c3d521b310f9b2dc2f154ae5a555ab06 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/common/TestHiveClientCache.java
 63a55482f7e9115f5626c5cde036597248459118 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestPassProperties.java
 f8a0af14e3d0b9dc5005f1c2f390f4e2dc054145 
  
hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java
 48a40b1c11d44c6d53d8f58b7ea91f090e72920f 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
 1c85ab5944628a388b4983a557600035d6d610b2 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/HiveEndPoint.java
 a08f2f97e4e297873250ac8d16c7679c2de901f0 
  
hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatClientHMSImpl.java
 cd05254f4e138b7c1ec7d9424c90416b25f93462 
  
hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/CompleteDelegator.java
 1b9663d2d0e2e0d94b520ed6760415be441c7ab4 
  
hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/SecureProxySupport.java
 8ae61a1e330b56037dd7440fa888e431c65fc158 
  
hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/TempletonControllerJob.java
 53eecfa990bcaab247ae8bc4df221742bd166081 
  metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
b4bb67944713951f089a9b8c11485fbf46088b49 

Diff: https://reviews.apache.org/r/31152/diff/


Testing
---

All hcatalog unit tests pass.


Thanks,

Thiruvel Thirumoolan



Re: Review Request 31152: HIVE-9582: HCatalog should use IMetaStoreClient interface

2015-02-26 Thread Thiruvel Thirumoolan


 On Feb. 26, 2015, 10:12 p.m., Thejas Nair wrote:
  hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/TempletonControllerJob.java,
   line 176
  https://reviews.apache.org/r/31152/diff/1/?file=867387#file867387line176
 
  shouldn't this be client.getDelegationToken(c.getUser(), u) ?

Thanks Thejas, I will check on the first set of comments and update the patch.


- Thiruvel


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31152/#review74374
---


On Feb. 18, 2015, 6:58 a.m., Thiruvel Thirumoolan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31152/
 ---
 
 (Updated Feb. 18, 2015, 6:58 a.m.)
 
 
 Review request for hive.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-9582: HCatalog should use IMetaStoreClient interface
 
 
 Diffs
 -
 
   hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java 
 63909b893b4be32647a0d91e58bc0dca86bcabd9 
   
 hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HiveClientCache.java
  a001252faaf9949b6d2f0e3110c2b343b9648a91 
   
 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultOutputCommitterContainer.java
  cead40d6eb7df285987c92b58021246e888dc502 
   
 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java
  bf2ba5a1c9135bb99cb12b4111e60e2b0a2ea10f 
   
 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputFormatContainer.java
  1cd5306aafb9b3ec61c31fb6504c8082b47ed2ae 
   
 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatOutputFormat.java
  694739821a202780818924d54d10edb707cfbcfa 
   
 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InitializeInput.java
  1980ef50af42499e0fed8863b6ff7a45f926d9fc 
   
 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/Security.java 
 39ef86e4c3d521b310f9b2dc2f154ae5a555ab06 
   
 hcatalog/core/src/test/java/org/apache/hive/hcatalog/common/TestHiveClientCache.java
  63a55482f7e9115f5626c5cde036597248459118 
   
 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestPassProperties.java
  f8a0af14e3d0b9dc5005f1c2f390f4e2dc054145 
   
 hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java
  48a40b1c11d44c6d53d8f58b7ea91f090e72920f 
   
 hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
  8c4bca02abeda7eb89ea0deacdfb2e06c9fda7f8 
   
 hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/HiveEndPoint.java
  a08f2f97e4e297873250ac8d16c7679c2de901f0 
   
 hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatClientHMSImpl.java
  cd05254f4e138b7c1ec7d9424c90416b25f93462 
   
 hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/CompleteDelegator.java
  1b9663d2d0e2e0d94b520ed6760415be441c7ab4 
   
 hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/SecureProxySupport.java
  8ae61a1e330b56037dd7440fa888e431c65fc158 
   
 hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/TempletonControllerJob.java
  53eecfa990bcaab247ae8bc4df221742bd166081 
   metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
 0aa0f515d9d15d442d31e32a63586d119c30494e 
 
 Diff: https://reviews.apache.org/r/31152/diff/
 
 
 Testing
 ---
 
 All hcatalog unit tests pass.
 
 
 Thanks,
 
 Thiruvel Thirumoolan
 




[jira] [Updated] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-02-17 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-9582:
---
Attachment: HIVE-9582.3.patch

Attaching rebased patch for precommit tests to run.

 HCatalog should use IMetaStoreClient interface
 --

 Key: HIVE-9582
 URL: https://issues.apache.org/jira/browse/HIVE-9582
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.14.0, 0.13.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: hcatalog, metastore, rolling_upgrade
 Fix For: 0.14.1

 Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, 
 HIVE-9583.1.patch


 Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
 Hence during a failure, the client retries and possibly succeeds. But 
 HCatalog has long been using HiveMetaStoreClient directly and hence failures 
 are costly, especially if they are during the commit stage of a job. Its also 
 not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 31146: HIVE-9508: MetaStore client socket connection should have a lifetime

2015-02-17 Thread Thiruvel Thirumoolan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31146/
---

Review request for hive.


Repository: hive-git


Description
---

HIVE-9508: MetaStore client socket connection should have a lifetime


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
e64e8fc11cbfe3e538440cd9ca344397baf1dc17 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 130fd67175091daf00268495093853fba63e3884 
  
metastore/src/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java
 b4f02fc1c519096b71a7e0c10567049a9ccdf13e 

Diff: https://reviews.apache.org/r/31146/diff/


Testing
---

1. Added a unit test case.
2. Tested on a live deployment with the fix. The client reconnected after 5 
minutes.


Thanks,

Thiruvel Thirumoolan



[jira] [Commented] (HIVE-9508) MetaStore client socket connection should have a lifetime

2015-02-17 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325344#comment-14325344
 ] 

Thiruvel Thirumoolan commented on HIVE-9508:


Review request @ https://reviews.apache.org/r/31146/

 MetaStore client socket connection should have a lifetime
 -

 Key: HIVE-9508
 URL: https://issues.apache.org/jira/browse/HIVE-9508
 Project: Hive
  Issue Type: Sub-task
  Components: CLI, Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: metastore, rolling_upgrade
 Fix For: 1.2.0

 Attachments: HIVE-9508.1.patch, HIVE-9508.2.patch, HIVE-9508.3.patch


 Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore 
 server until the connection is closed or there is a problem. I would like to 
 introduce the concept of a MetaStore client socket life time. The MS client 
 will reconnect if the socket lifetime is reached. This will help during 
 rolling upgrade of Metastore.
 When there are multiple Metastore servers behind a VIP (load balancer), it is 
 easy to take one server out of rotation and wait for 10+ mins for all 
 existing connections will die down (if the lifetime is 5mins say) and the 
 server can be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-02-17 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325511#comment-14325511
 ] 

Thiruvel Thirumoolan commented on HIVE-9582:


The test failure is unrelated to this patch. Review request @ 
https://reviews.apache.org/r/31152/

 HCatalog should use IMetaStoreClient interface
 --

 Key: HIVE-9582
 URL: https://issues.apache.org/jira/browse/HIVE-9582
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.14.0, 0.13.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: hcatalog, metastore, rolling_upgrade
 Fix For: 0.14.1

 Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, 
 HIVE-9583.1.patch


 Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
 Hence during a failure, the client retries and possibly succeeds. But 
 HCatalog has long been using HiveMetaStoreClient directly and hence failures 
 are costly, especially if they are during the commit stage of a job. Its also 
 not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 31152: HIVE-9582: HCatalog should use IMetaStoreClient interface

2015-02-17 Thread Thiruvel Thirumoolan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31152/
---

Review request for hive.


Repository: hive-git


Description
---

HIVE-9582: HCatalog should use IMetaStoreClient interface


Diffs
-

  hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java 
63909b893b4be32647a0d91e58bc0dca86bcabd9 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HiveClientCache.java
 a001252faaf9949b6d2f0e3110c2b343b9648a91 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultOutputCommitterContainer.java
 cead40d6eb7df285987c92b58021246e888dc502 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java
 bf2ba5a1c9135bb99cb12b4111e60e2b0a2ea10f 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputFormatContainer.java
 1cd5306aafb9b3ec61c31fb6504c8082b47ed2ae 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatOutputFormat.java
 694739821a202780818924d54d10edb707cfbcfa 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InitializeInput.java
 1980ef50af42499e0fed8863b6ff7a45f926d9fc 
  hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/Security.java 
39ef86e4c3d521b310f9b2dc2f154ae5a555ab06 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/common/TestHiveClientCache.java
 63a55482f7e9115f5626c5cde036597248459118 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestPassProperties.java
 f8a0af14e3d0b9dc5005f1c2f390f4e2dc054145 
  
hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java
 48a40b1c11d44c6d53d8f58b7ea91f090e72920f 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
 8c4bca02abeda7eb89ea0deacdfb2e06c9fda7f8 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/HiveEndPoint.java
 a08f2f97e4e297873250ac8d16c7679c2de901f0 
  
hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatClientHMSImpl.java
 cd05254f4e138b7c1ec7d9424c90416b25f93462 
  
hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/CompleteDelegator.java
 1b9663d2d0e2e0d94b520ed6760415be441c7ab4 
  
hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/SecureProxySupport.java
 8ae61a1e330b56037dd7440fa888e431c65fc158 
  
hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/TempletonControllerJob.java
 53eecfa990bcaab247ae8bc4df221742bd166081 
  metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
0aa0f515d9d15d442d31e32a63586d119c30494e 

Diff: https://reviews.apache.org/r/31152/diff/


Testing
---

All hcatalog unit tests pass.


Thanks,

Thiruvel Thirumoolan



[jira] [Updated] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-02-05 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-9582:
---
Attachment: HIVE-9582.2.patch

Updated patch.

 HCatalog should use IMetaStoreClient interface
 --

 Key: HIVE-9582
 URL: https://issues.apache.org/jira/browse/HIVE-9582
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.14.0, 0.13.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: hcatalog, metastore, rolling_upgrade
 Fix For: 0.14.1

 Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9583.1.patch


 Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
 Hence during a failure, the client retries and possibly succeeds. But 
 HCatalog has long been using HiveMetaStoreClient directly and hence failures 
 are costly, especially if they are during the commit stage of a job. Its also 
 not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9508) MetaStore client socket connection should have a lifetime

2015-02-05 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-9508:
---
Attachment: HIVE-9508.3.patch

Uploading patch with sane defaults.

 MetaStore client socket connection should have a lifetime
 -

 Key: HIVE-9508
 URL: https://issues.apache.org/jira/browse/HIVE-9508
 Project: Hive
  Issue Type: Sub-task
  Components: CLI, Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: metastore, rolling_upgrade
 Fix For: 1.2.0

 Attachments: HIVE-9508.1.patch, HIVE-9508.2.patch, HIVE-9508.3.patch


 Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore 
 server until the connection is closed or there is a problem. I would like to 
 introduce the concept of a MetaStore client socket life time. The MS client 
 will reconnect if the socket lifetime is reached. This will help during 
 rolling upgrade of Metastore.
 When there are multiple Metastore servers behind a VIP (load balancer), it is 
 easy to take one server out of rotation and wait for 10+ mins for all 
 existing connections will die down (if the lifetime is 5mins say) and the 
 server can be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-02-05 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-9582:
---
Attachment: HIVE-9582.1.patch

Uploading a patch that applies cleanly on trunk and with the right file name 
for tests to run.

 HCatalog should use IMetaStoreClient interface
 --

 Key: HIVE-9582
 URL: https://issues.apache.org/jira/browse/HIVE-9582
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.14.0, 0.13.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: hcatalog, metastore, rolling_upgrade
 Fix For: 0.14.1

 Attachments: HIVE-9582.1.patch, HIVE-9583.1.patch


 Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
 Hence during a failure, the client retries and possibly succeeds. But 
 HCatalog has long been using HiveMetaStoreClient directly and hence failures 
 are costly, especially if they are during the commit stage of a job. Its also 
 not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9583) Rolling upgrade of Hive MetaStore Server

2015-02-04 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-9583:
--

 Summary: Rolling upgrade of Hive MetaStore Server
 Key: HIVE-9583
 URL: https://issues.apache.org/jira/browse/HIVE-9583
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog, Metastore
Affects Versions: 0.14.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 1.2.0


This is an umbrella JIRA to track all rolling upgrade JIRAs w.r.t MetaStore 
server. This will be helpful for users deploying Metastore server and 
connecting to it with HCatalog or Hive CLI interface.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-02-04 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-9582:
--

 Summary: HCatalog should use IMetaStoreClient interface
 Key: HIVE-9582
 URL: https://issues.apache.org/jira/browse/HIVE-9582
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog, Metastore
Affects Versions: 0.13.1, 0.14.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.14.1


Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
Hence during a failure, the client retries and possibly succeeds. But HCatalog 
has long been using HiveMetaStoreClient directly and hence failures are costly, 
especially if they are during the commit stage of a job. Its also not possible 
to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-02-04 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-9582:
---
Attachment: HIVE-9583.1.patch

Uploading a WIP patch

 HCatalog should use IMetaStoreClient interface
 --

 Key: HIVE-9582
 URL: https://issues.apache.org/jira/browse/HIVE-9582
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.14.0, 0.13.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: hcatalog, metastore, rolling_upgrade
 Fix For: 0.14.1

 Attachments: HIVE-9583.1.patch


 Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
 Hence during a failure, the client retries and possibly succeeds. But 
 HCatalog has long been using HiveMetaStoreClient directly and hence failures 
 are costly, especially if they are during the commit stage of a job. Its also 
 not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-02-04 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-9582:
---
Issue Type: Sub-task  (was: Improvement)
Parent: HIVE-9583

 HCatalog should use IMetaStoreClient interface
 --

 Key: HIVE-9582
 URL: https://issues.apache.org/jira/browse/HIVE-9582
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.14.0, 0.13.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: hcatalog, metastore, rolling_upgrade
 Fix For: 0.14.1


 Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
 Hence during a failure, the client retries and possibly succeeds. But 
 HCatalog has long been using HiveMetaStoreClient directly and hence failures 
 are costly, especially if they are during the commit stage of a job. Its also 
 not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.

2015-02-04 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-8696:
---
Issue Type: Sub-task  (was: Bug)
Parent: HIVE-9583

 HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
 -

 Key: HIVE-8696
 URL: https://issues.apache.org/jira/browse/HIVE-8696
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.12.0, 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Fix For: 1.2.0

 Attachments: HIVE-8696.1.patch


 The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the 
 HCatClient API that log in through keytabs will fail without retry, when 
 their TGTs expire.
 The fix is inbound. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9508) MetaStore client socket connection should have a lifetime

2015-02-04 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-9508:
---
Issue Type: Sub-task  (was: Improvement)
Parent: HIVE-9583

 MetaStore client socket connection should have a lifetime
 -

 Key: HIVE-9508
 URL: https://issues.apache.org/jira/browse/HIVE-9508
 Project: Hive
  Issue Type: Sub-task
  Components: CLI, Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: metastore, rolling_upgrade
 Fix For: 1.2.0

 Attachments: HIVE-9508.1.patch, HIVE-9508.2.patch


 Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore 
 server until the connection is closed or there is a problem. I would like to 
 introduce the concept of a MetaStore client socket life time. The MS client 
 will reconnect if the socket lifetime is reached. This will help during 
 rolling upgrade of Metastore.
 When there are multiple Metastore servers behind a VIP (load balancer), it is 
 easy to take one server out of rotation and wait for 10+ mins for all 
 existing connections will die down (if the lifetime is 5mins say) and the 
 server can be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-02-04 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-9582:
---
Status: Patch Available  (was: Open)

 HCatalog should use IMetaStoreClient interface
 --

 Key: HIVE-9582
 URL: https://issues.apache.org/jira/browse/HIVE-9582
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.13.1, 0.14.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: hcatalog, metastore, rolling_upgrade
 Fix For: 0.14.1

 Attachments: HIVE-9583.1.patch


 Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
 Hence during a failure, the client retries and possibly succeeds. But 
 HCatalog has long been using HiveMetaStoreClient directly and hence failures 
 are costly, especially if they are during the commit stage of a job. Its also 
 not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.

2015-02-04 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-8696:
---
Fix Version/s: 1.2.0

 HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
 -

 Key: HIVE-8696
 URL: https://issues.apache.org/jira/browse/HIVE-8696
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Metastore
Affects Versions: 0.12.0, 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Fix For: 1.2.0

 Attachments: HIVE-8696.1.patch


 The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the 
 HCatClient API that log in through keytabs will fail without retry, when 
 their TGTs expire.
 The fix is inbound. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9508) MetaStore client socket connection should have a lifetime

2015-02-03 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-9508:
---
Attachment: HIVE-9508.2.patch

Uploading another version of patch with the functionality enabled and a minor 
bug fix.

 MetaStore client socket connection should have a lifetime
 -

 Key: HIVE-9508
 URL: https://issues.apache.org/jira/browse/HIVE-9508
 Project: Hive
  Issue Type: Improvement
  Components: CLI, Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: metastore, rolling_upgrade
 Fix For: 1.2.0

 Attachments: HIVE-9508.1.patch, HIVE-9508.2.patch


 Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore 
 server until the connection is closed or there is a problem. I would like to 
 introduce the concept of a MetaStore client socket life time. The MS client 
 will reconnect if the socket lifetime is reached. This will help during 
 rolling upgrade of Metastore.
 When there are multiple Metastore servers behind a VIP (load balancer), it is 
 easy to take one server out of rotation and wait for 10+ mins for all 
 existing connections will die down (if the lifetime is 5mins say) and the 
 server can be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9508) MetaStore client socket connection should have a lifetime

2015-01-30 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-9508:
---
Status: Patch Available  (was: Open)

 MetaStore client socket connection should have a lifetime
 -

 Key: HIVE-9508
 URL: https://issues.apache.org/jira/browse/HIVE-9508
 Project: Hive
  Issue Type: Improvement
  Components: CLI, Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: metastore, rolling_upgrade
 Fix For: 1.2.0

 Attachments: HIVE-9508.1.patch


 Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore 
 server until the connection is closed or there is a problem. I would like to 
 introduce the concept of a MetaStore client socket life time. The MS client 
 will reconnect if the socket lifetime is reached. This will help during 
 rolling upgrade of Metastore.
 When there are multiple Metastore servers behind a VIP (load balancer), it is 
 easy to take one server out of rotation and wait for 10+ mins for all 
 existing connections will die down (if the lifetime is 5mins say) and the 
 server can be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9508) MetaStore client socket connection should have a lifetime

2015-01-29 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-9508:
---
Fix Version/s: (was: 0.15.0)
   1.2.0

 MetaStore client socket connection should have a lifetime
 -

 Key: HIVE-9508
 URL: https://issues.apache.org/jira/browse/HIVE-9508
 Project: Hive
  Issue Type: Improvement
  Components: CLI, Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: metastore, rolling_upgrade
 Fix For: 1.2.0


 Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore 
 server until the connection is closed or there is a problem. I would like to 
 introduce the concept of a MetaStore client socket life time. The MS client 
 will reconnect if the socket lifetime is reached. This will help during 
 rolling upgrade of Metastore.
 When there are multiple Metastore servers behind a VIP (load balancer), it is 
 easy to take one server out of rotation and wait for 10+ mins for all 
 existing connections will die down (if the lifetime is 5mins say) and the 
 server can be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9508) MetaStore client socket connection should have a lifetime

2015-01-29 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-9508:
---
Attachment: HIVE-9508.1.patch

Attaching basic patch. The connection lifetime is disabled by default so 
existing users should not be affected.

 MetaStore client socket connection should have a lifetime
 -

 Key: HIVE-9508
 URL: https://issues.apache.org/jira/browse/HIVE-9508
 Project: Hive
  Issue Type: Improvement
  Components: CLI, Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: metastore, rolling_upgrade
 Fix For: 1.2.0

 Attachments: HIVE-9508.1.patch


 Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore 
 server until the connection is closed or there is a problem. I would like to 
 introduce the concept of a MetaStore client socket life time. The MS client 
 will reconnect if the socket lifetime is reached. This will help during 
 rolling upgrade of Metastore.
 When there are multiple Metastore servers behind a VIP (load balancer), it is 
 easy to take one server out of rotation and wait for 10+ mins for all 
 existing connections will die down (if the lifetime is 5mins say) and the 
 server can be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9508) MetaStore client socket connection should have a lifetime

2015-01-29 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-9508:
--

 Summary: MetaStore client socket connection should have a lifetime
 Key: HIVE-9508
 URL: https://issues.apache.org/jira/browse/HIVE-9508
 Project: Hive
  Issue Type: Improvement
  Components: CLI, Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.15.0


Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore 
server until the connection is closed or there is a problem. I would like to 
introduce the concept of a MetaStore client socket life time. The MS client 
will reconnect if the socket lifetime is reached. This will help during rolling 
upgrade of Metastore.

When there are multiple Metastore servers behind a VIP (load balancer), it is 
easy to take one server out of rotation and wait for 10+ mins for all existing 
connections will die down (if the lifetime is 5mins say) and the server can be 
updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6090) Audit logs for HiveServer2

2015-01-20 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-6090:
---
Attachment: HIVE-6090.1.patch

Uploading patch for unit tests to run. TestJdbcDriver2 passed with the changes.

 Audit logs for HiveServer2
 --

 Key: HIVE-6090
 URL: https://issues.apache.org/jira/browse/HIVE-6090
 Project: Hive
  Issue Type: Improvement
  Components: Diagnosability, HiveServer2
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Attachments: HIVE-6090.1.WIP.patch, HIVE-6090.1.patch, HIVE-6090.patch


 HiveMetastore has audit logs and would like to audit all queries or requests 
 to HiveServer2 also. This will help in understanding how the APIs were used, 
 queries submitted, users etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6090) Audit logs for HiveServer2

2015-01-20 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-6090:
---
Fix Version/s: 0.15.0
   Labels: audit hiveserver  (was: )
   Status: Patch Available  (was: Open)

 Audit logs for HiveServer2
 --

 Key: HIVE-6090
 URL: https://issues.apache.org/jira/browse/HIVE-6090
 Project: Hive
  Issue Type: Improvement
  Components: Diagnosability, HiveServer2
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: hiveserver, audit
 Fix For: 0.15.0

 Attachments: HIVE-6090.1.WIP.patch, HIVE-6090.1.patch, HIVE-6090.patch


 HiveMetastore has audit logs and would like to audit all queries or requests 
 to HiveServer2 also. This will help in understanding how the APIs were used, 
 queries submitted, users etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8417) round(decimal, negative) errors out/wrong results with reduce side vectorization

2014-10-09 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-8417:
--

 Summary: round(decimal, negative) errors out/wrong results with 
reduce side vectorization
 Key: HIVE-8417
 URL: https://issues.apache.org/jira/browse/HIVE-8417
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.14.0
Reporter: Thiruvel Thirumoolan
Assignee: Jitendra Nath Pandey
Priority: Critical


With reduce-side vectorization enabled, round UDF taking a decimal value and a 
negative argument fails. It passes when there is no reducer or when 
vectorization is turned off.

Simulated with:
create table decimal_tbl (dec decimal(10,0));

Data: just one record, 101

Query: select dec, round(dec, -1) from decimal_tbl order by dec;

This query fails with text and rcfile with IndexOutOfBoundsException in 
Decimal128.toFormalString(), but returns 101 101 with orc. When order by is 
removed, it returns 101 100 with orc and rc. When order by dec is replaced 
with order by round(dec, -1) it fails with the same exception with orc too.

Following is the exception thrown:

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing vector batch (tag=0) [Error getting row data with exception 
java.lang.IndexOutOfBoundsException: start 0, end 3, s.length() 2
at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:476)
at java.lang.StringBuilder.append(StringBuilder.java:191)
at 
org.apache.hadoop.hive.common.type.Decimal128.toFormalString(Decimal128.java:1858)
at 
org.apache.hadoop.hive.common.type.Decimal128.toBigDecimal(Decimal128.java:1733)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$1.writeValue(VectorExpressionWriterFactory.java:469)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterDecimal.writeValue(VectorExpressionWriterFactory.java:310)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:371)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:250)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:168)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:164)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
 ]
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:376)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:250)
... 16 more
Caused by: java.lang.IndexOutOfBoundsException: start 0, end 3, s.length() 2
at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:476)
at java.lang.StringBuilder.append(StringBuilder.java:191)
at 
org.apache.hadoop.hive.common.type.Decimal128.toFormalString(Decimal128.java:1858)
at 
org.apache.hadoop.hive.common.type.Decimal128.toBigDecimal(Decimal128.java:1733)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$1.writeValue(VectorExpressionWriterFactory.java:469)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterDecimal.writeValue(VectorExpressionWriterFactory.java:310)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterSetter.writeValue(VectorExpressionWriterFactory.java:1153

[jira] [Commented] (HIVE-8371) HCatStorer should fail by default when publishing to an existing partition

2014-10-07 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14162873#comment-14162873
 ] 

Thiruvel Thirumoolan commented on HIVE-8371:


Having a warehouse level property which basically sets default value of 
immutable property will only help restore HCatalog behavior. Isn't that going 
to flip the behavior of Hive?

I am very concerned that the default HCatStorer behavior has changed after it 
has been out for a very long time.

 HCatStorer should fail by default when publishing to an existing partition
 --

 Key: HIVE-8371
 URL: https://issues.apache.org/jira/browse/HIVE-8371
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.0, 0.14.0, 0.13.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: hcatalog, partition

 In Hive-12 and before (on in previous HCatalog releases) HCatStorer would 
 fail if the partition already exists (whether before launching the job or 
 during commit depending on the partitioning). HIVE-6406 changed that behavior 
 and by default does an append. This causes data quality issues since an rerun 
 (or duplicate run) won't fail (when it used to) and will just append to the 
 partition.
 A preferable approach would be to leave HCatStorer behavior as is (fail 
 during a duplicate publish) and support append through an option. Overwrite 
 also can be implemented in a similar fashion. Eg:
 store A into 'db.table' using 
 org.apache.hive.hcatalog.pig.HCatStorer('partspec', '', ' -append');



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8371) HCatStorer should fail by default when publishing to an existing partition

2014-10-06 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-8371:
--

 Summary: HCatStorer should fail by default when publishing to an 
existing partition
 Key: HIVE-8371
 URL: https://issues.apache.org/jira/browse/HIVE-8371
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1, 0.13.0, 0.14.0
Reporter: Thiruvel Thirumoolan


In Hive-12 and before (on in previous HCatalog releases) HCatStorer would fail 
if the partition already exists (whether before launching the job or during 
commit depending on the partitioning). HIVE-6406 changed that behavior and by 
default does an append. This causes data quality issues since an rerun (or 
duplicate run) won't fail (when it used to) and will just append to the 
partition.

A preferable approach would be to leave HCatStorer behavior as is (fail during 
a duplicate publish) and support append through an option. Overwrite also can 
be implemented in a similar fashion. Eg:

store A into 'db.table' using 
org.apache.hive.hcatalog.pig.HCatStorer('partspec', '', ' -append');



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-8371) HCatStorer should fail by default when publishing to an existing partition

2014-10-06 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan reassigned HIVE-8371:
--

Assignee: Thiruvel Thirumoolan

 HCatStorer should fail by default when publishing to an existing partition
 --

 Key: HIVE-8371
 URL: https://issues.apache.org/jira/browse/HIVE-8371
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.0, 0.14.0, 0.13.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: hcatalog, partition

 In Hive-12 and before (on in previous HCatalog releases) HCatStorer would 
 fail if the partition already exists (whether before launching the job or 
 during commit depending on the partitioning). HIVE-6406 changed that behavior 
 and by default does an append. This causes data quality issues since an rerun 
 (or duplicate run) won't fail (when it used to) and will just append to the 
 partition.
 A preferable approach would be to leave HCatStorer behavior as is (fail 
 during a duplicate publish) and support append through an option. Overwrite 
 also can be implemented in a similar fashion. Eg:
 store A into 'db.table' using 
 org.apache.hive.hcatalog.pig.HCatStorer('partspec', '', ' -append');



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8371) HCatStorer should fail by default when publishing to an existing partition

2014-10-06 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161227#comment-14161227
 ] 

Thiruvel Thirumoolan commented on HIVE-8371:


[~sushanth] Lemme know what do you think about this.

 HCatStorer should fail by default when publishing to an existing partition
 --

 Key: HIVE-8371
 URL: https://issues.apache.org/jira/browse/HIVE-8371
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.0, 0.14.0, 0.13.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: hcatalog, partition

 In Hive-12 and before (on in previous HCatalog releases) HCatStorer would 
 fail if the partition already exists (whether before launching the job or 
 during commit depending on the partitioning). HIVE-6406 changed that behavior 
 and by default does an append. This causes data quality issues since an rerun 
 (or duplicate run) won't fail (when it used to) and will just append to the 
 partition.
 A preferable approach would be to leave HCatStorer behavior as is (fail 
 during a duplicate publish) and support append through an option. Overwrite 
 also can be implemented in a similar fashion. Eg:
 store A into 'db.table' using 
 org.apache.hive.hcatalog.pig.HCatStorer('partspec', '', ' -append');



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8264) Math UDFs in Reducer-with-vectorization fail with ArrayIndexOutOfBoundsException

2014-09-26 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148981#comment-14148981
 ] 

Thiruvel Thirumoolan commented on HIVE-8264:


Thanks [~mmccline], it appears to fix the problem. After applying the patch in 
HIVE-8171, I don't see any exceptions, I see the query running fine.

 Math UDFs in Reducer-with-vectorization fail with 
 ArrayIndexOutOfBoundsException
 

 Key: HIVE-8264
 URL: https://issues.apache.org/jira/browse/HIVE-8264
 Project: Hive
  Issue Type: Bug
  Components: Tez, UDF, Vectorization
Affects Versions: 0.14.0
 Environment: Hive trunk - as of today
 Tez - 0.5.0
 Hadoop - 2.5
Reporter: Thiruvel Thirumoolan
  Labels: mathfunction, tez, vectorization

 Following queries are representative of the exceptions we are seeing with 
 trunk. These queries pass if vectorization is disabled (or if limit is 
 removed, which means no reducer).
 select name, log2(0) from (select name from mytable limit 1) t;
 select name, rand() from (select name from mytable limit 1) t;
 .. similar patterns with other Math UDFs'.
 Exception:
 ], TaskAttempt 3 failed, info=[Error: Failure while running 
 task:java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing vector batch (tag=0)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142)
   at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 Caused by: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing vector batch (tag=0)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:167)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:154)
   ... 14 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing vector batch (tag=0)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:360)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:242)
   ... 16 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 null
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator.processOp(VectorLimitOperator.java:47)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:347)
   ... 17 more
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluateLong(ConstantVectorExpression.java:102)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluate(ConstantVectorExpression.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:125)
   ... 22 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-8264) Math UDFs in Reducer-with-vectorization fail with ArrayIndexOutOfBoundsException

2014-09-26 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan resolved HIVE-8264.

Resolution: Duplicate

 Math UDFs in Reducer-with-vectorization fail with 
 ArrayIndexOutOfBoundsException
 

 Key: HIVE-8264
 URL: https://issues.apache.org/jira/browse/HIVE-8264
 Project: Hive
  Issue Type: Bug
  Components: Tez, UDF, Vectorization
Affects Versions: 0.14.0
 Environment: Hive trunk - as of today
 Tez - 0.5.0
 Hadoop - 2.5
Reporter: Thiruvel Thirumoolan
  Labels: mathfunction, tez, vectorization

 Following queries are representative of the exceptions we are seeing with 
 trunk. These queries pass if vectorization is disabled (or if limit is 
 removed, which means no reducer).
 select name, log2(0) from (select name from mytable limit 1) t;
 select name, rand() from (select name from mytable limit 1) t;
 .. similar patterns with other Math UDFs'.
 Exception:
 ], TaskAttempt 3 failed, info=[Error: Failure while running 
 task:java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing vector batch (tag=0)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142)
   at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 Caused by: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing vector batch (tag=0)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:167)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:154)
   ... 14 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing vector batch (tag=0)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:360)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:242)
   ... 16 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 null
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator.processOp(VectorLimitOperator.java:47)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:347)
   ... 17 more
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluateLong(ConstantVectorExpression.java:102)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluate(ConstantVectorExpression.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:125)
   ... 22 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8264) Math UDFs in Reducer-with-vectorization fail with ArrayIndexOutOfBoundsException

2014-09-25 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-8264:
--

 Summary: Math UDFs in Reducer-with-vectorization fail with 
ArrayIndexOutOfBoundsException
 Key: HIVE-8264
 URL: https://issues.apache.org/jira/browse/HIVE-8264
 Project: Hive
  Issue Type: Bug
  Components: Tez, UDF, Vectorization
Affects Versions: 0.14.0
 Environment: Hive trunk - as of today
Tez - 0.5.0
Hadoop - 2.5
Reporter: Thiruvel Thirumoolan
Assignee: Jitendra Nath Pandey


Following queries are representative of the exceptions we are seeing with 
trunk. These queries pass if vectorization is disabled (or if limit is removed, 
which means no reducer).

select name, log2(0) from (select name from mytable limit 1) t;
select name, rand() from (select name from mytable limit 1) t;
.. similar patterns with other Math UDFs'.

Exception:

], TaskAttempt 3 failed, info=[Error: Failure while running 
task:java.lang.RuntimeException: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing vector batch (tag=0)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing vector batch (tag=0)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:254)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:167)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:154)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing vector batch (tag=0)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:360)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:242)
... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
null
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator.processOp(VectorLimitOperator.java:47)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:347)
... 17 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluateLong(ConstantVectorExpression.java:102)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluate(ConstantVectorExpression.java:150)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:125)
... 22 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8264) Math UDFs in Reducer-with-vectorization fail with ArrayIndexOutOfBoundsException

2014-09-25 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-8264:
---
Tags: vectorization,math,udf,tez
Assignee: (was: Jitendra Nath Pandey)
  Labels: mathfunction tez vectorization  (was: )

 Math UDFs in Reducer-with-vectorization fail with 
 ArrayIndexOutOfBoundsException
 

 Key: HIVE-8264
 URL: https://issues.apache.org/jira/browse/HIVE-8264
 Project: Hive
  Issue Type: Bug
  Components: Tez, UDF, Vectorization
Affects Versions: 0.14.0
 Environment: Hive trunk - as of today
 Tez - 0.5.0
 Hadoop - 2.5
Reporter: Thiruvel Thirumoolan
  Labels: mathfunction, tez, vectorization

 Following queries are representative of the exceptions we are seeing with 
 trunk. These queries pass if vectorization is disabled (or if limit is 
 removed, which means no reducer).
 select name, log2(0) from (select name from mytable limit 1) t;
 select name, rand() from (select name from mytable limit 1) t;
 .. similar patterns with other Math UDFs'.
 Exception:
 ], TaskAttempt 3 failed, info=[Error: Failure while running 
 task:java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing vector batch (tag=0)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142)
   at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 Caused by: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing vector batch (tag=0)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:167)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:154)
   ... 14 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing vector batch (tag=0)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:360)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:242)
   ... 16 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 null
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator.processOp(VectorLimitOperator.java:47)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:347)
   ... 17 more
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluateLong(ConstantVectorExpression.java:102)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluate(ConstantVectorExpression.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:125)
   ... 22 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8264) Math UDFs in Reducer-with-vectorization fail with ArrayIndexOutOfBoundsException

2014-09-25 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148537#comment-14148537
 ] 

Thiruvel Thirumoolan commented on HIVE-8264:


[~mmccline] Have you seen this problem before or will it be addressed by any of 
the open jiras you are looking into?

 Math UDFs in Reducer-with-vectorization fail with 
 ArrayIndexOutOfBoundsException
 

 Key: HIVE-8264
 URL: https://issues.apache.org/jira/browse/HIVE-8264
 Project: Hive
  Issue Type: Bug
  Components: Tez, UDF, Vectorization
Affects Versions: 0.14.0
 Environment: Hive trunk - as of today
 Tez - 0.5.0
 Hadoop - 2.5
Reporter: Thiruvel Thirumoolan
  Labels: mathfunction, tez, vectorization

 Following queries are representative of the exceptions we are seeing with 
 trunk. These queries pass if vectorization is disabled (or if limit is 
 removed, which means no reducer).
 select name, log2(0) from (select name from mytable limit 1) t;
 select name, rand() from (select name from mytable limit 1) t;
 .. similar patterns with other Math UDFs'.
 Exception:
 ], TaskAttempt 3 failed, info=[Error: Failure while running 
 task:java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing vector batch (tag=0)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142)
   at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 Caused by: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing vector batch (tag=0)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:167)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:154)
   ... 14 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing vector batch (tag=0)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:360)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:242)
   ... 16 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 null
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator.processOp(VectorLimitOperator.java:47)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:347)
   ... 17 more
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluateLong(ConstantVectorExpression.java:102)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluate(ConstantVectorExpression.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:125)
   ... 22 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6090) Audit logs for HiveServer2

2014-09-16 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-6090:
---
Attachment: HIVE-6090.1.WIP.patch

Uploading a WIP progress patch that should apply cleanly. Will test against a 
live cluster (kerberos) and submit for precommit tests.

 Audit logs for HiveServer2
 --

 Key: HIVE-6090
 URL: https://issues.apache.org/jira/browse/HIVE-6090
 Project: Hive
  Issue Type: Improvement
  Components: Diagnosability, HiveServer2
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Attachments: HIVE-6090.1.WIP.patch, HIVE-6090.patch


 HiveMetastore has audit logs and would like to audit all queries or requests 
 to HiveServer2 also. This will help in understanding how the APIs were used, 
 queries submitted, users etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6090) Audit logs for HiveServer2

2014-09-13 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14132902#comment-14132902
 ] 

Thiruvel Thirumoolan commented on HIVE-6090:


Thanks [~farisa], will rebase and upload.

 Audit logs for HiveServer2
 --

 Key: HIVE-6090
 URL: https://issues.apache.org/jira/browse/HIVE-6090
 Project: Hive
  Issue Type: Improvement
  Components: Diagnosability, HiveServer2
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Attachments: HIVE-6090.patch


 HiveMetastore has audit logs and would like to audit all queries or requests 
 to HiveServer2 also. This will help in understanding how the APIs were used, 
 queries submitted, users etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7604) Add Metastore API to fetch one or more partition names

2014-08-26 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111437#comment-14111437
 ] 

Thiruvel Thirumoolan commented on HIVE-7604:


[~ashutoshc] Do you have any comments on the API?

 Add Metastore API to fetch one or more partition names
 --

 Key: HIVE-7604
 URL: https://issues.apache.org/jira/browse/HIVE-7604
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.14.0

 Attachments: Design_HIVE_7604.txt


 We need a new API in Metastore to address the following use cases. Both use 
 cases arise from having tables with hundreds of thousands or in some cases 
 millions of partitions.
 1. It should be quick and easy to obtain distinct values of a partition. Eg: 
 Obtain all dates for which partitions are available. This can be used by 
 tools/frameworks programmatically to understand gaps in partitions before 
 reprocessing them. Currently one has to run Hive queries (JDBC or CLI) to 
 obtain this information which is unfriendly and heavy weight. And for tables 
 which have large number of partitions, it takes a long time to run the 
 queries and it also requires large heap space.
 2. Typically users would like to know the list of partitions available and 
 would run queries that would only involve partition keys (select distinct 
 partkey1 from table) Or to obtain the latest date partition from a dimension 
 table to join against another fact table (select * from fact_table join 
 select max(dt) from dimension_table). Those queries (metadata only queries) 
 can be pushed to metastore and need not be run even locally in Hive. If the 
 queries can be converted into database based queries, the clients can be 
 light weight and need not fetch all partition names. The results can be 
 obtained much faster with less resources.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7604) Add Metastore API to fetch one or more partition names

2014-08-26 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-7604:
---

Attachment: Design_HIVE_7604.1.txt

Thanks [~ashutoshc], uploading revised document with additional information for 
return values. Lemme know if its unclear.

 Add Metastore API to fetch one or more partition names
 --

 Key: HIVE-7604
 URL: https://issues.apache.org/jira/browse/HIVE-7604
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.14.0

 Attachments: Design_HIVE_7604.1.txt, Design_HIVE_7604.txt


 We need a new API in Metastore to address the following use cases. Both use 
 cases arise from having tables with hundreds of thousands or in some cases 
 millions of partitions.
 1. It should be quick and easy to obtain distinct values of a partition. Eg: 
 Obtain all dates for which partitions are available. This can be used by 
 tools/frameworks programmatically to understand gaps in partitions before 
 reprocessing them. Currently one has to run Hive queries (JDBC or CLI) to 
 obtain this information which is unfriendly and heavy weight. And for tables 
 which have large number of partitions, it takes a long time to run the 
 queries and it also requires large heap space.
 2. Typically users would like to know the list of partitions available and 
 would run queries that would only involve partition keys (select distinct 
 partkey1 from table) Or to obtain the latest date partition from a dimension 
 table to join against another fact table (select * from fact_table join 
 select max(dt) from dimension_table). Those queries (metadata only queries) 
 can be pushed to metastore and need not be run even locally in Hive. If the 
 queries can be converted into database based queries, the clients can be 
 light weight and need not fetch all partition names. The results can be 
 obtained much faster with less resources.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6093) table creation should fail when user does not have permissions on db

2014-08-19 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102368#comment-14102368
 ] 

Thiruvel Thirumoolan commented on HIVE-6093:


Thanks [~thejas]

 table creation should fail when user does not have permissions on db
 

 Key: HIVE-6093
 URL: https://issues.apache.org/jira/browse/HIVE-6093
 Project: Hive
  Issue Type: Bug
  Components: Authorization, HCatalog, Metastore
Affects Versions: 0.12.0, 0.13.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
Priority: Minor
  Labels: authorization, metastore, security
 Fix For: 0.14.0

 Attachments: HIVE-6093-1.patch, HIVE-6093.1.patch, HIVE-6093.1.patch, 
 HIVE-6093.patch


 Its possible to create a table under a database where the user does not have 
 write permission. It can be done by specifying a LOCATION where the user has 
 write access (say /tmp/foo). This should be restricted.
 HdfsAuthorizationProvider (which typically runs on client) checks the 
 database directory during table creation. But 
 StorageBasedAuthorizationProvider does not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6093) table creation should fail when user does not have permissions on db

2014-08-14 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097401#comment-14097401
 ] 

Thiruvel Thirumoolan commented on HIVE-6093:


Review request @ https://reviews.apache.org/r/24705/

 table creation should fail when user does not have permissions on db
 

 Key: HIVE-6093
 URL: https://issues.apache.org/jira/browse/HIVE-6093
 Project: Hive
  Issue Type: Bug
  Components: Authorization, HCatalog, Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
Priority: Minor
 Attachments: HIVE-6093-1.patch, HIVE-6093.patch


 Its possible to create a table under a database where the user does not have 
 write permission. It can be done by specifying a LOCATION where the user has 
 write access (say /tmp/foo). This should be restricted.
 HdfsAuthorizationProvider (which typically runs on client) checks the 
 database directory during table creation. But 
 StorageBasedAuthorizationProvider does not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6093) table creation should fail when user does not have permissions on db

2014-08-14 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-6093:
---

Fix Version/s: 0.14.0
   Labels: authorization metastore security  (was: )
Affects Version/s: 0.12.0
   0.13.0
 Release Note: One cannot create table (whether or not they provide a 
LOCATION) if they do not have WRITE permission on the database directory.
   Status: Patch Available  (was: Open)

 table creation should fail when user does not have permissions on db
 

 Key: HIVE-6093
 URL: https://issues.apache.org/jira/browse/HIVE-6093
 Project: Hive
  Issue Type: Bug
  Components: Authorization, HCatalog, Metastore
Affects Versions: 0.13.0, 0.12.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
Priority: Minor
  Labels: security, authorization, metastore
 Fix For: 0.14.0

 Attachments: HIVE-6093-1.patch, HIVE-6093.patch


 Its possible to create a table under a database where the user does not have 
 write permission. It can be done by specifying a LOCATION where the user has 
 write access (say /tmp/foo). This should be restricted.
 HdfsAuthorizationProvider (which typically runs on client) checks the 
 database directory during table creation. But 
 StorageBasedAuthorizationProvider does not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7604) Add Metastore API to fetch one or more partition names

2014-08-13 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-7604:
---

Attachment: Design_HIVE_7604.txt

Attaching file that describes the API and rationality behind them.

I have an alpha implementation which obtains distinct values of partition keys. 
To start with, this is only ORM and its approach is very similar to 
ExpressionTree.java (using substring and indexOf string functions). Tested this 
with a table containing about a million partitions, partitioned by 6 keys and 
using Oracle as backend. It takes 2-4 seconds to obtain unique values of a 
partition. Hope this provides a rough idea of latency for large tables.

 Add Metastore API to fetch one or more partition names
 --

 Key: HIVE-7604
 URL: https://issues.apache.org/jira/browse/HIVE-7604
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.14.0

 Attachments: Design_HIVE_7604.txt


 We need a new API in Metastore to address the following use cases. Both use 
 cases arise from having tables with hundreds of thousands or in some cases 
 millions of partitions.
 1. It should be quick and easy to obtain distinct values of a partition. Eg: 
 Obtain all dates for which partitions are available. This can be used by 
 tools/frameworks programmatically to understand gaps in partitions before 
 reprocessing them. Currently one has to run Hive queries (JDBC or CLI) to 
 obtain this information which is unfriendly and heavy weight. And for tables 
 which have large number of partitions, it takes a long time to run the 
 queries and it also requires large heap space.
 2. Typically users would like to know the list of partitions available and 
 would run queries that would only involve partition keys (select distinct 
 partkey1 from table) Or to obtain the latest date partition from a dimension 
 table to join against another fact table (select * from fact_table join 
 select max(dt) from dimension_table). Those queries (metadata only queries) 
 can be pushed to metastore and need not be run even locally in Hive. If the 
 queries can be converted into database based queries, the clients can be 
 light weight and need not fetch all partition names. The results can be 
 obtained much faster with less resources.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7604) Add Metastore API to fetch one or more partition names

2014-08-13 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096002#comment-14096002
 ] 

Thiruvel Thirumoolan commented on HIVE-7604:


Thanks [~sershe]. I will reuse as much as possible.

 Add Metastore API to fetch one or more partition names
 --

 Key: HIVE-7604
 URL: https://issues.apache.org/jira/browse/HIVE-7604
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.14.0

 Attachments: Design_HIVE_7604.txt


 We need a new API in Metastore to address the following use cases. Both use 
 cases arise from having tables with hundreds of thousands or in some cases 
 millions of partitions.
 1. It should be quick and easy to obtain distinct values of a partition. Eg: 
 Obtain all dates for which partitions are available. This can be used by 
 tools/frameworks programmatically to understand gaps in partitions before 
 reprocessing them. Currently one has to run Hive queries (JDBC or CLI) to 
 obtain this information which is unfriendly and heavy weight. And for tables 
 which have large number of partitions, it takes a long time to run the 
 queries and it also requires large heap space.
 2. Typically users would like to know the list of partitions available and 
 would run queries that would only involve partition keys (select distinct 
 partkey1 from table) Or to obtain the latest date partition from a dimension 
 table to join against another fact table (select * from fact_table join 
 select max(dt) from dimension_table). Those queries (metadata only queries) 
 can be pushed to metastore and need not be run even locally in Hive. If the 
 queries can be converted into database based queries, the clients can be 
 light weight and need not fetch all partition names. The results can be 
 obtained much faster with less resources.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6089) Add metrics to HiveServer2

2014-08-13 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096508#comment-14096508
 ] 

Thiruvel Thirumoolan commented on HIVE-6089:


Hi [~jaideepdhok]. Sorry about the delay. I was hoping to get back on this. We 
have been using metrics internally and would like to update this patch with 
what we have learnt.

 Add metrics to HiveServer2
 --

 Key: HIVE-6089
 URL: https://issues.apache.org/jira/browse/HIVE-6089
 Project: Hive
  Issue Type: Improvement
  Components: Diagnosability, HiveServer2
Affects Versions: 0.12.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.14.0

 Attachments: HIVE-6089_prototype.patch


 Would like to collect metrics about HiveServer's usage, like active 
 connections, total requests etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6093) table creation should fail when user does not have permissions on db

2014-08-12 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-6093:
---

Attachment: HIVE-6093-1.patch

Updated patch for trunk and also added unit tests. Unit tests 
TestMetastoreAuthorizationProvider and 
TestStorageBasedMetastoreAuthorizationProvider passed. Running complete suite.

 table creation should fail when user does not have permissions on db
 

 Key: HIVE-6093
 URL: https://issues.apache.org/jira/browse/HIVE-6093
 Project: Hive
  Issue Type: Bug
  Components: Authorization, HCatalog, Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
Priority: Minor
 Attachments: HIVE-6093-1.patch, HIVE-6093.patch


 Its possible to create a table under a database where the user does not have 
 write permission. It can be done by specifying a LOCATION where the user has 
 write access (say /tmp/foo). This should be restricted.
 HdfsAuthorizationProvider (which typically runs on client) checks the 
 database directory during table creation. But 
 StorageBasedAuthorizationProvider does not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7604) Add Metastore API to fetch one or more partition names

2014-08-04 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-7604:
--

 Summary: Add Metastore API to fetch one or more partition names
 Key: HIVE-7604
 URL: https://issues.apache.org/jira/browse/HIVE-7604
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.14.0


We need a new API in Metastore to address the following use cases. Both use 
cases arise from having tables with hundreds of thousands or in some cases 
millions of partitions.

1. It should be quick and easy to obtain distinct values of a partition. Eg: 
Obtain all dates for which partitions are available. This can be used by 
tools/frameworks programmatically to understand gaps in partitions before 
reprocessing them. Currently one has to run Hive queries (JDBC or CLI) to 
obtain this information which is unfriendly and heavy weight. And for tables 
which have large number of partitions, it takes a long time to run the queries 
and it also requires large heap space.

2. Typically users would like to know the list of partitions available and 
would run queries that would only involve partition keys (select distinct 
partkey1 from table) Or to obtain the latest date partition from a dimension 
table to join against another fact table (select * from fact_table join select 
max(dt) from dimension_table). Those queries (metadata only queries) can be 
pushed to metastore and need not be run even locally in Hive. If the queries 
can be converted into database based queries, the clients can be light weight 
and need not fetch all partition names. The results can be obtained much faster 
with less resources.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7604) Add Metastore API to fetch one or more partition names

2014-08-04 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085064#comment-14085064
 ] 

Thiruvel Thirumoolan commented on HIVE-7604:


[~ashutoshc] Thanks. I will post an API signature today.

 Add Metastore API to fetch one or more partition names
 --

 Key: HIVE-7604
 URL: https://issues.apache.org/jira/browse/HIVE-7604
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.14.0


 We need a new API in Metastore to address the following use cases. Both use 
 cases arise from having tables with hundreds of thousands or in some cases 
 millions of partitions.
 1. It should be quick and easy to obtain distinct values of a partition. Eg: 
 Obtain all dates for which partitions are available. This can be used by 
 tools/frameworks programmatically to understand gaps in partitions before 
 reprocessing them. Currently one has to run Hive queries (JDBC or CLI) to 
 obtain this information which is unfriendly and heavy weight. And for tables 
 which have large number of partitions, it takes a long time to run the 
 queries and it also requires large heap space.
 2. Typically users would like to know the list of partitions available and 
 would run queries that would only involve partition keys (select distinct 
 partkey1 from table) Or to obtain the latest date partition from a dimension 
 table to join against another fact table (select * from fact_table join 
 select max(dt) from dimension_table). Those queries (metadata only queries) 
 can be pushed to metastore and need not be run even locally in Hive. If the 
 queries can be converted into database based queries, the clients can be 
 light weight and need not fetch all partition names. The results can be 
 obtained much faster with less resources.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7558) HCatLoader reuses credentials across jobs

2014-07-31 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14081476#comment-14081476
 ] 

Thiruvel Thirumoolan commented on HIVE-7558:


Thanks [~daijy]. Review link @ https://reviews.apache.org/r/24163/

 HCatLoader reuses credentials across jobs
 -

 Key: HIVE-7558
 URL: https://issues.apache.org/jira/browse/HIVE-7558
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.14.0

 Attachments: HIVE-7558.patch


 HCatLoader reuses credentials of stage1 in stage2 for some of the pig 
 queries. This causes stage-2 to fail, if stage-2 runs for more than 10 mins. 
 Pig queries which loads data using HCatLoader, filters only by partition 
 columns and does an order by will run into this problem. Exceptions will be 
 very similar to the following:
 2014-07-22 17:28:49,337 [main] ERROR org.apache.pig.tools.grunt.GruntParser - 
 ERROR 2997: Unable to recreate exception from backed error: 
 AttemptID:attemptid Info:RemoteTrace: 
 org.apache.hadoop.security.token.SecretManager$InvalidToken: token 
 (HDFS_DELEGATION_TOKEN token tokenid for user) can't be found in cache
   at org.apache.hadoop.ipc.Client.call(Client.java:1095)
   at 
 org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:195)
   at $Proxy7.getFileInfo(Unknown Source)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:102)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:67)
   at $Proxy7.getFileInfo(Unknown Source)
   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1305)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:734)
   at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176)
   at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:51)
   at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:284)
   at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:282)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1300)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:281)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
  at LocalTrace: 
   org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
 token (HDFS_DELEGATION_TOKEN token tokenid for user) can't be found in 
 cache
   at 
 org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217)
   at 
 org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:823)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:497)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:224)
   at 
 org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46)
   at 
 org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57

[jira] [Created] (HIVE-7558) HCatLoader reuses credentials across jobs

2014-07-30 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-7558:
--

 Summary: HCatLoader reuses credentials across jobs
 Key: HIVE-7558
 URL: https://issues.apache.org/jira/browse/HIVE-7558
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Thiruvel Thirumoolan
 Fix For: 0.14.0


HCatLoader reuses credentials of stage1 in stage2 for some of the pig queries. 
This causes stage-2 to fail, if stage-2 runs for more than 10 mins. Pig queries 
which loads data using HCatLoader, filters only by partition columns and does 
an order by will run into this problem. Exceptions will be very similar to the 
following:

2014-07-22 17:28:49,337 [main] ERROR org.apache.pig.tools.grunt.GruntParser - 
ERROR 2997: Unable to recreate exception from backed error: 
AttemptID:attemptid Info:RemoteTrace: 
org.apache.hadoop.security.token.SecretManager$InvalidToken: token 
(HDFS_DELEGATION_TOKEN token tokenid for user) can't be found in cache
at org.apache.hadoop.ipc.Client.call(Client.java:1095)
at 
org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:195)
at $Proxy7.getFileInfo(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:102)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:67)
at $Proxy7.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1305)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:734)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:51)
at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:284)
at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1300)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:281)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
 at LocalTrace: 
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
token (HDFS_DELEGATION_TOKEN token tokenid for user) can't be found in cache
at 
org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217)
at 
org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:823)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:497)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:224)
at 
org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46)
at 
org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57)
at 
org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Server.call(ProtoOverHadoopRpcEngine.java:353)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1476)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1472)
at java.security.AccessController.doPrivileged(Native Method

[jira] [Assigned] (HIVE-7558) HCatLoader reuses credentials across jobs

2014-07-30 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan reassigned HIVE-7558:
--

Assignee: Thiruvel Thirumoolan

 HCatLoader reuses credentials across jobs
 -

 Key: HIVE-7558
 URL: https://issues.apache.org/jira/browse/HIVE-7558
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.14.0


 HCatLoader reuses credentials of stage1 in stage2 for some of the pig 
 queries. This causes stage-2 to fail, if stage-2 runs for more than 10 mins. 
 Pig queries which loads data using HCatLoader, filters only by partition 
 columns and does an order by will run into this problem. Exceptions will be 
 very similar to the following:
 2014-07-22 17:28:49,337 [main] ERROR org.apache.pig.tools.grunt.GruntParser - 
 ERROR 2997: Unable to recreate exception from backed error: 
 AttemptID:attemptid Info:RemoteTrace: 
 org.apache.hadoop.security.token.SecretManager$InvalidToken: token 
 (HDFS_DELEGATION_TOKEN token tokenid for user) can't be found in cache
   at org.apache.hadoop.ipc.Client.call(Client.java:1095)
   at 
 org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:195)
   at $Proxy7.getFileInfo(Unknown Source)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:102)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:67)
   at $Proxy7.getFileInfo(Unknown Source)
   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1305)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:734)
   at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176)
   at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:51)
   at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:284)
   at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:282)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1300)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:281)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
  at LocalTrace: 
   org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
 token (HDFS_DELEGATION_TOKEN token tokenid for user) can't be found in 
 cache
   at 
 org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217)
   at 
 org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:823)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:497)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:224)
   at 
 org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46)
   at 
 org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57)
   at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Server.call(ProtoOverHadoopRpcEngine.java:353

[jira] [Updated] (HIVE-7558) HCatLoader reuses credentials across jobs

2014-07-30 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-7558:
---

Attachment: HIVE-7558.patch

Attaching patch. Do not copy job's credentials in HCatLoader's objects.

 HCatLoader reuses credentials across jobs
 -

 Key: HIVE-7558
 URL: https://issues.apache.org/jira/browse/HIVE-7558
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.14.0

 Attachments: HIVE-7558.patch


 HCatLoader reuses credentials of stage1 in stage2 for some of the pig 
 queries. This causes stage-2 to fail, if stage-2 runs for more than 10 mins. 
 Pig queries which loads data using HCatLoader, filters only by partition 
 columns and does an order by will run into this problem. Exceptions will be 
 very similar to the following:
 2014-07-22 17:28:49,337 [main] ERROR org.apache.pig.tools.grunt.GruntParser - 
 ERROR 2997: Unable to recreate exception from backed error: 
 AttemptID:attemptid Info:RemoteTrace: 
 org.apache.hadoop.security.token.SecretManager$InvalidToken: token 
 (HDFS_DELEGATION_TOKEN token tokenid for user) can't be found in cache
   at org.apache.hadoop.ipc.Client.call(Client.java:1095)
   at 
 org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:195)
   at $Proxy7.getFileInfo(Unknown Source)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:102)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:67)
   at $Proxy7.getFileInfo(Unknown Source)
   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1305)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:734)
   at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176)
   at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:51)
   at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:284)
   at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:282)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1300)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:281)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
  at LocalTrace: 
   org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
 token (HDFS_DELEGATION_TOKEN token tokenid for user) can't be found in 
 cache
   at 
 org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217)
   at 
 org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:823)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:497)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:224)
   at 
 org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46)
   at 
 org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57

[jira] [Commented] (HIVE-6089) Add metrics to HiveServer2

2014-01-03 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861841#comment-13861841
 ] 

Thiruvel Thirumoolan commented on HIVE-6089:


[~jaideepdhok] Thanks for the feedback. As this is the first metrics patch, I 
will add everything that's straightforward. Will add others in a followup JIRA.

 Add metrics to HiveServer2
 --

 Key: HIVE-6089
 URL: https://issues.apache.org/jira/browse/HIVE-6089
 Project: Hive
  Issue Type: Improvement
  Components: Diagnosability, HiveServer2
Affects Versions: 0.12.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.13.0

 Attachments: HIVE-6089_prototype.patch


 Would like to collect metrics about HiveServer's usage, like active 
 connections, total requests etc.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5268) HiveServer2 accumulates orphaned OperationHandle objects when a client fails while executing query

2013-10-22 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802192#comment-13802192
 ] 

Thiruvel Thirumoolan commented on HIVE-5268:


Thanks Brock and Carl for the comments. I posted the initial patch as sort of 
an approach I had for branch-10, it was only a first dig at this problem. The 
intention is to separate the physical disconnect and session timeout as Carl 
mentioned.

 HiveServer2 accumulates orphaned OperationHandle objects when a client fails 
 while executing query
 --

 Key: HIVE-5268
 URL: https://issues.apache.org/jira/browse/HIVE-5268
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Vaibhav Gumashta
Assignee: Thiruvel Thirumoolan
 Fix For: 0.13.0

 Attachments: HIVE-5268_prototype.patch


 When queries are executed against the HiveServer2 an OperationHandle object 
 is stored in the OperationManager.handleToOperation HashMap. Currently its 
 the duty of the JDBC client to explicitly close to cleanup the entry in the 
 map. But if the client fails to close the statement then the OperationHandle 
 object is never cleaned up and gets accumulated in the server.
 This can potentially cause OOM on the server over time. This also can be used 
 as a loophole by a malicious client to bring down the Hive server.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5268) HiveServer2 accumulates orphaned OperationHandle objects when a client fails while executing query

2013-10-21 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-5268:
---

Attachment: HIVE-5268_prototype.patch

Attaching a preliminary patch for branch 12. As mentioned before, this patch is 
aggressive (was a start) in cleaning up resources on server side. As soon as a 
client disconnects the resources are cleaned up on HS2 (if a query is running 
during disconnection, the resources are cleaned up at the end of the query). 
This approach was designed for Hive10 and I am working on porting it to trunk 
and a patch will be available for Hive12 too. The newer approach will handle 
disconnects during async query execution and also have timeouts after which 
handles/sessions will be cleaned up instead of the existing aggressive approach.

Vaibhav, can I assign this to myself if you arent working on this? Thanks!

 HiveServer2 accumulates orphaned OperationHandle objects when a client fails 
 while executing query
 --

 Key: HIVE-5268
 URL: https://issues.apache.org/jira/browse/HIVE-5268
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-5268_prototype.patch


 When queries are executed against the HiveServer2 an OperationHandle object 
 is stored in the OperationManager.handleToOperation HashMap. Currently its 
 the duty of the JDBC client to explicitly close to cleanup the entry in the 
 map. But if the client fails to close the statement then the OperationHandle 
 object is never cleaned up and gets accumulated in the server.
 This can potentially cause OOM on the server over time. This also can be used 
 as a loophole by a malicious client to bring down the Hive server.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (HIVE-5268) HiveServer2 accumulates orphaned OperationHandle objects when a client fails while executing query

2013-10-21 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan reassigned HIVE-5268:
--

Assignee: Thiruvel Thirumoolan  (was: Vaibhav Gumashta)

 HiveServer2 accumulates orphaned OperationHandle objects when a client fails 
 while executing query
 --

 Key: HIVE-5268
 URL: https://issues.apache.org/jira/browse/HIVE-5268
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Vaibhav Gumashta
Assignee: Thiruvel Thirumoolan
 Fix For: 0.13.0

 Attachments: HIVE-5268_prototype.patch


 When queries are executed against the HiveServer2 an OperationHandle object 
 is stored in the OperationManager.handleToOperation HashMap. Currently its 
 the duty of the JDBC client to explicitly close to cleanup the entry in the 
 map. But if the client fails to close the statement then the OperationHandle 
 object is never cleaned up and gets accumulated in the server.
 This can potentially cause OOM on the server over time. This also can be used 
 as a loophole by a malicious client to bring down the Hive server.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 14809: HIVE-5268: HiveServer2 accumulates orphaned OperationHandle objects when a client fails while executing query

2013-10-21 Thread Thiruvel Thirumoolan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14809/
---

Review request for hive and Vaibhav Gumashta.


Bugs: HIVE-5268
https://issues.apache.org/jira/browse/HIVE-5268


Repository: hive-git


Description
---

This is a prototype of the patch we have to cleanup resources on HS2 on client 
disconnects. This has worked for Hive-10.

An updated patch to handle async query execution and session timeouts is on the 
way.


Diffs
-

  
common/src/java/org/apache/hadoop/hive/common/thrift/HiveThriftChainedEventHandler.java
 PRE-CREATION 
  
jdbc/src/test/org/apache/hive/service/cli/thrift/TestDisconnectCleanupEventHandler.java
 PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/CLIService.java 1a7f338 
  service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
f392d62 
  
service/src/java/org/apache/hive/service/cli/thrift/DisconnectCleanupEventHandler.java
 PRE-CREATION 
  
service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java 
9c8f5c1 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
857e627 

Diff: https://reviews.apache.org/r/14809/diff/


Testing
---

A new test case to test preliminaries.
Manual testing: Start HS2 on a machine and launch a job through JDBC. Before 
the job is done, kill the client. The server will cleanup all resources, 
scratch directory etc at the end of the query. 


Thanks,

Thiruvel Thirumoolan



[jira] [Commented] (HIVE-5268) HiveServer2 accumulates orphaned OperationHandle objects when a client fails while executing query

2013-10-21 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801331#comment-13801331
 ] 

Thiruvel Thirumoolan commented on HIVE-5268:


[~vgumashta] Here it is https://reviews.apache.org/r/14809/
Let me dig in and come up with a design.

 HiveServer2 accumulates orphaned OperationHandle objects when a client fails 
 while executing query
 --

 Key: HIVE-5268
 URL: https://issues.apache.org/jira/browse/HIVE-5268
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Vaibhav Gumashta
Assignee: Thiruvel Thirumoolan
 Fix For: 0.13.0

 Attachments: HIVE-5268_prototype.patch


 When queries are executed against the HiveServer2 an OperationHandle object 
 is stored in the OperationManager.handleToOperation HashMap. Currently its 
 the duty of the JDBC client to explicitly close to cleanup the entry in the 
 map. But if the client fails to close the statement then the OperationHandle 
 object is never cleaned up and gets accumulated in the server.
 This can potentially cause OOM on the server over time. This also can be used 
 as a loophole by a malicious client to bring down the Hive server.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5486) HiveServer2 should create base scratch directories at startup

2013-10-15 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13796260#comment-13796260
 ] 

Thiruvel Thirumoolan commented on HIVE-5486:


[~prasadm] Can we set the permission to 1777, with the sticky bit? What do you 
think?

 HiveServer2 should create base scratch directories at startup
 -

 Key: HIVE-5486
 URL: https://issues.apache.org/jira/browse/HIVE-5486
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.11.0, 0.12.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-5486.2.patch, HIVE-5486.3.patch


 With impersonation enabled, the same base directory  is used by all 
 sessions/queries. For a new deployment, this directory gets created on first 
 invocation by the user running that session. This would cause directory 
 permission conflict for other users.
 HiveServer2 should create the base scratch dirs if it doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5268) HiveServer2 accumulates orphaned OperationHandle objects when a client fails while executing query

2013-10-14 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13794612#comment-13794612
 ] 

Thiruvel Thirumoolan commented on HIVE-5268:


Sorry, my bad. Let me upload what I have.

 HiveServer2 accumulates orphaned OperationHandle objects when a client fails 
 while executing query
 --

 Key: HIVE-5268
 URL: https://issues.apache.org/jira/browse/HIVE-5268
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0


 When queries are executed against the HiveServer2 an OperationHandle object 
 is stored in the OperationManager.handleToOperation HashMap. Currently its 
 the duty of the JDBC client to explicitly close to cleanup the entry in the 
 map. But if the client fails to close the statement then the OperationHandle 
 object is never cleaned up and gets accumulated in the server.
 This can potentially cause OOM on the server over time. This also can be used 
 as a loophole by a malicious client to bring down the Hive server.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5268) HiveServer2 accumulates orphaned OperationHandle objects when a client fails while executing query

2013-09-25 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13778106#comment-13778106
 ] 

Thiruvel Thirumoolan commented on HIVE-5268:


Thanks for raising this Vaibhav. We have a similar patch which cleans up 
session related info when network issues cause client disconnection or clients 
fail to close sessions. The patch is available for Hive-11 and am porting that 
to Hive12 and trunk. Unfortunately I didnt create the JIRA earlier. The patch 
cleanups aggressively as soon as the client disconnects. Based on Carl's 
feedback from a hive meetup, we would like to have a session timeout after 
which all idle/disconnected sessions are cleaned. I was working towards that.

Have you started working on this? If not, can I start by uploading the 
aggressive patch I have and then go forward with the improvements?

 HiveServer2 accumulates orphaned OperationHandle objects when a client fails 
 while executing query
 --

 Key: HIVE-5268
 URL: https://issues.apache.org/jira/browse/HIVE-5268
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0


 When queries are executed against the HiveServer2 an OperationHandle object 
 is stored in the OperationManager.handleToOperation HashMap. Currently its 
 the duty of the JDBC client to explicitly close to cleanup the entry in the 
 map. But if the client fails to close the statement then the OperationHandle 
 object is never cleaned up and gets accumulated in the server.
 This can potentially cause OOM on the server over time. This also can be used 
 as a loophole by a malicious client to bring down the Hive server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Operators and || do not work

2013-09-19 Thread Thiruvel Thirumoolan
Hi Amareshwari/Ashutosh,

Ashutosh is probably right, I doubt if this ever worked. I couldn't find a
clientpositive test case which uses  or ||.

I also modified a unit test case in Hive9 to use  instead of AND and
that failed with the same error Amareshwari saw. Hive9 does not have
HIVE-2439.

-Thiruvel

On 9/19/13 7:21 AM, Ashutosh Chauhan hashut...@apache.org wrote:

I have not tested it on historical versions, so don't know on which
versions it used to work (if ever), but possibly antlr upgrade [1] may
have
impacted this.

[1] : https://issues.apache.org/jira/browse/HIVE-2439

Ashutosh


On Thu, Sep 19, 2013 at 4:52 AM, amareshwari sriramdasu 
amareshw...@gmail.com wrote:

 Hello,

 Though the documentation
 https://cwiki.apache.org/Hive/languagemanual-udf.html says they are same
 as
 AND and OR, they do not even get parsed. User gets parsing when they are
 used. Was that intentional or is it a regression?

 hive select key from src where key=a || key =b;
 FAILED: Parse Error: line 1:33 cannot recognize input near '|' 'key'
'=' in
 expression specification

 hive select key from src where key=a  key =b;
 FAILED: Parse Error: line 1:33 cannot recognize input near '' 'key'
'=' in
 expression specification

 Thanks
 Amareshwari




[jira] [Created] (HIVE-5214) Dynamic partitions don't inherit groupname from table's directory

2013-09-04 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-5214:
--

 Summary: Dynamic partitions don't inherit groupname from table's 
directory
 Key: HIVE-5214
 URL: https://issues.apache.org/jira/browse/HIVE-5214
 Project: Hive
  Issue Type: Bug
  Components: Authorization, Security
Affects Versions: 0.12.0
Reporter: Thiruvel Thirumoolan


When dynamic partitions are created, the files/partitions don't inherit the 
group name.

The query (say, insert overwrite table select *) uses the scratch directory for 
creating the temporary data. The temporary data's perm/group is inherited from 
scratch directory. Finally, the MoveTask does a rename of the temporary dir to 
be the target partition directory and an explicit group/perm change does not 
happen.

HIVE-3756 fixed it for Load data, dynamic partitions has to be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5214) Dynamic partitions/insert overwrite don't inherit groupname from table's directory

2013-09-04 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-5214:
---

Description: 
When dynamic partitions are created or insert overwrite without partitions, the 
files/partition-dirs don't inherit the group name.

The query (say, insert overwrite table select *) uses the scratch directory for 
creating the temporary data. The temporary data's perm/group is inherited from 
scratch directory. Finally, the MoveTask does a rename of the temporary 
dir/files to be the target directory and an explicit group/perm change does not 
happen.

HIVE-3756 fixed it for Load data, dynamic partitions/inserts have to be handled.

  was:
When dynamic partitions are created, the files/partitions don't inherit the 
group name.

The query (say, insert overwrite table select *) uses the scratch directory for 
creating the temporary data. The temporary data's perm/group is inherited from 
scratch directory. Finally, the MoveTask does a rename of the temporary dir to 
be the target partition directory and an explicit group/perm change does not 
happen.

HIVE-3756 fixed it for Load data, dynamic partitions has to be handled.

Summary: Dynamic partitions/insert overwrite don't inherit groupname 
from table's directory  (was: Dynamic partitions don't inherit groupname from 
table's directory)

 Dynamic partitions/insert overwrite don't inherit groupname from table's 
 directory
 --

 Key: HIVE-5214
 URL: https://issues.apache.org/jira/browse/HIVE-5214
 Project: Hive
  Issue Type: Bug
  Components: Authorization, Security
Affects Versions: 0.12.0
Reporter: Thiruvel Thirumoolan

 When dynamic partitions are created or insert overwrite without partitions, 
 the files/partition-dirs don't inherit the group name.
 The query (say, insert overwrite table select *) uses the scratch directory 
 for creating the temporary data. The temporary data's perm/group is inherited 
 from scratch directory. Finally, the MoveTask does a rename of the temporary 
 dir/files to be the target directory and an explicit group/perm change does 
 not happen.
 HIVE-3756 fixed it for Load data, dynamic partitions/inserts have to be 
 handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3591) set hive.security.authorization.enabled can be executed by any user

2013-08-21 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746980#comment-13746980
 ] 

Thiruvel Thirumoolan commented on HIVE-3591:


[~lmccay] The first approach to authorization was client side. [~sushanth] has 
also enabled this on the server side (HCatalog/Metastore) through HIVE-3705.

We enable these features on our HCatalog deployments. Even if the user unsets 
these properties, server side changes still take effect and the user can't drop 
tables etc. We have tested this for HDFS based authorization. The properties we 
used on the HCatalog server are:

property
  namehive.security.metastore.authorization.manager/name
  
valueorg.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider/value
/property

property
  namehive.security.metastore.authenticator.manager/name
  
valueorg.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator/value
/property

property
  namehive.metastore.pre.event.listeners/name
  
valueorg.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener/value
/property

 set hive.security.authorization.enabled can be executed by any user
 ---

 Key: HIVE-3591
 URL: https://issues.apache.org/jira/browse/HIVE-3591
 Project: Hive
  Issue Type: Bug
  Components: Authorization, CLI, Clients, JDBC
Affects Versions: 0.7.1
 Environment: RHEL 5.6
 CDH U3
Reporter: Dev Gupta
  Labels: Authorization, Security

 The property hive.security.authorization.enabled can be set to true or false, 
 by any user on the CLI, thus circumventing any previously set grants and 
 authorizations. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [Discuss] project chop up

2013-08-21 Thread Thiruvel Thirumoolan
+1 Thanks Edward.

On 8/20/13 11:35 PM, amareshwari sriramdasu amareshw...@gmail.com
wrote:

Sounds great! Looking forward !


On Tue, Aug 20, 2013 at 7:58 PM, Edward Capriolo
edlinuxg...@gmail.comwrote:

 Just an update. This is going very well:

 NFO] Nothing to compile - all classes are up to date
 [INFO]
 
 [INFO] Reactor Summary:
 [INFO]
 [INFO] Apache Hive ... SUCCESS
[0.002s]
 [INFO] hive-shims-x .. SUCCESS
[1.210s]
 [INFO] hive-shims-20 . SUCCESS
[0.125s]
 [INFO] hive-common ... SUCCESS
[0.082s]
 [INFO] hive-serde  SUCCESS
[2.521s]
 [INFO] hive-metastore  SUCCESS
 [10.818s]
 [INFO] hive-exec . SUCCESS
[4.521s]
 [INFO] hive-avro . SUCCESS
[1.582s]
 [INFO] hive-zookeeper  SUCCESS
[0.519s]
 [INFO]
 
 [INFO] BUILD SUCCESS
 [INFO]
 
 [INFO] Total time: 21.613s
 [INFO] Finished at: Tue Aug 20 10:23:34 EDT 2013
 [INFO] Final Memory: 39M/408M


 Though I did some short cuts and disabled some tests. We can build hive
 very fast, including incremental builds. Also we are using maven
plugins to
 compile antlr, thrift, protobuf, datanucleas and building those every
time.


 On Fri, Aug 16, 2013 at 11:16 PM, Xuefu Zhang xzh...@cloudera.com
wrote:

  Thanks, Edward.
 
  I'm big +1 to mavenize Hive. Hive has long reached a point where it's
 hard
  to manage its build using ant. I'd like to help on this too.
 
  Thanks,
  Xuefu
 
 
  On Fri, Aug 16, 2013 at 7:31 PM, Edward Capriolo
edlinuxg...@gmail.com
  wrote:
 
   For those interested in pitching in.
   https://github.com/edwardcapriolo/hive
  
  
  
   On Fri, Aug 16, 2013 at 11:58 AM, Edward Capriolo 
 edlinuxg...@gmail.com
   wrote:
  
Summary from hive-irc channel. Minor edits for spell
check/grammar.
   
The last 10 lines are a summary of the key points.
   
[10:59:17] ecapriolo noland: et all. Do you want to talk about
hive
  in
maven?
[11:01:06] smonchi [~
ro...@host34-189-dynamic.23-79-r.retail.telecomitalia.it] has quit
  IRC:
Quit: ... 'cause there is no patch for human stupidity ...
[11:10:04] noland ecapriolo: yeah that sounds good to me!
[11:10:22] noland I saw you created the jira but haven't had
time
 to
   look
[11:10:32] ecapriolo So I found a few things
[11:10:49] ecapriolo In common there is one or two testats that
   actually
fork a process :)
[11:10:56] ecapriolo and use build.test.resources
[11:11:12] ecapriolo Some serde, uses some methods from ql in
 testing
[11:11:27] ecapriolo and shims really needs a separate hadoop
test
  shim
[11:11:32] ecapriolo But that is all simple stuff
[11:11:47] ecapriolo The biggest problem is I do not know how to
  solve
shims with maven
[11:11:50] ecapriolo do you have any ideas
[11:11:52] ecapriolo ?
[11:13:00] noland That one is going to be a challenge. It might
be
  that
in that section we have to drop down to ant
[11:14:44] noland Is it a requirement that we build both the .20
 and
   .23
shims for a package as we do today?
[11:16:46] ecapriolo I was thinking we can do it like a JDBC
driver
[11:16:59] ecapriolo Se separate out the interface of shims
[11:17:22] ecapriolo And then at runtime we drop in a driver
   implementing
[11:17:34] Wertax [~wer...@wolfkamp.xs4all.nl] has quit IRC:
Remote
  host
closed the connection
[11:17:36] ecapriolo That or we could use maven's profile system
[11:18:09] ecapriolo It seems that everything else can actually
 link
against hadoop-0.20.2 as a provided dependency
[11:18:37] noland Yeah either would work. The driver method
would
probably require use to use ant build both the drivers?
[11:18:44] noland I am a fan of mvn profiles
[11:19:05] ecapriolo I was thinking we kinda separate the shim
out
  into
its own project,, not a module
[11:19:10] ecapriolo to achive that jdbc thing
[11:19:27] ecapriolo But I do not have a solution yet, I was
 looking
  to
farm that out to someone smart...like you :)
[11:19:33] noland :)
[11:19:47] ecapriolo All I know is that we need a test shim
because
HadoopShim requires hadoop-test jars
[11:20:10] ecapriolo then the Mini stuff is only used in qtest
 anyway
[11:20:48] ecapriolo Is this something you want to help with? I
was
thinking of spinning up a github
[11:20:50] noland I think that the separate projects would work
and
perhaps nicely.
[11:21:01] noland Yeah I'd be interested in helping!

Re: [ANNOUNCE] New Hive Committer - Thejas Nair

2013-08-20 Thread Thiruvel Thirumoolan
Congrats Thejas!

On Aug 20, 2013, at 8:00 AM, Bill Graham 
billgra...@gmail.commailto:billgra...@gmail.com wrote:

Congrats Thejas!


On Tue, Aug 20, 2013 at 7:32 AM, Jarek Jarcec Cecho 
jar...@apache.orgmailto:jar...@apache.org wrote:
Congratulations Thejas!

Jarcec

On Tue, Aug 20, 2013 at 03:31:48AM -0700, Carl Steinbach wrote:
 The Apache Hive PMC has voted to make Thejas Nair a committer on the Apache
 Hive project.

 Please join me in congratulating Thejas!



--
Note that I'm no longer using my Yahoo! email address. Please email me at 
billgra...@gmail.commailto:billgra...@gmail.com going forward.


[jira] [Commented] (HIVE-4513) disable hivehistory logs by default

2013-08-12 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737711#comment-13737711
 ] 

Thiruvel Thirumoolan commented on HIVE-4513:


Thanks [~thejas]. +1. I guess we can close the duplicates HIVE-1708 and 
HIVE-3779.

We back-ported this to Hive10 and it works as expected. [~cdrome] had some 
comments on the patch. These fall in the vicinity but should be addressed as a 
separate JIRA.

1. HiveHistoryViewer.java: Its good as private void init()
2. HiveHistoryUtil.java: parseLine() method is not thread-safe. It uses 
parseBuffer which could be modified by multiple threads. Currently only HWA 
uses it.

 disable hivehistory logs by default
 ---

 Key: HIVE-4513
 URL: https://issues.apache.org/jira/browse/HIVE-4513
 Project: Hive
  Issue Type: Bug
  Components: Configuration, Logging
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4513.1.patch, HIVE-4513.2.patch, HIVE-4513.3.patch, 
 HIVE-4513.4.patch, HIVE-4513.5.patch, HIVE-4513.6.patch


 HiveHistory log files (hive_job_log_hive_*.txt files) store information about 
 hive query such as query string, plan , counters and MR job progress 
 information.
 There is no mechanism to delete these files and as a result they get 
 accumulated over time, using up lot of disk space. 
 I don't think this is used by most people, so I think it would better to turn 
 this off by default. Jobtracker logs already capture most of this 
 information, though it is not as structured as history logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4513) disable hivehistory logs by default

2013-08-12 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737720#comment-13737720
 ] 

Thiruvel Thirumoolan commented on HIVE-4513:


 Chris Drome had some comments on the patch. These fall in the vicinity but 
 should be addressed as a separate JIRA.

Created HIVE-5071 to address thread safety issues.

 disable hivehistory logs by default
 ---

 Key: HIVE-4513
 URL: https://issues.apache.org/jira/browse/HIVE-4513
 Project: Hive
  Issue Type: Bug
  Components: Configuration, Logging
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4513.1.patch, HIVE-4513.2.patch, HIVE-4513.3.patch, 
 HIVE-4513.4.patch, HIVE-4513.5.patch, HIVE-4513.6.patch


 HiveHistory log files (hive_job_log_hive_*.txt files) store information about 
 hive query such as query string, plan , counters and MR job progress 
 information.
 There is no mechanism to delete these files and as a result they get 
 accumulated over time, using up lot of disk space. 
 I don't think this is used by most people, so I think it would better to turn 
 this off by default. Jobtracker logs already capture most of this 
 information, though it is not as structured as history logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4513) disable hivehistory logs by default

2013-08-07 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733021#comment-13733021
 ] 

Thiruvel Thirumoolan commented on HIVE-4513:


Hi Thejas, are you working on this patch?

 disable hivehistory logs by default
 ---

 Key: HIVE-4513
 URL: https://issues.apache.org/jira/browse/HIVE-4513
 Project: Hive
  Issue Type: Bug
  Components: Configuration, Logging
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4513.1.patch, HIVE-4513.2.patch, HIVE-4513.3.patch, 
 HIVE-4513.4.patch


 HiveHistory log files (hive_job_log_hive_*.txt files) store information about 
 hive query such as query string, plan , counters and MR job progress 
 information.
 There is no mechanism to delete these files and as a result they get 
 accumulated over time, using up lot of disk space. 
 I don't think this is used by most people, so I think it would better to turn 
 this off by default. Jobtracker logs already capture most of this 
 information, though it is not as structured as history logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4835) Methods in Metrics class could avoid throwing IOException

2013-07-09 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13703965#comment-13703965
 ] 

Thiruvel Thirumoolan commented on HIVE-4835:


I think the method should return at-least return a boolean on whether an 
increment succeeded, so a corresponding decrement can be avoided if the 
increment failed. An incr might succeed but a decr fail, but that's again best 
effort to be consistent.

 Methods in Metrics class could avoid throwing IOException
 -

 Key: HIVE-4835
 URL: https://issues.apache.org/jira/browse/HIVE-4835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Arup Malakar
Priority: Minor

 I see that most of the methods in the Metrics class throws exception:
 {code:java}
 public void resetMetrics() throws IOException {
 public void open() throws IOException {
 public void close() throws IOException {
 public void reopen() throws IOException {
 public static void init() throws Exception {
 public static Long incrementCounter(String name) throws IOException{
 public static Long incrementCounter(String name, long increment) throws 
 IOException{
 public static void set(String name, Object value) throws IOException{
 public static Object get(String name) throws IOException{
 public static void initializeScope(String name) throws IOException {
 public static MetricsScope startScope(String name) throws IOException{
 public static MetricsScope getScope(String name) throws IOException {
 public static void endScope(String name) throws IOException{
 {code}
 I believe Metrics should be best effort and the Metrics system should just 
 log error messages in case it is unable to capture the Metrics. Throwing 
 exception makes the caller code unnecessarily lengthy. Also the caller would 
 never want to stop execution because of failure to capture metrics, so it 
 ends up just logging the exception. 
 The kind of code we see is like:
 {code:java}
   // Snippet from HiveMetaStore.java
   try {
 Metrics.startScope(function);
   } catch (IOException e) {
 LOG.debug(Exception when starting metrics scope
 + e.getClass().getName() +   + e.getMessage());
 MetaStoreUtils.printStackTrace(e);
   }
 {code} 
 which could have been:
 {code:java}
 Metrics.startScope(function);
 {code}
 Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4291) Test HiveServer2 crash based on max thrift threads

2013-06-17 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13686259#comment-13686259
 ] 

Thiruvel Thirumoolan commented on HIVE-4291:


I wrote the test case based on the patches in THRIFT-692. I haven't had a 
chance to modify the tests based on THRIFT-1869.

 Test HiveServer2 crash based on max thrift threads
 --

 Key: HIVE-4291
 URL: https://issues.apache.org/jira/browse/HIVE-4291
 Project: Hive
  Issue Type: Test
  Components: HiveServer2
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Attachments: TestHS2ThreadAllocation.java


 This test case ensures HS2 does not shutdown/crash when the thrift threads 
 have been depleted. This is due to an issue fixed in THRIFT-1869. This test 
 should pass post HIVE-4224. This test case ensures, the crash doesnt happen 
 due to any changes in Thrift behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4547) A complex create view statement fails with new Antlr 3.4

2013-06-03 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673739#comment-13673739
 ] 

Thiruvel Thirumoolan commented on HIVE-4547:


Sure, will take a look.

 A complex create view statement fails with new Antlr 3.4
 

 Key: HIVE-4547
 URL: https://issues.apache.org/jira/browse/HIVE-4547
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.12.0

 Attachments: HIVE-4547-1.patch, HIVE-4547-repro.tar


 A complex create view statement with CAST in join condition fails with 
 IllegalArgumentException error. This is exposed by the Antlr 3.4 upgrade 
 (HIVE-2439). The same statement works fine with Hive 0.9

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4467) HiveConnection does not handle failures correctly

2013-05-16 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659904#comment-13659904
 ] 

Thiruvel Thirumoolan commented on HIVE-4467:


[~cwsteinbach] Does the updated patch look good?

 HiveConnection does not handle failures correctly
 -

 Key: HIVE-4467
 URL: https://issues.apache.org/jira/browse/HIVE-4467
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Attachments: HIVE-4467_1.patch, HIVE-4467.patch


 HiveConnection uses Utils.verifySuccess* routines to check if there is any 
 error from the server side. This is not handled well. In 
 Utils.verifySuccess() when withInfo is 'false', the condition evaluates to 
 'false' and no SQLexception is thrown even though there could be a problem on 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4467) HiveConnection does not handle failures correctly

2013-05-09 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-4467:
---

Attachment: HIVE-4467_1.patch

Updated patch on phabricator and https://reviews.facebook.net/D10629 and also 
uploaded here (HIVE-4467_1.patch).

 HiveConnection does not handle failures correctly
 -

 Key: HIVE-4467
 URL: https://issues.apache.org/jira/browse/HIVE-4467
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Attachments: HIVE-4467_1.patch, HIVE-4467.patch


 HiveConnection uses Utils.verifySuccess* routines to check if there is any 
 error from the server side. This is not handled well. In 
 Utils.verifySuccess() when withInfo is 'false', the condition evaluates to 
 'false' and no SQLexception is thrown even though there could be a problem on 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4467) HiveConnection does not handle failures correctly

2013-05-09 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-4467:
---

Status: Patch Available  (was: Open)

 HiveConnection does not handle failures correctly
 -

 Key: HIVE-4467
 URL: https://issues.apache.org/jira/browse/HIVE-4467
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Attachments: HIVE-4467_1.patch, HIVE-4467.patch


 HiveConnection uses Utils.verifySuccess* routines to check if there is any 
 error from the server side. This is not handled well. In 
 Utils.verifySuccess() when withInfo is 'false', the condition evaluates to 
 'false' and no SQLexception is thrown even though there could be a problem on 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4467) HiveConnection does not handle failures correctly

2013-05-01 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-4467:
--

 Summary: HiveConnection does not handle failures correctly
 Key: HIVE-4467
 URL: https://issues.apache.org/jira/browse/HIVE-4467
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.11.0, 0.12.0


HiveConnection uses Utils.verifySuccess* routines to check if there is any 
error from the server side. This is not handled well. In Utils.verifySuccess() 
when withInfo is 'false', the condition evaluates to 'false' and no 
SQLexception is thrown even though there could be a problem on the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4467) HiveConnection does not handle failures correctly

2013-05-01 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-4467:
---

Attachment: HIVE-4467.patch

Attaching patch, I have made the functions straightforward and not preserved 
the boolean in the UtilsverifySuccess() methods. I am unsure where 
TStatusCode.SUCCESS_WITH_INFO_STATUS is set in the HS2 code and couldn't find 
any occurrences. Is there any reason/intention for checking for that status?

 HiveConnection does not handle failures correctly
 -

 Key: HIVE-4467
 URL: https://issues.apache.org/jira/browse/HIVE-4467
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.11.0, 0.12.0

 Attachments: HIVE-4467.patch


 HiveConnection uses Utils.verifySuccess* routines to check if there is any 
 error from the server side. This is not handled well. In 
 Utils.verifySuccess() when withInfo is 'false', the condition evaluates to 
 'false' and no SQLexception is thrown even though there could be a problem on 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4467) HiveConnection does not handle failures correctly

2013-05-01 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646996#comment-13646996
 ] 

Thiruvel Thirumoolan commented on HIVE-4467:


Uploaded patch to phabricator: https://reviews.facebook.net/D10629

 HiveConnection does not handle failures correctly
 -

 Key: HIVE-4467
 URL: https://issues.apache.org/jira/browse/HIVE-4467
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.11.0, 0.12.0

 Attachments: HIVE-4467.patch


 HiveConnection uses Utils.verifySuccess* routines to check if there is any 
 error from the server side. This is not handled well. In 
 Utils.verifySuccess() when withInfo is 'false', the condition evaluates to 
 'false' and no SQLexception is thrown even though there could be a problem on 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3620) Drop table using hive CLI throws error when the total number of partition in the table is around 50K.

2013-04-18 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13635676#comment-13635676
 ] 

Thiruvel Thirumoolan commented on HIVE-3620:


[~sho.shimauchi] Did you have any special parameters for datanucleus to get 
this working? I tried disabling datanucleus cache and also set connection 
pools, but that does not seem to help. Will also post a snapshot of memory dump 
I have. BTW, I tried dropping a table with 45k partitions with the batch size 
configured to 100 and 1000.

 Drop table using hive CLI throws error when the total number of partition in 
 the table is around 50K.
 -

 Key: HIVE-3620
 URL: https://issues.apache.org/jira/browse/HIVE-3620
 Project: Hive
  Issue Type: Bug
Reporter: Arup Malakar

 hive drop table load_test_table_2_0; 
  
 FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: 
 java.net.SocketTimeoutException: Read timedout
   
   
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask 
 The DB used is Oracle and hive had only one table:
 select COUNT(*) from PARTITIONS;
 54839
 I can try and play around with the parameter 
 hive.metastore.client.socket.timeout if that is what is being used. But it is 
 200 seconds as of now, and 200 seconds for a drop table calls seems high 
 already.
 Thanks,
 Arup

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3620) Drop table using hive CLI throws error when the total number of partition in the table is around 50K.

2013-04-18 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-3620:
---

Attachment: Hive-3620_HeapDump.jpg

 Drop table using hive CLI throws error when the total number of partition in 
 the table is around 50K.
 -

 Key: HIVE-3620
 URL: https://issues.apache.org/jira/browse/HIVE-3620
 Project: Hive
  Issue Type: Bug
Reporter: Arup Malakar
 Attachments: Hive-3620_HeapDump.jpg


 hive drop table load_test_table_2_0; 
  
 FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: 
 java.net.SocketTimeoutException: Read timedout
   
   
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask 
 The DB used is Oracle and hive had only one table:
 select COUNT(*) from PARTITIONS;
 54839
 I can try and play around with the parameter 
 hive.metastore.client.socket.timeout if that is what is being used. But it is 
 200 seconds as of now, and 200 seconds for a drop table calls seems high 
 already.
 Thanks,
 Arup

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3620) Drop table using hive CLI throws error when the total number of partition in the table is around 50K.

2013-04-09 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627283#comment-13627283
 ] 

Thiruvel Thirumoolan commented on HIVE-3620:


I have had this problem in the past (in my case 0.2 million partitions, was 
stress testing dynamic partitions). Metastore crashes badly, may be mine was a 
remote situation. The workaround I did was to drop one hierarchy of partition. 
In my case, there were many partition keys and I used to drop the topmost one 
instead of dropping the table.

May be its worthwhile to visit HIVE-3214 and see if there is anything we could 
do at datanucleus end.

 Drop table using hive CLI throws error when the total number of partition in 
 the table is around 50K.
 -

 Key: HIVE-3620
 URL: https://issues.apache.org/jira/browse/HIVE-3620
 Project: Hive
  Issue Type: Bug
Reporter: Arup Malakar

 hive drop table load_test_table_2_0; 
  
 FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: 
 java.net.SocketTimeoutException: Read timedout
   
   
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask 
 The DB used is Oracle and hive had only one table:
 select COUNT(*) from PARTITIONS;
 54839
 I can try and play around with the parameter 
 hive.metastore.client.socket.timeout if that is what is being used. But it is 
 200 seconds as of now, and 200 seconds for a drop table calls seems high 
 already.
 Thanks,
 Arup

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4291) Test HiveServer2 crash based on max thrift threads

2013-04-05 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13623999#comment-13623999
 ] 

Thiruvel Thirumoolan commented on HIVE-4291:


Thanks Brock, I will also reduce the time delays so the entire test runs in 
less than 20 seconds.

 Test HiveServer2 crash based on max thrift threads
 --

 Key: HIVE-4291
 URL: https://issues.apache.org/jira/browse/HIVE-4291
 Project: Hive
  Issue Type: Test
  Components: HiveServer2
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Attachments: TestHS2ThreadAllocation.java


 This test case ensures HS2 does not shutdown/crash when the thrift threads 
 have been depleted. This is due to an issue fixed in THRIFT-1869. This test 
 should pass post HIVE-4224. This test case ensures, the crash doesnt happen 
 due to any changes in Thrift behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4049) local_mapred_error_cache.q with hadoop 23.x fails with additional warning messages

2013-04-03 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan resolved HIVE-4049.


Resolution: Duplicate

HIVE-3428 has already fixed this.

 local_mapred_error_cache.q with hadoop 23.x fails with additional warning 
 messages
 --

 Key: HIVE-4049
 URL: https://issues.apache.org/jira/browse/HIVE-4049
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Thiruvel Thirumoolan
 Fix For: 0.10.1


 When run on branch10 with 23.x, the test fails. An additional warning message 
 leads to failure. The test should be independent of these things.
 Diff output:
 [junit] 16d15
 [junit]  WARNING: org.apache.hadoop.metrics.jvm.EventCounter is 
 deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the 
 log4j.properties files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4291) Test HiveServer2 crash based on max thrift threads

2013-04-03 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-4291:
--

 Summary: Test HiveServer2 crash based on max thrift threads
 Key: HIVE-4291
 URL: https://issues.apache.org/jira/browse/HIVE-4291
 Project: Hive
  Issue Type: Test
  Components: HiveServer2
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan


This test case ensures HS2 does not shutdown/crash when the thrift threads have 
been depleted. This is due to an issue fixed in THRIFT-1869. This test should 
pass post HIVE-4224. This test case ensures, the crash doesnt happen due to any 
changes in Thrift behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4291) Test HiveServer2 crash based on max thrift threads

2013-04-03 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-4291:
---

Attachment: TestHS2ThreadAllocation.java

A WIP patch, will clean it up and post it on review board. I tested this with a 
custom built Thrift 0.9.0 library with THRIFT-692 changes, will retest with 
THRIFT-1.0 and update.

 Test HiveServer2 crash based on max thrift threads
 --

 Key: HIVE-4291
 URL: https://issues.apache.org/jira/browse/HIVE-4291
 Project: Hive
  Issue Type: Test
  Components: HiveServer2
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Attachments: TestHS2ThreadAllocation.java


 This test case ensures HS2 does not shutdown/crash when the thrift threads 
 have been depleted. This is due to an issue fixed in THRIFT-1869. This test 
 should pass post HIVE-4224. This test case ensures, the crash doesnt happen 
 due to any changes in Thrift behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4228) Bump up hadoop2 version in trunk

2013-03-26 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13614548#comment-13614548
 ] 

Thiruvel Thirumoolan commented on HIVE-4228:


Patch on Phabricator - https://reviews.facebook.net/D9723

 Bump up hadoop2 version in trunk
 

 Key: HIVE-4228
 URL: https://issues.apache.org/jira/browse/HIVE-4228
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Affects Versions: 0.11.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.11.0

 Attachments: HIVE-4228.patch


 Hive builds with hadoop 2.0.0-alpha now. Bumping up to hadoop-2.0.3-alpha. 
 Have raised JIRAs with hive10-hadoop23.6 unit tests. Most of them should fix 
 any new failures due to this bump. [I am guessing this should also help 
 HCatalog].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4228) Bump up hadoop2 version in trunk

2013-03-26 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-4228:
---

Status: Patch Available  (was: Open)

 Bump up hadoop2 version in trunk
 

 Key: HIVE-4228
 URL: https://issues.apache.org/jira/browse/HIVE-4228
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Affects Versions: 0.11.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.11.0

 Attachments: HIVE-4228.patch


 Hive builds with hadoop 2.0.0-alpha now. Bumping up to hadoop-2.0.3-alpha. 
 Have raised JIRAs with hive10-hadoop23.6 unit tests. Most of them should fix 
 any new failures due to this bump. [I am guessing this should also help 
 HCatalog].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4228) Bump up hadoop2 version in trunk

2013-03-25 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-4228:
--

 Summary: Bump up hadoop2 version in trunk
 Key: HIVE-4228
 URL: https://issues.apache.org/jira/browse/HIVE-4228
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Affects Versions: 0.11.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.11.0


Hive builds with hadoop 2.0.0-alpha now. Bumping up to hadoop-2.0.3-alpha. Have 
raised JIRAs with hive10-hadoop23.6 unit tests. Most of them should fix any new 
failures due to this bump. [I am guessing this should also help HCatalog].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4228) Bump up hadoop2 version in trunk

2013-03-25 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-4228:
---

Attachment: HIVE-4228.patch

 Bump up hadoop2 version in trunk
 

 Key: HIVE-4228
 URL: https://issues.apache.org/jira/browse/HIVE-4228
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Affects Versions: 0.11.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.11.0

 Attachments: HIVE-4228.patch


 Hive builds with hadoop 2.0.0-alpha now. Bumping up to hadoop-2.0.3-alpha. 
 Have raised JIRAs with hive10-hadoop23.6 unit tests. Most of them should fix 
 any new failures due to this bump. [I am guessing this should also help 
 HCatalog].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4047) skewjoin.q unit test inconsistently fails with Hadoop 0.23.x on branch 10

2013-02-21 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-4047:
--

 Summary: skewjoin.q unit test inconsistently fails with Hadoop 
0.23.x on branch 10
 Key: HIVE-4047
 URL: https://issues.apache.org/jira/browse/HIVE-4047
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.10.1




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4049) local_mapred_error_cache.q with hadoop 23.x fails with additional warning messages

2013-02-21 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-4049:
--

 Summary: local_mapred_error_cache.q with hadoop 23.x fails with 
additional warning messages
 Key: HIVE-4049
 URL: https://issues.apache.org/jira/browse/HIVE-4049
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Thiruvel Thirumoolan
 Fix For: 0.10.1


When run on branch10 with 23.x, the test fails. An additional warning message 
leads to failure. The test should be independent of these things.

Diff output:
[junit] 16d15
[junit]  WARNING: org.apache.hadoop.metrics.jvm.EventCounter is 
deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the 
log4j.properties files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3911) udaf_percentile_approx.q fails with Hadoop 0.23.5 when map-side aggr is disabled.

2013-02-20 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-3911:
---

Attachment: HIVE-3911_branch10.patch

Attaching HIVE-3911_branch10.patch. This should make it consistent. I have just 
removed the queries that cause changes and fails this test.

 udaf_percentile_approx.q fails with Hadoop 0.23.5 when map-side aggr is 
 disabled.
 -

 Key: HIVE-3911
 URL: https://issues.apache.org/jira/browse/HIVE-3911
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Thiruvel Thirumoolan
 Fix For: 0.11.0

 Attachments: HIVE-3911_branch10.patch, HIVE-3911.patch


 I am running Hive10 unit tests against Hadoop 0.23.5 and 
 udaf_percentile_approx.q fails with a different value when map-side aggr is 
 disabled and only when 3rd argument to this UDAF is 100. Matches expected 
 output when map-side aggr is enabled for the same arguments.
 This test passes when hadoop.version is 1.1.1 and fails when its 0.23.x or 
 2.0.0-alpha or 2.0.2-alpha.
 [junit] 20c20
 [junit]  254.083331
 [junit] ---
 [junit]  252.77
 [junit] 47c47
 [junit]  254.083331
 [junit] ---
 [junit]  252.77
 [junit] 74c74
 [junit]  
 [23.358,254.083331,477.0625,489.54667]
 [junit] ---
 [junit]  [24.07,252.77,476.9,487.82]
 [junit] 101c101
 [junit]  
 [23.358,254.083331,477.0625,489.54667]
 [junit] ---
 [junit]  [24.07,252.77,476.9,487.82]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >