[jira] [Created] (HIVE-22306) Use nonblocking thrift server for metastore

2019-10-08 Thread Qinghui Xu (Jira)
Qinghui Xu created HIVE-22306:
-

 Summary: Use nonblocking thrift server for metastore
 Key: HIVE-22306
 URL: https://issues.apache.org/jira/browse/HIVE-22306
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Qinghui Xu


Currently hive metastore's threads are blocking for network io (it's using 
`TThreadPoolServer` behind the scene), which means with increasing use cases 
(in our tech stack there are different services relying on it, hiveserver2, 
spark, presto, and more, all with a significant number of users), to handle all 
connections it needs either a big thread pool or many instances with smaller 
thread pools. And often, those metastores will see their thread pool saturated, 
while the cpu usage is still quite low, just because most connections stay idle 
and only run a query from time to time. This is thus a great misuse of the 
computation resources.

Thus I propose to use a non blocking threading model, and run computation 
asynchronously. 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22305) Add the kudu-handler to the packaging module

2019-10-08 Thread Grant Henke (Jira)
Grant Henke created HIVE-22305:
--

 Summary: Add the kudu-handler to the packaging module
 Key: HIVE-22305
 URL: https://issues.apache.org/jira/browse/HIVE-22305
 Project: Hive
  Issue Type: Sub-task
Reporter: Grant Henke
Assignee: Grant Henke


The hive-kudu-handler needs to be added to the packaging module to ensure the 
jars are packaged into the tar distribution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Review Request 71589: Create read-only transactions

2019-10-08 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/
---

Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-21114
https://issues.apache.org/jira/browse/HIVE-21114


Repository: hive-git


Description
---

With HIVE-21036 we have a way to indicate that a txn is read only.
We should (at least in auto-commit mode) determine if the single stmt is a read 
and mark the txn accordingly.
Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks in 
write_set etc.

TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
TXN_OPEN) so it can read the txn type in the same SQL stmt.

HiveOperation only has QUERY, which includes Insert and Select, so this 
requires figuring out how to determine if a query is a SELECT. By the time 
Driver.openTransaction(); is called, we have already parsed the query so there 
should be a way to know if the statement only reads.

For multi-stmt txns (once these are supported) we should allow user to indicate 
that a txn is read-only and then not allow any statements that can make 
modifications in this txn. This should be a different jira.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java bcd4600683 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java ac813c8288 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 1c53426966 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
cc86afedbf 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java PRE-CREATION 


Diff: https://reviews.apache.org/r/71589/diff/1/


Testing
---

Unit + manual test


Thanks,

Denys Kuzmenko



[jira] [Created] (HIVE-22304) Upgrade ORC version to 1.6.0

2019-10-08 Thread David Lavati (Jira)
David Lavati created HIVE-22304:
---

 Summary: Upgrade ORC version to 1.6.0
 Key: HIVE-22304
 URL: https://issues.apache.org/jira/browse/HIVE-22304
 Project: Hive
  Issue Type: Improvement
Reporter: David Lavati
Assignee: David Lavati






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22303) TestObjectStore starts some deadline timers which are never stopped

2019-10-08 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-22303:
---

 Summary: TestObjectStore starts some deadline timers which are 
never stopped
 Key: HIVE-22303
 URL: https://issues.apache.org/jira/browse/HIVE-22303
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


because these timers are not stopped; they may stay there as a threadlocal; and 
eventually time out since the disarm logic is missing...

https://github.com/apache/hive/blob/d907dfe68ed84714d62a22e5191efa616eab2b24/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java#L373





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22302) Add some smoke tests

2019-10-08 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-22302:
---

 Summary: Add some smoke tests
 Key: HIVE-22302
 URL: https://issues.apache.org/jira/browse/HIVE-22302
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


I'm not sure how much this is achievable...but we sometime leave metastore 
upgrade bugs / etc by mistake...

it would be great to have something which:

* compiles and deploys hive
* runs some trivial cases
* ...and run it against multiple kind of metastore dbs

I think travis can be convinced to do something like this...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 71575: HIVE-22284: Improve LLAP CacheContentsTracker to collect and display correct statistics

2019-10-08 Thread Adam Szita via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71575/
---

(Updated Oct. 8, 2019, 7:53 a.m.)


Review request for hive.


Bugs: HIVE-22284
https://issues.apache.org/jira/browse/HIVE-22284


Repository: hive-git


Description
---

When keeping track of which buffers correspond to what Hive objects, 
CacheContentsTracker relies on cache tags.

Currently a tag is a simple String that ideally holds DB and table name, and a 
partition spec concatenated by . and / . The information here is derived from 
the Path of the file that is getting cached. Needless to say sometimes this 
produces a wrong tag especially for external tables.

Also there's a bug when calculating aggregated stats for a 'parent' tag 
(corresponding to the table of the partition) because the overall maxCount and 
maxSize do not add up to the sum of those in the partitions. This happens when 
buffers get removed from the cache.


Diffs (updated)
-

  llap-common/src/java/org/apache/hadoop/hive/llap/LlapUtil.java 
a351a193c6bc558bb420049c54b7657cd7d04b7c 
  
llap-server/src/java/org/apache/hadoop/hive/llap/cache/CacheContentsTracker.java
 64c0125833af100fd7012b9751d075ab536ad1b0 
  
llap-server/src/java/org/apache/hadoop/hive/llap/cache/LlapCacheableBuffer.java 
f91a5d91a5b739dcbee98a1485ad4c59f6a9057b 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LlapDataBuffer.java 
405fca2d4fae9fe0e3fd6d6d1345d55255d6df78 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LowLevelCache.java 
4dd3826a67dfff66ce9c90027d61a9012c0a15e8 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/LowLevelCacheImpl.java 
62d7e5534486b53634de332875c5fd5d336c29b4 
  
llap-server/src/java/org/apache/hadoop/hive/llap/cache/SerDeLowLevelCacheImpl.java
 2a39d2d32807a51346baad28b04d87670381b6d5 
  
llap-server/src/java/org/apache/hadoop/hive/llap/cache/SimpleBufferManager.java 
41855e171eaa5bf8da638bc62bce3d0d49dc4bae 
  llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java 
c63ee5f79b4f9fc356f033960e0af1a7b0058038 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java
 1378a01f44ef774a15f769460833064c6305b2d6 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/OrcColumnVectorProducer.java
 2a0c5ca92f3c7431f3c399f309a538f47eb27597 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java
 85a42f945624c3ca468790772f52363b4064d8fc 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/SerDeEncodedDataReader.java
 d414b1405b7672767196b3eaad02baa516169288 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/MetadataCache.java 
8400fe98411ed07bd525a51a223fc35423136efb 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java
 30dc1b9da2002689b8b1917f46ae3ca24194f3be 
  
llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestCacheContentsTracker.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/llap/LlapCacheAwareFs.java 
af04a51b5536550b2d2f7d3e008cf2b2dea607d4 
  ql/src/java/org/apache/hadoop/hive/llap/LlapHiveUtils.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java 
241a3001e6e0002377736d6d0e820fde004b0bac 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/Reader.java 
210c987b7f580dacda5bdb487af9cf234a738b79 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java 
a9a9f101948a970e0dbf2f77eeb6f688a88d1cbd 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java
 61e2556b08fe4247f35673f24378505ada20a605 
  storage-api/src/java/org/apache/hadoop/hive/common/io/CacheTag.java 
PRE-CREATION 
  storage-api/src/java/org/apache/hadoop/hive/common/io/DataCache.java 
2ac0a18a5026e76e65c3c3a8b81d5a844c472ed2 
  storage-api/src/java/org/apache/hadoop/hive/common/io/FileMetadataCache.java 
d7de3619380d24a1aeea2bac9a66485d7d468517 


Diff: https://reviews.apache.org/r/71575/diff/4/

Changes: https://reviews.apache.org/r/71575/diff/3-4/


Testing
---


Thanks,

Adam Szita