[jira] [Created] (HIVE-23240) loadDynamicPartition complains about static partitions even when they are provided in the description

2020-04-17 Thread Reza Safi (Jira)
Reza Safi created HIVE-23240:


 Summary: loadDynamicPartition complains about static partitions 
even when they are provided in the description 
 Key: HIVE-23240
 URL: https://issues.apache.org/jira/browse/HIVE-23240
 Project: Hive
  Issue Type: Bug
Reporter: Reza Safi


Hive is computing valid dynamic partitions here:
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2853
However it later uses the specification provided by client here:
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2879
(partSpec is exactly what client has provided and partSpec.keySet() contains 
both static and dynamic partitions key)
As a result the makeSpecFromName here will expect both static and dynamic 
partitions in requiredKeys:
https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java#L580
However since the curPath that is passed to the above method is just like 
"somePath/dynamicPart=value" which miss the static partitions and a result the 
method  will ignore static partition keys then complains in log a warning that 
the static partition keys are missing. Returning false to Hive.java,  a log 
warning that "dynamicPart=value" is an invalid partition will be issued, 
despite the fact that the dynamic partition has been validated before:
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2880
 
This will cause a silent data corruption in some clients. As an example spark 
will suffer from this when working with hive metastore in master branch.
It seems that if the goal was just to warn the client, there is no need to 
ignore the valid dynamic partition.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23239) Remove snakeyaml lib from Hive distribution via transitive dependency

2020-04-17 Thread Roohi Syeda (Jira)
Roohi Syeda created HIVE-23239:
--

 Summary: Remove snakeyaml lib from Hive distribution via 
transitive dependency
 Key: HIVE-23239
 URL: https://issues.apache.org/jira/browse/HIVE-23239
 Project: Hive
  Issue Type: Bug
Reporter: Roohi Syeda
Assignee: Roohi Syeda






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [RESULT][VOTE] Apache Hive 2.3.7 Release Candidate 0

2020-04-17 Thread Alan Gates
With three +1s (Alan, Owen, and Peter) and no other votes this vote
passes.  Thanks Owen and Peter for voting.

Alan.

On Fri, Apr 17, 2020 at 12:05 AM Peter Vary 
wrote:

> +1 for the release.
> - Downloaded the artifacts
> - Verified the signatures
> - Build the project (failed 2 times with parallel compilation - I was
> afraid that I have found a problem, but finally compiled without parallel
> :) )
> - Run some tests
>
> > On Apr 16, 2020, at 22:33, Owen O'Malley  wrote:
> >
> > I'm +1 for the release.
> >
> >   - I checked the signature & sha.
> >   - I built the project.
> >   - I ran a handful of unit tests.
> >
> > .. Owen
> >
> > On Tue, Apr 7, 2020 at 8:34 PM Hyukjin Kwon  wrote:
> >
> >> Thank you so much Alan for doing this.
> >>
> >> 2020년 4월 8일 (수) 오전 9:26, Alan Gates 님이 작성:
> >>
> >>> Apache Hive 2.3.7 Release Candidate 0 is available here:
> >>> https://people.apache.org/~gates/apache-hive-2.3.7-rc0/
> >>>
> >>> Maven artifacts are available here:
> >>> https://repository.apache.org/content/repositories/orgapachehive-1100/
> >>>
> >>> The tag release-2.3.7-rc0 has been applied to the source for this
> >>> release in github, you can see it
> >>> athttps://github.com/apache/hive/tree/release-2.3.7-rc0
> >>>
> >>> Voting will conclude in 72 hours (or whenever I scrounge together
> enough
> >>> votes).
> >>>
> >>> Hive PMC Members: Please test and vote.
> >>>
> >>> Thanks.
> >>>
> >>>
> >>> Alan.
> >>>
> >>
>
>


[jira] [Created] (HIVE-23238) FIX PreemptionQueueComparator edge cases

2020-04-17 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23238:
-

 Summary: FIX PreemptionQueueComparator edge cases
 Key: HIVE-23238
 URL: https://issues.apache.org/jira/browse/HIVE-23238
 Project: Hive
  Issue Type: Improvement
Reporter: Panagiotis Garefalakis
Assignee: Panagiotis Garefalakis
 Fix For: llap


Properly handle preemption comparator edge cases where tasks are same type and 
have the same number or upstream tasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72380: HIVE-23207 Create integration tests for TxnManager for different rdbms metastores

2020-04-17 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72380/#review220352
---



Thanks Peter for the patch!
This fix is long overdue!

I do not understand one thing, see below.

Also I would like to ask Denys to confirm, that running the init sqls again and 
again will not cause too much overhead in test runtime as he mentioned in our 
discussion. (5 min tests are fine, 1 hour tests are not fine :))

Thanks!


itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestMetaStoreHandler.java
Lines 67 (patched)


Why is this needed?



itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestMetaStoreHandler.java
Lines 119 (patched)


Why is this needed? Again?


- Peter Vary


On ápr. 17, 2020, 3:24 du, Peter Varga wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72380/
> ---
> 
> (Updated ápr. 17, 2020, 3:24 du)
> 
> 
> Review request for hive, Denys Kuzmenko and Zoltan Chovan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> In the final version the prepDb creates the transactional tables with the 
> init schema (and ignores the others if they exists). The cleanDb resets the 
> database to the starting point. So between the test cases the cleanDb call is 
> enough. If the prepDb is called unneccessary it will just check if the txns 
> table exist and then return, so it will be fast
> 
> 
> Diffs
> -
> 
>   itests/hive-blobstore/pom.xml 09955c55f3 
>   itests/qtest-accumulo/pom.xml a35d2a8a10 
>   itests/qtest-druid/pom.xml cc0cceff68 
>   itests/qtest-kudu/pom.xml f23399fa37 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestMetaStoreHandler.java
>  b86d736a89 
>   pom.xml 90e39702a1 
>   ql/pom.xml d1846c9245 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
>  15fcfc0e35 
>   ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
> 1d211857bf 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandlerNoConnectionPool.java
>  ebe4880e3a 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/ITestDbTxnManager.java 
> PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 73d3b91585 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DatabaseProduct.java
>  3e56ad513c 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  a66e16973f 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  962a63d418 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  1ace9d3ef0 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
>  5f3db52c2f 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/Mysql.java
>  c537d95470 
> 
> 
> Diff: https://reviews.apache.org/r/72380/diff/1/
> 
> 
> Testing
> ---
> 
> On my machine the 50 tests in TestDbTxnManager2 on postgres runs under 5 
> minutes.
> 
> 
> Thanks,
> 
> Peter Varga
> 
>



[jira] [Created] (HIVE-23237) Display HvieServer2 hostname in the operation logs

2020-04-17 Thread Miklos Szurap (Jira)
Miklos Szurap created HIVE-23237:


 Summary: Display HvieServer2 hostname in the operation logs
 Key: HIVE-23237
 URL: https://issues.apache.org/jira/browse/HIVE-23237
 Project: Hive
  Issue Type: Improvement
Reporter: Miklos Szurap


Hive deployments often have an external load-balancer in front of multiple 
HiveServer2 instances. 
In such cases the client does not know which HiveServer2 it is connected to. If 
there are some issues all HiveServer2 logs have to be searched for clues 
instead of directly going to the right host. It would be great if the HS2 
hostname was logged to the client logs (for example to beeline's output). 
We can "work around" by printing out this information with executing a "set 
hive.server2.thrift.bind.host;" however that requires an explicit modification 
to every application. 
Can we print this information in the operation logs and that way streaming it 
back to the client? 
Likely some users - customers do not want to expose that, so the behavior 
should be configurable.
This could make the issue/error investigation much easier.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23236) Remove the global lock from acquireLock

2020-04-17 Thread Marton Bod (Jira)
Marton Bod created HIVE-23236:
-

 Summary: Remove the global lock from acquireLock
 Key: HIVE-23236
 URL: https://issues.apache.org/jira/browse/HIVE-23236
 Project: Hive
  Issue Type: Improvement
Reporter: Marton Bod
Assignee: Marton Bod


Currently we have a global lock (NEXT_LOCK_ID) when running enqueueLock, 
because the algorithm in checkLock requires the locks to have a well defined 
order, and also requires that every lock component is already stored in the 
RDBMS before checking the locks.

Proposed approach:
 * Enqueue lock without a global lock, using a sequence for getting the next 
lock ID
 * Insert the locks as usual, but with an additional timestamp (enqueued_at)
 * Check lock should check all enqueued locks for the given db/table/partition 
but order should be determined based on the enqueued_at time instead of the 
lockID
 * If there is no blocking lock, update the state to acquired as usual



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Review Request 72380: HIVE-23207 Create integration tests for TxnManager for different rdbms metastores

2020-04-17 Thread Peter Varga via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72380/
---

Review request for hive, Denys Kuzmenko and Zoltan Chovan.


Repository: hive-git


Description
---

In the final version the prepDb creates the transactional tables with the init 
schema (and ignores the others if they exists). The cleanDb resets the database 
to the starting point. So between the test cases the cleanDb call is enough. If 
the prepDb is called unneccessary it will just check if the txns table exist 
and then return, so it will be fast


Diffs
-

  itests/hive-blobstore/pom.xml 09955c55f3 
  itests/qtest-accumulo/pom.xml a35d2a8a10 
  itests/qtest-druid/pom.xml cc0cceff68 
  itests/qtest-kudu/pom.xml f23399fa37 
  
itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestMetaStoreHandler.java 
b86d736a89 
  pom.xml 90e39702a1 
  ql/pom.xml d1846c9245 
  
ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java 
15fcfc0e35 
  ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
1d211857bf 
  
ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandlerNoConnectionPool.java
 ebe4880e3a 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/ITestDbTxnManager.java 
PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
73d3b91585 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DatabaseProduct.java
 3e56ad513c 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
 a66e16973f 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 962a63d418 
  
standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
 1ace9d3ef0 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
 5f3db52c2f 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/Mysql.java
 c537d95470 


Diff: https://reviews.apache.org/r/72380/diff/1/


Testing
---

On my machine the 50 tests in TestDbTxnManager2 on postgres runs under 5 
minutes.


Thanks,

Peter Varga



[jira] [Created] (HIVE-23235) Checkpointing in repl dump failing for orc format

2020-04-17 Thread Aasha Medhi (Jira)
Aasha Medhi created HIVE-23235:
--

 Summary: Checkpointing in repl dump failing for orc format
 Key: HIVE-23235
 URL: https://issues.apache.org/jira/browse/HIVE-23235
 Project: Hive
  Issue Type: Bug
Reporter: Aasha Medhi
Assignee: Aasha Medhi






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72378: HIVE-23201: Improve logging in locking

2020-04-17 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72378/#review220347
---



Mostly agree, few comments.
I would like to ask you to go through them, and if there are any places where 
the problem should not happen use at least info level.

Thanks,
Peter


standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 4492 (patched)


nit: Maybe remove the last ','?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Line 4810 (original), 4800 (patched)


I prefer the original info level for these logs.


- Peter Vary


On ápr. 17, 2020, 11:30 de, Marton Bod wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72378/
> ---
> 
> (Updated ápr. 17, 2020, 11:30 de)
> 
> 
> Review request for hive, Denys Kuzmenko and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-23201: Improve logging in locking
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbLockManager.java 4b6bc3e1e3 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  962a63d418 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java
>  b3a1f826bb 
> 
> 
> Diff: https://reviews.apache.org/r/72378/diff/1/
> 
> 
> Testing
> ---
> 
> Green build: https://builds.apache.org/job/PreCommit-HIVE-Build/21717/
> 
> 
> Thanks,
> 
> Marton Bod
> 
>



Review Request 72379: HIVE-23230: HiveWarehouseConnector executeQuery api for query having "limit" clause returns more rows

2020-04-17 Thread Adesh Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72379/
---

Review request for hive and Sankar Hariappan.


Repository: hive-git


Description
---

This issue occurs when there are multiple llap daemons running


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7b3acad511 
  
itests/hive-unit/src/test/java/org/apache/hive/jdbc/AbstractTestJdbcGenericUDTFGetSplits.java
 8cbca69737 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java 
00a6c89b1e 


Diff: https://reviews.apache.org/r/72379/diff/1/


Testing
---

Waiting for Hive QA run.


Thanks,

Adesh Rao



Review Request 72378: HIVE-23201: Improve logging in locking

2020-04-17 Thread Marton Bod

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72378/
---

Review request for hive, Denys Kuzmenko and Peter Vary.


Repository: hive-git


Description
---

HIVE-23201: Improve logging in locking


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbLockManager.java 4b6bc3e1e3 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 962a63d418 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java
 b3a1f826bb 


Diff: https://reviews.apache.org/r/72378/diff/1/


Testing
---

Green build: https://builds.apache.org/job/PreCommit-HIVE-Build/21717/


Thanks,

Marton Bod



[jira] [Created] (HIVE-23234) Optimize TxnHandler::allocateTableWriteIds

2020-04-17 Thread Marton Bod (Jira)
Marton Bod created HIVE-23234:
-

 Summary: Optimize TxnHandler::allocateTableWriteIds
 Key: HIVE-23234
 URL: https://issues.apache.org/jira/browse/HIVE-23234
 Project: Hive
  Issue Type: Improvement
Reporter: Marton Bod


Table write id allocation should be examined and optimized. One low hanging 
fruit is batching all the PreparedStatement inserts, but there might be other 
opportunities as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23233) Using default operation logs location cause hive service session testing failed

2020-04-17 Thread RuiChen (Jira)
RuiChen created HIVE-23233:
--

 Summary: Using default operation logs location cause hive service 
session testing failed
 Key: HIVE-23233
 URL: https://issues.apache.org/jira/browse/HIVE-23233
 Project: Hive
  Issue Type: Bug
Reporter: RuiChen


TestSessionCleanup and TestSessionManagerMetrics tests apply the default 
operation logs location 
"ConfVars.*HIVE_SERVER2_LOGGING_OPERATION_LOG_LOCATION*", it's same OS path, in 
TestSessionManagerMetrics SessionManager will delete the operation logs 
directory on testing exit, and in TestSessionCleanup the file count will be 
checked in operation logs directory, so if these test cases are executed in 
parrallel, they will impact each other.

I run Hive tests in my local machine and enable maven parallel mode to run 
tests with option "-T 4 -DforkCount=4", it will run test class in separeted 
processes and execute test class in parallel, you can try to run tests several 
times, this issue should happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23232) Fix flaky TestJdbcWithServiceDiscovery.testKillQueryWithDifferentServerZKTurnedOff

2020-04-17 Thread Peter Varga (Jira)
Peter Varga created HIVE-23232:
--

 Summary: Fix flaky 
TestJdbcWithServiceDiscovery.testKillQueryWithDifferentServerZKTurnedOff
 Key: HIVE-23232
 URL: https://issues.apache.org/jira/browse/HIVE-23232
 Project: Hive
  Issue Type: Bug
Reporter: Peter Varga
Assignee: Peter Varga


The test sometimes failed with error in the TEZ environment, most likely the 
root cause is that two MiniHS2 is using the same TEZ parallel. 

Sample Exception

tExecute expected null, but was:



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Apache Hive 2.3.7 Release Candidate 0

2020-04-17 Thread Peter Vary
+1 for the release.
- Downloaded the artifacts
- Verified the signatures
- Build the project (failed 2 times with parallel compilation - I was afraid 
that I have found a problem, but finally compiled without parallel :) )
- Run some tests

> On Apr 16, 2020, at 22:33, Owen O'Malley  wrote:
> 
> I'm +1 for the release.
> 
>   - I checked the signature & sha.
>   - I built the project.
>   - I ran a handful of unit tests.
> 
> .. Owen
> 
> On Tue, Apr 7, 2020 at 8:34 PM Hyukjin Kwon  wrote:
> 
>> Thank you so much Alan for doing this.
>> 
>> 2020년 4월 8일 (수) 오전 9:26, Alan Gates 님이 작성:
>> 
>>> Apache Hive 2.3.7 Release Candidate 0 is available here:
>>> https://people.apache.org/~gates/apache-hive-2.3.7-rc0/
>>> 
>>> Maven artifacts are available here:
>>> https://repository.apache.org/content/repositories/orgapachehive-1100/
>>> 
>>> The tag release-2.3.7-rc0 has been applied to the source for this
>>> release in github, you can see it
>>> athttps://github.com/apache/hive/tree/release-2.3.7-rc0
>>> 
>>> Voting will conclude in 72 hours (or whenever I scrounge together enough
>>> votes).
>>> 
>>> Hive PMC Members: Please test and vote.
>>> 
>>> Thanks.
>>> 
>>> 
>>> Alan.
>>> 
>>