Review Request 68496: Optimized & cleaned up HBaseQTest runner

2018-08-24 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68496/
---

Review request for hive, Marta Kuczora and Peter Vary.


Bugs: HIVE-20394
https://issues.apache.org/jira/browse/HIVE-20394


Repository: hive-git


Description
---

1. Set proper cluster destroy order
2. Propagated proper HBaseTestContext
3. Ported downstream fixes (CDH-63695)
4. General clean up


Diffs
-

  hbase-handler/src/test/queries/negative/cascade_dbdrop.q 
48be8cd07018d005a28ad8070e13c51a46cf6d06 
  hbase-handler/src/test/queries/positive/hbase_handler_snapshot.q 
e4290717a85a462a45bc14e6295fa7eccad8a51d 
  hbase-handler/src/test/results/negative/cascade_dbdrop.q.out 
803e35e40681df017a2f4e6d017b72749417d461 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestLocationQueries.java
 31195c4523a35caf473eeecd8c8142cd1265ca18 
  
itests/util/src/main/java/org/apache/hadoop/hive/accumulo/AccumuloQTestUtil.java
 956478d778be03070b242ae45d340b5a99ac9316 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/AbstractCoreBlobstoreCliDriver.java
 3cf5ebb3df3455f9ebe701b79f5814e40f34ccc5 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCliDriver.java 
1ead1448d1e1a981068c4bdf5484346babb72ec0 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCompareCliDriver.java
 6b4c6c6a794da754bcaf8c4374fd9f85d51f318f 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreHBaseCliDriver.java
 70cbf04823bb3d7208df245074eb9b732eadd18e 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreHBaseNegativeCliDriver.java
 c76a70e72c13c2b42c32526398c639f041e3 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreNegativeCliDriver.java
 07ae6ac206926c450c86b9519da469fc51f1d55d 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CorePerfCliDriver.java
 55e744e0f3d7b266593f4c8fb82eb8539ac0a563 
  itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseQTestUtil.java 
07df0c9d1ed7fa3bf28325a1c37acf043d8ca848 
  itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseTestSetup.java 
7b203a928110101ba0de695acacadef1f1484b44 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestArguments.java 
PRE-CREATION 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 
5adbb63693b1f2fafa47939b2833310f5dd96bf2 
  
itests/util/src/main/java/org/apache/hadoop/hive/ql/parse/CoreParseNegative.java
 8f5744d2f1c8c059b4148aafde0558d76b723b65 


Diff: https://reviews.apache.org/r/68496/diff/1/


Testing
---

fixed existing tests


Thanks,

denys kuzmenko



Re: Review Request 68496: Optimized & cleaned up HBaseQTest runner

2018-08-24 Thread denys kuzmenko via Review Board


> On Aug. 24, 2018, 8:17 a.m., Peter Vary wrote:
> > itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreHBaseCliDriver.java
> > Line 82 (original), 82 (patched)
> > 
> >
> > Since we removed clearTestSideEffects here, are we sure that 
> > clearPostTestEffects does the same removal?

This is how it was initially before the session lifecycle change (that is 
unrelated to cleanup, same way it's done downstream):
HIVE-19882: Fix QTestUtil session lifecycle (Zoltan Haindrich reviewed by Jason 
Dere) Zoltan Haindrich 2018. 06. 21. 6:15


> On Aug. 24, 2018, 8:17 a.m., Peter Vary wrote:
> > itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreHBaseNegativeCliDriver.java
> > Line 78 (original), 77 (patched)
> > 
> >
> > Same as above. Does clearPostEffects is enough?

This is how it was initially before the session lifecycle change (that is 
unrelated to cleanup, same way it's done downstream):
HIVE-19882: Fix QTestUtil session lifecycle (Zoltan Haindrich reviewed by Jason 
Dere) Zoltan Haindrich 2018. 06. 21. 6:15


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68496/#review207858
---


On Aug. 24, 2018, 7:28 a.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68496/
> ---
> 
> (Updated Aug. 24, 2018, 7:28 a.m.)
> 
> 
> Review request for hive, Marta Kuczora and Peter Vary.
> 
> 
> Bugs: HIVE-20394
> https://issues.apache.org/jira/browse/HIVE-20394
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Set proper cluster destroy order
> 2. Propagated proper HBaseTestContext
> 3. Ported downstream fixes (CDH-63695)
> 4. General clean up
> 
> 
> Diffs
> -
> 
>   hbase-handler/src/test/queries/negative/cascade_dbdrop.q 
> 48be8cd07018d005a28ad8070e13c51a46cf6d06 
>   hbase-handler/src/test/queries/positive/hbase_handler_snapshot.q 
> e4290717a85a462a45bc14e6295fa7eccad8a51d 
>   hbase-handler/src/test/results/negative/cascade_dbdrop.q.out 
> 803e35e40681df017a2f4e6d017b72749417d461 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestLocationQueries.java
>  31195c4523a35caf473eeecd8c8142cd1265ca18 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/accumulo/AccumuloQTestUtil.java
>  956478d778be03070b242ae45d340b5a99ac9316 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/AbstractCoreBlobstoreCliDriver.java
>  3cf5ebb3df3455f9ebe701b79f5814e40f34ccc5 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCliDriver.java
>  1ead1448d1e1a981068c4bdf5484346babb72ec0 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCompareCliDriver.java
>  6b4c6c6a794da754bcaf8c4374fd9f85d51f318f 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreHBaseCliDriver.java
>  70cbf04823bb3d7208df245074eb9b732eadd18e 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreHBaseNegativeCliDriver.java
>  c76a70e72c13c2b42c32526398c639f041e3 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreNegativeCliDriver.java
>  07ae6ac206926c450c86b9519da469fc51f1d55d 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CorePerfCliDriver.java
>  55e744e0f3d7b266593f4c8fb82eb8539ac0a563 
>   itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseQTestUtil.java 
> 07df0c9d1ed7fa3bf28325a1c37acf043d8ca848 
>   itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseTestSetup.java 
> 7b203a928110101ba0de695acacadef1f1484b44 
>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestArguments.java 
> PRE-CREATION 
>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 
> 5adbb63693b1f2fafa47939b2833310f5dd96bf2 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/ql/parse/CoreParseNegative.java
>  8f5744d2f1c8c059b4148aafde0558d76b723b65 
> 
> 
> Diff: https://reviews.apache.org/r/68496/diff/1/
> 
> 
> Testing
> ---
> 
> fixed existing tests
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 68518: ProxyFileSystem.listStatusIterator function override required once migrated to Hadoop 3.2.0+

2018-08-28 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68518/
---

(Updated Aug. 28, 2018, 11:34 a.m.)


Review request for hive, Marta Kuczora and Peter Vary.


Bugs: HIVE-20465
https://issues.apache.org/jira/browse/HIVE-20465


Repository: hive-git


Description
---

Few list_bucket_query_oneskew_*.q tests started to fail. It is the side effect 
of this Hadoop change:
http://github.mtv.cloudera.com/CDH/hadoop/commit/073a38ee09f40b25677cc49eff777241c8fb2eba#diff-4bf68d76a459a69bbb0affbf579ebcf3

In Hive there is the ProxyFileSystem class which has the 'swizzleParamPath' 
method. This method will replace the 'pfile' prefix to 'file' in the path. The 
ProxyFileSystem will be used as file system during the test execution and this 
class implements some methods of the hadoop FileSystem class. Before the linked 
Hadoop patch, the file status are fetched with the 'listStatus' method call 
which is implemented in the ProxyFileSystem. But the patch changed this code 
path and now it calls the FileSystem.listStatusIterator method which is not 
implemented in the ProxyFileSystem, so it will fall back to the Hadoop 
implementation which will throw error for the 'pfile' prefix.


Diffs (updated)
-

  shims/common/src/main/java/org/apache/hadoop/fs/ProxyFileSystem.java 
608c7e0578 
  shims/common/src/main/test/org/apache/hadoop/fs/TestProxyFileSystem.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/68518/diff/2/

Changes: https://reviews.apache.org/r/68518/diff/1-2/


Testing
---


Thanks,

denys kuzmenko



Re: Review Request 68496: Optimized & cleaned up HBaseQTest runner

2018-08-28 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68496/
---

(Updated Aug. 28, 2018, 11:33 a.m.)


Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Marta Kuczora, and 
Peter Vary.


Bugs: HIVE-20394
https://issues.apache.org/jira/browse/HIVE-20394


Repository: hive-git


Description
---

1. Set proper cluster destroy order
2. Propagated proper HBaseTestContext
3. Ported downstream fixes (CDH-63695)
4. General clean up


Diffs (updated)
-

  hbase-handler/src/test/queries/negative/cascade_dbdrop.q 48be8cd070 
  hbase-handler/src/test/queries/positive/hbase_handler_snapshot.q e4290717a8 
  hbase-handler/src/test/results/negative/cascade_dbdrop.q.out 803e35e406 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestLocationQueries.java
 31195c4523 
  
itests/util/src/main/java/org/apache/hadoop/hive/accumulo/AccumuloQTestUtil.java
 956478d778 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/AbstractCoreBlobstoreCliDriver.java
 3cf5ebb3df 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCliDriver.java 
1ead1448d1 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCompareCliDriver.java
 6b4c6c6a79 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreHBaseCliDriver.java
 70cbf04823 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreHBaseNegativeCliDriver.java
 c76a70e7dd 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreNegativeCliDriver.java
 07ae6ac206 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CorePerfCliDriver.java
 55e744e0f3 
  itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseQTestUtil.java 
07df0c9d1e 
  itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseTestSetup.java 
7b203a9281 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestArguments.java 
PRE-CREATION 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 5adbb63693 
  
itests/util/src/main/java/org/apache/hadoop/hive/ql/parse/CoreParseNegative.java
 8f5744d2f1 


Diff: https://reviews.apache.org/r/68496/diff/2/

Changes: https://reviews.apache.org/r/68496/diff/1-2/


Testing
---

fixed existing tests


Thanks,

denys kuzmenko



Re: Review Request 68523: Improve org.apache.hadoop.hive.ql.exec.FunctionTask Experience

2018-08-28 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68523/
---

(Updated Aug. 28, 2018, 1:50 p.m.)


Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.


Bugs: HIVE-20466
https://issues.apache.org/jira/browse/HIVE-20466


Repository: hive-git


Description
---

When a create function statement is submitted, it may fail with the following 
error:

Error while processing statement: FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.FunctionTask

This is not a user-friendly error message.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 44591842bb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java 6f8a8f5504 
  ql/src/test/queries/clientnegative/create_unknown_permanent_udf.q 
PRE-CREATION 
  ql/src/test/results/clientnegative/create_function_nonexistent_class.q.out 
77467f66e3 
  ql/src/test/results/clientnegative/create_unknown_permanent_udf.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/68523/diff/2/

Changes: https://reviews.apache.org/r/68523/diff/1-2/


Testing
---

added negative qtest to cover this scenario


Thanks,

denys kuzmenko



Review Request 68518: ProxyFileSystem.listStatusIterator function override required once migrated to Hadoop 3.2.0+

2018-08-27 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68518/
---

Review request for hive, Marta Kuczora and Peter Vary.


Bugs: HIVE-20465
https://issues.apache.org/jira/browse/HIVE-20465


Repository: hive-git


Description
---

Few list_bucket_query_oneskew_*.q tests started to fail. It is the side effect 
of this Hadoop change:
http://github.mtv.cloudera.com/CDH/hadoop/commit/073a38ee09f40b25677cc49eff777241c8fb2eba#diff-4bf68d76a459a69bbb0affbf579ebcf3

In Hive there is the ProxyFileSystem class which has the 'swizzleParamPath' 
method. This method will replace the 'pfile' prefix to 'file' in the path. The 
ProxyFileSystem will be used as file system during the test execution and this 
class implements some methods of the hadoop FileSystem class. Before the linked 
Hadoop patch, the file status are fetched with the 'listStatus' method call 
which is implemented in the ProxyFileSystem. But the patch changed this code 
path and now it calls the FileSystem.listStatusIterator method which is not 
implemented in the ProxyFileSystem, so it will fall back to the Hadoop 
implementation which will throw error for the 'pfile' prefix.


Diffs
-

  shims/common/src/main/java/org/apache/hadoop/fs/ProxyFileSystem.java 
a82b2f0564 


Diff: https://reviews.apache.org/r/68518/diff/1/


Testing
---


Thanks,

denys kuzmenko



Review Request 68523: Improve org.apache.hadoop.hive.ql.exec.FunctionTask Experience

2018-08-27 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68523/
---

Review request for hive, Marta Kuczora and Peter Vary.


Bugs: HIVE-20466
https://issues.apache.org/jira/browse/HIVE-20466


Repository: hive-git


Description
---

When a create function statement is submitted, it may fail with the following 
error:

Error while processing statement: FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.FunctionTask

This is not a user-friendly error message.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 44591842bb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java 6f8a8f5504 
  ql/src/test/queries/clientnegative/create_unknown_permanent_udf.q 
PRE-CREATION 
  ql/src/test/results/clientnegative/create_unknown_permanent_udf.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/68523/diff/1/


Testing
---

added negative qtest to cover this scenario


Thanks,

denys kuzmenko



Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-11 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/
---

Review request for hive, Zoltan Haindrich, Zoltan Haindrich, and Peter Vary.


Bugs: HIVE-20535
https://issues.apache.org/jira/browse/HIVE-20535


Repository: hive-git


Description
---

When removing the compile lock, it is quite risky to remove it entirely.

It would be good to provide a pool size for the concurrent compilation, so the 
administrator can limit the load


Diffs
-

  
common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsConstant.java
 af0f87bac3 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8c39de3e77 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 737debd2ad 
  ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 


Diff: https://reviews.apache.org/r/68683/diff/1/


Testing
---

Added CompileLockTest


Thanks,

denys kuzmenko



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-15 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/
---

(Updated Oct. 15, 2018, 7:21 p.m.)


Review request for hive, Sahil Takiar and Adam Szita.


Changes
---

updated the patch file.


Summary (updated)
-

HIVE-20737: Local SparkContext is shared between user sessions and should be 
closed only when there is no active


Bugs: HIVE-20737
https://issues.apache.org/jira/browse/HIVE-20737


Repository: hive-git


Description (updated)
---

1. Local SparkContext is shared between user sessions and should be closed only 
when there is no active. 
2. Possible race condition in SparkSession.open() in case when user queries run 
in parallel within the same session.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
72ff53e3bd 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 
bb50129518 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/69022/diff/2/


Testing
---

Added TestLocalHiveSparkClient test


File Attachments


HIVE-20737.7.patch
  
https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch


Thanks,

denys kuzmenko



Review Request 69022: HIVE-20737: LocalHiveSparkClient.close() and SparkSession.open() race condition fix

2018-10-15 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/
---

Review request for hive, Sahil Takiar and Adam Szita.


Bugs: HIVE-20737
https://issues.apache.org/jira/browse/HIVE-20737


Repository: hive-git


Description
---

SparkContext is shared between user sessions and should be closed only when 
there is no active one.
Possible race condition in SparkSession.open()


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
72ff53e3bd 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 
bb50129518 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/69022/diff/1/


Testing
---

Added TestLocalHiveSparkClient test


Thanks,

denys kuzmenko



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-18 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 125 (original), 120 (patched)
> > 
> >
> > we might need to re-think how we are synchronizing this method a bit. I 
> > think we want to support the use case where we call `close()` while 
> > `open()` is being run. The offers a way for the user to cancel a session 
> > while it is being opened, which can be useful if opening a session takes a 
> > long time, which can happen on a busy cluster where there aren't enough 
> > resources to open a session.
> > 
> > fixing that might be out of the scope of this JIRA, so I would 
> > recommend using a separate lock to guard against multiple users calling 
> > open on the same session.
> 
> Sahil Takiar wrote:
> Tracking the aformentioned fix in HIVE-20519, unless you want to fix it 
> in this patch.
> 
> denys kuzmenko wrote:
> i think it should be addressed in another JIRA, right now we need to have 
> working at least basic use-case
> 
> Sahil Takiar wrote:
> okay, still recommend using a separate lock
> 
> denys kuzmenko wrote:
> open() and close() both manipulate with the shared variable (isOpen), so 
> they have to be synchronized on a same monitor (at least in current approach).
> I am not sure whether SparkContext supports instant interruption 
> (Thread.interrupt or sc.stop()). However when closing session that is in 
> progress, you have to take care of SparkContext.
> 
> Sahil Takiar wrote:
> yes, but we need to support calling `close()` while `open()` is being 
> run, as I described in my first comment. the (2) bullet point in the RB 
> description states that you want to guard against multiple callers invoking 
> the `open()` method, so logically you should just have a single lock to 
> handle this behavior. i suggest you use a separate lock for this scenario 
> because the `closeLock` is meant to handle synchronization of the `close()` 
> method, it was not meant to handle synchronization of the `open()` method by 
> multiple callers. IMO by re-using the lock we make the code harder to 
> understand because we are using the `closeLock` for functionality that should 
> be outside its scope.
> 
> I don't feel strongly about this though given we will need to re-factor 
> this code later anyway to support calling `close()` while `open()` is 
> running, so I'll leave it up to you.

Let's have a seperate JIRA for this use-case.


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 16, 2018, 4:49 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 16, 2018, 4:49 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/3/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java
> > Line 73 (original), 73 (patched)
> > 
> >
> > if we expect multiple sessions to access this, should we make this 
> > `volatile`?
> 
> denys kuzmenko wrote:
> it's being accesed only inside of the critical section (within the lock 
> boundaries)
> 
> Sahil Takiar wrote:
> does java guarantee that non-volatile variables accessed inside a 
> critical section are not cached locally by a CPU?

In short - yes.

JSR 133 (Java Memory Model)

Synchronization ensures that memory writes by a thread before or during a 
synchronized block are made visible in a predictable manner to other threads 
which synchronize on the same monitor. After we exit a synchronized block, we 
release the monitor, which has the effect of flushing the cache to main memory, 
so that writes made by this thread can be visible to other threads. Before we 
can enter a synchronized block, we acquire the monitor, which has the effect of 
invalidating the local processor cache so that variables will be reloaded from 
main memory. We will then be able to see all of the writes made visible by the 
previous release.


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 16, 2018, 4:49 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 16, 2018, 4:49 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/3/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/
---

(Updated Oct. 16, 2018, 4:49 p.m.)


Review request for hive, Sahil Takiar and Adam Szita.


Bugs: HIVE-20737
https://issues.apache.org/jira/browse/HIVE-20737


Repository: hive-git


Description
---

1. Local SparkContext is shared between user sessions and should be closed only 
when there is no active. 
2. Possible race condition in SparkSession.open() in case when user queries run 
in parallel within the same session.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
72ff53e3bd 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 
bb50129518 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/69022/diff/3/

Changes: https://reviews.apache.org/r/69022/diff/2-3/


Testing
---

Added TestLocalHiveSparkClient test


File Attachments


HIVE-20737.7.patch
  
https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch


Thanks,

denys kuzmenko



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 352 (original)
> > 
> >
> > why remove this?
> 
> denys kuzmenko wrote:
> it's not required. close() method is covered with the lock, and 
> activeJobs is a concurrent collection
> 
> Sahil Takiar wrote:
> what happens if a job is submitted after `hasTimedOut` returns true?
> 
> denys kuzmenko wrote:
> I see. However existing lock won't help as it doesn't prevent other 
> threads from adding new queries. 
> 
> public void onQuerySubmission(String queryId) {
> activeJobs.add(queryId);
> }
> 
> we might need to cover this with separate lock (onQueryCompletion, 
> onQuerySubmission, triggerTimeout)
> What do you think?
> 
> denys kuzmenko wrote:
> what happens if a job is submitted after hasTimedOut returns true? 
> current Session will be closed and a new one will be opened.
> 
> denys kuzmenko wrote:
> there might be an issue when 2nd session checkes state and get isOpen and 
> before it's reaching the submit phase, 1st one calls the close.
> I think we need synchronization for active sessions manipulation.

fixed. reverted back to the original locking. Above tricky case will be handled 
by preventing new queries to execute open() before close() is complete.


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 352 (original)
> > 
> >
> > why remove this?
> 
> denys kuzmenko wrote:
> it's not required. close() method is covered with the lock, and 
> activeJobs is a concurrent collection
> 
> Sahil Takiar wrote:
> what happens if a job is submitted after `hasTimedOut` returns true?
> 
> denys kuzmenko wrote:
> I see. However existing lock won't help as it doesn't prevent other 
> threads from adding new queries. 
> 
> public void onQuerySubmission(String queryId) {
> activeJobs.add(queryId);
> }
> 
> we might need to cover this with separate lock (onQueryCompletion, 
> onQuerySubmission, triggerTimeout)
> What do you think?
> 
> denys kuzmenko wrote:
> what happens if a job is submitted after hasTimedOut returns true? 
> current Session will be closed and a new one will be opened.
> 
> denys kuzmenko wrote:
> there might be an issue when 2nd session checkes state and get isOpen and 
> before it's reaching the submit phase, 1st one calls the close.
> I think we need synchronization for active sessions manipulation.
> 
> denys kuzmenko wrote:
> fixed. reverted back to the original locking. Above tricky case will be 
> handled by preventing new queries to execute open() before close() is 
> complete.

@before triggerTimeout() is complete


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 16, 2018, 4:49 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 16, 2018, 4:49 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/3/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 125 (original), 120 (patched)
> > 
> >
> > we might need to re-think how we are synchronizing this method a bit. I 
> > think we want to support the use case where we call `close()` while 
> > `open()` is being run. The offers a way for the user to cancel a session 
> > while it is being opened, which can be useful if opening a session takes a 
> > long time, which can happen on a busy cluster where there aren't enough 
> > resources to open a session.
> > 
> > fixing that might be out of the scope of this JIRA, so I would 
> > recommend using a separate lock to guard against multiple users calling 
> > open on the same session.
> 
> Sahil Takiar wrote:
> Tracking the aformentioned fix in HIVE-20519, unless you want to fix it 
> in this patch.
> 
> denys kuzmenko wrote:
> i think it should be addressed in another JIRA, right now we need to have 
> working at least basic use-case
> 
> Sahil Takiar wrote:
> okay, still recommend using a separate lock

open() and close() both manipulate with the shared variable (isOpen), so they 
have to be synchronized on a same monitor (at least in current approach).
I am not sure whether SparkContext supports instant interruption 
(Thread.interrupt or sc.stop()). However when closing session that is in 
progress, you have to take care of SparkContext.


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 16, 2018, 4:49 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 16, 2018, 4:49 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/3/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: SparkContext is shared between user sessions and should be closed only when there is no active one

2018-10-15 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/
---

(Updated Oct. 15, 2018, 4:54 p.m.)


Review request for hive, Sahil Takiar and Adam Szita.


Summary (updated)
-

HIVE-20737: SparkContext is shared between user sessions and should be closed 
only when there is no active one


Bugs: HIVE-20737
https://issues.apache.org/jira/browse/HIVE-20737


Repository: hive-git


Description (updated)
---

1. SparkContext is shared between user sessions and should be closed only when 
there is no active one. 
2. Possible race condition in SparkSession.open() in case of user queries run 
in parallel within a single session.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
72ff53e3bd 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 
bb50129518 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/69022/diff/2/

Changes: https://reviews.apache.org/r/69022/diff/1-2/


Testing
---

Added TestLocalHiveSparkClient test


File Attachments (updated)


HIVE-20737.7.patch
  
https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch


Thanks,

denys kuzmenko



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java
> > Line 73 (original), 73 (patched)
> > 
> >
> > if we expect multiple sessions to access this, should we make this 
> > `volatile`?

it's being accesed only inside of the critical section (within the lock 
boundaries)


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java
> > Lines 74 (patched)
> > 
> >
> > should probably make this `volatile` in case multiple threads try to 
> > get an instance

it's being accesed only inside of the critical section (within the lock 
boundaries)


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Lines 112-116 (original)
> > 
> >
> > do we have unit tests that cover this?

queryCompleted and (lastSparkJobCompletionTime = 0) are complementary 
conditions that are checked and set at the same place


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 125 (original), 120 (patched)
> > 
> >
> > we might need to re-think how we are synchronizing this method a bit. I 
> > think we want to support the use case where we call `close()` while 
> > `open()` is being run. The offers a way for the user to cancel a session 
> > while it is being opened, which can be useful if opening a session takes a 
> > long time, which can happen on a busy cluster where there aren't enough 
> > resources to open a session.
> > 
> > fixing that might be out of the scope of this JIRA, so I would 
> > recommend using a separate lock to guard against multiple users calling 
> > open on the same session.
> 
> Sahil Takiar wrote:
> Tracking the aformentioned fix in HIVE-20519, unless you want to fix it 
> in this patch.

i think it should be addressed in another JIRA, right now we need to have 
working at least basic use-case


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 352 (original)
> > 
> >
> > why remove this?

it's not required. close() method is covered with the lock, and activeJobs is a 
concurrent collection


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Lines 112-116 (original)
> > 
> >
> > do we have unit tests that cover this?
> 
> denys kuzmenko wrote:
> queryCompleted and (lastSparkJobCompletionTime = 0) are complementary 
> conditions that are checked and set at the same place
> 
> denys kuzmenko wrote:
> queryCompleted and (lastSparkJobCompletionTime > 0)

we do have bunch of tests (TestSparkSession*, TestLocalSparkClient) that are 
covering this


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Lines 112-116 (original)
> > 
> >
> > do we have unit tests that cover this?
> 
> denys kuzmenko wrote:
> queryCompleted and (lastSparkJobCompletionTime = 0) are complementary 
> conditions that are checked and set at the same place

queryCompleted and (lastSparkJobCompletionTime > 0)


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 352 (original)
> > 
> >
> > why remove this?
> 
> denys kuzmenko wrote:
> it's not required. close() method is covered with the lock, and 
> activeJobs is a concurrent collection
> 
> Sahil Takiar wrote:
> what happens if a job is submitted after `hasTimedOut` returns true?

I see. However existing lock won't help as it doesn't prevent other threads 
from adding new queries. 

public void onQuerySubmission(String queryId) {
activeJobs.add(queryId);
}

we might need to cover this with separate lock (onQueryCompletion, 
onQuerySubmission, triggerTimeout)
What do you think?


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 352 (original)
> > 
> >
> > why remove this?
> 
> denys kuzmenko wrote:
> it's not required. close() method is covered with the lock, and 
> activeJobs is a concurrent collection
> 
> Sahil Takiar wrote:
> what happens if a job is submitted after `hasTimedOut` returns true?
> 
> denys kuzmenko wrote:
> I see. However existing lock won't help as it doesn't prevent other 
> threads from adding new queries. 
> 
> public void onQuerySubmission(String queryId) {
> activeJobs.add(queryId);
> }
> 
> we might need to cover this with separate lock (onQueryCompletion, 
> onQuerySubmission, triggerTimeout)
> What do you think?

what happens if a job is submitted after hasTimedOut returns true? current 
Session will be closed and a new one will be opened.


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 352 (original)
> > 
> >
> > why remove this?
> 
> denys kuzmenko wrote:
> it's not required. close() method is covered with the lock, and 
> activeJobs is a concurrent collection
> 
> Sahil Takiar wrote:
> what happens if a job is submitted after `hasTimedOut` returns true?
> 
> denys kuzmenko wrote:
> I see. However existing lock won't help as it doesn't prevent other 
> threads from adding new queries. 
> 
> public void onQuerySubmission(String queryId) {
> activeJobs.add(queryId);
> }
> 
> we might need to cover this with separate lock (onQueryCompletion, 
> onQuerySubmission, triggerTimeout)
> What do you think?
> 
> denys kuzmenko wrote:
> what happens if a job is submitted after hasTimedOut returns true? 
> current Session will be closed and a new one will be opened.

there might be an issue when 2nd session checkes state and get isOpen and 
before it's reaching the submit phase, 1st one calls the close.
I think we need synchronization for active sessions manipulation.


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69167: HIVE-20796: jdbc URL can contain sensitive information that should not be logged

2018-10-26 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69167/#review210088
---


Ship it!




Ship It!

- Denys Kuzmenko


On Oct. 26, 2018, 10:34 a.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69167/
> ---
> 
> (Updated Oct. 26, 2018, 10:34 a.m.)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20796: jdbc URL can contain sensitive information that should not be 
> logged
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  9c158040497cd3d2762620ce35e2b46bb6d5fffe 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreServerUtils.java
>  f3b38665676391fec9b85eb9a405c14632340dc6 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/TestMetaStoreServerUtils.java
>  f4bdd734dc4e731dda01e6031a4115cde5571baf 
> 
> 
> Diff: https://reviews.apache.org/r/69167/diff/2/
> 
> 
> Testing
> ---
> 
> New unit test created.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 69167: HIVE-20796: jdbc URL can contain sensitive information that should not be logged

2018-10-25 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69167/#review210044
---




standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreServerUtils.java
Lines 1163 (patched)


Strings are immutable


- denys kuzmenko


On Oct. 25, 2018, 1:36 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69167/
> ---
> 
> (Updated Oct. 25, 2018, 1:36 p.m.)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20796: jdbc URL can contain sensitive information that should not be 
> logged
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  9c158040497cd3d2762620ce35e2b46bb6d5fffe 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreServerUtils.java
>  f3b38665676391fec9b85eb9a405c14632340dc6 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/TestMetaStoreServerUtils.java
>  f4bdd734dc4e731dda01e6031a4115cde5571baf 
> 
> 
> Diff: https://reviews.apache.org/r/69167/diff/1/
> 
> 
> Testing
> ---
> 
> New unit test created.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 69167: HIVE-20796: jdbc URL can contain sensitive information that should not be logged

2018-10-25 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69167/#review210045
---




standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreServerUtils.java
Lines 1164 (patched)


Opinion: I would go with a list of attributes to be masked


- denys kuzmenko


On Oct. 25, 2018, 1:36 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69167/
> ---
> 
> (Updated Oct. 25, 2018, 1:36 p.m.)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20796: jdbc URL can contain sensitive information that should not be 
> logged
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  9c158040497cd3d2762620ce35e2b46bb6d5fffe 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreServerUtils.java
>  f3b38665676391fec9b85eb9a405c14632340dc6 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/TestMetaStoreServerUtils.java
>  f4bdd734dc4e731dda01e6031a4115cde5571baf 
> 
> 
> Diff: https://reviews.apache.org/r/69167/diff/1/
> 
> 
> Testing
> ---
> 
> New unit test created.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 69167: HIVE-20796: jdbc URL can contain sensitive information that should not be logged

2018-10-25 Thread Denys Kuzmenko via Review Board


> On Oct. 25, 2018, 10:46 p.m., Denys Kuzmenko wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreServerUtils.java
> > Lines 1172 (patched)
> > 
> >
> > Could be optimized to:
> > 
> > String[] sensitiveData = {"user", "password"};
> > String regex = "([;,\?&]" + String.join("|", sensitiveData) + 
> > ")=.*?([;,&\)]+)";
> > 
> > String result = 
> > Pattern.compile(regex).matcher(connectionURL).replaceAll("$1=***$2");
> 
> Denys Kuzmenko wrote:
> or just connectionURL.replaceAll(regex, "$1=***$2");

regex = "([;,?&]" + String.join("|", sensitiveData) + ")=.*?([;,&)]?)";


- Denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69167/#review210059
---


On Oct. 25, 2018, 1:36 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69167/
> ---
> 
> (Updated Oct. 25, 2018, 1:36 p.m.)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20796: jdbc URL can contain sensitive information that should not be 
> logged
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  9c158040497cd3d2762620ce35e2b46bb6d5fffe 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreServerUtils.java
>  f3b38665676391fec9b85eb9a405c14632340dc6 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/TestMetaStoreServerUtils.java
>  f4bdd734dc4e731dda01e6031a4115cde5571baf 
> 
> 
> Diff: https://reviews.apache.org/r/69167/diff/1/
> 
> 
> Testing
> ---
> 
> New unit test created.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 69167: HIVE-20796: jdbc URL can contain sensitive information that should not be logged

2018-10-25 Thread Denys Kuzmenko via Review Board


> On Oct. 25, 2018, 10:46 p.m., Denys Kuzmenko wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreServerUtils.java
> > Lines 1172 (patched)
> > 
> >
> > Could be optimized to:
> > 
> > String[] sensitiveData = {"user", "password"};
> > String regex = "([;,\?&]" + String.join("|", sensitiveData) + 
> > ")=.*?([;,&\)]+)";
> > 
> > String result = 
> > Pattern.compile(regex).matcher(connectionURL).replaceAll("$1=***$2");

or just connectionURL.replaceAll(regex, "$1=***$2");


- Denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69167/#review210059
---


On Oct. 25, 2018, 1:36 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69167/
> ---
> 
> (Updated Oct. 25, 2018, 1:36 p.m.)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20796: jdbc URL can contain sensitive information that should not be 
> logged
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  9c158040497cd3d2762620ce35e2b46bb6d5fffe 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreServerUtils.java
>  f3b38665676391fec9b85eb9a405c14632340dc6 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/TestMetaStoreServerUtils.java
>  f4bdd734dc4e731dda01e6031a4115cde5571baf 
> 
> 
> Diff: https://reviews.apache.org/r/69167/diff/1/
> 
> 
> Testing
> ---
> 
> New unit test created.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 69167: HIVE-20796: jdbc URL can contain sensitive information that should not be logged

2018-10-25 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69167/#review210059
---




standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreServerUtils.java
Lines 1172 (patched)


Could be optimized to:

String[] sensitiveData = {"user", "password"};
String regex = "([;,\?&]" + String.join("|", sensitiveData) + 
")=.*?([;,&\)]+)";

String result = 
Pattern.compile(regex).matcher(connectionURL).replaceAll("$1=***$2");


- Denys Kuzmenko


On Oct. 25, 2018, 1:36 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69167/
> ---
> 
> (Updated Oct. 25, 2018, 1:36 p.m.)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20796: jdbc URL can contain sensitive information that should not be 
> logged
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  9c158040497cd3d2762620ce35e2b46bb6d5fffe 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreServerUtils.java
>  f3b38665676391fec9b85eb9a405c14632340dc6 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/TestMetaStoreServerUtils.java
>  f4bdd734dc4e731dda01e6031a4115cde5571baf 
> 
> 
> Diff: https://reviews.apache.org/r/69167/diff/1/
> 
> 
> Testing
> ---
> 
> New unit test created.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-18 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 352 (original)
> > 
> >
> > why remove this?
> 
> denys kuzmenko wrote:
> it's not required. close() method is covered with the lock, and 
> activeJobs is a concurrent collection
> 
> Sahil Takiar wrote:
> what happens if a job is submitted after `hasTimedOut` returns true?
> 
> denys kuzmenko wrote:
> I see. However existing lock won't help as it doesn't prevent other 
> threads from adding new queries. 
> 
> public void onQuerySubmission(String queryId) {
> activeJobs.add(queryId);
> }
> 
> we might need to cover this with separate lock (onQueryCompletion, 
> onQuerySubmission, triggerTimeout)
> What do you think?
> 
> denys kuzmenko wrote:
> what happens if a job is submitted after hasTimedOut returns true? 
> current Session will be closed and a new one will be opened.
> 
> denys kuzmenko wrote:
> there might be an issue when 2nd session checkes state and get isOpen and 
> before it's reaching the submit phase, 1st one calls the close.
> I think we need synchronization for active sessions manipulation.
> 
> denys kuzmenko wrote:
> fixed. reverted back to the original locking. Above tricky case will be 
> handled by preventing new queries to execute open() before close() is 
> complete.
> 
> denys kuzmenko wrote:
> @before triggerTimeout() is complete
> 
> Sahil Takiar wrote:
> A little confused by all the comments here. Looks like the change has 
> been reverted. Do we agree that the current code is sufficient, or are their 
> other edge cases you think need to be covered?

Sorry for that, that was just my thoughts. 
I think, current code is sufficient for existing use case, however it's 
definetely not designed to support case where close() is called and could be 
executed while open() is being run


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 16, 2018, 4:49 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 16, 2018, 4:49 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/3/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69341: HIVE-20891: Call alter_partition in batch when dynamically loading partitions

2018-11-15 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69341/#review210568
---




ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2692 (patched)


By default parallelStream uses the ForkJoinPool.commonPool(), a Thread Pool 
shared by the entire application. You might consider creating a custom one.


- Denys Kuzmenko


On Nov. 15, 2018, 8:52 a.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69341/
> ---
> 
> (Updated Nov. 15, 2018, 8:52 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20891: Call alter_partition in batch when dynamically loading partitions
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/metastore/SynchronizedMetaStoreClient.java 
> e8f362357537e73502f743a9df189dec9be2da5d 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> e185bf49d42da9d1497643c20bbd71edaf071bf1 
> 
> 
> Diff: https://reviews.apache.org/r/69341/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 69341: HIVE-20891: Call alter_partition in batch when dynamically loading partitions

2018-11-15 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69341/#review210569
---




ql/src/java/org/apache/hadoop/hive/metastore/SynchronizedMetaStoreClient.java
Lines 90 (patched)


3 first method params could be grouped into Table object.


- Denys Kuzmenko


On Nov. 15, 2018, 8:52 a.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69341/
> ---
> 
> (Updated Nov. 15, 2018, 8:52 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20891: Call alter_partition in batch when dynamically loading partitions
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/metastore/SynchronizedMetaStoreClient.java 
> e8f362357537e73502f743a9df189dec9be2da5d 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> e185bf49d42da9d1497643c20bbd71edaf071bf1 
> 
> 
> Diff: https://reviews.apache.org/r/69341/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 69341: HIVE-20891: Call alter_partition in batch when dynamically loading partitions

2018-11-15 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69341/#review210570
---




ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2350 (patched)


I would extract result to liocal variable for better readability.


- Denys Kuzmenko


On Nov. 15, 2018, 8:52 a.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69341/
> ---
> 
> (Updated Nov. 15, 2018, 8:52 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20891: Call alter_partition in batch when dynamically loading partitions
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/metastore/SynchronizedMetaStoreClient.java 
> e8f362357537e73502f743a9df189dec9be2da5d 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> e185bf49d42da9d1497643c20bbd71edaf071bf1 
> 
> 
> Diff: https://reviews.apache.org/r/69341/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 69341: HIVE-20891: Call alter_partition in batch when dynamically loading partitions

2018-11-15 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69341/#review210571
---




ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Line 2310 (original), 2319 (patched)


This method could reuse new partitions method to avoid code duplication.


- Denys Kuzmenko


On Nov. 15, 2018, 8:52 a.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69341/
> ---
> 
> (Updated Nov. 15, 2018, 8:52 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20891: Call alter_partition in batch when dynamically loading partitions
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/metastore/SynchronizedMetaStoreClient.java 
> e8f362357537e73502f743a9df189dec9be2da5d 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> e185bf49d42da9d1497643c20bbd71edaf071bf1 
> 
> 
> Diff: https://reviews.apache.org/r/69341/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 69341: HIVE-20891: Call alter_partition in batch when dynamically loading partitions

2018-11-15 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69341/#review210572
---




ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2350 (patched)


You might consider using a custom thread pool.


- Denys Kuzmenko


On Nov. 15, 2018, 8:52 a.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69341/
> ---
> 
> (Updated Nov. 15, 2018, 8:52 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20891: Call alter_partition in batch when dynamically loading partitions
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/metastore/SynchronizedMetaStoreClient.java 
> e8f362357537e73502f743a9df189dec9be2da5d 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> e185bf49d42da9d1497643c20bbd71edaf071bf1 
> 
> 
> Diff: https://reviews.apache.org/r/69341/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 68895: HadoopVer was ignored in QTestUtil

2018-10-05 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68895/
---

(Updated Oct. 5, 2018, 10 a.m.)


Review request for hive, Zoltan Haindrich and Peter Vary.


Bugs: HIVE-20647
https://issues.apache.org/jira/browse/HIVE-20647


Repository: hive-git


Description
---

- Removed obsolete hadoopVer from QTestUtil

- Cleaned up QTestUtil, QTestArgumentsBuilder

- Refactored AccumuloTestSetup, AccumuloQTestUtil due to findbugs violation:
(UR_UNINIT_READ_CALLED_FROM_SUPER_CONSTRUCTOR: Uninitialized read of field 
method called from constructor of superclass)


Diffs (updated)
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestLocationQueries.java
 b7e563aebf 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
3d8eb83b04 
  
itests/util/src/main/java/org/apache/hadoop/hive/accumulo/AccumuloQTestUtil.java
 060e0cd3e2 
  
itests/util/src/main/java/org/apache/hadoop/hive/accumulo/AccumuloTestSetup.java
 47cf7ac79a 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/AbstractCoreBlobstoreCliDriver.java
 64f29194d5 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreAccumuloCliDriver.java
 0d64cfa1ac 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreBlobstoreCliDriver.java
 bdb15b3869 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreBlobstoreNegativeCliDriver.java
 801c44b40c 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCliDriver.java 
841344438e 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCompareCliDriver.java
 8ce43495e6 
  itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreDummy.java 
4d40ef9047 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreHBaseCliDriver.java
 252e9f6b90 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreHBaseNegativeCliDriver.java
 c009cec57b 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreNegativeCliDriver.java
 0807da1c7c 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CorePerfCliDriver.java
 badb4a5888 
  itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseQTestUtil.java 
6a6b1003b6 
  itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseTestSetup.java 
cc63a6cc83 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestArguments.java 
18269ebc92 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 0e8b82930e 
  
itests/util/src/main/java/org/apache/hadoop/hive/ql/parse/CoreParseNegative.java
 77de3faa1f 


Diff: https://reviews.apache.org/r/68895/diff/2/

Changes: https://reviews.apache.org/r/68895/diff/1-2/


Testing
---


Thanks,

denys kuzmenko



Review Request 68960: Race Condition when Multi-Threading in SessionState.createRootHDFSDir

2018-10-09 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68960/
---

Review request for hive, Jason Dere and Peter Vary.


Bugs: HIVE-20711
https://issues.apache.org/jira/browse/HIVE-20711


Repository: hive-git


Description
---

java.util.concurrent.ExecutionException: java.lang.RuntimeException: The root 
scratch dir: 
/home/hiveptest/hive-ptest-cloudera-slaves-17e5-13.gce.cloudera.com-hiveptest-0/cdh-source/itests/hive-unit/target/tmp/scratchdir
 on HDFS should be writable. Current permissions are: rwxr-xr-x at 
org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:714)
 at 
org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:637)
 at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:567) 
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:532) 
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:512) 
at


Diffs
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/exec/spark/TestSparkSessionTimeout.java
 c887297bc2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 76a30eb912 


Diff: https://reviews.apache.org/r/68960/diff/1/


Testing
---

TestSparkSessionTimeout


Thanks,

denys kuzmenko



Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-10-01 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/
---

(Updated Oct. 1, 2018, 11:42 a.m.)


Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, and 
Peter Vary.


Bugs: HIVE-20535
https://issues.apache.org/jira/browse/HIVE-20535


Repository: hive-git


Description
---

When removing the compile lock, it is quite risky to remove it entirely.

It would be good to provide a pool size for the concurrent compilation, so the 
administrator can limit the load


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d1e6631975 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java dad2035362 
  ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLockFactory.java 
PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 


Diff: https://reviews.apache.org/r/68683/diff/8/

Changes: https://reviews.apache.org/r/68683/diff/7-8/


Testing
---

Added CompileLockTest


File Attachments


HIVE-20535.1.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch
HIVE-20535.14.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/25/335b0f4b-ea94-41d4-881a-ec8bb870a376__HIVE-20535.14.patch
HIVE-20535.14.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/25/a92b6da2-eeba-46ee-9409-162653826172__HIVE-20535.14.patch
HIVE-20535.14.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/25/9db4cf76-9188-48fb-bd3d-5b28e43a791b__HIVE-20535.14.patch


Thanks,

denys kuzmenko



Review Request 68895: HadoopVer was ignored in QTestUtil

2018-10-02 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68895/
---

Review request for hive, Zoltan Haindrich and Peter Vary.


Bugs: HIVE-20647
https://issues.apache.org/jira/browse/HIVE-20647


Repository: hive-git


Description
---

- Removed obsolete hadoopVer from QTestUtil

- Cleaned up QTestUtil, QTestArgumentsBuilder

- Refactored AccumuloTestSetup, AccumuloQTestUtil due to findbugs violation:
(UR_UNINIT_READ_CALLED_FROM_SUPER_CONSTRUCTOR: Uninitialized read of field 
method called from constructor of superclass)


Diffs
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestLocationQueries.java
 b7e563aebf 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMTQueries.java 
3d8eb83b04 
  
itests/util/src/main/java/org/apache/hadoop/hive/accumulo/AccumuloQTestUtil.java
 060e0cd3e2 
  
itests/util/src/main/java/org/apache/hadoop/hive/accumulo/AccumuloTestSetup.java
 47cf7ac79a 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/AbstractCoreBlobstoreCliDriver.java
 64f29194d5 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreAccumuloCliDriver.java
 0d64cfa1ac 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreBlobstoreCliDriver.java
 bdb15b3869 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreBlobstoreNegativeCliDriver.java
 801c44b40c 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCliDriver.java 
841344438e 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCompareCliDriver.java
 8ce43495e6 
  itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreDummy.java 
4d40ef9047 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreHBaseCliDriver.java
 252e9f6b90 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreHBaseNegativeCliDriver.java
 c009cec57b 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreNegativeCliDriver.java
 0807da1c7c 
  
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CorePerfCliDriver.java
 badb4a5888 
  itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseQTestUtil.java 
6a6b1003b6 
  itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseTestSetup.java 
cc63a6cc83 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestArguments.java 
18269ebc92 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 0e8b82930e 
  
itests/util/src/main/java/org/apache/hadoop/hive/ql/parse/CoreParseNegative.java
 77de3faa1f 


Diff: https://reviews.apache.org/r/68895/diff/1/


Testing
---


Thanks,

denys kuzmenko



Re: Review Request 68767: HIVE-20551: Create PreparedStatement query dynamically when IN clause is used

2018-09-20 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68767/#review208827
---




standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Line 311 (original), 311 (patched)


Is it a good idea to use PreparedStatement for this particular use-case? 
Prepared statement must be recompiled each time you call it with a 
different number of parameters. Aside from the added time from recompiling the 
prepared statement, other queries may be removed from the pool of available 
statements causing them to be recompiled again too.


- denys kuzmenko


On Sept. 19, 2018, 9:46 a.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68767/
> ---
> 
> (Updated Sept. 19, 2018, 9:46 a.m.)
> 
> 
> Review request for hive, Alexander Kolbasov, Peter Vary, and Vihang 
> Karajgaonkar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20551: Create PreparedStatement query dynamically when IN clause is used
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
>  571c789eddfd2b1a27c65c48bdc6dccfafaaf676 
> 
> 
> Diff: https://reviews.apache.org/r/68767/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-26 Thread denys kuzmenko via Review Board


> On Sept. 26, 2018, 11:47 a.m., Antal Sinkovits wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 3062 (patched)
> > 
> >
> > Why is the default value -1? All the checks seems to go against >0. 
> > What happens when the value is 0?

If the degree of parallelism is lower than 1, there won't be any restrictions 
on number of parallel compilations. "-1" means unbounded. "0" doesn't make 
sence, so it's just ignored (and will be unbounded).  Otherwise, number of 
quotas will be equal to the "hive.driver.parallel.compilation.global.limit" 
value.


> On Sept. 26, 2018, 11:47 a.m., Antal Sinkovits wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java
> > Lines 12 (patched)
> > 
> >
> > I think there is a typo here, and the it should be CompileLock.class.

Thanks for the catch!


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/#review209017
---


On Sept. 26, 2018, 1:08 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68683/
> ---
> 
> (Updated Sept. 26, 2018, 1:08 p.m.)
> 
> 
> Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, 
> and Peter Vary.
> 
> 
> Bugs: HIVE-20535
> https://issues.apache.org/jira/browse/HIVE-20535
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> When removing the compile lock, it is quite risky to remove it entirely.
> 
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8c39de3e77 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 737debd2ad 
>   ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLockFactory.java 
> PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68683/diff/7/
> 
> 
> Testing
> ---
> 
> Added CompileLockTest
> 
> 
> File Attachments
> 
> 
> HIVE-20535.1.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/335b0f4b-ea94-41d4-881a-ec8bb870a376__HIVE-20535.14.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/a92b6da2-eeba-46ee-9409-162653826172__HIVE-20535.14.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/9db4cf76-9188-48fb-bd3d-5b28e43a791b__HIVE-20535.14.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-26 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/
---

(Updated Sept. 26, 2018, 1:08 p.m.)


Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, and 
Peter Vary.


Bugs: HIVE-20535
https://issues.apache.org/jira/browse/HIVE-20535


Repository: hive-git


Description
---

When removing the compile lock, it is quite risky to remove it entirely.

It would be good to provide a pool size for the concurrent compilation, so the 
administrator can limit the load


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8c39de3e77 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 737debd2ad 
  ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLockFactory.java 
PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 


Diff: https://reviews.apache.org/r/68683/diff/7/

Changes: https://reviews.apache.org/r/68683/diff/6-7/


Testing
---

Added CompileLockTest


File Attachments


HIVE-20535.1.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch
HIVE-20535.14.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/25/335b0f4b-ea94-41d4-881a-ec8bb870a376__HIVE-20535.14.patch
HIVE-20535.14.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/25/a92b6da2-eeba-46ee-9409-162653826172__HIVE-20535.14.patch
HIVE-20535.14.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/25/9db4cf76-9188-48fb-bd3d-5b28e43a791b__HIVE-20535.14.patch


Thanks,

denys kuzmenko



Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-25 Thread denys kuzmenko via Review Board


> On Sept. 24, 2018, 11:14 p.m., Peter Vary wrote:
> > Hi Denys,
> > 
> > Could you please think a little about separating the Manager/Factory and 
> > the tryAcquire mess?
> > 
> > Incomplete thoughts, but I had to run
> > 
> > Thanks, and sorry :(
> > Peter

Please review new patch. Really appreciate your and Zoltan's suggestions, now 
code looks much better.


> On Sept. 24, 2018, 11:14 p.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/CompileLockManager.java
> > Lines 130 (patched)
> > 
> >
> > nit: I do prefer creating static final variables at the begining of the 
> > class, or at the first use. Do not create a new patch because of this, but 
> > if you have to do a new one please move the declaration up to the line ~51

done!


> On Sept. 24, 2018, 11:14 p.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Line 1854 (original), 1849-1850 (patched)
> > 
> >
> > This still makes me itching...
> > I think we should separate the Manager / Factory and the actual lock 
> > object.
> > I would prefer the following:
> > - CompileLockManager should create the lock object
> > - Use the lock object as Zoltan suggested (try-with-resources)
> > - If we decide to keep tryAcquire - can we do it as a wrapper around 
> > the tryLock method

Please review new path! Thank you!


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/#review208968
---


On Sept. 25, 2018, 10:19 a.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68683/
> ---
> 
> (Updated Sept. 25, 2018, 10:19 a.m.)
> 
> 
> Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, 
> and Peter Vary.
> 
> 
> Bugs: HIVE-20535
> https://issues.apache.org/jira/browse/HIVE-20535
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> When removing the compile lock, it is quite risky to remove it entirely.
> 
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8c39de3e77 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 737debd2ad 
>   ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLockFactory.java 
> PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68683/diff/6/
> 
> 
> Testing
> ---
> 
> Added CompileLockTest
> 
> 
> File Attachments
> 
> 
> HIVE-20535.1.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/335b0f4b-ea94-41d4-881a-ec8bb870a376__HIVE-20535.14.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/a92b6da2-eeba-46ee-9409-162653826172__HIVE-20535.14.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/9db4cf76-9188-48fb-bd3d-5b28e43a791b__HIVE-20535.14.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-25 Thread denys kuzmenko via Review Board


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > I'm not sure but I feel that it would be probably simpler to add something 
> > which covers some reentrant-s and semaphores.
> > It feels like this lock handling is a littlebit scattered around...I think 
> > it would be better to have them outside of the Driver class.
> 
> denys kuzmenko wrote:
> moved logic to CompileLockManager

splitted and extracted functionality of CompileLock and CompileLockFactory


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Lines 247 (patched)
> > 
> >
> > I'm not sure we gain anything by having these strings in a static block 
> > - they are only used as log messages at debug level..
> 
> denys kuzmenko wrote:
> It's a clean code practice (String literals)

refactored


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Lines 252 (patched)
> > 
> >
> > final
> 
> denys kuzmenko wrote:
> there is conditional logic, default value is serializableCompileLock;

Fixed.


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Lines 271 (patched)
> > 
> >
> > it seems to me that this class is not the lock itself...it instead the 
> > "thing that locks"...
> > 
> > but getInstance() gives the feeling that it's something like a 
> > singleton...this is a little bit confusing to me
> 
> denys kuzmenko wrote:
> externalized to CompileLockManager class

refactored, added Factory and Lock


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Lines 380 (patched)
> > 
> >
> > my first comment was: why do we use 2 locks now?
> > 
> > I now understand why...I feel that probably trying to replace the 
> > existing logic with a decent one which could handle all these cases would 
> > make it more straight.
> > If you don't think that would be appropriate - that's okay; just drop 
> > this issue...
> 
> denys kuzmenko wrote:
> it's just a first steps in compile lock refactoring.

New locks could be created with CompileLockFactory.


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Line 1860 (original), 2017 (patched)
> > 
> >
> > I think it would be better to use try-with-resouces instead of manual 
> > control...that would also take care of the unlock/release/etc as well
> > 
> > I feel that it's easier to follow - if a lock has a scope..
> 
> denys kuzmenko wrote:
> I would have to remember the result of tryAcquire method (aquired lock or 
> not) and supply it to AutoClosable.close(){if(locked) lock.unlock()} . I 
> think it would complicate the logic.

Fixed


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/#review208625
---


On Sept. 25, 2018, 10:19 a.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68683/
> ---
> 
> (Updated Sept. 25, 2018, 10:19 a.m.)
> 
> 
> Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, 
> and Peter Vary.
> 
> 
> Bugs: HIVE-20535
> https://issues.apache.org/jira/browse/HIVE-20535
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> When removing the compile lock, it is quite risky to remove it entirely.
> 
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8c39de3e77 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 737debd2ad 
>   ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLockFactory.java 
> PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68683/diff/6/
> 
> 
> Testing
> ---
> 
> Added CompileLockTest
> 
> 
> File Attachments
> 
> 
> HIVE-20535.1.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch
> HIVE-20535.14.patch
>   
> 

Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-25 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/
---

(Updated Sept. 25, 2018, 10:19 a.m.)


Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, and 
Peter Vary.


Bugs: HIVE-20535
https://issues.apache.org/jira/browse/HIVE-20535


Repository: hive-git


Description
---

When removing the compile lock, it is quite risky to remove it entirely.

It would be good to provide a pool size for the concurrent compilation, so the 
administrator can limit the load


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8c39de3e77 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 737debd2ad 
  ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLockFactory.java 
PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 


Diff: https://reviews.apache.org/r/68683/diff/6/

Changes: https://reviews.apache.org/r/68683/diff/5-6/


Testing
---

Added CompileLockTest


File Attachments (updated)


HIVE-20535.1.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch
HIVE-20535.14.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/25/335b0f4b-ea94-41d4-881a-ec8bb870a376__HIVE-20535.14.patch
HIVE-20535.14.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/25/a92b6da2-eeba-46ee-9409-162653826172__HIVE-20535.14.patch
HIVE-20535.14.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/25/9db4cf76-9188-48fb-bd3d-5b28e43a791b__HIVE-20535.14.patch


Thanks,

denys kuzmenko



Re: Review Request 68474: HIVE-20440: Create better cache eviction policy for SmallTableCache

2018-09-20 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68474/#review208793
---




ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java
Lines 94 (patched)


Just a remark, note from google documentation:
"Because of the performance implications of using soft references, we 
generally recommend using the more predictable maximum cache size instead."


- denys kuzmenko


On Sept. 19, 2018, 11:14 p.m., Antal Sinkovits wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68474/
> ---
> 
> (Updated Sept. 19, 2018, 11:14 p.m.)
> 
> 
> Review request for hive, Naveen Gangam, Sahil Takiar, Adam Szita, and Xuefu 
> Zhang.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> I've modified the SmallTableCache to use guava cache, with soft references.
> By using a value loader, I've also eliminated the synchronization on the 
> intern-ed string of the path.
> 
> 
> Diffs
> -
> 
>   ql/pom.xml d73deba440702ec39fc5610df28e0fe54baef025 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 
> cf27e92bafdc63096ec0fa8c3106657bab52f370 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java 
> 3293100af96dc60408c53065fa89143ead98f818 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestSmallTableCache.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68474/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Antal Sinkovits
> 
>



Review Request 69801: HADOOP_CREDSTORE_PASSWORD is not populated under yarn.app.mapreduce.am.admin.user.env

2019-01-21 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69801/
---

Review request for hive, Peter Vary and Vihang Karajgaonkar.


Bugs: HIVE-21116
https://issues.apache.org/jira/browse/HIVE-21116


Repository: hive-git


Description
---

HADOOP_CREDSTORE_PASSWORD is not populated under 
yarn.app.mapreduce.am.admin.user.env


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java ae6fa43df0 
  common/src/java/org/apache/hive/common/util/HiveStringUtils.java a4923f9f1b 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java 
4f49190df0 


Diff: https://reviews.apache.org/r/69801/diff/1/


Testing
---

TestHiveCredentialProviders.java


Thanks,

Denys Kuzmenko



Re: Review Request 69801: HADOOP_CREDSTORE_PASSWORD is not populated under yarn.app.mapreduce.am.admin.user.env

2019-01-21 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69801/
---

(Updated Jan. 21, 2019, 1:13 p.m.)


Review request for hive, Peter Vary and Vihang Karajgaonkar.


Bugs: HIVE-21116
https://issues.apache.org/jira/browse/HIVE-21116


Repository: hive-git


Description
---

HADOOP_CREDSTORE_PASSWORD is not populated under 
yarn.app.mapreduce.am.admin.user.env


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java ae6fa43df0 
  common/src/java/org/apache/hive/common/util/HiveStringUtils.java a4923f9f1b 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java 
4f49190df0 


Diff: https://reviews.apache.org/r/69801/diff/2/

Changes: https://reviews.apache.org/r/69801/diff/1-2/


Testing
---

TestHiveCredentialProviders.java


Thanks,

Denys Kuzmenko



Re: Review Request 69801: HADOOP_CREDSTORE_PASSWORD is not populated under yarn.app.mapreduce.am.admin.user.env

2019-01-21 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69801/
---

(Updated Jan. 21, 2019, 1:34 p.m.)


Review request for hive, Peter Vary and Vihang Karajgaonkar.


Bugs: HIVE-21116
https://issues.apache.org/jira/browse/HIVE-21116


Repository: hive-git


Description
---

HADOOP_CREDSTORE_PASSWORD is not populated under 
yarn.app.mapreduce.am.admin.user.env


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java ae6fa43df0 
  common/src/java/org/apache/hive/common/util/HiveStringUtils.java a4923f9f1b 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java 
4f49190df0 


Diff: https://reviews.apache.org/r/69801/diff/3/

Changes: https://reviews.apache.org/r/69801/diff/2-3/


Testing
---

TestHiveCredentialProviders.java


Thanks,

Denys Kuzmenko



Re: Review Request 69550: Add credential store env properties redaction in JobConf

2018-12-12 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69550/
---

(Updated Dec. 12, 2018, 3:31 p.m.)


Review request for hive, Peter Vary and Vihang Karajgaonkar.


Changes
---

added tests


Bugs: HIVE-21030
https://issues.apache.org/jira/browse/HIVE-21030


Repository: hive-git


Description
---

Credstore decryption password should be redacted in JobConf


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 
2ad5f9ee39f376d3466994a24cc9f7850be902ae 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java 
62eb9e4c3c14cfc233f5184d58171e298e25bd9a 


Diff: https://reviews.apache.org/r/69550/diff/2/

Changes: https://reviews.apache.org/r/69550/diff/1-2/


Testing
---


Thanks,

Denys Kuzmenko



Re: Review Request 69560: HIVE-21035: Race condition in SparkUtilities#getSparkSession

2018-12-12 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69560/#review211237
---


Ship it!




Ship It!

- Denys Kuzmenko


On Dec. 12, 2018, 3:13 p.m., Antal Sinkovits wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69560/
> ---
> 
> (Updated Dec. 12, 2018, 3:13 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> It can happen, that when in one given session, multiple queries are executed, 
> that due to a race condition, multiple spark application master gets kicked 
> off.
> In this case, the one that started earlier, will not be killed, when the hive 
> session closes, consuming resources.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
> d384ed6db6f4630ee2ef309854236e8f12926688 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestSparkUtilities.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69560/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Antal Sinkovits
> 
>



Re: Review Request 69550: Add credential store env properties redaction in JobConf

2018-12-11 Thread Denys Kuzmenko via Review Board


> On Dec. 11, 2018, 4:30 p.m., Vihang Karajgaonkar wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java
> > Lines 217 (patched)
> > 
> >
> > Do we need to do redact for the Spark config as well?

no, it's already handled in Spark:
https://github.com/apache/spark/commit/66636ef0b046e5d1f340c3b8153d7213fa9d19c7


- Denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69550/#review211199
---


On Dec. 11, 2018, 12:43 p.m., Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69550/
> ---
> 
> (Updated Dec. 11, 2018, 12:43 p.m.)
> 
> 
> Review request for hive, Peter Vary and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-21030
> https://issues.apache.org/jira/browse/HIVE-21030
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Credstore decryption password should be redacted in JobConf
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 
> 2ad5f9ee39f376d3466994a24cc9f7850be902ae 
> 
> 
> Diff: https://reviews.apache.org/r/69550/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Review Request 69550: Add credential store env properties redaction in JobConf

2018-12-11 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69550/
---

Review request for hive, Peter Vary and Vihang Karajgaonkar.


Repository: hive-git


Description
---

Credstore decryption password should be redacted in JobConf


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 
2ad5f9ee39f376d3466994a24cc9f7850be902ae 


Diff: https://reviews.apache.org/r/69550/diff/1/


Testing
---


Thanks,

Denys Kuzmenko



Re: Review Request 69550: Add credential store env properties redaction in JobConf

2018-12-11 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69550/
---

(Updated Dec. 11, 2018, 12:43 p.m.)


Review request for hive, Peter Vary and Vihang Karajgaonkar.


Bugs: HIVE-21030
https://issues.apache.org/jira/browse/HIVE-21030


Repository: hive-git


Description
---

Credstore decryption password should be redacted in JobConf


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 
2ad5f9ee39f376d3466994a24cc9f7850be902ae 


Diff: https://reviews.apache.org/r/69550/diff/1/


Testing
---


Thanks,

Denys Kuzmenko



Re: Review Request 69432: HIVE-20964 Create a test that checks the level of the parallel compilation

2018-11-22 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69432/#review210800
---


Ship it!




Ship It!

- Denys Kuzmenko


On Nov. 22, 2018, 3:19 p.m., Peter Vary wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69432/
> ---
> 
> (Updated Nov. 22, 2018, 3:19 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Marta Kuczora, and Adam Szita.
> 
> 
> Bugs: HIVE-20964
> https://issues.apache.org/jira/browse/HIVE-20964
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> * Created 2 query types in the TestCompileLock mock driver. The original 
> SHORT_QUERY is finishing in 0.5s as before, but the new LONG_QUERY will 
> finish only after 5s.
> * With using the new 5s query I have created a new test where the compile 
> quota is 4 and the parallel request number is 10. So the test expects that 6 
> query will fail with timeout.
> * Added a new verifyThatTimedOutCompileOpsCount method to validate the number 
> of the timed out queries.
> * The other changes are just pushing down the query string so the 
> compileAndRespond method can decide which query to run.
> 
> 
> Diffs
> -
> 
>   ql/src/test/org/apache/hadoop/hive/ql/TestCompileLock.java 8dc05ff480 
> 
> 
> Diff: https://reviews.apache.org/r/69432/diff/1/
> 
> 
> Testing
> ---
> 
> Run the new test, and all the old tests in TestCompileLock
> 
> 
> Thanks,
> 
> Peter Vary
> 
>



Re: Review Request 68523: Improve org.apache.hadoop.hive.ql.exec.FunctionTask Experience

2018-09-17 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68523/
---

(Updated Sept. 17, 2018, 3:24 p.m.)


Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.


Bugs: HIVE-20466
https://issues.apache.org/jira/browse/HIVE-20466


Repository: hive-git


Description
---

When a create function statement is submitted, it may fail with the following 
error:

Error while processing statement: FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.FunctionTask

This is not a user-friendly error message.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 44591842bb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java 6f8a8f5504 
  ql/src/test/results/clientnegative/create_function_nonexistent_class.q.out 
77467f66e3 


Diff: https://reviews.apache.org/r/68523/diff/3/

Changes: https://reviews.apache.org/r/68523/diff/2-3/


Testing
---

added negative qtest to cover this scenario


Thanks,

denys kuzmenko



Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-17 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/
---

(Updated Sept. 17, 2018, 5:55 p.m.)


Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, and 
Peter Vary.


Bugs: HIVE-20535
https://issues.apache.org/jira/browse/HIVE-20535


Repository: hive-git


Description
---

When removing the compile lock, it is quite risky to remove it entirely.

It would be good to provide a pool size for the concurrent compilation, so the 
administrator can limit the load


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3fb8e76 
  ql/src/java/org/apache/hadoop/hive/ql/CompileLockManager.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java dad2035 
  ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 


Diff: https://reviews.apache.org/r/68683/diff/4/

Changes: https://reviews.apache.org/r/68683/diff/3-4/


Testing
---

Added CompileLockTest


File Attachments


HIVE-20535.1.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch


Thanks,

denys kuzmenko



Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-13 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/
---

(Updated Sept. 13, 2018, 8:18 p.m.)


Review request for hive, Zoltan Haindrich, Zoltan Haindrich, and Peter Vary.


Bugs: HIVE-20535
https://issues.apache.org/jira/browse/HIVE-20535


Repository: hive-git


Description
---

When removing the compile lock, it is quite risky to remove it entirely.

It would be good to provide a pool size for the concurrent compilation, so the 
administrator can limit the load


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java aa58d74 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java dad2035 
  ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 


Diff: https://reviews.apache.org/r/68683/diff/2/

Changes: https://reviews.apache.org/r/68683/diff/1-2/


Testing
---

Added CompileLockTest


File Attachments (updated)


HIVE-20535.1.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch


Thanks,

denys kuzmenko



Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-19 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/
---

(Updated Sept. 19, 2018, 9:37 a.m.)


Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, and 
Peter Vary.


Bugs: HIVE-20535
https://issues.apache.org/jira/browse/HIVE-20535


Repository: hive-git


Description
---

When removing the compile lock, it is quite risky to remove it entirely.

It would be good to provide a pool size for the concurrent compilation, so the 
administrator can limit the load


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8c39de3e77 
  ql/src/java/org/apache/hadoop/hive/ql/CompileLockManager.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 737debd2ad 
  ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 


Diff: https://reviews.apache.org/r/68683/diff/5/

Changes: https://reviews.apache.org/r/68683/diff/4-5/


Testing
---

Added CompileLockTest


File Attachments


HIVE-20535.1.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch


Thanks,

denys kuzmenko



Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-19 Thread denys kuzmenko via Review Board


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Line 507 (original), 666 (patched)
> > 
> >
> > please don't make this method more visible; use compile("sel") or 
> > something...it should work
> 
> denys kuzmenko wrote:
> it's impossible to mock and test compile lock behaviour. Entry point is 
> Driver.compileAndRespond("query"). I do not want to use PowerMock. Actually I 
> tried and faced many issues with hadoop classes.
> 
> Peter Vary wrote:
> What about @VisibleForTesting annotation? It could show the intention at 
> least...

Added @VisibleForTesting annotation


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/#review208625
---


On Sept. 19, 2018, 9:37 a.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68683/
> ---
> 
> (Updated Sept. 19, 2018, 9:37 a.m.)
> 
> 
> Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, 
> and Peter Vary.
> 
> 
> Bugs: HIVE-20535
> https://issues.apache.org/jira/browse/HIVE-20535
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> When removing the compile lock, it is quite risky to remove it entirely.
> 
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8c39de3e77 
>   ql/src/java/org/apache/hadoop/hive/ql/CompileLockManager.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 737debd2ad 
>   ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68683/diff/5/
> 
> 
> Testing
> ---
> 
> Added CompileLockTest
> 
> 
> File Attachments
> 
> 
> HIVE-20535.1.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-17 Thread denys kuzmenko via Review Board


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > I'm not sure but I feel that it would be probably simpler to add something 
> > which covers some reentrant-s and semaphores.
> > It feels like this lock handling is a littlebit scattered around...I think 
> > it would be better to have them outside of the Driver class.

moved logic to CompileLockManager


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Line 3062 (original), 3062 (patched)
> > 
> >
> > I think we don't want to start reinterpreting an existing option
> > 
> > consider adding:
> > 
> > * session level compilation limit
> > * global compilation limit
> > 
> > 
> > right now it seems to me that the "within the same session" is not 
> > possible right now...; because it acquires the sessionstate reentrant 
> > during quota checks

"within the same session" option is not possible right now, that is why I 
removed it from the property description as it was misleading.
Currently session level compilation limit is always 1. 

At this moment, we wanted to introduce just a global compilation limit - not to 
overwhelm the cluster. As of now, we don't have any restrictions for the number 
of compilation quaries in case parralel compilation option is set to true.

reverted.


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 3063 (patched)
> > 
> >
> > If we want to do session level parallelism; I think this should be 
> > renamed to:
> > 
> > hive.driver.parallel.compilation.global.limit

OK


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Lines 247 (patched)
> > 
> >
> > I'm not sure we gain anything by having these strings in a static block 
> > - they are only used as log messages at debug level..

It's a clean code practice (String literals)


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Lines 252 (patched)
> > 
> >
> > final

there is conditional logic, default value is serializableCompileLock;


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Lines 271 (patched)
> > 
> >
> > it seems to me that this class is not the lock itself...it instead the 
> > "thing that locks"...
> > 
> > but getInstance() gives the feeling that it's something like a 
> > singleton...this is a little bit confusing to me

externalized to CompileLockManager class


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Lines 275 (patched)
> > 
> >
> > intention/use mismatch:
> > 
> > this method is private; however it seems to be called from compile()

changed to public


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Lines 292 (patched)
> > 
> >
> > I think we don't need this "try first"
> > 
> > java.util.concurrent.locks.AbstractQueuedSynchronizer#tryAcquireNanos 
> > seems to be already doing this trick...
> > 
> > ```
> > return tryAcquire(arg) ||
> > doAcquireNanos(arg, nanosTimeout);
> > ```
> > 
> > ...or there are some other benefits?

Completely agree with this!!! I would remove this "try first" in first place, 
however I noticed that this logic was introduced intentionally: 
HIVE-14263 : Log message when HS2 query is waiting on compile lock (Tao Li via 
Thejas Nair)

Not sure how important is that and whether we can remove it???


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Lines 339 (patched)
> > 
> >
> > is there any reason that this Lock is an enum? :)

It's a more cleaner way to implement a singleton, than having double-ckecked 
locking. This class initiatlizes a global semaphore used to controll compile 
limit.


> On Sept. 17, 2018, 9:15 a.m., Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Lines 341 (patched)
> > 
> >
> > will there be a 

Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-17 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/
---

(Updated Sept. 17, 2018, 12:52 p.m.)


Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, and 
Peter Vary.


Bugs: HIVE-20535
https://issues.apache.org/jira/browse/HIVE-20535


Repository: hive-git


Description
---

When removing the compile lock, it is quite risky to remove it entirely.

It would be good to provide a pool size for the concurrent compilation, so the 
administrator can limit the load


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8c39de3e77 
  ql/src/java/org/apache/hadoop/hive/ql/CompileLockManager.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 737debd2ad 
  ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 


Diff: https://reviews.apache.org/r/68683/diff/3/

Changes: https://reviews.apache.org/r/68683/diff/2-3/


Testing
---

Added CompileLockTest


File Attachments


HIVE-20535.1.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch


Thanks,

denys kuzmenko



Re: Review Request 70934: HIVE-18735: Create table like loses transactional attribute.

2019-06-24 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70934/#review216090
---




ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
Lines 13593 (patched)


You do not need return statement here. tblProps reference is used under 
updateDefaultTblProps.


- Denys Kuzmenko


On June 24, 2019, 1:01 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70934/
> ---
> 
> (Updated June 24, 2019, 1:01 p.m.)
> 
> 
> Review request for hive, Eugene Koifman, Marta Kuczora, Peter Vary, and Adam 
> Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-18735: Create table like loses transactional attribute.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> d395db1b59d021789b1bb47c7f09ff337cba2dd0 
> 
> 
> Diff: https://reviews.apache.org/r/70934/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 71589: Create read-only transactions

2019-10-31 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/
---

(Updated Oct. 31, 2019, 3:21 p.m.)


Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-21114
https://issues.apache.org/jira/browse/HIVE-21114


Repository: hive-git


Description
---

With HIVE-21036 we have a way to indicate that a txn is read only.
We should (at least in auto-commit mode) determine if the single stmt is a read 
and mark the txn accordingly.
Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks in 
write_set etc.

TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
TXN_OPEN) so it can read the txn type in the same SQL stmt.

HiveOperation only has QUERY, which includes Insert and Select, so this 
requires figuring out how to determine if a query is a SELECT. By the time 
Driver.openTransaction(); is called, we have already parsed the query so there 
should be a way to know if the statement only reads.

For multi-stmt txns (once these are supported) we should allow user to indicate 
that a txn is read-only and then not allow any statements that can make 
modifications in this txn. This should be a different jira.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 91910d1c0c 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java ac813c8288 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 1c53426966 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
cc86afedbf 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java PRE-CREATION 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 355c4f5374 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
 b5f22092a9 


Diff: https://reviews.apache.org/r/71589/diff/7/

Changes: https://reviews.apache.org/r/71589/diff/6-7/


Testing
---

Unit + manual test


File Attachments


HIVE-21114.1.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/10/0929ed4a-17be-4098-8c61-0819a30613fd__HIVE-21114.1.patch
HIVE-21114.5.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/17/80cbb092-97d6-48d2-b603-24213141cb5e__HIVE-21114.5.patch
HIVE-21114.8.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/22/b14eedb4-a2f1-4f77-9676-c258b6804b98__HIVE-21114.8.patch
HIVE-21114.8.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/22/9096f402-3d2e-4cd2-9f85-df1dfeb25863__HIVE-21114.8.patch
HIVE-21114.8.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/28/a001316c-bcf4-43a2-83fa-7d49183b2a7f__HIVE-21114.8.patch
HIVE-21114.10.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/31/93854078-7b63-46ec-95a0-62ab783ee54c__HIVE-21114.10.patch


Thanks,

Denys Kuzmenko



Re: Review Request 71589: Create read-only transactions

2019-10-30 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/
---

(Updated Oct. 30, 2019, 12:23 p.m.)


Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-21114
https://issues.apache.org/jira/browse/HIVE-21114


Repository: hive-git


Description
---

With HIVE-21036 we have a way to indicate that a txn is read only.
We should (at least in auto-commit mode) determine if the single stmt is a read 
and mark the txn accordingly.
Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks in 
write_set etc.

TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
TXN_OPEN) so it can read the txn type in the same SQL stmt.

HiveOperation only has QUERY, which includes Insert and Select, so this 
requires figuring out how to determine if a query is a SELECT. By the time 
Driver.openTransaction(); is called, we have already parsed the query so there 
should be a way to know if the statement only reads.

For multi-stmt txns (once these are supported) we should allow user to indicate 
that a txn is read-only and then not allow any statements that can make 
modifications in this txn. This should be a different jira.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 91910d1c0c 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java ac813c8288 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 1c53426966 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
cc86afedbf 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 355c4f5374 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
 b5f22092a9 


Diff: https://reviews.apache.org/r/71589/diff/6/

Changes: https://reviews.apache.org/r/71589/diff/5-6/


Testing
---

Unit + manual test


File Attachments


HIVE-21114.1.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/10/0929ed4a-17be-4098-8c61-0819a30613fd__HIVE-21114.1.patch
HIVE-21114.5.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/17/80cbb092-97d6-48d2-b603-24213141cb5e__HIVE-21114.5.patch
HIVE-21114.8.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/22/b14eedb4-a2f1-4f77-9676-c258b6804b98__HIVE-21114.8.patch
HIVE-21114.8.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/22/9096f402-3d2e-4cd2-9f85-df1dfeb25863__HIVE-21114.8.patch
HIVE-21114.8.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/28/a001316c-bcf4-43a2-83fa-7d49183b2a7f__HIVE-21114.8.patch


Thanks,

Denys Kuzmenko



Re: Review Request 71589: Create read-only transactions

2019-10-31 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/
---

(Updated Oct. 31, 2019, 3:20 p.m.)


Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-21114
https://issues.apache.org/jira/browse/HIVE-21114


Repository: hive-git


Description
---

With HIVE-21036 we have a way to indicate that a txn is read only.
We should (at least in auto-commit mode) determine if the single stmt is a read 
and mark the txn accordingly.
Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks in 
write_set etc.

TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
TXN_OPEN) so it can read the txn type in the same SQL stmt.

HiveOperation only has QUERY, which includes Insert and Select, so this 
requires figuring out how to determine if a query is a SELECT. By the time 
Driver.openTransaction(); is called, we have already parsed the query so there 
should be a way to know if the statement only reads.

For multi-stmt txns (once these are supported) we should allow user to indicate 
that a txn is read-only and then not allow any statements that can make 
modifications in this txn. This should be a different jira.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 91910d1c0c 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java ac813c8288 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 1c53426966 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
cc86afedbf 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 355c4f5374 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
 b5f22092a9 


Diff: https://reviews.apache.org/r/71589/diff/6/


Testing
---

Unit + manual test


File Attachments (updated)


HIVE-21114.1.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/10/0929ed4a-17be-4098-8c61-0819a30613fd__HIVE-21114.1.patch
HIVE-21114.5.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/17/80cbb092-97d6-48d2-b603-24213141cb5e__HIVE-21114.5.patch
HIVE-21114.8.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/22/b14eedb4-a2f1-4f77-9676-c258b6804b98__HIVE-21114.8.patch
HIVE-21114.8.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/22/9096f402-3d2e-4cd2-9f85-df1dfeb25863__HIVE-21114.8.patch
HIVE-21114.8.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/28/a001316c-bcf4-43a2-83fa-7d49183b2a7f__HIVE-21114.8.patch
HIVE-21114.10.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/31/93854078-7b63-46ec-95a0-62ab783ee54c__HIVE-21114.10.patch


Thanks,

Denys Kuzmenko



Re: Review Request 71792: COMPLETED_TXN_COMPONENTS table is never cleaned up unless Compactor runs

2019-11-21 Thread Denys Kuzmenko via Review Board


> On Nov. 20, 2019, 3:19 p.m., Denys Kuzmenko wrote:
> > Not ready. Need to handle aborted and currently active compactions.
> 
> Denys Kuzmenko wrote:
> Handling above cases would complicate the Initiator logic and make 
> preliminare check longer. Not sure how critial it is that in case of 
> unsuccessful compaction attempt, on next run we won't retry unless there is 
> some change to the selected table/partiotion. Any thoughts on this?

Changed findPotentialCompactions query to: 

select distinct ctc_database, ctc_table, ctc_partition from 
COMPLETED_TXN_COMPONENTS where 
(select CC_STATE from COMPLETED_COMPACTIONS where ctc_database = CC_DATABASE 
and ctc_table = CC_TABLE and (ctc_partition is null or ctc_partition = 
cc_partition)
order by cc_id desc limit 1) IN ('a', 'f') || ctc_timestamp < current_timestamp

however this still won't cover skipped compactions due to already running one


- Denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71792/#review218723
---


On Nov. 20, 2019, 12:20 p.m., Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71792/
> ---
> 
> (Updated Nov. 20, 2019, 12:20 p.m.)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21917
> https://issues.apache.org/jira/browse/HIVE-21917
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The Initiator thread in the metastore repeatedly loops over entries in the 
> COMPLETED_TXN_COMPONENTS table to determine which partitions / tables might 
> need to be compacted. However, entries are never removed from this table 
> except by a completed Compactor run.
> 
> In a cluster where most tables / partitions are write-once read-many, this 
> results in stale entries in this table never being cleaned up. In a small 
> test cluster, we have observed approximately 45k entries in this table 
> (virtually equal to the number of partitions in the cluster) while < 100 of 
> these tables have delta files at all. Since most of the tables will never get 
> enough writes to trigger a compaction (and in fact have only ever been 
> written to once), the initiator thread keeps trying to evaluate them on every 
> loop.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 610cf05204 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
>  b28b57779b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
>  8253ccb9c9 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  6281208247 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
>  e840758c9d 
> 
> 
> Diff: https://reviews.apache.org/r/71792/diff/1/
> 
> 
> Testing
> ---
> 
> Unit tests
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 71792: COMPLETED_TXN_COMPONENTS table is never cleaned up unless Compactor runs

2019-11-21 Thread Denys Kuzmenko via Review Board


> On Nov. 20, 2019, 3:19 p.m., Denys Kuzmenko wrote:
> > Not ready. Need to handle aborted and currently active compactions.

Handling above cases would complicate the Initiator logic and make preliminare 
check longer. Not sure how critial it is that in case of unsuccessful 
compaction attempt, on next run we won't retry unless there is some change to 
the selected table/partiotion. Any thoughts on this?


- Denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71792/#review218723
---


On Nov. 20, 2019, 12:20 p.m., Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71792/
> ---
> 
> (Updated Nov. 20, 2019, 12:20 p.m.)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21917
> https://issues.apache.org/jira/browse/HIVE-21917
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The Initiator thread in the metastore repeatedly loops over entries in the 
> COMPLETED_TXN_COMPONENTS table to determine which partitions / tables might 
> need to be compacted. However, entries are never removed from this table 
> except by a completed Compactor run.
> 
> In a cluster where most tables / partitions are write-once read-many, this 
> results in stale entries in this table never being cleaned up. In a small 
> test cluster, we have observed approximately 45k entries in this table 
> (virtually equal to the number of partitions in the cluster) while < 100 of 
> these tables have delta files at all. Since most of the tables will never get 
> enough writes to trigger a compaction (and in fact have only ever been 
> written to once), the initiator thread keeps trying to evaluate them on every 
> loop.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 610cf05204 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
>  b28b57779b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
>  8253ccb9c9 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  6281208247 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
>  e840758c9d 
> 
> 
> Diff: https://reviews.apache.org/r/71792/diff/1/
> 
> 
> Testing
> ---
> 
> Unit tests
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 71792: COMPLETED_TXN_COMPONENTS table is never cleaned up unless Compactor runs

2019-11-21 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71792/
---

(Updated Nov. 21, 2019, 5:35 p.m.)


Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-21917
https://issues.apache.org/jira/browse/HIVE-21917


Repository: hive-git


Description
---

The Initiator thread in the metastore repeatedly loops over entries in the 
COMPLETED_TXN_COMPONENTS table to determine which partitions / tables might 
need to be compacted. However, entries are never removed from this table except 
by a completed Compactor run.

In a cluster where most tables / partitions are write-once read-many, this 
results in stale entries in this table never being cleaned up. In a small test 
cluster, we have observed approximately 45k entries in this table (virtually 
equal to the number of partitions in the cluster) while < 100 of these tables 
have delta files at all. Since most of the tables will never get enough writes 
to trigger a compaction (and in fact have only ever been written to once), the 
initiator thread keeps trying to evaluate them on every loop.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 610cf05204 
  
ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java 
b28b57779b 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
 8253ccb9c9 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 268038795b 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
 e840758c9d 


Diff: https://reviews.apache.org/r/71792/diff/2/

Changes: https://reviews.apache.org/r/71792/diff/1-2/


Testing
---

Unit tests


Thanks,

Denys Kuzmenko



Review Request 71888: HIVE-22568: Process compaction candidates in parallel by the Initiator

2019-12-06 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71888/
---

Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-22568
https://issues.apache.org/jira/browse/HIVE-22568


Repository: hive-git


Description
---

`checkForCompaction` includes many file metadata checks and may be expensive. 
Therefore, make sense using a thread pool here and running 
`checkForCompactions` in parallel.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 4393a2825e 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 7a0e32463d 
  ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java 
564839324f 


Diff: https://reviews.apache.org/r/71888/diff/1/


Testing
---

unit test


Thanks,

Denys Kuzmenko



Re: Review Request 71589: Create read-only transactions

2019-10-28 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/
---

(Updated Oct. 28, 2019, 1:40 p.m.)


Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-21114
https://issues.apache.org/jira/browse/HIVE-21114


Repository: hive-git


Description
---

With HIVE-21036 we have a way to indicate that a txn is read only.
We should (at least in auto-commit mode) determine if the single stmt is a read 
and mark the txn accordingly.
Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks in 
write_set etc.

TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
TXN_OPEN) so it can read the txn type in the same SQL stmt.

HiveOperation only has QUERY, which includes Insert and Select, so this 
requires figuring out how to determine if a query is a SELECT. By the time 
Driver.openTransaction(); is called, we have already parsed the query so there 
should be a way to know if the statement only reads.

For multi-stmt txns (once these are supported) we should allow user to indicate 
that a txn is read-only and then not allow any statements that can make 
modifications in this txn. This should be a different jira.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 91910d1c0c 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java ac813c8288 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 1c53426966 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
cc86afedbf 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java PRE-CREATION 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 504e6b12a1 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
 46fc3ed1e0 


Diff: https://reviews.apache.org/r/71589/diff/5/


Testing
---

Unit + manual test


File Attachments (updated)


HIVE-21114.1.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/10/0929ed4a-17be-4098-8c61-0819a30613fd__HIVE-21114.1.patch
HIVE-21114.5.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/17/80cbb092-97d6-48d2-b603-24213141cb5e__HIVE-21114.5.patch
HIVE-21114.8.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/22/b14eedb4-a2f1-4f77-9676-c258b6804b98__HIVE-21114.8.patch
HIVE-21114.8.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/22/9096f402-3d2e-4cd2-9f85-df1dfeb25863__HIVE-21114.8.patch
HIVE-21114.8.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/28/a001316c-bcf4-43a2-83fa-7d49183b2a7f__HIVE-21114.8.patch


Thanks,

Denys Kuzmenko



Re: Review Request 71589: Create read-only transactions

2019-10-22 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/
---

(Updated Oct. 22, 2019, 1:24 p.m.)


Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-21114
https://issues.apache.org/jira/browse/HIVE-21114


Repository: hive-git


Description
---

With HIVE-21036 we have a way to indicate that a txn is read only.
We should (at least in auto-commit mode) determine if the single stmt is a read 
and mark the txn accordingly.
Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks in 
write_set etc.

TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
TXN_OPEN) so it can read the txn type in the same SQL stmt.

HiveOperation only has QUERY, which includes Insert and Select, so this 
requires figuring out how to determine if a query is a SELECT. By the time 
Driver.openTransaction(); is called, we have already parsed the query so there 
should be a way to know if the statement only reads.

For multi-stmt txns (once these are supported) we should allow user to indicate 
that a txn is read-only and then not allow any statements that can make 
modifications in this txn. This should be a different jira.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 91910d1c0c 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java ac813c8288 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 1c53426966 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
cc86afedbf 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java PRE-CREATION 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 504e6b12a1 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
 46fc3ed1e0 


Diff: https://reviews.apache.org/r/71589/diff/5/

Changes: https://reviews.apache.org/r/71589/diff/4-5/


Testing
---

Unit + manual test


File Attachments (updated)


HIVE-21114.1.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/10/0929ed4a-17be-4098-8c61-0819a30613fd__HIVE-21114.1.patch
HIVE-21114.5.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/17/80cbb092-97d6-48d2-b603-24213141cb5e__HIVE-21114.5.patch
HIVE-21114.8.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/22/b14eedb4-a2f1-4f77-9676-c258b6804b98__HIVE-21114.8.patch
HIVE-21114.8.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/22/9096f402-3d2e-4cd2-9f85-df1dfeb25863__HIVE-21114.8.patch


Thanks,

Denys Kuzmenko



Re: Review Request 71589: Create read-only transactions

2019-10-17 Thread Denys Kuzmenko via Review Board


> On Oct. 17, 2019, 11:10 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Line 992 (original), 1003 (patched)
> > 
> >
> > Why is this a List of Pairs, why not just a Map? Is the order important?

Have no idea :) Originally I refactored it to the Map and than reverted back to 
List of Pairs as it required some more refactoring in Driver. If you are 
raising same concern, probably makes sence to change this.


- Denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/#review218258
---


On Oct. 17, 2019, 8:41 a.m., Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71589/
> ---
> 
> (Updated Oct. 17, 2019, 8:41 a.m.)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21114
> https://issues.apache.org/jira/browse/HIVE-21114
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> With HIVE-21036 we have a way to indicate that a txn is read only.
> We should (at least in auto-commit mode) determine if the single stmt is a 
> read and mark the txn accordingly.
> Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks 
> in write_set etc.
> 
> TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
> TXN_OPEN) so it can read the txn type in the same SQL stmt.
> 
> HiveOperation only has QUERY, which includes Insert and Select, so this 
> requires figuring out how to determine if a query is a SELECT. By the time 
> Driver.openTransaction(); is called, we have already parsed the query so 
> there should be a way to know if the statement only reads.
> 
> For multi-stmt txns (once these are supported) we should allow user to 
> indicate that a txn is read-only and then not allow any statements that can 
> make modifications in this txn. This should be a different jira.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 91910d1c0c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 
> ac813c8288 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 
> 1c53426966 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
> cc86afedbf 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71589/diff/3/
> 
> 
> Testing
> ---
> 
> Unit + manual test
> 
> 
> File Attachments
> 
> 
> HIVE-21114.1.patch
>   
> https://reviews.apache.org/media/uploaded/files/2019/10/10/0929ed4a-17be-4098-8c61-0819a30613fd__HIVE-21114.1.patch
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 71589: Create read-only transactions

2019-10-17 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/
---

(Updated Oct. 17, 2019, 12:37 p.m.)


Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-21114
https://issues.apache.org/jira/browse/HIVE-21114


Repository: hive-git


Description
---

With HIVE-21036 we have a way to indicate that a txn is read only.
We should (at least in auto-commit mode) determine if the single stmt is a read 
and mark the txn accordingly.
Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks in 
write_set etc.

TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
TXN_OPEN) so it can read the txn type in the same SQL stmt.

HiveOperation only has QUERY, which includes Insert and Select, so this 
requires figuring out how to determine if a query is a SELECT. By the time 
Driver.openTransaction(); is called, we have already parsed the query so there 
should be a way to know if the statement only reads.

For multi-stmt txns (once these are supported) we should allow user to indicate 
that a txn is read-only and then not allow any statements that can make 
modifications in this txn. This should be a different jira.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 91910d1c0c 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java ac813c8288 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 1c53426966 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
cc86afedbf 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java PRE-CREATION 


Diff: https://reviews.apache.org/r/71589/diff/4/

Changes: https://reviews.apache.org/r/71589/diff/3-4/


Testing
---

Unit + manual test


File Attachments (updated)


HIVE-21114.1.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/10/0929ed4a-17be-4098-8c61-0819a30613fd__HIVE-21114.1.patch
HIVE-21114.5.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/17/80cbb092-97d6-48d2-b603-24213141cb5e__HIVE-21114.5.patch


Thanks,

Denys Kuzmenko



Re: Review Request 71589: Create read-only transactions

2019-10-17 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/
---

(Updated Oct. 17, 2019, 8:41 a.m.)


Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-21114
https://issues.apache.org/jira/browse/HIVE-21114


Repository: hive-git


Description
---

With HIVE-21036 we have a way to indicate that a txn is read only.
We should (at least in auto-commit mode) determine if the single stmt is a read 
and mark the txn accordingly.
Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks in 
write_set etc.

TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
TXN_OPEN) so it can read the txn type in the same SQL stmt.

HiveOperation only has QUERY, which includes Insert and Select, so this 
requires figuring out how to determine if a query is a SELECT. By the time 
Driver.openTransaction(); is called, we have already parsed the query so there 
should be a way to know if the statement only reads.

For multi-stmt txns (once these are supported) we should allow user to indicate 
that a txn is read-only and then not allow any statements that can make 
modifications in this txn. This should be a different jira.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 91910d1c0c 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java ac813c8288 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 1c53426966 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
cc86afedbf 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java PRE-CREATION 


Diff: https://reviews.apache.org/r/71589/diff/3/

Changes: https://reviews.apache.org/r/71589/diff/2-3/


Testing
---

Unit + manual test


File Attachments


HIVE-21114.1.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/10/0929ed4a-17be-4098-8c61-0819a30613fd__HIVE-21114.1.patch


Thanks,

Denys Kuzmenko



Review Request 71792: COMPLETED_TXN_COMPONENTS table is never cleaned up unless Compactor runs

2019-11-20 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71792/
---

Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-21917
https://issues.apache.org/jira/browse/HIVE-21917


Repository: hive-git


Description
---

The Initiator thread in the metastore repeatedly loops over entries in the 
COMPLETED_TXN_COMPONENTS table to determine which partitions / tables might 
need to be compacted. However, entries are never removed from this table except 
by a completed Compactor run.

In a cluster where most tables / partitions are write-once read-many, this 
results in stale entries in this table never being cleaned up. In a small test 
cluster, we have observed approximately 45k entries in this table (virtually 
equal to the number of partitions in the cluster) while < 100 of these tables 
have delta files at all. Since most of the tables will never get enough writes 
to trigger a compaction (and in fact have only ever been written to once), the 
initiator thread keeps trying to evaluate them on every loop.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 610cf05204 
  
ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java 
b28b57779b 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
 8253ccb9c9 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 6281208247 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
 e840758c9d 


Diff: https://reviews.apache.org/r/71792/diff/1/


Testing
---

Unit tests


Thanks,

Denys Kuzmenko



Re: Review Request 71792: COMPLETED_TXN_COMPONENTS table is never cleaned up unless Compactor runs

2019-11-20 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71792/#review218723
---



Not ready. Need to handle aborted and currently active compactions.

- Denys Kuzmenko


On Nov. 20, 2019, 12:20 p.m., Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71792/
> ---
> 
> (Updated Nov. 20, 2019, 12:20 p.m.)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21917
> https://issues.apache.org/jira/browse/HIVE-21917
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The Initiator thread in the metastore repeatedly loops over entries in the 
> COMPLETED_TXN_COMPONENTS table to determine which partitions / tables might 
> need to be compacted. However, entries are never removed from this table 
> except by a completed Compactor run.
> 
> In a cluster where most tables / partitions are write-once read-many, this 
> results in stale entries in this table never being cleaned up. In a small 
> test cluster, we have observed approximately 45k entries in this table 
> (virtually equal to the number of partitions in the cluster) while < 100 of 
> these tables have delta files at all. Since most of the tables will never get 
> enough writes to trigger a compaction (and in fact have only ever been 
> written to once), the initiator thread keeps trying to evaluate them on every 
> loop.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 610cf05204 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
>  b28b57779b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
>  8253ccb9c9 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  6281208247 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
>  e840758c9d 
> 
> 
> Diff: https://reviews.apache.org/r/71792/diff/1/
> 
> 
> Testing
> ---
> 
> Unit tests
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 71812: HIVE-22534: ACID: Improve Compactor thread logging

2019-11-28 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71812/#review218847
---




ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
Line 132 (original), 132 (patched)


Redundant


- Denys Kuzmenko


On Nov. 25, 2019, 12:18 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71812/
> ---
> 
> (Updated Nov. 25, 2019, 12:18 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22534: ACID: Improve Compactor thread logging
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> ee2c0f3e23ed716f3de0a2740a96a7ec39251bc2 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MajorQueryCompactor.java 
> 10681c0202a32c338e58b3e2eede03657a00774f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
>  f7e0a85c1f595bb4f112aa051779db3f00c8e572 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java 
> 80119de22f602d9e3cb7a1f60b48e05a37c6a047 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactorFactory.java
>  41cb4b64fbc79dcf81919769c567b26a2e18cfe5 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java 
> 3270175a80992e0efb1e0bfd1f33ffd8a96fcf87 
> 
> 
> Diff: https://reviews.apache.org/r/71812/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 71705: HIVE-22420: DbTxnManager.stopHeartbeat() should be thread-safe

2019-11-04 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71705/#review218490
---




ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java
Lines 539 (patched)


should it be thread safe?



ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java
Lines 677 (patched)


Magic number. Why exactly 31 sec?


- Denys Kuzmenko


On Nov. 4, 2019, 10:02 a.m., Aron Hamvas wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71705/
> ---
> 
> (Updated Nov. 4, 2019, 10:02 a.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Marta Kuczora, Laszlo Pinter, and 
> Peter Vary.
> 
> 
> Bugs: HIVE-22420
> https://issues.apache.org/jira/browse/HIVE-22420
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DbTxnManager.stopHeartbeat() should be thread-safe, rollbackTxn() should 
> check before calling MS client, if txn has been closed previously, to avoid 
> sending 0 as txnId.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
> 
> 
> Diff: https://reviews.apache.org/r/71705/diff/2/
> 
> 
> Testing
> ---
> 
> unit test
> 
> 
> Thanks,
> 
> Aron Hamvas
> 
>



Re: Review Request 71705: HIVE-22420: DbTxnManager.stopHeartbeat() should be thread-safe

2019-11-04 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71705/#review218491
---



LGTM, just minor comments. Also you mentioned unit test, do not see it in a 
patch.

- Denys Kuzmenko


On Nov. 4, 2019, 10:02 a.m., Aron Hamvas wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71705/
> ---
> 
> (Updated Nov. 4, 2019, 10:02 a.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Marta Kuczora, Laszlo Pinter, and 
> Peter Vary.
> 
> 
> Bugs: HIVE-22420
> https://issues.apache.org/jira/browse/HIVE-22420
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DbTxnManager.stopHeartbeat() should be thread-safe, rollbackTxn() should 
> check before calling MS client, if txn has been closed previously, to avoid 
> sending 0 as txnId.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
> 
> 
> Diff: https://reviews.apache.org/r/71705/diff/2/
> 
> 
> Testing
> ---
> 
> unit test
> 
> 
> Thanks,
> 
> Aron Hamvas
> 
>



Re: Review Request 71589: Create read-only transactions

2019-10-10 Thread Denys Kuzmenko via Review Board


> On Oct. 10, 2019, 7:46 a.m., Peter Vary wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java
> > Lines 47 (patched)
> > 
> >
> > What about CREATE TABLE AS SELECT * FROM...?
> > We still might miss some cases.
> > 
> > I would create an assertion on generating writeId for read only 
> > transactions (this would be usefull anyway), and use the ptest to run on 
> > all of the test cases to see if this assertion breaks anything.
> > 
> > What do you think?

Hi Peter, yes, I was thinking about something similar, however most propably it 
would be one time check (won't be commited). 
Somewhere in Driver.compile method, after SemanticAnalyzer add assert if 
transaction marked as ReadOnly doesn't have assosiated write ids. 
This way we could make sure we do not misclasify some of the existing queries. 
Does this makes sence?


- Denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/#review218174
---


On Oct. 8, 2019, 2:27 p.m., Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71589/
> ---
> 
> (Updated Oct. 8, 2019, 2:27 p.m.)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21114
> https://issues.apache.org/jira/browse/HIVE-21114
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> With HIVE-21036 we have a way to indicate that a txn is read only.
> We should (at least in auto-commit mode) determine if the single stmt is a 
> read and mark the txn accordingly.
> Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks 
> in write_set etc.
> 
> TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
> TXN_OPEN) so it can read the txn type in the same SQL stmt.
> 
> HiveOperation only has QUERY, which includes Insert and Select, so this 
> requires figuring out how to determine if a query is a SELECT. By the time 
> Driver.openTransaction(); is called, we have already parsed the query so 
> there should be a way to know if the statement only reads.
> 
> For multi-stmt txns (once these are supported) we should allow user to 
> indicate that a txn is read-only and then not allow any statements that can 
> make modifications in this txn. This should be a different jira.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java bcd4600683 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 
> ac813c8288 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 
> 1c53426966 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
> cc86afedbf 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71589/diff/1/
> 
> 
> Testing
> ---
> 
> Unit + manual test
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 71589: Create read-only transactions

2019-10-10 Thread Denys Kuzmenko via Review Board


> On Oct. 10, 2019, 7:46 a.m., Peter Vary wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java
> > Lines 47 (patched)
> > 
> >
> > What about CREATE TABLE AS SELECT * FROM...?
> > We still might miss some cases.
> > 
> > I would create an assertion on generating writeId for read only 
> > transactions (this would be usefull anyway), and use the ptest to run on 
> > all of the test cases to see if this assertion breaks anything.
> > 
> > What do you think?
> 
> Denys Kuzmenko wrote:
> Hi Peter, yes, I was thinking about something similar, however most 
> propably it would be one time check (won't be commited). 
> Somewhere in Driver.compile method, after SemanticAnalyzer add assert if 
> transaction marked as ReadOnly doesn't have assosiated write ids. 
> This way we could make sure we do not misclasify some of the existing 
> queries. Does this makes sence?
> 
> Peter Vary wrote:
> I would vote for the check to be committed for several reasons:
> - We might cause strange/flaky errors if we assume that a transacion is 
> RO, but in reality it writes something. Easier to catch this if we fail fast.
> - When introducing new commands, it would be easy to forget to update 
> this check, but if the assertion is there we will catch them compile time - 
> again fail fast
> 
> Just my 2 cents

Agreed! Thank you, Peter!


- Denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/#review218174
---


On Oct. 8, 2019, 2:27 p.m., Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71589/
> ---
> 
> (Updated Oct. 8, 2019, 2:27 p.m.)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21114
> https://issues.apache.org/jira/browse/HIVE-21114
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> With HIVE-21036 we have a way to indicate that a txn is read only.
> We should (at least in auto-commit mode) determine if the single stmt is a 
> read and mark the txn accordingly.
> Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks 
> in write_set etc.
> 
> TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
> TXN_OPEN) so it can read the txn type in the same SQL stmt.
> 
> HiveOperation only has QUERY, which includes Insert and Select, so this 
> requires figuring out how to determine if a query is a SELECT. By the time 
> Driver.openTransaction(); is called, we have already parsed the query so 
> there should be a way to know if the statement only reads.
> 
> For multi-stmt txns (once these are supported) we should allow user to 
> indicate that a txn is read-only and then not allow any statements that can 
> make modifications in this txn. This should be a different jira.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java bcd4600683 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 
> ac813c8288 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 
> 1c53426966 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
> cc86afedbf 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71589/diff/1/
> 
> 
> Testing
> ---
> 
> Unit + manual test
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 71589: Create read-only transactions

2019-10-10 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/
---

(Updated Oct. 10, 2019, 4:09 p.m.)


Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-21114
https://issues.apache.org/jira/browse/HIVE-21114


Repository: hive-git


Description
---

With HIVE-21036 we have a way to indicate that a txn is read only.
We should (at least in auto-commit mode) determine if the single stmt is a read 
and mark the txn accordingly.
Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks in 
write_set etc.

TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
TXN_OPEN) so it can read the txn type in the same SQL stmt.

HiveOperation only has QUERY, which includes Insert and Select, so this 
requires figuring out how to determine if a query is a SELECT. By the time 
Driver.openTransaction(); is called, we have already parsed the query so there 
should be a way to know if the statement only reads.

For multi-stmt txns (once these are supported) we should allow user to indicate 
that a txn is read-only and then not allow any statements that can make 
modifications in this txn. This should be a different jira.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java bcd4600683 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java ac813c8288 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 1c53426966 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
cc86afedbf 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java PRE-CREATION 


Diff: https://reviews.apache.org/r/71589/diff/2/

Changes: https://reviews.apache.org/r/71589/diff/1-2/


Testing
---

Unit + manual test


File Attachments


HIVE-21114.1.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/10/0929ed4a-17be-4098-8c61-0819a30613fd__HIVE-21114.1.patch


Thanks,

Denys Kuzmenko



Re: Review Request 71589: Create read-only transactions

2019-10-10 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/
---

(Updated Oct. 10, 2019, 4:09 p.m.)


Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-21114
https://issues.apache.org/jira/browse/HIVE-21114


Repository: hive-git


Description
---

With HIVE-21036 we have a way to indicate that a txn is read only.
We should (at least in auto-commit mode) determine if the single stmt is a read 
and mark the txn accordingly.
Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks in 
write_set etc.

TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
TXN_OPEN) so it can read the txn type in the same SQL stmt.

HiveOperation only has QUERY, which includes Insert and Select, so this 
requires figuring out how to determine if a query is a SELECT. By the time 
Driver.openTransaction(); is called, we have already parsed the query so there 
should be a way to know if the statement only reads.

For multi-stmt txns (once these are supported) we should allow user to indicate 
that a txn is read-only and then not allow any statements that can make 
modifications in this txn. This should be a different jira.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java bcd4600683 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java ac813c8288 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 1c53426966 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
cc86afedbf 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java PRE-CREATION 


Diff: https://reviews.apache.org/r/71589/diff/1/


Testing
---

Unit + manual test


File Attachments (updated)


HIVE-21114.1.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/10/0929ed4a-17be-4098-8c61-0819a30613fd__HIVE-21114.1.patch


Thanks,

Denys Kuzmenko



Review Request 71589: Create read-only transactions

2019-10-08 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/
---

Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-21114
https://issues.apache.org/jira/browse/HIVE-21114


Repository: hive-git


Description
---

With HIVE-21036 we have a way to indicate that a txn is read only.
We should (at least in auto-commit mode) determine if the single stmt is a read 
and mark the txn accordingly.
Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks in 
write_set etc.

TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
TXN_OPEN) so it can read the txn type in the same SQL stmt.

HiveOperation only has QUERY, which includes Insert and Select, so this 
requires figuring out how to determine if a query is a SELECT. By the time 
Driver.openTransaction(); is called, we have already parsed the query so there 
should be a way to know if the statement only reads.

For multi-stmt txns (once these are supported) we should allow user to indicate 
that a txn is read-only and then not allow any statements that can make 
modifications in this txn. This should be a different jira.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java bcd4600683 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java ac813c8288 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 1c53426966 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
cc86afedbf 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java PRE-CREATION 


Diff: https://reviews.apache.org/r/71589/diff/1/


Testing
---

Unit + manual test


Thanks,

Denys Kuzmenko



Re: Review Request 72129: HIVE-22850: Optimise lock acquisition in TxnHandler

2020-02-13 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72129/#review219572
---




standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 4546 (patched)


We shouldn't remove NULL partitions (lock on db/table level).


- Denys Kuzmenko


On Feb. 13, 2020, 11:40 a.m., Rajesh Balamohan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72129/
> ---
> 
> (Updated Feb. 13, 2020, 11:40 a.m.)
> 
> 
> Review request for hive, Gopal V, Peter Vary, and Zoltan Chovan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> - Main change is in TxnHandler::checkLock. 
> - When all incoming requests are SHARED_READ, we can add a condition in 
> the query to retrieve only relevant rows. This avoids significant number of 
> rows fetched in the form of "SHARED_READ + ACQUIRED". There is a corner 
> condition of "SHARED_WRITE --> SHARED_READ::ACQUIRED", which is misleading in 
> the jumpttable. This condition can be optimised later.
> - Also, removed the "HL_PARTITION IN" clause which could potentially 
> overflow for oracle. Partition details can be filtered out, if the earlier 
> query actually returned any rows.
> - Rest of the changes, are related to refactoring 
> "TxnHandler::enqueueLockWithRetry" to reduce lock scope.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  f53aebe4ad 
> 
> 
> Diff: https://reviews.apache.org/r/72129/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Rajesh Balamohan
> 
>



Re: Review Request 72129: HIVE-22850: Optimise lock acquisition in TxnHandler

2020-02-13 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72129/#review219575
---




standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 4546 (patched)


Hi Rajesh, could you please explain, what is the reason of doing partition 
filtering on HMS side, not backend db?


- Denys Kuzmenko


On Feb. 13, 2020, 1:22 p.m., Rajesh Balamohan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72129/
> ---
> 
> (Updated Feb. 13, 2020, 1:22 p.m.)
> 
> 
> Review request for hive, Gopal V, Peter Vary, and Zoltan Chovan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> - Main change is in TxnHandler::checkLock. 
> - When all incoming requests are SHARED_READ, we can add a condition in 
> the query to retrieve only relevant rows. This avoids significant number of 
> rows fetched in the form of "SHARED_READ + ACQUIRED". There is a corner 
> condition of "SHARED_WRITE --> SHARED_READ::ACQUIRED", which is misleading in 
> the jumpttable. This condition can be optimised later.
> - Also, removed the "HL_PARTITION IN" clause which could potentially 
> overflow for oracle. Partition details can be filtered out, if the earlier 
> query actually returned any rows.
> - Rest of the changes, are related to refactoring 
> "TxnHandler::enqueueLockWithRetry" to reduce lock scope.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  f53aebe4ad 
> 
> 
> Diff: https://reviews.apache.org/r/72129/diff/2/
> 
> 
> Testing
> ---
> 
> 
> File Attachments
> 
> 
> HIVE-22850.5.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/02/13/74ec6cbd-c552-4d46-b5a6-e2fa6da41bdc__HIVE-22850.5.patch
> 
> 
> Thanks,
> 
> Rajesh Balamohan
> 
>



Re: Review Request 72151: HIVE-22376: Cancelled query still prints exception if it was stuck in waiting for lock

2020-02-19 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72151/#review219612
---




ql/src/java/org/apache/hadoop/hive/ql/Driver.java
Lines 400 (patched)


Could you please elaborate on terminal state. Maybe I am lacking the 
context (is it only when query is cancelled), but it's not obvious for me what 
might lead us to this state.


- Denys Kuzmenko


On Feb. 19, 2020, 9:21 a.m., Aron Hamvas wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72151/
> ---
> 
> (Updated Feb. 19, 2020, 9:21 a.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Marta Kuczora, Laszlo Pinter, and 
> Peter Vary.
> 
> 
> Bugs: HIVE-22376
> https://issues.apache.org/jira/browse/HIVE-22376
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> If the reason for lock acquisition failure is that the query is canceled, the 
> exception should be ignored.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
> 48ebc4f87071bae4cc39309ada8d90dfc5c64f5b 
> 
> 
> Diff: https://reviews.apache.org/r/72151/diff/2/
> 
> 
> Testing
> ---
> 
> 
> File Attachments
> 
> 
> HIVE-22376.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/02/19/41c20e4c-57f6-46cc-988c-3b76041aa7ee__HIVE-22376.patch
> 
> 
> Thanks,
> 
> Aron Hamvas
> 
>



Re: Review Request 72129: HIVE-22850: Optimise lock acquisition in TxnHandler

2020-02-14 Thread Denys Kuzmenko via Review Board


> On Feb. 13, 2020, 3:08 p.m., Denys Kuzmenko wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Lines 4546 (patched)
> > 
> >
> > Hi Rajesh, could you please explain, what is the reason of doing 
> > partition filtering on HMS side, not backend db?
> 
> Rajesh Balamohan wrote:
> By adding all the partition details, the query can become large and has 
> the issue of overflowing in the case of oracle (i,e have to batch with 1000 
> entries). Also, its incurs parsing in sql server side, as it is executed as 
> Statement. Given that we have additional filter now, it would make it a lot 
> simpler to do this in client side.  This was pointed out in the JIRA by Gopal.

Can we rewrite the query with JOIN operator? Somethinkg like:
https://issues.apache.org/jira/browse/HIVE-22888


- Denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72129/#review219575
---


On Feb. 14, 2020, 1:27 a.m., Rajesh Balamohan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72129/
> ---
> 
> (Updated Feb. 14, 2020, 1:27 a.m.)
> 
> 
> Review request for hive, Gopal V, Peter Vary, and Zoltan Chovan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> - Main change is in TxnHandler::checkLock. 
> - When all incoming requests are SHARED_READ, we can add a condition in 
> the query to retrieve only relevant rows. This avoids significant number of 
> rows fetched in the form of "SHARED_READ + ACQUIRED". There is a corner 
> condition of "SHARED_WRITE --> SHARED_READ::ACQUIRED", which is misleading in 
> the jumpttable. This condition can be optimised later.
> - Also, removed the "HL_PARTITION IN" clause which could potentially 
> overflow for oracle. Partition details can be filtered out, if the earlier 
> query actually returned any rows.
> - Rest of the changes, are related to refactoring 
> "TxnHandler::enqueueLockWithRetry" to reduce lock scope.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbLockManager.java a8b9653411 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  f53aebe4ad 
> 
> 
> Diff: https://reviews.apache.org/r/72129/diff/3/
> 
> 
> Testing
> ---
> 
> 
> File Attachments
> 
> 
> HIVE-22850.5.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/02/13/74ec6cbd-c552-4d46-b5a6-e2fa6da41bdc__HIVE-22850.5.patch
> 
> 
> Thanks,
> 
> Rajesh Balamohan
> 
>



Re: Review Request 71988: HIVE-22703: Compaction configuration check when starting HMS/HS2

2020-01-15 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71988/#review219268
---


Ship it!




Ship It!

- Denys Kuzmenko


On Jan. 13, 2020, 2:12 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71988/
> ---
> 
> (Updated Jan. 13, 2020, 2:12 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Karen Coppage, and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22703: Compaction configuration check when starting HMS/HS2
> 
> 
> Diffs
> -
> 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java 
> a93cc1b7e1aadf6e2724d667b6e4c9c9ecc38a75 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  94698e6771890a050b00b374ca0ee926f768aa0e 
> 
> 
> Diff: https://reviews.apache.org/r/71988/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 71988: HIVE-22703: Compaction configuration check when starting HMS/HS2

2020-01-13 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71988/#review219237
---



LGTM, just minor comments


service/src/java/org/apache/hive/service/server/HiveServer2.java
Lines 437 (patched)


do we have constant for "hs2" literal?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 10135 (patched)


small nit: you can use static import to reduce log command length (i.e. use 
LOG instead of HMSHandler.LOG)


- Denys Kuzmenko


On Jan. 13, 2020, 2:12 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71988/
> ---
> 
> (Updated Jan. 13, 2020, 2:12 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Karen Coppage, and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22703: Compaction configuration check when starting HMS/HS2
> 
> 
> Diffs
> -
> 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java 
> a93cc1b7e1aadf6e2724d667b6e4c9c9ecc38a75 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  94698e6771890a050b00b374ca0ee926f768aa0e 
> 
> 
> Diff: https://reviews.apache.org/r/71988/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 72028: HIVE-22729: Provide a failure reason for failed compactions

2020-01-20 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72028/#review219335
---


Ship it!




LGTM, 1 minor comment


ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java
Lines 127 (patched)


Wouldn't it be helpul to add treshold param for more clarity ("Compaction 
is not initiated since last 
HiveConf.ConfVars.COMPACTOR_INITIATOR_FAILED_THRESHOLD consecutive compaction 
attemps failed) ?


- Denys Kuzmenko


On Jan. 20, 2020, 12:51 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72028/
> ---
> 
> (Updated Jan. 20, 2020, 12:51 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Karen Coppage, and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22729: Provide a failure reason for failed compactions
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/ddl/process/show/compactions/ShowCompactionsDesc.java
>  9348efc5a12b50f55f5952094882e941158405fd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/ddl/process/show/compactions/ShowCompactionsOperation.java
>  517d88237cc3b8f0316727bf1eebfc6535152fae 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java 
> 6f642901203ab73699ed694009d48ca77263fb10 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 6017fd31b1c180c144c0d37dddc56c44d9d8a05b 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java 
> 5aff71e0e981c429f85663300d3e5c21089529a9 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
>  e5895547e6006f30a37b5ba0b1ce42253129d3b6 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/CompactionInfoStruct.java
>  4aee45ce5f0e534823194bc84d13b88210ce0b3c 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ShowCompactResponseElement.java
>  8a5682a013b24f8dcf7ad3fdb0b0b606d82cc7c0 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  093ad4be273f3544e013c608ba3ca0a66cfaea20 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  0dcca59b6812a892055d5d0615c1c479e8d8d030 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  8d7c32a7658535b2a55695813f5e98a95f79f0a4 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 72ccdd1a0f62ed1c034b17dfbd84a8ab3f49befd 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionInfo.java
>  ba45f3945274853fdc84487d93c4c00ff2982541 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
>  aded6f5486cc840f397347b39049310009fd3bad 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  da5dd61d08e2ca8fe5e80ffdf9fb4a6f4c4d0ba3 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  1dc3867929fcdfef53da4859fbe8de2e89435166 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  67102718867233f29ddb2ea8ec3fbcb6560c6c30 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
>  ae0a32541a4bb9179b2bb71ae9f9098d7b35a88e 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
>  221d4f1fffb682aaec3af22a339e7a3077a75f6a 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
>  bc98d5fc4a5637988c97f2e5a0e02d3be16ae0cb 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
>  dd761a66db4826580a67d64879e4c85278b8e20c 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
>  6a040a6a64c2086b5eb68a397697c9e2d2ca4d76 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
>  f5ec1ba1aff89d02b66d6a2cd1da8de1b3b08d06 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
>  c7738be2732b839aa2b460733c092e368909f935 
>   
> standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql
>  455f98b72578ff977e29301cd2fc595ae80ee4ca 
>   
> standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql
>  5c39b0d9f4d27ab82ef44392818c1810cb7664ce 
> 
> 
> Diff: https://reviews.apache.org/r/72028/diff/1/
> 
> 
> Testing
> ---
> 
> 
> 

Re: Review Request 72324: HIVE-22750: Consolidate LockType naming

2020-04-08 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72324/#review220256
---




metastore/scripts/upgrade/hive/hive-schema-4.0.0.hive.sql
Line 1421 (original), 1421 (patched)


where is excl_write?


- Denys Kuzmenko


On April 6, 2020, 1:58 p.m., Marton Bod wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72324/
> ---
> 
> (Updated April 6, 2020, 1:58 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Peter Vary, and Zoltan Chovan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22750: Consolidate LockType naming
> 
> 
> Diffs
> -
> 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/HiveEndPoint.java
>  e249b7775e 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/client/lock/Lock.java
>  52eb6133e7 
>   metastore/scripts/upgrade/hive/hive-schema-4.0.0.hive.sql 03540bba4d 
>   metastore/scripts/upgrade/hive/upgrade-3.1.0-to-4.0.0.hive.sql fa518747de 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 17e6cdf162 
>   ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
> 72f095d264 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 80fb1aff78 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/LockType.java
>  8ae4351129 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  db4cfb996a 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  cf3137928f 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  849970eb56 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockComponentBuilder.java
>  c739d4d196 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 098ddec5dc 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  7d0db0c3a0 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java
>  da38a6bbd3 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtil.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
>  1dfc105958 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtilTest.java
>  PRE-CREATION 
>   streaming/src/java/org/apache/hive/streaming/TransactionBatch.java 
> d44065018f 
> 
> 
> Diff: https://reviews.apache.org/r/72324/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Marton Bod
> 
>



Re: Review Request 72324: HIVE-22750: Consolidate LockType naming

2020-04-08 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72324/#review220258
---



LGTM, just few more comments


standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockComponentBuilder.java
Line 60 (original), 60 (patched)


add builder for ExclWrite



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtil.java
Lines 14 (patched)


why not enum?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtil.java
Lines 34 (patched)


you can move inverse into static and do no calc every time


- Denys Kuzmenko


On April 8, 2020, 3:09 p.m., Marton Bod wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72324/
> ---
> 
> (Updated April 8, 2020, 3:09 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Peter Vary, and Zoltan Chovan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22750: Consolidate LockType naming
> 
> 
> Diffs
> -
> 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/HiveEndPoint.java
>  e249b7775e 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/client/lock/Lock.java
>  52eb6133e7 
>   metastore/scripts/upgrade/hive/hive-schema-4.0.0.hive.sql 03540bba4d 
>   metastore/scripts/upgrade/hive/upgrade-3.1.0-to-4.0.0.hive.sql fa518747de 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 17e6cdf162 
>   ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
> 72f095d264 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 80fb1aff78 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/LockType.java
>  8ae4351129 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  db4cfb996a 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  cf3137928f 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  849970eb56 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockComponentBuilder.java
>  c739d4d196 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 098ddec5dc 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  7d0db0c3a0 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java
>  da38a6bbd3 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtil.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
>  1dfc105958 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtilTest.java
>  PRE-CREATION 
>   streaming/src/java/org/apache/hive/streaming/TransactionBatch.java 
> d44065018f 
> 
> 
> Diff: https://reviews.apache.org/r/72324/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Marton Bod
> 
>



Re: Review Request 72324: HIVE-22750: Consolidate LockType naming

2020-04-09 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72324/#review220265
---


Ship it!




Ship It!

- Denys Kuzmenko


On April 8, 2020, 3:09 p.m., Marton Bod wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72324/
> ---
> 
> (Updated April 8, 2020, 3:09 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Peter Vary, and Zoltan Chovan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22750: Consolidate LockType naming
> 
> 
> Diffs
> -
> 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/HiveEndPoint.java
>  e249b7775e 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/client/lock/Lock.java
>  52eb6133e7 
>   metastore/scripts/upgrade/hive/hive-schema-4.0.0.hive.sql 03540bba4d 
>   metastore/scripts/upgrade/hive/upgrade-3.1.0-to-4.0.0.hive.sql fa518747de 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 17e6cdf162 
>   ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
> 72f095d264 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 80fb1aff78 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/LockType.java
>  8ae4351129 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  db4cfb996a 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  cf3137928f 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  849970eb56 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockComponentBuilder.java
>  c739d4d196 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 098ddec5dc 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  7d0db0c3a0 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java
>  da38a6bbd3 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtil.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
>  1dfc105958 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtilTest.java
>  PRE-CREATION 
>   streaming/src/java/org/apache/hive/streaming/TransactionBatch.java 
> d44065018f 
> 
> 
> Diff: https://reviews.apache.org/r/72324/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Marton Bod
> 
>



Re: Review Request 72324: HIVE-22750: Consolidate LockType naming

2020-04-07 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72324/#review220239
---



LGTM, some comments


hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/HiveEndPoint.java
Line 1017 (original), 1017 (patched)


Insert & Select both acquire same type of the lock  - Shared. SharedRead 
for Insert sounds odd, isn't it? That's minor, just to consider.



standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/LockType.java
Lines 18 (patched)


Exclusive is more restrictive than EXCL_WRITE. We have lock comparator that 
relies on id.


- Denys Kuzmenko


On April 6, 2020, 1:58 p.m., Marton Bod wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72324/
> ---
> 
> (Updated April 6, 2020, 1:58 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Peter Vary, and Zoltan Chovan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22750: Consolidate LockType naming
> 
> 
> Diffs
> -
> 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/HiveEndPoint.java
>  e249b7775e 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/client/lock/Lock.java
>  52eb6133e7 
>   metastore/scripts/upgrade/hive/hive-schema-4.0.0.hive.sql 03540bba4d 
>   metastore/scripts/upgrade/hive/upgrade-3.1.0-to-4.0.0.hive.sql fa518747de 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 17e6cdf162 
>   ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
> 72f095d264 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 80fb1aff78 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/LockType.java
>  8ae4351129 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  db4cfb996a 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  cf3137928f 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  849970eb56 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockComponentBuilder.java
>  c739d4d196 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 098ddec5dc 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  e8a988cca8 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java
>  da38a6bbd3 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtil.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
>  1dfc105958 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtilTest.java
>  PRE-CREATION 
>   streaming/src/java/org/apache/hive/streaming/TransactionBatch.java 
> d44065018f 
> 
> 
> Diff: https://reviews.apache.org/r/72324/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Marton Bod
> 
>



Re: Review Request 72387: Locks: Add new lock implementations for always zero-wait readers

2020-04-20 Thread Denys Kuzmenko via Review Board


> On April 20, 2020, 4:55 p.m., Peter Vary wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 2727-2730 (original), 2727-2730 (patched)
> > 
> >
> > Is this config still needed in this case?

yes, in order to be backward compatible, when user goes with an exclusive 
writes and desides to lower the lock level for the insert overwrite from 
exclusive to exclusive write (insert overwrite can execute in parallel with 
insert - current functionality)


> On April 20, 2020, 4:55 p.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
> > Line 3027 (original), 3027-3028 (patched)
> > 
> >
> > Maybe get once, and store?

fixed


- Denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72387/#review220374
---


On April 20, 2020, 8:50 p.m., Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72387/
> ---
> 
> (Updated April 20, 2020, 8:50 p.m.)
> 
> 
> Review request for hive, Marton Bod and Peter Vary.
> 
> 
> Bugs: HIVE-19369
> https://issues.apache.org/jira/browse/HIVE-19369
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Hive Locking with Micro-managed and full-ACID tables needs a better locking 
> implementation which allows for no-wait readers always.
> 
> EXCL_DROP
> EXCL_WRITE
> SHARED_WRITE
> SHARED_READ
> 
> Short write-up
> 
> EXCL_DROP is a "drop partition" or "drop table" and waits for all others to 
> exit
> EXCL_WRITE excludes all writes and will wait for all existing SHARED_WRITE to 
> exit.
> SHARED_WRITE allows all SHARED_WRITES to go through, but will wait for an 
> EXCL_WRITE & EXCL_DROP (waiting so that you can do drop + insert in different 
> threads).
> 
> SHARED_READ does not wait for any lock - it fails fast for a pending 
> EXCL_DROP, because even if there is an EXCL_WRITE or SHARED_WRITE pending, 
> there's no semantic reason to wait for them to succeed before going ahead 
> with a SHARED_WRITE.
> 
> a select * => SHARED_READ
> an insert into => SHARED_WRITE
> an insert overwrite or MERGE => EXCL_WRITE
> a drop table => EXCL_DROP
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 16bae920df 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 77878ca40b 
>   ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
> 1d211857bf 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 73d3b91585 
>   ql/src/test/queries/clientpositive/explain_locks.q 3c11560c5f 
>   ql/src/test/results/clientpositive/llap/explain_locks.q.out 3183533478 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockRequestBuilder.java
>  22902a9c20 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockTypeComparator.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  d080df417b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtil.java
>  f928bf781b 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtilTest.java
>  3d69a6e7dc 
> 
> 
> Diff: https://reviews.apache.org/r/72387/diff/2/
> 
> 
> Testing
> ---
> 
> Added number of tests under TestDbTxnManager2, TestTxnHandler & extended 
> explain_locks.q test.
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 72387: Locks: Add new lock implementations for always zero-wait readers

2020-04-20 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72387/
---

(Updated April 20, 2020, 8:50 p.m.)


Review request for hive, Marton Bod and Peter Vary.


Bugs: HIVE-19369
https://issues.apache.org/jira/browse/HIVE-19369


Repository: hive-git


Description
---

Hive Locking with Micro-managed and full-ACID tables needs a better locking 
implementation which allows for no-wait readers always.

EXCL_DROP
EXCL_WRITE
SHARED_WRITE
SHARED_READ

Short write-up

EXCL_DROP is a "drop partition" or "drop table" and waits for all others to exit
EXCL_WRITE excludes all writes and will wait for all existing SHARED_WRITE to 
exit.
SHARED_WRITE allows all SHARED_WRITES to go through, but will wait for an 
EXCL_WRITE & EXCL_DROP (waiting so that you can do drop + insert in different 
threads).

SHARED_READ does not wait for any lock - it fails fast for a pending EXCL_DROP, 
because even if there is an EXCL_WRITE or SHARED_WRITE pending, there's no 
semantic reason to wait for them to succeed before going ahead with a 
SHARED_WRITE.

a select * => SHARED_READ
an insert into => SHARED_WRITE
an insert overwrite or MERGE => EXCL_WRITE
a drop table => EXCL_DROP


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 16bae920df 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 77878ca40b 
  ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
1d211857bf 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
73d3b91585 
  ql/src/test/queries/clientpositive/explain_locks.q 3c11560c5f 
  ql/src/test/results/clientpositive/llap/explain_locks.q.out 3183533478 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockRequestBuilder.java
 22902a9c20 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockTypeComparator.java
 PRE-CREATION 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 d080df417b 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtil.java
 f928bf781b 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtilTest.java
 3d69a6e7dc 


Diff: https://reviews.apache.org/r/72387/diff/2/

Changes: https://reviews.apache.org/r/72387/diff/1-2/


Testing
---

Added number of tests under TestDbTxnManager2, TestTxnHandler & extended 
explain_locks.q test.


Thanks,

Denys Kuzmenko



Re: Review Request 72387: Locks: Add new lock implementations for always zero-wait readers

2020-04-20 Thread Denys Kuzmenko via Review Board


> On April 20, 2020, 4:55 p.m., Peter Vary wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 2732 (patched)
> > 
> >
> > Maybe rephrase?
> > Manages concurrency levels for ACID resoruces. Enables users enable 
> > parallel queries by enabling write-write conflict resolution happen only at 
> > commit phase 
> > - If true - no commit phase conflict resolution: 
> >- INSERT OVERWRITE requests EXCLUSIVE locks
> >- UPDATE/DELETE requests EXCL_WRITE lock
> >- INSERT requests SHARED_READ lock
> > - If false - write might fail when committed on conflict check: 
> >- INSERT OVERWRITE requests EXCL_WRITE locks
> >- UPDATE/DELETE/INSERT requests SHARED_READ lock

fixed


- Denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72387/#review220374
---


On April 20, 2020, 8:50 p.m., Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72387/
> ---
> 
> (Updated April 20, 2020, 8:50 p.m.)
> 
> 
> Review request for hive, Marton Bod and Peter Vary.
> 
> 
> Bugs: HIVE-19369
> https://issues.apache.org/jira/browse/HIVE-19369
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Hive Locking with Micro-managed and full-ACID tables needs a better locking 
> implementation which allows for no-wait readers always.
> 
> EXCL_DROP
> EXCL_WRITE
> SHARED_WRITE
> SHARED_READ
> 
> Short write-up
> 
> EXCL_DROP is a "drop partition" or "drop table" and waits for all others to 
> exit
> EXCL_WRITE excludes all writes and will wait for all existing SHARED_WRITE to 
> exit.
> SHARED_WRITE allows all SHARED_WRITES to go through, but will wait for an 
> EXCL_WRITE & EXCL_DROP (waiting so that you can do drop + insert in different 
> threads).
> 
> SHARED_READ does not wait for any lock - it fails fast for a pending 
> EXCL_DROP, because even if there is an EXCL_WRITE or SHARED_WRITE pending, 
> there's no semantic reason to wait for them to succeed before going ahead 
> with a SHARED_WRITE.
> 
> a select * => SHARED_READ
> an insert into => SHARED_WRITE
> an insert overwrite or MERGE => EXCL_WRITE
> a drop table => EXCL_DROP
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 16bae920df 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 77878ca40b 
>   ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
> 1d211857bf 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 73d3b91585 
>   ql/src/test/queries/clientpositive/explain_locks.q 3c11560c5f 
>   ql/src/test/results/clientpositive/llap/explain_locks.q.out 3183533478 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockRequestBuilder.java
>  22902a9c20 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockTypeComparator.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  d080df417b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtil.java
>  f928bf781b 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtilTest.java
>  3d69a6e7dc 
> 
> 
> Diff: https://reviews.apache.org/r/72387/diff/2/
> 
> 
> Testing
> ---
> 
> Added number of tests under TestDbTxnManager2, TestTxnHandler & extended 
> explain_locks.q test.
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 72388: HIVE-23048 Use sequences for TXN_ID generation

2020-04-21 Thread Denys Kuzmenko via Review Board


> On April 20, 2020, 10:40 a.m., Denys Kuzmenko wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Line 172 (original), 225 (patched)
> > 
> >
> > Please rebase, this code is no longer here
> 
> Peter Varga wrote:
> which commit removed this? I don't see it.

oh, sorry, I confused it with the lock type. Not an issue ignore.


> On April 20, 2020, 10:40 a.m., Denys Kuzmenko wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Lines 902 (patched)
> > 
> >
> > Can it be null? Could we add constraint on db level instead of multiple 
> > checks in backend code?
> 
> Peter Varga wrote:
> It should not be null, but the constraint is that there must be always 
> one record in the table, not a nonnull column.

ok, don't you cover this with if (!maxTxnIdRs.next()) { .. } ?


- Denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72388/#review220367
---


On April 21, 2020, 7:53 a.m., Peter Varga wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72388/
> ---
> 
> (Updated April 21, 2020, 7:53 a.m.)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> * Use sequences for TXN_ID generation.
>   * Make it possible to open transactions in parallel
>   * drop Oracle 11g support
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 37a5862791 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
>  15fcfc0e35 
>   ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
> 1d211857bf 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 2c13e8dd03 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java 51b0fa336f 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommandsForMmTable.java 
> 6525ffc00a 
>   ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java 
> 1435269ed3 
>   
> ql/src/test/org/apache/hadoop/hive/ql/lockmgr/DbTxnManagerEndToEndTestBase.java
>  PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 73d3b91585 
>   
> ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManagerIsolationProperties.java
>  PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java 
> 1151466f8c 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  a874121e12 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/SQLGenerator.java
>  49b737ecf9 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
>  2344c2d5f6 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  bb29410e7d 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  e7910c1c5d 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
>  87130a519d 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  1ace9d3ef0 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
>  8a3cd56658 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
>  2e0177723d 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
>  9f3951575b 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
>  0512a45cad 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
>  4b82e36ab4 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
>  db398e5f66 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
>  1be83fc4a9 
>   
> standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql
>  e6e30160af 
>   
> standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql
>  b90cecb173 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/DbInstallBase.java
>  c1a1629548 
>   
> 

  1   2   >