Review Request 69058: HIVE-19647 use bitvectors in IN operators

2018-10-16 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69058/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-19647
https://issues.apache.org/jira/browse/HIVE-19647


Repository: hive-git


Description
---

Enables to estimate selectivity by using the bitvector


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
29958b3e50e577a593d3a953156668e2dc82bfd1 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/HiveMurmur3Adapter.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
 32fba6c8ff80befdde55542a4ae83b619256632e 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColStatistics.java 
a31f965a5fbe123f4cfec0ba8fe1171796ab7b5c 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDesc.java 
7b8c5d12332b34d3500e68cf84300b9edc7df095 
  ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java 
b7adc485a70e148e71feb594f311bfad1763479d 
  ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestStatEstimations.java 
e5233ced3fd5910d6e5045b94e120caddb818012 
  ql/src/test/queries/clientpositive/in_bitvector_filter.q PRE-CREATION 
  ql/src/test/results/clientpositive/in_bitvector_filter.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/69058/diff/1/


Testing
---


Thanks,

Zoltan Haindrich



Review Request 69031: Changed default config of hive.tez.llap.min.reducer.per.executor to 0.33

2018-10-16 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69031/
---

Review request for hive.


Bugs: HIVE-20572
https://issues.apache.org/jira/browse/HIVE-20572


Repository: hive-git


Description
---

Changed default config of hive.tez.llap.min.reducer.per.executor to 0.33


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 29958b3e50 
  ql/src/test/results/clientpositive/llap/bucket_groupby.q.out 433e033b6e 
  ql/src/test/results/clientpositive/llap/cbo_limit.q.out 0d5c8f0e36 
  ql/src/test/results/clientpositive/llap/cbo_rp_limit.q.out 0d5c8f0e36 
  ql/src/test/results/clientpositive/llap/cbo_rp_views.q.out 878a767a19 
  ql/src/test/results/clientpositive/llap/cbo_views.q.out 214574ed61 
  ql/src/test/results/clientpositive/llap/cluster.q.out 056c4dac15 
  ql/src/test/results/clientpositive/llap/constraints_optimization.q.out 
b45b7c409f 
  ql/src/test/results/clientpositive/llap/correlationoptimizer1.q.out 
0d32c395c6 
  ql/src/test/results/clientpositive/llap/cte_1.q.out 044fb70cbc 
  ql/src/test/results/clientpositive/llap/dp_counter_mm.q.out 4ca60ba5ce 
  ql/src/test/results/clientpositive/llap/dp_counter_non_mm.q.out 101b343506 
  ql/src/test/results/clientpositive/llap/except_distinct.q.out ea0224c888 
  ql/src/test/results/clientpositive/llap/hybridgrace_hashjoin_2.q.out 
da2be55462 
  ql/src/test/results/clientpositive/llap/intersect_all.q.out 1f6b0b872b 
  ql/src/test/results/clientpositive/llap/intersect_distinct.q.out b4c69b1505 
  ql/src/test/results/clientpositive/llap/lateral_view.q.out c1bca18a07 
  ql/src/test/results/clientpositive/llap/lineage2.q.out f56b100046 
  ql/src/test/results/clientpositive/llap/llap_decimal64_reader.q.out 
945dfd6a37 
  ql/src/test/results/clientpositive/llap/llap_smb.q.out ed10999f8f 
  ql/src/test/results/clientpositive/llap/materialized_view_create.q.out 
36a3d8c3bf 
  
ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_2.q.out
 d7c1ee15d2 
  ql/src/test/results/clientpositive/llap/materialized_view_describe.q.out 
2928fcfb9b 
  ql/src/test/results/clientpositive/llap/multi_count_distinct_null.q.out 
a049b02fda 
  ql/src/test/results/clientpositive/llap/parquet_types.q.out 508ac16878 
  ql/src/test/results/clientpositive/llap/parquet_types_vectorization.q.out 
4cc93bdd2a 
  ql/src/test/results/clientpositive/llap/partition_multilevels.q.out 
00d0a14515 
  ql/src/test/results/clientpositive/llap/reduce_deduplicate_extended.q.out 
53d0f3192d 
  ql/src/test/results/clientpositive/llap/results_cache_1.q.out 86a110bb83 
  ql/src/test/results/clientpositive/llap/results_cache_with_masking.q.out 
ba3b2804be 
  ql/src/test/results/clientpositive/llap/skiphf_aggr.q.out 667692a2d7 
  ql/src/test/results/clientpositive/llap/subquery_notin.q.out 083ad3074a 
  ql/src/test/results/clientpositive/llap/tez_input_counters.q.out 8f4f0551cd 
  ql/src/test/results/clientpositive/llap/tez_smb_reduce_side.q.out 9ee3dc161d 
  ql/src/test/results/clientpositive/llap/tez_union2.q.out b00c36ebb8 
  ql/src/test/results/clientpositive/llap/udaf_collect_set_2.q.out 4d6a68404e 
  ql/src/test/results/clientpositive/llap/unionDistinct_3.q.out 6cef15adc6 
  ql/src/test/results/clientpositive/llap/vector_complex_all.q.out f0f5fe7a8a 
  ql/src/test/results/clientpositive/llap/vector_grouping_sets.q.out 78de6807d3 
  ql/src/test/results/clientpositive/llap/vector_partitioned_date_time.q.out 
4711f35165 
  ql/src/test/results/clientpositive/llap/vector_ptf_part_simple.q.out 
6fa48e88a4 
  ql/src/test/results/clientpositive/llap/vector_windowing_expressions.q.out 
49daa409e3 
  
ql/src/test/results/clientpositive/llap/vector_windowing_multipartitioning.q.out
 7596c9a8c7 
  
ql/src/test/results/clientpositive/llap/vector_windowing_range_multiorder.q.out 
b906d156b5 
  ql/src/test/results/clientpositive/llap/vectorized_distinct_gby.q.out 
0cffc4e750 
  ql/src/test/results/clientpositive/llap/vectorized_parquet.q.out db7262f078 


Diff: https://reviews.apache.org/r/69031/diff/1/


Testing
---


Thanks,

Ashutosh Chauhan



[jira] [Created] (HIVE-20759) hive.server2.enable.doAs=false is not honored

2018-10-16 Thread Xiaoyu Yao (JIRA)
Xiaoyu Yao created HIVE-20759:
-

 Summary: hive.server2.enable.doAs=false is not honored 
 Key: HIVE-20759
 URL: https://issues.apache.org/jira/browse/HIVE-20759
 Project: Hive
  Issue Type: Bug
Reporter: Xiaoyu Yao


During recent Ozone test, we found that Hive proxy user was not setup properly. 
When  hive.server2.enable.doAs=false, Hive still send a proxy user for 
FileSystem related RPC calls. 

With changes from 
[HIVE-11029|https://github.com/apache/hive/commit/8ed337749261ad78becb46a16a350ef23d9f305f],
 in the non-Kerberos case, hive always sends an impersonated proxy UGI along 
with the RPC calls now. As a result, we have to add the 
hadoop.proxyuser.hive.users/groups/hosts=* to core-site.xml to make it work 
even though hive.server2.enable.doAs=false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20758) Constraints: Show create table does not show constraints

2018-10-16 Thread Gopal V (JIRA)
Gopal V created HIVE-20758:
--

 Summary: Constraints: Show create table does not show constraints
 Key: HIVE-20758
 URL: https://issues.apache.org/jira/browse/HIVE-20758
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V


Even though the desc formatted shows the constraints, the show create table 
does not

{code}
| # Primary Key  | NULL 
  | NULL   |
| Table: | 
tpcds_bin_partitioned_orc_1.inventory  | NULL   
|
| Constraint Name:   | pk_in
  | NULL   |
| Column Names:  | inv_date_sk  
  | inv_item_sk|
|| NULL 
  | NULL   |
| # Foreign Keys | NULL 
  | NULL   |
| Table: | 
tpcds_bin_partitioned_orc_1.inventory  | NULL   
|
| Constraint Name:   | inv_d
  | NULL   |
| Parent Column Name:tpcds_bin_partitioned_orc_1.date_dim.d_date_sk | 
Column Name:inv_date_sk| Key Sequence:1 
|
|| NULL 
  | NULL   |
| Constraint Name:   | inv_i
  | NULL   |
| Parent Column Name:tpcds_bin_partitioned_orc_1.item.i_item_sk | Column 
Name:inv_item_sk| Key Sequence:1
 |
|| NULL 
  | NULL   |
| Constraint Name:   | inv_w
  | NULL   |
| Parent Column Name:tpcds_bin_partitioned_orc_1.warehouse.w_warehouse_sk | 
Column Name:inv_warehouse_sk   | Key Sequence:1 
|
|| NULL 
  | NULL   |
{code}

But 

{code}
++
|   createtab_stmt   |
++
| CREATE TABLE `inventory`(  |
|   `inv_item_sk` bigint,|
|   `inv_warehouse_sk` bigint,   |
|   `inv_quantity_on_hand` int,  |
|   `inv_date_sk` bigint)|
| ROW FORMAT SERDE   |
|   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'  |
| STORED AS INPUTFORMAT  |
|   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  |
| OUTPUTFORMAT   |
|   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' |
| LOCATION   |
|   
'hdfs:///warehouse/tablespace/managed/hive/tpcds_bin_partitioned_orc_1.db/inventory'
 |
| TBLPROPERTIES (|
|   'bucketing_version'='2', |
|   'transactional'='true',  |
|   'transactional_properties'='default',|
|   'transient_lastDdlTime'='1539710410')|
++
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69054: HIVE-20740 : Remove global lock in ObjectStore.setConf method

2018-10-16 Thread Vihang Karajgaonkar via Review Board


> On Oct. 16, 2018, 8:59 p.m., Andrew Sherman wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> > Lines 291 (patched)
> > 
> >
> > What is someone has set  ConfVars.MANAGER_FACTORY_CLASS to some 
> > non-default value? Is this still correct?

Yes, looks like it will fail in that case although I am not sure the use-cases 
where you will use a different PersistenceManagerFactory. This code has been 
there since before the patch and has not been changed in this patch. Perhaps we 
can look at this as a separate JIRA.


- Vihang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69054/#review209662
---


On Oct. 16, 2018, 10:47 p.m., Vihang Karajgaonkar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69054/
> ---
> 
> (Updated Oct. 16, 2018, 10:47 p.m.)
> 
> 
> Review request for hive, Andrew Sherman, Alan Gates, and Peter Vary.
> 
> 
> Bugs: HIVE-20740
> https://issues.apache.org/jira/browse/HIVE-20740
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20740 : Remove global lock in ObjectStore.setConf method
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  66977d79c946f1ac57aacfbe8704d37bfbac3ea3 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
>  b74c3048fa2e18adc7f0d7cc813a180d4466fa36 
> 
> 
> Diff: https://reviews.apache.org/r/69054/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vihang Karajgaonkar
> 
>



Re: Review Request 69054: HIVE-20740 : Remove global lock in ObjectStore.setConf method

2018-10-16 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69054/
---

(Updated Oct. 16, 2018, 10:47 p.m.)


Review request for hive, Andrew Sherman, Alan Gates, and Peter Vary.


Changes
---

Added suggested changes from Andrew


Bugs: HIVE-20740
https://issues.apache.org/jira/browse/HIVE-20740


Repository: hive-git


Description
---

HIVE-20740 : Remove global lock in ObjectStore.setConf method


Diffs (updated)
-

  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 66977d79c946f1ac57aacfbe8704d37bfbac3ea3 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
 b74c3048fa2e18adc7f0d7cc813a180d4466fa36 


Diff: https://reviews.apache.org/r/69054/diff/2/

Changes: https://reviews.apache.org/r/69054/diff/1-2/


Testing
---


Thanks,

Vihang Karajgaonkar



Re: Review Request 69054: HIVE-20740 : Remove global lock in ObjectStore.setConf method

2018-10-16 Thread Andrew Sherman via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69054/#review209662
---



This all looks good, I just have annoying questions


standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 290 (patched)


Can you use  ConfVars.MANAGER_FACTORY_CLASS.getVarname() instead of the 
string?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 291 (patched)


What is someone has set  ConfVars.MANAGER_FACTORY_CLASS to some non-default 
value? Is this still correct?



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
Lines 918 (patched)


Add comment expaining what the test does



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
Lines 923 (patched)


numThreads and numIterations seem small to me, can we make them higher 
without the test taking a long time?



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
Lines 927 (patched)


ArrayList<>(numThreads) ?



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
Lines 948 (patched)


nit: add a timeout to get then you will kow the test can never hang


- Andrew Sherman


On Oct. 16, 2018, 8:36 p.m., Vihang Karajgaonkar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69054/
> ---
> 
> (Updated Oct. 16, 2018, 8:36 p.m.)
> 
> 
> Review request for hive, Andrew Sherman, Alan Gates, and Peter Vary.
> 
> 
> Bugs: HIVE-20740
> https://issues.apache.org/jira/browse/HIVE-20740
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20740 : Remove global lock in ObjectStore.setConf method
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  66977d79c946f1ac57aacfbe8704d37bfbac3ea3 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
>  b74c3048fa2e18adc7f0d7cc813a180d4466fa36 
> 
> 
> Diff: https://reviews.apache.org/r/69054/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vihang Karajgaonkar
> 
>



Review Request 69054: HIVE-20740 : Remove global lock in ObjectStore.setConf method

2018-10-16 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69054/
---

Review request for hive, Andrew Sherman, Alan Gates, and Peter Vary.


Bugs: HIVE-20740
https://issues.apache.org/jira/browse/HIVE-20740


Repository: hive-git


Description
---

HIVE-20740 : Remove global lock in ObjectStore.setConf method


Diffs
-

  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 66977d79c946f1ac57aacfbe8704d37bfbac3ea3 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
 b74c3048fa2e18adc7f0d7cc813a180d4466fa36 


Diff: https://reviews.apache.org/r/69054/diff/1/


Testing
---


Thanks,

Vihang Karajgaonkar



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 125 (original), 120 (patched)
> > 
> >
> > we might need to re-think how we are synchronizing this method a bit. I 
> > think we want to support the use case where we call `close()` while 
> > `open()` is being run. The offers a way for the user to cancel a session 
> > while it is being opened, which can be useful if opening a session takes a 
> > long time, which can happen on a busy cluster where there aren't enough 
> > resources to open a session.
> > 
> > fixing that might be out of the scope of this JIRA, so I would 
> > recommend using a separate lock to guard against multiple users calling 
> > open on the same session.
> 
> Sahil Takiar wrote:
> Tracking the aformentioned fix in HIVE-20519, unless you want to fix it 
> in this patch.
> 
> denys kuzmenko wrote:
> i think it should be addressed in another JIRA, right now we need to have 
> working at least basic use-case
> 
> Sahil Takiar wrote:
> okay, still recommend using a separate lock

open() and close() both manipulate with the shared variable (isOpen), so they 
have to be synchronized on a same monitor (at least in current approach).
I am not sure whether SparkContext supports instant interruption 
(Thread.interrupt or sc.stop()). However when closing session that is in 
progress, you have to take care of SparkContext.


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 16, 2018, 4:49 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 16, 2018, 4:49 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/3/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java
> > Line 73 (original), 73 (patched)
> > 
> >
> > if we expect multiple sessions to access this, should we make this 
> > `volatile`?
> 
> denys kuzmenko wrote:
> it's being accesed only inside of the critical section (within the lock 
> boundaries)
> 
> Sahil Takiar wrote:
> does java guarantee that non-volatile variables accessed inside a 
> critical section are not cached locally by a CPU?

In short - yes.

JSR 133 (Java Memory Model)

Synchronization ensures that memory writes by a thread before or during a 
synchronized block are made visible in a predictable manner to other threads 
which synchronize on the same monitor. After we exit a synchronized block, we 
release the monitor, which has the effect of flushing the cache to main memory, 
so that writes made by this thread can be visible to other threads. Before we 
can enter a synchronized block, we acquire the monitor, which has the effect of 
invalidating the local processor cache so that variables will be reloaded from 
main memory. We will then be able to see all of the writes made visible by the 
previous release.


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 16, 2018, 4:49 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 16, 2018, 4:49 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/3/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



[jira] [Created] (HIVE-20757) Autogather stats doesn't work when SDPO (sort dynamic partition optimization) is ON

2018-10-16 Thread Vineet Garg (JIRA)
Vineet Garg created HIVE-20757:
--

 Summary: Autogather stats doesn't work when SDPO (sort dynamic 
partition optimization) is ON
 Key: HIVE-20757
 URL: https://issues.apache.org/jira/browse/HIVE-20757
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 4.0.0
Reporter: Vineet Garg


*Reproducer*
{code:sql}
set hive.optimize.sort.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.stats.autogather=true;

create table t11(i int, j int) partitioned by (s string);
insert into t11 partition(s) values(3,4, 'p1'),(4,5, 'p2'),(6,9,'p3');

hive> desc formatted t11 j;
OK
col_namej
data_type   int
min
max
num_nulls
distinct_count
avg_col_len
max_col_len
num_trues
num_falses
bitVector
comment from deserializer
COLUMN_STATS_ACCURATE   {}
{code}

{code:sql}
hive> explain insert into t11 partition(s) values(3,4, 'p1'),(4,5, 
'p2'),(6,9,'p3');

STAGE PLANS:
  Stage: Stage-1
Tez
  DagId: vgarg_20181016113701_f3aa9f8f-b38b-47a8-8149-b5521bf072f6:13
  Edges:
Reducer 2 <- Map 1 (SIMPLE_EDGE)
  DagName: vgarg_20181016113701_f3aa9f8f-b38b-47a8-8149-b5521bf072f6:13
  Vertices:
Map 1
Map Operator Tree:
TableScan
  alias: _dummy_table
  Row Limit Per Split: 1
  Statistics: Num rows: 1 Data size: 10 Basic stats: COMPLETE 
Column stats: COMPLETE
  Select Operator
expressions: array(const struct(3,4,'p1'),const 
struct(4,5,'p2'),const struct(6,9,'p3')) (type: 
array>)
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 64 Basic stats: COMPLETE 
Column stats: COMPLETE
UDTF Operator
  Statistics: Num rows: 1 Data size: 64 Basic stats: 
COMPLETE Column stats: COMPLETE
  function name: inline
  Select Operator
expressions: col1 (type: int), col2 (type: int), col3 
(type: string)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 1 Data size: 8 Basic stats: 
COMPLETE Column stats: COMPLETE
Reduce Output Operator
  key expressions: _col2 (type: string)
  sort order: +
  Map-reduce partition columns: _col2 (type: string)
  Statistics: Num rows: 1 Data size: 8 Basic stats: 
COMPLETE Column stats: COMPLETE
  value expressions: _col0 (type: int), _col1 (type: 
int)
Reducer 2
Execution mode: vectorized
Reduce Operator Tree:
  Select Operator
expressions: VALUE._col0 (type: int), VALUE._col1 (type: int), 
KEY._col2 (type: string)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
Column stats: COMPLETE
File Output Operator
  compressed: false
  Dp Sort State: PARTITION_SORTED
  Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
Column stats: COMPLETE
  table:
  input format: org.apache.hadoop.mapred.TextInputFormat
  output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
  name: default.t11

  Stage: Stage-2
Dependency Collection

  Stage: Stage-0
Move Operator
  tables:
  partition:
s
  replace: false
  table:
  input format: org.apache.hadoop.mapred.TextInputFormat
  output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
  name: default.t11

  Stage: Stage-3
Stats Work
  Basic Stats Work:
  Column Stats Desc:
  Columns: i, j
  Column Types: int, int
  Table: default.t11
{code}

Notice that explain plan has autogather stats branch missing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20756) Disable SARG leaf creation for date column until ORC-135

2018-10-16 Thread Chiran Ravani (JIRA)
Chiran Ravani created HIVE-20756:


 Summary: Disable SARG leaf creation for date column until ORC-135
 Key: HIVE-20756
 URL: https://issues.apache.org/jira/browse/HIVE-20756
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.1.1
Reporter: Chiran Ravani
Assignee: Prasanth Jayachandran


Until ORC-135 is committed and orc version is updated in hive, disable SARG 
creation for timestamp columns in hive. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20755) Add the ability to push Dynamic Between and Bloom filters to JDBC handler

2018-10-16 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20755:
--

 Summary: Add the ability to push Dynamic Between and Bloom filters 
to JDBC handler
 Key: HIVE-20755
 URL: https://issues.apache.org/jira/browse/HIVE-20755
 Project: Hive
  Issue Type: New Feature
  Components: StorageHandler
Reporter: Jesus Camacho Rodriguez


HIVE-20683 has done some work to push semijoin reduction to Druid. We could use 
similar model to push them to JDBC sources, which would be quite useful, e.g., 
for joins between Hive and JDBC sources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69050: HIVE-20720: Add partition column option to JDBC handler

2018-10-16 Thread Daniel Dai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69050/
---

(Updated Oct. 16, 2018, 5:08 p.m.)


Review request for hive.


Repository: hive-git


Description
---

See HIVE-20720.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/Constants.java 1190679 
  itests/src/test/resources/testconfiguration.properties 9a87464 
  jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcInputFormat.java 
74999db 
  jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcInputSplit.java 
3a6ada8 
  jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcRecordReader.java 
1da6213 
  jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcSerDe.java 
5947628 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/conf/JdbcStorageConfigManager.java
 18e2397 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/dao/DatabaseAccessor.java
 fdaa794 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/dao/GenericJdbcDatabaseAccessor.java
 abdc5f0 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/dao/JdbcRecordIterator.java
 a95aca2 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/dao/JethroDatabaseAccessor.java
 db0454e 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/dao/MySqlDatabaseAccessor.java
 86fde7c 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/spitter/DateIntervalSplitter.java
 PRE-CREATION 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/spitter/DecimalIntervalSplitter.java
 PRE-CREATION 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/spitter/DoubleIntervalSplitter.java
 PRE-CREATION 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/spitter/IntervalSplitter.java
 PRE-CREATION 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/spitter/IntervalSplitterFactory.java
 PRE-CREATION 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/spitter/LongIntervalSpitter.java
 PRE-CREATION 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/spitter/TimestampIntervalSplitter.java
 PRE-CREATION 
  
jdbc-handler/src/test/java/org/apache/hive/storage/jdbc/TestJdbcInputFormat.java
 b146633 
  
jdbc-handler/src/test/java/org/apache/hive/storage/jdbc/dao/TestGenericJdbcDatabaseAccessor.java
 34f061e 
  ql/src/test/queries/clientpositive/external_jdbc_table_partition.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/external_jdbc_table_typeconversion.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/llap/external_jdbc_table_partition.q.out 
PRE-CREATION 
  
ql/src/test/results/clientpositive/llap/external_jdbc_table_typeconversion.q.out
 PRE-CREATION 


Diff: https://reviews.apache.org/r/69050/diff/2/

Changes: https://reviews.apache.org/r/69050/diff/1-2/


Testing
---


Thanks,

Daniel Dai



Review Request 69050: HIVE-20720: Add partition column option to JDBC handler

2018-10-16 Thread Daniel Dai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69050/
---

Review request for hive.


Repository: hive-git


Description
---

See HIVE-20720.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/Constants.java 1190679 
  itests/src/test/resources/testconfiguration.properties 9a87464 
  jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcInputFormat.java 
74999db 
  jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcInputSplit.java 
3a6ada8 
  jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcRecordReader.java 
1da6213 
  jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcSerDe.java 
5947628 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/conf/JdbcStorageConfigManager.java
 18e2397 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/dao/DatabaseAccessor.java
 fdaa794 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/dao/GenericJdbcDatabaseAccessor.java
 abdc5f0 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/dao/JdbcRecordIterator.java
 a95aca2 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/dao/JethroDatabaseAccessor.java
 db0454e 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/dao/MySqlDatabaseAccessor.java
 86fde7c 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/spitter/DateIntervalSplitter.java
 PRE-CREATION 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/spitter/DecimalIntervalSplitter.java
 PRE-CREATION 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/spitter/DoubleIntervalSplitter.java
 PRE-CREATION 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/spitter/IntervalSplitter.java
 PRE-CREATION 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/spitter/IntervalSplitterFactory.java
 PRE-CREATION 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/spitter/LongIntervalSpitter.java
 PRE-CREATION 
  
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/spitter/TimestampIntervalSplitter.java
 PRE-CREATION 
  
jdbc-handler/src/test/java/org/apache/hive/storage/jdbc/TestJdbcInputFormat.java
 b146633 
  
jdbc-handler/src/test/java/org/apache/hive/storage/jdbc/dao/TestGenericJdbcDatabaseAccessor.java
 34f061e 
  ql/src/test/queries/clientpositive/external_jdbc_table_partition.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/external_jdbc_table_typeconversion.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/llap/external_jdbc_table_partition.q.out 
PRE-CREATION 
  
ql/src/test/results/clientpositive/llap/external_jdbc_table_typeconversion.q.out
 PRE-CREATION 


Diff: https://reviews.apache.org/r/69050/diff/1/


Testing
---


Thanks,

Daniel Dai



[jira] [Created] (HIVE-20754) JDBC: Add some missing classes to jdbc standalone jar (follow up to HIVE-19801)

2018-10-16 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20754:
---

 Summary: JDBC: Add some missing classes to jdbc standalone jar 
(follow up to HIVE-19801)
 Key: HIVE-20754
 URL: https://issues.apache.org/jira/browse/HIVE-20754
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 3.1.0
Reporter: Vaibhav Gumashta


Some more classes are needed in a secure cluster



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 352 (original)
> > 
> >
> > why remove this?
> 
> denys kuzmenko wrote:
> it's not required. close() method is covered with the lock, and 
> activeJobs is a concurrent collection
> 
> Sahil Takiar wrote:
> what happens if a job is submitted after `hasTimedOut` returns true?
> 
> denys kuzmenko wrote:
> I see. However existing lock won't help as it doesn't prevent other 
> threads from adding new queries. 
> 
> public void onQuerySubmission(String queryId) {
> activeJobs.add(queryId);
> }
> 
> we might need to cover this with separate lock (onQueryCompletion, 
> onQuerySubmission, triggerTimeout)
> What do you think?
> 
> denys kuzmenko wrote:
> what happens if a job is submitted after hasTimedOut returns true? 
> current Session will be closed and a new one will be opened.
> 
> denys kuzmenko wrote:
> there might be an issue when 2nd session checkes state and get isOpen and 
> before it's reaching the submit phase, 1st one calls the close.
> I think we need synchronization for active sessions manipulation.
> 
> denys kuzmenko wrote:
> fixed. reverted back to the original locking. Above tricky case will be 
> handled by preventing new queries to execute open() before close() is 
> complete.

@before triggerTimeout() is complete


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 16, 2018, 4:49 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 16, 2018, 4:49 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/3/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/
---

(Updated Oct. 16, 2018, 4:49 p.m.)


Review request for hive, Sahil Takiar and Adam Szita.


Bugs: HIVE-20737
https://issues.apache.org/jira/browse/HIVE-20737


Repository: hive-git


Description
---

1. Local SparkContext is shared between user sessions and should be closed only 
when there is no active. 
2. Possible race condition in SparkSession.open() in case when user queries run 
in parallel within the same session.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
72ff53e3bd 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 
bb50129518 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/69022/diff/3/

Changes: https://reviews.apache.org/r/69022/diff/2-3/


Testing
---

Added TestLocalHiveSparkClient test


File Attachments


HIVE-20737.7.patch
  
https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch


Thanks,

denys kuzmenko



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 352 (original)
> > 
> >
> > why remove this?
> 
> denys kuzmenko wrote:
> it's not required. close() method is covered with the lock, and 
> activeJobs is a concurrent collection
> 
> Sahil Takiar wrote:
> what happens if a job is submitted after `hasTimedOut` returns true?
> 
> denys kuzmenko wrote:
> I see. However existing lock won't help as it doesn't prevent other 
> threads from adding new queries. 
> 
> public void onQuerySubmission(String queryId) {
> activeJobs.add(queryId);
> }
> 
> we might need to cover this with separate lock (onQueryCompletion, 
> onQuerySubmission, triggerTimeout)
> What do you think?
> 
> denys kuzmenko wrote:
> what happens if a job is submitted after hasTimedOut returns true? 
> current Session will be closed and a new one will be opened.
> 
> denys kuzmenko wrote:
> there might be an issue when 2nd session checkes state and get isOpen and 
> before it's reaching the submit phase, 1st one calls the close.
> I think we need synchronization for active sessions manipulation.

fixed. reverted back to the original locking. Above tricky case will be handled 
by preventing new queries to execute open() before close() is complete.


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 352 (original)
> > 
> >
> > why remove this?
> 
> denys kuzmenko wrote:
> it's not required. close() method is covered with the lock, and 
> activeJobs is a concurrent collection
> 
> Sahil Takiar wrote:
> what happens if a job is submitted after `hasTimedOut` returns true?
> 
> denys kuzmenko wrote:
> I see. However existing lock won't help as it doesn't prevent other 
> threads from adding new queries. 
> 
> public void onQuerySubmission(String queryId) {
> activeJobs.add(queryId);
> }
> 
> we might need to cover this with separate lock (onQueryCompletion, 
> onQuerySubmission, triggerTimeout)
> What do you think?
> 
> denys kuzmenko wrote:
> what happens if a job is submitted after hasTimedOut returns true? 
> current Session will be closed and a new one will be opened.

there might be an issue when 2nd session checkes state and get isOpen and 
before it's reaching the submit phase, 1st one calls the close.
I think we need synchronization for active sessions manipulation.


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 352 (original)
> > 
> >
> > why remove this?
> 
> denys kuzmenko wrote:
> it's not required. close() method is covered with the lock, and 
> activeJobs is a concurrent collection
> 
> Sahil Takiar wrote:
> what happens if a job is submitted after `hasTimedOut` returns true?
> 
> denys kuzmenko wrote:
> I see. However existing lock won't help as it doesn't prevent other 
> threads from adding new queries. 
> 
> public void onQuerySubmission(String queryId) {
> activeJobs.add(queryId);
> }
> 
> we might need to cover this with separate lock (onQueryCompletion, 
> onQuerySubmission, triggerTimeout)
> What do you think?

what happens if a job is submitted after hasTimedOut returns true? current 
Session will be closed and a new one will be opened.


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 352 (original)
> > 
> >
> > why remove this?
> 
> denys kuzmenko wrote:
> it's not required. close() method is covered with the lock, and 
> activeJobs is a concurrent collection
> 
> Sahil Takiar wrote:
> what happens if a job is submitted after `hasTimedOut` returns true?

I see. However existing lock won't help as it doesn't prevent other threads 
from adding new queries. 

public void onQuerySubmission(String queryId) {
activeJobs.add(queryId);
}

we might need to cover this with separate lock (onQueryCompletion, 
onQuerySubmission, triggerTimeout)
What do you think?


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 68474: HIVE-20440

2018-10-16 Thread Sahil Takiar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68474/#review209628
---



Could we add some more E2E integration tests? I'm thinking they could at the 
granularity of a `MapJoinOperator`? For example, confirm that starting a new 
query actually evicts everything from the cache? We want to make sure we aren't 
accidentally leaking small tables.

- Sahil Takiar


On Oct. 10, 2018, 1:20 p.m., Antal Sinkovits wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68474/
> ---
> 
> (Updated Oct. 10, 2018, 1:20 p.m.)
> 
> 
> Review request for hive, Naveen Gangam, Sahil Takiar, Adam Szita, and Xuefu 
> Zhang.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> I've modified the SmallTableCache to use guava cache, with soft references.
> By using a value loader, I've also eliminated the synchronization on the 
> intern-ed string of the path.
> 
> 
> Diffs
> -
> 
>   ql/pom.xml d73deba440702ec39fc5610df28e0fe54baef025 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 
> da1dd426c9155290e30fd1e3ae7f19a5479a8967 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
>  9e65fd98d6e4451421641b1429ccf334fe9a9586 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HybridHashTableContainer.java
>  54377428eafdb79e1bbdc8a182eafb46f8febd23 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
>  0e4b8df036724bd83e85fc3cc70f534272dab4c4 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
>  74e0b120ea3560a6a2a0074e6c0026b4874b3d5e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
>  24b8fea33815867ce544fd284437c4d02a21f1a3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 
> cf27e92bafdc63096ec0fa8c3106657bab52f370 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java 
> 3293100af96dc60408c53065fa89143ead98f818 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastTableContainer.java
>  e8dcbf18cb09b190536f920a53d6e9fa870ce33b 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestSmallTableCache.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68474/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Antal Sinkovits
> 
>



Re: Review Request 68474: HIVE-20440

2018-10-16 Thread Sahil Takiar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68474/#review209626
---




ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java
Lines 117 (patched)


nit: if you want to leave the `@return` section empty, then just remove it 
entirely



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java
Lines 127 (patched)


nit: same as above



ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
Lines 178-190 (patched)


what about changing this to something like `getKey()` and just returning a 
`String`. I don't think the interface needs to be tied to reading data to a 
folder on HDFS.



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java
Lines 131 (patched)


why do we run the action just for the l2 cache?


- Sahil Takiar


On Oct. 10, 2018, 1:20 p.m., Antal Sinkovits wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68474/
> ---
> 
> (Updated Oct. 10, 2018, 1:20 p.m.)
> 
> 
> Review request for hive, Naveen Gangam, Sahil Takiar, Adam Szita, and Xuefu 
> Zhang.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> I've modified the SmallTableCache to use guava cache, with soft references.
> By using a value loader, I've also eliminated the synchronization on the 
> intern-ed string of the path.
> 
> 
> Diffs
> -
> 
>   ql/pom.xml d73deba440702ec39fc5610df28e0fe54baef025 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 
> da1dd426c9155290e30fd1e3ae7f19a5479a8967 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
>  9e65fd98d6e4451421641b1429ccf334fe9a9586 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HybridHashTableContainer.java
>  54377428eafdb79e1bbdc8a182eafb46f8febd23 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
>  0e4b8df036724bd83e85fc3cc70f534272dab4c4 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
>  74e0b120ea3560a6a2a0074e6c0026b4874b3d5e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
>  24b8fea33815867ce544fd284437c4d02a21f1a3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 
> cf27e92bafdc63096ec0fa8c3106657bab52f370 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java 
> 3293100af96dc60408c53065fa89143ead98f818 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastTableContainer.java
>  e8dcbf18cb09b190536f920a53d6e9fa870ce33b 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestSmallTableCache.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68474/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Antal Sinkovits
> 
>



[jira] [Created] (HIVE-20753) Derby thread interrupt during ptest

2018-10-16 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-20753:
---

 Summary: Derby thread interrupt during ptest
 Key: HIVE-20753
 URL: https://issues.apache.org/jira/browse/HIVE-20753
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


I've had another failed ptest executionit seems like derby have caught an 
unexpected interrupt call ; which have hanged the execution; after that nothing 
happend for about half an hour - after which batch timeout have happened...

{code}
Caused by: ERROR XSDG9: Derby thread received an interrupt during a disk I/O 
operation, please check your application for the source of the interrupt.
at org.apache.derby.iapi.error.StandardException.newException(Unknown 
Source)
at 
org.apache.derby.impl.jdbc.SQLExceptionFactory.wrapArgsForTransportAcrossDRDA(Unknown
 Source)
... 42 more
{code}

full stacktrack:

{code}
2018-10-16T06:47:29,355 ERROR [Heartbeater-3] lockmgr.DbTxnManager: Failed 
trying to heartbeat queryId=null, currentUser: hiveptest (auth:SIMPLE): null
java.lang.reflect.UndeclaredThrowableException: null
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1700)
 ~[hadoop-common-3.1.0.jar:?]
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:955)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_102]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
[?:1.8.0_102]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 [?:1.8.0_102]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 [?:1.8.0_102]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[?:1.8.0_102]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[?:1.8.0_102]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102]
Caused by: org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating 
with the metastore(txnid:15,lockid:0 queryId=null txnid:0)
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:590) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.lambda$run$0(DbTxnManager.java:956)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at java.security.AccessController.doPrivileged(Native Method) 
~[?:1.8.0_102]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_102]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
 ~[hadoop-common-3.1.0.jar:?]
... 8 more
Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable to select 
from transaction database java.sql.SQLException: Derby thread received an 
interrupt during a disk I/O operation, please check your application for the 
source of the interrupt.
at 
org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
at 
org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
at 
org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
Source)
at 
org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
Source)
at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
Source)
at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
Source)
at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedStatement.executeLargeUpdate(Unknown 
Source)
at org.apache.derby.impl.jdbc.EmbedStatement.executeUpdate(Unknown 
Source)
at 
com.zaxxer.hikari.pool.ProxyStatement.executeUpdate(ProxyStatement.java:117)
at 
com.zaxxer.hikari.pool.HikariProxyStatement.executeUpdate(HikariProxyStatement.java)
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeatTxn(TxnHandler.java:4405)
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2659)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:7430)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)

Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread Sahil Takiar


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java
> > Line 73 (original), 73 (patched)
> > 
> >
> > if we expect multiple sessions to access this, should we make this 
> > `volatile`?
> 
> denys kuzmenko wrote:
> it's being accesed only inside of the critical section (within the lock 
> boundaries)

does java guarantee that non-volatile variables accessed inside a critical 
section are not cached locally by a CPU?


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Lines 112-116 (original)
> > 
> >
> > do we have unit tests that cover this?
> 
> denys kuzmenko wrote:
> queryCompleted and (lastSparkJobCompletionTime = 0) are complementary 
> conditions that are checked and set at the same place
> 
> denys kuzmenko wrote:
> queryCompleted and (lastSparkJobCompletionTime > 0)
> 
> denys kuzmenko wrote:
> we do have bunch of tests (TestSparkSession*, TestLocalSparkClient) that 
> are covering this

i don't think we have a test that explicitly checks what happens when a timeout 
is triggered before the first HoS query is run, but i think i added some in 
HIVE-20519 already anyway


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 125 (original), 120 (patched)
> > 
> >
> > we might need to re-think how we are synchronizing this method a bit. I 
> > think we want to support the use case where we call `close()` while 
> > `open()` is being run. The offers a way for the user to cancel a session 
> > while it is being opened, which can be useful if opening a session takes a 
> > long time, which can happen on a busy cluster where there aren't enough 
> > resources to open a session.
> > 
> > fixing that might be out of the scope of this JIRA, so I would 
> > recommend using a separate lock to guard against multiple users calling 
> > open on the same session.
> 
> Sahil Takiar wrote:
> Tracking the aformentioned fix in HIVE-20519, unless you want to fix it 
> in this patch.
> 
> denys kuzmenko wrote:
> i think it should be addressed in another JIRA, right now we need to have 
> working at least basic use-case

okay, still recommend using a separate lock


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 352 (original)
> > 
> >
> > why remove this?
> 
> denys kuzmenko wrote:
> it's not required. close() method is covered with the lock, and 
> activeJobs is a concurrent collection

what happens if a job is submitted after `hasTimedOut` returns true?


- Sahil


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Lines 112-116 (original)
> > 
> >
> > do we have unit tests that cover this?
> 
> denys kuzmenko wrote:
> queryCompleted and (lastSparkJobCompletionTime = 0) are complementary 
> conditions that are checked and set at the same place
> 
> denys kuzmenko wrote:
> queryCompleted and (lastSparkJobCompletionTime > 0)

we do have bunch of tests (TestSparkSession*, TestLocalSparkClient) that are 
covering this


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Lines 112-116 (original)
> > 
> >
> > do we have unit tests that cover this?
> 
> denys kuzmenko wrote:
> queryCompleted and (lastSparkJobCompletionTime = 0) are complementary 
> conditions that are checked and set at the same place

queryCompleted and (lastSparkJobCompletionTime > 0)


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread denys kuzmenko via Review Board


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java
> > Line 73 (original), 73 (patched)
> > 
> >
> > if we expect multiple sessions to access this, should we make this 
> > `volatile`?

it's being accesed only inside of the critical section (within the lock 
boundaries)


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java
> > Lines 74 (patched)
> > 
> >
> > should probably make this `volatile` in case multiple threads try to 
> > get an instance

it's being accesed only inside of the critical section (within the lock 
boundaries)


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Lines 112-116 (original)
> > 
> >
> > do we have unit tests that cover this?

queryCompleted and (lastSparkJobCompletionTime = 0) are complementary 
conditions that are checked and set at the same place


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 125 (original), 120 (patched)
> > 
> >
> > we might need to re-think how we are synchronizing this method a bit. I 
> > think we want to support the use case where we call `close()` while 
> > `open()` is being run. The offers a way for the user to cancel a session 
> > while it is being opened, which can be useful if opening a session takes a 
> > long time, which can happen on a busy cluster where there aren't enough 
> > resources to open a session.
> > 
> > fixing that might be out of the scope of this JIRA, so I would 
> > recommend using a separate lock to guard against multiple users calling 
> > open on the same session.
> 
> Sahil Takiar wrote:
> Tracking the aformentioned fix in HIVE-20519, unless you want to fix it 
> in this patch.

i think it should be addressed in another JIRA, right now we need to have 
working at least basic use-case


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 352 (original)
> > 
> >
> > why remove this?

it's not required. close() method is covered with the lock, and activeJobs is a 
concurrent collection


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread Sahil Takiar


> On Oct. 16, 2018, 1:47 p.m., Sahil Takiar wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
> > Line 125 (original), 120 (patched)
> > 
> >
> > we might need to re-think how we are synchronizing this method a bit. I 
> > think we want to support the use case where we call `close()` while 
> > `open()` is being run. The offers a way for the user to cancel a session 
> > while it is being opened, which can be useful if opening a session takes a 
> > long time, which can happen on a busy cluster where there aren't enough 
> > resources to open a session.
> > 
> > fixing that might be out of the scope of this JIRA, so I would 
> > recommend using a separate lock to guard against multiple users calling 
> > open on the same session.

Tracking the aformentioned fix in HIVE-20519, unless you want to fix it in this 
patch.


- Sahil


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 69022: HIVE-20737: Local SparkContext is shared between user sessions and should be closed only when there is no active

2018-10-16 Thread Sahil Takiar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69022/#review209617
---




ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java
Line 73 (original), 73 (patched)


if we expect multiple sessions to access this, should we make this 
`volatile`?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java
Lines 74 (patched)


should probably make this `volatile` in case multiple threads try to get an 
instance



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
Lines 112-116 (original)


do we have unit tests that cover this?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
Line 125 (original), 120 (patched)


we might need to re-think how we are synchronizing this method a bit. I 
think we want to support the use case where we call `close()` while `open()` is 
being run. The offers a way for the user to cancel a session while it is being 
opened, which can be useful if opening a session takes a long time, which can 
happen on a busy cluster where there aren't enough resources to open a session.

fixing that might be out of the scope of this JIRA, so I would recommend 
using a separate lock to guard against multiple users calling open on the same 
session.



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
Line 352 (original)


why remove this?


- Sahil Takiar


On Oct. 15, 2018, 7:21 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69022/
> ---
> 
> (Updated Oct. 15, 2018, 7:21 p.m.)
> 
> 
> Review request for hive, Sahil Takiar and Adam Szita.
> 
> 
> Bugs: HIVE-20737
> https://issues.apache.org/jira/browse/HIVE-20737
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1. Local SparkContext is shared between user sessions and should be closed 
> only when there is no active. 
> 2. Possible race condition in SparkSession.open() in case when user queries 
> run in parallel within the same session.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 
> 72ff53e3bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java
>  bb50129518 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestLocalHiveSparkClient.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69022/diff/2/
> 
> 
> Testing
> ---
> 
> Added TestLocalHiveSparkClient test
> 
> 
> File Attachments
> 
> 
> HIVE-20737.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/10/15/9cf8a2b3-9ec1-4316-81d0-3cd124b1a9fd__HIVE-20737.7.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



[jira] [Created] (HIVE-20752) In case of LLAP start failure add info how to find YARN logs

2018-10-16 Thread Miklos Gergely (JIRA)
Miklos Gergely created HIVE-20752:
-

 Summary: In case of LLAP start failure add info how to find YARN 
logs
 Key: HIVE-20752
 URL: https://issues.apache.org/jira/browse/HIVE-20752
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 4.0.0
Reporter: Miklos Gergely
Assignee: Miklos Gergely
 Fix For: 4.0.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20751) Upgrade arrow version to 0.10.0

2018-10-16 Thread Shubham Chaurasia (JIRA)
Shubham Chaurasia created HIVE-20751:


 Summary: Upgrade arrow version to 0.10.0
 Key: HIVE-20751
 URL: https://issues.apache.org/jira/browse/HIVE-20751
 Project: Hive
  Issue Type: Bug
  Components: llap
Affects Versions: 3.1.0
Reporter: Shubham Chaurasia
Assignee: Shubham Chaurasia


Need to upgrade arrow version as spark is moving to arrow version 0.10.0 in 
it's upcoming release 2.4.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] hive pull request #447: HIVE-20708: Load an external table as an external ta...

2018-10-16 Thread ashutosh-bapat
GitHub user ashutosh-bapat opened a pull request:

https://github.com/apache/hive/pull/447

HIVE-20708: Load an external table as an external table on target with the 
same location as  on the source

Dump an external table as an external table.

When loading an external table set the location of the target table same as 
the location of source,
but relative to the file system of the target location. IOW, the scheme, 
authority of the target
location is same as the target file system but the path relative to the 
file system is same as the
source.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ashutosh-bapat/hive hive20708

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/447.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #447


commit c076bbbd2b0fd1b193ac51a1595911a80324b923
Author: Ashutosh Bapat 
Date:   2018-10-15T05:09:05Z

HIVE-20708: Load an external table as an external table on target with the 
same location as
on the source

Dump an external table as an external table.

When loading an external table set the location of the target table same as 
the location of source,
but relative to the file system of the target location. IOW, the scheme, 
authority of the target
location is same as the target file system but the path relative to the 
file system is same as the
source.




---


Re: Review Request 68946: HIVE-20707: Automatic MSCK REPAIR for external tables

2018-10-16 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68946/
---

(Updated Oct. 16, 2018, 7:02 a.m.)


Review request for hive, Ashutosh Chauhan and Jason Dere.


Changes
---

Addressed review comment and updated druid test golden files.


Bugs: HIVE-20707
https://issues.apache.org/jira/browse/HIVE-20707


Repository: hive-git


Description
---

HIVE-20707: Automatic partition management


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 92a1c31 
  hbase-handler/src/test/results/positive/external_table_ppd.q.out edcbe7e 
  hbase-handler/src/test/results/positive/hbase_binary_storage_queries.q.out 
1209c88 
  hbase-handler/src/test/results/positive/hbase_ddl.q.out ccd4148 
  hbase-handler/src/test/results/positive/hbase_queries.q.out eeb97f0 
  hbase-handler/src/test/results/positive/hbasestats.q.out 5a4aea9 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
 a9d7468 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 807f159 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java 46bf088 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/CheckResult.java 0b4240f 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java 
598bb2e 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java cff32d3 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 29f6ecf 
  ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableDesc.java 27f677e 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/TestMsckCreatePartitionsInBatches.java
 ce2b186 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/TestMsckDropPartitionsInBatches.java 
9480d38 
  ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreChecker.java 
a2a0583 
  ql/src/test/queries/clientpositive/msck_repair_acid.q PRE-CREATION 
  ql/src/test/queries/clientpositive/partition_discovery.q PRE-CREATION 
  ql/src/test/results/clientpositive/create_like.q.out f4a5ed5 
  ql/src/test/results/clientpositive/create_like_view.q.out 870f280 
  ql/src/test/results/clientpositive/default_file_format.q.out 0adf5ae 
  ql/src/test/results/clientpositive/druid/druidkafkamini_basic.q.out 883994c 
  ql/src/test/results/clientpositive/druid/druidmini_expressions.q.out 9c9af44 
  ql/src/test/results/clientpositive/druid_topn.q.out 179902a 
  ql/src/test/results/clientpositive/explain_locks.q.out ed7f1e8 
  ql/src/test/results/clientpositive/llap/external_table_purge.q.out 24c778e 
  ql/src/test/results/clientpositive/llap/mm_exim.q.out ee6cf06 
  ql/src/test/results/clientpositive/llap/strict_managed_tables2.q.out f3b6152 
  ql/src/test/results/clientpositive/llap/whroot_external1.q.out cac158c 
  ql/src/test/results/clientpositive/msck_repair_acid.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/msck_repair_drop.q.out 2456734 
  ql/src/test/results/clientpositive/partition_discovery.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/rename_external_partition_location.q.out 
02cd814 
  ql/src/test/results/clientpositive/repl_2_exim_basic.q.out b2bcd51 
  ql/src/test/results/clientpositive/show_create_table_alter.q.out 2c75c36 
  ql/src/test/results/clientpositive/show_create_table_partitioned.q.out 
e554a18 
  ql/src/test/results/clientpositive/show_create_table_serde.q.out 8b95c9b 
  ql/src/test/results/clientpositive/spark/stats_noscan_2.q.out 2d713a8 
  ql/src/test/results/clientpositive/stats_noscan_2.q.out 182820f 
  ql/src/test/results/clientpositive/temp_table_display_colstats_tbllvl.q.out 
2a442b4 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/CheckResult.java
 PRE-CREATION 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
 294dfb7 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/api/MetastoreException.java
 PRE-CREATION 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
 7b01678 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 16f4a50 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java
 PRE-CREATION 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java
 PRE-CREATION 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MsckInfo.java
 PRE-CREATION 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MsckPartitionExpressionProxy.java
 PRE-CREATION 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 66977d7 
  

Re: Review Request 68946: HIVE-20707: Automatic MSCK REPAIR for external tables

2018-10-16 Thread j . prasanth . j


> On Oct. 16, 2018, 2:28 a.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
> > Lines 4761 (patched)
> > 
> >
> > Should this be on by default? If there are a lot of external tables 
> > (especially on s3), the metastore could be spending a lot of time doing 
> > auto discover. Could also affect the running of other MetastoreTaskThreads.
> 
> Prasanth_J wrote:
> Yeah. I think this should be default. This will remove manual msck step 
> or periodic msck query (via cron job). This thread kicks in once every 5 
> minutes but if the previous attempt is not done yet it will skip an attempt 
> so as to avoid queue'ing up of background tasks. Also it will use high batch 
> size by default so that in most case there should be 1 MS request per table. 
> MSCK thread also runs in a thread pool. The only place this background thread 
> could be blocked is when exclusive lock is obtained on a table (which gets 
> released after txn timeout of 300s). 
> We could probably restrict this only for EXTERNAL table types (currently 
> it defaults to both EXTERNAL and MANAGED). Since managed is ACID by default 
> we can avoid scanning managed tables. Changing ACID table layout out of band 
> is shooting themselves in the foot anyway.

Actually, thinking about it again. I think I will leave the table types 
(metastore.partition.management.table.types) as such. Required for partition 
retention. If user adds retention period to managed table, they don't have to 
do anything. 

Added a fix in new patch that won't acquire lock unless it is required (change 
detected, adding or dropping partition).


- Prasanth_J


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68946/#review209582
---


On Oct. 16, 2018, 12:21 a.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68946/
> ---
> 
> (Updated Oct. 16, 2018, 12:21 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Jason Dere.
> 
> 
> Bugs: HIVE-20707
> https://issues.apache.org/jira/browse/HIVE-20707
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20707: Automatic partition management
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 92a1c31 
>   hbase-handler/src/test/results/positive/external_table_ppd.q.out edcbe7e 
>   hbase-handler/src/test/results/positive/hbase_binary_storage_queries.q.out 
> 1209c88 
>   hbase-handler/src/test/results/positive/hbase_ddl.q.out ccd4148 
>   hbase-handler/src/test/results/positive/hbase_queries.q.out eeb97f0 
>   hbase-handler/src/test/results/positive/hbasestats.q.out 5a4aea9 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  a9d7468 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 807f159 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java 46bf088 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/CheckResult.java 0b4240f 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java 
> 598bb2e 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java cff32d3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> 29f6ecf 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableDesc.java 27f677e 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/TestMsckCreatePartitionsInBatches.java
>  ce2b186 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/TestMsckDropPartitionsInBatches.java
>  9480d38 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreChecker.java 
> a2a0583 
>   ql/src/test/queries/clientpositive/msck_repair_acid.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/partition_discovery.q PRE-CREATION 
>   ql/src/test/results/clientpositive/create_like.q.out f4a5ed5 
>   ql/src/test/results/clientpositive/create_like_view.q.out 870f280 
>   ql/src/test/results/clientpositive/default_file_format.q.out 0adf5ae 
>   ql/src/test/results/clientpositive/druid_topn.q.out 179902a 
>   ql/src/test/results/clientpositive/explain_locks.q.out ed7f1e8 
>   ql/src/test/results/clientpositive/llap/external_table_purge.q.out 24c778e 
>   ql/src/test/results/clientpositive/llap/mm_exim.q.out ee6cf06 
>   ql/src/test/results/clientpositive/llap/strict_managed_tables2.q.out 
> f3b6152 
>   ql/src/test/results/clientpositive/llap/whroot_external1.q.out cac158c 
>   ql/src/test/results/clientpositive/msck_repair_acid.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/msck_repair_drop.q.out 2456734 
>   ql/src/test/results/clientpositive/partition_discovery.q.out PRE-CREATION