Re: Review Request 59205: HIVE-16579: CachedStore: improvements to partition col stats caching and cache column stats for unpartitioned table

2017-05-14 Thread Daniel Dai


> On May 14, 2017, 7:38 a.m., Daniel Dai wrote:
> > metastore/src/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
> > Lines 528 (patched)
> > 
> >
> > If using dirty flag, databaseCacheLock can be removed, right?
> 
> Thejas Nair wrote:
> We still need some locking/synchronization to prevent things like this -
> 
> 
> 1. If (isDirtyFlag == false) {
> 2.  // if db gets updated in another thread while this thread is here, we 
> lose the update.
> 3.updateDBCache()
> 2. }
> 
> A read/write lock similar to what is used here appropriate for that. The 
> background thread is the one that holds the write lock.
> However, not sure if we need to busy-wait with trylock though. (I figure, 
> the intent is to give this one higher priority). The update DbCache part 
> should not take that long. (The busy-wait can be dealt with later also).

A simpler solution is move dirtyFlag to SharedCache. 
SharedCache.refreshDatabases will check the flag and 
SharedCache.addDatabaseToCache/alterDatabaseInCache/removeDatabaseFromCache 
will set the flag. There is no synchronization issue in SharedCache since every 
method is synchronized.


- Daniel


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59205/#review174907
---


On May 14, 2017, 5:19 a.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59205/
> ---
> 
> (Updated May 14, 2017, 5:19 a.m.)
> 
> 
> Review request for hive, Daniel Dai and Thejas Nair.
> 
> 
> Bugs: HIVE-16579
> https://issues.apache.org/jira/browse/HIVE-16579
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-16579
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d6a80ae 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
>  91a3a38 
>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java b897ffa 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> b96c27e 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 870896c 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> ed19f42 
>   metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java c1af690 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java 
> 668499b 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java 
> 5a187d8 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java 
> 7beee42 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/HBaseStore.java 
> 0c7d8bb 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  f613c30 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  1720e37 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
>  0ab20d6 
> 
> 
> Diff: https://reviews.apache.org/r/59205/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>



Re: Review Request 59205: HIVE-16579: CachedStore: improvements to partition col stats caching and cache column stats for unpartitioned table

2017-05-14 Thread Thejas Nair


> On May 14, 2017, 7:38 a.m., Daniel Dai wrote:
> > metastore/src/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
> > Line 381 (original), 526 (patched)
> > 
> >
> > I think we only need a dirty flag, it will be reset by the next 
> > refresh, not end of individual call.

I agree, this activeDatabaseUpdateCalls params are not going to prevent the 
race-condition. Its better to use dirty flag instead.


- Thejas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59205/#review174907
---


On May 14, 2017, 5:19 a.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59205/
> ---
> 
> (Updated May 14, 2017, 5:19 a.m.)
> 
> 
> Review request for hive, Daniel Dai and Thejas Nair.
> 
> 
> Bugs: HIVE-16579
> https://issues.apache.org/jira/browse/HIVE-16579
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-16579
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d6a80ae 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
>  91a3a38 
>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java b897ffa 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> b96c27e 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 870896c 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> ed19f42 
>   metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java c1af690 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java 
> 668499b 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java 
> 5a187d8 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java 
> 7beee42 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/HBaseStore.java 
> 0c7d8bb 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  f613c30 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  1720e37 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
>  0ab20d6 
> 
> 
> Diff: https://reviews.apache.org/r/59205/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>



Re: Review Request 59205: HIVE-16579: CachedStore: improvements to partition col stats caching and cache column stats for unpartitioned table

2017-05-14 Thread Thejas Nair


> On May 14, 2017, 7:38 a.m., Daniel Dai wrote:
> > metastore/src/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
> > Lines 528 (patched)
> > 
> >
> > If using dirty flag, databaseCacheLock can be removed, right?

We still need some locking/synchronization to prevent things like this -


1. If (isDirtyFlag == false) {
2.  // if db gets updated in another thread while this thread is here, we lose 
the update.
3.updateDBCache()
2. }

A read/write lock similar to what is used here appropriate for that. The 
background thread is the one that holds the write lock.
However, not sure if we need to busy-wait with trylock though. (I figure, the 
intent is to give this one higher priority). The update DbCache part should not 
take that long. (The busy-wait can be dealt with later also).


- Thejas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59205/#review174907
---


On May 14, 2017, 5:19 a.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59205/
> ---
> 
> (Updated May 14, 2017, 5:19 a.m.)
> 
> 
> Review request for hive, Daniel Dai and Thejas Nair.
> 
> 
> Bugs: HIVE-16579
> https://issues.apache.org/jira/browse/HIVE-16579
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-16579
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d6a80ae 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
>  91a3a38 
>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java b897ffa 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> b96c27e 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 870896c 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> ed19f42 
>   metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java c1af690 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java 
> 668499b 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java 
> 5a187d8 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java 
> 7beee42 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/HBaseStore.java 
> 0c7d8bb 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  f613c30 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  1720e37 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
>  0ab20d6 
> 
> 
> Diff: https://reviews.apache.org/r/59205/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>



Re: Review Request 59205: HIVE-16579: CachedStore: improvements to partition col stats caching and cache column stats for unpartitioned table

2017-05-14 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59205/#review174906
---




metastore/src/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
Lines 341 (patched)


this call has already been made in updateTables. Should we just get it from 
the cache to reduce the rawStore call ? This can be done as part of follow up 
jira if necessary.



metastore/src/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
Line 300 (original), 374 (patched)


This still doesn't prevent the race condition I mentioned (in offline 
discussion).
If another thread calls a method that updates a database object, when this 
thread is here (let say that calls starts and finishes when this thread is at 
this line), it would still go ahead and update the cache. That would result in 
local changes getting lost.

This is not a regression in this patch, we can address that in a follow up 
patch.
To address that, we need a dirty-cache check after acquring the lock.



metastore/src/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
Line 1131 (original), 1433 (patched)


this would be just like getPartition() . We get it from cache, if its there 
in cache. Otherwise, assume its not there (yet). This can be done as part of 
follow up jira if necessary.


- Thejas Nair


On May 14, 2017, 5:19 a.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59205/
> ---
> 
> (Updated May 14, 2017, 5:19 a.m.)
> 
> 
> Review request for hive, Daniel Dai and Thejas Nair.
> 
> 
> Bugs: HIVE-16579
> https://issues.apache.org/jira/browse/HIVE-16579
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-16579
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d6a80ae 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
>  91a3a38 
>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java b897ffa 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> b96c27e 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 870896c 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> ed19f42 
>   metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java c1af690 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java 
> 668499b 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java 
> 5a187d8 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java 
> 7beee42 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/HBaseStore.java 
> 0c7d8bb 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  f613c30 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  1720e37 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
>  0ab20d6 
> 
> 
> Diff: https://reviews.apache.org/r/59205/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>



Re: Review Request 59205: HIVE-16579: CachedStore: improvements to partition col stats caching and cache column stats for unpartitioned table

2017-05-14 Thread Daniel Dai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59205/#review174907
---




metastore/src/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
Line 381 (original), 526 (patched)


I think we only need a dirty flag, it will be reset by the next refresh, 
not end of individual call.



metastore/src/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
Lines 528 (patched)


If using dirty flag, databaseCacheLock can be removed, right?


- Daniel Dai


On May 14, 2017, 5:19 a.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59205/
> ---
> 
> (Updated May 14, 2017, 5:19 a.m.)
> 
> 
> Review request for hive, Daniel Dai and Thejas Nair.
> 
> 
> Bugs: HIVE-16579
> https://issues.apache.org/jira/browse/HIVE-16579
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-16579
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d6a80ae 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
>  91a3a38 
>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java b897ffa 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> b96c27e 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 870896c 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> ed19f42 
>   metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java c1af690 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java 
> 668499b 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java 
> 5a187d8 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java 
> 7beee42 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/HBaseStore.java 
> 0c7d8bb 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  f613c30 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  1720e37 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
>  0ab20d6 
> 
> 
> Diff: https://reviews.apache.org/r/59205/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>



Re: Review Request 59205: HIVE-16579: CachedStore: improvements to partition col stats caching and cache column stats for unpartitioned table

2017-05-14 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59205/#review174905
---




metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Lines 1387 (patched)


There is no order-by in the SQL query above. Is it guaranteed that the list 
is ordered ?
Should we add a "order by" to above SQL query ?


- Thejas Nair


On May 14, 2017, 5:19 a.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59205/
> ---
> 
> (Updated May 14, 2017, 5:19 a.m.)
> 
> 
> Review request for hive, Daniel Dai and Thejas Nair.
> 
> 
> Bugs: HIVE-16579
> https://issues.apache.org/jira/browse/HIVE-16579
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-16579
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d6a80ae 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
>  91a3a38 
>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java b897ffa 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> b96c27e 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 870896c 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> ed19f42 
>   metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java c1af690 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java 
> 668499b 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java 
> 5a187d8 
>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java 
> 7beee42 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/HBaseStore.java 
> 0c7d8bb 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  f613c30 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  1720e37 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
>  0ab20d6 
> 
> 
> Diff: https://reviews.apache.org/r/59205/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>