sankarh commented on a change in pull request #2128:
URL: https://github.com/apache/hive/pull/2128#discussion_r636821684



##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##########
@@ -918,6 +938,29 @@ private void updateAllTableConstraints(RawStore rawStore, 
String catName, String
       }
     }
 
+    private void updateValidWriteIdList(TxnStore txnStore, String catName, 
String dbName, String tblName) {
+      catName = StringUtils.normalizeIdentifier(catName);
+      dbName = StringUtils.normalizeIdentifier(dbName);
+      tblName = StringUtils.normalizeIdentifier(tblName);
+      LOG.debug("CachedStore: updating cached table validWriteIdlist for 
catalog: {}, database: {}, table: {}", catName,
+          dbName, tblName);
+      ValidWriteIdList validWriteIdList = null;
+      try {
+        Deadline.startTimer("getValidWriteIds");
+        validWriteIdList = 
TxnCommonUtils.createValidReaderWriteIdList(txnStore.getValidWriteIds(

Review comment:
       A curious question.
   I think, we recently added writeId allocation logic for DDL operations. Do 
we rollback the changes happened in metastore if DDL op fails?

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##########
@@ -918,6 +938,29 @@ private void updateAllTableConstraints(RawStore rawStore, 
String catName, String
       }
     }
 
+    private void updateValidWriteIdList(TxnStore txnStore, String catName, 
String dbName, String tblName) {
+      catName = StringUtils.normalizeIdentifier(catName);
+      dbName = StringUtils.normalizeIdentifier(dbName);
+      tblName = StringUtils.normalizeIdentifier(tblName);
+      LOG.debug("CachedStore: updating cached table validWriteIdlist for 
catalog: {}, database: {}, table: {}", catName,
+          dbName, tblName);
+      ValidWriteIdList validWriteIdList = null;
+      try {
+        Deadline.startTimer("getValidWriteIds");
+        validWriteIdList = 
TxnCommonUtils.createValidReaderWriteIdList(txnStore.getValidWriteIds(

Review comment:
       Probably, getting the snapshot before fetching the metadata objects 
might help to handle some cases.
   1. Take snapshot from current thread and cache it -> 
ValidWriteIdList=:HWM=10:
   2. Fetch table object --> writeid=10
   2. Another thread writes table object --> writeid=11
   3. In this case, new snapshot would be ValidWriteIdList=:HWM=11: which 
doesn't match with cached snapshot

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##########
@@ -1266,7 +1309,7 @@ public Table getTable(String catName, String dbName, 
String tblName, String vali
     catName = normalizeIdentifier(catName);

Review comment:
         @Override
     public Table getTable(String catName, String dbName, String tblName, 
String validWriteIds) throws MetaException {
       return getTable(catName, dbName, tblName, null, -1);
     }
   
   Here, validWriteIds is not passed down.

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##########
@@ -918,6 +938,29 @@ private void updateAllTableConstraints(RawStore rawStore, 
String catName, String
       }
     }
 
+    private void updateValidWriteIdList(TxnStore txnStore, String catName, 
String dbName, String tblName) {
+      catName = StringUtils.normalizeIdentifier(catName);
+      dbName = StringUtils.normalizeIdentifier(dbName);
+      tblName = StringUtils.normalizeIdentifier(tblName);
+      LOG.debug("CachedStore: updating cached table validWriteIdlist for 
catalog: {}, database: {}, table: {}", catName,
+          dbName, tblName);
+      ValidWriteIdList validWriteIdList = null;
+      try {
+        Deadline.startTimer("getValidWriteIds");
+        validWriteIdList = 
TxnCommonUtils.createValidReaderWriteIdList(txnStore.getValidWriteIds(
+            new 
GetValidWriteIdsRequest(Collections.singletonList(TableName.getDbTable(dbName, 
tblName))))
+            .getTblValidWriteIds().get(0));
+        Deadline.stopTimer();
+      } catch (Exception e) {
+        LOG.info("Updating CachedStore: unable to update table 
validWriteIdList of catalog: " + catName + ", database: "
+            + dbName + ", table: " + tblName, e);
+      }
+      if (validWriteIdList != null) {
+        sharedCache.refreshValidWriteIdListInCache(catName, dbName, tblName, 
validWriteIdList);
+        LOG.debug("CachedStore: updated cached table validWriteIdList for 
catalog: {}, database: {}, table: {}",

Review comment:
       Also, log the ValidWriteIdList.

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
##########
@@ -2476,6 +2498,25 @@ public boolean isTableConstraintValid(String catName, 
String dbName, String tblN
     return isValid;
   }
 
+  public boolean isTableCacheStale(String catName, String dbName, String 
tblName, String validWriteIdList) {
+    boolean isStale = true;
+
+    if (StringUtils.isEmpty(validWriteIdList))
+      return isStale;

Review comment:
       It sounds like, if we cannot get ValidWriteIdList, then the cache is 
always marked as invalid. Can we not proceed to populate the table metadata 
into cache if fetching ValidWriteIdList fail for any reason?

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##########
@@ -918,6 +938,29 @@ private void updateAllTableConstraints(RawStore rawStore, 
String catName, String
       }
     }
 
+    private void updateValidWriteIdList(TxnStore txnStore, String catName, 
String dbName, String tblName) {
+      catName = StringUtils.normalizeIdentifier(catName);
+      dbName = StringUtils.normalizeIdentifier(dbName);
+      tblName = StringUtils.normalizeIdentifier(tblName);
+      LOG.debug("CachedStore: updating cached table validWriteIdlist for 
catalog: {}, database: {}, table: {}", catName,
+          dbName, tblName);
+      ValidWriteIdList validWriteIdList = null;
+      try {
+        Deadline.startTimer("getValidWriteIds");
+        validWriteIdList = 
TxnCommonUtils.createValidReaderWriteIdList(txnStore.getValidWriteIds(

Review comment:
       ValidWriteIdList obtained here doesn't guarantee validity of cached 
metadata objects.
   Consider below sequence.
   1. Fetch table object --> writeid=10
   2. Another thread writes table object --> writeid=11
   3. Take snapshot from current thread and cache it -> 
ValidWriteIdList=<tbl>:HWM=11:
   4. Now, the table object is stale but we still mark it as valid for any new 
txn. 

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##########
@@ -1266,7 +1309,7 @@ public Table getTable(String catName, String dbName, 
String tblName, String vali
     catName = normalizeIdentifier(catName);
     dbName = StringUtils.normalizeIdentifier(dbName);
     tblName = StringUtils.normalizeIdentifier(tblName);
-    if (!shouldCacheTable(catName, dbName, tblName) || (canUseEvents && 
rawStore.isActiveTransaction())) {
+    if (shouldGetTableFromRawStore(catName, dbName, tblName, validWriteIds)) {

Review comment:
       We should use "shouldGetTableFromRawStore" even for get partitions, 
column stats. 

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##########
@@ -573,6 +577,18 @@ static void prewarm(RawStore rawStore) {
               Deadline.stopTimer();
               cacheObjects.setTableConstraints(tableConstraints);
 
+              try {
+                Deadline.startTimer("getValidWriteIds");
+                ValidWriteIdList validWriteIdList = 
TxnCommonUtils.createValidReaderWriteIdList(txnStore
+                    .getValidWriteIds(
+                        new 
GetValidWriteIdsRequest(Collections.singletonList(TableName.getDbTable(dbName, 
tblName))))
+                    .getTblValidWriteIds().get(0));
+                Deadline.stopTimer();
+                cacheObjects.setValidWriteIdList(validWriteIdList);
+              } catch (Exception e) {
+                LOG.debug(ExceptionUtils.getStackTrace(e));

Review comment:
       It can be atleast an info log. 

##########
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
##########
@@ -38,10 +38,11 @@
 import org.apache.hadoop.hive.metastore.Warehouse;
 import org.apache.hadoop.hive.metastore.annotation.MetastoreCheckinTest;
 import org.apache.hadoop.hive.metastore.api.*;
-import org.apache.hadoop.hive.metastore.api.utils.DecimalUtils;
 import org.apache.hadoop.hive.metastore.client.builder.DatabaseBuilder;
 import 
org.apache.hadoop.hive.metastore.columnstats.cache.LongColumnStatsDataInspector;
 import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.apache.hadoop.hive.metastore.txn.TxnStore;

Review comment:
       Add tests to check if invalidation happens only for stale data or not. 
Also, after refresh it should be valid.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to