kishendas commented on a change in pull request #1186:
URL: https://github.com/apache/hive/pull/1186#discussion_r447465847



##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##########
@@ -2264,6 +2255,8 @@ private MergedColumnStatsForPartitions 
mergeColStatsForPartitions(String catName
       if (colStatsMap.size() < 1) {
         LOG.debug("No stats data found for: dbName={} tblName= {} partNames= 
{} colNames= ", dbName, tblName, partNames,
             colNames);
+        // TODO: If we don't find any stats then most likely we should return 
null. Returning an empty object will not
+        // trigger the lookup in the raw store and we will end up with missing 
stats.

Review comment:
       Please create a JIRA for this TODO and add a reference in the comment. 

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##########
@@ -851,8 +851,6 @@ private void updateTableColStats(RawStore rawStore, String 
catName, String dbNam
             
sharedCache.refreshTableColStatsInCache(StringUtils.normalizeIdentifier(catName),
                 StringUtils.normalizeIdentifier(dbName), 
StringUtils.normalizeIdentifier(tblName),
                 tableColStats.getStatsObj());
-            // Update the table to get consistent stats state.
-            sharedCache.alterTableInCache(catName, dbName, tblName, table);

Review comment:
       Sorry, I am bit confused looking at this diff. So, the original issue 
seems to be - "Metastore's update service wrongly strips partition column stats 
from the cache in an attempt to update them." . How are we fixing this issue by 
not updating the stats in the cache ? Wouldn't the right fix for this is to to 
ensure sharedCache.alterTableInCache and sharedCache.alterPartitionInCache 
methods do the right thing and not incorrectly remove partition column stats ? 

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##########
@@ -900,13 +898,6 @@ private void updateTablePartitionColStats(RawStore 
rawStore, String catName, Str
               rawStore.getPartitionColumnStatistics(catName, dbName, tblName, 
partNames, colNames, CacheUtils.HIVE_ENGINE);
           Deadline.stopTimer();
           sharedCache.refreshPartitionColStatsInCache(catName, dbName, 
tblName, partitionColStats);
-          Deadline.startTimer("getPartitionsByNames");
-          List<Partition> parts = rawStore.getPartitionsByNames(catName, 
dbName, tblName, partNames);
-          Deadline.stopTimer();
-          // Also save partitions for consistency as they have the stats state.
-          for (Partition part : parts) {
-            sharedCache.alterPartitionInCache(catName, dbName, tblName, 
part.getValues(), part);

Review comment:
       Same concern here as previous. How do we fix this issue by not updating 
the cache at all ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to