----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/65634/#review198507 -----------------------------------------------------------
Changes are pretty big, I didn't go through all of them 0 some comments below. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java Lines 62 (patched) <https://reviews.apache.org/r/65634/#comment278651> Please add Javadoc comment, explaining what this function does. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java Lines 63 (patched) <https://reviews.apache.org/r/65634/#comment278653> It would be cleaner and easier to read to rewrite this as ``` public static String buildKey(List<String> partVals) { if (partVals == null || partVals.isEmpty()) { return ""; } return String.join(delimit, partVals); } ``` standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java Lines 70 (patched) <https://reviews.apache.org/r/65634/#comment278652> 1) Please add Javadoc comment, explaining what this function does. 2) Is overloading really useful here? standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java Lines 71 (patched) <https://reviews.apache.org/r/65634/#comment278654> why not just `return buildKey(partVals) + delimit + colName` can colName be empty here or not? standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java Line 62 (original), 75 (patched) <https://reviews.apache.org/r/65634/#comment278655> This method is never used standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Lines 128 (patched) <https://reviews.apache.org/r/65634/#comment278656> 1) Please add units in the name and use constant for the default value. 2) Please document what is `cacheRefreshPeriod`. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 223 (original), 137 (patched) <https://reviews.apache.org/r/65634/#comment278657> Why do you need an empty public constructor? standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 226 (original), 140 (patched) <https://reviews.apache.org/r/65634/#comment278660> Please document this method - in particular how does it gets cache implementaiton from config. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 236 (original), 150 (patched) <https://reviews.apache.org/r/65634/#comment278659> Please used internal formatting for LOG: LOG.debug("CachedStore is not enabled; using {}", clazzName) standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 237 (original), 151 (patched) <https://reviews.apache.org/r/65634/#comment278658> This return is not needed. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 259 (original), 172 (patched) <https://reviews.apache.org/r/65634/#comment278661> Can this be an else part of the prior if? standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 262 (original), 175 (patched) <https://reviews.apache.org/r/65634/#comment278663> This doesn't look correct: 1) initBlackListWhiteList() will not update any existing whitelist or blacklist, only add one if it wasn't there. 2) initBlackListWhiteList() is calling `Collections.reverse(blacklistPatterns)` which doesn't make sense when configuration is set to a new value. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Lines 176 (patched) <https://reviews.apache.org/r/65634/#comment278662> Looks like every time someone calls setConf() a new thread is started - isn't it a threda leak? In general it isn't a good practice to add such side-effects for config changes like setConf - it is better to explicitly call a method which will do whatever is needed after conf update. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Lines 179 (patched) <https://reviews.apache.org/r/65634/#comment278664> Please document this method. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Lines 180 (patched) <https://reviews.apache.org/r/65634/#comment278665> This seem to repeat the code from setConf. is there any way to avoid code copy? The only difference seems to be the call to startCacheUpdateService() which shows that it isn't a good idea to have it there. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 266 (original), 202 (patched) <https://reviews.apache.org/r/65634/#comment278669> Please document this method. Among other things - can prewarm() be called multiple times? If not, should it be somehow enforced? standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 270 (original), 206 (patched) <https://reviews.apache.org/r/65634/#comment278668> Please use built-in {} formatting for og. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 271 (original), 207 (patched) <https://reviews.apache.org/r/65634/#comment278667> Please remove <Database> and add explicit size: `List<Database> databases = new ArrayList<>(dbNames.size());` standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Lines 209 (patched) <https://reviews.apache.org/r/65634/#comment278671> This line isn't adding anything, why not just for (String dbName : dbNames) { databases.add(rawStore.getDatabase(dbName)); } standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Lines 213 (patched) <https://reviews.apache.org/r/65634/#comment278674> Why is this an info log and not debug? standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 272 (original), 215 (patched) <https://reviews.apache.org/r/65634/#comment278675> Why not use something like the for loop before? `for (String dbName : dbNames) {...}` in particular, dbNames.size() doesn't change in the loop so it can be cached. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 285 (original), 217 (patched) <https://reviews.apache.org/r/65634/#comment278678> It is quite possible that there is another metastore instance running and someone removes this database, so this call will fail due to missing database. I think this code should continue prewarm for other databases in such cases. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 286 (original), 218 (patched) <https://reviews.apache.org/r/65634/#comment278676> tblNames may contain thousands of entries - it isn't a good idea to log it. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 287 (original), 219 (patched) <https://reviews.apache.org/r/65634/#comment278677> Again, why not just a simple iterator over tblNames? standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 290 (original), 222 (patched) <https://reviews.apache.org/r/65634/#comment278679> Why not just use normal dbname.tblName notation? `"LOG.info("table {}.{} is not cached", dbName, tblName);"` Also, is there value in info logging this? There is already debug logging for shouldCacheTable() standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 299 (original), 229 (patched) <https://reviews.apache.org/r/65634/#comment278680> Wouldn't getTable throw execption in this case? standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 302 (original), 232 (patched) <https://reviews.apache.org/r/65634/#comment278681> Why set it to null and not to the actual value (which is done a few lines below)? standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 303 (original), 233 (patched) <https://reviews.apache.org/r/65634/#comment278682> I think you can simplify the code and move this to where they are used if you modify if statement below. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 317 (original), 242 (patched) <https://reviews.apache.org/r/65634/#comment278683> Here it would be better to do if (!table.isSetPartitionKeys())) { return what should be returned in this case } standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 321 (original), 243 (patched) <https://reviews.apache.org/r/65634/#comment278684> This may fail as well in which case you want to continue with other tables. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Lines 245 (patched) <https://reviews.apache.org/r/65634/#comment278685> This can fail. Also you can get a mismatch between partition names and actual partitions. I think it is better to just get all partitions and then get their names from partition list to avoid such inconsistency. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 322 (original), 252 (patched) <https://reviews.apache.org/r/65634/#comment278686> 1) please use isEmpty() instead. Also shouldn't you check partitions list rather then partition names? 2) It is more readable if you just return whatever is needed if this condition is false. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 324 (original), 255 (patched) <https://reviews.apache.org/r/65634/#comment278687> This can fail standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 329 (original), 261 (patched) <https://reviews.apache.org/r/65634/#comment278689> Replace <String> with <> and add list size (partKeys.size()) standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 330 (original), 262 (patched) <https://reviews.apache.org/r/65634/#comment278691> This can use Collections.nCopies(partKeys.size(), defaultPartitionValue) standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Lines 274 (patched) <https://reviews.apache.org/r/65634/#comment278692> There may be thousands of tables per DB - is there a value in info logging this? standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 360 (original), 296 (patched) <https://reviews.apache.org/r/65634/#comment278693> 1) Please reorder modifiers 2) Please add comment describing what this method does standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 368 (original), 305 (patched) <https://reviews.apache.org/r/65634/#comment278694> ANy reason not to use lambda here? standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 409 (original), 353 (patched) <https://reviews.apache.org/r/65634/#comment278695> this can be package-private standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 425 (original), 369 (patched) <https://reviews.apache.org/r/65634/#comment278696> It owuld be easier to read if you reverse conditions: if (!shouldRunPrewarm) { update(); return; } standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 426 (original), 370 (patched) <https://reviews.apache.org/r/65634/#comment278697> What protects isCachePrewarmed()? standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 430 (original), 374 (patched) <https://reviews.apache.org/r/65634/#comment278699> Should the retry logic live here or in prewarm() itself? standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 439 (original), 382 (patched) <https://reviews.apache.org/r/65634/#comment278698> Note that this is a regular static variable which is modified here. It should be either atomic or volatile. Also, this is static field and here non-static method modifies static field which is a bad practice. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 453 (original), 395 (patched) <https://reviews.apache.org/r/65634/#comment278701> how can it return null without thrwing an exception? standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 460 (original), 401 (patched) <https://reviews.apache.org/r/65634/#comment278700> this can fail - do you want to continue with other databases? standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 746 (original), 595 (patched) <https://reviews.apache.org/r/65634/#comment278702> Wouldn;t dropDatabase also throw exception if it fails? standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 758 (original), 599 (patched) <https://reviews.apache.org/r/65634/#comment278703> This is rather useless since in case of failure tu'll throw MetaException anyway. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 764 (original), 605 (patched) <https://reviews.apache.org/r/65634/#comment278704> See comments fro dropDatabase. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java Line 902 (original), 701 (patched) <https://reviews.apache.org/r/65634/#comment278705> tbl would never be null, there will be an exception if the call above fails. standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java Lines 687 (patched) <https://reviews.apache.org/r/65634/#comment278672> This should be done outside try-block - Alexander Kolbasov On March 1, 2018, 11:09 a.m., Vaibhav Gumashta wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/65634/ > ----------------------------------------------------------- > > (Updated March 1, 2018, 11:09 a.m.) > > > Review request for hive, Daniel Dai and Thejas Nair. > > > Bugs: HIVE-18264 > https://issues.apache.org/jira/browse/HIVE-18264 > > > Repository: hive-git > > > Description > ------- > > https://issues.apache.org/jira/browse/HIVE-18264 > > > Diffs > ----- > > > itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java > a3725c5395 > service/src/java/org/apache/hive/service/server/HiveServer2.java 86c9c2b33c > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java > ac71d0882f > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java > 7b44df4128 > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java > f500d63725 > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java > f0f650ddcf > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java > 0d132f2074 > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java > 32ea17495f > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java > 50f873a013 > > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java > 75ea8c4a77 > > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java > 207d842f94 > > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java > ab6feb6f0b > standalone-metastore/src/test/resources/log4j2.properties 365687e1c9 > > > Diff: https://reviews.apache.org/r/65634/diff/4/ > > > Testing > ------- > > > Thanks, > > Vaibhav Gumashta > >