Alex Behm has posted comments on this change. Change subject: IMPALA-5259: Add REFRESH FUNCTIONS <db> statement ......................................................................
Patch Set 1: (12 comments) http://gerrit.cloudera.org:8080/#/c/6878/1/fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java File fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java: Line 30: * Representation of a REFRESH/INVALIDATE METADATA statement. Can you spell out the different variants to make this clearer? Line 45: public ResetMetadataStmt(TableName name, boolean isRefresh, This c'tor is starting to look scary. Please make it private and add static create() functions for the different use cases, e.g.: createInvalidateStmt(); createRefreshStmt(); createRefreshFunctionsStmt(); http://gerrit.cloudera.org:8080/#/c/6878/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: Line 616: List<org.apache.hadoop.hive.metastore.api.Function> javaFns = Let's move as much of the heavy work as possible outside of the global lock, in particular, those parts that talk to HMS. We only need to acquire the catalogLock_ when once we access/update the DB in our cache. Line 624: Db db = dbCache_.get().get(dbName); need to handle db == null Line 627: Db tmpDb = new Db(dbName, this, msDb); I understand this is convenient, but it would be preferable to not RPC to HMS to get the msDb. That's a rather high price which adds extra potential failure weirdness. Can't you just construct a new msDb from the info we have in db? Line 636: for (String fn_name: tmpDb.getAllFunctions().keySet()) { easier to read if you iterate over the entrySet() Line 638: boolean result = db.addFunction(fn); We need to handle the !result case. We could have a totally different function with the same name, and we need to update that function in db. The easiest way is probably to first db.removeFunction() and then re-add if something was removed. Line 648: for (String fn_name: db.getAllFunctions().keySet()) { easier to read if you iterate over the entrySet() Line 653: toRemove.add(fn); why not remove immediately here? http://gerrit.cloudera.org:8080/#/c/6878/1/fe/src/main/java/org/apache/impala/catalog/Function.java File fe/src/main/java/org/apache/impala/catalog/Function.java: Line 318: result.setCatalog_version(getCatalogVersion()); using catalogVersion_ is a little clearer to me http://gerrit.cloudera.org:8080/#/c/6878/1/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: Line 225: if (req.is_delta) { single line and remove the else block http://gerrit.cloudera.org:8080/#/c/6878/1/tests/custom_cluster/test_permanent_udfs.py File tests/custom_cluster/test_permanent_udfs.py: Line 220: REFRESH_COMMANDS = ["INVALIDATE METADATA", Let's think a little about how to test these scenarios: 1. Hive dropped + re-added a different function with the same name. See that REFRESH returns the new function. 2. Test that REFRESH works as expected for native functions. -- To view, visit http://gerrit.cloudera.org:8080/6878 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-HasComments: Yes
