Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 )
Change subject: IMPALA-7970 : Add support for automatic invalidates by polling metastore events ...................................................................... Patch Set 8: (46 comments) Posting my initial set of comments. Haven't looked at the tests yet. Will take another pass after code refactoring. (Also, please update your profile name in gerrit settings). http://gerrit.cloudera.org:8080/#/c/12118/8//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12118/8//COMMIT_MSG@7 PS8, Line 7: IMPALA-7970 : Add support for automatic invalidates by polling metastore events nit: wrap lines to less than 72 chars. http://gerrit.cloudera.org:8080/#/c/12118/8//COMMIT_MSG@21 PS8, Line 21: Is the self-event detection coming in a separate patch? http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@246 PS8, Line 246: invalidate or refresh This only invalidates right? (refresh has a different meaning in Impala's context) http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@247 PS8, Line 247: cached metadata about tables nit: rephrase to ...cached table metadata based on... http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@251 PS8, Line 251: ..has.. http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@253 PS8, Line 253: "Additionally, " Since this is a user-facing message, we should probably clarify that this is for "external" events. http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@258 PS8, Line 258: To enable this feature, a " : "non-zero " : "value of the flag must be applied to catalogd and impalad. Looks like this is already mentioned above. Also, is it needed on the coordinators? Doesn't look like it. http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@260 PS8, Line 260: "value of the flag must be applied to catalogd and impalad."); nit: format to fewer lines like others. http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@229 PS8, Line 229: private MetastoreEventsProcessor metastoreEventProcessor; Add a oneliner comment. Also, we use an "_" prefix for class variables. foo_ http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@292 PS8, Line 292: if configured nit: sounds vague, rephrase? http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@292 PS8, Line 292: In order to determine if : * event polling is enabled, this method looks at the value of HMS polling frequency in : * <code>BackendConfig</code>. I think we can get rid of the implementation detail here, remove? Also, I don't think you are doing that check here? Return early (and log) if it is disabled? http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@304 PS8, Line 304: LOG.error( : "Unable to fetch the current notification event id from metastore." : + "Metastore event processing will be disabled."); Isn't this a fatal error? (Include the stack trace in the error log) http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@45 PS8, Line 45: events Probably good to preface this with a what an "event" is with a pointer to Hive's class. Basically when is an event generated etc (since that isn't quite obvious for Impala folks). Also probably worth mentioning that the goal is to apply changes external to Impala. http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@47 PS8, Line 47: The polling of the events is from a : * given start event id at the construction time. Subsequently, it updates the last event : * id until which it has already synced so that next poll is from the last synced event : * id. Maybe just rephrase to "we keep track of last synced event in each iteration..." http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@51 PS8, Line 51: 1000 MAX_EVENTS.... http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@51 PS8, Line 51: In case there is a use-case in the future, we can change the batch size to a : * configurable value. The polling frequency can be configured using the backend config : * hms_event_polling_frequency_s Remove? http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@56 PS8, Line 56: * metastore. In case of CREATE_TABLE/CREATE_DATABASE events, a new table/database is nit: Make this bullet points instead? (supported / non-supported) http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@64 PS8, Line 64: In order to function correctly, it assumes that Hive metastore is configured correctly : * to generate the events. Following configurations should be set in metastore : * hive-site.xml so that events have sufficient information and actions can be taken base : * on them. : * : * <code>hive.metastore.notifications.add.thrift.objects</code> should be set to : * <code>true</code> so that event messages contain thrift object and can be used to : * create new objects in Catalog <code>hive.metastore.alter.notifications.basic</code> : * should be set to <code>true</code> so that all the needed ALTER events are generated I think this is implementation detail can be omitted in the class comment (since you already mentioned it in the init methods). http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@84 PS8, Line 84: METASTORE_NOTIFICATIONS_ADD_THRIFT_OBJECTS _CONFIG_KEY http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@84 PS8, Line 84: static final String METASTORE_NOTIFICATIONS_ADD_THRIFT_OBJECTS = for consistency, use public/private? http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@88 PS8, Line 88: private long lastSyncedEventId_; Observability: expose this as a catalog server metric? (look at <catalog-webui>/metrics). Also helps with testing. http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@93 PS8, Line 93: private final ScheduledExecutorService scheduler = nit: "_" for class members (multiple places) http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@109 PS8, Line 109: private static final int EVENTS_BATCH_SIZE = 1000; oneliner comment. http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@113 PS8, Line 113: Preconditions.checkNotNull(catalog); : this.catalog_ = catalog; this.catalog_ = Preconditions.checkNotNull(catalog); http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@122 PS8, Line 122: if (pollingFrequencyInSec > 0) { : scheduleAtFixedDelayRate(pollingFrequencyInSec); : } Probably worth moving this check to the parent class and skip the c'tor if this is not set. http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@128 PS8, Line 128: of update. http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@139 PS8, Line 139: Starting metastore event polling with interval " + pollingFrequencyInSec : + " seconds." nit: String.format() http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@147 PS8, Line 147: ); include stack trace http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@156 PS8, Line 156: private long getCurrentNotificationID() { Looks like the only caller is processHMSNotificationEvents(), can we reuse the msClient? (do we need a separate method? can't we just move it to the try-catch block inside processHMSNotificationEvents()?) http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@174 PS8, Line 174: lastSyncedEventId_ = : lastSyncedEventId_ == -1 ? getCurrentNotificationID() : lastSyncedEventId_; : if (lastSyncedEventId_ == -1) { : LOG.warn( : "Unable to fetch current notification event id. Cannot sync with metastore"); : Like I commented below, it appears to me that (lastSyncedEventId_ != -1) is a precondition. Thoughts? http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@182 PS8, Line 182: CurrentNotificationEventId currentNotificationEventId = : msClient.getHiveClient().getCurrentNotificationEventId(); : long currentEventId = currentNotificationEventId.getEventId(); : if (currentEventId > lastSyncedEventId_) { Can we skip this and directly do a single RPC to fetch the latest events? events = getEvents(); if (events.size() == 0) break; http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@187 PS8, Line 187: EVENTS_BATCH_SIZE Looks like we are only doing only a single batch per iteration. Should we rename this to MAX_EVENTS_PER_RPC or something? http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@191 PS8, Line 191: // update the lastSyncedEventId before processing the event itself Looks like this event is lost forever. Any idea how this can be surfaced back to the user? http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@196 PS8, Line 196: processEvent(event); If a single event processing throws an exception, we seem to be breaking out of the loop. Does that mean everything after that is lost too? http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@212 PS8, Line 212: dbName This seems to be optional in the thrift spec and the next line there is a (db != null) check. Are there events where this is not set in the response? Should we add those tests? http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@215 PS8, Line 215: case "ADD_PARTITION": Strings look risky, is there no EventType enum or some such? http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@221 PS8, Line 221: case "CREATE_TABLE": This seems to contain too much code and branches. Can we define an EventHandler interface (or abstract class) and implement it for various Events and register them at init? That simplifies the code here to dispatch. Thoughts? http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@222 PS8, Line 222: catalog_.addTable(dbName, tblName); What if this races with an addTable() from the local cluster? http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@235 PS8, Line 235: catalog_.removeTable(dbName, tblName); Same as above. What if there are races? (multiple cases below) http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@257 PS8, Line 257: /*if (db != null) { : msDb = MetastoreEventsProcessor.getDatabaseFromMessage(event); : db.setMSDb(msDb); : }*/ Is this intentional? (given you changed the Db class locking protocol) http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@287 PS8, Line 287: private boolean processRenameTableEvent(NotificationEvent event) It looks like to me like these should call into the CatalogOpExecutor (modulo apply*() calls) primarily for - Locking considerations (races) - To avoid code duplication http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@347 PS8, Line 347: Check if %s is set to true in metastore configuration Lets force this check? http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@401 PS8, Line 401: startSyncFromId If startSyncFromId == -1, shouldn't that be a fatal error? http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@402 PS8, Line 402: if (INSTANCE == null) { Do we need to validate the HiveClient config? property> <name>hive.metastore.notifications.add.thrift.objects</name> <value>true</value> </property> <property> <name>hive.metastore.alter.notifications.basic</name> <value>false</value> </property> http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@412 PS8, Line 412: static class MetastoreNotificationException extends ImpalaException { Move to its own class file http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/test/resources/postgresql-hive-site.xml.template File fe/src/test/resources/postgresql-hive-site.xml.template: http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/test/resources/postgresql-hive-site.xml.template@223 PS8, Line 223: <!-- This property is required to issue invalidates based on metastore events. See IMPALA-7954 for details --> nit:wrap lines. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 8 Gerrit-Owner: Anonymous Coward <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Comment-Date: Wed, 26 Dec 2018 02:23:46 +0000 Gerrit-HasComments: Yes
