Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 )
Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate ...................................................................... Patch Set 20: (36 comments) http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@307 PS21, Line 307: BackendConfig.INSTANCE.getHMSPollingIntervalInSeconds() > eventPollingInterval Done http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@319 PS21, Line 319: LOG.error("Unable to fetch the current notification event id from metastore." : + "Metastore event processing will be disabled.", : e); > nit: Multiple places in the code, try to format in fewer lines. Something l I used the command given in https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868536 as advised by some of the other members of the team. Unfortunately, it doesn't like some of these lines. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@324 PS21, Line 324: e); > Will the code that handles this exception write to the log both the message The LOG.error above logs the trace as well since it takes in the underlying cause exception as the second argument. The exception is uncaught in this particular case on the caller side, since we want CatalogD to not come up if it is configured to using event processing but for some reasons isn't able to instantiate one. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1206 PS21, Line 1206: // Even though we get the current notification event id before stopping the event > clarification: reset() is equivalent to invalidating everything and fetchin This is similar to Tim's comment and our discussion. The line 1216 gets the currentEventID (latest) from HMS. And the event processing begins from that id. So the event processing jumps to the latest event id once the reset() is complete. There is still a slight chance that we will reprocess some of the events which are generated during reset() execution. It is better than missing those events which would lead to inconsistent state between catalog and HMS as far as event processing is concerned. The comment gives details of that case. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1216 PS21, Line 1216: long currentEventId = metastoreEventProcessor_.getCurrentEventId(); > Should this be here? Or in the event processor itself? If here, why is it n It needs to be here to avoid the race condition which can lead to possible missed events. We want to get the latestEventId before triggering the reset() so that we can then restart the event processing after reset from this eventId. If we move the code in the start() method, there is a chance that we will miss processing some of the events which happen during after reset is complete, but before we started the event processing. See Tim's comment in the earlier review comments which gives a nice example. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1360 PS21, Line 1360: Reference<Boolean> dbWasFound, Reference<Boolean> tblWasFound) { > I still think that using the reference is non-standard Java. A simple solut used the second option of return boolean and throw exception when db is not found. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1363 PS21, Line 1363: Db db = getDb(dbName); > Let's think about this. We return the flags because we want to know if the if getting db before acquiring the lock has a race then it looks like addTable and renameTable also has a race. I will create separate JIRAs for fixing them http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1372 PS21, Line 1372: tblWasFound.setRef(true); > I wonder, do we really need to know if the table existed or not? Is it even consider a case where user do create, drop, create with the same table name from Impala. In case of if the create event from the first create statement, the table presented in the event is stale and should not be used to add to the catalog. I think it is useful to log this information to make sure that the create event was ignored in such a case. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1475 PS21, Line 1475: Reference<Boolean> tblWasfound, Reference<Boolean> tblMatched) { > Same comments as above. Moved the getDb() call in the block where lock is acquired first. I would like to defer the suggestion of changing the Reference to later patch in the interest of time. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1804 PS21, Line 1804: Table incompleteTable = IncompleteTable.createUninitializedTable(db, tblName); > move it after L1807? Done http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@130 PS20, Line 130: public static abstract class MetastoreEvent { > The new event class hierarchy looks very good. Done http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@138 PS20, Line 138: protected final Logger LOG = Logger.getLogger(this.getClass()); > protected static final ... MetastoreEvent.class); this.getClass() gives us the ability to log the implementation class name which is what I thought would be useful to debug. Else you will have to grep the statement in this class to locate the line of the code. http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@160 PS20, Line 160: if (!event.getEventType().equalsIgnoreCase("CREATE_DATABASE") > Move this code to the constructor for the suggested MetastoreTableEvent cla Done http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@182 PS20, Line 182: } > Would move to the suggested MetastoreTableEvent class. Done http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@199 PS20, Line 199: private final String tblName_; > These two fields shadow fields of the same name in the base class. This is yeah, I missed cleaning this up in one of the many refactors this class went through. http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@241 PS20, Line 241: LOG.info(String.format("Added a table %s.%s after processing event " > Nit: to make it easier to read/parse log records, maybe: introduced a debugString method. Bharath had suggested the same. http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@271 PS20, Line 271: throw new MetastoreNotificationException("Unable to parse the alter table " > This is a round-about way to do error handling. Generally, you EITHER Would like to defer this to later in the interest of time. http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@293 PS20, Line 293: if (!invalidateCatalogTable(dbName_, tblName_)) { > If this is a method, then no need to pass the fields since the method has v missed that during refactor http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@318 PS20, Line 318: throw new MetastoreNotificationException(String.format("Could not rename " > Let's think about error handling a bit. There are two broad cases: If the table rename did not happen, there is something very unexpected going on in the catalog. I think its better to halt the event processing in such a case. If the rename did not happen and the table in catalog still has the old name, it may be possible the future events act on the old name leading to inconsistent catalog state. There is lot more to do in this feature and I would like to move on to the next sub-tasks. I would like to defer this suggestion to later unless you think it is critical for functionality of the patch. http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@333 PS20, Line 333: throw new MetastoreNotificationException(e); > Suggestion: have MetastoreNotificationException take this class, or the mes Good suggestion but in the interest of time the debugString method which does something similar is a good substitute for what your are suggesting here. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@155 PS21, Line 155: event %d of type %s on table %s > I think it is cleaner to define a debugString(NotificationEvent event) {} t Good suggestion. Added a util method which gets the debug string for a given list of message and arguments. It appends EventId and EventType information to the given message. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@160 PS21, Line 160: if (!event.getEventType().equalsIgnoreCase("CREATE_DATABASE") : && !event.getEventType().equalsIgnoreCase("DROP_DATABASE")) { : tblName_ = Preconditions.checkNotNull(event.getTableName()); : } else { : tblName_ = null; : } > nit: Reformat to the following? (fewer branches and use enum) The tblName_ is a final object so you can assign to it only once. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@181 PS21, Line 181: tblName_ > checkNotNull()? Done http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@198 PS21, Line 198: private final String dbName_; : private final String tblName_; > aren't these set in the parent c'tor? yes, Looks like I missed removing it during one of the many iterations of the patch. Thanks for catching this. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@347 PS21, Line 347: droppedTable_ > isn't this more like tblTobeDropped_ ? It depends on whether you see it as a object which is dropped in metastore or which needs to be dropped in Catalog. Since this object is retrieved by parsing the NotificationEvent and the type of the object is metastore.api.Table not impala.catalog.Table, I thought the name made sense. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@434 PS21, Line 434: // > nit: remove, you are already in a comment block. Done http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@205 PS20, Line 205: eventProcessorStatus_ = EventProcessorStatus.ACTIVE; > Should we write something to the log, perhaps at DEBUG or TRACE levels, tha added a info log. http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@265 PS20, Line 265: return metaStoreClient.getHiveClient().getCurrentNotificationEventId().getEventId(); > How is this used? How is it synchronized? Initially this was part of start() method. But we needed to get the currentNotificationID from metastore in catalogServiceCatalog reset() method. So instead of duplicating the code, I created a method which is called from start() method. http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@285 PS20, Line 285: LOG.info(String.format("Metastore event processing restarted. Last synced event id " > Isn't start() called only once? If not, what is the use case for pausing an the start() method starts the event processing from whatever is the current event id in metastore. This is done when catalog service is started. start(eventId) resumes the event processing from the given eventId. THis is done from the reset() method. WHen a user issues invalidate metadata, the catalog state is cleared and we can jump to the latest eventId. Its also good to stop event processing during reset() to avoid races between events from reset() http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@360 PS20, Line 360: Preconditions.checkState(lastProcessedEvent != null); > The catch-block handles two types of exceptions. It also handles two source good point. However, adding two catch blocks is not sufficient since event process can also throws MetastoreNotificationException. I added a simple check of lastProcessedEvent to find out if the getNextMetastoreEvents failed or the event.process() http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@371 PS20, Line 371: private void dumpEventInfoOnErrors(NotificationEvent event) { > The name indicates why the method might be called. But, the method itself d Done http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@373 PS20, Line 373: .append("Event id : " + event.getEventId() + "\n") > Nit: If using a StringBuilder, write each field as a separate .append() cal Done http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@376 PS20, Line 376: .append("Database name : " + event.getDbName() + "\n"); > Another nit: colons usually have no leading space: "Database name: foo", no Done http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@397 PS20, Line 397: public static synchronized ExternalEventsProcessor getOrCreate( > typically called "instance()" or "getInstance()". renamed the method and changed the INSTANCE to instance. However, I do think that this method needs to be synchronized to avoid the risk of double instantiation in the future this is called from multiple threads http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@310 PS21, Line 310: if (currentEventId <= lastSyncedEventId_) { : return Collections.emptyList(); : } > nit: single line Done http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@373 PS21, Line 373: .append("Event id : " + event.getEventId() + "\n") > weird formatting. Done -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 20 Gerrit-Owner: Vihang Karajgaonkar <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Paul Rogers <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]> Gerrit-Comment-Date: Sat, 19 Jan 2019 01:54:56 +0000 Gerrit-HasComments: Yes
