Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12938 )
Change subject: IMPALA-8338 : Check CREATION_TIME of databases in event processor to avoid incorrect/redundant invalidates ...................................................................... Patch Set 4: (11 comments) http://gerrit.cloudera.org:8080/#/c/12938/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12938/4//COMMIT_MSG@8 PS4, Line 8: incorrect/redundant invalidates I think this is not accurate. creation_time is not used for invalidating but rather for checking if the exact same instance of Db exists before dropping. So the a more apt description should be "... to avoid incorrect delete of databases" http://gerrit.cloudera.org:8080/#/c/12938/4//COMMIT_MSG@15 PS4, Line 15: we do not need : to invalidate it in this case. suggest you to change it to ".. and we can skip the event instead" which is more accurate. http://gerrit.cloudera.org:8080/#/c/12938/4/be/src/util/event-metrics.h File be/src/util/event-metrics.h: http://gerrit.cloudera.org:8080/#/c/12938/4/be/src/util/event-metrics.h@40 PS4, Line 40: static IntCounter* NUM_EVENTS_FILTERED_COUNTER; Is there any specific value in exposing this metric on the Catalog metrics page? The end-user most likely does not care about this metric or can take any decisions based on this metric. If this metric is only to improve debugging and observability of this feature then I would suggest it to keep it visible only in the /events page which gives more detailed view of the event processor. http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@953 PS4, Line 953: public Integer getDbCreationTime(String dbName) { like suggested in my other comment related to race conditions , this API in itself is not very useful since no action can be taken based on creationTime itself without exposing races. May be change this to removeDbIfExists(Database msDb) which checks the creationTime and removes if it matches. http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@208 PS4, Line 208: metrics_.getCounter(MetastoreEventsProcessor.EVENTS_FILTERED_METRIC).inc(numFilteredEvents); > line too long (98 > 90) Just having this line is sufficient to expose the metric in /events page. You don't need the other changes in the event-metrics.h etc if you decide to skip exposing this metric to the metrics page. http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@898 PS4, Line 898: the latest and we do not need to invalidate it in this case The process method does not invalidate so this description needs to be updated. suggest you to change it "... in the catalog is a later version and we can skip the event" http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@903 PS4, Line 903: int dbCreationTimeFromEvent = droppedDatabase_.getCreateTime(); : Integer dbCreationTimeFromCatalog = catalog_.getDbCreationTime(dbName_); : : if (dbCreationTimeFromCatalog == null) { : infoLog("Database {} not present in the catalog, hence no " + : "action is required for DROP_DATABASE event.", dbName_); : return; : } : : if (dbCreationTimeFromCatalog > dbCreationTimeFromEvent) { : infoLog("Not removing database {} from catalog as it already has a" + : " DB object which was created later.", dbName_); : metrics_.getCounter(MetastoreEventsProcessor.EVENTS_FILTERED_METRIC).inc(); This code has potential race conditions. For example, the createTime of the database which is fetched on line 903 could change by the time it is checked at line 912. Both the creation_time check and database removal should be done atomically while holding the write lock. See CatalogServiceCatalog#removeTableIfExists as an example. http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@272 PS4, Line 272: public void testCreateDropCreateDatabase() throws TException { add javadoc to describe what exactly this test is testing. Specifically, describe around why checking this sequence of events is important. http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@284 PS4, Line 284: public add javadoc http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@298 PS4, Line 298: assertEquals(2, eventsProcessor_.getMetrics() hmm, when a create_db, drop_db, create_db is found, I thought only the first create_db is filtered out. Why is the count 2? http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@302 PS4, Line 302: dropDatabaseCascadeFromImpala isn't the @After method already doing this? -- To view, visit http://gerrit.cloudera.org:8080/12938 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8fd3685cf7261e4953f4f884850489a47c5bbd6c Gerrit-Change-Number: 12938 Gerrit-PatchSet: 4 Gerrit-Owner: Bharath Krishna <[email protected]> Gerrit-Reviewer: Anurag Mantripragada <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]> Gerrit-Comment-Date: Tue, 09 Apr 2019 17:59:51 +0000 Gerrit-HasComments: Yes
