Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12938 )

Change subject: IMPALA-8338 : Check CREATION_TIME of databases in event 
processor to avoid incorrect/redundant invalidates
......................................................................


Patch Set 4:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/12938/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12938/4//COMMIT_MSG@8
PS4, Line 8: incorrect/redundant invalidates
I think this is not accurate. creation_time is not used for invalidating but 
rather for checking if the exact same instance of Db exists before dropping. So 
the a more apt description should be "... to avoid incorrect delete of 
databases"


http://gerrit.cloudera.org:8080/#/c/12938/4//COMMIT_MSG@15
PS4, Line 15: we do not need
            : to invalidate it in this case.
suggest you to change it to ".. and we can skip the event instead" which is 
more accurate.


http://gerrit.cloudera.org:8080/#/c/12938/4/be/src/util/event-metrics.h
File be/src/util/event-metrics.h:

http://gerrit.cloudera.org:8080/#/c/12938/4/be/src/util/event-metrics.h@40
PS4, Line 40: static IntCounter* NUM_EVENTS_FILTERED_COUNTER;
Is there any specific value in exposing this metric on the Catalog metrics 
page? The end-user most likely does not care about this metric or can take any 
decisions based on this metric. If this metric is only to improve debugging and 
observability of this feature then I would suggest it to keep it visible only 
in the /events page which gives more detailed view of the event processor.


http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@953
PS4, Line 953:   public Integer getDbCreationTime(String dbName) {
like suggested in my other comment related to race conditions , this API in 
itself is not very useful since no action can be taken based on creationTime 
itself without exposing races. May be change this to removeDbIfExists(Database 
msDb) which checks the creationTime and removes if it matches.


http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@208
PS4, Line 208:       
metrics_.getCounter(MetastoreEventsProcessor.EVENTS_FILTERED_METRIC).inc(numFilteredEvents);
> line too long (98 > 90)
Just having this line is sufficient to expose the metric in /events page. You 
don't need the other changes in the event-metrics.h etc if you decide to skip 
exposing this metric to the metrics page.


http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@898
PS4, Line 898: the latest and we do not need to invalidate it in this case
The process method does not invalidate so this description needs to be updated.

suggest you to change it "... in the catalog is a later version and we can skip 
the event"


http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@903
PS4, Line 903: int dbCreationTimeFromEvent = droppedDatabase_.getCreateTime();
             :       Integer dbCreationTimeFromCatalog = 
catalog_.getDbCreationTime(dbName_);
             :
             :       if (dbCreationTimeFromCatalog == null) {
             :         infoLog("Database {} not present in the catalog, hence 
no " +
             :                 "action is required for DROP_DATABASE event.", 
dbName_);
             :         return;
             :       }
             :
             :       if (dbCreationTimeFromCatalog > dbCreationTimeFromEvent) {
             :         infoLog("Not removing database {} from catalog as it 
already has a" +
             :                 " DB object which was created later.", dbName_);
             :         
metrics_.getCounter(MetastoreEventsProcessor.EVENTS_FILTERED_METRIC).inc();
This code has potential race conditions. For example, the createTime of the 
database which is fetched on line 903 could change by the time it is checked at 
line 912. Both the creation_time check and database removal should be done 
atomically while holding the write lock. See 
CatalogServiceCatalog#removeTableIfExists as an example.


http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@272
PS4, Line 272:   public void testCreateDropCreateDatabase() throws TException {
add javadoc to describe what exactly this test is testing. Specifically, 
describe around why checking this sequence of events is important.


http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@284
PS4, Line 284: public
add javadoc


http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@298
PS4, Line 298:     assertEquals(2, eventsProcessor_.getMetrics()
hmm, when a create_db, drop_db, create_db is found, I thought only the first 
create_db is filtered out. Why is the count 2?


http://gerrit.cloudera.org:8080/#/c/12938/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@302
PS4, Line 302: dropDatabaseCascadeFromImpala
isn't the @After method already doing this?



--
To view, visit http://gerrit.cloudera.org:8080/12938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8fd3685cf7261e4953f4f884850489a47c5bbd6c
Gerrit-Change-Number: 12938
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Krishna <[email protected]>
Gerrit-Reviewer: Anurag Mantripragada <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]>
Gerrit-Comment-Date: Tue, 09 Apr 2019 17:59:51 +0000
Gerrit-HasComments: Yes

Reply via email to