Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17308 )

Change subject: IMPALA-10502: Handle CREATE/DROP events correctly
......................................................................


Patch Set 6:

(24 comments)

http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@a689
PS5, Line 689:
> Will removing these break third party extensions? If not, we can remove the
yes, you are right. Thanks for catching that!


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@874
PS5, Line 874:     loadFileMetadataForPartitions(client, addedPartBuilders, 
/*isRefresh=*/false);
> Can we refactor this with the above method? The major logics are the same.
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@883
PS5, Line 883:
             :   private HdfsPartition.Builder
> nit: can fit into one line
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2436
PS5, Line 2436:  value.
> nit: I think we prefer ++i and some spaces
Done. I never realized that we prefer ++i v/s i++ in the loops. A quick grep 
does indeed so ++i has 319 instances and i++ has only 61. Did some reading to 
know if this is more of a stylistic preference or something more :)


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/TableLoader.java
File fe/src/main/java/org/apache/impala/catalog/TableLoader.java:

http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@78
PS5, Line 78:       synchronized (metastoreAccessLock_) {
> hmm.. not related to this patch, I think we don't need this anymore. We hav
yeah, I agree. I filed IMPALA-10706


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@112
PS5, Line 112:   throw new TableLoadingExceptio
> I'm not clear on the purpose of this. Do we depend on the createEventId whe
The table has a new field createEventId which is used to track the event. When 
a table is created from Impala we create a IncompleteTable which has the 
createEventId set to corresponding CREATE_TABLE event id from the HMS. But it 
is possible that when the table is loaded, the table was dropped and recreated 
outside Impala. Events processor doesn't update the createEventId of the table 
since this table is in unloaded state. Hence we need to update the eventId to 
the latest CREATE_TABLE event id when we load the table so that the drop and 
create table event which events processor receives on this loaded table is 
ignored.

I will update the comment with more details to make it more readable.


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@744
PS5, Line 744:           LOG.debug("EventId: {} Table {} was not added since "
> nit: Log table name as well?
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@787
PS5, Line 787:       Table tblBefore = null;
> nit: could you move this to line 805 since it's only used there?
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@809
PS5, Line 809: origin
> nit: "original" makes more sense for me :)
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1305
PS5, Line 1305:    * Alters an existing view's definition in the metastore. 
Throws an exception
> Can we refactor this with the above method? The major logics are the same.
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1388
PS5, Line 1388:    * Updates table property 'impala.lastComputeStatsTime' for 
COMPUTE (INCREMENTAL) STATS,
> nit: redundant blank line
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1773
PS5, Line 1773:           ((CreateDatabaseEvent) event).getDatabase());
              :     } catch (MetastoreNotificationException e) {
              :       throw new CatalogException("Unable to create a metastore 
event ", e);
              :     }
              :   }
              :
              :   /**
> Yeah, I think handling the IF NOT EXISTS case in this situation is a nice-t
It was not too complex. I implemented it for create table and create database. 
For partitions it was already being handled.


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1802
PS5, Line 1802:           .get(events.get(events.size() - 1));
              :       Preconditions.checkState(event instanceof 
CreateTableEvent);
              :
> nit: this can be one-line statement
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2257
PS5, Line 2257:         } else {
> Does it make more sense to either
yeah, I could have done either of those but I found that the code was becoming 
a lot more convoluted because of the need to handle the TException if I move 
this out of the try-with-resources block or if I pass the client to the 
getNextMetastoreEvents. I thought this perf benefit is unlikely to be the 
common case and hence wasn't worth the extra complexity. The only time I can 
see this would benefit is when the dropDatabase is using the last metastore 
client of the pool and getNextMetastoreEvents is creating a new client. 
However, at that point it is clear that catalogd needs more clients than it is 
configured with due to possibly increased concurrency of ddls and this kind of 
bottleneck would be observed at many places.


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2260
PS5, Line 2260:         }
> Why are we filtering DROP_TABLE_EVENT_TYPE here? Isn't it being handled at 
we are interested in filtering out the drop table events also here because this 
drop database operation could be a drop database cascade command which would 
drop the underlying tables as well. line 2548 is to handle the drop table 
command. In both the cases we need to update the deleteEventLog. Hope that 
makes sense.


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3093
PS5, Line 3093:                         
.equalsIgnoreCase(event.getTableName()));
> Just confirming: For any ddl operations, HMS adds entry to NotificationEven
yes, HMS events are generated as part of the RDBMS transaction which adds the 
table to the HMS.


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3198
PS5, Line 3198: e
> Thoughts?
Yeah, I think we should change it to .equals() But perhaps in a separate change 
so that we can port it to any other branches if needed.


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3209
PS5, Line 3209:       if (cacheOp != null && cacheOp.isSet_cached()) {
> Thoughts?
yeah ,I have updated my patch to remove this additional getTable call. I will 
see if this passes the tests to make sure.


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3255
PS5, Line 3255:           "ifNotExists is true.", tableName));
> nit: redundant blank line
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4015
PS5, Line 4015:                 + "
> nit: can we remove this and the parameter in the method if we don't need it
makes sense.


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/JniCatalog.java@185
PS5, Line 185:     "Fatal error while ini
> Throwing the execption will fail the process. Based on the aboved logs, it
Thanks for pointing this out. I think it is valid to throw CatalogException 
here since the user configured to use the events processor but we cannot 
instantiate it. I will update the log message.


http://gerrit.cloudera.org:8080/#/c/17308/5/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/17308/5/tests/custom_cluster/test_event_processing.py@285
PS5, Line 285: t Exception, e:
> No needed? It seems 'q' is already a concrete string.
Thanks for catching that. Done.


http://gerrit.cloudera.org:8080/#/c/17308/5/tests/custom_cluster/test_event_processing.py@286
PS5, Line 286:   print("Failed
> nit: might be helpful to print the exception as well
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/tests/custom_cluster/test_event_processing.py@591
PS5, Line 591: r is not updated. Inste
> nit: process it
Done



--
To view, visit http://gerrit.cloudera.org:8080/17308
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2c5e96b48abac015240f20295b3ec3b1d71f24a
Gerrit-Change-Number: 17308
Gerrit-PatchSet: 6
Gerrit-Owner: Vihang Karajgaonkar <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Sourabh Goyal <[email protected]>
Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]>
Gerrit-Comment-Date: Tue, 18 May 2021 00:36:52 +0000
Gerrit-HasComments: Yes

Reply via email to