Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/20022 )
Change subject: IMPALA-11535: Skip older events in the event processor based on the latestRefreshEventID ...................................................................... Patch Set 3: (11 comments) http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1285 PS3, Line 1285: // update refresh event id at partition level This and the comment at L1298 confuse me since no codes after them are actually updating the refresh event id. Can we remove them? http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1923 PS3, Line 1923: long latestEventId = getLatestEventId(client); I think we should get this before loadFileMetadataForPartitions() as well. http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1930 PS3, Line 1930: p.setLastRefreshEventId(latestEventId); We'd better check and only update it if 'latestEventId' is not -1 or lower than the current tracked refresh event id. http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1932 PS3, Line 1932: LOG.info("Setting the latest refresh event id to {} for the loaded partitions", Let's add the table name in the log. http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2947 PS3, Line 2947: public long getLatestEventId(IMetaStoreClient client) { This seems a static method and can be moved to MetastoreEventsProcessor. Actually, there is a similar method: MetastoreEventsProcessor.getCurrentEventId(). We can add a new one like getCurrentEventIdNoThrow() for this. http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1053 PS3, Line 1053: infoLog("Incremented events skipped counter to {}", We'd better add more info about why an event is skipped or not skipped. They would be useful in debugging issues when we lose track of the event ids, e.g. IMPALA-12256 We already have partition level logs at https://github.com/apache/impala/blob/fed883379383b368bbce4f3b36b95e01a7378578/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L3903 So we just need to add the table level refresh event id. http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1529 PS3, Line 1529: eventId = getCurrentEventId(msClient); This happens after we reload the table metadata which is unsafe if there are external DDLs finish between we reload the HMS schema and get the latest event id here. Those DDLs will be incorrectly skipped. To be safe, I think we should get the latest event id before L1522. BTW, do we need this extra RPC call for non-hdfs tables, e.g. kudu table? http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1537 PS3, Line 1537: throw new CatalogException("Unable to load table metadata.", ex); It seems too strict to fail the DDL just for failures in getting the latest event id. It'd be better to just log a warning about older events might not be skipped due to failures in fetching the latest event id. http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1615 PS3, Line 1615: eventId = getCurrentEventId(msClient); The same as above, I think this should run before the load() and exceptions in it should not fail the DDL. http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/20022/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3219 PS3, Line 3219: setEnableSyncToLatestEventOnDdls(true) Is this required to turn on this optimization? http://gerrit.cloudera.org:8080/#/c/20022/3/tests/custom_cluster/test_events_custom_configs.py File tests/custom_cluster/test_events_custom_configs.py: http://gerrit.cloudera.org:8080/#/c/20022/3/tests/custom_cluster/test_events_custom_configs.py@316 PS3, Line 316: --enable_reload_events=true Do we need this flag? -- To view, visit http://gerrit.cloudera.org:8080/20022 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0dc5c7396d80616680d8a5805ce80db293b72e1 Gerrit-Change-Number: 20022 Gerrit-PatchSet: 3 Gerrit-Owner: Sai Hemanth Gantasala <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Mon, 31 Jul 2023 02:40:25 +0000 Gerrit-HasComments: Yes
