Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/20367 )
Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs ...................................................................... Patch Set 21: (28 comments) Thanks for the hard work on this great feature! The patch looks much better now. I just have some minor comments. http://gerrit.cloudera.org:8080/#/c/20367/21//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20367/21//COMMIT_MSG@27 PS21, Line 27: Set 'enable_sync_to_latest_event_on_ddls' to true. nit: more contents here? http://gerrit.cloudera.org:8080/#/c/20367/21/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/20367/21/be/src/catalog/catalog-server.cc@139 PS21, Line 139: "cache_directive_id, cache_replication", "This configuration is used to whitelist " nit: could you mention this change in the commit message? http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2676 PS21, Line 2676: tbl.setLastRefreshEventId(eventId, isSetLastSyncEventId); Why do we still update lastRefreshEventId if isSetLastSyncEventId=false in this branch? If the event is the latest event, currentHmsEventId will be eventId then go into this branch. BTW, 'isSetLastSyncEventId' sounds like lastSyncedEventId is already set. 'isFullReload' might be better. http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2825 PS21, Line 2825: accessLevel_ = getAvailableAccessLevel(getFullName(), tblLocation, This could be a heavy operation for large tables that have many partitions since we are passing an empty permission cache. We can remove this in IMPALA-7539 if we decide to remove the support on permission checks. http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1623 PS21, Line 1623: !Objects.equals(beforeSd.getSerdeInfo(), afterSd.getSerdeInfo())) { For supportability, we'd better add logs about why we return true here. http://gerrit.cloudera.org:8080/#/c/20367/20/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/20367/20/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3162 PS20, Line 3162: trySyncToLatestEventId(table))) { > For a truncate table operation, the alter_table() HMS API is only called if Ack. Thanks for the explanation! http://gerrit.cloudera.org:8080/#/c/20367/20/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6612 PS20, Line 6612: boolean syncToLatestEventId = not used http://gerrit.cloudera.org:8080/#/c/20367/20/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@7098 PS20, Line 7098: if (update.isSetDebug_action()) { We can remove this check since nulls are checked inside DebugUtils.executeDebugAction(). http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@404 PS21, Line 404: private final Metrics metrics_ = new Metrics(); We will use this in concurrent DDL/DMLs. Based on the class comment, Metrics is not thread-safe: https://github.com/apache/impala/blob/5af8fef199b60fb7725971b419596a36e48b1eec/fe/src/main/java/org/apache/impala/common/Metrics.java#L33 But looking into the underlying MetricRegistry, it's using ConcurrentMap and CopyOnWriteArrayList which are all thread-safe. Not sure if any underlying classes are not thread-safe, e.g. Guage, Counter, Meter, etc. It might be safer to create new instance for each operation. http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1147 PS21, Line 1147: if (!syncToLatestEventId) { Could you add a comment saying that HdfsTable is not updated in alterTableAddPartitions() if this flag is on? http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1187 PS21, Line 1187: if (!syncToLatestEventId) { Please also add a comment saying that HdfsTable is not updated in alterTableDropPartition() if this flag is on. http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4362 PS21, Line 4362: Also creates and adds new : * HdfsPartitions to the corresponding HdfsTable. nit: add "if enableSyncToLatestEventOnDdls is false". http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5234 PS21, Line 5234: * Also drops the corresponding partitions from its Hdfs table. nit: add "if enableSyncToLatestEventOnDdls is false" http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@1498 PS21, Line 1498: BackendConfig.INSTANCE.setInvalidateCatalogdHMSCacheOnDDLs(false); nit: do we still need to set this flag in this test? http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@1511 PS21, Line 1511: eventsProcessor_.processEvents(); Can we also verify all events are skipped? We can add a wrapper for this, e.g. processEventsAndAssertAllSkipped(). long lastReceivedEvents = eventsProcessor_.getMetrics() .getCounter(MetastoreEventsProcessor.EVENTS_RECEIVED_METRIC).getCount(); long lastSkippedEvents = eventsProcessor_.getMetrics() .getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).getCount(); eventsProcessor_.processEvents(); long currReceivedEvents = eventsProcessor_.getMetrics() .getCounter(MetastoreEventsProcessor.EVENTS_RECEIVED_METRIC).getCount(); long currSkippedEvents = eventsProcessor_.getMetrics() .getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).getCount(); assertEquals("All events should be skipped", currSkippedEvents - lastSkippedEvents, currReceivedEvents - lastReceivedEvents); http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@1513 PS21, Line 1513: tbl.getLastSyncedEventId() == eventsProcessor_.getCurrentEventId()); Can we get the value of tbl.getLastSyncedEventId() before processEvents()? The reason is processEvents() will also update the last synced event id of the table. What we want to verify is it's already synced before processEvents(). long tblLastSyncedEventId = tbl.getLastSyncedEventId(); eventsProcessor_.processEvents(); assertEquals("Table not synced to latest event id of event processor", tblLastSyncedEventId, eventsProcessor_.getCurrentEventId()); BTW, use assertEquals() which shows the values when it fails. http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@1535 PS21, Line 1535: insertFromImpala(testTblName, true, "p1=2022", "p2=2023", To have more coverage, can we also insert into a new partition, i.e. "p1=2022" & "p2=2024" ? http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3533 PS21, Line 3533: throws ImpalaException nit: remove "throws ImpalaException" http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3585 PS21, Line 3585: throws ImpalaException nit: remove "throws ImpalaException" http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3638 PS21, Line 3638: throws ImpalaException { nit: remove "throws ImpalaException" http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3772 PS21, Line 3772: throws ImpalaException nit: remove "throws ImpalaException" http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3801 PS21, Line 3801: throws ImpalaException nit: remove "throws ImpalaException" http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3889 PS21, Line 3889: throws ImpalaException nit: remove "throws ImpalaException" http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3918 PS21, Line 3918: throws ImpalaException nit: remove "throws ImpalaException". http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3947 PS21, Line 3947: throws ImpalaException { nit: remove "throws ImpalaException" http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java: http://gerrit.cloudera.org:8080/#/c/20367/21/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@247 PS21, Line 247: prevSyncedEventId); Both processEvents() at L232 and execDdlRequest() will update last synced event id of the db. We can't prove which one does the change. I think we should update 'prevSyncedEventId' after L232. http://gerrit.cloudera.org:8080/#/c/20367/21/tests/custom_cluster/test_sync_to_latest_hms_events.py File tests/custom_cluster/test_sync_to_latest_hms_events.py: http://gerrit.cloudera.org:8080/#/c/20367/21/tests/custom_cluster/test_sync_to_latest_hms_events.py@47 PS21, Line 47: Each test in : this class should start a hms_server using the catalogd flag --start_hms_server=true, : enable_sync_to_latest_event_on_ddls=true, invalidate_hms_cache_on_ddls=false. This becomes stale now. Only enable_sync_to_latest_event_on_ddls=true is needed. http://gerrit.cloudera.org:8080/#/c/20367/21/tests/custom_cluster/test_sync_to_latest_hms_events.py@60 PS21, Line 60: def test_truncate_cleans_hdfs_files(self, unique_database): These codes are copied from test_ddl.py. We'd better reuse the method instead of copying the codes. So they won't be inconsistent in future code changes. We can extract the codes into a method and import it from test_ddl.py @classmethod def test_truncate_cleans_hdfs_files(client, filesystem_client, unique_database): BTW, to import tests.metadata.test_ddl, we need a __init__.py file inside the tests/metadata folder. See tests/util/__init__.py as an example. Other test methods in this file, e.g. test_sync_event_based_replication, also need refactor. -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 21 Gerrit-Owner: Sai Hemanth Gantasala <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]> Gerrit-Comment-Date: Fri, 29 Dec 2023 10:49:24 +0000 Gerrit-HasComments: Yes
