Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/23174 )
Change subject: IMPALA-14227: In HA failover, passive catalogd should apply pending HMS events before being active ...................................................................... Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/23174/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/23174/4//COMMIT_MSG@11 PS4, Line 11: However, it could still use : a stale metadata cache when some pending HMS events generated by the : previous active catalogd are not applied yet. > . For storages like S3 that don't have block locations, reload might have the > same performance as the initial loading since both time is dominated in file > listing. This is true for external tables, but for Hive ACID tables we could skip file listing if validWriteIdList didn't change. In Iceberg tables we don't even need file listing. I will think this through and create a Jira. Marking tables stale could be also very useful to reduce load during event processing. For example: 1. mark a tables stale if no catalog operations happened to it for N minutes 2. while a table is stale HMS events are ignored (with the exception of drop table/rename) but the cached data is kept in memory in catalogd 3. the staleness is propagated via catalogd to coordinators through statestore 4. if a coordinator wants to use a stale table, it has to request data from the catalog again 5. the new request to catalog "revives" the table, leading to a REFRESH while assuming that existing file descriptors are still valid This could be really efficient for example for Iceberg tables that are frequently written but read rarely by Impala - between the rare reads the table could go stale, so the write events wouldn't lead to always re-reading Iceberg metadata. When it is read again by Impala, only the freshest Iceberg snapshot would need to be read and only new files would need fetching block locations. The cost is that catalogd <-> coordinator traffic would increase. -- To view, visit http://gerrit.cloudera.org:8080/23174 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icf4fcb0e27c14197f79625749949b47c033a5f31 Gerrit-Change-Number: 23174 Gerrit-PatchSet: 7 Gerrit-Owner: Quanlong Huang <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Thu, 17 Jul 2025 10:56:58 +0000 Gerrit-HasComments: Yes
