Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/23174 )
Change subject: IMPALA-14227: In HA failover, passive catalogd should apply pending HMS events before being active ...................................................................... Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/23174/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/23174/2//COMMIT_MSG@15 PS2, Line 15: This patch adds a wait during HA failover to ensure HMS events before : the failover happens are all applied on the new active catalogd. > This may become problematic in case the event processor is lagging - if the Yeah, that could be a problem. I'm assuming the passive catalogd won't have a long lag since it doesn't run any DDLs that could block event processing. Agree that adding a timeout is useful when external systems (e.g. HMS) are slow. I plan to add an improvement that EventProcessor goes into a "catching up" state that it just invalidate tables when processing events. Such a state can be used in this failover scenario or when the active catalogd starts to have a long lag. It'd be a larger change so will do this in a seperate patch. http://gerrit.cloudera.org:8080/#/c/23174/2/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/23174/2/be/src/catalog/catalog-server.cc@873 PS2, Line 873: SleepForMs(FLAGS_hms_event_polling_interval_s * 1000L); > Wouldn't it be better to do an HMS RPC here to get the latest id, and wait Tried to not adding new JNI methods. The sleep is just 1s by default. But yeah, fetching from HMS directly is more robust. I'll change this. http://gerrit.cloudera.org:8080/#/c/23174/2/be/src/catalog/catalog-server.cc@879 PS2, Line 879: while (last_synced_hms_event_id < latest_hms_event_id) > What will happen if the event processor runs into an error state? Will this Yeah, I'm assuming EventProcessor won't go into the error state after IMPALA-12832. But there could still be some unhandled cases. I'll add a timeout for this. -- To view, visit http://gerrit.cloudera.org:8080/23174 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icf4fcb0e27c14197f79625749949b47c033a5f31 Gerrit-Change-Number: 23174 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Tue, 15 Jul 2025 14:30:10 +0000 Gerrit-HasComments: Yes
