Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23174 )

Change subject: IMPALA-14227: In HA failover, passive catalogd should apply 
pending HMS events before being active
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/23174/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23174/2//COMMIT_MSG@15
PS2, Line 15: This patch adds a wait during HA failover to ensure HMS events 
before
            : the failover happens are all applied on the new active catalogd.
This may become problematic in case the event processor is lagging - if the 
passive coordinator is lagging 1 hour behind HMS, does this mean the failover 
will need 1 hour so there won't be a catalogd for a prolonged time?


http://gerrit.cloudera.org:8080/#/c/23174/2/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/23174/2/be/src/catalog/catalog-server.cc@873
PS2, Line 873:   SleepForMs(FLAGS_hms_event_polling_interval_s * 1000L);
Wouldn't it be better to do an HMS RPC here to get the latest id, and wait 
until last_synced_hms_event_id reaches that?
I have the following problems with the sleep:
- if there are not many writes to HMS at the moment then catalogd may sleep 
unnecessarily, making failover slower
- in case the polling is delayes (e.g. slow HMS RPC), sleeping this much may 
not be enough.


http://gerrit.cloudera.org:8080/#/c/23174/2/be/src/catalog/catalog-server.cc@879
PS2, Line 879: while (last_synced_hms_event_id < latest_hms_event_id)
What will happen if the event processor runs into an error state? Will this 
loop wait forever?

I may be useful to have a timeout to sync events, and if it passes, revert to 
globally invalidating metedata.



--
To view, visit http://gerrit.cloudera.org:8080/23174
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icf4fcb0e27c14197f79625749949b47c033a5f31
Gerrit-Change-Number: 23174
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Comment-Date: Tue, 15 Jul 2025 12:12:39 +0000
Gerrit-HasComments: Yes

Reply via email to