github-actions[bot] commented on code in PR #64813:
URL: https://github.com/apache/doris/pull/64813#discussion_r3474109727
##########
fe/fe-core/src/main/java/org/apache/doris/cloud/CloudWarmUpJob.java:
##########
@@ -990,13 +999,17 @@ private void runEventDrivenJob() throws Exception {
hasTableFilter() ? getCurrentTableIdNames().size() :
"all");
TWarmUpTabletsResponse response =
entry.getValue().warmUpTablets(request);
if (response.getStatus().getStatusCode() != TStatusCode.OK) {
+ hasError = true;
if (!response.getStatus().getErrorMsgs().isEmpty()) {
errMsg = response.getStatus().getErrorMsgs().get(0);
}
LOG.warn("send warm up request failed. job_id={},
event={}, err={}",
jobId, syncEvent, errMsg);
}
}
+ if (!hasError && resetErrMsg()) {
+ Env.getCurrentEnv().getEditLog().logModifyCloudWarmUpJob(this);
Review Comment:
This reset also fires when the refreshed source BE set is empty.
`refreshEventDrivenBeToThriftAddress()` replaces `beToThriftAddress` with
whatever `getBackendsByClusterName(srcClusterName)` returns, and if that is
empty, `initClients()` leaves `beToClient` empty, the loop sends zero `SET_JOB`
RPCs, `hasError` stays false, and this branch clears and journals the old
error. That makes `SHOW WARM UP JOB` look recovered even though the event
subscription was not installed on any BE. Please only clear after at least one
`SET_JOB` succeeds, or treat the empty refreshed source-BE set as a failure and
keep the previous error; a unit test with
`getBackendsByClusterName(srcClusterName)` returning an empty list would cover
this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]