Apache9 commented on PR #6129:
URL: https://github.com/apache/hbase/pull/6129#issuecomment-2260471977
> @Apache9 - the problem with master start code is that if we are running 5
masters and one of the master which have highest start code creates the
procedure and aborts. Now since all other masters have start code smaller than
last active(which has aborted now), this check
>
> ```
> if (masterStartCodeFromProc > server.getStartcode()) {
> // procedure is initiated by new active master but report received on
different
> throw new MasterNotRunningException("Another master is active");
> }
> ```
>
> will always pass and we will always throw exception. So instead of using
the master start code we should be using master active time for this check. The
misunderstanding i believe was that start code changes when a master becomes
active which is false. Only master active time would update when a master
becomes active.
OK, so the problem here is that, a backup master may have lower start code
then the current active master, and once the backup master becomes the active
master, it can not accept the procedure's report which was scheduled by the
previous active master.
Makes sense. We should use the timestamp when the master becomes the active
master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]