Tanya-W opened a new issue, #17824:
URL: https://github.com/apache/doris/issues/17824

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Version
   
   master
   
   ### What's Wrong?
   
   - Phenomenon:
   Schema change job occurs at the same time, resulting in the failure of the 
FE restart.
   
   - Analysis:
   Current schema change job execute `runRunningJob`, modify the table's state 
to NORMAL first and then set the job finished, then write edit log,  if write 
edit log slowly and take a long time, since the state of the table has been 
changed back to NORMAL and table write lock is unlock, other schema change jobs 
for the same table can be created successfully, if other schema change jobs 
write edit log succeed before current schema change job, that leads to fe 
restart replay schema change job failed.
   
   ```
       protected void runRunningJob() throws AlterCancelException {
           ......
           tbl.writeLockOrAlterCancelException();
           ......
   
           try {
              ......
               onFinished(tbl); // in this function change table's state to 
NORMAL
           } finally {
               tbl.writeUnlock(); // unlock
           }
   
           pruneMeta();
           this.jobState = JobState.FINISHED;
           this.finishedTimeMs = System.currentTimeMillis();
   
           Env.getCurrentEnv().getEditLog().logAlterJob(this); // write slowly
           LOG.info("schema change job finished: {}", jobId);
       }
   ```
   
   - for example:
   ```
   [job 1] pending state edit log (table_state: SCHEMA_CHANGE)   =>  [job 1] 
replay pending state edit log  
   [job 1] wait_txn state edit log (table_state: SCHEMA_CHANGE)  =>  [job 1] 
replay wait_txn state edit log 
   [job 2] pending state edit log (table_state: SCHEMA_CHANGE)   =>  [job 2] 
replay pending state edit log (check table_state failed, table not normal)
   [job 1] finished state edit log (table_state: NORMAL)         =>  [job 1] 
replay finished state edit log 
   ```
   
   ### What You Expected?
   
   change table's state to normal after schema change job finished
   
   ### How to Reproduce?
   
   _No response_
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to