crazycarry opened a new issue #4226:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4226
### when i upgrade ds to 1.3.3, i find a bug like before ,in the class
MaterSchedulerService
```
public void run() {
logger.info("master scheduler started");
while (Stopper.isRunning()){
InterProcessMutex mutex = null;
try {
boolean runCheckFlag =
OSUtils.checkResource(masterConfig.getMasterMaxCpuloadAvg(),
masterConfig.getMasterReservedMemory());
if(!runCheckFlag) {
Thread.sleep(Constants.SLEEP_TIME_MILLIS);
continue;
}
if (zkMasterClient.getZkClient().getState() ==
CuratorFrameworkState.STARTED) {
mutex = zkMasterClient.blockAcquireMutex();
int activeCount = masterExecService.getActiveCount();
// make sure to scan and delete command table in one
transaction
Command command = processService.findOneCommand();
if (command != null) {
logger.info("find one command: id: {}, type: {}",
command.getId(),command.getCommandType());
try{
ProcessInstance processInstance =
processService.handleCommand(logger,
getLocalAddress(),
this.masterConfig.getMasterExecThreads()
- activeCount, command);
if (processInstance != null) {
logger.info("start master exec thread ,
split DAG ...");
masterExecService.execute(new
MasterExecThread(processInstance, processService, nettyRemotingClient));
}
}catch (Exception e){
logger.error("scan command error ", e);
processService.moveToErrorCommand(command,
e.toString());
}
} else{
//indicate that no command ,sleep for 1s
Thread.sleep(Constants.SLEEP_TIME_MILLIS);
}
}
} catch (Exception e){
logger.error("master scheduler thread error",e);
} finally{
zkMasterClient.releaseMutex(mutex);
}
}
}
```
when the db get a error or some other exception,the loop do not hava any
function to down it
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]