TongWenbin commented on PR #10541: URL: https://github.com/apache/dolphinscheduler/pull/10541#issuecomment-1843101025
> It will be happened when the retry task appear in another task maybe there are something wrong with the retry system, to protect historical log from loss, retry task must be sent to the same executor, i haven't found out if there are something wrong with our setup. unfortunatly,this bug happenes everytime when timeout causes failed. i can understand why you didn't change the starttime, but the way the dolphin identifies timeouts may causes unexpected failed. in big data system, it's very common that some tasks could not finish as we expected because of some net reason or blocked queue. so we hope the task can retry automatically, but under the current mechanism, our retry was useless, because retry doesn't do anything and the dolphin thinks it has timeout at the begin of the first retry. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
