TongWenbin commented on PR #10541:
URL: 
https://github.com/apache/dolphinscheduler/pull/10541#issuecomment-1843101025

   > It will be happened when the retry task appear in another task
   
   maybe there are something wrong with the retry system, to protect historical 
log from loss, retry task must be sent to the same executor, i haven't found 
out if there are something wrong with our setup.
   unfortunatly,this bug happenes everytime when timeout causes failed.
   i can understand why you didn't change the starttime, but the way the 
dolphin identifies timeouts may causes unexpected failed. in big data system, 
it's very common that some tasks could not finish as we expected because of 
some net reason or blocked queue. so we hope the task can retry automatically, 
but under the current mechanism, our retry was useless, because retry doesn't 
do anything and the dolphin thinks it has timeout at the begin of the first 
retry.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to