geosmart edited a comment on issue #6880: URL: https://github.com/apache/dolphinscheduler/issues/6880#issuecomment-972478393
and the worker has the same restart situation! the master or worker must sleep a while after stop ,then start and the sleep time must greater than `zookeeper.session.timeout`. as my prod env config `zookeeper.session.timeout=60000`. 1. stop master or worker, 2. sleep for 70 second, 3. start master or worker --- so ops resolve the bug. but bug is bug , we need to check the server is not exist in `nodes` path and then add it to `dead-servers` path in atomic operation -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
