waixiaoyu opened a new issue #6577: Tasks got stuck in "Pending" states forever caused by empty callback URL: https://github.com/apache/incubator-druid/issues/6577 One day, when the druid (0.11.0) cluster restarting, I found some tasks stuck in pending task list. After several days, the cluster worked well, but these tasks still stuck here. Every modules are normal expect these task. But actually, these tasks had done in SUCCESS states due to log of overlord, and data from these tasks had injected into Druid which appeared in coordinator console. After analyzing source code, I found there seems like a tiny bug. In the normal process, overlord restarts in following procedures: 1. waiting for leader selecting 2. waiting for several second before starting manage() in TaskQueue thread 3. restoring tasks and normal attach callback function 4. adding tasks in pending task For MiddleManager, it can restore all tasks which was running before restarting, and register these tasks in Zookeeper. And now, Overlord can receive CHILD_ADDED from running tasks, and updating running task list. But if Overlord receive Zookeeper task running event before starting manage(), it will register an empty callback. And the pending task list can not be clear forever. Through breaking point in TaskQueue and waiting for Zookeeper event can reproduce this bug. Finally, this bug can be judged by 2 scenarios: 1. tasks stuck in pending task list forever 2. normal callback resisted in Map<String, ListenableFuture<TaskStatus>> taskFutures can never be clear up. TaskQueue:260 Even though it is just a display bug, it makes users feel quite confused. Registering a valid callback in CHILD_ADDED and CHILD_UPDATE instead of a empty callback when new a RemoteTaskRunnerWorkItem can fix it. RemoteTaskRunner:978
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
