good job I like the feature very much, please contact your mentor and get better advice, thx
Best Regards --------------- Apache DolphinScheduler PMC Chair David Dai [email protected] Linkedin: https://www.linkedin.com/in/dailidong Twitter: @WorkflowEasy <https://twitter.com/WorkflowEasy> --------------- On Wed, Aug 4, 2021 at 5:26 PM Yin Rui <[email protected]> wrote: > *1. the neccessity of task group queue* > Task group queue (TGQ) can achieve cross-project and cross-process > concurrent control of tasks, reducing resource pressure on scheduling > system or other big data cluster. > TGQ also support priority-based control, which ensures that important > tasks can be executed first. Users can also execute a task compulsively, > ignoring the TGQ. > *2. the details about TGQ* > TGQ is essentially a flow limiter. By managing resources, the TGQ allows > the tasks to obtain resources from the TGQ. In this way, the resources > obtained by multiple tasks is limited and worker node's pressure is > avoided. > The database optimistic lock is used to solve the thread safety problem > in the distributed concurrent scenario. > Note that some tasks are not TGQ bound: > 1. The tasks that do not need to be performed by workers; > 2. The tasks which does not belongs to any TGQ; > 3. The tasks is forcibly started by the user. > *2.1 init a TGQ* > The user manually creates a TGQ. The size of the TGQ is specified by the > user. > *2.2 how does a TGQ works* > Each task configured with TGQ will apply for resource from the TGQ before > being issued to the worker. If the TGQ has no available resources, the task > will not be delivered to the worker, and the task wil wait for the resource > release and ressend a request to TGQ. > *2.3 recycle resources* > After receiving the response from the worker, TGQ will release the > resources corresponding to the task. > *2.4 fault tolerance* > In the distributed architecture, the fault tolerance mechanism is > considerable. When the worker node is offline, the tasks with fault > tolerance mechanism running on the worker node will be re-executed by the > master. In order to prevent the same task from repeatedly applying for > resources, when a task succeeds in applying for resources, it should check > whether the task is already in the TGQ, and if so, it will resend the task > to the worker. If not, allocate resources. > > > yinrui_ustb > [email protected] > > <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=yinrui_ustb&uid=yinrui_ustb%40163.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22yinrui_ustb%40163.com%22%5D> > 签名由网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81>定制 >
