1. the neccessity of task group queue Task group queue (TGQ) can achieve cross-project and cross-process concurrent control of tasks, reducing resource pressure on scheduling system or other big data cluster. TGQ also support priority-based control, which ensures that important tasks can be executed first. Users can also execute a task compulsively, ignoring the TGQ. 2. the details about TGQ TGQ is essentially a flow limiter. By managing resources, the TGQ allows the tasks to obtain resources from the TGQ. In this way, the resources obtained by multiple tasks is limited and worker node's pressure is avoided.
The database optimistic lock is used to solve the thread safety problem in the distributed concurrent scenario. Note that some tasks are not TGQ bound: 1. The tasks that do not need to be performed by workers; 2. The tasks which does not belongs to any TGQ; 3. The tasks is forcibly started by the user. 2.1 init a TGQ The user manually creates a TGQ. The size of the TGQ is specified by the user. 2.2 how does a TGQ works Each task configured with TGQ will apply for resource from the TGQ before being issued to the worker. If the TGQ has no available resources, the task will not be delivered to the worker, and the task wil wait for the resource release and ressend a request to TGQ. 2.3 recycle resources After receiving the response from the worker, TGQ will release the resources corresponding to the task. 2.4 fault tolerance In the distributed architecture, the fault tolerance mechanism is considerable. When the worker node is offline, the tasks with fault tolerance mechanism running on the worker node will be re-executed by the master. In order to prevent the same task from repeatedly applying for resources, when a task succeeds in applying for resources, it should check whether the task is already in the TGQ, and if so, it will resend the task to the worker. If not, allocate resources. | | yinrui_ustb | | [email protected] | 签名由网易邮箱大师定制
