1. the neccessity of task group queue
Task group queue (TGQ) can achieve cross-project and cross-process concurrent 
control of tasks, reducing resource pressure on scheduling system or other big 
data cluster.
TGQ also support priority-based control, which ensures that important tasks can 
be executed first. Users can also execute a task compulsively, ignoring the TGQ.
2. the details about TGQ
TGQ is essentially a flow limiter. By managing resources, the TGQ allows the 
tasks to obtain resources from the TGQ. In this way, the resources obtained by 
multiple tasks is limited and worker node's pressure is avoided.

The database optimistic lock is used to solve the thread safety problem in the 
distributed concurrent scenario. 
Note that some tasks are not TGQ bound:
1. The tasks that do not need to be performed by workers;
2. The tasks which does not belongs to any TGQ; 
3. The tasks is forcibly started by the user.
2.1 init a TGQ
The user manually creates a TGQ. The size of the TGQ is specified by the user.

2.2 how does a TGQ works
Each task configured with TGQ will apply for resource from the TGQ before being 
issued to the worker. If the TGQ has no available resources, the task will not 
be delivered to the worker, and the task wil wait for the resource release and 
ressend a request to TGQ.

2.3 recycle resources
After receiving the response from the worker, TGQ will release the resources 
corresponding to the task.

2.4 fault tolerance
In the distributed architecture, the fault tolerance mechanism is considerable. 
When the worker node is offline, the tasks with fault tolerance mechanism 
running on the worker node will be re-executed by the master. In order to 
prevent the same task from repeatedly applying for resources, when a task 
succeeds in applying for resources, it should check whether the task is already 
in the TGQ, and if so, it will resend the task to the worker. If not, allocate 
resources.


| |
yinrui_ustb
|
|
[email protected]
|
签名由网易邮箱大师定制

Reply via email to