HxpSerein opened a new issue, #3960:
URL: https://github.com/apache/incubator-streampark/issues/3960

   ### Search before asking
   
   - [X] I had searched in the 
[feature](https://github.com/apache/incubator-streampark/issues?q=is%3Aissue+label%3A%22Feature%22)
 and found no similar feature requirement.
   
   
   ### Description
   
   Distribute the task to the specified server, using the consistent hashing 
algorithm by default.
   
   The previous design suggested separating task startup and monitoring. While 
the task could be started on any server, monitoring needed to be distributed to 
maintain load balancing. However, at the current stage, there is a dependency 
between task startup and monitoring, and separating the two would require 
additional design efforts. Additionally, during fault recovery, task startup 
would also need to be reallocated, adding further complexity. To minimize 
changes in the first stage, task startup, shutdown, and monitoring will all be 
performed on the same specified server.
   
   Todo List:
   
   - **Provide the `ConsistentHash` utility class**: Support adding and 
deleting servers, and hash tasks to the specified server based on the task ID.
   - **Implement a unified `TaskManager` interface**: Include functionality for 
obtaining a list of monitored tasks, starting/stopping tasks, and 
redistributing tasks.
   - **Create a new table `task_command` in the database for message 
communication**: Use the producer-consumer model, writing a record to the table 
when a task starts/stops. TaskManager will poll the table to obtain task 
records and use ConsistentHash to determine if the task belongs to it. If not, 
it takes no action; if it does, it performs the relevant operations.
   
   ### Function
   
   The SP uses task distribution to perform the following three operations:
   
   - Use the producer-consumer model to start/stop tasks.
   - When there is no server addition or deletion, ConsistentHash ensures the 
stability of the hash result, with each task being assigned to only one 
specified server.
   - When a server is added or removed, the registry center triggers 
TaskManager to reallocate tasks.
   
   ### How to use
   SP automatically manages the distribution of tasks without the user's 
awareness.
   
   ### Module
   
   **ConsistentHash:**
   
   Add a utility class for the task distribution algorithm.
   
   **TaskManager:**
   
   This module includes the relevant interfaces involved in task distribution. 
Below are several key interfaces:
   
   1. **Get Monitored Task List**: Through this interface, the watcher obtains 
the list of tasks that need to be monitored.
   2. **Task Start/Stop**: This interface is responsible for polling the 
database to retrieve task records and execute the corresponding operations.
   3. **Task Redistribution**: This interface handles task redistribution when 
server nodes are added or removed.
   
   
   ### Usage Scenario
   
   _No response_
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to