HxpSerein commented on issue #3905: URL: https://github.com/apache/incubator-streampark/issues/3905#issuecomment-2260409378
>In general, database has distributed consistency and high availability as well. > >As I understand, StreamPark relies heavily on the database. All user information and job information are stored in the database, so the StreamPark cannot work after database is crashed even if the registry center is using zookeeper. > >If we introduced the zookeeper, StreamPark will be unavailable whenever either Zookeeper or database crashes. This increases the maintenance cost for users and makes StreamPark more likely to be unavailable. If the database has distributed consistency and high availability, it is indeed simpler to use a database. >As my example mentioned before, all operations of same key should be forwarded to the same server for database system. But I still don't understand why same job should be monitored in the same server? > >Or my question is: if job is monitored by one random server, does it works? If yes, there is no need to introduce complex consistent hashing. First, I agree with @SbloodyS 's viewpoint that only job monitoring needs to consider distribution among servers. Second, I believe that consistent hashing does not introduce much additional complexity to the architecture. When job monitoring need to be migrated and allocated, we can simply invoke the algorithm to provide the allocation plan. This algorithm could be consistent hashing, greedy, or random. Regardless of the algorithm, our framework only needs to use a common interface for invocation. Finally, I believe that the primary goal should be to implement the overall distributed framework. The registry center and allocation algorithms are options that can be modified within the framework. Considering the workload and complexity, it is feasible to first implement a registry center using a database. Once the framework is in place, implementing the allocation algorithms will be relatively straightforward, and both greedy and consistent hashing algorithms can be considered. Please correct me if any understanding is wrong, thanks~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
