jerqi commented on PR #1652: URL: https://github.com/apache/incubator-uniffle/pull/1652#issuecomment-2082017411
> > If one server becomes a faulty server, all tasks will change the assignment, won't they? Why do we need to record every task for a new assignment? > > I want to clarity that receivingFailureServer should be scoped for partition block data rather than tasks. Because sometimes server will in high watermark with too much requests, so they will effect these partitioned data in that time. That means these partitioned data should be reassigned to another server. If this is not happened in other partitions, the assign will not be changed. > > > Why do we need to record every task for a new assignment? > > I don't catch your thought about task -> assignment. I got your point. You just record one reassignment but you re-balance them if you according to hash or range. It's ok that we store one assignment. But we should consider two class names. ``` receivingFailureServer ``` Could we return a high load error code to the server when the shuffle server has too high load? Is it a failure when we just return a high load error code? ``` TaskAssignment ``` Maybe we couldn't change this class name. Should we have a strategy class to handle the difference between faulty servers and high load servers. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
