xianjingfeng commented on PR #2022: URL: https://github.com/apache/incubator-uniffle/pull/2022#issuecomment-2276997871
> > In general, we only need to check the information of a particular application, not all of them. > > Could you elaborate a bit more on this? Based on your umbrella issue(#1941), I think we still need all the application infos to detect abnormal apps. Scenario 1: When we find that the cluster load is high, we need to know whether the shuffle data volume of some applications is too large. In this scenario, we need the brief information about all applications, such as the total shuffle data volume. Scenario 2: When we find that an application is very slow, we need to quickly determine whether the application has data skew. In this scenario, we need to know the detailed information of this application, including the amount of data in each partition. In general, we only query this data when something unusual happens, and there is no need to report the data to the coordinator regularly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
