advancedxy commented on PR #2022: URL: https://github.com/apache/incubator-uniffle/pull/2022#issuecomment-2277018296
I see, it's kind of throw away messages and for diagnostic purpose. I'm neutral to this change then. However I still think it would be better to report application info back and that can cover all the use cases you have described. The additional overhead might be negligible compared to other infos. > Scenario 1: When we find that the cluster load is high, we need to know whether the shuffle data volume of some applications is too large. In this scenario, we need the brief information about all applications, such as the total shuffle data volume. > > Scenario 2: When we find that an application is very slow, we need to quickly determine whether the application has data skew. In this scenario, we need to know the detailed information of this application, including the amount of data in each partition. > > In general, we only query this data when something unusual happens, and there is no need to report the data to the coordinator regularly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
