gaborgsomogyi commented on pull request #30151: URL: https://github.com/apache/spark/pull/30151#issuecomment-717816270
Let me create a stream-stream join app to test and we can discuss the details what/how/where to aggregate. Some preliminary opinions: > see the overall memory usage end users have to accumulate these values by theirselves I agree, it would be good to show a summary but independent graph also needed to see which one is problematic > Having graphs per state store may be helpful on stream-stream join when there's a skew between left side and right side (either volume of the inputs or difference on evict condition), but probably can be hidden by default and shown on demand of "details". (separate page?) Yeah, having 3-4 operator would make the UI horror. I'll start to experiment w/ separate page per operator approach. > Btw I guess loadedMapCacheHitCount graph can be dropped unless on demand, as if things are working without crash or Spark's bug it will always increment properly. `loadedMapCacheHitCount` is coming from custom metrics which has taken over as-is: https://github.com/apache/spark/pull/30151/files#diff-e2de3487a935d91466e94189dc6d74dfe545a80a2a24a6da73cffbc55e32f6eaR261 If we want to show such values selectively maybe we can create a blacklist config for it (of course is separate jira). Just a rapid idea: `spark.sql.streaming.ui.disabledCustomMetrics=foo,bar`. WDYT? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
