HeartSaVioR commented on pull request #35548: URL: https://github.com/apache/spark/pull/35548#issuecomment-1066031036
Sorry I was too busy with handling other stuffs. I'm handling the blocker stuff for release and for others' works I'd focus reviewing on bugfixes. In addition, as I made it clear in previous comment, I'm not the one deeply involved in UI and I don't think I can sign off by myself for this one. If you'd like to move this forward, please try to get consensus about necessity of this feature and find some supporters. My comments are all general one and subject to change if experts/another supporters have objects: > I understand the concern of sending the response as store objects which blocks the evolution of their structures between versions. However, these objects are also part of the existing programmatic monitoring APIs as mentioned here - https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#reading-metrics-interactively. Wouldn't the versioning issue also affect those APIs? Do the contracts apply only for REST APIs and not metrics objects or programmatic API responses? We annotated all models with `@Evolving` and do not guarantee compatibility. We tend to not break backward compatibility, but we won't be blocked if it is unavoidable to make improvement. This brings less problem for customer listener since the compiler can notice the incompatibility when building a custom listener with new Spark version. Would it be OK for REST API as well? > Achieving the same via DropWizard / other alternatives would add the overhead of building one more REST layer and we lose the real-time experience. I'm not sure I understand correctly - there is already a strong ecosystem being built among Dropwizard - time-series DB - UI. Once Spark produces the metrics via DropWizard to time-series DB, Spark can defer the concern to time-series DB and UI layer. The mechanism has been working well for a decade (I worked for Apache Storm which also produces the metrics to DropWizard). > First one is real-time monitoring, these APIs can be used to build more sophisticated UIs than the one that is present under the Structured Streaming tab. For example, we can build a live version of it that plots the data on the client side by frequent refreshes without reloading the page. I agree with this, but if we think building a live version of the page would be helpful on monitoring and you have an idea, why not introducing such improvment to Spark UI and include this as a part of that? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
