Github user zsxwing commented on the pull request:
https://github.com/apache/spark/pull/13259#issuecomment-221747509
> @zsxwing thank you for the suggestion, But I have two concerns:
> 1. There could be multiple output ops on one BatchTime. And these ops
could have different batchDuration. Which one should I use?
> 2. The input rate graph is generated by the input size column of batch
table. If we do the aggregation, the meaning of input rate graph will be
changed from "number of input events from source" to "number of processed
events".
How about adding a field `lastBatchTime` and storing the last submitted
batch? Then just use the aggregated input infos between the last batch and the
current batch? We need to aggregate them since we only have batch infos that
have jobs.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]