[GitHub] [spark] HeartSaVioR commented on pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

GitBox Thu, 29 Apr 2021 14:35:21 -0700


HeartSaVioR commented on pull request #31944:
URL: https://github.com/apache/spark/pull/31944#issuecomment-829611545



   Yes that sounds like a good rationalization for real case. Thanks!
   
   I looked into the changes on API side, and felt both #30988 and this can 
co-exist. #30988 covers specific cases where latest offset as Offset format can 
be provided by data source, and this covers more general ("arbitrary" might fit 
better) cases where the information data source wants to provide is not limited 
to the latest offset.
   
   For sure, the actual behavioral change in #30988 can be implemented with the 
API being added here, but providing general output across data sources would be 
ideally more useful, like plotting to the UI. (I know the technical lack here 
on making it general as the format of "Offset" is varying across data sources 
and consumer has to take care.)
   
   For the newly added Kafka metrics, it still makes sense when the target 
persona is human (convenient to check), but otherwise I agree with @viirya that 
it sounds like redundant. Despite the fact code change is not huge, probably 
good to split this down to two PRs with two JIRA issues 1) API changes 2) Kafka 
metrics, and finalize reviewing 1) first as there seems no outstanding concern 
on API changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HeartSaVioR commented on pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

Reply via email to