yijiacui-db edited a comment on pull request #31944: URL: https://github.com/apache/spark/pull/31944#issuecomment-806074981
> The latest offset in source progress is the latest offset available in the source, not the latest consumed offset by the stream. @viirya That's a good point. I referred to the latest consumed offset used in metrics method, without realizing that latestOffset available is reported by Kafka through reportLatestOffset. While implementing this metrics interface, it's more general for sources that don't implement reportLatestOffset, so that they can do some computation based on the consumed offset and report stats back. It's definitely true that for Kafka source, this api isn't that necessary because of that reported latest offset. @zsxwing @tdas Do you think that we should remove this api for kafka source because it's kinda duplicated? And if so, do we still want to merge the metrics api only to apache/spark? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
