GitHub user arunmahadevan opened a pull request:
https://github.com/apache/spark/pull/21819
[SPARK-24863][SS] Report Kafka offset lag as a custom metrics
## What changes were proposed in this pull request?
This builds on top of SPARK-24748 to report 'offset lag' as a custom
metrics for Kafka structured streaming source.
This lag is the difference between the latest offsets in Kafka the time the
metrics is reported (just after a micro-batch completes) and the latest offset
Spark has processed. It can be 0 (or close to 0) if spark keeps up with the
rate at which messages are ingested into Kafka topics in steady state. This
measures how far behind the spark source has fallen behind (per partition) and
can aid in tuning the application.
## How was this patch tested?
Existing and new unit tests
Please review http://spark.apache.org/contributing.html before opening a
pull request.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/arunmahadevan/spark SPARK-24863
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21819.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21819
----
commit 29919fe07191cf75f5a7651f8ac9434dc79c119d
Author: Arun Mahadevan <arunm@...>
Date: 2018-07-06T01:51:50Z
[SPARK-24748][SS] Support for reporting custom metrics via Streaming Query
Progress
commit 43190e9112c3d87e482d81ac8c56097c5c513012
Author: Arun Mahadevan <arunm@...>
Date: 2018-07-06T18:07:28Z
Add error reporting API for custom metrics and address review comments
commit 6d4165efc9c49f73141292b6c0f318f6a3cafb23
Author: Arun Mahadevan <arunm@...>
Date: 2018-07-11T17:42:17Z
Added support for custom metrics in Sink and use MemorySinkV2 as an example
commit bca054f978406b257bfa4c4010e7655144fc820f
Author: Arun Mahadevan <arunm@...>
Date: 2018-07-11T17:59:54Z
remove kafka source metrics outside the scope of this PR
commit 5e732cba85a5c2e3ed3f0487c70c1ebe4c20b75d
Author: Arun Mahadevan <arunm@...>
Date: 2018-07-11T18:48:41Z
Fix scala style issues
Change-Id: I831719f1e9ef1437d9df2b3529bf0a288ef5d0fa
commit c1fc3ca1ec2e2698d1d83ca2bd3ecbecd4da76a6
Author: Arun Mahadevan <arunm@...>
Date: 2018-07-19T20:14:40Z
[SPARK-24863][SS] Report Kafka offset lag as a custom metrics
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]