[
https://issues.apache.org/jira/browse/BEAM-11213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17313850#comment-17313850
]
Ismaël Mejía edited comment on BEAM-11213 at 4/2/21, 8:50 PM:
--------------------------------------------------------------
This is a new feature for Beam 2.29.0 but previous PRs introduced an API issue
for the Spark Runner that we should fix before the release. I opened
[https://github.com/apache/beam/pull/14409] for this. I will let this open
until we get the cherry pick merged.
was (Author: iemejia):
This is a new feature but it is introducing an API issue for the Spark Runner
that we should fix before the release. I opened
[https://github.com/apache/beam/pull/14409] for this. I will let you know for
the cherry pick once reviewed and merged.
> Beam metrics should be displayed in Spark UI
> --------------------------------------------
>
> Key: BEAM-11213
> URL: https://issues.apache.org/jira/browse/BEAM-11213
> Project: Beam
> Issue Type: Wish
> Components: runner-spark
> Reporter: Kyle Weaver
> Assignee: Tomasz Szerszen
> Priority: P2
> Labels: portability-spark
> Fix For: 2.29.0
>
> Time Spent: 14h 20m
> Remaining Estimate: 0h
>
> All Beam metrics are visible in the Spark UI in a single accumulator value
> (in the "Accumulators" tab), which is a large, hard-to-read blob. Originally,
> this blob was rendered in a bespoke format
> (https://github.com/apache/beam/blob/ead80b469ffeeddcd8e9e5c8dc462eec0b0ffc6b/sdks/java/core/src/main/java/org/apache/beam/sdk/metrics/MetricQueryResults.java#L63-L72).
> I changed the format to JSON so it could be easily deserialized (BEAM-9600).
> But then an issue was filed (BEAM-10294) reporting that the new JSON format
> was harder to read than the original bespoke format. The temporary fix was to
> revert to the bespoke format in Spark, while allowing Flink to continue to
> use JSON. However, if Beam metrics are only visible as an accumulator, then
> they are also unreadable because the payloads are in binary form (BEAM-10719).
> Having metrics visible in Spark's "Metrics" tab would A) make metrics easier
> to read (even compared to the bespoke accumulator string format), and closer
> to what users of Beamless Spark expect, and B) free us to use the accumulator
> however we wish for Beam internal purposes, without worrying about
> readability.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)