[ 
https://issues.apache.org/jira/browse/BEAM-11213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17313850#comment-17313850
 ] 

Ismaël Mejía edited comment on BEAM-11213 at 4/2/21, 8:50 PM:
--------------------------------------------------------------

This is a new feature for Beam 2.29.0 but previous PRs introduced an API issue 
for the Spark Runner that we should fix before the release. I opened 
[https://github.com/apache/beam/pull/14409] for this. I will let this open 
until we get the cherry pick merged.


was (Author: iemejia):
This is a new feature but it is introducing an API issue for the Spark Runner 
that we should fix before the release. I opened 
[https://github.com/apache/beam/pull/14409] for this. I will let you know for 
the cherry pick once reviewed and merged.

> Beam metrics should be displayed in Spark UI
> --------------------------------------------
>
>                 Key: BEAM-11213
>                 URL: https://issues.apache.org/jira/browse/BEAM-11213
>             Project: Beam
>          Issue Type: Wish
>          Components: runner-spark
>            Reporter: Kyle Weaver
>            Assignee: Tomasz Szerszen
>            Priority: P2
>              Labels: portability-spark
>             Fix For: 2.29.0
>
>          Time Spent: 14h 20m
>  Remaining Estimate: 0h
>
> All Beam metrics are visible in the Spark UI in a single accumulator value 
> (in the "Accumulators" tab), which is a large, hard-to-read blob. Originally, 
> this blob was rendered in a bespoke format 
> (https://github.com/apache/beam/blob/ead80b469ffeeddcd8e9e5c8dc462eec0b0ffc6b/sdks/java/core/src/main/java/org/apache/beam/sdk/metrics/MetricQueryResults.java#L63-L72).
>  I changed the format to JSON so it could be easily deserialized (BEAM-9600). 
> But then an issue was filed (BEAM-10294) reporting that the new JSON format 
> was harder to read than the original bespoke format. The temporary fix was to 
> revert to the bespoke format in Spark, while allowing Flink to continue to 
> use JSON. However, if Beam metrics are only visible as an accumulator, then 
> they are also unreadable because the payloads are in binary form (BEAM-10719).
> Having metrics visible in Spark's "Metrics" tab would A) make metrics easier 
> to read (even compared to the bespoke accumulator string format), and closer 
> to what users of Beamless Spark expect, and B) free us to use the accumulator 
> however we wish for Beam internal purposes, without worrying about 
> readability.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to