[
https://issues.apache.org/jira/browse/BEAM-7605?focusedWorklogId=271693&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-271693
]
ASF GitHub Bot logged work on BEAM-7605:
----------------------------------------
Author: ASF GitHub Bot
Created on: 03/Jul/19 15:32
Start Date: 03/Jul/19 15:32
Worklog Time Spent: 10m
Work Description: steveniemitz commented on issue #8913: [BEAM-7605]
Allow user-code to read counters from the dataflow worker
URL: https://github.com/apache/beam/pull/8913#issuecomment-508143755
so I spent a little time trying to adapt MetricsPusher into the dataflow
worker. The largest hurdle seems to be adapting a lot of the dataflow metrics
into the "sdk" metrics (MetricsContainer, etc) and figuring out how to adapt
things like CounterSet and CounterFactory into their beam equivalents. The
right way to do it seems like just moving everything over to the beam versions,
but that's a pretty large project.
I'm not going to have the time to take on a large refactor like that right
now. We can close this PR if you don't think there's a chance of it getting
in, I'm fine maintaining our own fork of the worker jar with this in it for the
time being, until there's a better solution for getting metrics out of the
worker.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 271693)
Time Spent: 2.5h (was: 2h 20m)
> Provide a way for user code to read dataflow runner stats
> ---------------------------------------------------------
>
> Key: BEAM-7605
> URL: https://issues.apache.org/jira/browse/BEAM-7605
> Project: Beam
> Issue Type: Improvement
> Components: runner-dataflow
> Reporter: Steve Niemitz
> Assignee: Steve Niemitz
> Priority: Major
> Time Spent: 2.5h
> Remaining Estimate: 0h
>
> The dataflow runner collects (and publishes to the dataflow service) a large
> number of useful stats. While these can be polled from the dataflow service
> via its API, there are a few downsides to this:
> * it requires another process to poll and collect the stats
> * the stats are aggregated across all workers, so per-worker stats are lost
> It would be simple to provide a hook to allow users to receive stats updates
> as well, and then do whatever they want with them.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)