[ https://issues.apache.org/jira/browse/BEAM-7605?focusedWorklogId=263932&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-263932 ]
ASF GitHub Bot logged work on BEAM-7605: ---------------------------------------- Author: ASF GitHub Bot Created on: 20/Jun/19 17:33 Start Date: 20/Jun/19 17:33 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8913: [BEAM-7605] Allow user-code to read counters from the dataflow worker URL: https://github.com/apache/beam/pull/8913#issuecomment-504115136 I can understand why you would want this. There would be two previous contributors that work in this space more then I that would be valuable to get their input form. @echauchot who was the designer of the MetricsPusher and @ajamato who has done a lot metrics related work with Dataflow. I would like to give some useful context pointers/notes in the meantime: * you should really integrate and use MetricsPusher: https://github.com/apache/beam/blob/master/runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/MetricsPusher.java * Jira issues BEAM-3310 and specifically BEAM-3926 for Dataflow * the original metrics pusher thread (includes design docs) https://lists.apache.org/thread.html/01a80d62f2df6b84bfa41f05e15fda900178f882877c294fed8be91e@%3Cdev.beam.apache.org%3E * and it would be good to integrate this in both batch and streaming dataflow workers + sdks/java/harness which is the portability container that is meant to be used across all runners for Java jobs. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 263932) Time Spent: 20m (was: 10m) > Provide a way for user code to read dataflow runner stats > --------------------------------------------------------- > > Key: BEAM-7605 > URL: https://issues.apache.org/jira/browse/BEAM-7605 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow > Reporter: Steve Niemitz > Assignee: Steve Niemitz > Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > The dataflow runner collects (and publishes to the dataflow service) a large > number of useful stats. While these can be polled from the dataflow service > via its API, there are a few downsides to this: > * it requires another process to poll and collect the stats > * the stats are aggregated across all workers, so per-worker stats are lost > It would be simple to provide a hook to allow users to receive stats updates > as well, and then do whatever they want with them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)