[jira] [Work logged] (BEAM-3310) Push metrics to a backend in an runner agnostic way

ASF GitHub Bot (JIRA) Thu, 23 Aug 2018 08:28:15 -0700


     [ 
https://issues.apache.org/jira/browse/BEAM-3310?focusedWorklogId=137423&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-137423
 ]


ASF GitHub Bot logged work on BEAM-3310:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 23/Aug/18 15:27
            Start Date: 23/Aug/18 15:27
    Worklog Time Spent: 10m 
      Work Description: echauchot commented on issue #4548: [BEAM-3310] Metrics 
pusher
URL: https://github.com/apache/beam/pull/4548#issuecomment-415459174
 
 
   @JozoVilcek I understand and I agree with the importance of detached job 
submission. 
   Please know that the beam flink runner (and the other runners) translate 
beam code into native code but still relies on flink engine to run the native 
code (of course). For the metrics it relies on native flink accumulators to 
deal with parallelism and merge. A lot of features of flink are deactivated by 
flink engine itself in this mode, not only accumulators: In 
`DetachedEnvironment` class in Flink code (regular Flink, not Beam Flink 
runner) an exception is thrown when you try to access flink accumulators in 
detached mode. The message states this:
   `Job was submitted in detached mode. Results of job execution, such as 
accumulators, runtime, job id etc. are not available. Please make sure your 
program doesn't call an eager execution function [collect, print, printToErr, 
count].`
   So, AFAIK, there is nothing to bind to in native flink for the beam metrics 
in detached mode. 
   BTW: the current PR was about agnostic metrics **extraction** not collection.
   
   @aljoscha, @StephanEwen do you have any idea about how we could support beam 
metrics in Flink in detached mode (if it is possible in the current state of 
Flink architecture)?
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 137423)
    Time Spent: 15h 20m  (was: 15h 10m)

> Push metrics to a backend in an runner agnostic way
> ---------------------------------------------------
>
>                 Key: BEAM-3310
>                 URL: https://issues.apache.org/jira/browse/BEAM-3310
>             Project: Beam
>          Issue Type: New Feature
>          Components: runner-extensions-metrics, sdk-java-core
>            Reporter: Etienne Chauchot
>            Assignee: Etienne Chauchot
>            Priority: Major
>          Time Spent: 15h 20m
>  Remaining Estimate: 0h
>
> The idea is to avoid relying on the runners to provide access to the metrics 
> (either at the end of the pipeline or while it runs) because they don't have 
> all the same capabilities towards metrics (e.g. spark runner configures sinks 
>  like csv, graphite or in memory sinks using the spark engine conf). The 
> target is to push the metrics in the common runner code so that no matter the 
> chosen runner, a user can get his metrics out of beam.
> Here is the link to the discussion thread on the dev ML: 
> https://lists.apache.org/thread.html/01a80d62f2df6b84bfa41f05e15fda900178f882877c294fed8be91e@%3Cdev.beam.apache.org%3E
> And the design doc:
> https://s.apache.org/runner_independent_metrics_extraction



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-3310) Push metrics to a backend in an runner agnostic way

Reply via email to