Hi @all,

From a quick debugging session, I conclude that the wiring is in place for the Flink Runner. There is a ProgressReporter that reports MonitoringInfos to Flink, in a similar fashion as the "legacy" Runner.

The bundle duration metrics are 0, but the element count gets reported correctly. It appears to be an issue of the Python/Java harness because "ProcessBundleProgressResponse" contains only 0 values for the bundle duration.

Thanks,
Max

On 04.04.19 19:54, Mikhail Gryzykhin wrote:
Hi everyone,

Quick summary on python and Dataflow Runner:
Python SDK already reports:
- MSec
- User metrics (int64 and distribution)
- PCollection Element Count
- Work on MeanByteCount for pcollection is ongoing here <https://github.com/apache/beam/pull/8062>.

Dataflow Runner:
- all metrics listed above are passed through to Dataflow.

Ryan can give more information on Flink Runner. I also see Maximilian on some of relevant PRs, so he might comment on this as well.

Regards,
Mikhail.


On Thu, Apr 4, 2019 at 10:43 AM Pablo Estrada <[email protected] <mailto:[email protected]>> wrote:

    Hello guys!
    Alex, Mikhail and Ryan are working on support for metrics in the
    portability framework. The support on the SDK is pretty advanced
    AFAIK*, and the next step is to get the metrics back into the
    runner. Lukazs and myself are working on a project that depends on
    this too, so I'm adding everyone so we can get an idea of what's
    missing.

    I believe:
    - User metrics are fully wired up in the SDK
    - State sampler (timing) metrics are wired up as well (is that
    right, +Alex Amato <mailto:[email protected]>?)
    - Work is ongoing to send the updates back to Flink.
    - What is the plan for making metrics queriable from Flink? +Ryan
    Williams <mailto:[email protected]>

    Thanks!
    -P.



    On Wed, Apr 3, 2019 at 12:02 PM Thomas Weise <[email protected]
    <mailto:[email protected]>> wrote:

        I believe this is where the metrics are supplied:
        
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/operations.py

        git grep process_bundle_msecs   yields results for dataflow
        worker only

        There isn't any test coverage for the Flink runner:

        
https://github.com/apache/beam/blob/d38645ae8758d834c3e819b715a66dd82c78f6d4/sdks/python/apache_beam/runners/portability/flink_runner_test.py#L181



        On Wed, Apr 3, 2019 at 10:45 AM Akshay Balwally
        <[email protected] <mailto:[email protected]>> wrote:

            Should have added- I'm using Python sdk, Flink runner

            On Wed, Apr 3, 2019 at 10:32 AM Akshay Balwally
            <[email protected] <mailto:[email protected]>> wrote:

                Hi,
                I'm hoping to get metrics on the amount of time spent on
                each operator, so it seams like the stat

                
{organization_specific_prefix}.operator.beam-metric-pardo_execution_time-process_bundle_msecs-v1.gauge.mean

                would be pretty helpful. But in practice, this stat
                always shows 0, which I interpret as 0 milliseconds
                spent per bundle, which can't be correct (other stats
                show that the operators are running, and timers within
                the operators show more reasonable times). Is this a
                known bug?


-- *Akshay Balwally*
                Software Engineer
                937.271.6469 <tel:+19372716469>
                Lyft <http://www.lyft.com/>



-- *Akshay Balwally*
            Software Engineer
            937.271.6469 <tel:+19372716469>
            Lyft <http://www.lyft.com/>

Reply via email to