Installing cython in the application environment fixed the issue. Now I am
able to see the operator metrics ({organization_specific_prefix}
.operator.beam-metric-pardo_execution_time-process_bundle_
msecs-v1.gauge.mean)

Thanks Ankur for looking into it and providing support.

I am going to close  https://issues.apache.org/jira/browse/BEAM-7058 if no
one has any objection?


On Thu, Apr 11, 2019 at 7:13 AM Thomas Weise <[email protected]> wrote:

> Tracked as https://issues.apache.org/jira/browse/BEAM-7058
>
>
> On Wed, Apr 10, 2019 at 11:38 AM Pablo Estrada <[email protected]> wrote:
>
>> This sounds like a bug then? +Alex Amato <[email protected]>
>>
>> On Wed, Apr 10, 2019 at 3:59 AM Maximilian Michels <[email protected]>
>> wrote:
>>
>>> Hi @all,
>>>
>>>  From a quick debugging session, I conclude that the wiring is in place
>>> for the Flink Runner. There is a ProgressReporter that reports
>>> MonitoringInfos to Flink, in a similar fashion as the "legacy" Runner.
>>>
>>> The bundle duration metrics are 0, but the element count gets reported
>>> correctly. It appears to be an issue of the Python/Java harness because
>>> "ProcessBundleProgressResponse" contains only 0 values for the bundle
>>> duration.
>>>
>>> Thanks,
>>> Max
>>>
>>> On 04.04.19 19:54, Mikhail Gryzykhin wrote:
>>> > Hi everyone,
>>> >
>>> > Quick summary on python and Dataflow Runner:
>>> > Python SDK already reports:
>>> > - MSec
>>> > - User metrics (int64 and distribution)
>>> > - PCollection Element Count
>>> > - Work on MeanByteCount for pcollection is ongoing here
>>> > <https://github.com/apache/beam/pull/8062>.
>>> >
>>> > Dataflow Runner:
>>> > - all metrics listed above are passed through to Dataflow.
>>> >
>>> > Ryan can give more information on Flink Runner. I also see Maximilian
>>> on
>>> > some of relevant PRs, so he might comment on this as well.
>>> >
>>> > Regards,
>>> > Mikhail.
>>> >
>>> >
>>> > On Thu, Apr 4, 2019 at 10:43 AM Pablo Estrada <[email protected]
>>> > <mailto:[email protected]>> wrote:
>>> >
>>> >     Hello guys!
>>> >     Alex, Mikhail and Ryan are working on support for metrics in the
>>> >     portability framework. The support on the SDK is pretty advanced
>>> >     AFAIK*, and the next step is to get the metrics back into the
>>> >     runner. Lukazs and myself are working on a project that depends on
>>> >     this too, so I'm adding everyone so we can get an idea of what's
>>> >     missing.
>>> >
>>> >     I believe:
>>> >     - User metrics are fully wired up in the SDK
>>> >     - State sampler (timing) metrics are wired up as well (is that
>>> >     right, +Alex Amato <mailto:[email protected]>?)
>>> >     - Work is ongoing to send the updates back to Flink.
>>> >     - What is the plan for making metrics queriable from Flink? +Ryan
>>> >     Williams <mailto:[email protected]>
>>> >
>>> >     Thanks!
>>> >     -P.
>>> >
>>> >
>>> >
>>> >     On Wed, Apr 3, 2019 at 12:02 PM Thomas Weise <[email protected]
>>> >     <mailto:[email protected]>> wrote:
>>> >
>>> >         I believe this is where the metrics are supplied:
>>> >
>>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/operations.py
>>> >
>>> >         git grep process_bundle_msecs   yields results for dataflow
>>> >         worker only
>>> >
>>> >         There isn't any test coverage for the Flink runner:
>>> >
>>> >
>>> https://github.com/apache/beam/blob/d38645ae8758d834c3e819b715a66dd82c78f6d4/sdks/python/apache_beam/runners/portability/flink_runner_test.py#L181
>>> >
>>> >
>>> >
>>> >         On Wed, Apr 3, 2019 at 10:45 AM Akshay Balwally
>>> >         <[email protected] <mailto:[email protected]>> wrote:
>>> >
>>> >             Should have added- I'm using Python sdk, Flink runner
>>> >
>>> >             On Wed, Apr 3, 2019 at 10:32 AM Akshay Balwally
>>> >             <[email protected] <mailto:[email protected]>> wrote:
>>> >
>>> >                 Hi,
>>> >                 I'm hoping to get metrics on the amount of time spent
>>> on
>>> >                 each operator, so it seams like the stat
>>> >
>>> >
>>>  
>>> {organization_specific_prefix}.operator.beam-metric-pardo_execution_time-process_bundle_msecs-v1.gauge.mean
>>> >
>>> >                 would be pretty helpful. But in practice, this stat
>>> >                 always shows 0, which I interpret as 0 milliseconds
>>> >                 spent per bundle, which can't be correct (other stats
>>> >                 show that the operators are running, and timers within
>>> >                 the operators show more reasonable times). Is this a
>>> >                 known bug?
>>> >
>>> >
>>> >                 --
>>> >                 *Akshay Balwally*
>>> >                 Software Engineer
>>> >                 937.271.6469 <tel:+19372716469>
>>> >                 Lyft <http://www.lyft.com/>
>>> >
>>> >
>>> >
>>> >             --
>>> >             *Akshay Balwally*
>>> >             Software Engineer
>>> >             937.271.6469 <tel:+19372716469>
>>> >             Lyft <http://www.lyft.com/>
>>> >
>>>
>>

Reply via email to