Re: ParDo Execution Time stat is always 0

2019-07-15 Thread Pablo Estrada
State sampler is the only state provider for the Python SDK. This means that the Metrics module relies on it to attribute metrics to each step; and the logging module also uses it to attribute logs to each step. statesampler_slow does not implement the actual sampling, but it does implement the

Re: ParDo Execution Time stat is always 0

2019-07-15 Thread Alex Amato
Perhaps no metric at all should be returned, instead of 0, which is an incorrect value. Also, is there a reason to have state_sampler_slow at all then, if its not intended to be implemented? On Mon, Jul 15, 2019 at 5:03 PM Kyle Weaver wrote: > Pablo, what about setting a lower sampling rate?

Re: ParDo Execution Time stat is always 0

2019-07-15 Thread Kyle Weaver
Pablo, what about setting a lower sampling rate? Or would that lead to poor results? Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com | +1650203 On Mon, Jul 15, 2019 at 4:44 PM Pablo Estrada wrote: > @Thomas do you think this is a problem of documentation, or a

Re: ParDo Execution Time stat is always 0

2019-07-15 Thread Pablo Estrada
@Thomas do you think this is a problem of documentation, or a missing feature? We did not add support for it without cython because the cost of locking and checking every 200ms in Python would be too high - that's why this is only implemented in the optimized Cython codepath. I think it makes

Re: ParDo Execution Time stat is always 0

2019-07-15 Thread Thomas Weise
That's great, but I think the JIRA needs to remain open since w/o Cython the metric still doesn't work. It would however be helpful to add a comment regarding your findings. On Mon, Jul 15, 2019 at 1:46 PM Rakesh Kumar wrote: > > Installing cython in the application environment fixed the

Re: ParDo Execution Time stat is always 0

2019-07-15 Thread Rakesh Kumar
Installing cython in the application environment fixed the issue. Now I am able to see the operator metrics ({organization_specific_prefix} .operator.beam-metric-pardo_execution_time-process_bundle_ msecs-v1.gauge.mean) Thanks Ankur for looking into it and providing support. I am going to close

Re: ParDo Execution Time stat is always 0

2019-04-11 Thread Thomas Weise
Tracked as https://issues.apache.org/jira/browse/BEAM-7058 On Wed, Apr 10, 2019 at 11:38 AM Pablo Estrada wrote: > This sounds like a bug then? +Alex Amato > > On Wed, Apr 10, 2019 at 3:59 AM Maximilian Michels wrote: > >> Hi @all, >> >> From a quick debugging session, I conclude that the

Re: ParDo Execution Time stat is always 0

2019-04-10 Thread Pablo Estrada
This sounds like a bug then? +Alex Amato On Wed, Apr 10, 2019 at 3:59 AM Maximilian Michels wrote: > Hi @all, > > From a quick debugging session, I conclude that the wiring is in place > for the Flink Runner. There is a ProgressReporter that reports > MonitoringInfos to Flink, in a similar

Re: ParDo Execution Time stat is always 0

2019-04-04 Thread Mikhail Gryzykhin
Hi everyone, Quick summary on python and Dataflow Runner: Python SDK already reports: - MSec - User metrics (int64 and distribution) - PCollection Element Count - Work on MeanByteCount for pcollection is ongoing here . Dataflow Runner: - all metrics

Re: ParDo Execution Time stat is always 0

2019-04-04 Thread Pablo Estrada
Hello guys! Alex, Mikhail and Ryan are working on support for metrics in the portability framework. The support on the SDK is pretty advanced AFAIK*, and the next step is to get the metrics back into the runner. Lukazs and myself are working on a project that depends on this too, so I'm adding

Re: ParDo Execution Time stat is always 0

2019-04-03 Thread Thomas Weise
I believe this is where the metrics are supplied: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/operations.py git grep process_bundle_msecs yields results for dataflow worker only There isn't any test coverage for the Flink runner:

Re: ParDo Execution Time stat is always 0

2019-04-03 Thread Akshay Balwally
Should have added- I'm using Python sdk, Flink runner On Wed, Apr 3, 2019 at 10:32 AM Akshay Balwally wrote: > Hi, > I'm hoping to get metrics on the amount of time spent on each operator, so > it seams like the stat > > >