Re: [portablility] metrics interrogations

Etienne Chauchot Tue, 11 Sep 2018 02:43:29 -0700

Le lundi 10 septembre 2018 à 09:42 -0700, Lukasz Cwik a écrit :
> Alex is out on vacation for the next 3 weeks.
> Alex had proposed the types of metrics[1] but not the exact protocol as to 
> what the SDK and runner do. I could
> envision Alex proposing that the SDK harness only sends diffs or dirty 
> metrics in intermediate updates and all metrics
> values in the final update.
> Robert is referring to an integration that happened to an older set of 
> messages[2] that preceeded Alex's proposal and
> that integration with Dataflow which is still incomplete works as you 
> described in #2.


Thanks Luke and Robert for the confirmation.
> Robin had recently been considering adding an accessor to DoFns that would 
> allow you to get access to the job
> information from within the pipeline (current state, poll for metrics, invoke 
> actions like cancel / drain, ...). He
> wanted it so he could poll for attempted metrics to be able to test 
> @RequiresStableInput. 
Yes, I remember, I voted +1 to his proposal.
> Integrating the MetricsPusher or something like that on the SDK side to be 
> able to poll metrics over the job
> information accessor could be useful.

Well, in the design discussion, we decided to host Metrics Pusher as close as 
possible of the actual engine (inside the
runner code chosen over the sdk code) to allow the runner to send system 
metrics in the future. 
> 1: https://s.apache.org/beam-fn-api-metrics
> 2: 
> https://github.com/apache/beam/blob/9b68f926628d727e917b6a33ccdafcfe693eef6a/model/fn-execution/src/main/proto/beam
> _fn_api.proto#L410

Besides, in his PR Alex talks about deprecated metrics. As he is off, can you 
tell me a little more about them ? What
metrics will be deprecated when the portability framework is 100% operational 
on all the runners?
ThxEtienne
> 
> On Mon, Sep 10, 2018 at 8:41 AM Robert Burke <[email protected]> wrote:
> > The way I entered them into the Go SDK is #2 (SDK sends diffs per bundle) 
> > and the Java Runner Harness appears to
> > aggregate them correctly from there.
> > On Mon, Sep 10, 2018, 2:07 AM Etienne Chauchot <[email protected]> wrote:
> > > Hi all,
> > > @Luke, @Alex I have a general question related to metrics in the Fn API: 
> > > as the communication between runner
> > > harness and SDK harness is done on a bundle basis. When the runner 
> > > harness sends data to the sdk harness to
> > > execute a transform that contains metrics, does it:
> > > send metrics values (for the ones defined in the transform) alongside 
> > > with data and receive an updated value of
> > > the metrics from the sdk harness when the bundle is finished 
> > > processing?or does it send only the data and the sdk
> > > harness responds with a diff value of the metrics so that the runner can 
> > > update them in its side?My bet is option
> > > 2. But can you confirm?
> > > 
> > > 
> > > Thanks
> > > 
> > > Etienne
> > > Le jeudi 19 juillet 2018 à 15:10 +0200, Etienne Chauchot a écrit :
> > > > Thanks for the confirmations Luke.
> > > > Le mercredi 18 juillet 2018 à 07:56 -0700, Lukasz Cwik a écrit :
> > > > > On Wed, Jul 18, 2018 at 7:01 AM Etienne Chauchot 
> > > > > <[email protected]> wrote:
> > > > > > Hi,
> > > > > > Luke, Alex, I have some portable metrics interrogations, can you 
> > > > > > confirm them ? 
> > > > > > 
> > > > > > 1 - As it is the SDK harness that will run the code of the UDFs, if 
> > > > > > a UDF defines a metric, then the SDK
> > > > > > harness will give updates through GRPC calls to the runner so that 
> > > > > > the runner could update metrics cells,
> > > > > > right?
> > > > > 
> > > > > Yes. 
> > > > > > 2 - Alex, you mentioned in proto and design doc that there will be 
> > > > > > no aggreagation of metrics. But some
> > > > > > runners (spark/flink) rely on accumulators and when they are 
> > > > > > merged, it triggers the merging of the whole
> > > > > > chain to the metric cells. I know that Dataflow does not do the 
> > > > > > same, it uses non agregated metrics and
> > > > > > sends them to an aggregation service. Will there be a change of 
> > > > > > paradigm with portability for runners that
> > > > > > merge themselves ? 
> > > > > 
> > > > > There will be local aggregation of metrics scoped to a bundle; after 
> > > > > the bundle is finished processing they
> > > > > are discarded. This will require some kind of global aggregation 
> > > > > support from a runner, whether that runner
> > > > > does it via accumulators or via an aggregation service is up to the 
> > > > > runner.
> > > > > > 3 - Please confirm that the distinction between attempted and 
> > > > > > committed metrics is not the business of
> > > > > > portable metrics. Indeed, it does not involve communication between 
> > > > > > the runner harness and the SDK harness
> > > > > > as it is a runner only matter. I mean, when a runner commits a 
> > > > > > bundle it just updates its committed metrics
> > > > > > and do not need to inform the SDK harness. But, of course, when the 
> > > > > > user requests committed metrics through
> > > > > > the SDK, then the SDK harness will ask the runner harness to give 
> > > > > > them.
> > > > > > 
> > > > > > 
> > > > >  You are correct in saying that during execution, the SDK does not 
> > > > > differentiate between attempted and
> > > > > committed metrics and only the runner does. We still lack an API 
> > > > > definition and contract for how an SDK would
> > > > > query for metrics from a runner but your right in saying that an SDK 
> > > > > could request committed metrics and the
> > > > > Runner would supply them some how. 
> > > > > > Thanks
> > > > > > BestEtienne
> > > > > > 
> > > > > >

Re: [portablility] metrics interrogations

Reply via email to