[ 
https://issues.apache.org/jira/browse/BEAM-3926?focusedWorklogId=104882&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-104882
 ]

ASF GitHub Bot logged work on BEAM-3926:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/May/18 23:54
            Start Date: 22/May/18 23:54
    Worklog Time Spent: 10m 
      Work Description: robertwb commented on a change in pull request #5437: 
[BEAM-3926] Add new metrics protos based on "Defining and adding SDK Metrics" 
htt…
URL: https://github.com/apache/beam/pull/5437#discussion_r190088024
 
 

 ##########
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##########
 @@ -257,6 +262,122 @@ message ProcessBundleProgressRequest {
   string instruction_reference = 1;
 }
 
+message MonitoringInfo {
+  // The name defining the metric or monitored state.
+  string urn = 1;
+
+  // This is specified as a URN that implies:
+  // A message class: (Distribution, Counter, Extrema, MonitoringDataTable).
+  // Sub types like field formats - int64, double, string.
+  // Aggregation methods - SUM, LATEST, TOP-N, BOTTOM-N, DISTRIBUTION
+  // valid values are:
+  // beam:metrics:[SumInt64|LatestInt64|Top-NInt64|Bottom-NInt64|
+  //     SumDouble|LatestDouble|Top-NDouble|Bottom-NDouble|DistributionInt64|
+  //     DistributionDouble|MonitoringDataTable]
+  string type = 2;
+
+  // The Metric or monitored state.
+  oneof monitoring_status {
+    MonitoringTableData monitored_table_data = 3;
+    Metric metric = 4;
+  }
+
+  // A set of key+value labels which define the scope of the metric.
+  // Either a well defined entity id for the keys:
+  // “transform”, “pcollection”, “windowing_strategy”,
+  // “coder”, “environment” or any arbitrary label
+  // set by a custom metric or user metric.
+  // A monitoring system is expected to be able to aggregate the metric 
together
+  // for all updates having the same URN and labels.
+  // Some systems such as Stackdriver will be able to aggregate the metric
+  // using a subset of the provided labels
+  map<string, string> labels = 5;
+}
+
+message Metric {
+  // (Required) The data for this metric.
+  oneof data {
+    CounterData counter_data = 1;
+    DistributionData distribution_data = 2;
+    Extrema extrema_data = 3;
+  }
+}
+
+// Data associated with a Counter or Gauge metric.
+// This is designed to be compatible with metric collection
+// systems such as DropWizard.
+message CounterData {
+   oneof value {
+     int64 int64_value = 1;
+     string string_value = 2;
+     double double_value = 3;
+   }
+}
+
+// Extrema messages are used for calculating
+// Top-N/Bottom-N metrics.
+message Extrema {
+  // Only one of the two should be specified.
+  // Note: oneof is not allowed on repeated fields.
+  repeated int64 int_values = 1;
+  repeated double double_values = 2;
 
 Review comment:
   Top and bottom strings makes sense as well. (Actually, one of the most 
useful extrema is MostFrequent). 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 104882)
    Time Spent: 1h 10m  (was: 1h)

> Support MetricsPusher in Dataflow Runner
> ----------------------------------------
>
>                 Key: BEAM-3926
>                 URL: https://issues.apache.org/jira/browse/BEAM-3926
>             Project: Beam
>          Issue Type: Sub-task
>          Components: runner-dataflow
>            Reporter: Scott Wegner
>            Assignee: Pablo Estrada
>            Priority: Major
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> See [relevant email 
> thread|https://lists.apache.org/thread.html/2e87f0adcdf8d42317765f298e3e6fdba72917a72d4a12e71e67e4b5@%3Cdev.beam.apache.org%3E].
>  From [~echauchot]:
>   
> _AFAIK Dataflow being a cloud hosted engine, the related runner is very 
> different from the others. It just submits a job to the cloud hosted engine. 
> So, no access to metrics container etc... from the runner. So I think that 
> the MetricsPusher (component responsible for merging metrics and pushing them 
> to a sink backend) must not be instanciated in DataflowRunner otherwise it 
> would be more a client (driver) piece of code and we will lose all the 
> interest of being close to the execution engine (among other things 
> instrumentation of the execution of the pipelines).  I think that the 
> MetricsPusher needs to be instanciated in the actual Dataflow engine._
>  
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to