Hello Beamer:

We have been aware of some of the shortcoming with reporting, so we actually 
make that part of our Batch Code  includes Beam Counters on what we want to 
track (standardize “elements” and “errors” as the main ones ) and then with the 
 outcome of

result = p.run()

we call the following function that allows us to store a JSON representation of 
the counters for Direct Runner, Flink Runner and Dataflow Runner in a somewhat 
standardized way. Notice we added support to do a POST to and endpoint and 
return the json to print on stdout if wanted to.

# =============================================================================
#
# Report Metrics in a JSON Payload to a local path or remote post
#
# =============================================================================
def reportMetrics(result,
                  reportTarget=None,
                  counterNames=["errors", "elements"]):
    import re
    import json
    import requests

    #Provide Json structure
    def toJsonEntry(metric):
        step = metric.key.step

        #Check to see if it has Flink Runner naming
        m = re.match("ref_AppliedPTransform_(.+)_\d+$", step)
        if not m is None:
            step = m.group(1).replace("-", " ")

        #Standarized with lower case step name
        step = step.lower().strip()

        return {
            "kind": "counter",
            "step": step,
            "name": metric.key.metric.name,
            "value": metric.committed
        }

    counters = result.metrics().query(beam.metrics.MetricsFilter())['counters']
    metrics = [
        toJsonEntry(counter)
        for counter in counters
        if counter.key.metric.name in counterNames
    ]

    #Report to external resource
    if not reportTarget is None:
        payload = json.dumps(metrics)

        #If HTTP(S) then POST
        if reportTarget.startswith("http"):
            requests.post(reportTarget, data=payload)
        else:
            with open(reportTarget, "w") as fd:
                fd.write(payload)

    return metrics



It sure would be nice to have a standardized way to get it OOTB



INTERNAL USE

From: LDesire <[email protected]>
Sent: Friday, March 15, 2024 5:30 AM
To: [email protected]
Subject: [EXTERNAL] [QUESTION] about Metric REST spec




Hello Apache Beam community.

I'm reading the programming-guide in the official documentation, and I'm seeing 
<https://urldefense.com/v3/__https:/beam.apache.org/documentation/programming-guide/*export-metrics__;Iw!!M-nmYVHPHQ!MncdL4elY0X2XfTO98cJQMDPrQzA0zt_Tv2omZA6G5jL6DItbagTfhJLY8AvnEBNciHsxBHiIqPlBTBZ3bTuMHnXJJO-x2MP68A$>
  <REDACTED>

There it states the following



"""

As for now only the REST HTTP and the Graphite sinks are supported and only 
Flink and Spark runners support metrics export.

"""

Is there any specification for this REST?



Thank you.



________________________________

The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

Reply via email to