> Also, I do have a feedback that current metrics list and what they track are 
> not really that useful

Fully agree.

> (I mean, there is so much that one can do for metrics like operator failures 
> and ti failures - since they don’t post any context specific information) - 
> so while we may be working with making OpenTelemetry available for airflow, 
> we might also investigate and try improvements on reviewing these metrics and 
> really verify whether these metrics are helpful, and if there can be 
> additional metrics that we can instrument while doing this.

Oh yeah.

> I think when we are designing for the distributed traces on Airflow, we 
> should also work on defining what kind of traces would be useful and how to 
> come up with better name convention etc. to make things clear and easy to 
> understand, etc..

Absolutely!  I think we have a very clear "separation" and actually
"complementary" work that we should indeed do together!

1) From the "internship project" that we do together with Melody, the
focus is more on the engineering side - "how we can easily integrate
open-telemetry" with Airflow - seamlessly and in a modular fashion and
in the way that will be easy to use and test in "development
environment". It is more about solving all engineering obstacles with
integration (for example what we learn now is that Open Telemetry
requires some custom code to account for a "forking" model. Also about
exposing a lot of low-level metrics that are not airflow specific
(flask, db access etc - something that really allows to debug "any"
application deployment, not only Airflow). Then we thought about
simply adding the "current" metrics that we have in statsd as custom
ones.

* And I understand that your focus is - more "how we can actually make
a really useful set of Airflow metrics" which is ideally complementing
the "engineering" part - once we get OT fully integrated we can add
not only (or maybe even not at all) the current metrics but, once you
help defining "better" metrics, we can simply implement them in OT -
including some example dashboards etc.

Happy to collaborate on that!

J.


> - Howard
>

Reply via email to