I like the vision and the effect - even though we do not have much
experiences with OTEL, I saw what  it can do and I like what it "provides".
And I found it very appealing when I saw it in "action" during some talks.

Maybe it would be useful to see a somewhat realistic example of what can be
done with it? Showing what a user could do in DAG and screenshotting (or
making up how it could look like)? I think that will give others a quick
way of assessing what users will get out of it.

But I believe there is no "fundamental" issue with adding an OTEL provider
- as this is an open industry standard, popular and getting even more
popular nowadays. I like how it fits into "Airflow as a platform" approach
- especially that it will be a provider, not "core" modification.

J.


On Wed, Jul 31, 2024 at 9:28 PM Howard Yoo <howard...@gmail.com> wrote:

> As part of AIP-49, tasks logs are actually included, as span events.
> It can be turned on/off but when it's enabled, the content of the task log
> will be included as span events and emitted.
> Since task logs can be quite a large size, depending on the OTEL backend,
> the log contents can be truncated as necessary.
>
> OTEL instrumented DAG would create a trace flow of DAG run, based on when
> and what task instances got invoked and executed, so much similar to the
> flow chart of Airflow, you can monitor the DAG's execution via any
> OpenTelemetry compatible backend. Aside from the DAG run, there will be a
> span-link that would connect a particular task instance run with the
> instance (or loop) of scheduler, such that you would also know the
> relationship between what was happening on the scheduler's side and the DAG
> run itself - connected together. Instrumentation would also allow users to
> attach or make custom attributes or spans as part of this DAG run graph
> (which can be achieved easily using OTEL provider), such that not only they
> can see the OOTB dag run graph, but have their own spans and
> instrumentation be part of it, to make the monitoring more enriched.
>
> Utilizing these data, it is hoped that with larger Airflow environment, it
> may be easier to monitor any failures, or incidents, or any performance
> issues much better since the trace, metrics, and logs, will be able to be
> collected into the single place, and easily correlated.
>
> On Thu, Jul 25, 2024 at 5:45 PM Vikram Koka <vik...@astronomer.io.invalid>
> wrote:
>
> > Howard,
> >
> > I am intrigued by this, but unclear on what this would actually look like
> > and what benefits it would add.
> >
> > Specifically, I believe that AIP-49 adds support for OTEL emission of
> > metrics and traces, but NOT task logs from Airflow.
> >
> > I am probably being dense here, but I don't quite understand what the
> OTEL
> > instrumented DAG would look like, and how/when the instrumentation would
> be
> > utilized. Can you please elaborate on this?
> >
> > Best regards,
> > Vikram
> >
> >
> > On Wed, Jul 24, 2024 at 12:51 PM Kaxil Naik <kaxiln...@gmail.com> wrote:
> >
> > > @howard: What sort of Operators or Hooks are you planning for the OTEL
> > > provider?
> > >
> > > I am favour of deeper integration for OTEL and Airflow but I don't know
> > > what Operators, Hooks or other things will be part of the provider.
> > >
> > > Regards,
> > > Kaxil
> > >
> > > On Wed, 24 Jul 2024 at 01:00, Jarek Potiuk <ja...@potiuk.com> wrote:
> > >
> > > > +1. I like the idea of how it will add a possibility to customize
> OTEL
> > > > metrics and spans possibly. With Airflow 2.10 I would also love to
> see
> > > some
> > > > guidelines and description and maybe some kind of simple How-TO on
> how
> > > you
> > > > can make "more" use of OTEL - for example users could use
> > > > auto-instrumentation for sqlalchemy, flask and other libraries we are
> > > > using, monitoring memory, cpu, processes etc. (this is all
> > out-of-the-box
> > > > available in OTEL) - and if such documentation describing a number of
> > > > options and what the users can do about it would be great - and
> > provider
> > > > seems to be a good place maybe even to have some ways to enable those
> > > > things more easily.
> > > >
> > > > Maybe just loosely related - but one thing that I am particularly
> > looking
> > > > forward to - is the ability for our users to be able to make a
> snapshot
> > > of
> > > > a problem they see (with traces) and send it to us. I know Jaegger
> has
> > > such
> > > > an option, and I saw what you can do, especially if you capture a lot
> > of
> > > > information, this would be the way we always tell all our users "But
> I
> > > have
> > > > no way to inspect your system - so I can't tell you what is wrong" -
> > > having
> > > > such a snapshot that you can load locally especially with a lot of
> auto
> > > > instrumentation enabled might be fantastic way to help our users -
> but
> > in
> > > > order to do that - we need to give them some
> > easy-to-follow-instructions.
> > > >
> > > > If that would be part of the work then I am even +10 on that.
> > > >
> > > > J.
> > > >
> > > > On Tue, Jul 23, 2024 at 2:16 AM Howard Yoo <howard...@gmail.com>
> > wrote:
> > > >
> > > > > Hi Apache Airflow Community,
> > > > >
> > > > > I hope this message finds you well.
> > > > >
> > > > > I am writing to propose the addition of a new provider to Apache
> > > Airflow
> > > > > for OpenTelemetry (https://opentelemetry.io). OpenTelemetry is an
> > > > emerging
> > > > > standard for instrumentation of services and applications, and
> > recently
> > > > has
> > > > > matured to gain huge popularity.
> > > > >
> > > > > Recently, there has been AIP (Airflow Improvement Proposal) no. 49
> to
> > > > > implement OpenTelemetry support for Apache Airflow, which will
> enable
> > > > > Airflow to be able to emit metrics, traces, and task logs in
> > > > OpenTelemetry
> > > > > (PRs: https://github.com/apache/airflow/pull/37948,
> > > > > https://github.com/apache/airflow/pull/40802)
> > > > >
> > > > > Since this feature is to be released soon to the future Airflow,
> > having
> > > > > this provider will further allow users to have more means to
> > instrument
> > > > > their DAGs. This OTEL provider can work independently from
> Airflow's
> > > OTEL
> > > > > implementation, as well as in conjunction if the feature is
> available
> > > and
> > > > > enabled. Any DAGs instrumented with OTEL provider will work with
> > > Airflow
> > > > > versions that may not have OTEL support, but also seamlessly with
> > > Airflow
> > > > > that supports OTEL, providing OTEL for everybody.
> > > > >
> > > > > I am willing to contribute to the development and integration
> effort
> > to
> > > > > ensure a smooth and effective implementation. Please let me know if
> > > there
> > > > > are any specific guidelines or processes that I should follow to
> > > initiate
> > > > > this proposal.
> > > > >
> > > > > Thanks and regards,
> > > > > Howard Yoo
> > > > >
> > > >
> > >
> >
>

Reply via email to