ferruzzi commented on code in PR #37948:
URL: https://github.com/apache/airflow/pull/37948#discussion_r1516714159
##########
airflow/config_templates/config.yml:
##########
@@ -1130,6 +1130,56 @@ metrics:
type: string
example: ~
default: "False"
+traces:
+ description: |
+ Distributed traces integration settings.
+ options:
+ otel_on:
+ description: |
+ Enables sending traces to OpenTelemetry.
+ version_added: 2.9.0
+ type: string
+ example: ~
+ default: "False"
+ otel_host:
+ description: |
+ Specifies the hostname or IP address of the OpenTelemetry Collector to
which Airflow sends
+ traces.
Review Comment:
If you are using OTel for metrics and traces, how often will the host/port
be different? Does it make sense to reuse the same host and port that is set
in the metrics section above? Perhaps adding some logic in get_otel_tracer
where this is imported
[[here](https://github.com/apache/airflow/pull/37948/files#diff-33c40b9d4b606572f96dc46b44de7523314373a26a93bd1c82bcf5650599e3e2R284)]
that if this is not set, then try looking for metrics.otel_host, and note here
that that is the fallback? (same comment goes for the other new config options
where this may make sense)
##########
airflow/traces/otel_tracer.py:
##########
@@ -0,0 +1,333 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+import logging
+import random
+
+from opentelemetry import trace
+from opentelemetry.context import create_key
+from opentelemetry.exporter.otlp.proto.http.trace_exporter import
OTLPSpanExporter
+from opentelemetry.sdk.resources import HOST_NAME, SERVICE_NAME, Resource
+from opentelemetry.sdk.trace import Span, Tracer as OpenTelemetryTracer,
TracerProvider
+from opentelemetry.sdk.trace.export import BatchSpanProcessor,
ConsoleSpanExporter
+from opentelemetry.sdk.trace.id_generator import IdGenerator
+from opentelemetry.trace import Link, NonRecordingSpan, SpanContext,
TraceFlags, Tracer
+from opentelemetry.trace.span import INVALID_SPAN_ID, INVALID_TRACE_ID
+
+from airflow.configuration import conf
+from airflow.traces import (
+ TRACEPARENT,
+ TRACESTATE,
+)
+from airflow.traces.utils import (
+ gen_dag_span_id,
+ gen_span_id,
+ gen_trace_id,
+ parse_traceparent,
+ parse_tracestate,
+)
+from airflow.utils.net import get_hostname
+
+log = logging.getLogger(__name__)
+
+_NEXT_ID = create_key("next_id")
+
+
+class OtelTrace:
+ """
+ OpenTelemetry Tracing Class.
+
+ Handles all tracing requirements such as getting the tracer, and starting
a new span.
+ When OTEL is enabled, the Trace class will be replaced by this class.
+ """
Review Comment:
I'm surprised this docstring passes the mypy tests. This appears to violate
mypy rule D401 and should maybe be:
```suggestion
"""
Handle all tracing requirements such as getting the tracer, and starting
a new span.
When OTEL is enabled, the Trace class will be replaced by this class.
"""
```
The rule is somewhat opaque and confusing, but essentially a docstring
should not tell me what I am looking at, it should tell the class or method
what to do.
"OpenTelemetry Tracing Class." would be dropped because I can tell that by
the fact that I am looking at a class called OtelTrace. And so on.
##########
scripts/ci/docker-compose/integration-otel.yml:
##########
@@ -54,14 +54,31 @@ services:
- ./grafana/volume/dashboards:/grafana/dashboards
- ./grafana/volume/provisioning:/grafana/provisioning
+ jaeger:
+ image: jaegertracing/all-in-one
+ container_name: "breeze-jaeger"
+ environment:
+ COLLECTOR_OTLP_ENABLED: true
+ COLLLECTOR_ZIPKIN_HOST_PORT: :9411
Review Comment:
Just want to confirm that the colon is intended to be there? It looks
suspicious.
##########
airflow/traces/otel_tracer.py:
##########
@@ -0,0 +1,333 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+import logging
+import random
+
+from opentelemetry import trace
+from opentelemetry.context import create_key
+from opentelemetry.exporter.otlp.proto.http.trace_exporter import
OTLPSpanExporter
+from opentelemetry.sdk.resources import HOST_NAME, SERVICE_NAME, Resource
+from opentelemetry.sdk.trace import Span, Tracer as OpenTelemetryTracer,
TracerProvider
+from opentelemetry.sdk.trace.export import BatchSpanProcessor,
ConsoleSpanExporter
+from opentelemetry.sdk.trace.id_generator import IdGenerator
+from opentelemetry.trace import Link, NonRecordingSpan, SpanContext,
TraceFlags, Tracer
+from opentelemetry.trace.span import INVALID_SPAN_ID, INVALID_TRACE_ID
+
+from airflow.configuration import conf
+from airflow.traces import (
+ TRACEPARENT,
+ TRACESTATE,
+)
+from airflow.traces.utils import (
+ gen_dag_span_id,
+ gen_span_id,
+ gen_trace_id,
+ parse_traceparent,
+ parse_tracestate,
+)
+from airflow.utils.net import get_hostname
+
+log = logging.getLogger(__name__)
+
+_NEXT_ID = create_key("next_id")
+
+
+class OtelTrace:
+ """
+ OpenTelemetry Tracing Class.
+
+ Handles all tracing requirements such as getting the tracer, and starting
a new span.
+ When OTEL is enabled, the Trace class will be replaced by this class.
+ """
+
+ def __init__(self, span_exporter: ConsoleSpanExporter | OTLPSpanExporter,
tags=None):
+ self.span_exporter = span_exporter
+ self.span_processor = BatchSpanProcessor(self.span_exporter)
+ self.tags = tags
+ self.otel_service = conf.get("traces", "otel_service")
+
+ def get_tracer(self, component: str) -> OpenTelemetryTracer | Tracer:
+ """Get tracer from a given component."""
+ resource = Resource(attributes={HOST_NAME: get_hostname(),
SERVICE_NAME: self.otel_service})
+ tracer_provider = TracerProvider(resource=resource)
+ # span_processor = BatchSpanProcessor(self.span_exporter)
Review Comment:
Presumably this should be removed?
##########
airflow/traces/otel_tracer.py:
##########
@@ -0,0 +1,333 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+import logging
+import random
+
+from opentelemetry import trace
+from opentelemetry.context import create_key
+from opentelemetry.exporter.otlp.proto.http.trace_exporter import
OTLPSpanExporter
+from opentelemetry.sdk.resources import HOST_NAME, SERVICE_NAME, Resource
+from opentelemetry.sdk.trace import Span, Tracer as OpenTelemetryTracer,
TracerProvider
+from opentelemetry.sdk.trace.export import BatchSpanProcessor,
ConsoleSpanExporter
+from opentelemetry.sdk.trace.id_generator import IdGenerator
+from opentelemetry.trace import Link, NonRecordingSpan, SpanContext,
TraceFlags, Tracer
+from opentelemetry.trace.span import INVALID_SPAN_ID, INVALID_TRACE_ID
+
+from airflow.configuration import conf
+from airflow.traces import (
+ TRACEPARENT,
+ TRACESTATE,
+)
+from airflow.traces.utils import (
+ gen_dag_span_id,
+ gen_span_id,
+ gen_trace_id,
+ parse_traceparent,
+ parse_tracestate,
+)
+from airflow.utils.net import get_hostname
+
+log = logging.getLogger(__name__)
+
+_NEXT_ID = create_key("next_id")
+
+
+class OtelTrace:
+ """
+ OpenTelemetry Tracing Class.
+
+ Handles all tracing requirements such as getting the tracer, and starting
a new span.
+ When OTEL is enabled, the Trace class will be replaced by this class.
+ """
+
+ def __init__(self, span_exporter: ConsoleSpanExporter | OTLPSpanExporter,
tags=None):
+ self.span_exporter = span_exporter
+ self.span_processor = BatchSpanProcessor(self.span_exporter)
+ self.tags = tags
+ self.otel_service = conf.get("traces", "otel_service")
Review Comment:
Please add type hinting. It may be less important for local variables in
the methods, but I think it's extremely helpful in an init method for a class.
##########
airflow/traces/otel_tracer.py:
##########
@@ -0,0 +1,333 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+import logging
+import random
+
+from opentelemetry import trace
+from opentelemetry.context import create_key
+from opentelemetry.exporter.otlp.proto.http.trace_exporter import
OTLPSpanExporter
+from opentelemetry.sdk.resources import HOST_NAME, SERVICE_NAME, Resource
+from opentelemetry.sdk.trace import Span, Tracer as OpenTelemetryTracer,
TracerProvider
+from opentelemetry.sdk.trace.export import BatchSpanProcessor,
ConsoleSpanExporter
+from opentelemetry.sdk.trace.id_generator import IdGenerator
+from opentelemetry.trace import Link, NonRecordingSpan, SpanContext,
TraceFlags, Tracer
+from opentelemetry.trace.span import INVALID_SPAN_ID, INVALID_TRACE_ID
+
+from airflow.configuration import conf
+from airflow.traces import (
+ TRACEPARENT,
+ TRACESTATE,
+)
+from airflow.traces.utils import (
+ gen_dag_span_id,
+ gen_span_id,
+ gen_trace_id,
+ parse_traceparent,
+ parse_tracestate,
+)
+from airflow.utils.net import get_hostname
+
+log = logging.getLogger(__name__)
+
+_NEXT_ID = create_key("next_id")
+
+
+class OtelTrace:
+ """
+ OpenTelemetry Tracing Class.
+
+ Handles all tracing requirements such as getting the tracer, and starting
a new span.
+ When OTEL is enabled, the Trace class will be replaced by this class.
+ """
+
+ def __init__(self, span_exporter: ConsoleSpanExporter | OTLPSpanExporter,
tags=None):
+ self.span_exporter = span_exporter
+ self.span_processor = BatchSpanProcessor(self.span_exporter)
+ self.tags = tags
+ self.otel_service = conf.get("traces", "otel_service")
+
+ def get_tracer(self, component: str) -> OpenTelemetryTracer | Tracer:
+ """Get tracer from a given component."""
+ resource = Resource(attributes={HOST_NAME: get_hostname(),
SERVICE_NAME: self.otel_service})
+ tracer_provider = TracerProvider(resource=resource)
+ # span_processor = BatchSpanProcessor(self.span_exporter)
+ tracer_provider.add_span_processor(self.span_processor)
+ tracer = tracer_provider.get_tracer(component)
+ return tracer
+
+ def get_tracer_with_id(
+ self, component: str, trace_id: int | None = None, span_id: int | None
= None
+ ) -> OpenTelemetryTracer | Tracer:
+ """Tracer that will use special AirflowOtelIdGenerator to control
producing certain span and trace id."""
+ resource = Resource(attributes={HOST_NAME: get_hostname(),
SERVICE_NAME: self.otel_service})
+ tracer_provider = TracerProvider(
+ resource=resource,
id_generator=AirflowOtelIdGenerator(span_id=span_id, trace_id=trace_id)
+ )
+ # span_processor = BatchSpanProcessor(self.span_exporter,
schedule_delay_millis=1)
Review Comment:
This should presumably be removed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]