Hallo Brian, Jaeger would be a good choice because it is very common (almost the standard with OpenTelemetry). Have you looked at OpenLineage (https://openlineage.io/)? Possibly interesting?!
Thanks Uwe > Am 23.05.2023 um 04:57 schrieb Brian Putt <puttbr...@gmail.com>: > > Hello Joe / All, > > Jaeger or Grafana (w/ tempo) offer comparable tools to visualize the trace > data. I believe additional tools will be needed to get the most out of the > trace data. We've been experimenting with a number of open source products > to see what works best for the amount of trace data that NiFi emits. So > far, Grafana Tempo, Victoria Metrics, and Clickhouse seem to offer a good > set of features to cover searching / viewing the traces along with > summarizing certain flowfile attributes. As long as the trace data is in > OTEL's format, the collector offers flexibility in exporting the data to a > number of services with ease. > > I would expect a PR to OTEL's java auto instrumentation project over the > next few months that adds NiFi to its list of instrumentations. If the NiFi > committers would like a demo / tech exchange to go over the current state > of the tracing agent, we'd be happy to accommodate. As it stands, the agent > utilizes flowfile attributes to pass along the tracestate so trace > propagation can occur across NiFi to NiFi boundaries. > > Thanks, > > Brian > >> On Wed, May 17, 2023 at 1:05 PM Joe Witt <joe.w...@gmail.com> wrote: >> >> Brian Putt, All >> >> Are you aware of any good tools/services that can ingest the traces and >> provide an interesting view/story/reporting on it? >> >> I could see us emitting otel events instead of our current provenance >> mechanism and using that both internally to do what we already do but also >> have a clear/spec friendly way of exporting it to others. >> >> Thanks >> >> On Sat, Jul 30, 2022 at 7:43 AM u...@moosheimer.com <u...@moosheimer.com> >> wrote: >> >>> Hello Brian, Bryan, Greg, NiFi devs, >>> >>> Integrating OpenTelemetry is a very good idea, especially since the major >>> cloud providers also rely on it. This could also be interesting for >>> Stateless NiFi. >>> >>> I have a suggestion that I would like to put up for discussion. >>> >>> Would it be useful to make a list of what extensions or new development >>> would be helpful for a complete integration of OpenTelemetry? >>> >>> I'm thinking of ConsumeMQTT and PublishMQTT, for example. Currently these >>> can do max. MQTT version 3.11, but since version 5 the User Properties >>> exist, which are similar to the HTTP header fields. >>> Thus one could implement OpenTelemetry in the MQTT processors similarly >> as >>> in HTTP. >>> >>> With a list we could make an overview of the "necessary" adjustments and >>> advertise for support. >>> >>> If what I write is nonsense, then I may not have understood something and >>> I take it all back :) >>> >>> Mit freundlichen Grüßen / best regards >>> Kay-Uwe Moosheimer >>> >>>> Am 29.07.2022 um 05:09 schrieb Brian Putt <puttbr...@gmail.com>: >>>> >>>> Hello Bryan / Greg / NiFi devs, >>>> >>>> Distributed tracing (DT) is similar to provenance in that it shows the >>> path >>>> a particular flowfile travels, but its core selling point is that it >>>> supports tracing across multiple systems/services regardless of what's >>>> receiving the data. Provenance is a fantastic feature and there are >>>> instances where one might want to draw that bigger picture of >> identifying >>>> bottlenecks as data flows from one system to another and that system >>>> may/may not be using NiFi. >>>> >>>> DT utilizes three ids: traceId, parentId, and spanId. While a tree can >> be >>>> built using two ids, the third id (traceId) helps bring all of the >>> relevant >>>> information out of a datastore more easily. >>>> DT is focused more on performance and identifying bottlenecks in one or >>>> more systems. Imagine if NiFi were receiving data from various sources >>>> (i.e. HTTP, Kafka, SQS) and NiFi egressed to other sources (HTTP, >> Kafka, >>>> NiFi). >>>> DT provides a spec that we'd be able to follow and correlate the data >> as >>> it >>>> traverses from system to system. Each system that participates in the >> DT >>>> ecosystem would simply emit information (a trace is made up of one or >>> more >>>> spans) and there'd be a collection system which would aggregate all of >>>> these spans and would draw a bigger picture of the path that data went >>>> through and could help identify key bottlenecks. >>>> >>>> OpenTelemetry (OTEL) provides clients (across many languages, including >>>> java) where developers can instrument their library's APIs and >>> participate >>>> in a DT ecosystem as it adheres to the tracing spec. Egressing trace >> data >>>> is possible without using OTEL, but then we may find ourselves having >> to >>>> recreate the wheel, but could be optimized for NiFi. >>>> >>>> Creating a reporting task could certainly be a path, mainly have a few >>>> concerns with that: >>>> >>>> 1. If provenance is disabled, will provenance events still be emitted >> and >>>> be collected by a new reporting task? >>>> 2. There'll be an impact on performance, how much is unknown. OTEL is >>>> gaining traction across industry and there are ways to mitigate >>>> performance, mainly sampling and the fact that *tracing is best >> effort*. >>>> Spans would be emitted from NiFi via UDP to a collector on the same >>> network >>>> 3. Would there be any issues with appending a flowfile attribute that >> is >>>> carried throughout the flow where it maintains the traceId, >> parentSpanId, >>>> and trace flags? See below for more details >>>> >>>> There's a W3C spec (Trace context) which includes a formatted string >> that >>>> would be propagated to services (HTTP, Kafka, etc...). So if NiFi were >> to >>>> put information onto kafka, any consumers of that data would be able to >>>> continue the trace and help draw the bigger picture. >>>> >>>> W3C Spec: https://www.w3.org/TR/trace-context/#traceparent-header >>>> >>>> For #2, since DT is focused on performance, sampling can help alleviate >>>> chatter over the wire and ideally, 0.01% would draw the same picture as >>> 1% >>>> or 10%+. This is certainly different from provenance as DT is focused >> on >>>> performance over quality of the data and should not be thought of as >>>> auditing. >>>> >>> >> https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk.md#sampler >>>> >>>>> On Thu, Jul 28, 2022 at 5:01 PM Bryan Bende <bbe...@gmail.com> wrote: >>>>> >>>>> Hi Greg, >>>>> >>>>> I don't really know anything about OpenTelemetry, but from the >>>>> perspective of integrating something into the framework, some things >>>>> to consider... >>>>> >>>>> Is there some way to piggy-back on provenance and use a ReportingTask >>>>> to process provenance events and report something to OpenTelemetry? >>>>> >>>>> If something new does need to be added, it should probably be an >>>>> extension point where there is an interface in the framework-api and >>>>> different implementations can be plugged in. >>>>> Ideally the framework itself wouldn't have any knowledge of >>>>> OpenTelemetry specifically, it would only be reporting some >>>>> information, which could then be used in some way by the OpenTelemetry >>>>> implementation. >>>>> >>>>> How does NiFi actually communicate with OpenTelemetry? Are you >>>>> expecting to send data to OpenTelemetry in this new method you are >>>>> suggesting? >>>>> That would likely have a significant impact on the performance of the >>> flow. >>>>> >>>>> Thanks, >>>>> >>>>> Bryan >>>>> >>>>>> On Thu, Jul 28, 2022 at 3:17 PM glma...@uwe.nsa.gov < >>> glma...@uwe.nsa.gov> >>>>>> wrote: >>>>>> >>>>>> Nifi Devs, >>>>>> >>>>>> My team and I are looking for guidance on how we can extend Apache >>>>> Nifi's capabilities. Specifically we're looking to include distributed >>>>> tracing. We'll approach this effort as if we're the tracing experts >> and >>>>> simply seeking implementation guidance. Our developers have good >>> exposure >>>>> to working with Nifi and creating custom processors. We plan to fork >> the >>>>> project to begin this effort but want to make sure we approach this >> with >>>>> the best possible direction for community adoption. >>>>>> >>>>>> Our initial thoughts on this approach would be to piggyback on how >>>>> Provenance was implemented. We essentially want to include a >> subroutine >>> or >>>>> method that gets implicitly invoked upon a processors 'onTrigger' >>> method. >>>>> From there we would analyze the FlowFiles attributes to check for the >>>>> existence of 'traceId' and/or propagate one if found. >>>>>> >>>>>> We can expound upon all of these tracing/observability details if >> that >>>>> helps by any means. We're able to provide more detailed scope of this >>> task >>>>> as well but for now we just want to get feed back for our overall goal >>> and >>>>> proposed approach. >>>>>> >>>>>> Thanks, >>>>>> Greg Marshall >>>>> >>> >>> >>