potiuk commented on PR #40802:
URL: https://github.com/apache/airflow/pull/40802#issuecomment-2468032255

   > However, I know some of the airflow users would like to have task logs sent
   out as span events - and would see those as good value (hence the
   implementation).
   
   I see the value of it indeed, but I agree there should be a limit. I think 
huge percentage (9X%) - the logs will be really small and very useful to see 
immediately in the OTEL span context, and for the rest it would be super useful 
if you only see beginning of the logs to get a bit more context.
   
   How about having a resonable (say 2K?) limit for the log size being sent 
with some indication (ellipsis) that it's not complete. Maybe later also we 
could connect it with our logging framework, so that such a message could also 
contain (maybe in structured form) link to the log message accessible in 
whatever remote logging we have configured (and for task logs links to Airflow 
UI where you could see the logs from tasks).
   
   That would make OTEL spans really, really useful as a first / main part of 
"application debugging" problems. 
   
   I really see OTEL as main way which will a) make debugging of problems with 
Airflow easier, b) it will also make it easier for us to help our users. One of 
the great features of OTEL and tools like jaeger is the they have export 
capabilities. Similarly to py-spy and memray flamegraphs, such OTEL exports can 
be sent to us for further analysis in case our users have OTEL enabled - seing 
even limited logs included in such exports would be a fantastic aid that will 
allow us to open such export using jaeger for example and be able to diagnose 
many issues much faster.
   
   I think eventually we should even provide our users some information on how 
they can setup some OTEL tools (jaeger seems like an easy one ) and how to 
create such exports so that we can analyse them (likely with some 
anonymisation/obfuscation options for sensitive names for users who care about 
it etc, but I guess that should be possible with tools like Jaeger)
   
   This is really part of https://github.com/apache/airflow/issues/40975 - 
"Improve Airflow's debugging story" - which clearly from the survey run by 
@omkar-foss had shown needs improvement, I see OTEL as a big chance to make it 
easy to have a fantastic tool and easy to set-up configuration for our users to 
provide use much more data about the problems they are experiencing and 
allowing us to diagnose and fix them way faster.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to