jscheffl commented on PR #36029: URL: https://github.com/apache/airflow/pull/36029#issuecomment-1990936322
> > "niche" in context we came to version 2.9.0 > > The feature was actually merged for 2.8 and then reverted due to failing tests so it missed the 2.8 cut. Its not a recent new change. I'd rather we make an attepmt to solve the pain if we can then wait for 2.10 > > We can find solutions and warn about possible performance issues like having the feature off by default and users must set it explicitly in the settings... I am not so worried about this part. We always find ways to make everyone happy :) I was thinking of use cases where persistence is needed and still have doubts. The use cases that came into my mind: - Any custom logic --> Python code is sufficient - Retry cases --> Like example can use the task instance context - Context information --> Can use existing XCom - Different strategies based on DAG run (e.g. high/low prio executions) --> Use DagRun.conf - Option to switch the logic --> I'd propose to rather use multiple implementations for the variation each static - Use a runtime parameter --> Can use existing Variable infrastructure - Priority based on time (low prio during day) --> Use standard Python code - Priority based on external condition (e.g. weather) --> SHOULD NOT be used, big performance risk - Some data from other DAGs/Tasks --> You can make a query in the DB, be careful what you do not to impact performance - Other cases --> If really needed then I'd rather recommend to expend the DB Schema for these cases like we have the "task_reschedule" or "rendered_task_instance" structures and not put the additional field into every tuple I understand that it was merged and reverted in 2.8 - but this was regarding security. Performance concerns stay and this is valid to be experimental. My concerns here are mainly performance. Adding millions of Blob into the biggest table in the DB Scheme needs some considerations. Otherwise all Airflow users need to throw additional hardware on the environment to compensate the overhead of this feature. Extending the DB scheme is nothing that is optional for all users. It might impact all. Means: I am totally fine except persistence. (plus the other comment about parameter rename) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
