This is an automated email from the ASF dual-hosted git repository. ephraimanierobi pushed a commit to branch v2-6-test in repository https://gitbox.apache.org/repos/asf/airflow.git
commit ea1534696623f16c9b1bf750403a2a68f9f6ff2f Author: Ephraim Anierobi <[email protected]> AuthorDate: Fri Apr 14 14:05:24 2023 +0100 Add release notes --- RELEASE_NOTES.rst | 168 ++++++++++++++++++++++++++++++++++++ newsfragments/28172.misc.rst | 1 - newsfragments/28538.misc.rst | 1 - newsfragments/28892.improvement.rst | 1 - newsfragments/29506.significant.rst | 6 -- newsfragments/29933.improvement.rst | 1 - newsfragments/30076.significant.rst | 3 - newsfragments/30152.significant.rst | 6 -- newsfragments/30374.significant.rst | 5 -- newsfragments/30375.significant.rst | 9 -- 10 files changed, 168 insertions(+), 33 deletions(-) diff --git a/RELEASE_NOTES.rst b/RELEASE_NOTES.rst index 6da321c799..51a04677d5 100644 --- a/RELEASE_NOTES.rst +++ b/RELEASE_NOTES.rst @@ -21,6 +21,174 @@ .. towncrier release notes start +Airflow 2.6.0 (2023-04-20) +-------------------------- + +Significant Changes +^^^^^^^^^^^^^^^^^^^ + +Default permissions of file task handler log directories and files has been changed to "owner + group" writeable (#29506). +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + Default setting handles case where impersonation is needed and both users (airflow and the impersonated user) + have the same group set as main group. Previously the default was also other-writeable and the user might choose + to use the other-writeable setting if they wish by configuring ``file_task_handler_new_folder_permissions`` + and ``file_task_handler_new_file_permissions`` in ``logging`` section. + +SLA callbacks no longer add files to the dag processor manager's queue (#30076) +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + This stops SLA callbacks from keeping the dag processor manager permanently busy. It means reduced CPU, + and fixes issues where SLAs stop the system from seeing changes to existing dag files. Additional metrics added to help track queue state. + +The ``cleanup()`` method in BaseTrigger is now defined as asynchronous (following async/await) pattern (#30152). +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + This is potentially a breaking change for any custom trigger implementations that override the ``cleanup()`` + method and uses synchronous code, however using synchronous operations in cleanup was technically wrong, + because the method was executed in the main loop of the Triggerer and it was introducing unnecessary delays + impacting other triggers. The change is unlikely to affect any existing trigger implementations. + +The gauge ``scheduler.tasks.running`` no longer exist (#30374) +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + The gauge has never been working and its value has always been 0. Having an accurate + value for this metric is complex so it has been decided that removing this gauge makes + more sense than fixing it with no certainty of the correctness of its value. + +Consolidate handling of tasks stuck in queued under new ``task_queued_timeout`` config (#30375) +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + Logic for handling tasks stuck in the queued state has been consolidated, and the all configurations + responsible for timing out stuck queued tasks have been deprecated and merged into + ``[scheduler] task_queued_timeout``. The configurations that have been deprecated are + ``[kubernetes] worker_pods_pending_timeout``, ``[celery] stalled_task_timeout``, and + ``[celery] task_adoption_timeout``. If any of these configurations are set, the longest timeout will be + respected. For example, if ``[celery] stalled_task_timeout`` is 1200, and ``[scheduler] task_queued_timeout`` + is 600, Airflow will set ``[scheduler] task_queued_timeout`` to 1200. + +Improvement Changes +^^^^^^^^^^^^^^^^^^^ + +Display only the running configuration in configurations view (#28892) +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + The configurations view now only displays the running configuration. Previously, the default configuration + was displayed at the top but it was not obvious whether this default configuration was overridden or not. + Subsequently, the non-documented endpoint ``/configuration?raw=true`` is deprecated and will be removed in + Airflow 3.0. The HTTP response now returns an additional ``Deprecation`` header. The ``/config`` endpoint on + the REST API is the standard way to fetch Airflow configuration programmatically. + +Explicit skipped states list for ExternalTaskSensor (#29933) +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + ExternalTaskSensor now has an explicit ``skipped_states`` list + +Miscellaneous Changes +^^^^^^^^^^^^^^^^^^^^^ + +Handle OverflowError on exponential backoff in next_run_calculation (#28172) +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + Maximum retry task delay is set to be 24h (86400s) by default. You can change it globally via ``core.max_task_retry_delay`` + parameter. + +Move Hive macros to the provider (#28538) +""""""""""""""""""""""""""""""""""""""""" + The Hive Macros (``hive.max_partition``, ``hive.closest_ds_partition``) are available only when Hive Provider is + installed. Please install Hive Provider > 5.1.0 when using those macros. + +New Features +^^^^^^^^^^^^ +- Add serializer for pandas dataframe (#30390) +- Deferrable ``TriggerDagRunOperator`` (#30292) +- Adding ContinuousTimetable and support for @continuous schedule_interval (#29909) +- Allow customized rules to check if a file has dag (#30104) +- Add a new Airflow conf to specify a SSL ca cert for Kubernetes client (#30048) +- Bash sensor has an explicit retry code (#30080) +- Add filter task upstream/downstream to grid view (#29885) +- Add testing a connection via Airflow CLI (#29892) +- Support deleting the local log files when using remote logging (#29772) +- ``Blocklist`` to disable specific metric tags or metric names (#29881) +- Add a new graph inside of the grid view (#29413) +- Add database ``check_migrations`` config (#29714) +- add output format arg for ``cli.dags.trigger`` (#29224) +- Make json and yaml available in templates (#28930) +- Enable tagged metric names for existing Statsd metric publishing events | influxdb-statsd support (#29093) +- Add HttpHookAsync for deferrable implementation (#29038) +- Add arg --yes to ``db export-archived`` command. (#29485) +- Make the policy functions pluggable (#28558) +- Add ``airflow db drop-archived`` command (#29309) +- Enable individual trigger logging (#27758) +- Implement new filtering options in graph view (#29226) +- Add triggers for ExternalTask (#29313) +- Add command to export purged records to CSV files (#29058) +- Add ``FileTrigger`` (#29265) +- Emit DataDog statsd metrics with metadata tags (#28961) +- add some statsd metrics for dataset (#28907) +- Add a new SSM hook and use it in the System Test context builder (#28755) +- Add --overwrite option to ``connections import`` CLI command (#28738) +- Add ``ImpalaHook`` (#26970) +- Add general-purpose "notifier" concept to DAGs (#28569) +- add a new conf to wait past_deps before skipping a task (#27710) +- Add Flink on K8s Operator (#28512) +- Allow Users to disable SwaggerUI via configuration (#28354) +- Show mapped task groups in graph (#28392) +- Log FileTaskHandler to work with KubernetesExecutor's multi_namespace_mode (#28436) +- Add Amazon Elastic Container Registry (ECR) Hook (#28279) +- Add a new config for adapting masked secrets to make it easier to prevent secret leakage in logs (#28239) +- List specific config section and its values using the cli (#28334) +- KubernetesExecutor multi_namespace_mode can use namespace list to avoid requiring cluster role (#28047) +- Add ``FTPFileTransmitOperator`` (#26974) +- Automatically save and allow restore of recent DAG run configs (#27805) +- Added exclude_microseconds to cli (#27640) + +Improvements +"""""""""""" +- Speed up TaskGroups with caching property of group_id (#30284) +- Use the engine provided in the session (#29804) +- Type related import optimization for Executors (#30361) +- Add more type hints to the code base (#30503) +- some fixes to metrics doc (#30290) +- Always use self.appbuilder.get_session in security managers (#30233) +- Refactor out xcom constants from models (#30180) +- Add exception class name to DAG-parsing error message (#30105) +- Rename statsd_allow_list and statsd_block_list to ``metrics_*_list`` (#30174) +- Improve serialization of tuples and sets (#29019) +- Make cleanup method in trigger an async one (#30152) +- Lazy load serialization modules (#30094) +- SLA callbacks no longer add files to the dag_processing manager queue (#30076) +- Add task.trigger rule to grid_data (#30130) +- Speed up log template sync by avoiding ORM (#30119) +- Separate cli_parser.py into two modules (#29962) +- Explicit skipped states list for ExternalTaskSensor (#29933) +- Add task state hover highlighting to new graph (#30100) +- Store grid tabs in url params (#29904) +- Use custom Connexion resolver to load lazily (#29992) +- Delay Kubernetes import in secret masker (#29993) +- Delay ConnectionModelView init until it's accessed (#29946) +- Scheduler, make stale DAG deactivation threshold configurable instead of using dag processing timeout (#29446) +- Improve grid view height calculations (#29563) +- Avoid importing executor during conf validation (#29569) +- Make permissions for FileTaskHandler group-writeable and configurable (#29506) +- Add colors in help outputs of Airflow CLI commands #28789 (#29116) +- Add a param for get_dags endpoint to list only unpaused dags (#28713) +- Expose updated_at filter for dag run and task instance endpoints (#28636) +- Increase length of user identifier columns (#29061) +- Update gantt chart UI to display queued state of tasks (#28686) +- Add index on log.dttm (#28944) +- Display only the running configuration in configurations view (#28892) +- css, cap dropdown menu size dynamically (#28736) +- added JSON linter to connection edit / add UI for field extra. On connection edit screen, existing extra data will be displayed indented (#28583) +- Use labels instead of pod name for pod log read in k8s exec (#28546) +- Use time not tries for queued & running re-checks. (#28586) +- CustomTTYColoredFormatter should inherit TimezoneAware formatter (#28439) +- Improve past depends handling in Airflow CLI tasks.run command (#28113) +- Support using a list of callbacks in ``on_*_callback/sla_miss_callbacks`` (#28469) +- Better table name validation for db clean (#28246) +- Use object instead of array in config.yml for config template (#28417) +- Add markdown rendering for task notes. (#28245) +- Show mapped task groups in grid view (#28208) +- Add ``renamed`` and ``previous_name`` in config sections (#28324) +- Speed up most Users/Role CLI commands (#28259) +- Speed up Airflow role list command (#28244) +- Refactor serialization (#28067) +- Allow longer pod names for k8s executor / KPO (#27736) +- Updates health check endpoint to include ``triggerer`` status (#27755) + + Airflow 2.5.3 (2023-04-01) -------------------------- diff --git a/newsfragments/28172.misc.rst b/newsfragments/28172.misc.rst deleted file mode 100644 index 8b47c9749c..0000000000 --- a/newsfragments/28172.misc.rst +++ /dev/null @@ -1 +0,0 @@ -Maximum retry task delay is set to be 24h (86400s) by default. You can change it globally via ``core.max_task_retry_delay`` parameter. diff --git a/newsfragments/28538.misc.rst b/newsfragments/28538.misc.rst deleted file mode 100644 index 5b929d8448..0000000000 --- a/newsfragments/28538.misc.rst +++ /dev/null @@ -1 +0,0 @@ -The Hive Macros (``hive.max_partition``, ``hive.closest_ds_partition``) are available only when Hive Provider is installed. Please install Hive Provider > 5.1.0 when using those macros. diff --git a/newsfragments/28892.improvement.rst b/newsfragments/28892.improvement.rst deleted file mode 100644 index ee27b97d5a..0000000000 --- a/newsfragments/28892.improvement.rst +++ /dev/null @@ -1 +0,0 @@ -The configurations view now only displays the running configuration. Previously, the default configuration was displayed at the top but it wasn't obvious whether this default configuration was overridden or not. Subsequently, the non-documented endpoint ``/configuration?raw=true`` is deprecated and will be removed in Airflow 3.0. The HTTP response now returns an additional ``Deprecation`` header. The ``/config`` endpoint on the REST API is the standard way to fetch Airflow configuration [...] diff --git a/newsfragments/29506.significant.rst b/newsfragments/29506.significant.rst deleted file mode 100644 index 1e0555c13a..0000000000 --- a/newsfragments/29506.significant.rst +++ /dev/null @@ -1,6 +0,0 @@ -Default permissions of file task handler log directories and files has been changed to "owner + group" writeable. - -Default setting handles case where impersonation is needed and both users (airflow and the impersonated user) -have the same group set as main group. Previously the default was also other-writeable and the user might choose -to use the other-writeable setting if they wish by configuring ``file_task_handler_new_folder_permissions`` -and ``file_task_handler_new_file_permissions`` in ``logging`` section. diff --git a/newsfragments/29933.improvement.rst b/newsfragments/29933.improvement.rst deleted file mode 100644 index fd3de0f713..0000000000 --- a/newsfragments/29933.improvement.rst +++ /dev/null @@ -1 +0,0 @@ -ExternalTaskSensor now has an explicit ``skipped_states`` list diff --git a/newsfragments/30076.significant.rst b/newsfragments/30076.significant.rst deleted file mode 100644 index 83805af118..0000000000 --- a/newsfragments/30076.significant.rst +++ /dev/null @@ -1,3 +0,0 @@ -SLA callbacks no longer add files to the dag processor manager's queue - -This stops SLA callbacks from keeping the dag processor manager permanently busy. It means reduced CPU, and fixes issues where SLAs stop the system from seeing changes to existing dag files. Additional metrics added to help track queue state. diff --git a/newsfragments/30152.significant.rst b/newsfragments/30152.significant.rst deleted file mode 100644 index 5b0325bbe8..0000000000 --- a/newsfragments/30152.significant.rst +++ /dev/null @@ -1,6 +0,0 @@ -The ``cleanup()`` method in BaseTrigger is now defined as asynchronous (following async/await) pattern. - -This is potentially a breaking change for any custom trigger implementations that override the ``cleanup()`` -method and uses synchronous code, however using synchronous operations in cleanup was technically wrong, -because the method was executed in the main loop of the Triggerer and it was introducing unnecessary delays -impacting other triggers. The change is unlikely to affect any existing trigger implementations. diff --git a/newsfragments/30374.significant.rst b/newsfragments/30374.significant.rst deleted file mode 100644 index d6c32cbdae..0000000000 --- a/newsfragments/30374.significant.rst +++ /dev/null @@ -1,5 +0,0 @@ -The gauge ``scheduler.tasks.running`` no longer exist - -The gauge has never been working and its value has always been 0. Having an accurate -value for this metric is complex so it has been decided that removing this gauge makes -more sense than fixing it with no certainty of the correctness of its value. diff --git a/newsfragments/30375.significant.rst b/newsfragments/30375.significant.rst deleted file mode 100644 index d2fd1a87f2..0000000000 --- a/newsfragments/30375.significant.rst +++ /dev/null @@ -1,9 +0,0 @@ -Consolidate handling of tasks stuck in queued under new ``task_queued_timeout`` config - -Logic for handling tasks stuck in the queued state has been consolidated, and the all configurations -responsible for timing out stuck queued tasks have been deprecated and merged into -``[scheduler] task_queued_timeout``. The configurations that have been deprecated are -``[kubernetes] worker_pods_pending_timeout``, ``[celery] stalled_task_timeout``, and -``[celery] task_adoption_timeout``. If any of these configurations are set, the longest timeout will be -respected. For example, if ``[celery] stalled_task_timeout`` is 1200, and ``[scheduler] task_queued_timeout`` -is 600, Airflow will set ``[scheduler] task_queued_timeout`` to 1200.
