Arunodoy18 opened a new pull request, #59813:
URL: https://github.com/apache/airflow/pull/59813
This PR fixes a critical issue where the Apache Airflow scheduler crashes
when the
[external_executor_id](vscode-file://vscode-app/c:/Users/aruno/AppData/Local/Programs/Microsoft%20VS%20Code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
field exceeds 250 characters. This particularly affects the AWS Lambda
Executor and other executors that manage tasks with long identifiers.
Problem
The
[external_executor_id](vscode-file://vscode-app/c:/Users/aruno/AppData/Local/Programs/Microsoft%20VS%20Code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
column in
[task_instance](vscode-file://vscode-app/c:/Users/aruno/AppData/Local/Programs/Microsoft%20VS%20Code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
and task_instance_history tables is currently limited to VARCHAR(250). When
using executors with long dag_id, task_id, and run_id combinations, the
generated external executor IDs can easily exceed this limit, causing:
sqlalchemy.exc.DataError: (psycopg2.errors.StringDataRightTruncation)
value too long for type character varying(250)
[SQL: UPDATE task_instance SET updated_at=%(updated_at)s,
external_executor_id=%(external_executor_id)s
WHERE task_instance.id = %(task_instance_id)s]
[parameters: {'external_executor_id': '{"dag_id": "ZZ_YY_ZZZZZZZZZ_YYYYYYY",
"task_id": "aaaaaaa.bbbbbbbbbbbbbbbbbbb",...'}]
Solution
Increased the
[external_executor_id](vscode-file://vscode-app/c:/Users/aruno/AppData/Local/Programs/Microsoft%20VS%20Code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
column length from 250 to 1000 characters to accommodate longer identifiers
while maintaining reasonable database performance.
Changes
Database Migration - 0094_3_2_0_increase_external_executor_id_length.py:
[[ Alters
[external_executor_id](vscode-file://vscode-app/c:/Users/aruno/AppData/Local/Programs/Microsoft%20VS%20Code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
column in both
[task_instance](vscode-file://vscode-app/c:/Users/aruno/AppData/Local/Programs/Microsoft%20VS%20Code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
and task_instance_history tables from VARCHAR(250) to VARCHAR(1000)
Includes upgrade and downgrade paths ]]
Uses batch operations for SQLite compatibility
Model Updates:
Updated
[TaskInstance.external_executor_id](vscode-file://vscode-app/c:/Users/aruno/AppData/Local/Programs/Microsoft%20VS%20Code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
to use
[StringID(length=1000)](vscode-file://vscode-app/c:/Users/aruno/AppData/Local/Programs/Microsoft%20VS%20Code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
Updated
[TaskInstanceHistory.external_executor_id](vscode-file://vscode-app/c:/Users/aruno/AppData/Local/Programs/Microsoft%20VS%20Code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
to use
[StringID(length=1000)](vscode-file://vscode-app/c:/Users/aruno/AppData/Local/Programs/Microsoft%20VS%20Code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
Impact
Executors Affected: AWS Lambda, AWS ECS, AWS Batch, Kubernetes, and any
custom executors with long identifiers
Backward Compatibility: Fully backward compatible - existing data preserved
Performance: Minimal impact - VARCHAR length increase has negligible
performance cost in modern databases
Breaking Changes: None
Testing
Migration tested with batch alter operations
Model definitions validated against schema
Downgrade path included for rollback scenarios
Migration
Users can apply this change with:
airflow db migrate
Closes this issue.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]