BharathPESU opened a new pull request, #57861:
URL: https://github.com/apache/airflow/pull/57861

   Fixes a critical bug in 
[CloudRunExecuteJobOperator](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
 where jobs with 
[deferrable=True](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
 are incorrectly marked as successful when Cloud Run Jobs are canceled or have 
failed tasks.
   
   Problem
   When using 
[CloudRunExecuteJobOperator](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
 with 
[deferrable=True](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html),
 the operator would mark tasks as successful even when:
   
   Jobs were canceled - Not all tasks completed execution
   Tasks failed - Some tasks within the job failed
   Malformed trigger events - Missing required fields could cause KeyError or 
produce unclear error messages
   The root cause was that the trigger only checked 
[operation.done](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
 without validating the actual execution state (task completion counts), and 
the operator didn't defensively validate the event payload.
   
   Solution
   This PR implements a three-layer defensive validation approach:
   
   1. Trigger Layer 
([cloud_run.py](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
 trigger)
   Extracts execution details 
([task_count](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html),
 
[succeeded_count](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html),
 
[failed_count](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html))
 from the operation response using protobuf 
[Unpack()](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
   Validates 
[Unpack()](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
 return value and raises clear exceptions if deserialization fails
   Includes operation name and job name in error messages for debugging
   2. Operator Layer 
([cloud_run.py](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
 operator)
   Defensive event validation: Uses 
[event.get()](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
 to prevent KeyError on malformed events
   Execution state validation:
   Checks [succeeded_count + failed_count == 
task_count](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
 to detect canceled jobs
   Checks [failed_count > 
0](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
 to detect failed tasks
   Validates presence of required execution fields
   Improved error messages: Replaces None/empty error details with meaningful 
defaults ("Unknown", "Unknown error")
   3. Test Coverage
   Added comprehensive test coverage for:
   
   ✅ Successful job execution with all tasks completed
   ✅ Canceled jobs (incomplete task execution)
   ✅ Jobs with failed tasks
   ✅ Malformed events missing status field
   ✅ Malformed events missing execution fields
   ✅ Operation failures with missing error details (None values)
   ✅ Protobuf Unpack() failure scenarios
   Changes
   Modified Files:
   
   
[cloud_run.py](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
   
[cloud_run.py](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
   
[test_cloud_run.py](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
   
[test_cloud_run.py](vscode-file://vscode-app/usr/share/code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
   Statistics:
   
   2 files changed in source code
   2 test files enhanced
   236 insertions total across both commits
   4 deletions
   Testing
   All changes include comprehensive unit test coverage. Tests verify:
   
   Success path with valid execution details
   Failure detection for canceled jobs
   Failure detection for jobs with failed tasks
   Error handling for malformed trigger events
   Meaningful error messages for missing operation error details
   Protobuf Unpack() failure handling
   Example Error Messages
   Before this fix:
   
   After this fix:
   
   Checklist
    Bug fix (non-breaking change which fixes an issue)
    New tests added to cover the changes
    All tests pass locally
    Follows code style guidelines
    Includes appropriate error handling
    Error messages are clear and actionable
   Related Issues
   Note: This fix prevents false positives in production environments where 
canceled or partially completed Cloud Run Jobs would be incorrectly reported as 
successful, potentially leading to data inconsistencies or missed error 
conditions.
   
   Feel free to modify this description based on any specific issue numbers or 
additional context you want to include!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to