[I] [Bug] [Master] Dependent nodes failed when the upstream rerun without the task relied on [dolphinscheduler]

via GitHub Sat, 06 Jul 2024 20:25:32 -0700


starrysxy opened a new issue, #16285:
URL: https://github.com/apache/dolphinscheduler/issues/16285


   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### What happened
   
   First of all, I'm not sure if this is a code bug. I checked the code and the 
phenomenon is consistent with the current code logic, but for the business 
logic, I don't think it is very reasonable. If necessary, please help to modify 
the tag.
   
   Suppose there are two workflows A and B. Workflow A has tasks A-1, A-2, and 
A-3. Workflow B depends on task A-3 in workflow A. When the A workflow is 
finished, if the A-1 task in the A workflow is re-executed separately and the B 
workflow instance in the same cycle has not been executed, the dependent node 
in the B workflow will fail, which will cause the B workflow instance to fail.
   
   The logic in the code is to find the workflow instance with the latest 
endTime in each cycle, so the workflow instance where the A-1 task is executed 
alone will be found, but this workflow instance does not have the A-3 task that 
the downstream B workflow depends on. Therefore, after getting the A workflow 
instance, the A-3 task instance cannot be found when traversing the task 
instances. At the same time, the A workflow instance is in the completed state, 
so the dependent node is marked as failed, and then the B workflow is marked as 
failed.
   
   In my opinion, the logic of this part of the code is to facilitate the 
selection of 'ALL' for dependent tasks, without having to check the status of 
each task in the upstream workflow, but directly check the status of the entire 
upstream workflow. However, I think that in actual work, it is inevitable to 
modify the workflow, and it is also inevitable to re-execute the task after 
modifying the workflow. At the same time, re-executing the entire workflow may 
lead to problems such as late result output and waste of machine resources. 
Therefore, the logic here may be optimized.
   
   ### What you expected to happen
   
   If a dependent task in an upstream workflow has ever succeeded, the node 
status of the dependent task in the downstream workflow of the same cycle 
should be marked as successful.
   
   ### How to reproduce
   
   Suppose there are two workflows A and B. Workflow A has tasks A-1, A-2, and 
A-3. Workflow B depends on task A-3 in workflow A. When the A workflow is 
finished, if the A-1 task in the A workflow is re-executed separately and the B 
workflow instance in the same cycle has not been executed, the dependent node 
in the B workflow will fail, which will cause the B workflow instance to fail.
   
   ### Anything else
   
   _No response_
   
   ### Version
   
   3.1.x
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: 
[email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [Bug] [Master] Dependent nodes failed when the upstream rerun without the task relied on [dolphinscheduler]

Reply via email to