potiuk commented on issue #40196: URL: https://github.com/apache/airflow/issues/40196#issuecomment-2212522330
> Now I had trouble finding this behavior explained in documentation as well, and I have definitely made heavy use of task_groups returning values, but looking now I can see that _TaskGroupFactory.__call__ does explain this behavior in it's docstring and some inline comments clarify things further in _TaskGroupFactory._create_task_group. The behavior makes sense given these comments and yet doesn't feel quite right given this issue was raised in the first place. Ahhh. I see exactly what happened. I looked a bit at the history of it and did a little investigation and here is what happend (@uranusjr -> would love to get you confirm my understanding and see if you agree with my assesment of what should be done here): * The original design of task_group() was that it returned the last task in the group to be able to make another task depend on it, so that you could write: ``` tg >> next_task ``` However that does not work well for this case: ``` prev_task >> tg ``` Precisely, because of the case you describe - that the previous task would be upstream of the last task in a group. (and @uranusjr correctly mentioned in https://github.com/apache/airflow/issues/19903#issuecomment-1005849107 it could be done by returning `[ task_start, task_end]` - but it was completely non-obvious. Then the task group-decorated function return value have been improved by @uranusjr here https://github.com/apache/airflow/pull/20671 -> where it was allowed that task group-decorated function returns nothing which is equivalent of returning the task_group - which works for both sides of the dependencies (>> tg will add downstream to first task in the group where tg >> will add upstream dependency from the last task in the group) It was supposed to be documented in https://github.com/apache/airflow/pull/20671 which was created as a follow-up task - but it has been closed as "completed" by https://github.com/apache/airflow/pull/26028 - but in fact that PR only adds somme examples and does not **really** describe this behaviour in the docs (and it does not describe the differences depending on what is returned by the decorated function). I honestly find it hard to justify the the behaviour when `task` is returned - it only makes sense for `tg >> next_task` case, but if we just return none (or tg directly), it will behave exactly the same for this case (and it will not create a confusion because it will also correctly (intuitively) work as intuitively expected also in `prev_task >> tg`. So my proposal is that we should issue a deprecation warning when `task` is returned from `@task_group` decorated function and possibly `fail` if task is returned in such method in Airflow 3 - because I think returning single task from such decorated method makes very little sense and creates confusion (as proven by this Issue). Then it should of course be documented (but I I would rather see it documented as deprecated behaviour if others agree with me). @uranusjr (and also @eladkal - who had already fixed our example dag in the past that was exhibiting that confusing behaviour - https://github.com/apache/airflow/pull/21240 - what do you think ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
