potiuk commented on issue #14529:
URL: https://github.com/apache/airflow/issues/14529#issuecomment-1133066140

   > @potiuk I hope you don't feel uncomfotable on my follow-up questions. 
   
   Surely I don't mind. And I am not arguing. I am just trying to explain the 
complexity - precisely because you as a user might not know complexities 
involved. My role here is to make you aware of it and set some expectations. On 
the other hand you think things are easy precisely because you do not know the 
details so I am trying to explain it to you. I hope you can learn something 
from that - for example that hastily saying that something is "easy" when you 
have no good understanding is usually very premature. It's fine if you have to 
yourself follow your own judgment and "pay" the price of mistake in the 
assesment of difficulty. That's why I very considerately return  the discussion 
back to you. "if you think it is easy - do it" . Plain and simple. This is 
precisely what I (deliberately) do to many people who I mentored or worked with 
- when I think something is complex - I explain why and we discuss. If the 
other person is not convinced I - very consciously and deliberately say t
 o that persona - ok, if you are not convinced - please feel free to do it and 
let's see. Sometimes the best inventions are made by people who thought that 
the things were simple when everyone else thought they are complex. 
   
   Sometimes in such situation I am positively surprised, sometimes the other 
person gets to bite the bullet and learn hard ways that things are harder than 
on the surface.  this is precisely such moment - feel free to do it. You will 
learn. If you do it and it take far less time and turns out to be simpler than 
I thought (assuming it will pass all the testing, works at scale and so on) - 
this is even better. I will learn something new, we will get cool feature. I 
really like when this happens - so by all means - please, do it. Implement it 
if you think it is simple. This will be great feature to have!
   
   And if not - then we have a great source of information on things to look at 
when someone actually WILL start implementing it - they can see at the points 
raised here and try to see if their solution addresses them,
   
   BTW. I am not sure also if you are aware that the discussion here is pretty 
academic. Until there will be a person that would like to implement it, it will 
simply not happen. No matter what. If you want this feature in - there is 
no-one at the "top" to convince this shoudl be done. There is no-one here who 
"decides" what should be done. Things are done here - because someone does 
them. So in a way it does not matter, if we agree or not here - the surest way 
to get this feature (if you really want it) is ... to implement it. That's it. 
So my comments are really. - yeah, go ahead and implement it. If you thik it's 
easy - cool. But I just pointed out the problems you (or anyone else here will 
have to solve). 
   
   But answering your question:
   
   > Web UI has a feature to clear a task and its all downstream tasks. What 
happens if there are 100s or 1000s of downstream tasks?
   
   Yes. it's very, very, very different. When you clear the tasks which are 
downstream, you can do it in one big transaction (i.e. qucikly)  because they 
are directly depending on the original task and their state depends directly on 
the state of your task, so you can clear them mathematically speaking in 
"topological" order. 
   
   Task Group has no such notion. Tasks in a task group might be arbitrary 
connected or not connected they can have various inter-dependencies and you 
cannot do it reliably because even if the tasks are grouped in the UI they 
might relate to various parts of the tree with some interdependncies between 
them and there is no clear topological order you can clear all the tasks in 
single transaction. Rather than in direct downstream dependency, there might be 
multiple levels of hiearchy - i.e. you can have a task hierarchy:  A ->  B -> C 
where A and C belong to the task group and B does not. What happens when you 
clear group - clearing A might in turn cause transitive clearing of B which in 
turn might cause  "starting" of C (depending on the type of dependency), but 
then you also attempt to clear C at the same time. Should the C task be started 
or cleared in this case? Are there other cases like that? Likely. That's why I 
tihnk - trying to do it in one big transaction is likely to lead to
  deadlocks. 
   
   But if you think otherwise and thought that through and have answers to all 
the possible edge-cases here - by all means, please implement it. Make a PR. 
Let's review it :)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to