Hello,
This is a repost from slack as Jarek suggested this ML would be a better place 
to discuss it.

I have a concern with the introduction of a config to control the default value 
for deferrable (https://github.com/apache/airflow/pull/31712)
A good portion of operators had a parameter to control whether they wait or not 
(wait_until_finished, wait_for_completion, wait_for_termination, or just wait).
I think most implementations of deferrable treat it as an override for this 
setting (i.e. if deferrable = True then the value of wait does not matter 
anymore)
My concern is that if users have operators that do not wait (either because the 
default value for wait was False, or they set it manually to False), and they 
set default_deferrable=True, then all their operators will start waiting, which 
might not be what they expect.
I think what’d be nice is if setting the default only changed how operators 
work (deferring instead of waiting in the worker), not whether they wait or not.
It might be too late to change this because there has been a release since 
then, but we could say that:
  • if deferrable is manually set to True, we always wait & defer
  • else if wait_for_completion is False (default value, or set manually), we 
don’t wait
  • in other cases (wait for completion True and no set value for deferrable), 
we rely on the config.
That way, users setting this conf wouldn’t see their dags suddenly taking more 
time because they start waiting where they weren’t before.

This would impact any operator that has a legacy mechanism to wait 
synchronously and a deferrable mode.
Especially those where the default value to wait was False, like 
RedshiftCreateClusterOperator for instance (but there are others).

Reply via email to