Few thoughts:
* in our environment at Lyft, cleared tasks do get picked up by the
scheduler. Is there an issue opened for the bug you are referring to? is
that on 1.9.0?
* "clearing" in both the web ui and CLI also flips the DagRun state back to
running as the intent of clearing is usually to get the scheduler to pick
it up, there may be caveats when the DagRun doesn't exist, or is a DagRun
of the "backfill" type. Maybe `clear` should make sure that there's a
proper DagRun
* "clear" is confusing as a verb to use when the actual intent is to
reprocess...
* another [backward compatible] option is to put the feature behind a
feature flag and set that flag to False in your environment
Whether the web server and the upcoming REST API should be able to talk to
an executor is bigger question. We may want Airflow to allow running
arbitrary DAG (through the upcoming DagFetcher abstraction) on arbitrary
executors, and the REST API (web server) may need to communicate with the
executor for that purpose. Though that's far enough ahead that it's mostly
unrelated to your current concern.
I'll go a bit further and say that eventually we should have a way to run
local, "in-development" DAGs on a remote executor in an adhoc fashion. In
next-gen Airflow that would go through publishing a DAG to the DagFetcher
abstraction (say git://{org}/{repo}/{gitref_say_a_branch}/{path/to/dag.py}
), and run say `airflow test` or `airflow backfill` through the REST API
and get that to run remote on k8s (through k8sexecutor) for instance. I
think this may require the REST api talking to the executor.
Max
On Fri, Jun 29, 2018 at 8:03 PM Feng Lu <[email protected]> wrote:
> Re-attaching the image..
>
> On Fri, Jun 29, 2018 at 4:54 PM Feng Lu <[email protected]> wrote:
>
>> Hi all,
>>
>> Please take a look at our proposal to deprecate Run task in Airflow
>> webUI.
>>
>> *What?*
>> Deprecate Run task support in the Airflow webUI and make it a no-op for
>> now.
>>
>> [image: xGAcOrcLJE4.png]
>>
>> *Why?*
>>
>> 1. It only works with CeleryExecutor
>>
>> <https://github.com/apache/incubator-airflow/blob/master/airflow/www/views.py#L1001-L1003>
>> and renders an inconsistent experience for other types of Executors.
>> 2. It requires Airflow webserver to have direct connection with the
>> message backend of CeleryExecutor, and opens more vulnerability in the
>> system. In many cases, users may want to restrict access to the celery
>> messaging backend as much as possible.
>>
>> *Mitigation:*
>> This Run task feature is mainly for the purpose of re-executing of a
>> previously running task which got stuck in running and deleted manually.
>> It's currently a two step process:
>> 1. Navigate to the task instance view page and delete the running task
>> that's stuck
>> 2. Go back to DAG/task view and click "Run"
>>
>> We proposed to combine the two steps, after a running task is deleted,
>> the Airflow scheduler will automatically re-schedule (which it does today)
>> and re-queue the task (there's a bug that needs to be fixed).
>>
>> *Fix:*
>> The scheduler currently doesn't not automatically re-queue the task
>> despite the task instance has changed from running to scheduled state. The
>> heartbeat check incorrectly returns a success in this case. The root cause
>> is that LocalTaskJob doesn't set the job state to failed (details
>> <https://github.com/apache/incubator-airflow/blob/master/airflow/jobs.py#L2674-L2681>)
>> when a running task is externally deleted and confuses the heartbeat
>> check
>> <https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L443>
>> .
>> Once this is fixed, a killed running task instance will be
>> auto-scheduled/enqueued for execution, verified locally.
>>
>> Thank you.
>>
>> Feng
>>
>>
>>