Hello again :)

The vote has passed with 3 binding +1 votes (and 4 non-binding +1 votes):

Jarek Potiuk +1 (binding)
Tomasz Urbaszek +1 (binding)
Kaxil Naik +1 (binding)
Bartłomiej Hirsz +1 (non-binding)
Eugen Kosteev +1 (non-binding)
Mars Group +1 (non-binding)
Łukasz Wyszomirski +1 (non-binding)

Thanks everyone!

We will proceed with the implementation.

Kind regards,
Mateusz Nojek


śr., 23 lut 2022 o 21:04 Kaxil Naik <[email protected]> napisał(a):

> Sounds good. While I don't think we should enforce the "watcher" task for
> all the example DAGs, it isn't a blocker.
>
> +1 (binding)
>
> On Tue, 22 Feb 2022 at 07:43, Mateusz Nojek <[email protected]> wrote:
>
>> Hello Kaxil and thanks for the questions.
>> Let me answer them...
>>
>>    1. We don't, but since we want to have one file that will be an
>>    example DAG and also a system test at the same time, we need to decide
>>    where to store them. We have 2 options: store them as tests somewhere in
>>    `tests` directory or store them as example DAGs (e.g. in providers
>>    directory). Since they fulfill their 'example' role using dynamic
>>    documentation that is generated from code (no matter where the files are
>>    stored), this purpose we have already covered. The 'test' role, on the
>>    other hand, is fulfilled by storing them in the right place, indicating
>>    that these are actually tests that can be executed. Also, we decided to
>>    move them because since we are planning to do the redesign, we can find a
>>    better place for them. `tests/system/providers` seems like a proper place
>>    for system tests.
>>    2. I am not for or against setting up the DebugExecutor inside the
>>    environment variable, but I would like to see strong arguments against
>>    using it the way it is described in the AIP, so that it can possibly break
>>    the execution in some specific cases. Examples would be very desired.
>>    3. You're right, if all tasks do not have any trigger rules, we don't
>>    need the watcher task. But since probably most of them will have teardown
>>    tasks (tasks with trigger_rule set to "all_done" to do the cleanup after
>>    test) we need to introduce this watcher task to handle those. We actually
>>    don't need to use the watcher task in DAGs without teardown tasks, but 
>> it's
>>    actually better to use it always because it doesn't provide anything other
>>    than to support the possibility of having a teardown task, so if anyone in
>>    the future decides to add some task to remove some resources, watcher task
>>    will be already there to handle it. It doesn't bring that much complexity
>>    to the code and computational consumption. Also, we are going to have a
>>    precommit hook that will check against absence of this piece of code and
>>    assure that it's there in every test. Of course, it's only my opinion and
>>    if we later decide to not use watcher task where it is not essentially
>>    needed, we can avoid adding it, but having it everywhere creates a kind of
>>    'framework' for our tests and saves us from possible mistakes.
>>
>> One of the concepts of the new design is to minimize the amount of
>> environment variables to maintain. When I was fixing and refactoring system
>> tests, the part that I found the most difficult was maintaining the list of
>> environment variables. Even if every possible variable has the "good"
>> default, it was not running successfully on the first run. I needed to
>> check all the values of variables and see if they are sensible. They
>> weren't. The problem is that they age very fast and with every change to
>> the Airflow, they become brittle, outdated, useless. One of the main
>> pillars of the new design of system tests is to minimize the number of
>> environment variables needed to be maintained so that the DAGs just work
>> without that much hassle required to configure them.
>>
>> I hope it delivers answers your concerns.
>>
>> Kind regards,
>> Mateusz Nojek
>>
>>
>> pon., 21 lut 2022 o 18:31 Kaxil Naik <[email protected]> napisał(a):
>>
>>> Apologies for chiming in late. I read the AIP now, please find my
>>> questions/comments below:
>>>
>>>    1. Why do we need to move the DAGs from example_dags ?
>>>    2. Hardcoding DebugExecutor using environment variables like that
>>>    doesn't feel right. Might cause issues when running with other executors.
>>>    3. The Watcher task - why do we need that? If the example DAGs don't
>>>    set any Trigger rules, the default Trigger rule is "all_success" so the
>>>    DagRun will fail if any task fails.
>>>
>>> The only thing I think is needed to make the example DAGs actually run
>>> in CI/locally is making sure the values comes from Environment Variables
>>> while the DAG itself have sensible defaults.
>>>
>>> For instance, the following example should work fine if the environment
>>> variables are set.
>>>
>>> AIRFLOW__CORE__EXECUTOR=DebugExecutor
>>> python 
>>> airflow/providers/google/cloud/example_dags/example_gcs_timespan_file_transform.py
>>>
>>> What would be ideal (according to me) is to have airflow test run work
>>> without execution_date which will take care of running it via
>>> DebugExecutor. (docs
>>> <https://airflow.apache.org/docs/apache-airflow/stable/cli-and-env-variables-ref.html#test>
>>> )
>>>
>>> This opens up the possibility for us in future to provide more useful
>>> debug logs (like import time, number of DAGs, warn when a DAG doesn't
>>> follow best practices etc) via the "airflow test" command. While this is
>>> out of scope of the current AIP, I feel it might be a great value add.
>>>
>>> Just my 2-cents.
>>>
>>> Regards,
>>> Kaxil
>>>
>>>
>>>
>>> On Mon, 21 Feb 2022 at 07:20, Mateusz Nojek <[email protected]> wrote:
>>>
>>>> Hello everyone,
>>>>
>>>> Since there was no response from the community during the weekend on
>>>> this voting, I am extending it for the next 48 hours, so it will end on
>>>> Wednesday 23rd February 9 AM CET.
>>>>
>>>> Current state is:
>>>> 2/3 required binding votes,
>>>> additional 4 non-binding votes.
>>>>
>>>> Kind regards,
>>>> Mateusz Nojek
>>>>
>>>>
>>>> niedz., 20 lut 2022 o 12:01 Mateusz Nojek <[email protected]>
>>>> napisał(a):
>>>>
>>>>> Hi everyone,
>>>>>
>>>>> It's less than 24 hours till the end of the voting.
>>>>> If anyone of you still hesitates, this is the time to share your
>>>>> opinion with a vote on this AIP :)
>>>>> Thank you!
>>>>>
>>>>> Kind regards,
>>>>> Mateusz Nojek
>>>>>
>>>>>
>>>>> pt., 18 lut 2022 o 21:36 Tomasz Urbaszek <[email protected]>
>>>>> napisał(a):
>>>>>
>>>>>> +1 (binding)
>>>>>>
>>>>>> Tomek
>>>>>>
>>>>>

Reply via email to