Re: Airflow Feature Design Feedback

Deng Xiaodong Sun, 18 Oct 2020 05:49:58 -0700

I have the same concern as Yu Qian does, and a bit more elaborations follow
below:


- "log" table
<https://github.com/apache/airflow/blob/master/airflow/models/log.py> has
fields: *id*, *dttm*, *dag_id*, *task_id*, *event*, *execution_date*,
*owner*, *extra*. Seems to me only "*event*" or "*extra*" can provide some
room for this use case. But does it support more complicated cases well?
For example, for the same task instance, there may be back and forth tries,
and I may mark it Success/Failure for multiple times back and forth, with
multiple comments.

- As demonstrated by the field list of the "log" table, it's at "task"
level (it has field "taks_id"). Ace has mentioned that "*Users want to be
able to add an optional reason/description when they click on the Mark
Success/Failed for both task level in graph views or dag level in tree
views*". So my question would be how the "dag level" should be supported if
we use the "log" table to implement this feature?


XD

On Sun, Oct 18, 2020 at 11:13 AM Yu Qian <[email protected]> wrote:

> Did anyone consider having a more generic comments feature? The feature
> Ace described is an example of a comment being useful. E.g. when a user
> clears a task or marks it success/failed, he probably wants to include a
> comment for his action when he clicks OK. If a task failed and the user
> decided to do nothing about it, he may also wants to leave a comment with
> the reason.
>
> It'll be nice if these comments can optionally be displayed on the UI
> somewhere. To do that, it's likely cleaner to create a dedicated comments
> table where each comment has a foreign key to associate it with the
> TaskInstance.
>
> On Sat, Oct 17, 2020 at 1:30 AM Vikram Koka <[email protected]> wrote:
>
>> +1 for feature. Strong preference for storing it in the log table, rather
>> than in the task instance.
>>
>>
>>
>> On Fri, Oct 16, 2020 at 10:22 AM Tomasz Urbaszek <[email protected]>
>> wrote:
>>
>>> +1, I know a customer who was requesting it too
>>>
>>> On Fri, Oct 16, 2020 at 6:52 PM Kaxil Naik <[email protected]> wrote:
>>>
>>>> +1
>>>>
>>>> On Fri, Oct 16, 2020 at 5:48 PM Jarek Potiuk <[email protected]>
>>>> wrote:
>>>>
>>>>> +1
>>>>>
>>>>> On Fri, Oct 16, 2020 at 6:42 PM Sumit Maheshwari <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> +1 for the feature. Looking at the schema of the log table, I think
>>>>>> it's perfect to store such kind of information.
>>>>>>
>>>>>> On Fri, Oct 16, 2020 at 9:55 PM Daniel Imberman <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> This could be pretty valuable for future audits. I’d personally
>>>>>>> rather avoid adding fields to the DB in general. Could we store it 
>>>>>>> wherever
>>>>>>> we store normal failure messages?
>>>>>>>
>>>>>>> via Newton Mail
>>>>>>> <https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.6&source=email_footer_2>
>>>>>>>
>>>>>>> On Fri, Oct 16, 2020 at 9:21 AM, Ace Haidrey
>>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> Hi team,
>>>>>>>
>>>>>>> My name is Ace, I’m a data engineer at Pinterest, and I wanted to
>>>>>>> get some feedback from the community on how they’d propose designing a
>>>>>>> feature that has been asked for us multiple times, that we’d like to
>>>>>>> implement in-house.
>>>>>>>
>>>>>>> The ask is the following:
>>>>>>> - Users want to be able to add an optional reason/description when
>>>>>>> they click on the Mark Success/Failed for both task level in graph 
>>>>>>> views or
>>>>>>> dag level in tree views. This is to clearly state why the actions were
>>>>>>> taken on higher priority flows for when others are stepping in to look 
>>>>>>> in
>>>>>>> it.
>>>>>>>
>>>>>>> For us the feature makes sense to have, and I was wondering if this
>>>>>>> is something the greater group would like to have upstream as well. And 
>>>>>>> if
>>>>>>> so then we wanted to know what was preferred.
>>>>>>>
>>>>>>> 1. We can add this information in the UI for the confirmation view
>>>>>>> and an optional textbook to add details similar to how users can add 
>>>>>>> conf
>>>>>>> variables when triggering a dag option is selected in the UI.
>>>>>>>
>>>>>>> Then after this, storing this information is the question. Do we
>>>>>>> want to store this information in the action logging extra blob and then
>>>>>>> add a view in the dag view where it has actions just pertaining to this 
>>>>>>> dag
>>>>>>> and its tasks. (We plan to add this personally anyways for ourselves as
>>>>>>> going to the big list of actions for all dags and searching is not
>>>>>>> convenient always).
>>>>>>>
>>>>>>> Or do we want to add a new column in the task Instance model and dag
>>>>>>> run model to store the descriptions and retrieve the information from
>>>>>>> there.
>>>>>>>
>>>>>>> Tradeoffs here include the addition of a schema change, where the
>>>>>>> new column would generally be sparse, and the data is more resultant 
>>>>>>> from
>>>>>>> an action vs necessarily being stored as part of the task run/dag run.
>>>>>>>
>>>>>>> Please let me know on any feedback, concerns, ideas, and more!
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Ace
>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Jarek Potiuk
>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>
>>>>> M: +48 660 796 129 <+48660796129>
>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>
>>>>>

Re: Airflow Feature Design Feedback

Reply via email to