s/dagbag import exception/dag import timeout exception/

On Wed, Jul 31, 2019 at 11:17 PM Kevin Yang <yrql...@gmail.com> wrote:

> Hi Jonathan, for your problem, aside waiting for AIP-24 for the long term,
> you can try set the dagbag_import_timeout
> <https://github.com/apache/airflow/blob/master/airflow/config_templates/default_airflow.cfg#L162>
> to a smaller value so that those slow DAG file parsing ends faster. Also I
> don't think one DAG parsing can block parsing of other DAG files even we
> parse all of them in a single thread in the webserver. All exceptions are
> captured, including the dagbag import exception, will be captured and
> logged
> <https://github.com/apache/airflow/blob/master/airflow/models/dagbag.py#L197-L203>
> .
>
> Love Dan's ideas and agree with Fokko to start small and expand.
>
> I scan through PR5701 <https://github.com/apache/airflow/pull/5701> and
> it is exactly what I care the most for this AIP--to me others come
> naturally after we have a DAG serialization pattern defined. Good job Zhou
> and pardon me for not having enough bandwidth to review it thoroughly. +1
> for limit the scope of this AIP to item 1 and 3 in your proposed timeline.
>
>
> Cheers,
> Kevin Y
>
> On Wed, Jul 31, 2019 at 5:11 PM Zhou Fang <zhouf...@google.com.invalid>
> wrote:
>
>> I implemented the first version of DAG serialization part in AIP-24:
>> https://github.com/apache/airflow/pull/5701. Please take a look if you
>> are
>> interested @all. Thanks!
>>
>> It contains almost all fields of DAGs and tasks in the serialization (an
>> example of serialized DAG here:
>>
>> https://github.com/apache/airflow/blob/35e38f19b09646a0f85a2a7866a8d9aacc345252/tests/dags/test_dag_serialization.py#L100
>> ).
>> So basically the webserver can still treat them as before. No webserver UI
>> code change is needed. The benefit is that we can use it for 1.10.*.
>>
>> Of course, it is a short-term fix compared to many long-term proposals.
>>
>> It only contains serialization. I verified its usage in UI end-to-end by
>> using the Async DAG Loader in https://github.com/apache/airflow/pull/5594
>> .
>> I split the DAG serialization out of 5594 since Async DAG Loader is an
>> optional one. (I suddenly recall that if there are N webserver process + 1
>> async DAG loading process, it may solve webserver inconsistency problem??)
>>
>>
>> On Wed, Jul 31, 2019 at 10:33 AM Tao Feng <fengta...@gmail.com> wrote:
>>
>> > hey Zhou,
>> >
>> > Great to see this happens and make it backward compatible. I will
>> persist
>> > DAG into DB is definitely needed. And it will make migration easier
>> with a
>> > lightweight approach. At Lyft we sometimes observe nondeterministic
>> > increased scheduling delay once users add some dynamic generated large
>> DAGs
>> > with thousands of tasks.
>> >
>> > I will spend some time to look at your proposal more in more detail.
>> But I
>> > agree that this is the most important pain point that we should address.
>> > And let me know if anything I could help to facilitate this.
>> >
>> >
>> > On Mon, Jul 29, 2019 at 2:13 PM Zhou Fang <zhouf...@google.com.invalid>
>> > wrote:
>> >
>> > > Thanks everyone for the discussion. The comments are very helpful.
>> > >
>> > > AIP-24 that we proposed here is really a short-term one to minimize
>> the
>> > > change for fast launch and compatibility. I agree with the benefits of
>> > the
>> > > long-term proposals. It would be great if AIP-24 can be a first step
>> (if
>> > we
>> > > can agree with the basic serialization approach). Then we can
>> gradually
>> > > apply long-term fixes.
>> > >
>> > > I summarized a few long-term proposals (from Fokko and Ash) and added
>> a
>> > > 'timeline' in AIP-24 (make things more clear):
>> > >
>> > > *Terms*
>> > >
>> > >    - (this) stringified DAG: a patch to current DAG that can be
>> JSONified
>> > >    - (long-term) serialized DAG: a new serializable DAG class used by
>> > >    webserver/scheduler
>> > >
>> > > *Proposed timeline*
>> > >
>> > >    1. (this) JSON Serialization of DAGs
>> > >       1. will be out with https://github.com/apache/airflow/pull/5594
>> > >
>> > >       2. (this, optional) Asynchronous  DAG loading in webserver
>> > >       1.  webserver process uses a background process to collect DAGs,
>> > >       solve scalability issue before DAG persistence in DB being out
>> > >       2. webserver process itself does not need to restart every 30s
>> to
>> > >       collect DAGs
>> > >       3. will be out with https://github.com/apache/airflow/pull/5594
>> > >
>> > >       3. (this) DAG persistence in DB for webserver
>> > >       1. minimal Airflow code change
>> > >       2. an optional feature enabled via configuration
>> > >       3. rolled out with Airflow 1.10.5
>> > >
>> > >       4. (this, optional) Using DAG cached in DB for scheduling
>> > >
>> > >    5. (long-term) Defining serialized DAG for webserver
>> > >       1. this proposal keeps all fields of DAG/Operator, however, some
>> > >       fields are not used by webserver or scheduler
>> > >       2. trimming these fields are easy, just providing a list of
>> fields
>> > to
>> > >       include or exclude (Sec 2.3): _serialize_object(x, visited_dags)
>> > >       =>_serialize_object(x, visited_dags, include=['foo'],
>> > > exclude=['bar'])
>> > >       3. we should carefully check all webserver/scheduler code to
>> make
>> > >       sure trimmed fields are not used, e.g., *task.owner* is used in
>> > >       webserver
>> > >
>> > >       6. (long-term) Defining serialized DAG for scheduler
>> > >       1. Once we have 'stringified DAG' or 'serialized DAG',
>> > >       SimpleDAG/SimpleTaskInstance used by scheduler are not needed
>> > >       2. adding more fields to stringified DAGs to be compatible with
>> > >       scheduler
>> > >
>> > >       7. (long-term) Directly reading DAGs from DB in webserver
>> > >       1. let webserver process fetch data from DB, instead of making a
>> > DAG
>> > >       bag and refresh it
>> > >       2. it solves the webserver inconsistency issue
>> > >
>> > >       8. (long-term) Event-driven DAG parsing
>> > >       1. Instead of polling DAG files for updating/deleting DAGs,
>> event
>> > >       based approaches, *e.g.*, inotify (
>> > >       https://pypi.org/project/inotify_simple/) can be used
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > On Mon, Jul 29, 2019 at 3:23 AM Kaxil Naik <kaxiln...@gmail.com>
>> wrote:
>> > >
>> > > > Thanks all for the input and thanks Zhou too for the detailed AIP.
>> > > >
>> > > > The WIP PR can be a good first step to overall optimization.
>> > > >
>> > > > Let's sync-up on the progress you have already made & what we want
>> to
>> > > > target.
>> > > >
>> > > > @Jarek Potiuk <jarek.pot...@polidea.com> & @Fokko  - If we manage
>> to
>> > > make
>> > > > it entirely backward-compatible with an enable/disable flag as we
>> > > > mentioned, we can think of including it in 1.10.5 but I am in favor
>> of
>> > > > removing / cleaning stuff like pickles, drop Py 2.0 and cut Airflow
>> 2.0
>> > > and
>> > > > include this change there.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Mon, Jul 29, 2019 at 1:03 PM Jarek Potiuk <
>> jarek.pot...@polidea.com
>> > >
>> > > > wrote:
>> > > >
>> > > > > Actually I am also doing a lot of v1-10-test merges during the
>> last
>> > few
>> > > > > months (probably several tens of them already). Rarely the
>> conflicts
>> > > are
>> > > > > difficult to solve in fact. We have usually small, localised
>> changes
>> > > and
>> > > > > until we go for full Black file re-formatting, we should be ok
>> (and
>> > the
>> > > > > change from Zhou seems rather small and localised).
>> > > > >
>> > > > > J.
>> > > > >
>> > > > > On Mon, Jul 29, 2019 at 9:25 AM Driesprong, Fokko
>> > <fo...@driesprong.frl
>> > > >
>> > > > > wrote:
>> > > > >
>> > > > > > I would be hesitant to merge it into 1.10.5. When I try to
>> backport
>> > > > > > anything into the 1.x branch, I get a whole bunch on merge
>> > conflicts,
>> > > > > even
>> > > > > > on the trivial tickets. For me, the only one who can really
>> comment
>> > > on
>> > > > > this
>> > > > > > would be Ash, since he's doing the bulk of the conflict
>> resolving.
>> > > > Apart
>> > > > > > from that, I'm really excited to make this happen!
>> > > > > >
>> > > > > > Cheers, Fokko
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > Op zo 28 jul. 2019 om 20:23 schreef Jarek Potiuk <
>> > > > > jarek.pot...@polidea.com
>> > > > > > >:
>> > > > > >
>> > > > > > > Some thought I have after looking at the proposal from Zhou.
>> > > > > > >
>> > > > > > > I think this is one of the most important things feature-wise
>> for
>> > > > > > Airflow.
>> > > > > > > It looks like we have several in-progress attempts to solve
>> the
>> > > > problem
>> > > > > > and
>> > > > > > > I guess we should agree common approach.
>> > > > > > >
>> > > > > > > I like very much the approach of Zhou (AIP-24). It does seem
>> to
>> > > > > minimise
>> > > > > > > the changes needed in Airflow and it means that we with some
>> > > > > > optimisations
>> > > > > > > (caching mentioned by Fokko) - it can solve the major pain
>> points
>> > > > and I
>> > > > > > > think relatively quick and is potentially portable to 1.10.5
>> if
>> > we
>> > > > have
>> > > > > > it.
>> > > > > > >
>> > > > > > > I wonder how much it overlaps/differs from what Kaxil and Ash
>> > ideas
>> > > > > are.
>> > > > > > If
>> > > > > > > I read it correctly - it sounds like this idea will contain
>> some
>> > > more
>> > > > > > > "fundamental" changes. Ones that are likely less
>> > > > backwards-compatible,
>> > > > > > and
>> > > > > > > potentially taking longer time to implement and test. And
>> likely
>> > > > > solving
>> > > > > > > some of the problems better or even solving other problems.
>> Am I
>> > > > right
>> > > > > > with
>> > > > > > > my assumptions?
>> > > > > > >
>> > > > > > > I think more information on this might be helpful so that we
>> all
>> > > know
>> > > > > if
>> > > > > > > those are two different AIPs, or whether they can be joined in
>> > one
>> > > > > > effort,
>> > > > > > > and how they relate to AIP-18/AIP-19 (should those be
>> deprecated
>> > or
>> > > > > > > independently implemented ?). Also - since 2.0.0 release is
>> half
>> > a
>> > > > year
>> > > > > > > ahead we should consider how it impact the roadmap.
>> > > > > > >
>> > > > > > > I can see three approaches here that we as community can
>> follow
>> > > > (maybe
>> > > > > I
>> > > > > > am
>> > > > > > > missing some :) ):
>> > > > > > >
>> > > > > > > 1) focus our work on single "complete" solution that will take
>> > > longer
>> > > > > > time
>> > > > > > > and targets 2.0.0.
>> > > > > > > 2) work on two of them: one quick/fast - potentially portable
>> to
>> > > > > 1.10.5m
>> > > > > > > one longer-term for 2.0.0.
>> > > > > > > 3) decide that the simple solution we have from Zhou (maybe
>> with
>> > > some
>> > > > > > > modifications) is our target solution (for both 1.10.5 if we
>> have
>> > > it
>> > > > > and
>> > > > > > > 2.0.0):
>> > > > > > >
>> > > > > > > J.
>> > > > > > >
>> > > > > > > On Sat, Jul 27, 2019 at 11:43 AM Kevin Yang <
>> yrql...@gmail.com>
>> > > > wrote:
>> > > > > > >
>> > > > > > > > Nice job Zhou!
>> > > > > > > >
>> > > > > > > > Really excited, exactly what we wanted for the webserver
>> > scaling
>> > > > > issue.
>> > > > > > > > Want to add another big drive for Airbnb to start think
>> about
>> > > this
>> > > > > > > > previously to support the effort: it can not only bring
>> > > consistency
>> > > > > > > between
>> > > > > > > > webservers but also bring consistency between webserver and
>> > > > > > > > scheduler/workers. It may be less of a problem if total DAG
>> > > parsing
>> > > > > > time
>> > > > > > > is
>> > > > > > > > small, but for us the total DAG parsing time is 15+ mins
>> and we
>> > > had
>> > > > > to
>> > > > > > > set
>> > > > > > > > the webserver( gunicorn subprocesses) restart interval to 20
>> > > mins,
>> > > > > > which
>> > > > > > > > leads to a worst case 15+20+15=50 mins delay between
>> scheduler
>> > > > start
>> > > > > to
>> > > > > > > > schedule things and users can see their deployed
>> > DAGs/changes...
>> > > > > > > >
>> > > > > > > > I'm not so sure about the scheduler performance improvement:
>> > > > > currently
>> > > > > > we
>> > > > > > > > already feed the main scheduler process with SimpleDag
>> through
>> > > > > > > > DagFileProcessorManager running in a subprocess--in the
>> future
>> > we
>> > > > > feed
>> > > > > > it
>> > > > > > > > with data from DB, which is likely slower( tho the diff
>> should
>> > > have
>> > > > > > > > negligible impact to the scheduler performance). In fact if
>> > we'd
>> > > > keep
>> > > > > > the
>> > > > > > > > existing behavior, try schedule only fresh parsed DAGs,
>> then we
>> > > may
>> > > > > > need
>> > > > > > > to
>> > > > > > > > deal with some consistency issue--dag processor and the
>> > scheduler
>> > > > > race
>> > > > > > > for
>> > > > > > > > updating the flag indicating if the DAG is newly parsed. No
>> big
>> > > > deal
>> > > > > > > there
>> > > > > > > > but just some thoughts on the top of my head and hopefully
>> can
>> > be
>> > > > > > > helpful.
>> > > > > > > >
>> > > > > > > > And good idea on pre-rendering the template, believe
>> template
>> > > > > rendering
>> > > > > > > was
>> > > > > > > > the biggest concern in the previous discussion. We've also
>> > chose
>> > > > the
>> > > > > > > > pre-rendering+JSON approach in our smart sensor API
>> > > > > > > > <
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-17+Airflow+sensor+optimization
>> > > > > > > > >
>> > > > > > > > and
>> > > > > > > > seems to be working fine--a supporting case for ur proposal
>> ;)
>> > > > > There's
>> > > > > > a
>> > > > > > > > WIP
>> > > > > > > > PR <https://github.com/apache/airflow/pull/5499> for it
>> just
>> > in
>> > > > case
>> > > > > > you
>> > > > > > > > are interested--maybe we can even share some logics.
>> > > > > > > >
>> > > > > > > > Thumbs-up again for this and please don't heisitate to reach
>> > out
>> > > if
>> > > > > you
>> > > > > > > > want to discuss further with us or need any help from us.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > Cheers,
>> > > > > > > > Kevin Y
>> > > > > > > >
>> > > > > > > > On Sat, Jul 27, 2019 at 12:54 AM Driesprong, Fokko
>> > > > > > <fo...@driesprong.frl
>> > > > > > > >
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Looks great Zhou,
>> > > > > > > > >
>> > > > > > > > > I have one thing that pops in my mind while reading the
>> AIP;
>> > > > should
>> > > > > > > keep
>> > > > > > > > > the caching on the webserver level. As the famous quote
>> goes:
>> > > > > *"There
>> > > > > > > are
>> > > > > > > > > only two hard things in Computer Science: cache
>> invalidation
>> > > and
>> > > > > > naming
>> > > > > > > > > things." -- Phil Karlton*
>> > > > > > > > >
>> > > > > > > > > Right now, the fundamental change that is being proposed
>> in
>> > the
>> > > > AIP
>> > > > > > is
>> > > > > > > > > fetching the DAGs from the database in a serialized
>> format,
>> > and
>> > > > not
>> > > > > > > > parsing
>> > > > > > > > > the Python files all the time. This will give already a
>> great
>> > > > > > > performance
>> > > > > > > > > improvement on the webserver side because it removes a
>> lot of
>> > > the
>> > > > > > > > > processing. However, since we're still fetching the DAGs
>> from
>> > > the
>> > > > > > > > database
>> > > > > > > > > in a regular interval, cache it in the local process, so
>> we
>> > > still
>> > > > > > have
>> > > > > > > > the
>> > > > > > > > > two issues that Airflow is suffering from right now:
>> > > > > > > > >
>> > > > > > > > >    1. No snappy UI because it is still polling the
>> database
>> > in
>> > > a
>> > > > > > > regular
>> > > > > > > > >    interval.
>> > > > > > > > >    2. Inconsistency between webservers because they might
>> > poll
>> > > > in a
>> > > > > > > > >    different interval, I think we've all seen this:
>> > > > > > > > >    https://www.youtube.com/watch?v=sNrBruPS3r4
>> > > > > > > > >
>> > > > > > > > > As I also mentioned in the Slack channel, I strongly feel
>> > that
>> > > we
>> > > > > > > should
>> > > > > > > > be
>> > > > > > > > > able to render most views from the tables in the
>> database, so
>> > > > > without
>> > > > > > > > > touching the blob. For specific views, we could just pull
>> the
>> > > > blob
>> > > > > > from
>> > > > > > > > the
>> > > > > > > > > database. In this case we always have the latest version,
>> and
>> > > we
>> > > > > > tackle
>> > > > > > > > the
>> > > > > > > > > second point above.
>> > > > > > > > >
>> > > > > > > > > To tackle the first one, I also have an idea. We should
>> > change
>> > > > the
>> > > > > > DAG
>> > > > > > > > > parser from a loop to something that uses inotify
>> > > > > > > > > https://pypi.org/project/inotify_simple/. This will
>> change
>> > it
>> > > > from
>> > > > > > > > polling
>> > > > > > > > > to an event-driven design, which is much more performant
>> and
>> > > less
>> > > > > > > > resource
>> > > > > > > > > hungry. But this would be an AIP on its own.
>> > > > > > > > >
>> > > > > > > > > Again, great design and a comprehensive AIP, but I would
>> > > include
>> > > > > the
>> > > > > > > > > caching on the webserver to greatly improve the user
>> > experience
>> > > > in
>> > > > > > the
>> > > > > > > > UI.
>> > > > > > > > > Looking forward to the opinion of others on this.
>> > > > > > > > >
>> > > > > > > > > Cheers, Fokko
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > Op za 27 jul. 2019 om 01:44 schreef Zhou Fang
>> > > > > > > > <zhouf...@google.com.invalid
>> > > > > > > > > >:
>> > > > > > > > >
>> > > > > > > > > > Hi Kaxi,
>> > > > > > > > > >
>> > > > > > > > > > Just sent out the AIP:
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-24+DAG+Persistence+in+DB+using+JSON+for+Airflow+Webserver+and+%28optional%29+Scheduler
>> > > > > > > > > >
>> > > > > > > > > > Thanks!
>> > > > > > > > > > Zhou
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > On Fri, Jul 26, 2019 at 1:33 PM Zhou Fang <
>> > > zhouf...@google.com
>> > > > >
>> > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Hi Kaxil,
>> > > > > > > > > > >
>> > > > > > > > > > > We are also working on persisting DAGs into DB using
>> JSON
>> > > for
>> > > > > > > Airflow
>> > > > > > > > > > > webserver in Google Composer. We target at minimizing
>> the
>> > > > > change
>> > > > > > to
>> > > > > > > > the
>> > > > > > > > > > > current Airflow code. Happy to get synced on this!
>> > > > > > > > > > >
>> > > > > > > > > > > Here is our progress:
>> > > > > > > > > > > (1) Serializing DAGs using Pickle to be used in
>> webserver
>> > > > > > > > > > > It has been launched in Composer. I am working on the
>> PR
>> > to
>> > > > > > > upstream
>> > > > > > > > > it:
>> > > > > > > > > > > https://github.com/apache/airflow/pull/5594
>> > > > > > > > > > > Currently it does not support non-Airflow operators
>> and
>> > we
>> > > > are
>> > > > > > > > working
>> > > > > > > > > on
>> > > > > > > > > > > a fix.
>> > > > > > > > > > >
>> > > > > > > > > > > (2) Caching Pickled DAGs in DB to be used by webserver
>> > > > > > > > > > > We have a proof-of-concept implementation, working on
>> an
>> > > AIP
>> > > > > now.
>> > > > > > > > > > >
>> > > > > > > > > > > (3) Using JSON instead of Pickle in (1) and (2)
>> > > > > > > > > > > Decided to use JSON because Pickle is not secure and
>> > human
>> > > > > > > readable.
>> > > > > > > > > The
>> > > > > > > > > > > serialization approach is very similar to (1).
>> > > > > > > > > > >
>> > > > > > > > > > > I will update the RP (
>> > > > > > https://github.com/apache/airflow/pull/5594)
>> > > > > > > > to
>> > > > > > > > > > > replace Pickle by JSON, and send our design of (2) as
>> an
>> > > AIP
>> > > > > next
>> > > > > > > > week.
>> > > > > > > > > > > Glad to check together whether our implementation
>> makes
>> > > sense
>> > > > > and
>> > > > > > > do
>> > > > > > > > > > > improvements on that.
>> > > > > > > > > > >
>> > > > > > > > > > > Thanks!
>> > > > > > > > > > > Zhou
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > On Fri, Jul 26, 2019 at 7:37 AM Kaxil Naik <
>> > > > > kaxiln...@gmail.com>
>> > > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > >> Hi all,
>> > > > > > > > > > >>
>> > > > > > > > > > >> We, at Astronomer, are going to spend time working on
>> > DAG
>> > > > > > > > > Serialisation.
>> > > > > > > > > > >> There are 2 AIPs that are somewhat related to what we
>> > plan
>> > > > to
>> > > > > > work
>> > > > > > > > on:
>> > > > > > > > > > >>
>> > > > > > > > > > >>    - AIP-18 Persist all information from DAG file in
>> DB
>> > > > > > > > > > >>    <
>> > > > > > > > > > >>
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-18+Persist+all+information+from+DAG+file+in+DB
>> > > > > > > > > > >> >
>> > > > > > > > > > >>    - AIP-19 Making the webserver stateless
>> > > > > > > > > > >>    <
>> > > > > > > > > > >>
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-19+Making+the+webserver+stateless
>> > > > > > > > > > >> >
>> > > > > > > > > > >>
>> > > > > > > > > > >> We plan to use JSON as the Serialisation format and
>> > store
>> > > it
>> > > > > as
>> > > > > > a
>> > > > > > > > blob
>> > > > > > > > > > in
>> > > > > > > > > > >> metadata DB.
>> > > > > > > > > > >>
>> > > > > > > > > > >> *Goals:*
>> > > > > > > > > > >>
>> > > > > > > > > > >>    - Make Webserver Stateless
>> > > > > > > > > > >>    - Use the same version of the DAG across
>> Webserver &
>> > > > > > Scheduler
>> > > > > > > > > > >>    - Keep backward compatibility and have a flag
>> > > (globally &
>> > > > > at
>> > > > > > > DAG
>> > > > > > > > > > level)
>> > > > > > > > > > >>    to turn this feature on/off
>> > > > > > > > > > >>    - Enable DAG Versioning (extended Goal)
>> > > > > > > > > > >>
>> > > > > > > > > > >>
>> > > > > > > > > > >> We will be preparing a proposal (AIP) after some
>> > research
>> > > > and
>> > > > > > some
>> > > > > > > > > > initial
>> > > > > > > > > > >> work and open it for the suggestions of the
>> community.
>> > > > > > > > > > >>
>> > > > > > > > > > >> We already had some good brain-storming sessions with
>> > > > Twitter
>> > > > > > > folks
>> > > > > > > > > > (DanD
>> > > > > > > > > > >> &
>> > > > > > > > > > >> Sumit), folks from GoDataDriven (Fokko & Bas) & Alex
>> > (from
>> > > > > Uber)
>> > > > > > > > which
>> > > > > > > > > > >> will
>> > > > > > > > > > >> be a good starting point for us.
>> > > > > > > > > > >>
>> > > > > > > > > > >> If anyone in the community is interested in it or has
>> > some
>> > > > > > > > experience
>> > > > > > > > > > >> about
>> > > > > > > > > > >> the same and want to collaborate please let me know
>> and
>> > > join
>> > > > > > > > > > >> #dag-serialisation channel on Airflow Slack.
>> > > > > > > > > > >>
>> > > > > > > > > > >> Regards,
>> > > > > > > > > > >> Kaxil
>> > > > > > > > > > >>
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > --
>> > > > > > >
>> > > > > > > Jarek Potiuk
>> > > > > > > Polidea <https://www.polidea.com/> | Principal Software
>> Engineer
>> > > > > > >
>> > > > > > > M: +48 660 796 129 <+48660796129>
>> > > > > > > [image: Polidea] <https://www.polidea.com/>
>> > > > > > >
>> > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > >
>> > > > > Jarek Potiuk
>> > > > > Polidea <https://www.polidea.com/> | Principal Software Engineer
>> > > > >
>> > > > > M: +48 660 796 129 <+48660796129>
>> > > > > [image: Polidea] <https://www.polidea.com/>
>> > > > >
>> > > >
>> > > >
>> > > > --
>> > > > *Kaxil Naik*
>> > > > *Big Data Consultant | DevOps Data Engineer*
>> > > > *Certified *Google Cloud Data Engineer | *Certified* Apache Spark &
>> > Neo4j
>> > > > Developer
>> > > > *LinkedIn*: https://www.linkedin.com/in/kaxil
>> > > >
>> > >
>> >
>>
>

Reply via email to