It wouldn't really be serialization. You would still need to watch all the dependent code unless you wanted a continuous parse going on.
On Mon, Apr 24, 2017 at 3:19 PM, Bolke de Bruin <bdbr...@gmail.com> wrote: > That would be close to serialization which you could do with marshmallow > (which works better than pickle). > > B. > > Sent from my iPhone > > > On 25 Apr 2017, at 00:07, Alex Guziel <alex.guz...@airbnb.com.INVALID> > wrote: > > > > You can also use reflection in Python to read the modules all the way > down. > > > > On Mon, Apr 24, 2017 at 3:05 PM, Dan Davydov <dan.davy...@airbnb.com. > invalid > >> wrote: > > > >> Was talking with Alex about the DB case offline, for those we could > support > >> a force refresh arg with an interval param. > >> > >> Manifests would need to be hierarchal but I feel like it would spin out > >> into a full blown build system inevitably. > >> > >> On Mon, Apr 24, 2017 at 3:02 PM, Arthur Wiedmer < > arthur.wied...@gmail.com> > >> wrote: > >> > >>> What if the DAG actually depends on configuration that only exists in a > >>> database and is retrieved by the Python code generating the DAG? > >>> > >>> Just asking because we have this case in production here. It is slowly > >>> changing, so still fits within the Airflow framework, but you cannot > just > >>> watch a file... > >>> > >>> Best, > >>> Arthur > >>> > >>> On Mon, Apr 24, 2017 at 2:55 PM, Bolke de Bruin <bdbr...@gmail.com> > >> wrote: > >>> > >>>> Inotify can work without a daemon. Just fire a call to the API when a > >>> file > >>>> changes. Just a few lines in bash. > >>>> > >>>> If you bundle you dependencies in a zip you should be fine with the > >>> above. > >>>> Or if we start using manifests that list the files that are needed in > a > >>>> dag... > >>>> > >>>> > >>>> Sent from my iPhone > >>>> > >>>>> On 24 Apr 2017, at 22:46, Dan Davydov <dan.davy...@airbnb.com. > >> INVALID> > >>>> wrote: > >>>>> > >>>>> One idea to solve this is to use a daemon that uses inotify to watch > >>> for > >>>>> changes in files and then reprocesses just those files. The hard part > >>> is > >>>>> without any kind of dependency/build system for DAGs it can be hard > >> to > >>>> tell > >>>>> which DAGs depend on which files. > >>>>> > >>>>> On Mon, Apr 24, 2017 at 1:21 PM, Gerard Toonstra < > >> gtoons...@gmail.com> > >>>>> wrote: > >>>>> > >>>>>> Hey, > >>>>>> > >>>>>> I've seen some people complain about DAG file processing times. An > >>> issue > >>>>>> was raised about this today: > >>>>>> > >>>>>> https://issues.apache.org/jira/browse/AIRFLOW-1139 > >>>>>> > >>>>>> I attempted to provide a good explanation what's going on. Feel free > >>> to > >>>>>> validate and comment. > >>>>>> > >>>>>> > >>>>>> I'm noticing that the file processor is a bit naive in the way it > >>>>>> reprocesses DAGs. It doesn't look at the DAG interval for example, > >> so > >>> it > >>>>>> looks like it reprocesses all files continuously in one big batch, > >>> even > >>>> if > >>>>>> we can determine that the next "schedule" for all its dags are in > >> the > >>>>>> future? > >>>>>> > >>>>>> > >>>>>> Wondering if a change in the DagFileProcessingManager could optimize > >>>> things > >>>>>> a bit here. > >>>>>> > >>>>>> In the part where it gets the simple_dags from a file it's currently > >>>>>> processing: > >>>>>> > >>>>>> for simple_dag in processor.result: > >>>>>> simple_dags.append(simple_dag) > >>>>>> > >>>>>> the file_path is in the context and the simple_dags should be able > >> to > >>>>>> provide the next interval date for each dag in the file. > >>>>>> > >>>>>> The idea is to add files to a sorted deque by > >> "next_schedule_datetime" > >>>> (the > >>>>>> minimum next interval date), so that when we build the list > >>>>>> "files_paths_to_queue", it can remove files that have dags that we > >>> know > >>>>>> won't have a new dagrun for a while. > >>>>>> > >>>>>> One gotcha to resolve after that is to deal with files getting > >> updated > >>>> with > >>>>>> new dags or changed dag definitions and renames and different > >> interval > >>>>>> schedules. > >>>>>> > >>>>>> Worth a PR to glance over? > >>>>>> > >>>>>> Rgds, > >>>>>> > >>>>>> Gerard > >>>>>> > >>>> > >>> > >> >