Hi Anita,

Thank you, that spares me some time as I hadn't tested my fixes yet... :-X

Mit freundlichen Grüßen / Best regards

Dr. Thomas Niebler
Data Scientist<br/>Sales Data Lab, Analytics DC-IH/SDL1

Tel. +49 9352 18-2392
Fax +49 9352 18-0
thomas.nieb...@boschrexroth.de
www.boschrexroth.com

Bosch Rexroth AG
Partensteiner Straße 23
97816 Lohr am Main
GERMANY

Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart HRB 23192
Vorstand: Rolf Najork (Vorsitzender), Dr. Markus Forschner, Dr. Heiner Lang, 
Reinhard Schäfer, Dr. Marc Wucherer
Vorsitzender des Aufsichtsrats: Christoph Kübel

-----Ursprüngliche Nachricht-----
Von: Anita Fronczak <akar...@gmail.com> 
Gesendet: Monday, May 4, 2020 9:15 AM
An: dev@airflow.apache.org
Betreff: Re: Potential issue with serialized DAGs in decoupled Webserver and 
Scheduler

Hello,

Your scenario is possible. I have a ready PR that sets store_serialized_dags in 
all places to appropriate value (web server, CLI, experimental API) and then 
dag serialization works without access to DAG files (tested). I shall push the 
PR today.

Anita

pon., 4 maj 2020, 08:42 użytkownik Niebler Thomas (DC-IH/SDL1) 
<thomas.nieb...@boschrexroth.de.invalid> napisał:

> Hi all,
>
> I have a probably rather special use case scenario:
> Using Airflow 1.10.10, I would like to physically decouple the 
> Webserver and the Scheduler for some secure access reasons. According 
> to https://airflow.apache.org/docs/stable/dag-serialization.html, this 
> should be a piece of cake with Airflow 1.10.10, since DAGs are stored 
> in the Metadata database and the webserver does not need to access the 
> DAG files anymore. The metadata database is of course reachable by 
> both Airflow
> instances:
>
> Docker Image Instance: Airflow Webserver <----> Docker Image Instance:
> Metadata database <----> Physical Machine: Airflow Scheduler
>
> However, every time I start a DAG manually, it crashes with a rather 
> lengthy error message:
>
>   File "/usr/local/lib/python3.7/site-packages/airflow/www/views.py", 
> line 1255, in trigger
>     external_trigger=True
>   File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", 
> line 74, in wrapper
>     return func(*args, **kwargs)
>   File "/usr/local/lib/python3.7/site-packages/airflow/models/dag.py",
> line 1818, in create_dagrun
>     return self.get_dag().create_dagrun(run_id=run_id,
> AttributeError: 'NoneType' object has no attribute 'create_dagrun'
>
> This basically boils down to self.get_dag() not having set the Boolean 
> flag store_serialized_flags to True (or whatever value the config is 
> set to), but always using False (the default value).
> This then leads to Airflow attempting to read the DAG file, ignoring 
> the DAG database entry and returning None, which obviously has no 
> attribute create_dagrun.
>
> I’ve got several questions here now:
>
>   1.  Is my scenario even possible or am I overlooking something 
> rather obvious?
>   2.  Is the crashing DAG behavior intended like that? It rather seems 
> like a bug to me.
>   3.  Is it worth fixing this issue (if it is one) for Airflow 1.10.x, 
> considering that Airflow 2.0.0 does not even contain the corresponding 
> classes anymore and takes a different path?
>
> Mit freundlichen Grüßen / Best regards
>
> Dr. Thomas Niebler
> Data Scientist
> Sales Data Lab, Analytics DC-IH/SDL1
>
> Tel. +49 9352 18-2392
> Fax +49 9352 18-0
> thomas.nieb...@boschrexroth.de<mailto:thomas.nieb...@boschrexroth.de>
> www.boschrexroth.com
>
> Bosch Rexroth AG
> Partensteiner Straße 23
> 97816 Lohr am Main
> GERMANY
>
> [BOSCH REXROTH]<http://www.boschrexroth.com/>
>
>
>
> Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart HRB 23192
> Vorstand: Rolf Najork (Vorsitzender), Dr. Markus Forschner, Dr. Heiner 
> Lang, Reinhard Schäfer, Dr. Marc Wucherer Vorsitzender des 
> Aufsichtsrats: Christoph Kübel ​
>

Reply via email to