[
https://issues.apache.org/jira/browse/AIRFLOW-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16547113#comment-16547113
]
Kevin Yang edited comment on AIRFLOW-2762 at 7/17/18 9:13 PM:
--------------------------------------------------------------
[~ashb] Ty a lot for providing your opinions. I think that is good idea, since
it will also provide some sort of consistency between scheduler and webserver.
Though to be able to do that, we need to store more info in the DagModel that
webserver needs, e.g. the dependency. I am also not very sure about how much
extra load that would place on the DB. I think if we go this route, we might
want to build a DAG parsing component that parses DAG for both scheduler and
webserver. I think before we decided to do that, we can try parallelize the
parsing on webserver--the work can be reused when we have the DAG parsing
service since the webserver will be using the serializable info of the DAG
instead of the the DAG object in both cases.
was (Author: yrqls21):
[~ashb] Ty for the opinions. I think that is good idea, since it will also
provide some sort of consistency between scheduler and webserver. Though to be
able to do that, we need to store more info in the DagModel that webserver
needs, e.g. the dependency. I am also not very sure about how much extra load
that would place on the DB. I think if we go this route, we might want to build
a DAG parsing component that parses DAG for both scheduler and webserver. I
think before we decided to do that, we can try parallelize the parsing on
webserver--the work can be reused when we have the DAG parsing service since
the webserver will be using the serializable info of the DAG instead of the the
DAG object in both cases.
> Parallelize DAG parsing in webserver
> ------------------------------------
>
> Key: AIRFLOW-2762
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2762
> Project: Apache Airflow
> Issue Type: Improvement
> Reporter: Kevin Yang
> Priority: Major
>
> Currently the webserver parses DagBag in a single thread fashion and causes
> the start up time to be slow when we have large # of DAG files. Webservers
> should not need the actual DAG object and this should be parallelized.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)