wolfier commented on a change in pull request #15183:
URL: https://github.com/apache/airflow/pull/15183#discussion_r617705667



##########
File path: docs/apache-airflow/faq.rst
##########
@@ -159,72 +210,206 @@ simple dictionary.
         other_dag_id = f'bar_{i}'
         globals()[other_dag_id] = create_dag(other_dag_id)
 
-What are all the ``airflow tasks run`` commands in my process list?
--------------------------------------------------------------------
+Even though Airflow supports multiple DAG definition per python file, 
dynamically generated or else, it is not
+recommended as Airflow would like better isolation between DAGs from a fault 
and deployment perspective and multiple
+DAGs in the same file goes against that.
 
-There are many layers of ``airflow tasks run`` commands, meaning it can call 
itself.
 
-- Basic ``airflow tasks run``: fires up an executor, and tell it to run an
-  ``airflow tasks run --local`` command. If using Celery, this means it puts a
-  command in the queue for it to run remotely on the worker. If using
-  LocalExecutor, that translates into running it in a subprocess pool.
-- Local ``airflow tasks run --local``: starts an ``airflow tasks run --raw``
-  command (described below) as a subprocess and is in charge of
-  emitting heartbeats, listening for external kill signals
-  and ensures some cleanup takes place if the subprocess fails.
-- Raw ``airflow tasks run --raw`` runs the actual operator's execute method and
-  performs the actual work.
+Are top level Python code allowed?
+----------------------------------
 
+While it is not recommended to write any code outside of defining Airflow 
constructs, Airflow does support any
+arbitrary python code as long as it does not break the DAG file processor or 
prolong file processing time past the
+:ref:`config:core__dagbag_import_timeout` value.
 
-How can my airflow dag run faster?
-----------------------------------
+A common example is the violation of the time limit when building a dynamic 
DAG which usually requires querying data
+from another service like a database. At the same time, the requested service 
is being swamped with DAG file
+processors requests for data to process the file. These unintended 
interactions may cause the service to deteriorate
+and eventually cause DAG file processing to fail.
 
-There are a few variables we can control to improve airflow dag performance:
+Refer to :ref:`DAG writing best practices<best_practice:writing_a_dag>` for 
more information.
 
-- ``parallelism``: This variable controls the number of task instances that 
runs simultaneously across the whole Airflow cluster. User could increase the 
``parallelism`` variable in the ``airflow.cfg``.
-- ``concurrency``: The Airflow scheduler will run no more than ``concurrency`` 
task instances for your DAG at any given time. Concurrency is defined in your 
Airflow DAG. If you do not set the concurrency on your DAG, the scheduler will 
use the default value from the ``dag_concurrency`` entry in your 
``airflow.cfg``.
-- ``task_concurrency``: This variable controls the number of concurrent 
running task instances across ``dag_runs`` per task.
-- ``max_active_runs``: the Airflow scheduler will run no more than 
``max_active_runs`` DagRuns of your DAG at a given time. If you do not set the 
``max_active_runs`` in your DAG, the scheduler will use the default value from 
the ``max_active_runs_per_dag`` entry in your ``airflow.cfg``.
-- ``pool``: This variable controls the number of concurrent running task 
instances assigned to the pool.
 
-How can we reduce the airflow UI page load time?
-------------------------------------------------
+Do Macros resolves in another Jinja template?
+---------------------------------------------
 
-If your dag takes long time to load, you could reduce the value of 
``default_dag_run_display_number`` configuration in ``airflow.cfg`` to a 
smaller value. This configurable controls the number of dag run to show in UI 
with default value 25.
+It is not possible to render :ref:`Macros<macros>` or any Jinja template 
within another Jinja template. This is
+commonly attempted in ``user_defined_macros``.
 
+.. code-block:: python
 
-How to fix Exception: Global variable explicit_defaults_for_timestamp needs to 
be on (1)?
------------------------------------------------------------------------------------------
+        dag = DAG(
+            ...
+            user_defined_macros={
+                'my_custom_macro': 'day={{ ds }}'
+            }
+        )
 
-This means ``explicit_defaults_for_timestamp`` is disabled in your mysql 
server and you need to enable it by:
+        bo = BashOperator(
+            task_id='my_task',
+            bash_command="echo {{ my_custom_macro }}",
+            dag=dag
+        )
 
-#. Set ``explicit_defaults_for_timestamp = 1`` under the ``mysqld`` section in 
your ``my.cnf`` file.
-#. Restart the Mysql server.
+This will echo "day={{ ds }}" instead of "day=2020-01-01" for a dagrun with 
the execution date 2020-01-01 00:00:00.
 
+.. code-block:: python
 
-How to reduce airflow dag scheduling latency in production?
------------------------------------------------------------
+        bo = BashOperator(
+            task_id='my_task',
+            bash_command="echo day={{ ds }}",
+            dag=dag
+        )
+
+By using the ds macros directly in the template_field, the rendered value 
results in "day=2020-01-01".
 
-Airflow 2 has low DAG scheduling latency out of the box (particularly when 
compared with Airflow 1.10.x),
-however if you need more throughput you can :ref:`start multiple 
schedulers<scheduler:ha>`.
 
-Why next_ds or prev_ds might not contain expected values?
----------------------------------------------------------
+Why ``next_ds`` or ``prev_ds`` might not contain expected values?
+------------------------------------------------------------------
 
 - When scheduling DAG, the ``next_ds`` ``next_ds_nodash`` ``prev_ds`` 
``prev_ds_nodash`` are calculated using
   ``execution_date`` and ``schedule_interval``. If you set 
``schedule_interval`` as ``None`` or ``@once``,
   the ``next_ds``, ``next_ds_nodash``, ``prev_ds``, ``prev_ds_nodash`` values 
will be set to ``None``.
 - When manually triggering DAG, the schedule will be ignored, and ``prev_ds == 
next_ds == ds``
 
+
+Task execution interactions
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Why does a sensor never complete?
+---------------------------------
+
+When waiting on another task's result, it is recommended to raise the priority 
of the tasks it will be waiting for.
+Otherwise, scheduling can get deadlocked, with the waiting task taking up all 
available slots, leaving no slots for the

Review comment:
       I will remove this faq for now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to