mik-laj commented on a change in pull request #10303: URL: https://github.com/apache/airflow/pull/10303#discussion_r475822497
########## File path: docs/modules_management.rst ########## @@ -0,0 +1,194 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + + + +Modules Management +================== + +Airflow allows you to use your own Python modules in the DAG and in the Airflow configuration. The following article +will describe how you can create your own module so that Airflow can load it correctly, as well as diagnose problems +when modules are not loaded properly. + +This article is the last one for you if you need to adapt Airflow to the needs of your organization. + +Packages Loading in Python +-------------------------- + +The list of directories from which Python tries to load the module is given by the variable :any:`sys.path`. Python +really tries to `intelligently determine the contents of <https://stackoverflow.com/a/38403654>`_ of this variable, +including depending on the operating system and how Python is installed. + +You can check the contents of this variable for the current Python environment by running an interactive terminal as in +the example below: + +.. code-block:: pycon + + >>> import sys + >>> from pprint import pprint + >>> pprint(sys.path) + ['', + '/home/arch/.pyenv/versions/3.7.4/lib/python37.zip', + '/home/arch/.pyenv/versions/3.7.4/lib/python3.7', + '/home/arch/.pyenv/versions/3.7.4/lib/python3.7/lib-dynload', + '/home/arch/venvs/airflow/lib/python3.7/site-packages'] + +``sys.path`` is initialized during program startup. The first precedence is given to the current directory, +i.e, ``path[0]`` is the directory containing the current script that was used to invoke or an empty string in case +it was an interactive shell. Second precedence is given to the ``PYTHONPATH`` if provided, followed by installation-dependent +default paths which is managed by `site <https://docs.python.org/3/library/site.html#module-site>`_ module. + +``sys.path`` can also be modified during a Python session by simply using append +(for example, ``sys.path.append("/path/to/custom/package")``). Python will start searching for packages in the newer +paths once they're added. Airflow makes use of this feature as described in the further sections. Review comment: > in the further sections Can you add a link to section? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
