potiuk commented on a change in pull request #10303:
URL: https://github.com/apache/airflow/pull/10303#discussion_r476264200



##########
File path: docs/modules_management.rst
##########
@@ -0,0 +1,194 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+
+Modules Management
+==================
+
+Airflow allows you to use your own Python modules in the DAG and in the 
Airflow configuration. The following article
+will describe how you can create your own module so that Airflow can load it 
correctly, as well as diagnose problems
+when modules are not loaded properly.
+
+This article is the last one for you if you need to adapt Airflow to the needs 
of your organization.
+
+Packages Loading in Python
+--------------------------
+
+The list of directories from which Python tries to load the module is given by 
the variable :any:`sys.path`. Python
+really tries to `intelligently determine the contents of 
<https://stackoverflow.com/a/38403654>`_ of this variable,
+including depending on the operating system and how Python is installed.
+
+You can check the contents of this variable for the current Python environment 
by running an interactive terminal as in
+the example below:
+
+.. code-block:: pycon
+
+    >>> import sys
+    >>> from pprint import pprint
+    >>> pprint(sys.path)
+    ['',
+     '/home/arch/.pyenv/versions/3.7.4/lib/python37.zip',
+     '/home/arch/.pyenv/versions/3.7.4/lib/python3.7',
+     '/home/arch/.pyenv/versions/3.7.4/lib/python3.7/lib-dynload',
+     '/home/arch/venvs/airflow/lib/python3.7/site-packages']
+
+``sys.path`` is initialized during program startup. The first precedence is 
given to the current directory,
+i.e, ``path[0]`` is the directory containing the current script that was used 
to invoke or an empty string in case
+it was an interactive shell. Second precedence is given to the ``PYTHONPATH`` 
if provided, followed by installation-dependent
+default paths which is managed by `site 
<https://docs.python.org/3/library/site.html#module-site>`_ module.
+
+``sys.path`` can also be modified during a Python session by simply using 
append
+(for example, ``sys.path.append("/path/to/custom/package")``). Python will 
start searching for packages in the newer
+paths once they're added. Airflow makes use of this feature as described in 
the further sections.
+
+In the variable ``sys.path`` there is a directory ``site-packages`` which 
contains the installed **external packages**,
+which means you can install packages with ``pip`` or ``anaconda`` and you can 
use them in Airflow. In the next section,
+you will learn how to create your own simple installable package and how to 
specify additional directories to be added
+to ``sys.path`` using the environment variable :envvar:`PYTHONPATH`.
+

Review comment:
       I think it would be worth to add section about .pth files 
(https://docs.python.org/3/library/site.html#module-site) . I often find it 
invaluable (especially in production installation) to modularize access to 
different parts of code. Big organisations often have a lot of independent 
modules and components and often they are not installed by "pip" packages (for 
various reason - compilation needs, necessity to use code from sources etc.)  
and in those cases adding paths to search in .pth files is a really nice way of 
modularising such access. Then you need to just drop the .pth file in one of 
the site modules. The .pth has also the nice property that it can have an 
executable that it executed at every python interpreter start. It is also used 
in big packages that needs to be installed from sources (example ROS uses .pth 
files extensively http://wiki.ros.org/rospy)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to