potiuk commented on pull request #15317:
URL: https://github.com/apache/airflow/pull/15317#issuecomment-817300262


   I agree with @mik-laj we should  be careful with namespaces. The question is 
what we want to achieve with it. 
   
   Do we really want to convert the test/charts into a namespace package ? Why? 
 Is the only reason because we want to run all regular tests and chart test 
with the same pytest command (I believe we should not do it because those are 
completely different sets of tests that SHOULD be run separately and they are 
part of the "chart" 'sub-project' not airflow ( they even never import airflow 
and never should). 
   
   There are couple of reasons we should be careful with the implicit namespace 
packages.
   
   1. Not all the tools are handling implicit namespace packages properly..
   
   For example we have to dynamically add  (and remove) __init__.py now when we 
are running airflow docs: 
(https://github.com/apache/airflow/blob/master/docs/exts/provider_init_hack.py) 
and pylint 
https://github.com/apache/airflow/blob/master/scripts/in_container/run_pylint.sh#L22
  
   
   We even have a pre-commit to check if it is not a left-over so that you 
would not committ it accidentally:
   
https://github.com/apache/airflow/blob/master/scripts/ci/pre_commit/pre_commit_check_providers_init.sh
   
   2. Namespace packages should be used to separate whole "packages" not just a 
folder in a subdirectory
   
   This is where namespace packages are described 
https://packaging.python.org/guides/packaging-namespace-packages/. If you look 
at the examples, you will see why they are introduced in a first place.
   
   There is a misconception (which I also believed in for some time) that after 
you move to python3.3+, you can simply remove all the __init__.py. This is not 
true. Namespace packages are really useful when you want to separate your 
projects into smaller "separately installable" and versioned packages. Very 
similar to what we do with providers. With the current providers - at least for 
now - we chose a bit different path where we have a bit hybrid solution (we 
have namespaces but they are not "complete" separate packages on their own. we 
might eventually converge into full namespace-compliant way though. That will 
require a few more iterations, moving the setup dependencies, docs inside each 
package etc.  - we are not very far from that, but there is a number of 
non-trivial changes to implement to make it happen (likely something we might 
consider for 2.1).  
   
   The namespace packages are not really good for separating a single tests 
subdirectory. Why?
   
   3. Consequences of having separate namespaces. 
   
   There are a number of implications of not having __init__.py (i.e. having an 
implicit package) - like where they can be installed, how long it takes to 
search for the packages when you import them (much longer), but also what does 
it mean for the test code. I had the case when you place the tests in a 
different namespace and for example you wanted to check if two classes were 
objects of the same clas/inheritnce - they were different because they were 
imported in  a different namespace. Those are non-obvious problems that might 
be alien to an average python developer.
   
   If you look at PEP420 
https://www.python.org/dev/peps/pep-0420/#specification  the regular packages 
are still supposed to contain `__init__.py`:
   
   ```
   Regular packages will continue to have an __init__.py and will reside in a 
single directory.
   ```
   And 
   ``
   Namespace packages cannot contain an __init__.py. 
   ```
   so i think converting "chart/tests" into a namespace package is not a good 
idea.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to