I've read a bit more about PEP-0420 so some more extracted information: - implicit namespace packages are very similar to regular packages. The notable difference is that they can be loaded from several directories (so you can have modules in the same package but coming from different physical directories), so no __file__ is defined for the package (it is still defined for modules),
- pytest should have no problems. It has indeed <https://stackoverflow.com/questions/50174130/how-do-i-pytest-a-project-using-pep-420-namespace-packages> problems with implicit packages when tests are co-located with their tested modules (module.py, module_test.py) but we are not in this situation. - I do not see how it could be slower - I have not found any benchmarking yet (but I have not found any complaints about it either). I looked at the loading algorithm for PEP-420 - it traverses all the directory structure available on the PYTHONPATH bo so does the "standard" mechanism. And we would not have to parse and load all the 140 licence-only __init__.py. The presence of module in the package is based on the presence of "<module>.py" in the right folder. If we would have several packages elsewhere that might become a bit slower, but I think reading files is so much slower than scanning directories that - if anything - we will be faster (just opening and closing those __init__.files will take quite some time :). We can do some benchmarking before and after to be sure. Side comment: I am not 100% sure, but after reading PEP-420 intuition tells me that maybe implicit packages can help in solving relative imports problems that some of our user raised . I saw recently several users had problems with relative imports used in DAGs. We are loading the DAGs in a specific way and in implicit packages, the __repr__ of modules parent is dynamic - based on the location of the file rather than based on the module of the file, so it's likely if we encourage dag developers also to use implicit packages, it might actually solve the relative imports case. J. On Fri, Nov 22, 2019 at 4:29 PM Daniel Imberman <[email protected]> wrote: > Fine by me if it doesn’t break anything. I’m always on the side of less > code :). > > via Newton Mail [ > https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.32&pv=10.14.5&source=email_footer_2 > ] > On Fri, Nov 22, 2019 at 8:12 AM, Ash Berlin-Taylor <[email protected]> wrote: > Probably worth it, but things to check: > > - if nose or pytest needs them under tests/ to find all the tests > - if not having the init files slows down pythons package importer as it > searches more paths? > > -a > > > On 22 Nov 2019, at 12:06, Jarek Potiuk <[email protected]> wrote: > > > > There is one implementation detail in AIP-21 that I continue to have > > questions about, and I wanted to ask community about it. > > > > Since we moved to Python 3, we have the option of using implicit > namespace > > packages. > > https://www.python.org/dev/peps/pep-0420/ . We have now grouping per > folder > > for services, so it would require a lot more __init__.py files if we > > continue using them. > > > > The implicit naming boils down to not requiring __init__.py files if no > > initialisation of package is needed. We could do it consistently for all > > code which is "internal" to airflow. This means that most packages will > not > > have __init__.py there will be few exceptions like 'airflow', > > 'airflow.modules' and likely few others with the __init__.py. > > > > I did a quick check and we have only 25 _init_.py with some logic - > > remaining ~ 140 could be removed as they only contain comments. > > > > find . -name '__init__.py' | xargs grep -le '^[^#].*' | wc -l > > 25 > > find . -name '__init__.py' |wc -l > > 164 > > > > Those are the files that will be left: > > ./tests/__init__.py > > ./tests/utils/log/elasticmock/__init__.py > > ./tests/utils/log/elasticmock/utilities/__init__.py > > ./tests/models/__init__.py > > ./tests/task/__init__.py > > ./airflow/sensors/__init__.py > > ./airflow/operators/__init__.py > > ./airflow/_vendor/nvd3/__init__.py > > ./airflow/_vendor/slugify/__init__.py > > ./airflow/serialization/__init__.py > > ./airflow/__init__.py > > ./airflow/models/__init__.py > > > ./airflow/www/node_modules/npm/node_modules/node-gyp/gyp/pylib/gyp/__init__.py > > ./airflow/ti_deps/deps/__init__.py > > ./airflow/macros/__init__.py > > ./airflow/executors/__init__.py > > ./airflow/lineage/__init__.py > > ./airflow/lineage/backend/__init__.py > > ./airflow/lineage/backend/atlas/__init__.py > > ./airflow/hooks/__init__.py > > ./airflow/task/task_runner/__init__.py > > ./airflow/api/__init__.py > > ./airflow/api/common/experimental/__init__.py > > ./airflow/api/client/__init__.py > > ./airflow/jobs/__init__.py > > > > WDYT? > > > > J > > > > -- > > > > Jarek Potiuk > > Polidea <https://www.polidea.com/> | Principal Software Engineer > > > > M: +48 660 796 129 <+48660796129> > > [image: Polidea] <https://www.polidea.com/> -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>
