Thank you for spending time on this to clarify it for all of us! Much appreciated.
On Sun, Nov 10, 2019 at 3:45 PM Chad Dombrova <chad...@gmail.com> wrote: > Hi all, > > >> The sdist step creates a package that should be installed into each >> tox environment. If the tox environment has cython when this apache >> beam package is installed, it should be used. Nose (or whatever) >> should then run the tests. >> > I spent some time this weekend trying to understand the Beam python build > process, and here’s an overview of what I’ve learned: > > - the :sdks:python:sdist gradle task creates the source tarball (no > surprises there) > - the protobuf stubs are generated during this process > - the sdist is provided to tox, which installs it into the the > virtualenv for that task > - for *-cython tasks, tox installs the cython dep and, as Ahmet > asserted, python setup.py nosetests performs the cythonization. > - this cythonized build overrides the one installed by tox > > Here’s what I learned about the current status of tests wrt cython: > > - cython tox tasks *are* using cython (good!) > - non-cython tox tasks *are not* using cython (good!) > - none of the GCP or integration tests are using cython (bad?) > > This is intentional with the recent change to drop base only tests. Otherwise we would not have coverage for non-cythonized tests. > > - This is because the build is only cythonized when python setup.py > nosetests is used in conjunction with tox (tox installs cython, python > setup.py nosetests compiles it). > - GCP tests don't install cython. ITs don't use tox. > > To confirm my understanding of this, I created a PR [1] to assert that a > cythonized or pure-python build is being used. A cythonized build is > expected by default on linux systems unless a special flag is provided to > inform the test otherwise. It appears as though the portable tests passed > (i.e. used cython), but I forgot to add the assertion for those; like the > other ITs they are not using cython. > > *Questions:* > > - Is the lack of cython for ITs expected and/or desired? > - Why aren't ITs using tox? It's quite possible to pass arguments > into tox to control it's behavior. For example, it seems reasonable that > run_integration_test.sh could be inside tox > > For ITs the primary execution will happen on a remote system. As long as those remote workers have cython installed the tests will run in a cythonized environment. This is true for any IT tests, that runs in a container based on the Dockerfile we have in beam (which installs cython), and also true for legacy Dataflow environments that are not yet using this Beam provided Dockerfile. > > *Next Steps:*There has been some movement in the python community to > solve problems around build dependencies [2] and toolchains [3]. I hope to > have a proposal for how to simplify this process soon. > Thank you! [1] https://github.com/apache/beam/pull/10058 > [2] https://www.python.org/dev/peps/pep-0517/ > [3] https://www.python.org/dev/peps/pep-0518/ > > -chad >