For now we should run Py3 and Py2 tests alongside each other to get a side-by-side comparison. I suggest we open a Jira ticket to investigate the difference in performance . We have limited performance test coverage on Python 3 in Beam, so more Py3 tests would help a lot here, thanks for adding them.
On Fri, Dec 6, 2019 at 9:43 AM Robert Bradshaw <rober...@google.com> wrote: > This is very surprising--I would expect the times to quite similar. Do > you have profiles for where the (difference in) time is spent? With > differences like these, I wonder if there are issues with container > setup (e.g. some things not being installed or cached) for Python 3. > > On Fri, Dec 6, 2019 at 9:06 AM Kamil Wasilewski > <kamil.wasilew...@polidea.com> wrote: > > > > Hi all, > > > > Python 2.7 won't be maintained past 2020 and that's why we want to > migrate all Python performance tests in Beam from Python 2.7 to Python 3.7. > However, I was surprised by seeing that after switching Dataflow tests to > Python 3.x they are a few times slower. For example, the same ParDo test > that takes approx. 8 minutes to run on Python 2.7 needs approx. 21 minutes > on Python 3.x. You can find all the results I gathered and the setup here. > > > > Do you know any possible reason for this? This issue makes it impossible > to do the migration, because of the limited resources on Jenkins (almost > every job would be aborted). > > > > Thanks, > > Kamil >