Hello! My name is Britney Whittington, and I'm an intern at Quansight working with Nathan Goldbaum. This is my first time working with an open-source project like NumPy, and I'm excited to be here!
For my project, I am working on making the NumPy test suite more thread safe, to improve its support of Python's free-threaded build of Python 3.14. To do this, I have been using `pytest-run-parallel` [1]. You can see the README of the plugin for more details, but briefly it runs each test in a test suite many times in a thread pool. This exercise results in failures in the test suite, often due to thread-safety issues in pytest itself, or thread-safety issues with NumPy due to use of global state in the NumPy implementation. As we work on making the test suite more thread safe, we have exposed thread safety issues in NumPy. Party of the difficulty of using `pytest-run-parallel` is that Pytest itself and the constructs provided by it are not thread safe [2], so it takes some effort to make a test suite as big as NumPy's run under `pytest-run-parallel` without any failures. We've already merged some work along these lines. You may have seen some PRs I've submitted so far related to this project [3]. We are getting close to turning on `pytest-run-parallel` in CI. Before we do that, we want to make sure there aren't any objections. So far, I've made two major changes to the test suite: ---------------------------------------------------- 1. Refactoring xunit Setup and Teardown Methods ---------------------------------------------------- #### Problem - NumPy makes heavy use of pytest's xunit setup and teardown methods [4], which are mentioned in the testing guidelines [5]. - When using `pytest-run-parallel`, pytest will try to run the teardown before all threads complete running a test. #### Solutions - Replace setup/teardown with fixtures. While fixtures play more nicely with threads, they are still shared between threads. If one thread mutates a fixture, this mutation will carry over to all the other threads. Additionally, dependency injection via a fixture is fundamentally *hard to debug and understand at a glance*. - Use **explicit setup**. Instead of pytest calling setup methods, we modify the setup methods to be manually called in each test. This fixes the teardown issue, and allows us to declare variables locally and not worry about mutations between threads. Currently we are favoring the usage of explicit setup. Of course, this may not work for every xunit setup. For more complex cases, fixtures may be more useful, or context managers. ------------------------------------------- 2. Refactoring Global np.random Calls ------------------------------------------- #### Problem - Calls to `np.random` use the same global instance. This results in errors with tests that heavily rely on seeded results, due to threads sharing the same global RNG state. While this may not necessarily cause failures, we also feel that it's a fundamentally bad practice for tests to rely on global state like this. #### Solution - Instead of `np.random`, each test should use a local instance of `np.random.RandomState`, so that threads can increment through their own local RNG stream. I have made this change to the tests that fail under `pytest-run-parallel` [7], and am working on making all test calls to `np.random` local. Note that `RandomState` uses the same MT RNG that the global RNG uses, so the RNG streams are the same as before. =============================== ----------------------------------------- > What Does This Mean Going Forward? ----------------------------------------- By refactoring the test suite to be more thread safe, if we'd like to add `pytest-run-parallel` CI, contributors may need to write tests in a somewhat different style. It is possible for `pytest-run-parallel` to fix some of these issues on its side, such as making it so xunit setup runs properly. However, with the current state of pytest and the plugin, this will require a lot of work, time, and maintenance. It also may make it more difficult to improve the thread safety of pytest itself in the future. See this issue [8] for further discussion. We think that a refactor of the NumPy test suite is more straightforward for now, and can always be reverted once `pytest-run-parallel` develops a way to handle thread-unsafe setup fixtures. -------------------- > Testing Guidelines -------------------- In addition to make the tests thread safe, I'd like to update the testing guidelines [9]. Some things can be clarified (such as NumPy's opinion on fixture usage) and it would be good to update it with current best practices, and, if folks are open to it, guidelines on writing thread-safe tests. [1] https://github.com/Quansight-Labs/pytest-run-parallel [2] https://docs.pytest.org/en/stable/explanation/flaky.html#thread-safety [3] https://github.com/numpy/numpy/issues/29552 (mentioned by all PRs related to this project) [4] https://docs.pytest.org/en/stable/how-to/xunit_setup.html [5] https://numpy.org/doc/stable/reference/testing.html#easier-setup-and-teardown-functions-methods [6] https://docs.pytest.org/en/stable/explanation/fixtures.html [7] https://github.com/numpy/numpy/pull/29729 [8] https://github.com/Quansight-Labs/pytest-run-parallel/issues/14 [9] https://numpy.org/doc/stable/reference/testing.html _______________________________________________ NumPy-Discussion mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: [email protected]
