Hello! My name is Britney Whittington, and I'm an intern at Quansight working 
with Nathan Goldbaum. This is my first time working with an open-source project 
like NumPy, and I'm excited to be here!

For my project, I am working on making the NumPy test suite more thread safe, 
to improve its support of Python's free-threaded build of Python 3.14. To do 
this, I have been using `pytest-run-parallel` [1]. You can see the README of 
the plugin for more details, but briefly it runs each test in a test suite many 
times in a thread pool. This exercise results in failures in the test suite, 
often due to thread-safety issues in pytest itself, or thread-safety issues 
with NumPy due to use of global state in the NumPy implementation. As we work 
on making the test suite more thread safe, we have exposed thread safety issues 
in NumPy.

Party of the difficulty of using `pytest-run-parallel` is that Pytest itself 
and the constructs provided by it are not thread safe [2], so it takes some 
effort to make a test suite as big as NumPy's run under `pytest-run-parallel` 
without any failures.

We've already merged some work along these lines. You may have seen some PRs 
I've submitted so far related to this project [3]. We are getting close to 
turning on `pytest-run-parallel` in CI. Before we do that, we want to make sure 
there aren't any objections.

So far, I've made two major changes to the test suite:

----------------------------------------------------
1. Refactoring xunit Setup and Teardown Methods
----------------------------------------------------
#### Problem
- NumPy makes heavy use of pytest's xunit setup and teardown methods [4], which 
are mentioned in the testing guidelines [5].
- When using `pytest-run-parallel`, pytest will try to run the teardown before 
all threads complete running a test.

#### Solutions
- Replace setup/teardown with fixtures. While fixtures play more nicely with 
threads, they are still shared between threads. If one thread mutates a 
fixture, this mutation will carry over to all the other threads. Additionally, 
dependency injection via a fixture is fundamentally *hard to debug and 
understand at a glance*.
- Use **explicit setup**. Instead of pytest calling setup methods, we modify 
the setup methods to be manually called in each test. This fixes the teardown 
issue, and allows us to declare variables locally and not worry about mutations 
between threads.

Currently we are favoring the usage of explicit setup. Of course, this may not 
work for every xunit setup. For more complex cases, fixtures may be more 
useful, or context managers.

-------------------------------------------
2. Refactoring Global np.random Calls
-------------------------------------------
#### Problem
- Calls to `np.random` use the same global instance. This results in errors 
with tests that heavily rely on seeded results, due to threads sharing the same 
global RNG state. While this may not necessarily cause failures, we also feel 
that it's a fundamentally bad practice for tests to rely on global state like 
this.

#### Solution
-  Instead of `np.random`, each test should use a local instance of 
`np.random.RandomState`, so that threads can increment through their own local 
RNG stream.

I have made this change to the tests that fail under `pytest-run-parallel` [7], 
and am working on making all test calls to `np.random` local.
Note that `RandomState` uses the same MT RNG that the global RNG uses, so the 
RNG streams are the same as before.

===============================

-----------------------------------------
> What Does This Mean Going Forward?
-----------------------------------------
By refactoring the test suite to be more thread safe, if we'd like to add 
`pytest-run-parallel` CI, contributors may need to write tests in a somewhat 
different style.

It is possible for `pytest-run-parallel` to fix some of these issues on its 
side, such as making it so xunit setup runs properly. However, with the current 
state of pytest and the plugin, this will require a lot of work, time, and 
maintenance. It also may make it more difficult to improve the thread safety of 
pytest itself in the future. See this issue [8] for further discussion. We 
think that a refactor of the NumPy test suite is more straightforward for now, 
and can always be reverted once `pytest-run-parallel` develops a way to handle 
thread-unsafe setup fixtures.

--------------------
> Testing Guidelines
--------------------
In addition to make the tests thread safe, I'd like to update the testing 
guidelines [9]. Some things can be clarified (such as NumPy's opinion on 
fixture usage) and it would be good to update it with current best practices, 
and, if folks are open to it, guidelines on writing thread-safe tests.

[1] https://github.com/Quansight-Labs/pytest-run-parallel
[2] https://docs.pytest.org/en/stable/explanation/flaky.html#thread-safety
[3] https://github.com/numpy/numpy/issues/29552 (mentioned by all PRs related 
to this project)
[4] https://docs.pytest.org/en/stable/how-to/xunit_setup.html
[5] 
https://numpy.org/doc/stable/reference/testing.html#easier-setup-and-teardown-functions-methods
[6] https://docs.pytest.org/en/stable/explanation/fixtures.html
[7] https://github.com/numpy/numpy/pull/29729
[8] https://github.com/Quansight-Labs/pytest-run-parallel/issues/14
[9] https://numpy.org/doc/stable/reference/testing.html
_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: [email protected]

Reply via email to