Gregory P. Smith added the comment:

I suspect flakiness is due to parallel test execution.  Is some other test 
possibly executing at the same time removing __pycache__ directories or .pyc 
files to recreate them (test_compileall?)?  If the test were adjusted to point 
to a .py file of its own that it generates in a temporary directory that would 
avoid that.


