Hi,
The fastest way to produce the problem described in
https://cygwin.com/pipermail/cygwin/2024-January/255267.html and
https://cygwin.com/pipermail/cygwin/2024-January/255273.html seems to be
to run `pip install ...` on a version of `pip` that uses its vendored
`rich` dependency to draw progress bars. (The hang reliably occurs at 0%
on the *second* progress bar, and `--progress-bar off` avoids it.)
Examining what `pip` is doing *may* be sufficient to investigate this.
However, I was able to make a *fairly* simple script that reliably
produces it, at least on my machine (and on GitHub Actions runners). It
seems to me that this script may give some insight. In case it's useful:
import hashlib
import threading
import time
t1 = threading.Thread(target=lambda: print("hello"))
t2 = threading.Thread(target=lambda: print("goodbye"))
t1.start()
time.sleep(1)
print("in between")
t2.start()
t1.join()
t2.join()
The interesting thing here is that the `hashlib` import is required.
Even though that import is not used, the script does not trigger the
problem if it is removed.
As discussed at
https://github.com/gitpython-developers/GitPython/pull/1814, this script
is motivated by code in GitPython that produces the hang when unit tests
are run. The script hangs when attempting to execute `t2.start()`. The
effect appears specific to Python 3.9.18 on Cygwin. Running that script
with Python 3.9.16 on Cygwin, or on either Python 3.9.16 or Python
3.9.18 on either Ubuntu 22.04 LTS or macOS 13, does not produce the
problem. (I don't have native Windows builds of those versions to test
with at this time.)
`t1` can be joined before `t2` is started, and the problem still
reliably occurs. If that is done, then the sleep can be omitted and the
problem sometimes occurs. Running the statements in a REPL also produces
the problem without requiring a sleep (presumably the delay of entering
them is sufficient). The child threads and main thread don't have to
print to produce the problem; I included that to make it clearer what's
going on. I have not tested non-blocking delays.
I named that `simple.py` and ran it in various ways to verify that it
triggers the problem, but I think the most important ways to report are:
/usr/bin/python3.9 simple.py
And:
strace -o strace.out /usr/bin/python3.9 simple.py
By the time I killed the process in the strace run, `strace.out` had
grown to 1819328 lines, most of which were:
--- Process 25112 (pid: 20768), exception c0000005 at 0000000000000000
(This is the same pattern Daniel Abrahamsson reported when running
`pip install` with strace.)
I made a copy of the first 6610 lines as `truncated.out`, but even that
is 828 KiB, so I've posted it here rather than attaching it:
https://gist.github.com/EliahKagan/04143302056426d72c7a617d65890dda
The last 8 lines of `truncated.out` are identical, and the original
`strace.out` continued that way.
(Although the strace output shows that this was run from a directory
related to GitPython, this was not done with any virtual environment
activated, nothing from GitPython was imported or otherwise used, and
neither GitPython nor its distinctive dependencies gitdb and smmap were
installed in the global environment.)
That GitHub Gist also includes `simple.py` for convenience, and
`cygcheck.out` in case that would somehow be useful.
-Eliah
--
Problem reports: https://cygwin.com/problems.html
FAQ: https://cygwin.com/faq/
Documentation: https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple