The recent changes in Cygwin's pipe logic are a gift that keeps on
giving.

The recurring pattern is that bugs are fixed, but with many of these bug
fixes, new bugs are produced, and we are stuck in this seemingly endless
tunnel of fixing bugs caused by bug fixes.

In cbfaeba4f7 (Cygwin: pipe: Fix incorrect write length in raw_write(),
2024-11-06), for example, a segmentation fault was fixed. Which is good!
However, a bug was introduced, where e.g. Git for Windows' `git clone`
via SSH frequently hangs indefinitely for large-ish repositories, see
https://github.com/git-for-windows/git/issues/5688.

That commit removed logic whereby in non-blocking mode, not only the
chunk size, but also the overall count of bytes to write (`len`) was
clamped to whatever is indicated as the `WriteQuotaAvailable`. Now only
`chunk` is clamped to that, but `len` is left at its original number.
However, the following `while` loop expects `len - chunk` (which is
assigned to `len1`) not to be positive in non-blocking mode.

Let's reinstate that clamping logic and implicitly fix this SSH hang.

Fixes: cbfaeba4f7 (Cygwin: pipe: Fix incorrect write length in raw_write(), 
2024-11-06)
Signed-off-by: Johannes Schindelin <[email protected]>
---
Published-As: 
https://github.com/dscho/msys2-runtime/releases/tag/fix-ssh-hangs-reloaded-v1
Fetch-It-Via: git fetch https://github.com/dscho/msys2-runtime 
fix-ssh-hangs-reloaded-v1

        This commit message will like any love you can give it. For
        example, I do not _quite_ understand why the `while` loop skips
        large chunks of code unless `len1 > 0`, and what the exact idea
        was behind having that `while` loop even for non-blocking mode.
        Could anyone help me understand that `raw_write()` method? Is
        there any good reason why the non-blocking mode runs into a
        `while` loop? Is it supposed to be run only once in non-blocking
        mode, is _that_ the big secret that allows the code to be shared
        between blocking and non-blocking mode? If so, wouldn't it be much
        better to refactor out that logic and then have non-blocking mode
        take a short-cut, for clarity's sake and peace of readers' mind?

        What I am quite confident is that this works around the problems.

        I would have put more work into the commit message, if it weren't
        for two counter-acting points:

        1. This seems to be a pretty bad regression by which many Git for
           Windows users are affected. So I do feel quite the pressure to
           get a fix out to those users.

        2. Despite my pleas, the commit messages in the pipe-related
           changes keep having too many gaps, still leave way too much
           unclear for me to make any sense of them, and I have to admit
           that I do not want to be the only person in that space to put
           in a large effort to write stellar commit messages. Therefore I
           left this here commit message in a state I consider "good
           enough", even if I am more than willing to improve it should
           someone enlighten me as to the questions I raised above.

 winsup/cygwin/fhandler/pipe.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/winsup/cygwin/fhandler/pipe.cc b/winsup/cygwin/fhandler/pipe.cc
index e35d523bbc..13af7f2ae1 100644
--- a/winsup/cygwin/fhandler/pipe.cc
+++ b/winsup/cygwin/fhandler/pipe.cc
@@ -542,6 +542,8 @@ fhandler_pipe_fifo::raw_write (const void *ptr, size_t len)
 
   if (len <= (size_t) avail)
     chunk = len;
+  else if (is_nonblocking ())
+    chunk = len = avail;
   else
     chunk = avail;
 

base-commit: 1186791e9f404644832023b8fa801227c2995ab7
-- 
2.50.0.windows.1

Reply via email to