Re: [PATCH] wireproto: explicitly flush stdio to prevent stalls on Windows
On Sun, 11 Mar 2018 10:10:21 -0700, Gregory Szorc wrote: > On Sun, Mar 11, 2018 at 7:52 AM, Yuya Nishihara wrote: > > > On Sun, 11 Mar 2018 08:55:41 -0400, Matt Harbison wrote: > > > > > > > On Mar 11, 2018, at 5:43 AM, Yuya Nishihara wrote: > > > > > > > >> On Sun, 11 Mar 2018 00:04:45 -0500, Matt Harbison wrote: > > > >> # HG changeset patch > > > >> # User Matt Harbison > > > >> # Date 1520744281 18000 > > > >> # Sat Mar 10 23:58:01 2018 -0500 > > > >> # Node ID 1145671119a40e82932f63f83ad4049727416174 > > > >> # Parent 963b4223d14fa419df2a82fbe47cd55075707b6a > > > >> wireproto: explicitly flush stdio to prevent stalls on Windows > > > >> > > > >> This is the key to fixing the hangs on Windows in D2720[1]. I put > > flushes in a > > > >> bunch of other places that didn't help, but I suspect that's more a > > lack of test > > > >> coverage than anything else. > > > >> > > > >> Chasing down stuff like this is pretty painful. I'm wondering if we > > can put a > > > >> proxy around sys.stderr (and sys.stdout?) on Windows (only when > > daemonized?) > > > >> that will flush on every write (or at least every write with a '\n'). > > > >> > > > >> [1] https://www.mercurial-scm.org/pipermail/mercurial-devel/ > > 2018-March/113352.html > > > >> > > > >> diff --git a/mercurial/debugcommands.py b/mercurial/debugcommands.py > > > >> --- a/mercurial/debugcommands.py > > > >> +++ b/mercurial/debugcommands.py > > > >> @@ -2769,6 +2769,8 @@ def debugwireproto(ui, repo, **opts): > > > >> > > > >> # Now perform actions based on the parsed wire language > > instructions. > > > >> for action, lines in blocks: > > > >> +util.stdout.flush() > > > > > > > > This one is suspicious. Can you clarify which data is flushed? I think > > > > fileobjectobserver should flush log per method call. > > > > > > Yeah, this made no sense to me. I would have expected that it needed to > > be flushed on the server side. But, here’s where it stalls without this: > > > > > > https://www.mercurial-scm.org/pipermail/mercurial-devel/ > > 2018-March/113388.html > > > > Ok, if the server stalls, that might be because of the --noreadstderr > > option. > > The client must read stderr so the server can write into the pipe. > > > > That's probably it. We're using unbuffered pipes to the process running the > SSH protocol server. So we could very easily get into a deadlock situation > if the client isn't reading from a pipe when the other end writes to it. > > Maybe the correct thing to do is have the client always read from stderr > but not print the data unless explicitly instructed to. That would make the > code a bit more complicated. But I'm not sure how else to get deterministic > results here, since the very nature of polling two pipes is > non-deterministic. The only way to be deterministic is to perform blocking > reads for deterministic amounts - either readline() or read(n). Maybe we can redirect server's stderr to /dev/null if autoreadstderr is off. I don't know that's a correct thing, though. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH] wireproto: explicitly flush stdio to prevent stalls on Windows
On Sun, 11 Mar 2018 13:10:21 -0400, Gregory Szorc wrote: On Sun, Mar 11, 2018 at 7:52 AM, Yuya Nishihara wrote: On Sun, 11 Mar 2018 08:55:41 -0400, Matt Harbison wrote: > > > On Mar 11, 2018, at 5:43 AM, Yuya Nishihara wrote: > > > >> On Sun, 11 Mar 2018 00:04:45 -0500, Matt Harbison wrote: > >> # HG changeset patch > >> # User Matt Harbison > >> # Date 1520744281 18000 > >> # Sat Mar 10 23:58:01 2018 -0500 > >> # Node ID 1145671119a40e82932f63f83ad4049727416174 > >> # Parent 963b4223d14fa419df2a82fbe47cd55075707b6a > >> wireproto: explicitly flush stdio to prevent stalls on Windows > >> > >> This is the key to fixing the hangs on Windows in D2720[1]. I put flushes in a > >> bunch of other places that didn't help, but I suspect that's more a lack of test > >> coverage than anything else. > >> > >> Chasing down stuff like this is pretty painful. I'm wondering if we can put a > >> proxy around sys.stderr (and sys.stdout?) on Windows (only when daemonized?) > >> that will flush on every write (or at least every write with a '\n'). > >> > >> [1] https://www.mercurial-scm.org/pipermail/mercurial-devel/ 2018-March/113352.html > >> > >> diff --git a/mercurial/debugcommands.py b/mercurial/debugcommands.py > >> --- a/mercurial/debugcommands.py > >> +++ b/mercurial/debugcommands.py > >> @@ -2769,6 +2769,8 @@ def debugwireproto(ui, repo, **opts): > >> > >> # Now perform actions based on the parsed wire language instructions. > >> for action, lines in blocks: > >> +util.stdout.flush() > > > > This one is suspicious. Can you clarify which data is flushed? I think > > fileobjectobserver should flush log per method call. > > Yeah, this made no sense to me. I would have expected that it needed to be flushed on the server side. But, here’s where it stalls without this: > > https://www.mercurial-scm.org/pipermail/mercurial-devel/ 2018-March/113388.html Ok, if the server stalls, that might be because of the --noreadstderr option. The client must read stderr so the server can write into the pipe. That's probably it. We're using unbuffered pipes to the process running the SSH protocol server. So we could very easily get into a deadlock situation if the client isn't reading from a pipe when the other end writes to it. The other issue I think is that there's no line buffering on Windows- it's full buffered. The missing data that caused the hangs was actually missing before D2720 too. But the deterministic read and hang made it harder to ignore. :) Maybe the correct thing to do is have the client always read from stderr but not print the data unless explicitly instructed to. That would make the code a bit more complicated. But I'm not sure how else to get deterministic results here, since the very nature of polling two pipes is non-deterministic. The only way to be deterministic is to perform blocking reads for deterministic amounts - either readline() or read(n). I know that (at least on Windows), writing to stdout/stderr can hang if the other side isn't reading the buffer, so it's probably a good idea to tackle at some point. But I don't think we are close to 4K of stderr (I think that's the default size- maybe it's only 1K). The problem here was too little data, so it wasn't flushed. I'm sorry this test is causing trouble for you, Matt. Hopefully the end state of having low-level test coverage of the wire protocol is worth it. FWIW I think the tests have already been useful for finding oddities with stdio stream handling during hooks and identifying wonky behavior of the existing wire protocol. Not a problem, and I agree that the low level tests are better. So hopefully that didn't come off as a criticism of your work, because it wasn't meant to be. I was just thinking out loud that the full buffering on Windows is a PITA, and maybe there's something we can do other than sprinkling flushes all over. The other patch with the hook flushing was something I had to hunt down a year or two ago (I remember wondering if it should go in the finally block, but the tests passed, and I wasn't sure if it would change the order there). And there were several other cases of needing a random flush here or there to get Windows tests aligned with everything else. Wrapping stderr in a proxy is beyond my python ability, and IDK what the downsides are. So I guess we can just watch and see if any more problems pop up in the short term? But I don't have a lot of confidence that we can find every place that needs to be flushed manually. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH] wireproto: explicitly flush stdio to prevent stalls on Windows
On Sun, 11 Mar 2018 05:43:13 -0400, Yuya Nishihara wrote: On Sun, 11 Mar 2018 00:04:45 -0500, Matt Harbison wrote: # HG changeset patch # User Matt Harbison # Date 1520744281 18000 # Sat Mar 10 23:58:01 2018 -0500 # Node ID 1145671119a40e82932f63f83ad4049727416174 # Parent 963b4223d14fa419df2a82fbe47cd55075707b6a wireproto: explicitly flush stdio to prevent stalls on Windows This is the key to fixing the hangs on Windows in D2720[1]. I put flushes in a bunch of other places that didn't help, but I suspect that's more a lack of test coverage than anything else. Chasing down stuff like this is pretty painful. I'm wondering if we can put a proxy around sys.stderr (and sys.stdout?) on Windows (only when daemonized?) that will flush on every write (or at least every write with a '\n'). [1] https://www.mercurial-scm.org/pipermail/mercurial-devel/2018-March/113352.html diff --git a/mercurial/debugcommands.py b/mercurial/debugcommands.py --- a/mercurial/debugcommands.py +++ b/mercurial/debugcommands.py @@ -2769,6 +2769,8 @@ def debugwireproto(ui, repo, **opts): # Now perform actions based on the parsed wire language instructions. for action, lines in blocks: +util.stdout.flush() This one is suspicious. Can you clarify which data is flushed? I think fileobjectobserver should flush log per method call. Yep, fileobjectobserver flushes worked without this. I've got no idea how this flush broke things free in the first place. Maybe the server was waiting for more client requests, this gave it to the server, which then wrote enough more output to flush the server buffers? ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH] wireproto: explicitly flush stdio to prevent stalls on Windows
On Sun, Mar 11, 2018 at 7:52 AM, Yuya Nishihara wrote: > On Sun, 11 Mar 2018 08:55:41 -0400, Matt Harbison wrote: > > > > > On Mar 11, 2018, at 5:43 AM, Yuya Nishihara wrote: > > > > > >> On Sun, 11 Mar 2018 00:04:45 -0500, Matt Harbison wrote: > > >> # HG changeset patch > > >> # User Matt Harbison > > >> # Date 1520744281 18000 > > >> # Sat Mar 10 23:58:01 2018 -0500 > > >> # Node ID 1145671119a40e82932f63f83ad4049727416174 > > >> # Parent 963b4223d14fa419df2a82fbe47cd55075707b6a > > >> wireproto: explicitly flush stdio to prevent stalls on Windows > > >> > > >> This is the key to fixing the hangs on Windows in D2720[1]. I put > flushes in a > > >> bunch of other places that didn't help, but I suspect that's more a > lack of test > > >> coverage than anything else. > > >> > > >> Chasing down stuff like this is pretty painful. I'm wondering if we > can put a > > >> proxy around sys.stderr (and sys.stdout?) on Windows (only when > daemonized?) > > >> that will flush on every write (or at least every write with a '\n'). > > >> > > >> [1] https://www.mercurial-scm.org/pipermail/mercurial-devel/ > 2018-March/113352.html > > >> > > >> diff --git a/mercurial/debugcommands.py b/mercurial/debugcommands.py > > >> --- a/mercurial/debugcommands.py > > >> +++ b/mercurial/debugcommands.py > > >> @@ -2769,6 +2769,8 @@ def debugwireproto(ui, repo, **opts): > > >> > > >> # Now perform actions based on the parsed wire language > instructions. > > >> for action, lines in blocks: > > >> +util.stdout.flush() > > > > > > This one is suspicious. Can you clarify which data is flushed? I think > > > fileobjectobserver should flush log per method call. > > > > Yeah, this made no sense to me. I would have expected that it needed to > be flushed on the server side. But, here’s where it stalls without this: > > > > https://www.mercurial-scm.org/pipermail/mercurial-devel/ > 2018-March/113388.html > > Ok, if the server stalls, that might be because of the --noreadstderr > option. > The client must read stderr so the server can write into the pipe. > That's probably it. We're using unbuffered pipes to the process running the SSH protocol server. So we could very easily get into a deadlock situation if the client isn't reading from a pipe when the other end writes to it. Maybe the correct thing to do is have the client always read from stderr but not print the data unless explicitly instructed to. That would make the code a bit more complicated. But I'm not sure how else to get deterministic results here, since the very nature of polling two pipes is non-deterministic. The only way to be deterministic is to perform blocking reads for deterministic amounts - either readline() or read(n). I'm sorry this test is causing trouble for you, Matt. Hopefully the end state of having low-level test coverage of the wire protocol is worth it. FWIW I think the tests have already been useful for finding oddities with stdio stream handling during hooks and identifying wonky behavior of the existing wire protocol. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH] wireproto: explicitly flush stdio to prevent stalls on Windows
On Sun, 11 Mar 2018 08:55:41 -0400, Matt Harbison wrote: > > > On Mar 11, 2018, at 5:43 AM, Yuya Nishihara wrote: > > > >> On Sun, 11 Mar 2018 00:04:45 -0500, Matt Harbison wrote: > >> # HG changeset patch > >> # User Matt Harbison > >> # Date 1520744281 18000 > >> # Sat Mar 10 23:58:01 2018 -0500 > >> # Node ID 1145671119a40e82932f63f83ad4049727416174 > >> # Parent 963b4223d14fa419df2a82fbe47cd55075707b6a > >> wireproto: explicitly flush stdio to prevent stalls on Windows > >> > >> This is the key to fixing the hangs on Windows in D2720[1]. I put flushes > >> in a > >> bunch of other places that didn't help, but I suspect that's more a lack > >> of test > >> coverage than anything else. > >> > >> Chasing down stuff like this is pretty painful. I'm wondering if we can > >> put a > >> proxy around sys.stderr (and sys.stdout?) on Windows (only when > >> daemonized?) > >> that will flush on every write (or at least every write with a '\n'). > >> > >> [1] > >> https://www.mercurial-scm.org/pipermail/mercurial-devel/2018-March/113352.html > >> > >> diff --git a/mercurial/debugcommands.py b/mercurial/debugcommands.py > >> --- a/mercurial/debugcommands.py > >> +++ b/mercurial/debugcommands.py > >> @@ -2769,6 +2769,8 @@ def debugwireproto(ui, repo, **opts): > >> > >> # Now perform actions based on the parsed wire language instructions. > >> for action, lines in blocks: > >> +util.stdout.flush() > > > > This one is suspicious. Can you clarify which data is flushed? I think > > fileobjectobserver should flush log per method call. > > Yeah, this made no sense to me. I would have expected that it needed to be > flushed on the server side. But, here’s where it stalls without this: > > https://www.mercurial-scm.org/pipermail/mercurial-devel/2018-March/113388.html Ok, if the server stalls, that might be because of the --noreadstderr option. The client must read stderr so the server can write into the pipe. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH] wireproto: explicitly flush stdio to prevent stalls on Windows
> On Mar 11, 2018, at 5:43 AM, Yuya Nishihara wrote: > >> On Sun, 11 Mar 2018 00:04:45 -0500, Matt Harbison wrote: >> # HG changeset patch >> # User Matt Harbison >> # Date 1520744281 18000 >> # Sat Mar 10 23:58:01 2018 -0500 >> # Node ID 1145671119a40e82932f63f83ad4049727416174 >> # Parent 963b4223d14fa419df2a82fbe47cd55075707b6a >> wireproto: explicitly flush stdio to prevent stalls on Windows >> >> This is the key to fixing the hangs on Windows in D2720[1]. I put flushes >> in a >> bunch of other places that didn't help, but I suspect that's more a lack of >> test >> coverage than anything else. >> >> Chasing down stuff like this is pretty painful. I'm wondering if we can put >> a >> proxy around sys.stderr (and sys.stdout?) on Windows (only when daemonized?) >> that will flush on every write (or at least every write with a '\n'). >> >> [1] >> https://www.mercurial-scm.org/pipermail/mercurial-devel/2018-March/113352.html >> >> diff --git a/mercurial/debugcommands.py b/mercurial/debugcommands.py >> --- a/mercurial/debugcommands.py >> +++ b/mercurial/debugcommands.py >> @@ -2769,6 +2769,8 @@ def debugwireproto(ui, repo, **opts): >> >> # Now perform actions based on the parsed wire language instructions. >> for action, lines in blocks: >> +util.stdout.flush() > > This one is suspicious. Can you clarify which data is flushed? I think > fileobjectobserver should flush log per method call. Yeah, this made no sense to me. I would have expected that it needed to be flushed on the server side. But, here’s where it stalls without this: https://www.mercurial-scm.org/pipermail/mercurial-devel/2018-March/113388.html I’ll try the fileobjectobserver flush (I think I have that already, trying to get the stderr data to show up). ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH] wireproto: explicitly flush stdio to prevent stalls on Windows
On Sun, 11 Mar 2018 00:04:45 -0500, Matt Harbison wrote: > # HG changeset patch > # User Matt Harbison > # Date 1520744281 18000 > # Sat Mar 10 23:58:01 2018 -0500 > # Node ID 1145671119a40e82932f63f83ad4049727416174 > # Parent 963b4223d14fa419df2a82fbe47cd55075707b6a > wireproto: explicitly flush stdio to prevent stalls on Windows > > This is the key to fixing the hangs on Windows in D2720[1]. I put flushes in > a > bunch of other places that didn't help, but I suspect that's more a lack of > test > coverage than anything else. > > Chasing down stuff like this is pretty painful. I'm wondering if we can put a > proxy around sys.stderr (and sys.stdout?) on Windows (only when daemonized?) > that will flush on every write (or at least every write with a '\n'). > > [1] > https://www.mercurial-scm.org/pipermail/mercurial-devel/2018-March/113352.html > > diff --git a/mercurial/debugcommands.py b/mercurial/debugcommands.py > --- a/mercurial/debugcommands.py > +++ b/mercurial/debugcommands.py > @@ -2769,6 +2769,8 @@ def debugwireproto(ui, repo, **opts): > > # Now perform actions based on the parsed wire language instructions. > for action, lines in blocks: > +util.stdout.flush() This one is suspicious. Can you clarify which data is flushed? I think fileobjectobserver should flush log per method call. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel