Re: stable ordering of test output

2017-04-20 Thread Yuya Nishihara
On Wed, 19 Apr 2017 15:22:42 -0400, Augie Fackler wrote:
> On Sat, Apr 15, 2017 at 1:04 PM, Augie Fackler  wrote:
> >
> >> On Apr 15, 2017, at 5:48 AM, Yuya Nishihara  wrote:
> >>
> >> On Thu, 13 Apr 2017 16:17:34 -0400, Augie Fackler wrote:
> >>> # HG changeset patch
> >>> # User Augie Fackler 
> >>> # Date 1492114180 14400
> >>> #  Thu Apr 13 16:09:40 2017 -0400
> >>> # Node ID ec81fd7580f3e31aa92e8834ffbcf2a8e80e72e3
> >>> # Parent  35afb54dbb4df2975dbbf0e1525b98611f18ba85
> >>> sshpeer: try harder to snag stderr when stdout closes unexpectedly
> >>>
> >>> Resolves test failures on FreeBSD, but I'm not happy about the fix.
> >>>
> >>> diff --git a/mercurial/sshpeer.py b/mercurial/sshpeer.py
> >>> --- a/mercurial/sshpeer.py
> >>> +++ b/mercurial/sshpeer.py
> >>> @@ -110,9 +110,17 @@ class doublepipe(object):
> >>> if mainready:
> >>> meth = getattr(self._main, methname)
> >>> if data is None:
> >>> -return meth()
> >>> +r = meth()
> >>> else:
> >>> -return meth(data)
> >>> +r = meth(data)
> >>> +if not r and data != 0:
> >>
> >> I'm not sure what this condition is intended for. It's always true for
> >> write() because r is None and data is a str.
> >
> > This forwarder is also used for read(), where data is the number of bytes 
> > to be read. At least, I think that’s right, now I’m doubting myself.
> 
> Should I go ahead and mail this patch? Perhaps with some extra
> comments? Or do people object to this fix?

Seems fine, but can you update the patch to not always call _forwardoutput()
by write() ? Perhaps this hack can be moved to read() and readline().
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: stable ordering of test output

2017-04-19 Thread Jun Wu
I'm +1 on this patch. It does not look harmful.

Excerpts from Augie Fackler's message of 2017-04-19 15:22:42 -0400:
> >> I'm not sure what this condition is intended for. It's always true for
> >> write() because r is None and data is a str.
> >
> > This forwarder is also used for read(), where data is the number of
> > bytes to be read. At least, I think that’s right, now I’m doubting
> > myself.
> 
> Should I go ahead and mail this patch? Perhaps with some extra
> comments? Or do people object to this fix?
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: stable ordering of test output

2017-04-19 Thread Augie Fackler
On Sat, Apr 15, 2017 at 1:04 PM, Augie Fackler  wrote:
>
>> On Apr 15, 2017, at 5:48 AM, Yuya Nishihara  wrote:
>>
>> On Thu, 13 Apr 2017 16:17:34 -0400, Augie Fackler wrote:
>>> # HG changeset patch
>>> # User Augie Fackler 
>>> # Date 1492114180 14400
>>> #  Thu Apr 13 16:09:40 2017 -0400
>>> # Node ID ec81fd7580f3e31aa92e8834ffbcf2a8e80e72e3
>>> # Parent  35afb54dbb4df2975dbbf0e1525b98611f18ba85
>>> sshpeer: try harder to snag stderr when stdout closes unexpectedly
>>>
>>> Resolves test failures on FreeBSD, but I'm not happy about the fix.
>>>
>>> diff --git a/mercurial/sshpeer.py b/mercurial/sshpeer.py
>>> --- a/mercurial/sshpeer.py
>>> +++ b/mercurial/sshpeer.py
>>> @@ -110,9 +110,17 @@ class doublepipe(object):
>>> if mainready:
>>> meth = getattr(self._main, methname)
>>> if data is None:
>>> -return meth()
>>> +r = meth()
>>> else:
>>> -return meth(data)
>>> +r = meth(data)
>>> +if not r and data != 0:
>>
>> I'm not sure what this condition is intended for. It's always true for
>> write() because r is None and data is a str.
>
> This forwarder is also used for read(), where data is the number of bytes to 
> be read. At least, I think that’s right, now I’m doubting myself.

Should I go ahead and mail this patch? Perhaps with some extra
comments? Or do people object to this fix?

>
>>
>>> +# We've observed a condition that indicates the
>>> +# stdout closed unexpectedly. Check stderr one
>>> +# more time and snag anything that's there before
>>> +# letting anyone know the main part of the pipe
>>> +# closed prematurely.
>>> +_forwardoutput(self._ui, self._side)
>>> +return r
>
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: stable ordering of test output

2017-04-15 Thread Augie Fackler

> On Apr 15, 2017, at 5:48 AM, Yuya Nishihara  wrote:
> 
> On Thu, 13 Apr 2017 16:17:34 -0400, Augie Fackler wrote:
>> # HG changeset patch
>> # User Augie Fackler 
>> # Date 1492114180 14400
>> #  Thu Apr 13 16:09:40 2017 -0400
>> # Node ID ec81fd7580f3e31aa92e8834ffbcf2a8e80e72e3
>> # Parent  35afb54dbb4df2975dbbf0e1525b98611f18ba85
>> sshpeer: try harder to snag stderr when stdout closes unexpectedly
>> 
>> Resolves test failures on FreeBSD, but I'm not happy about the fix.
>> 
>> diff --git a/mercurial/sshpeer.py b/mercurial/sshpeer.py
>> --- a/mercurial/sshpeer.py
>> +++ b/mercurial/sshpeer.py
>> @@ -110,9 +110,17 @@ class doublepipe(object):
>> if mainready:
>> meth = getattr(self._main, methname)
>> if data is None:
>> -return meth()
>> +r = meth()
>> else:
>> -return meth(data)
>> +r = meth(data)
>> +if not r and data != 0:
> 
> I'm not sure what this condition is intended for. It's always true for
> write() because r is None and data is a str.

This forwarder is also used for read(), where data is the number of bytes to be 
read. At least, I think that’s right, now I’m doubting myself.

> 
>> +# We've observed a condition that indicates the
>> +# stdout closed unexpectedly. Check stderr one
>> +# more time and snag anything that's there before
>> +# letting anyone know the main part of the pipe
>> +# closed prematurely.
>> +_forwardoutput(self._ui, self._side)
>> +return r

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: stable ordering of test output

2017-04-15 Thread Yuya Nishihara
On Thu, 13 Apr 2017 16:17:34 -0400, Augie Fackler wrote:
> # HG changeset patch
> # User Augie Fackler 
> # Date 1492114180 14400
> #  Thu Apr 13 16:09:40 2017 -0400
> # Node ID ec81fd7580f3e31aa92e8834ffbcf2a8e80e72e3
> # Parent  35afb54dbb4df2975dbbf0e1525b98611f18ba85
> sshpeer: try harder to snag stderr when stdout closes unexpectedly
> 
> Resolves test failures on FreeBSD, but I'm not happy about the fix.
> 
> diff --git a/mercurial/sshpeer.py b/mercurial/sshpeer.py
> --- a/mercurial/sshpeer.py
> +++ b/mercurial/sshpeer.py
> @@ -110,9 +110,17 @@ class doublepipe(object):
>  if mainready:
>  meth = getattr(self._main, methname)
>  if data is None:
> -return meth()
> +r = meth()
>  else:
> -return meth(data)
> +r = meth(data)
> +if not r and data != 0:

I'm not sure what this condition is intended for. It's always true for
write() because r is None and data is a str.

> +# We've observed a condition that indicates the
> +# stdout closed unexpectedly. Check stderr one
> +# more time and snag anything that's there before
> +# letting anyone know the main part of the pipe
> +# closed prematurely.
> +_forwardoutput(self._ui, self._side)
> +return r
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: stable ordering of test output

2017-04-13 Thread Matt Harbison

On Thu, 13 Apr 2017 16:17:34 -0400, Augie Fackler  wrote:


On Thu, Apr 13, 2017 at 3:55 PM, Augie Fackler  wrote:

On Wed, Mar 8, 2017 at 10:44 AM, Yuya Nishihara  wrote:

On Tue, 7 Mar 2017 17:56:58 +0100, Pierre-Yves David wrote:

On the other hand, this is probably not so bundle2 specific. We have
some "select" logic to read stdout and stderr as soon as possible.  
This

is the main suspect as it is possible that this logic behave different
under linux and other unix (not too much effort have been put into  
it).


posix.poll() waits every type of operation no matter if fd is e.g.  
writable
or not. IIRC, this doesn't always work on FreeBSD since the underlying  
resource

of read/write ends might be shared in the kernel.

But I don't think this is the source of the unstable output.


I've had a little time today between things to try and debug this.
What I've found so far:

1) when the test passes, the remote: output is printed by the
_forwardoutput method in sshpeer, presumably since stderr makes it to
the client before the close of stdout.
2) When the test fails (as on BSD, and I guess Solaris), the client
notices that stdout closed before stderr. It then aborts the
transaction and sshpeer.cleanup() notices some data chilling on stderr
and ensures it gets read and printed.

I'm not really sure why BSD systems would be quicker at communicating
the closed FD than other systems. I'm poking at dummyssh now to see if
maybe it's weirdness from there...


Here's a patch that seems to work. I'm not happy about it, but it
makes the behavior consistent, and it looks mostly harmless.


This fixes it for Windows too.  Thanks!


# HG changeset patch
# User Augie Fackler 
# Date 1492114180 14400
#  Thu Apr 13 16:09:40 2017 -0400
# Node ID ec81fd7580f3e31aa92e8834ffbcf2a8e80e72e3
# Parent  35afb54dbb4df2975dbbf0e1525b98611f18ba85
sshpeer: try harder to snag stderr when stdout closes unexpectedly

Resolves test failures on FreeBSD, but I'm not happy about the fix.

diff --git a/mercurial/sshpeer.py b/mercurial/sshpeer.py
--- a/mercurial/sshpeer.py
+++ b/mercurial/sshpeer.py
@@ -110,9 +110,17 @@ class doublepipe(object):
 if mainready:
 meth = getattr(self._main, methname)
 if data is None:
-return meth()
+r = meth()
 else:
-return meth(data)
+r = meth(data)
+if not r and data != 0:
+# We've observed a condition that indicates the
+# stdout closed unexpectedly. Check stderr one
+# more time and snag anything that's there before
+# letting anyone know the main part of the pipe
+# closed prematurely.
+_forwardoutput(self._ui, self._side)
+return r

 def close(self):
 return self._main.close()
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: stable ordering of test output

2017-04-13 Thread Danek Duvall
Augie Fackler wrote:

> On Thu, Apr 13, 2017 at 3:55 PM, Augie Fackler  wrote:
> > On Wed, Mar 8, 2017 at 10:44 AM, Yuya Nishihara  wrote:
> >> On Tue, 7 Mar 2017 17:56:58 +0100, Pierre-Yves David wrote:
> >>> On the other hand, this is probably not so bundle2 specific. We have
> >>> some "select" logic to read stdout and stderr as soon as possible. This
> >>> is the main suspect as it is possible that this logic behave different
> >>> under linux and other unix (not too much effort have been put into it).
> >>
> >> posix.poll() waits every type of operation no matter if fd is e.g. writable
> >> or not. IIRC, this doesn't always work on FreeBSD since the underlying 
> >> resource
> >> of read/write ends might be shared in the kernel.
> >>
> >> But I don't think this is the source of the unstable output.
> >
> > I've had a little time today between things to try and debug this.
> > What I've found so far:
> >
> > 1) when the test passes, the remote: output is printed by the
> > _forwardoutput method in sshpeer, presumably since stderr makes it to
> > the client before the close of stdout.
> > 2) When the test fails (as on BSD, and I guess Solaris), the client
> > notices that stdout closed before stderr. It then aborts the
> > transaction and sshpeer.cleanup() notices some data chilling on stderr
> > and ensures it gets read and printed.
> >
> > I'm not really sure why BSD systems would be quicker at communicating
> > the closed FD than other systems. I'm poking at dummyssh now to see if
> > maybe it's weirdness from there...
> 
> Here's a patch that seems to work. I'm not happy about it, but it
> makes the behavior consistent, and it looks mostly harmless.

Confirmed that it fixes the problem on Solaris, too.

Thanks!

Danek
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: stable ordering of test output

2017-04-13 Thread Augie Fackler
On Thu, Apr 13, 2017 at 3:55 PM, Augie Fackler  wrote:
> On Wed, Mar 8, 2017 at 10:44 AM, Yuya Nishihara  wrote:
>> On Tue, 7 Mar 2017 17:56:58 +0100, Pierre-Yves David wrote:
>>> On the other hand, this is probably not so bundle2 specific. We have
>>> some "select" logic to read stdout and stderr as soon as possible. This
>>> is the main suspect as it is possible that this logic behave different
>>> under linux and other unix (not too much effort have been put into it).
>>
>> posix.poll() waits every type of operation no matter if fd is e.g. writable
>> or not. IIRC, this doesn't always work on FreeBSD since the underlying 
>> resource
>> of read/write ends might be shared in the kernel.
>>
>> But I don't think this is the source of the unstable output.
>
> I've had a little time today between things to try and debug this.
> What I've found so far:
>
> 1) when the test passes, the remote: output is printed by the
> _forwardoutput method in sshpeer, presumably since stderr makes it to
> the client before the close of stdout.
> 2) When the test fails (as on BSD, and I guess Solaris), the client
> notices that stdout closed before stderr. It then aborts the
> transaction and sshpeer.cleanup() notices some data chilling on stderr
> and ensures it gets read and printed.
>
> I'm not really sure why BSD systems would be quicker at communicating
> the closed FD than other systems. I'm poking at dummyssh now to see if
> maybe it's weirdness from there...

Here's a patch that seems to work. I'm not happy about it, but it
makes the behavior consistent, and it looks mostly harmless.

# HG changeset patch
# User Augie Fackler 
# Date 1492114180 14400
#  Thu Apr 13 16:09:40 2017 -0400
# Node ID ec81fd7580f3e31aa92e8834ffbcf2a8e80e72e3
# Parent  35afb54dbb4df2975dbbf0e1525b98611f18ba85
sshpeer: try harder to snag stderr when stdout closes unexpectedly

Resolves test failures on FreeBSD, but I'm not happy about the fix.

diff --git a/mercurial/sshpeer.py b/mercurial/sshpeer.py
--- a/mercurial/sshpeer.py
+++ b/mercurial/sshpeer.py
@@ -110,9 +110,17 @@ class doublepipe(object):
 if mainready:
 meth = getattr(self._main, methname)
 if data is None:
-return meth()
+r = meth()
 else:
-return meth(data)
+r = meth(data)
+if not r and data != 0:
+# We've observed a condition that indicates the
+# stdout closed unexpectedly. Check stderr one
+# more time and snag anything that's there before
+# letting anyone know the main part of the pipe
+# closed prematurely.
+_forwardoutput(self._ui, self._side)
+return r

 def close(self):
 return self._main.close()
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: stable ordering of test output

2017-04-13 Thread Augie Fackler
On Wed, Mar 8, 2017 at 10:44 AM, Yuya Nishihara  wrote:
> On Tue, 7 Mar 2017 17:56:58 +0100, Pierre-Yves David wrote:
>> On the other hand, this is probably not so bundle2 specific. We have
>> some "select" logic to read stdout and stderr as soon as possible. This
>> is the main suspect as it is possible that this logic behave different
>> under linux and other unix (not too much effort have been put into it).
>
> posix.poll() waits every type of operation no matter if fd is e.g. writable
> or not. IIRC, this doesn't always work on FreeBSD since the underlying 
> resource
> of read/write ends might be shared in the kernel.
>
> But I don't think this is the source of the unstable output.

I've had a little time today between things to try and debug this.
What I've found so far:

1) when the test passes, the remote: output is printed by the
_forwardoutput method in sshpeer, presumably since stderr makes it to
the client before the close of stdout.
2) When the test fails (as on BSD, and I guess Solaris), the client
notices that stdout closed before stderr. It then aborts the
transaction and sshpeer.cleanup() notices some data chilling on stderr
and ensures it gets read and printed.

I'm not really sure why BSD systems would be quicker at communicating
the closed FD than other systems. I'm poking at dummyssh now to see if
maybe it's weirdness from there...
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: stable ordering of test output

2017-03-08 Thread Yuya Nishihara
On Tue, 7 Mar 2017 17:56:58 +0100, Pierre-Yves David wrote:
> On the other hand, this is probably not so bundle2 specific. We have 
> some "select" logic to read stdout and stderr as soon as possible. This 
> is the main suspect as it is possible that this logic behave different 
> under linux and other unix (not too much effort have been put into it).

posix.poll() waits every type of operation no matter if fd is e.g. writable
or not. IIRC, this doesn't always work on FreeBSD since the underlying resource
of read/write ends might be shared in the kernel.

But I don't think this is the source of the unstable output.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: stable ordering of test output

2017-03-07 Thread Matt Harbison
On Fri, 03 Mar 2017 17:45:56 -0500, Danek Duvall   
wrote:



I frequently get failures like this:

--- .../mercurial.hg/tests/test-bundle2-exchange.t
+++ .../mercurial.hg/tests/test-bundle2-exchange.t.err
@@ -1042,11 +1042,11 @@
   $ hg --config devel.legacy.exchange=bundle1 clone  
ssh://user@dummy/bundle2onlyserver not-bundle2-ssh

   requesting all changes
   adding changesets
-  remote: abort: incompatible Mercurial client; bundle2 required
-  remote: (see  
https://www.mercurial-scm.org/wiki/IncompatibleClient)

   transaction abort!
   rollback completed
   abort: stream ended unexpectedly (got 0 bytes, expected 4)
+  remote: abort: incompatible Mercurial client; bundle2 required
+  remote: (see  
https://www.mercurial-scm.org/wiki/IncompatibleClient)

   [255]
  $ cat > bundle2onlyserver/.hg/hgrc << EOF

ERROR: test-bundle2-exchange.t output changed

It's usually fairly consistent, at least for a period of time, and then  
it

goes away.  Presumably it's some sort of fairly stable timing issue, and
possibly unique to the environment I'm running in (at least, I assume  
that

the official tests aren't showing this).


The symptoms seem similar to what was happening on Windows a few years ago.

https://www.mercurial-scm.org/repo/hg/rev/83f6c4733ecc
https://www.mercurial-scm.org/repo/hg/rev/2abbf4750915

The commit referenced in the first link's message might also be of  
interest.


I could patch the tests locally to reorder the lines, but if it's really  
an

environmental issue, then that's guaranteed to work consistently even for
me.

Is this (ever? frequently?) an issue for anyone else?

I can't think of any particularly satisfying solution to this, other than
perhaps separating the remote lines from the local lines, and comparing
each of those independently.  Would that make sense?

Thanks,
Danek
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: stable ordering of test output

2017-03-07 Thread Danek Duvall
Pierre-Yves David wrote:

> >>It's also a problem on the FreeBSD buildbot. I don't know enough about
> >>the bundle2 code to understand how to fix it, but maybe we can figure
> >>out a way to get marmoute an account on a machine taht would help him
> >>diagnose? Danek, do you have a solaris machine you could get him
> >>access to for testing purposes?
> >
> >Yeah, I tried to setup the BSD on the gcc compile farm to debug this,
> >but the machine have such ancient everything else that I finally gave up
> >(after recompiling my own python and a couple of dependency). I did not
> >had time to spend on this since that last attempt. Having an account on
> >something showing the issue would help.
> >
> >On the other hand, this is probably not so bundle2 specific. We have
> >some "select" logic to read stdout and stderr as soon as possible. This
> >is the main suspect as it is possible that this logic behave different
> >under linux and other unix (not too much effort have been put into it).
> >So there is not need of a deep knowledge of bundle2 to debug this if
> >someone else want to give it a shot.
> 
> As perr Augie request on IRC let me be more specific: I would have a look at
> the behavior of the logic in sshpeer.py using "doublepipe" class and the
> associated "util.poll".
> 
> It is responsible for reading the first stream that has data (of stderr and
> stdout). This kind of change in output seems to imply that either the server
> is flushing the stream in different order or that the "ready to read"
> detection is disabled or behaving differently.

It is in fact not specific to bundle2, as I'm also currently seeing the
problem in test-ssh-bundle1.

I would think this is an inherently racy problem, as with any asynchronous
execution, though I would then expect more variance in the results, so
perhaps I'm mistaken here.

It looks like my home machine reproduces the problem.  I'll set you up with
an account and send you the login info.

Thanks,
Danek
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: stable ordering of test output

2017-03-07 Thread Pierre-Yves David



On 03/07/2017 05:56 PM, Pierre-Yves David wrote:



On 03/07/2017 05:49 PM, Augie Fackler wrote:

On Fri, Mar 03, 2017 at 04:37:54PM -0800, Jun Wu wrote:

Excerpts from Danek Duvall's message of 2017-03-03 14:45:56 -0800:

I frequently get failures like this:

--- .../mercurial.hg/tests/test-bundle2-exchange.t
+++ .../mercurial.hg/tests/test-bundle2-exchange.t.err
@@ -1042,11 +1042,11 @@
   $ hg --config devel.legacy.exchange=bundle1 clone
ssh://user@dummy/bundle2onlyserver not-bundle2-ssh
   requesting all changes
   adding changesets
-  remote: abort: incompatible Mercurial client; bundle2 required
-  remote: (see
https://www.mercurial-scm.org/wiki/IncompatibleClient )
   transaction abort!
   rollback completed
   abort: stream ended unexpectedly (got 0 bytes, expected 4)
+  remote: abort: incompatible Mercurial client; bundle2 required
+  remote: (see
https://www.mercurial-scm.org/wiki/IncompatibleClient )
   [255]

   $ cat > bundle2onlyserver/.hg/hgrc << EOF

ERROR: test-bundle2-exchange.t output changed

It's usually fairly consistent, at least for a period of time, and
then it
goes away.  Presumably it's some sort of fairly stable timing issue,
and
possibly unique to the environment I'm running in (at least, I
assume that
the official tests aren't showing this).

I could patch the tests locally to reorder the lines, but if it's
really an
environmental issue, then that's guaranteed to work consistently
even for
me.

Is this (ever? frequently?) an issue for anyone else?


Yes. We have seen this on our OSX tests. I guess it's "select()"
returning
different things.


It's also a problem on the FreeBSD buildbot. I don't know enough about
the bundle2 code to understand how to fix it, but maybe we can figure
out a way to get marmoute an account on a machine taht would help him
diagnose? Danek, do you have a solaris machine you could get him
access to for testing purposes?


Yeah, I tried to setup the BSD on the gcc compile farm to debug this,
but the machine have such ancient everything else that I finally gave up
(after recompiling my own python and a couple of dependency). I did not
had time to spend on this since that last attempt. Having an account on
something showing the issue would help.

On the other hand, this is probably not so bundle2 specific. We have
some "select" logic to read stdout and stderr as soon as possible. This
is the main suspect as it is possible that this logic behave different
under linux and other unix (not too much effort have been put into it).
So there is not need of a deep knowledge of bundle2 to debug this if
someone else want to give it a shot.


As perr Augie request on IRC let me be more specific: I would have a 
look at the behavior of the logic in sshpeer.py using "doublepipe" class 
and the associated "util.poll".


It is responsible for reading the first stream that has data (of stderr 
and stdout). This kind of change in output seems to imply that either 
the server is flushing the stream in different order or that the "ready 
to read" detection is disabled or behaving differently.



The fix would be pretty straightforward while figuring out the root
cause
(what race condition it is) needs time.


I can't think of any particularly satisfying solution to this, other
than
perhaps separating the remote lines from the local lines, and comparing
each of those independently.  Would that make sense?

Thanks,
Danek

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel




--
Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: stable ordering of test output

2017-03-07 Thread Pierre-Yves David



On 03/07/2017 05:49 PM, Augie Fackler wrote:

On Fri, Mar 03, 2017 at 04:37:54PM -0800, Jun Wu wrote:

Excerpts from Danek Duvall's message of 2017-03-03 14:45:56 -0800:

I frequently get failures like this:

--- .../mercurial.hg/tests/test-bundle2-exchange.t
+++ .../mercurial.hg/tests/test-bundle2-exchange.t.err
@@ -1042,11 +1042,11 @@
   $ hg --config devel.legacy.exchange=bundle1 clone 
ssh://user@dummy/bundle2onlyserver not-bundle2-ssh
   requesting all changes
   adding changesets
-  remote: abort: incompatible Mercurial client; bundle2 required
-  remote: (see https://www.mercurial-scm.org/wiki/IncompatibleClient )
   transaction abort!
   rollback completed
   abort: stream ended unexpectedly (got 0 bytes, expected 4)
+  remote: abort: incompatible Mercurial client; bundle2 required
+  remote: (see https://www.mercurial-scm.org/wiki/IncompatibleClient )
   [255]

   $ cat > bundle2onlyserver/.hg/hgrc << EOF

ERROR: test-bundle2-exchange.t output changed

It's usually fairly consistent, at least for a period of time, and then it
goes away.  Presumably it's some sort of fairly stable timing issue, and
possibly unique to the environment I'm running in (at least, I assume that
the official tests aren't showing this).

I could patch the tests locally to reorder the lines, but if it's really an
environmental issue, then that's guaranteed to work consistently even for
me.

Is this (ever? frequently?) an issue for anyone else?


Yes. We have seen this on our OSX tests. I guess it's "select()" returning
different things.


It's also a problem on the FreeBSD buildbot. I don't know enough about
the bundle2 code to understand how to fix it, but maybe we can figure
out a way to get marmoute an account on a machine taht would help him
diagnose? Danek, do you have a solaris machine you could get him
access to for testing purposes?


Yeah, I tried to setup the BSD on the gcc compile farm to debug this, 
but the machine have such ancient everything else that I finally gave up 
(after recompiling my own python and a couple of dependency). I did not 
had time to spend on this since that last attempt. Having an account on 
something showing the issue would help.


On the other hand, this is probably not so bundle2 specific. We have 
some "select" logic to read stdout and stderr as soon as possible. This 
is the main suspect as it is possible that this logic behave different 
under linux and other unix (not too much effort have been put into it). 
So there is not need of a deep knowledge of bundle2 to debug this if 
someone else want to give it a shot.



The fix would be pretty straightforward while figuring out the root cause
(what race condition it is) needs time.


I can't think of any particularly satisfying solution to this, other than
perhaps separating the remote lines from the local lines, and comparing
each of those independently.  Would that make sense?

Thanks,
Danek

___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


--
Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: stable ordering of test output

2017-03-07 Thread Augie Fackler
On Fri, Mar 03, 2017 at 04:37:54PM -0800, Jun Wu wrote:
> Excerpts from Danek Duvall's message of 2017-03-03 14:45:56 -0800:
> > I frequently get failures like this:
> >
> > --- .../mercurial.hg/tests/test-bundle2-exchange.t
> > +++ .../mercurial.hg/tests/test-bundle2-exchange.t.err
> > @@ -1042,11 +1042,11 @@
> >$ hg --config devel.legacy.exchange=bundle1 clone 
> > ssh://user@dummy/bundle2onlyserver not-bundle2-ssh
> >requesting all changes
> >adding changesets
> > -  remote: abort: incompatible Mercurial client; bundle2 required
> > -  remote: (see https://www.mercurial-scm.org/wiki/IncompatibleClient )
> >transaction abort!
> >rollback completed
> >abort: stream ended unexpectedly (got 0 bytes, expected 4)
> > +  remote: abort: incompatible Mercurial client; bundle2 required
> > +  remote: (see https://www.mercurial-scm.org/wiki/IncompatibleClient )
> >[255]
> >
> >$ cat > bundle2onlyserver/.hg/hgrc << EOF
> >
> > ERROR: test-bundle2-exchange.t output changed
> >
> > It's usually fairly consistent, at least for a period of time, and then it
> > goes away.  Presumably it's some sort of fairly stable timing issue, and
> > possibly unique to the environment I'm running in (at least, I assume that
> > the official tests aren't showing this).
> >
> > I could patch the tests locally to reorder the lines, but if it's really an
> > environmental issue, then that's guaranteed to work consistently even for
> > me.
> >
> > Is this (ever? frequently?) an issue for anyone else?
>
> Yes. We have seen this on our OSX tests. I guess it's "select()" returning
> different things.

It's also a problem on the FreeBSD buildbot. I don't know enough about
the bundle2 code to understand how to fix it, but maybe we can figure
out a way to get marmoute an account on a machine taht would help him
diagnose? Danek, do you have a solaris machine you could get him
access to for testing purposes?

>
> The fix would be pretty straightforward while figuring out the root cause
> (what race condition it is) needs time.
>
> > I can't think of any particularly satisfying solution to this, other than
> > perhaps separating the remote lines from the local lines, and comparing
> > each of those independently.  Would that make sense?
> >
> > Thanks,
> > Danek
> ___
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: stable ordering of test output

2017-03-03 Thread Jun Wu
Excerpts from Danek Duvall's message of 2017-03-03 14:45:56 -0800:
> I frequently get failures like this:
> 
> --- .../mercurial.hg/tests/test-bundle2-exchange.t
> +++ .../mercurial.hg/tests/test-bundle2-exchange.t.err
> @@ -1042,11 +1042,11 @@
>$ hg --config devel.legacy.exchange=bundle1 clone 
> ssh://user@dummy/bundle2onlyserver not-bundle2-ssh
>requesting all changes
>adding changesets
> -  remote: abort: incompatible Mercurial client; bundle2 required
> -  remote: (see https://www.mercurial-scm.org/wiki/IncompatibleClient )
>transaction abort!
>rollback completed
>abort: stream ended unexpectedly (got 0 bytes, expected 4)
> +  remote: abort: incompatible Mercurial client; bundle2 required
> +  remote: (see https://www.mercurial-scm.org/wiki/IncompatibleClient )
>[255]
>  
>$ cat > bundle2onlyserver/.hg/hgrc << EOF
> 
> ERROR: test-bundle2-exchange.t output changed
> 
> It's usually fairly consistent, at least for a period of time, and then it
> goes away.  Presumably it's some sort of fairly stable timing issue, and
> possibly unique to the environment I'm running in (at least, I assume that
> the official tests aren't showing this).
> 
> I could patch the tests locally to reorder the lines, but if it's really an
> environmental issue, then that's guaranteed to work consistently even for
> me.
> 
> Is this (ever? frequently?) an issue for anyone else?

Yes. We have seen this on our OSX tests. I guess it's "select()" returning
different things.

The fix would be pretty straightforward while figuring out the root cause
(what race condition it is) needs time.

> I can't think of any particularly satisfying solution to this, other than
> perhaps separating the remote lines from the local lines, and comparing
> each of those independently.  Would that make sense?
> 
> Thanks,
> Danek
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel