RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Chris Purvis
Apologies all.  Nic and I did not realise that an internal email had been 
cross-posted to a public mail list (let alone one with strict email formatting 
rules!), and were having a hard time understanding the context for the emails 
we were receiving.

I have a certain amount of experience of asynchronous communication and 
protocol design: we aren't novices in this area.  One would hope in the kind of 
intra-machine protocol that we're using here that, if we know the sender is 
halted, there should be a way of clearing the contents of the channel so that 
the receiver can get hold of whatever has been put in to it.
Can tcflush() (or some similar API) be used to resolve our debugging scenario?

Regards,
Chris
-- 
Chris Purvis
Senior Development Manager 

Micro Focus

chris.pur...@microfocus.com
The Lawn, 22-30 Old Bath Road
Newbury, Berkshire, RG14 1QN, UK
Direct:+44 1635 565282



This message has been scanned for malware by Websense. www.websense.com


RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Chris Purvis
Peter,

So is calling tcflush() a solution here?

Regards,
Chris
-- 
Chris Purvis
Senior Development Manager 

Micro Focus

chris.pur...@microfocus.com
The Lawn, 22-30 Old Bath Road
Newbury, Berkshire, RG14 1QN, UK
Direct:+44 1635 565282


-Original Message-
From: Peter Hurley [mailto:pe...@hurleysoftware.com] 
Sent: 05 May 2015 14:29
To: Nic Percival; Michael Matz; Kevin Fletcher; Paul Matthews; Chris Purvis
Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org
Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
'no' when it should say 'yes'

Stop top-posting.

On 05/05/2015 08:03 AM, Nic Percival wrote:
> 
> There is only ever one debuggee process.
> My original demo (and indeed the original test failure) is not threaded. The 
> debugger is multi-threaded.
> 
> I've brought in Chris, Fletch and Paul, my immediate colleagues, into the 
> discussion.
> 
> The email thread is getting a little tangled, however, from my standpoint I 
> have..
> 
> 1) poll tells us we have nothing to read on a pty, when we know something was 
> written into the other end.

You're using a synchronization mechanism (ptrace) to validate an
asynchronous process (tty i/o). That's not going to work.

> 2) Given that 'poll' is not telling us that data has been written into the 
> pty, what can we use? Surely that is what poll is for.

poll() doesn't tell you that nothing has been written.

You're inferring that using a broken understanding of terminal i/o:
ttys are not synchronous pipes.

> 3) If a debuggee program has displayed 'how old are you?' and then hit a 
> breakpoint on the 'ACCEPT' response, then the question might very well not be 
> displayed, despite the debugger  sitting on the statement some way subsequent 
> to the display. 

Let's extend your logic process here to a general-purpose debugger
that can control all output devices.

1. The debugger and debuggee are running on X-Windows.
2. The debuggee outputs 'how old are you?'
3. The debugger immediately halts the debuggee and all output devices.

The output will not appear on the monitor because X-Windows
output is asynchronous. So is terminal i/o.

 
> 4) If I understand correctly, the modification is a performance enhancement. 
> Obviously in the case of 'ptrace' debugging, performance is not a requirement.

Nothing obvious about it. Not all uses of ptrace are interactive, and certainly 
don't
want alternate behavior based on whether the process is ptraced.

> 5) Given 'xterm' use pty's, could a scenario happen where a user is prompted 
> 'How old are you?' in the xterm, but an input (getchar, whatever) is hit 
> before that output is displayed? With or without ptrace?

Of course. It's called typeahead. Since tty i/o is buffered, the following is 
possible:

1. The user types '15\r'
2. The process writes 'How old are you?'
3. The process reads '15\n'

Processes that don't want typeahead call tcflush() before reading input.

Regards,
Peter Hurley

> Thanks,
> Nic
> 
> 
> 
> -Original Message-
> From: Peter Hurley [mailto:pe...@hurleysoftware.com] 
> Sent: 05 May 2015 12:19
> To: Nic Percival; Michael Matz
> Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
> 'no' when it should say 'yes'
> 
> 
> A: No.
> Q: Should I include quotations after my reply?
> 
> http://daringfireball.net/2007/07/on_top
> 
> 
> On 05/05/2015 04:20 AM, Nic Percival wrote:
>> Michael is correct.
>> Our COBOL debugger has a test feature whereby we can drive it to step 
>> through debugging code, hitting breakpoints and so on.
>> The debugger maintains a 'user screen' which is what the 'debuggee' process 
>> has displayed.
>> This is communicated to the debugger with pseudo-tty's.
>> The state of this user screen is checked as part of this (and other) tests.
> 
> So the debugger doesn't display output from other non-TRACEME threads or 
> child processes of the debuggee, right?
> 
> When that's fixed, you'll see that the "test failure" has gone away.
> 
>> The actual test failure is a failure of some text to be displayed on the 
>> debuggee user screen when we know, given it has hit a certain breakpoint, 
>> that the text has been written.
>>
>> What is worse is its non-deterministic.
> 
> That your test is non-deterministic stems from the fact that the i/o is 
> asynchronous.
> 
> You would experience the same problem if your test setup was a tty in 
> loopback.
> 
>> Sometimes the text makes it and is displayed, so it wouldn't even be 
>> practical to modify the test to make it pass.
>> We wouldn't really want to do that anyway - the test is just fine on ot

RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Chris Purvis
A way of giving context.



This message has been scanned for malware by Websense. www.websense.com
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Peter Hurley
On 05/05/2015 09:34 AM, Chris Purvis wrote:
> Peter,
> 
> So is calling tcflush() a solution here?
> 
> Regards,
> Chris

What is with the top-posting?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Peter Hurley
Stop top-posting.

On 05/05/2015 08:03 AM, Nic Percival wrote:
> 
> There is only ever one debuggee process.
> My original demo (and indeed the original test failure) is not threaded. The 
> debugger is multi-threaded.
> 
> I've brought in Chris, Fletch and Paul, my immediate colleagues, into the 
> discussion.
> 
> The email thread is getting a little tangled, however, from my standpoint I 
> have..
> 
> 1) poll tells us we have nothing to read on a pty, when we know something was 
> written into the other end.

You're using a synchronization mechanism (ptrace) to validate an
asynchronous process (tty i/o). That's not going to work.

> 2) Given that 'poll' is not telling us that data has been written into the 
> pty, what can we use? Surely that is what poll is for.

poll() doesn't tell you that nothing has been written.

You're inferring that using a broken understanding of terminal i/o:
ttys are not synchronous pipes.

> 3) If a debuggee program has displayed 'how old are you?' and then hit a 
> breakpoint on the 'ACCEPT' response, then the question might very well not be 
> displayed, despite the debugger  sitting on the statement some way subsequent 
> to the display. 

Let's extend your logic process here to a general-purpose debugger
that can control all output devices.

1. The debugger and debuggee are running on X-Windows.
2. The debuggee outputs 'how old are you?'
3. The debugger immediately halts the debuggee and all output devices.

The output will not appear on the monitor because X-Windows
output is asynchronous. So is terminal i/o.

 
> 4) If I understand correctly, the modification is a performance enhancement. 
> Obviously in the case of 'ptrace' debugging, performance is not a requirement.

Nothing obvious about it. Not all uses of ptrace are interactive, and certainly 
don't
want alternate behavior based on whether the process is ptraced.

> 5) Given 'xterm' use pty's, could a scenario happen where a user is prompted 
> 'How old are you?' in the xterm, but an input (getchar, whatever) is hit 
> before that output is displayed? With or without ptrace?

Of course. It's called typeahead. Since tty i/o is buffered, the following is 
possible:

1. The user types '15\r'
2. The process writes 'How old are you?'
3. The process reads '15\n'

Processes that don't want typeahead call tcflush() before reading input.

Regards,
Peter Hurley

> Thanks,
> Nic
> 
> 
> 
> -Original Message-
> From: Peter Hurley [mailto:pe...@hurleysoftware.com] 
> Sent: 05 May 2015 12:19
> To: Nic Percival; Michael Matz
> Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
> 'no' when it should say 'yes'
> 
> 
> A: No.
> Q: Should I include quotations after my reply?
> 
> http://daringfireball.net/2007/07/on_top
> 
> 
> On 05/05/2015 04:20 AM, Nic Percival wrote:
>> Michael is correct.
>> Our COBOL debugger has a test feature whereby we can drive it to step 
>> through debugging code, hitting breakpoints and so on.
>> The debugger maintains a 'user screen' which is what the 'debuggee' process 
>> has displayed.
>> This is communicated to the debugger with pseudo-tty's.
>> The state of this user screen is checked as part of this (and other) tests.
> 
> So the debugger doesn't display output from other non-TRACEME threads or 
> child processes of the debuggee, right?
> 
> When that's fixed, you'll see that the "test failure" has gone away.
> 
>> The actual test failure is a failure of some text to be displayed on the 
>> debuggee user screen when we know, given it has hit a certain breakpoint, 
>> that the text has been written.
>>
>> What is worse is its non-deterministic.
> 
> That your test is non-deterministic stems from the fact that the i/o is 
> asynchronous.
> 
> You would experience the same problem if your test setup was a tty in 
> loopback.
> 
>> Sometimes the text makes it and is displayed, so it wouldn't even be 
>> practical to modify the test to make it pass.
>> We wouldn't really want to do that anyway - the test is just fine on other 
>> earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris.
> 
> There is a reason Linux is the platform of choice for scalability.
> 
> Regards,
> Peter Hurley
> 
>> -----Original Message-
>> From: Michael Matz [mailto:m...@suse.de]
>> Sent: 04 May 2015 13:24
>> To: Peter Hurley
>> Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; 
>> linux-kernel@vger.kernel.org
>> Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
>> 'no' when it should say 'yes'
>>
>

RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Nic Percival

There is only ever one debuggee process.
My original demo (and indeed the original test failure) is not threaded. The 
debugger is multi-threaded.

I've brought in Chris, Fletch and Paul, my immediate colleagues, into the 
discussion.

The email thread is getting a little tangled, however, from my standpoint I 
have..

1) poll tells us we have nothing to read on a pty, when we know something was 
written into the other end.

2) Given that 'poll' is not telling us that data has been written into the pty, 
what can we use? Surely that is what poll is for.

3) If a debuggee program has displayed 'how old are you?' and then hit a 
breakpoint on the 'ACCEPT' response, then the question might very well not be 
displayed, despite the debugger  sitting on the statement some way subsequent 
to the display. 

4) If I understand correctly, the modification is a performance enhancement. 
Obviously in the case of 'ptrace' debugging, performance is not a requirement.

5) Given 'xterm' use pty's, could a scenario happen where a user is prompted 
'How old are you?' in the xterm, but an input (getchar, whatever) is hit before 
that output is displayed? With or without ptrace?

Thanks,
Nic



-Original Message-
From: Peter Hurley [mailto:pe...@hurleysoftware.com] 
Sent: 05 May 2015 12:19
To: Nic Percival; Michael Matz
Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org
Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
'no' when it should say 'yes'


A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top


On 05/05/2015 04:20 AM, Nic Percival wrote:
> Michael is correct.
> Our COBOL debugger has a test feature whereby we can drive it to step through 
> debugging code, hitting breakpoints and so on.
> The debugger maintains a 'user screen' which is what the 'debuggee' process 
> has displayed.
> This is communicated to the debugger with pseudo-tty's.
> The state of this user screen is checked as part of this (and other) tests.

So the debugger doesn't display output from other non-TRACEME threads or child 
processes of the debuggee, right?

When that's fixed, you'll see that the "test failure" has gone away.

> The actual test failure is a failure of some text to be displayed on the 
> debuggee user screen when we know, given it has hit a certain breakpoint, 
> that the text has been written.
> 
> What is worse is its non-deterministic.

That your test is non-deterministic stems from the fact that the i/o is 
asynchronous.

You would experience the same problem if your test setup was a tty in loopback.

> Sometimes the text makes it and is displayed, so it wouldn't even be 
> practical to modify the test to make it pass.
> We wouldn't really want to do that anyway - the test is just fine on other 
> earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris.

There is a reason Linux is the platform of choice for scalability.

Regards,
Peter Hurley

> -Original Message-
> From: Michael Matz [mailto:m...@suse.de]
> Sent: 04 May 2015 13:24
> To: Peter Hurley
> Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; 
> linux-kernel@vger.kernel.org
> Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
> 'no' when it should say 'yes'
> 
> Hi,
> 
> On Fri, 1 May 2015, Peter Hurley wrote:
> 
>> I don't think this a real bug, in the sense that pty i/o is not 
>> synchronous, in the same way that tty i/o is not synchronous.
> 
> Here's what I wrote internally about my speculations about this being a bug 
> or not:
> 
>>> I also never hit it with pipes (remove the USEPTY define), also not 
>>> on sle12, so it must be some change specific to the pty implementation.
>>>
>>> Now, all of this is of course unspecified.  There are two 
>>> asynchronous processes involved, and a buffered tube between them.
>>> Just because one process filled one end of the tube (the breakpoint 
>>> was hit) doesn't mean the contents have to appear at that instant at 
>>> the other end.  So the change in behaviour in sle12 is not a genuine 
>>> bug.  It _might_ be an unintented change, though, that's why kernel 
>>> people should comment on this.  If there are no terribly good 
>>> reasons for this change I'd consider it a quality-of-implementation 
>>> regression in sle12.
> 
> So, I'd accept this being declared a non-bug, but it is certainly a change in 
> behaviour that's visible for our debugger team.
> 
>> However, that said, if this is a regression (regression as in "it 
>> broke something that used to work", not regression as in "this new 
>> thing I'm writing doesn't behave the way I want it to" :) )
>>
>> Help me u

Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Peter Hurley

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top


On 05/05/2015 04:20 AM, Nic Percival wrote:
> Michael is correct.
> Our COBOL debugger has a test feature whereby we can drive it to step through 
> debugging code, hitting breakpoints and so on.
> The debugger maintains a 'user screen' which is what the 'debuggee' process 
> has displayed.
> This is communicated to the debugger with pseudo-tty's.
> The state of this user screen is checked as part of this (and other) tests.

So the debugger doesn't display output from other non-TRACEME threads or
child processes of the debuggee, right?

When that's fixed, you'll see that the "test failure" has gone away.

> The actual test failure is a failure of some text to be displayed on the 
> debuggee user screen when we know, given it has hit a certain breakpoint, 
> that the text has been written.
> 
> What is worse is its non-deterministic.

That your test is non-deterministic stems from the fact that
the i/o is asynchronous.

You would experience the same problem if your test setup
was a tty in loopback.

> Sometimes the text makes it and is displayed, so it wouldn't even be 
> practical to modify the test to make it pass.
> We wouldn't really want to do that anyway - the test is just fine on other 
> earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris.

There is a reason Linux is the platform of choice for scalability.

Regards,
Peter Hurley

> -Original Message-
> From: Michael Matz [mailto:m...@suse.de] 
> Sent: 04 May 2015 13:24
> To: Peter Hurley
> Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; 
> linux-kernel@vger.kernel.org
> Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
> 'no' when it should say 'yes'
> 
> Hi,
> 
> On Fri, 1 May 2015, Peter Hurley wrote:
> 
>> I don't think this a real bug, in the sense that pty i/o is not 
>> synchronous, in the same way that tty i/o is not synchronous.
> 
> Here's what I wrote internally about my speculations about this being a bug 
> or not:
> 
>>> I also never hit it with pipes (remove the USEPTY define), also not 
>>> on sle12, so it must be some change specific to the pty implementation.
>>>
>>> Now, all of this is of course unspecified.  There are two 
>>> asynchronous processes involved, and a buffered tube between them.  
>>> Just because one process filled one end of the tube (the breakpoint 
>>> was hit) doesn't mean the contents have to appear at that instant at 
>>> the other end.  So the change in behaviour in sle12 is not a genuine 
>>> bug.  It _might_ be an unintented change, though, that's why kernel 
>>> people should comment on this.  If there are no terribly good 
>>> reasons for this change I'd consider it a quality-of-implementation 
>>> regression in sle12.
> 
> So, I'd accept this being declared a non-bug, but it is certainly a change in 
> behaviour that's visible for our debugger team.
> 
>> However, that said, if this is a regression (regression as in "it 
>> broke something that used to work", not regression as in "this new 
>> thing I'm writing doesn't behave the way I want it to" :) )
>>
>> Help me understand the use-case here: are you using pty i/o to debug 
>> the debugger?
> 
> Nic is working on the Cobol debugger, but I think this pty i/o is rather a 
> part of the normal interaction between a debugged Cobol process and the 
> debugger; that's just a theory, Nic is authorative here.  But this change in 
> behaviour _did_ result in real testsuite regressions, so it's not something 
> that he wanted to write from scratch.
> 
> (FWIW: I do think it's a better QoI factor if something returns data from a 
> tube if we can know via side channels (break points) that something must have 
> been written locally to the other end of the tube, if that can be ensured 
> without too much other work)
> 
> 
> Ciao,
> Michael.
> 
> 
> This message has been scanned for malware by Websense. www.websense.com
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Nic Percival
Michael is correct.
Our COBOL debugger has a test feature whereby we can drive it to step through 
debugging code, hitting breakpoints and so on.
The debugger maintains a 'user screen' which is what the 'debuggee' process has 
displayed.
This is communicated to the debugger with pseudo-tty's.
The state of this user screen is checked as part of this (and other) tests.

The actual test failure is a failure of some text to be displayed on the 
debuggee user screen when we know, given it has hit a certain breakpoint, that 
the text has been written.

What is worse is its non-deterministic. Sometimes the text makes it and is 
displayed, so it wouldn't even be practical to modify the test to make it pass.
We wouldn't really want to do that anyway - the test is just fine on other 
earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris.

Thanks,
Nic

-Original Message-
From: Michael Matz [mailto:m...@suse.de] 
Sent: 04 May 2015 13:24
To: Peter Hurley
Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; 
linux-kernel@vger.kernel.org
Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
'no' when it should say 'yes'

Hi,

On Fri, 1 May 2015, Peter Hurley wrote:

> I don't think this a real bug, in the sense that pty i/o is not 
> synchronous, in the same way that tty i/o is not synchronous.

Here's what I wrote internally about my speculations about this being a bug or 
not:

> > I also never hit it with pipes (remove the USEPTY define), also not 
> > on sle12, so it must be some change specific to the pty implementation.
> > 
> > Now, all of this is of course unspecified.  There are two 
> > asynchronous processes involved, and a buffered tube between them.  
> > Just because one process filled one end of the tube (the breakpoint 
> > was hit) doesn't mean the contents have to appear at that instant at 
> > the other end.  So the change in behaviour in sle12 is not a genuine 
> > bug.  It _might_ be an unintented change, though, that's why kernel 
> > people should comment on this.  If there are no terribly good 
> > reasons for this change I'd consider it a quality-of-implementation 
> > regression in sle12.

So, I'd accept this being declared a non-bug, but it is certainly a change in 
behaviour that's visible for our debugger team.

> However, that said, if this is a regression (regression as in "it 
> broke something that used to work", not regression as in "this new 
> thing I'm writing doesn't behave the way I want it to" :) )
> 
> Help me understand the use-case here: are you using pty i/o to debug 
> the debugger?

Nic is working on the Cobol debugger, but I think this pty i/o is rather a part 
of the normal interaction between a debugged Cobol process and the debugger; 
that's just a theory, Nic is authorative here.  But this change in behaviour 
_did_ result in real testsuite regressions, so it's not something that he 
wanted to write from scratch.

(FWIW: I do think it's a better QoI factor if something returns data from a 
tube if we can know via side channels (break points) that something must have 
been written locally to the other end of the tube, if that can be ensured 
without too much other work)


Ciao,
Michael.


This message has been scanned for malware by Websense. www.websense.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Peter Hurley
Stop top-posting.

On 05/05/2015 08:03 AM, Nic Percival wrote:
 
 There is only ever one debuggee process.
 My original demo (and indeed the original test failure) is not threaded. The 
 debugger is multi-threaded.
 
 I've brought in Chris, Fletch and Paul, my immediate colleagues, into the 
 discussion.
 
 The email thread is getting a little tangled, however, from my standpoint I 
 have..
 
 1) poll tells us we have nothing to read on a pty, when we know something was 
 written into the other end.

You're using a synchronization mechanism (ptrace) to validate an
asynchronous process (tty i/o). That's not going to work.

 2) Given that 'poll' is not telling us that data has been written into the 
 pty, what can we use? Surely that is what poll is for.

poll() doesn't tell you that nothing has been written.

You're inferring that using a broken understanding of terminal i/o:
ttys are not synchronous pipes.

 3) If a debuggee program has displayed 'how old are you?' and then hit a 
 breakpoint on the 'ACCEPT' response, then the question might very well not be 
 displayed, despite the debugger  sitting on the statement some way subsequent 
 to the display. 

Let's extend your logic process here to a general-purpose debugger
that can control all output devices.

1. The debugger and debuggee are running on X-Windows.
2. The debuggee outputs 'how old are you?'
3. The debugger immediately halts the debuggee and all output devices.

The output will not appear on the monitor because X-Windows
output is asynchronous. So is terminal i/o.

 
 4) If I understand correctly, the modification is a performance enhancement. 
 Obviously in the case of 'ptrace' debugging, performance is not a requirement.

Nothing obvious about it. Not all uses of ptrace are interactive, and certainly 
don't
want alternate behavior based on whether the process is ptraced.

 5) Given 'xterm' use pty's, could a scenario happen where a user is prompted 
 'How old are you?' in the xterm, but an input (getchar, whatever) is hit 
 before that output is displayed? With or without ptrace?

Of course. It's called typeahead. Since tty i/o is buffered, the following is 
possible:

1. The user types '15\r'
2. The process writes 'How old are you?'
3. The process reads '15\n'

Processes that don't want typeahead call tcflush() before reading input.

Regards,
Peter Hurley

 Thanks,
 Nic
 
 
 
 -Original Message-
 From: Peter Hurley [mailto:pe...@hurleysoftware.com] 
 Sent: 05 May 2015 12:19
 To: Nic Percival; Michael Matz
 Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org
 Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
 'no' when it should say 'yes'
 
 
 A: No.
 Q: Should I include quotations after my reply?
 
 http://daringfireball.net/2007/07/on_top
 
 
 On 05/05/2015 04:20 AM, Nic Percival wrote:
 Michael is correct.
 Our COBOL debugger has a test feature whereby we can drive it to step 
 through debugging code, hitting breakpoints and so on.
 The debugger maintains a 'user screen' which is what the 'debuggee' process 
 has displayed.
 This is communicated to the debugger with pseudo-tty's.
 The state of this user screen is checked as part of this (and other) tests.
 
 So the debugger doesn't display output from other non-TRACEME threads or 
 child processes of the debuggee, right?
 
 When that's fixed, you'll see that the test failure has gone away.
 
 The actual test failure is a failure of some text to be displayed on the 
 debuggee user screen when we know, given it has hit a certain breakpoint, 
 that the text has been written.

 What is worse is its non-deterministic.
 
 That your test is non-deterministic stems from the fact that the i/o is 
 asynchronous.
 
 You would experience the same problem if your test setup was a tty in 
 loopback.
 
 Sometimes the text makes it and is displayed, so it wouldn't even be 
 practical to modify the test to make it pass.
 We wouldn't really want to do that anyway - the test is just fine on other 
 earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris.
 
 There is a reason Linux is the platform of choice for scalability.
 
 Regards,
 Peter Hurley
 
 -Original Message-
 From: Michael Matz [mailto:m...@suse.de]
 Sent: 04 May 2015 13:24
 To: Peter Hurley
 Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; 
 linux-kernel@vger.kernel.org
 Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
 'no' when it should say 'yes'

 Hi,

 On Fri, 1 May 2015, Peter Hurley wrote:

 I don't think this a real bug, in the sense that pty i/o is not 
 synchronous, in the same way that tty i/o is not synchronous.

 Here's what I wrote internally about my speculations about this being a bug 
 or not:

 I also never hit it with pipes (remove the USEPTY define), also not 
 on sle12, so it must be some change specific to the pty implementation.

 Now, all of this is of course unspecified.  There are two 
 asynchronous

RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Chris Purvis
A way of giving context.



This message has been scanned for malware by Websense. www.websense.com
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Peter Hurley
On 05/05/2015 09:34 AM, Chris Purvis wrote:
 Peter,
 
 So is calling tcflush() a solution here?
 
 Regards,
 Chris

What is with the top-posting?


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Chris Purvis
Apologies all.  Nic and I did not realise that an internal email had been 
cross-posted to a public mail list (let alone one with strict email formatting 
rules!), and were having a hard time understanding the context for the emails 
we were receiving.

I have a certain amount of experience of asynchronous communication and 
protocol design: we aren't novices in this area.  One would hope in the kind of 
intra-machine protocol that we're using here that, if we know the sender is 
halted, there should be a way of clearing the contents of the channel so that 
the receiver can get hold of whatever has been put in to it.
Can tcflush() (or some similar API) be used to resolve our debugging scenario?

Regards,
Chris
-- 
Chris Purvis
Senior Development Manager 

Micro Focus

chris.pur...@microfocus.com
The Lawn, 22-30 Old Bath Road
Newbury, Berkshire, RG14 1QN, UK
Direct:+44 1635 565282



This message has been scanned for malware by Websense. www.websense.com


RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Chris Purvis
Peter,

So is calling tcflush() a solution here?

Regards,
Chris
-- 
Chris Purvis
Senior Development Manager 

Micro Focus

chris.pur...@microfocus.com
The Lawn, 22-30 Old Bath Road
Newbury, Berkshire, RG14 1QN, UK
Direct:+44 1635 565282


-Original Message-
From: Peter Hurley [mailto:pe...@hurleysoftware.com] 
Sent: 05 May 2015 14:29
To: Nic Percival; Michael Matz; Kevin Fletcher; Paul Matthews; Chris Purvis
Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org
Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
'no' when it should say 'yes'

Stop top-posting.

On 05/05/2015 08:03 AM, Nic Percival wrote:
 
 There is only ever one debuggee process.
 My original demo (and indeed the original test failure) is not threaded. The 
 debugger is multi-threaded.
 
 I've brought in Chris, Fletch and Paul, my immediate colleagues, into the 
 discussion.
 
 The email thread is getting a little tangled, however, from my standpoint I 
 have..
 
 1) poll tells us we have nothing to read on a pty, when we know something was 
 written into the other end.

You're using a synchronization mechanism (ptrace) to validate an
asynchronous process (tty i/o). That's not going to work.

 2) Given that 'poll' is not telling us that data has been written into the 
 pty, what can we use? Surely that is what poll is for.

poll() doesn't tell you that nothing has been written.

You're inferring that using a broken understanding of terminal i/o:
ttys are not synchronous pipes.

 3) If a debuggee program has displayed 'how old are you?' and then hit a 
 breakpoint on the 'ACCEPT' response, then the question might very well not be 
 displayed, despite the debugger  sitting on the statement some way subsequent 
 to the display. 

Let's extend your logic process here to a general-purpose debugger
that can control all output devices.

1. The debugger and debuggee are running on X-Windows.
2. The debuggee outputs 'how old are you?'
3. The debugger immediately halts the debuggee and all output devices.

The output will not appear on the monitor because X-Windows
output is asynchronous. So is terminal i/o.

 
 4) If I understand correctly, the modification is a performance enhancement. 
 Obviously in the case of 'ptrace' debugging, performance is not a requirement.

Nothing obvious about it. Not all uses of ptrace are interactive, and certainly 
don't
want alternate behavior based on whether the process is ptraced.

 5) Given 'xterm' use pty's, could a scenario happen where a user is prompted 
 'How old are you?' in the xterm, but an input (getchar, whatever) is hit 
 before that output is displayed? With or without ptrace?

Of course. It's called typeahead. Since tty i/o is buffered, the following is 
possible:

1. The user types '15\r'
2. The process writes 'How old are you?'
3. The process reads '15\n'

Processes that don't want typeahead call tcflush() before reading input.

Regards,
Peter Hurley

 Thanks,
 Nic
 
 
 
 -Original Message-
 From: Peter Hurley [mailto:pe...@hurleysoftware.com] 
 Sent: 05 May 2015 12:19
 To: Nic Percival; Michael Matz
 Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org
 Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
 'no' when it should say 'yes'
 
 
 A: No.
 Q: Should I include quotations after my reply?
 
 http://daringfireball.net/2007/07/on_top
 
 
 On 05/05/2015 04:20 AM, Nic Percival wrote:
 Michael is correct.
 Our COBOL debugger has a test feature whereby we can drive it to step 
 through debugging code, hitting breakpoints and so on.
 The debugger maintains a 'user screen' which is what the 'debuggee' process 
 has displayed.
 This is communicated to the debugger with pseudo-tty's.
 The state of this user screen is checked as part of this (and other) tests.
 
 So the debugger doesn't display output from other non-TRACEME threads or 
 child processes of the debuggee, right?
 
 When that's fixed, you'll see that the test failure has gone away.
 
 The actual test failure is a failure of some text to be displayed on the 
 debuggee user screen when we know, given it has hit a certain breakpoint, 
 that the text has been written.

 What is worse is its non-deterministic.
 
 That your test is non-deterministic stems from the fact that the i/o is 
 asynchronous.
 
 You would experience the same problem if your test setup was a tty in 
 loopback.
 
 Sometimes the text makes it and is displayed, so it wouldn't even be 
 practical to modify the test to make it pass.
 We wouldn't really want to do that anyway - the test is just fine on other 
 earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris.
 
 There is a reason Linux is the platform of choice for scalability.
 
 Regards,
 Peter Hurley
 
 -Original Message-
 From: Michael Matz [mailto:m...@suse.de]
 Sent: 04 May 2015 13:24
 To: Peter Hurley
 Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; 
 linux-kernel

RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Nic Percival

There is only ever one debuggee process.
My original demo (and indeed the original test failure) is not threaded. The 
debugger is multi-threaded.

I've brought in Chris, Fletch and Paul, my immediate colleagues, into the 
discussion.

The email thread is getting a little tangled, however, from my standpoint I 
have..

1) poll tells us we have nothing to read on a pty, when we know something was 
written into the other end.

2) Given that 'poll' is not telling us that data has been written into the pty, 
what can we use? Surely that is what poll is for.

3) If a debuggee program has displayed 'how old are you?' and then hit a 
breakpoint on the 'ACCEPT' response, then the question might very well not be 
displayed, despite the debugger  sitting on the statement some way subsequent 
to the display. 

4) If I understand correctly, the modification is a performance enhancement. 
Obviously in the case of 'ptrace' debugging, performance is not a requirement.

5) Given 'xterm' use pty's, could a scenario happen where a user is prompted 
'How old are you?' in the xterm, but an input (getchar, whatever) is hit before 
that output is displayed? With or without ptrace?

Thanks,
Nic



-Original Message-
From: Peter Hurley [mailto:pe...@hurleysoftware.com] 
Sent: 05 May 2015 12:19
To: Nic Percival; Michael Matz
Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org
Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
'no' when it should say 'yes'


A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top


On 05/05/2015 04:20 AM, Nic Percival wrote:
 Michael is correct.
 Our COBOL debugger has a test feature whereby we can drive it to step through 
 debugging code, hitting breakpoints and so on.
 The debugger maintains a 'user screen' which is what the 'debuggee' process 
 has displayed.
 This is communicated to the debugger with pseudo-tty's.
 The state of this user screen is checked as part of this (and other) tests.

So the debugger doesn't display output from other non-TRACEME threads or child 
processes of the debuggee, right?

When that's fixed, you'll see that the test failure has gone away.

 The actual test failure is a failure of some text to be displayed on the 
 debuggee user screen when we know, given it has hit a certain breakpoint, 
 that the text has been written.
 
 What is worse is its non-deterministic.

That your test is non-deterministic stems from the fact that the i/o is 
asynchronous.

You would experience the same problem if your test setup was a tty in loopback.

 Sometimes the text makes it and is displayed, so it wouldn't even be 
 practical to modify the test to make it pass.
 We wouldn't really want to do that anyway - the test is just fine on other 
 earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris.

There is a reason Linux is the platform of choice for scalability.

Regards,
Peter Hurley

 -Original Message-
 From: Michael Matz [mailto:m...@suse.de]
 Sent: 04 May 2015 13:24
 To: Peter Hurley
 Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; 
 linux-kernel@vger.kernel.org
 Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
 'no' when it should say 'yes'
 
 Hi,
 
 On Fri, 1 May 2015, Peter Hurley wrote:
 
 I don't think this a real bug, in the sense that pty i/o is not 
 synchronous, in the same way that tty i/o is not synchronous.
 
 Here's what I wrote internally about my speculations about this being a bug 
 or not:
 
 I also never hit it with pipes (remove the USEPTY define), also not 
 on sle12, so it must be some change specific to the pty implementation.

 Now, all of this is of course unspecified.  There are two 
 asynchronous processes involved, and a buffered tube between them.
 Just because one process filled one end of the tube (the breakpoint 
 was hit) doesn't mean the contents have to appear at that instant at 
 the other end.  So the change in behaviour in sle12 is not a genuine 
 bug.  It _might_ be an unintented change, though, that's why kernel 
 people should comment on this.  If there are no terribly good 
 reasons for this change I'd consider it a quality-of-implementation 
 regression in sle12.
 
 So, I'd accept this being declared a non-bug, but it is certainly a change in 
 behaviour that's visible for our debugger team.
 
 However, that said, if this is a regression (regression as in it 
 broke something that used to work, not regression as in this new 
 thing I'm writing doesn't behave the way I want it to :) )

 Help me understand the use-case here: are you using pty i/o to debug 
 the debugger?
 
 Nic is working on the Cobol debugger, but I think this pty i/o is rather a 
 part of the normal interaction between a debugged Cobol process and the 
 debugger; that's just a theory, Nic is authorative here.  But this change in 
 behaviour _did_ result in real testsuite regressions, so it's not something 
 that he

RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Nic Percival
Michael is correct.
Our COBOL debugger has a test feature whereby we can drive it to step through 
debugging code, hitting breakpoints and so on.
The debugger maintains a 'user screen' which is what the 'debuggee' process has 
displayed.
This is communicated to the debugger with pseudo-tty's.
The state of this user screen is checked as part of this (and other) tests.

The actual test failure is a failure of some text to be displayed on the 
debuggee user screen when we know, given it has hit a certain breakpoint, that 
the text has been written.

What is worse is its non-deterministic. Sometimes the text makes it and is 
displayed, so it wouldn't even be practical to modify the test to make it pass.
We wouldn't really want to do that anyway - the test is just fine on other 
earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris.

Thanks,
Nic

-Original Message-
From: Michael Matz [mailto:m...@suse.de] 
Sent: 04 May 2015 13:24
To: Peter Hurley
Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; 
linux-kernel@vger.kernel.org
Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
'no' when it should say 'yes'

Hi,

On Fri, 1 May 2015, Peter Hurley wrote:

 I don't think this a real bug, in the sense that pty i/o is not 
 synchronous, in the same way that tty i/o is not synchronous.

Here's what I wrote internally about my speculations about this being a bug or 
not:

  I also never hit it with pipes (remove the USEPTY define), also not 
  on sle12, so it must be some change specific to the pty implementation.
  
  Now, all of this is of course unspecified.  There are two 
  asynchronous processes involved, and a buffered tube between them.  
  Just because one process filled one end of the tube (the breakpoint 
  was hit) doesn't mean the contents have to appear at that instant at 
  the other end.  So the change in behaviour in sle12 is not a genuine 
  bug.  It _might_ be an unintented change, though, that's why kernel 
  people should comment on this.  If there are no terribly good 
  reasons for this change I'd consider it a quality-of-implementation 
  regression in sle12.

So, I'd accept this being declared a non-bug, but it is certainly a change in 
behaviour that's visible for our debugger team.

 However, that said, if this is a regression (regression as in it 
 broke something that used to work, not regression as in this new 
 thing I'm writing doesn't behave the way I want it to :) )
 
 Help me understand the use-case here: are you using pty i/o to debug 
 the debugger?

Nic is working on the Cobol debugger, but I think this pty i/o is rather a part 
of the normal interaction between a debugged Cobol process and the debugger; 
that's just a theory, Nic is authorative here.  But this change in behaviour 
_did_ result in real testsuite regressions, so it's not something that he 
wanted to write from scratch.

(FWIW: I do think it's a better QoI factor if something returns data from a 
tube if we can know via side channels (break points) that something must have 
been written locally to the other end of the tube, if that can be ensured 
without too much other work)


Ciao,
Michael.


This message has been scanned for malware by Websense. www.websense.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Peter Hurley

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top


On 05/05/2015 04:20 AM, Nic Percival wrote:
 Michael is correct.
 Our COBOL debugger has a test feature whereby we can drive it to step through 
 debugging code, hitting breakpoints and so on.
 The debugger maintains a 'user screen' which is what the 'debuggee' process 
 has displayed.
 This is communicated to the debugger with pseudo-tty's.
 The state of this user screen is checked as part of this (and other) tests.

So the debugger doesn't display output from other non-TRACEME threads or
child processes of the debuggee, right?

When that's fixed, you'll see that the test failure has gone away.

 The actual test failure is a failure of some text to be displayed on the 
 debuggee user screen when we know, given it has hit a certain breakpoint, 
 that the text has been written.
 
 What is worse is its non-deterministic.

That your test is non-deterministic stems from the fact that
the i/o is asynchronous.

You would experience the same problem if your test setup
was a tty in loopback.

 Sometimes the text makes it and is displayed, so it wouldn't even be 
 practical to modify the test to make it pass.
 We wouldn't really want to do that anyway - the test is just fine on other 
 earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris.

There is a reason Linux is the platform of choice for scalability.

Regards,
Peter Hurley

 -Original Message-
 From: Michael Matz [mailto:m...@suse.de] 
 Sent: 04 May 2015 13:24
 To: Peter Hurley
 Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; 
 linux-kernel@vger.kernel.org
 Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
 'no' when it should say 'yes'
 
 Hi,
 
 On Fri, 1 May 2015, Peter Hurley wrote:
 
 I don't think this a real bug, in the sense that pty i/o is not 
 synchronous, in the same way that tty i/o is not synchronous.
 
 Here's what I wrote internally about my speculations about this being a bug 
 or not:
 
 I also never hit it with pipes (remove the USEPTY define), also not 
 on sle12, so it must be some change specific to the pty implementation.

 Now, all of this is of course unspecified.  There are two 
 asynchronous processes involved, and a buffered tube between them.  
 Just because one process filled one end of the tube (the breakpoint 
 was hit) doesn't mean the contents have to appear at that instant at 
 the other end.  So the change in behaviour in sle12 is not a genuine 
 bug.  It _might_ be an unintented change, though, that's why kernel 
 people should comment on this.  If there are no terribly good 
 reasons for this change I'd consider it a quality-of-implementation 
 regression in sle12.
 
 So, I'd accept this being declared a non-bug, but it is certainly a change in 
 behaviour that's visible for our debugger team.
 
 However, that said, if this is a regression (regression as in it 
 broke something that used to work, not regression as in this new 
 thing I'm writing doesn't behave the way I want it to :) )

 Help me understand the use-case here: are you using pty i/o to debug 
 the debugger?
 
 Nic is working on the Cobol debugger, but I think this pty i/o is rather a 
 part of the normal interaction between a debugged Cobol process and the 
 debugger; that's just a theory, Nic is authorative here.  But this change in 
 behaviour _did_ result in real testsuite regressions, so it's not something 
 that he wanted to write from scratch.
 
 (FWIW: I do think it's a better QoI factor if something returns data from a 
 tube if we can know via side channels (break points) that something must have 
 been written locally to the other end of the tube, if that can be ensured 
 without too much other work)
 
 
 Ciao,
 Michael.
 
 
 This message has been scanned for malware by Websense. www.websense.com
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-04 Thread Peter Hurley
On 05/04/2015 12:56 PM, Michael Matz wrote:
> Hi,
> 
> On Mon, 4 May 2015, Peter Hurley wrote:
> 
>> I think it would be a shame if ptrace() usage forced a whole class of 
>> i/o to be synchronous.
> 
> I think Neils patch doesn't do that, does it?

Yes. Non-blocking i/o would now have to wait until the input worker has
completed (at a time when the input worker may not even be running on a
cpu yet).

>  If it has an indication 
> that in fact some data must be there but isn't it pulls it, otherwise it's 
> the same as after your patch/current state?

At the point when the write() returns from the child process, the input
worker may not even have run yet. The patch will now force the reader to
wait until the input worker has started and run to completion.

Moreover, prior to 4.1, the read i/o loop is sloppy wrt kicking the
the input worker, so there is every likelihood that this patch would
force extra waits on a non-blocking reader even though no input was actually
being written.

> (Leaving the debugger question to Nic; but I guess it's similar interface 
> like gdb.  Once you come back into the debugger (breakpoint hit) it looks 
> once for input on the tubes of the debuggee and then enters a prompt; it 
> doesn't continue looking for input until you continue the debuggee (1).

That's exactly what gdb does. Below is simple test jig [2] where the child
outputs while at the gdb prompt.

  
> ptys would be used because it's Cobol, the programs contain data entry 
> masks presumably needing a real tty, not just pipes.  That usecase would 
> be broken now; the tty provided by the debugger doesn't reflect the real 
> screen that the debuggee actually generated before the breakpoint.  Note 
> how pipes in my test program _are_ synchronuous in this sense, just ptys 
> aren't.)

What's interesting about that expectation is that it would never work
on an actual tty.


> (1) And in a single threaded debugger (no matter if the debuggee is 
> multithreaded) it would be awkward to implement.  After read returns 0 
> you'd again call poll, which indicates data is there, and read again.  
> You repeat that until $SOMEWHEN.  But when is it enough?  Two loops, 10, 
> 1000?  To not sit in a tight loop you'd also add some nanosleeps, but that 
> would add unnecessary lags.

That's not what's happening.

poll() with a timeout of 0 returns immediately, even if no file descriptors
are ready for i/o. The poll() is returning 0 because there is no data to read,
so the loop is exiting.

If poll() returned non-zero and read() returned no data, that would
definitely be a bug.

> Basically, whenever poll indicates that read won't block then it should 
> also return some data, not 0, if at all reasonably implementable; i.e. 
> some progress should be guaranteed.

ttys have a further refinement of the behavior of read() and poll(), which
is controlled by the VMIN and VTIME values in the termios structure.

But note: the pty master does not allow its termios to be programmed --
a pty master read() is always non-blocking.

>  I realize that this isn't always the 
> case, but here it is.  In code, this loop:
> 
>   while (poll ([fd, POLLIN], 0) == 1)
> // So, read won't block, yippie
> if (read (fd, ...) == 0)
>   continue;
> 
> shouldn't become a tight loop, without the read making progress but the 
> kernel continuously stating "yep, there's data available", until some 
> random point in the future.

Yeah, that would be broken but, again, that's not what's happening here.

Regards,
Peter Hurley


--- >% ---
#include 
#include 
#include 
#include 
#include 

#define BRKPT  asm("int $3")

void child()
{
sleep(1);
printf("Hello, world!");
}

int main()
{
int child_id;

setbuf(stdout, NULL);


child_id = fork();
switch (child_id) {
case -1:
printf("fork: %s (code: %d)\n", strerror(errno), errno);
exit(EXIT_FAILURE);

case 0:
child();
break;

default: /* parent */
BRKPT;
break;
}

return 0;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-04 Thread Michael Matz
Hi,

On Mon, 4 May 2015, Peter Hurley wrote:

> I think it would be a shame if ptrace() usage forced a whole class of 
> i/o to be synchronous.

I think Neils patch doesn't do that, does it?  If it has an indication 
that in fact some data must be there but isn't it pulls it, otherwise it's 
the same as after your patch/current state?

(Leaving the debugger question to Nic; but I guess it's similar interface 
like gdb.  Once you come back into the debugger (breakpoint hit) it looks 
once for input on the tubes of the debuggee and then enters a prompt; it 
doesn't continue looking for input until you continue the debuggee (1).  
ptys would be used because it's Cobol, the programs contain data entry 
masks presumably needing a real tty, not just pipes.  That usecase would 
be broken now; the tty provided by the debugger doesn't reflect the real 
screen that the debuggee actually generated before the breakpoint.  Note 
how pipes in my test program _are_ synchronuous in this sense, just ptys 
aren't.)


Ciao,
Michael.

(1) And in a single threaded debugger (no matter if the debuggee is 
multithreaded) it would be awkward to implement.  After read returns 0 
you'd again call poll, which indicates data is there, and read again.  
You repeat that until $SOMEWHEN.  But when is it enough?  Two loops, 10, 
1000?  To not sit in a tight loop you'd also add some nanosleeps, but that 
would add unnecessary lags.

Basically, whenever poll indicates that read won't block then it should 
also return some data, not 0, if at all reasonably implementable; i.e. 
some progress should be guaranteed.  I realize that this isn't always the 
case, but here it is.  In code, this loop:

  while (poll ([fd, POLLIN], 0) == 1)
// So, read won't block, yippie
if (read (fd, ...) == 0)
  continue;

shouldn't become a tight loop, without the read making progress but the 
kernel continuously stating "yep, there's data available", until some 
random point in the future.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-04 Thread Peter Hurley
On 05/04/2015 08:24 AM, Michael Matz wrote:
> Hi,
> 
> On Fri, 1 May 2015, Peter Hurley wrote:
> 
>> I don't think this a real bug, in the sense that pty i/o is not 
>> synchronous, in the same way that tty i/o is not synchronous.
> 
> Here's what I wrote internally about my speculations about this being a 
> bug or not:
> 
>>> I also never hit it with pipes (remove the USEPTY define), also not on 
>>> sle12, so it must be some change specific to the pty implementation.
>>>
>>> Now, all of this is of course unspecified.  There are two asynchronous 
>>> processes involved, and a buffered tube between them.  Just because 
>>> one process filled one end of the tube (the breakpoint was hit) 
>>> doesn't mean the contents have to appear at that instant at the other 
>>> end.  So the change in behaviour in sle12 is not a genuine bug.  It 
>>> _might_ be an unintented change, though, that's why kernel people 
>>> should comment on this.  If there are no terribly good reasons for 
>>> this change I'd consider it a quality-of-implementation regression in 
>>> sle12.
> 
> So, I'd accept this being declared a non-bug, but it is certainly a change 
> in behaviour that's visible for our debugger team.
> 
>> However, that said, if this is a regression (regression as in "it broke 
>> something that used to work", not regression as in "this new thing I'm 
>> writing doesn't behave the way I want it to" :) )
>>
>> Help me understand the use-case here: are you using pty i/o to debug the 
>> debugger?
> 
> Nic is working on the Cobol debugger, but I think this pty i/o is rather a 
> part of the normal interaction between a debugged Cobol process and the 
> debugger; that's just a theory, Nic is authorative here.  But this change 
> in behaviour _did_ result in real testsuite regressions, so it's not 
> something that he wanted to write from scratch.

I'd like to understand why the debugger cares about when pty i/o shows up
and why there is a testsuite to check for that.

Does the debuggee know about the debugger, or is the pty i/o just stdout/stderr?

This doesn't seem stable in the face of multiple threads of execution in
the debuggee (or grandchild processes); IOW, pty slave writes from the
debuggee may continue from other non-TRACEME threads. Presumably that i/o
isn't being read either.

> (FWIW: I do think it's a better QoI factor if something returns data from 
> a tube if we can know via side channels (break points) that something must 
> have been written locally to the other end of the tube, if that can be 
> ensured without too much other work)

Well, if the debugger simply continues to monitor the pty master, the i/o
will arrive.

I think it would be a shame if ptrace() usage forced a whole class of
i/o to be synchronous.

Regards,
Peter Hurley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-04 Thread Michael Matz
Hi,

On Fri, 1 May 2015, Peter Hurley wrote:

> I don't think this a real bug, in the sense that pty i/o is not 
> synchronous, in the same way that tty i/o is not synchronous.

Here's what I wrote internally about my speculations about this being a 
bug or not:

> > I also never hit it with pipes (remove the USEPTY define), also not on 
> > sle12, so it must be some change specific to the pty implementation.
> > 
> > Now, all of this is of course unspecified.  There are two asynchronous 
> > processes involved, and a buffered tube between them.  Just because 
> > one process filled one end of the tube (the breakpoint was hit) 
> > doesn't mean the contents have to appear at that instant at the other 
> > end.  So the change in behaviour in sle12 is not a genuine bug.  It 
> > _might_ be an unintented change, though, that's why kernel people 
> > should comment on this.  If there are no terribly good reasons for 
> > this change I'd consider it a quality-of-implementation regression in 
> > sle12.

So, I'd accept this being declared a non-bug, but it is certainly a change 
in behaviour that's visible for our debugger team.

> However, that said, if this is a regression (regression as in "it broke 
> something that used to work", not regression as in "this new thing I'm 
> writing doesn't behave the way I want it to" :) )
> 
> Help me understand the use-case here: are you using pty i/o to debug the 
> debugger?

Nic is working on the Cobol debugger, but I think this pty i/o is rather a 
part of the normal interaction between a debugged Cobol process and the 
debugger; that's just a theory, Nic is authorative here.  But this change 
in behaviour _did_ result in real testsuite regressions, so it's not 
something that he wanted to write from scratch.

(FWIW: I do think it's a better QoI factor if something returns data from 
a tube if we can know via side channels (break points) that something must 
have been written locally to the other end of the tube, if that can be 
ensured without too much other work)


Ciao,
Michael.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-04 Thread Michael Matz
Hi,

On Fri, 1 May 2015, Peter Hurley wrote:

 I don't think this a real bug, in the sense that pty i/o is not 
 synchronous, in the same way that tty i/o is not synchronous.

Here's what I wrote internally about my speculations about this being a 
bug or not:

  I also never hit it with pipes (remove the USEPTY define), also not on 
  sle12, so it must be some change specific to the pty implementation.
  
  Now, all of this is of course unspecified.  There are two asynchronous 
  processes involved, and a buffered tube between them.  Just because 
  one process filled one end of the tube (the breakpoint was hit) 
  doesn't mean the contents have to appear at that instant at the other 
  end.  So the change in behaviour in sle12 is not a genuine bug.  It 
  _might_ be an unintented change, though, that's why kernel people 
  should comment on this.  If there are no terribly good reasons for 
  this change I'd consider it a quality-of-implementation regression in 
  sle12.

So, I'd accept this being declared a non-bug, but it is certainly a change 
in behaviour that's visible for our debugger team.

 However, that said, if this is a regression (regression as in it broke 
 something that used to work, not regression as in this new thing I'm 
 writing doesn't behave the way I want it to :) )
 
 Help me understand the use-case here: are you using pty i/o to debug the 
 debugger?

Nic is working on the Cobol debugger, but I think this pty i/o is rather a 
part of the normal interaction between a debugged Cobol process and the 
debugger; that's just a theory, Nic is authorative here.  But this change 
in behaviour _did_ result in real testsuite regressions, so it's not 
something that he wanted to write from scratch.

(FWIW: I do think it's a better QoI factor if something returns data from 
a tube if we can know via side channels (break points) that something must 
have been written locally to the other end of the tube, if that can be 
ensured without too much other work)


Ciao,
Michael.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-04 Thread Peter Hurley
On 05/04/2015 08:24 AM, Michael Matz wrote:
 Hi,
 
 On Fri, 1 May 2015, Peter Hurley wrote:
 
 I don't think this a real bug, in the sense that pty i/o is not 
 synchronous, in the same way that tty i/o is not synchronous.
 
 Here's what I wrote internally about my speculations about this being a 
 bug or not:
 
 I also never hit it with pipes (remove the USEPTY define), also not on 
 sle12, so it must be some change specific to the pty implementation.

 Now, all of this is of course unspecified.  There are two asynchronous 
 processes involved, and a buffered tube between them.  Just because 
 one process filled one end of the tube (the breakpoint was hit) 
 doesn't mean the contents have to appear at that instant at the other 
 end.  So the change in behaviour in sle12 is not a genuine bug.  It 
 _might_ be an unintented change, though, that's why kernel people 
 should comment on this.  If there are no terribly good reasons for 
 this change I'd consider it a quality-of-implementation regression in 
 sle12.
 
 So, I'd accept this being declared a non-bug, but it is certainly a change 
 in behaviour that's visible for our debugger team.
 
 However, that said, if this is a regression (regression as in it broke 
 something that used to work, not regression as in this new thing I'm 
 writing doesn't behave the way I want it to :) )

 Help me understand the use-case here: are you using pty i/o to debug the 
 debugger?
 
 Nic is working on the Cobol debugger, but I think this pty i/o is rather a 
 part of the normal interaction between a debugged Cobol process and the 
 debugger; that's just a theory, Nic is authorative here.  But this change 
 in behaviour _did_ result in real testsuite regressions, so it's not 
 something that he wanted to write from scratch.

I'd like to understand why the debugger cares about when pty i/o shows up
and why there is a testsuite to check for that.

Does the debuggee know about the debugger, or is the pty i/o just stdout/stderr?

This doesn't seem stable in the face of multiple threads of execution in
the debuggee (or grandchild processes); IOW, pty slave writes from the
debuggee may continue from other non-TRACEME threads. Presumably that i/o
isn't being read either.

 (FWIW: I do think it's a better QoI factor if something returns data from 
 a tube if we can know via side channels (break points) that something must 
 have been written locally to the other end of the tube, if that can be 
 ensured without too much other work)

Well, if the debugger simply continues to monitor the pty master, the i/o
will arrive.

I think it would be a shame if ptrace() usage forced a whole class of
i/o to be synchronous.

Regards,
Peter Hurley

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-04 Thread Michael Matz
Hi,

On Mon, 4 May 2015, Peter Hurley wrote:

 I think it would be a shame if ptrace() usage forced a whole class of 
 i/o to be synchronous.

I think Neils patch doesn't do that, does it?  If it has an indication 
that in fact some data must be there but isn't it pulls it, otherwise it's 
the same as after your patch/current state?

(Leaving the debugger question to Nic; but I guess it's similar interface 
like gdb.  Once you come back into the debugger (breakpoint hit) it looks 
once for input on the tubes of the debuggee and then enters a prompt; it 
doesn't continue looking for input until you continue the debuggee (1).  
ptys would be used because it's Cobol, the programs contain data entry 
masks presumably needing a real tty, not just pipes.  That usecase would 
be broken now; the tty provided by the debugger doesn't reflect the real 
screen that the debuggee actually generated before the breakpoint.  Note 
how pipes in my test program _are_ synchronuous in this sense, just ptys 
aren't.)


Ciao,
Michael.

(1) And in a single threaded debugger (no matter if the debuggee is 
multithreaded) it would be awkward to implement.  After read returns 0 
you'd again call poll, which indicates data is there, and read again.  
You repeat that until $SOMEWHEN.  But when is it enough?  Two loops, 10, 
1000?  To not sit in a tight loop you'd also add some nanosleeps, but that 
would add unnecessary lags.

Basically, whenever poll indicates that read won't block then it should 
also return some data, not 0, if at all reasonably implementable; i.e. 
some progress should be guaranteed.  I realize that this isn't always the 
case, but here it is.  In code, this loop:

  while (poll ([fd, POLLIN], 0) == 1)
// So, read won't block, yippie
if (read (fd, ...) == 0)
  continue;

shouldn't become a tight loop, without the read making progress but the 
kernel continuously stating yep, there's data available, until some 
random point in the future.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-04 Thread Peter Hurley
On 05/04/2015 12:56 PM, Michael Matz wrote:
 Hi,
 
 On Mon, 4 May 2015, Peter Hurley wrote:
 
 I think it would be a shame if ptrace() usage forced a whole class of 
 i/o to be synchronous.
 
 I think Neils patch doesn't do that, does it?

Yes. Non-blocking i/o would now have to wait until the input worker has
completed (at a time when the input worker may not even be running on a
cpu yet).

  If it has an indication 
 that in fact some data must be there but isn't it pulls it, otherwise it's 
 the same as after your patch/current state?

At the point when the write() returns from the child process, the input
worker may not even have run yet. The patch will now force the reader to
wait until the input worker has started and run to completion.

Moreover, prior to 4.1, the read i/o loop is sloppy wrt kicking the
the input worker, so there is every likelihood that this patch would
force extra waits on a non-blocking reader even though no input was actually
being written.

 (Leaving the debugger question to Nic; but I guess it's similar interface 
 like gdb.  Once you come back into the debugger (breakpoint hit) it looks 
 once for input on the tubes of the debuggee and then enters a prompt; it 
 doesn't continue looking for input until you continue the debuggee (1).

That's exactly what gdb does. Below is simple test jig [2] where the child
outputs while at the gdb prompt.

  
 ptys would be used because it's Cobol, the programs contain data entry 
 masks presumably needing a real tty, not just pipes.  That usecase would 
 be broken now; the tty provided by the debugger doesn't reflect the real 
 screen that the debuggee actually generated before the breakpoint.  Note 
 how pipes in my test program _are_ synchronuous in this sense, just ptys 
 aren't.)

What's interesting about that expectation is that it would never work
on an actual tty.


 (1) And in a single threaded debugger (no matter if the debuggee is 
 multithreaded) it would be awkward to implement.  After read returns 0 
 you'd again call poll, which indicates data is there, and read again.  
 You repeat that until $SOMEWHEN.  But when is it enough?  Two loops, 10, 
 1000?  To not sit in a tight loop you'd also add some nanosleeps, but that 
 would add unnecessary lags.

That's not what's happening.

poll() with a timeout of 0 returns immediately, even if no file descriptors
are ready for i/o. The poll() is returning 0 because there is no data to read,
so the loop is exiting.

If poll() returned non-zero and read() returned no data, that would
definitely be a bug.

 Basically, whenever poll indicates that read won't block then it should 
 also return some data, not 0, if at all reasonably implementable; i.e. 
 some progress should be guaranteed.

ttys have a further refinement of the behavior of read() and poll(), which
is controlled by the VMIN and VTIME values in the termios structure.

But note: the pty master does not allow its termios to be programmed --
a pty master read() is always non-blocking.

  I realize that this isn't always the 
 case, but here it is.  In code, this loop:
 
   while (poll ([fd, POLLIN], 0) == 1)
 // So, read won't block, yippie
 if (read (fd, ...) == 0)
   continue;
 
 shouldn't become a tight loop, without the read making progress but the 
 kernel continuously stating yep, there's data available, until some 
 random point in the future.

Yeah, that would be broken but, again, that's not what's happening here.

Regards,
Peter Hurley


--- % ---
#include stdio.h
#include stdlib.h
#include string.h
#include errno.h
#include unistd.h

#define BRKPT  asm(int $3)

void child()
{
sleep(1);
printf(Hello, world!);
}

int main()
{
int child_id;

setbuf(stdout, NULL);


child_id = fork();
switch (child_id) {
case -1:
printf(fork: %s (code: %d)\n, strerror(errno), errno);
exit(EXIT_FAILURE);

case 0:
child();
break;

default: /* parent */
BRKPT;
break;
}

return 0;
}

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-01 Thread Peter Hurley
Hi Neil,

On 05/01/2015 02:20 AM, NeilBrown wrote:
> 
> Hi Peter,
>  I recently had a report of a regression in 3.12.  I bisected it down to your
>  patch
>   f95499c3030f ("n_tty: Don't wait for buffer work in read() loop")
> 
>  Sometimes a poll on a master-pty will report there is nothing to read after
>  the slave has written something.
>  As test program is below.
>  On a kernel prior to your commit, this program never reports
> 
> Total bytes read is 0. PollRC=0
> 
>  On a kernel subsequent to your commit, that message is produced quite often.
> 
>  This was found while working on a debugger.
> 
>  Following the test program is my proposed patch which allows the program to
>  run as it used to.  It re-introduces the call to tty_flush_to_ldisc(), but
>  only if it appears that there is nothing to read.
> 
>  Do you think this is a suitable fix?  Do you even agree that it is a real
>  bug?

I don't think this a real bug, in the sense that pty i/o is not synchronous,
in the same way that tty i/o is not synchronous.

However, that said, if this is a regression (regression as in "it broke 
something
that used to work", not regression as in "this new thing I'm writing doesn't
behave the way I want it to" :) )

Help me understand the use-case here: are you using pty i/o to debug the
debugger?

Regards,
Peter Hurley
 
> --
> #define _XOPEN_SOURCE
> #include
> #include
> #include
> #include
> #include
> #include
> #include
> #include
> #include
> #include
> #include
> #include
> 
> 
> #define USEPTY
> #define COUNT_MAX (500)
> #define MY_BREAKPOINT { asm("int $3"); }
> #define PTRACE_IGNORED(void *)0
> 
> /*
> ** Open s pseudo-tty pair.
> **
> ** Return the master fd, set spty to the slave fd.
> */
> int
> my_openpt(int *spty)
> {
> int mfd = posix_openpt(O_RDWR | O_NOCTTY);
> char *slavedev;
> int sfd=-1;
> if(mfd == -1) return -1;
> if(grantpt(mfd) == -1) return -1;
> if(unlockpt(mfd) == -1) return -1;
> 
> slavedev = (char *)ptsname(mfd);
> 
> if((sfd = open(slavedev, O_RDWR | O_NOCTTY)) == -1)
> {
>   close(mfd);
>   return -1;
> }
> if(spty != NULL)
> {
>   *spty = sfd;
> }
> return mfd;
> }
> 
> 
> /*
> ** Read from the provided file descriptor if poll says there's
> ** anything there..
> */
> int
> DoPollRead(int mpty)
> {
> struct pollfd fds;
> int pollrc;
> ssize_t bread=0, totread=0;
> char readbuf[101];
> 
> /*
> ** Set up the poll.
> */
> fds.fd = mpty;
> fds.events = POLLIN | POLLRDNORM | POLLRDBAND | POLLPRI;
> 
> /*
> ** poll for any output.
> */
> 
> while((pollrc = poll(, 1, 0)) == 1)
> {
>   if(fds.revents & POLLIN)
>   {
>   bread = read( mpty, readbuf, 100 );
>   totread += bread;
>   if(bread > 0)
>   {
>   //printf("Read %d bytes.\n", (int)bread);
>   readbuf[bread] = '\0';
>   //printf("\t%s", readbuf);
>   } else
>   {
>   //puts("Nothing read.\n");
>   }
>   } else if (fds.revents & (POLLHUP | POLLERR | POLLNVAL)) {
>   printf ("hangup/error/invalid on poll\n");
>   return totread;
>   } else { printf("No POLLIN, revents=%d\n", fds.revents); };
> }
> 
> /*
> ** This sometimes happens - we're expecting input on the pty, 
> ** but nothing is there.
> */
> if(totread == 0)
>   printf("Total bytes read is 0. PollRC=%d\n", pollrc);
> 
> return totread;
> }
> 
> static
> void writeall (int fd, const char *buf, size_t count)
> {
>   while (count)
> {
>   ssize_t r = write (fd, buf, count);
>   if (r == 0)
>   break;
>   if (r < 0 && errno == EINTR)
>   continue;
>   if (r < 0)
>   exit (2);
>   count -= r;
>   buf += r;
> }
> }
> 
> int
> thechild(void)
> {
> unsigned int i;
> 
> writeall (1, "debuggee starts\n", strlen ("debuggee starts\n"));
> 
> for(i=0 ; i {
> char buf[100];
>   sprintf(buf, "This is the debuggee. Count is %d\n", i);
>   writeall (1, buf, strlen (buf));
> 
>   MY_BREAKPOINT
> }
> 
> writeall (1, "debuggee finishing now.\n", strlen ("debuggee finishing 
> now.\n"));
> exit (0);
> }
> 
> int
> main()
> {
> int rv, status, i=0;
> pid_t pid;
> int sfd = -1;
> int mfd;
> #ifdef USEPTY
> mfd = my_openpt();/* Get a pseudo-tty pair. */
> if(mfd < 0)
> {
>   fprintf(stderr, "Failed to create pty\n");
>   return(1);
> }
> #else
> int pipefd[2];
> if (pipe (pipefd) < 0)
>   {
>   perror ("pipe");
>   return 1;
>   }
> mfd = pipefd[0];
> sfd = pipefd[1];
> #endif
> 
> /*
> ** Create a child process.
> */
> pid = fork();
> switch(pid)
> {
>   case -1:/* failed fork  */
>   return -1;
>   case 0: /* 

[PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-01 Thread NeilBrown

Hi Peter,
 I recently had a report of a regression in 3.12.  I bisected it down to your
 patch
  f95499c3030f ("n_tty: Don't wait for buffer work in read() loop")

 Sometimes a poll on a master-pty will report there is nothing to read after
 the slave has written something.
 As test program is below.
 On a kernel prior to your commit, this program never reports

Total bytes read is 0. PollRC=0

 On a kernel subsequent to your commit, that message is produced quite often.

 This was found while working on a debugger.

 Following the test program is my proposed patch which allows the program to
 run as it used to.  It re-introduces the call to tty_flush_to_ldisc(), but
 only if it appears that there is nothing to read.

 Do you think this is a suitable fix?  Do you even agree that it is a real
 bug?

Thanks,
NeilBrown



--
#define _XOPEN_SOURCE
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include


#define USEPTY
#define COUNT_MAX   (500)
#define MY_BREAKPOINT   { asm("int $3"); }
#define PTRACE_IGNORED  (void *)0

/*
** Open s pseudo-tty pair.
**
** Return the master fd, set spty to the slave fd.
*/
int
my_openpt(int *spty)
{
int mfd = posix_openpt(O_RDWR | O_NOCTTY);
char *slavedev;
int sfd=-1;
if(mfd == -1) return -1;
if(grantpt(mfd) == -1) return -1;
if(unlockpt(mfd) == -1) return -1;

slavedev = (char *)ptsname(mfd);

if((sfd = open(slavedev, O_RDWR | O_NOCTTY)) == -1)
{
close(mfd);
return -1;
}
if(spty != NULL)
{
*spty = sfd;
}
return mfd;
}


/*
** Read from the provided file descriptor if poll says there's
** anything there..
*/
int
DoPollRead(int mpty)
{
struct pollfd fds;
int pollrc;
ssize_t bread=0, totread=0;
char readbuf[101];

/*
** Set up the poll.
*/
fds.fd = mpty;
fds.events = POLLIN | POLLRDNORM | POLLRDBAND | POLLPRI;

/*
** poll for any output.
*/

while((pollrc = poll(, 1, 0)) == 1)
{
if(fds.revents & POLLIN)
{
bread = read( mpty, readbuf, 100 );
totread += bread;
if(bread > 0)
{
//printf("Read %d bytes.\n", (int)bread);
readbuf[bread] = '\0';
//printf("\t%s", readbuf);
} else
{
//puts("Nothing read.\n");
}
} else if (fds.revents & (POLLHUP | POLLERR | POLLNVAL)) {
printf ("hangup/error/invalid on poll\n");
return totread;
} else { printf("No POLLIN, revents=%d\n", fds.revents); };
}

/*
** This sometimes happens - we're expecting input on the pty, 
** but nothing is there.
*/
if(totread == 0)
printf("Total bytes read is 0. PollRC=%d\n", pollrc);

return totread;
}

static
void writeall (int fd, const char *buf, size_t count)
{
  while (count)
{
  ssize_t r = write (fd, buf, count);
  if (r == 0)
break;
  if (r < 0 && errno == EINTR)
continue;
  if (r < 0)
exit (2);
  count -= r;
  buf += r;
}
}

int
thechild(void)
{
unsigned int i;

writeall (1, "debuggee starts\n", strlen ("debuggee starts\n"));

for(i=0 ; i
Subject: [PATCH] n_tty: Sometimes wait for buffer work in read() loop

Since commit
  f95499c3030f ("n_tty: Don't wait for buffer work in read() loop")

it as been possible for poll to report that there is no data to read
on a master-pty even if a write to the slave has actually completed.

That patch removes a 'wait' when the wait isn't really necessary.
Unfortunately it also removed it in the case when it *is* necessary.
If the simple tests show that there is nothing to read, we really need
to flush the work queue in case there is something ready but which
hasn't arrived yet.

This patch restores the wait, but only if simple tests suggest there
is nothing ready.

Reported-by: Nic Percival 
Reported-by: Michael Matz 
Fixes: f95499c3030f ("n_tty: Don't wait for buffer work in read() loop")
Signed-off-by: NeilBrown 

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index cf6e0f2e1331..9884091819b6 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1942,11 +1942,18 @@ static inline int input_available_p(struct tty_struct 
*tty, int poll)
 {
struct n_tty_data *ldata = tty->disc_data;
int amt = poll && !TIME_CHAR(tty) && MIN_CHAR(tty) ? MIN_CHAR(tty) : 1;
-
-   if (ldata->icanon && !L_EXTPROC(tty))
-   return ldata->canon_head != ldata->read_tail;
-   else
-   return ldata->commit_head - ldata->read_tail >= amt;
+   int i;
+   int ret = 0;
+
+   for (i = 0; !ret && i < 2; i++) {
+   if (i)
+   tty_flush_to_ldisc(tty);
+   if (ldata->icanon && !L_EXTPROC(tty))
+   ret = (ldata->canon_head != 

Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-01 Thread Peter Hurley
Hi Neil,

On 05/01/2015 02:20 AM, NeilBrown wrote:
 
 Hi Peter,
  I recently had a report of a regression in 3.12.  I bisected it down to your
  patch
   f95499c3030f (n_tty: Don't wait for buffer work in read() loop)
 
  Sometimes a poll on a master-pty will report there is nothing to read after
  the slave has written something.
  As test program is below.
  On a kernel prior to your commit, this program never reports
 
 Total bytes read is 0. PollRC=0
 
  On a kernel subsequent to your commit, that message is produced quite often.
 
  This was found while working on a debugger.
 
  Following the test program is my proposed patch which allows the program to
  run as it used to.  It re-introduces the call to tty_flush_to_ldisc(), but
  only if it appears that there is nothing to read.
 
  Do you think this is a suitable fix?  Do you even agree that it is a real
  bug?

I don't think this a real bug, in the sense that pty i/o is not synchronous,
in the same way that tty i/o is not synchronous.

However, that said, if this is a regression (regression as in it broke 
something
that used to work, not regression as in this new thing I'm writing doesn't
behave the way I want it to :) )

Help me understand the use-case here: are you using pty i/o to debug the
debugger?

Regards,
Peter Hurley
 
 --
 #define _XOPEN_SOURCE
 #includeunistd.h
 #includestdlib.h
 #includestdio.h
 #includestdlib.h
 #includestring.h
 #includeerrno.h
 #includesys/wait.h
 #includesys/types.h
 #includesys/ptrace.h
 #includeasm/ptrace.h
 #includefcntl.h
 #includesys/poll.h
 
 
 #define USEPTY
 #define COUNT_MAX (500)
 #define MY_BREAKPOINT { asm(int $3); }
 #define PTRACE_IGNORED(void *)0
 
 /*
 ** Open s pseudo-tty pair.
 **
 ** Return the master fd, set spty to the slave fd.
 */
 int
 my_openpt(int *spty)
 {
 int mfd = posix_openpt(O_RDWR | O_NOCTTY);
 char *slavedev;
 int sfd=-1;
 if(mfd == -1) return -1;
 if(grantpt(mfd) == -1) return -1;
 if(unlockpt(mfd) == -1) return -1;
 
 slavedev = (char *)ptsname(mfd);
 
 if((sfd = open(slavedev, O_RDWR | O_NOCTTY)) == -1)
 {
   close(mfd);
   return -1;
 }
 if(spty != NULL)
 {
   *spty = sfd;
 }
 return mfd;
 }
 
 
 /*
 ** Read from the provided file descriptor if poll says there's
 ** anything there..
 */
 int
 DoPollRead(int mpty)
 {
 struct pollfd fds;
 int pollrc;
 ssize_t bread=0, totread=0;
 char readbuf[101];
 
 /*
 ** Set up the poll.
 */
 fds.fd = mpty;
 fds.events = POLLIN | POLLRDNORM | POLLRDBAND | POLLPRI;
 
 /*
 ** poll for any output.
 */
 
 while((pollrc = poll(fds, 1, 0)) == 1)
 {
   if(fds.revents  POLLIN)
   {
   bread = read( mpty, readbuf, 100 );
   totread += bread;
   if(bread  0)
   {
   //printf(Read %d bytes.\n, (int)bread);
   readbuf[bread] = '\0';
   //printf(\t%s, readbuf);
   } else
   {
   //puts(Nothing read.\n);
   }
   } else if (fds.revents  (POLLHUP | POLLERR | POLLNVAL)) {
   printf (hangup/error/invalid on poll\n);
   return totread;
   } else { printf(No POLLIN, revents=%d\n, fds.revents); };
 }
 
 /*
 ** This sometimes happens - we're expecting input on the pty, 
 ** but nothing is there.
 */
 if(totread == 0)
   printf(Total bytes read is 0. PollRC=%d\n, pollrc);
 
 return totread;
 }
 
 static
 void writeall (int fd, const char *buf, size_t count)
 {
   while (count)
 {
   ssize_t r = write (fd, buf, count);
   if (r == 0)
   break;
   if (r  0  errno == EINTR)
   continue;
   if (r  0)
   exit (2);
   count -= r;
   buf += r;
 }
 }
 
 int
 thechild(void)
 {
 unsigned int i;
 
 writeall (1, debuggee starts\n, strlen (debuggee starts\n));
 
 for(i=0 ; iCOUNT_MAX ; i++)
 {
 char buf[100];
   sprintf(buf, This is the debuggee. Count is %d\n, i);
   writeall (1, buf, strlen (buf));
 
   MY_BREAKPOINT
 }
 
 writeall (1, debuggee finishing now.\n, strlen (debuggee finishing 
 now.\n));
 exit (0);
 }
 
 int
 main()
 {
 int rv, status, i=0;
 pid_t pid;
 int sfd = -1;
 int mfd;
 #ifdef USEPTY
 mfd = my_openpt(sfd);/* Get a pseudo-tty pair. */
 if(mfd  0)
 {
   fprintf(stderr, Failed to create pty\n);
   return(1);
 }
 #else
 int pipefd[2];
 if (pipe (pipefd)  0)
   {
   perror (pipe);
   return 1;
   }
 mfd = pipefd[0];
 sfd = pipefd[1];
 #endif
 
 /*
 ** Create a child process.
 */
 pid = fork();
 switch(pid)
 {
   case -1:/* failed fork  */
   return -1;
   case 0: /* child process*/
 
   close (mfd);
   /*
   ** Close stdout, use the slave pty 

[PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-01 Thread NeilBrown

Hi Peter,
 I recently had a report of a regression in 3.12.  I bisected it down to your
 patch
  f95499c3030f (n_tty: Don't wait for buffer work in read() loop)

 Sometimes a poll on a master-pty will report there is nothing to read after
 the slave has written something.
 As test program is below.
 On a kernel prior to your commit, this program never reports

Total bytes read is 0. PollRC=0

 On a kernel subsequent to your commit, that message is produced quite often.

 This was found while working on a debugger.

 Following the test program is my proposed patch which allows the program to
 run as it used to.  It re-introduces the call to tty_flush_to_ldisc(), but
 only if it appears that there is nothing to read.

 Do you think this is a suitable fix?  Do you even agree that it is a real
 bug?

Thanks,
NeilBrown



--
#define _XOPEN_SOURCE
#includeunistd.h
#includestdlib.h
#includestdio.h
#includestdlib.h
#includestring.h
#includeerrno.h
#includesys/wait.h
#includesys/types.h
#includesys/ptrace.h
#includeasm/ptrace.h
#includefcntl.h
#includesys/poll.h


#define USEPTY
#define COUNT_MAX   (500)
#define MY_BREAKPOINT   { asm(int $3); }
#define PTRACE_IGNORED  (void *)0

/*
** Open s pseudo-tty pair.
**
** Return the master fd, set spty to the slave fd.
*/
int
my_openpt(int *spty)
{
int mfd = posix_openpt(O_RDWR | O_NOCTTY);
char *slavedev;
int sfd=-1;
if(mfd == -1) return -1;
if(grantpt(mfd) == -1) return -1;
if(unlockpt(mfd) == -1) return -1;

slavedev = (char *)ptsname(mfd);

if((sfd = open(slavedev, O_RDWR | O_NOCTTY)) == -1)
{
close(mfd);
return -1;
}
if(spty != NULL)
{
*spty = sfd;
}
return mfd;
}


/*
** Read from the provided file descriptor if poll says there's
** anything there..
*/
int
DoPollRead(int mpty)
{
struct pollfd fds;
int pollrc;
ssize_t bread=0, totread=0;
char readbuf[101];

/*
** Set up the poll.
*/
fds.fd = mpty;
fds.events = POLLIN | POLLRDNORM | POLLRDBAND | POLLPRI;

/*
** poll for any output.
*/

while((pollrc = poll(fds, 1, 0)) == 1)
{
if(fds.revents  POLLIN)
{
bread = read( mpty, readbuf, 100 );
totread += bread;
if(bread  0)
{
//printf(Read %d bytes.\n, (int)bread);
readbuf[bread] = '\0';
//printf(\t%s, readbuf);
} else
{
//puts(Nothing read.\n);
}
} else if (fds.revents  (POLLHUP | POLLERR | POLLNVAL)) {
printf (hangup/error/invalid on poll\n);
return totread;
} else { printf(No POLLIN, revents=%d\n, fds.revents); };
}

/*
** This sometimes happens - we're expecting input on the pty, 
** but nothing is there.
*/
if(totread == 0)
printf(Total bytes read is 0. PollRC=%d\n, pollrc);

return totread;
}

static
void writeall (int fd, const char *buf, size_t count)
{
  while (count)
{
  ssize_t r = write (fd, buf, count);
  if (r == 0)
break;
  if (r  0  errno == EINTR)
continue;
  if (r  0)
exit (2);
  count -= r;
  buf += r;
}
}

int
thechild(void)
{
unsigned int i;

writeall (1, debuggee starts\n, strlen (debuggee starts\n));

for(i=0 ; iCOUNT_MAX ; i++)
{
char buf[100];
sprintf(buf, This is the debuggee. Count is %d\n, i);
writeall (1, buf, strlen (buf));

MY_BREAKPOINT
}

writeall (1, debuggee finishing now.\n, strlen (debuggee finishing 
now.\n));
exit (0);
}

int
main()
{
int rv, status, i=0;
pid_t pid;
int sfd = -1;
int mfd;
#ifdef USEPTY
mfd = my_openpt(sfd);  /* Get a pseudo-tty pair. */
if(mfd  0)
{
fprintf(stderr, Failed to create pty\n);
return(1);
}
#else
int pipefd[2];
if (pipe (pipefd)  0)
  {
perror (pipe);
return 1;
  }
mfd = pipefd[0];
sfd = pipefd[1];
#endif

/*
** Create a child process.
*/
pid = fork();
switch(pid)
{
case -1:/* failed fork  */
return -1;
case 0: /* child process*/

close (mfd);
/*
** Close stdout, use the slave pty for output.
*/
dup2(sfd, STDOUT_FILENO);


/*
** Set 'TRACEME' so this child process can be traced by the
** parent process. 
*/
ptrace(PTRACE_TRACEME,
PTRACE_IGNORED, PTRACE_IGNORED, PTRACE_IGNORED);
thechild ();
break;

default:/* parent process drops out of switch   */
close (sfd);
break;
}

/*
** Wait for the debuggee to hit the traceme.
** When we see this, immediately send a PTRACE_CONT