RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Nic Percival

There is only ever one debuggee process.
My original demo (and indeed the original test failure) is not threaded. The 
debugger is multi-threaded.

I've brought in Chris, Fletch and Paul, my immediate colleagues, into the 
discussion.

The email thread is getting a little tangled, however, from my standpoint I 
have..

1) poll tells us we have nothing to read on a pty, when we know something was 
written into the other end.

2) Given that 'poll' is not telling us that data has been written into the pty, 
what can we use? Surely that is what poll is for.

3) If a debuggee program has displayed 'how old are you?' and then hit a 
breakpoint on the 'ACCEPT' response, then the question might very well not be 
displayed, despite the debugger  sitting on the statement some way subsequent 
to the display. 

4) If I understand correctly, the modification is a performance enhancement. 
Obviously in the case of 'ptrace' debugging, performance is not a requirement.

5) Given 'xterm' use pty's, could a scenario happen where a user is prompted 
'How old are you?' in the xterm, but an input (getchar, whatever) is hit before 
that output is displayed? With or without ptrace?

Thanks,
Nic



-Original Message-
From: Peter Hurley [mailto:pe...@hurleysoftware.com] 
Sent: 05 May 2015 12:19
To: Nic Percival; Michael Matz
Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org
Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
'no' when it should say 'yes'


A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top


On 05/05/2015 04:20 AM, Nic Percival wrote:
> Michael is correct.
> Our COBOL debugger has a test feature whereby we can drive it to step through 
> debugging code, hitting breakpoints and so on.
> The debugger maintains a 'user screen' which is what the 'debuggee' process 
> has displayed.
> This is communicated to the debugger with pseudo-tty's.
> The state of this user screen is checked as part of this (and other) tests.

So the debugger doesn't display output from other non-TRACEME threads or child 
processes of the debuggee, right?

When that's fixed, you'll see that the "test failure" has gone away.

> The actual test failure is a failure of some text to be displayed on the 
> debuggee user screen when we know, given it has hit a certain breakpoint, 
> that the text has been written.
> 
> What is worse is its non-deterministic.

That your test is non-deterministic stems from the fact that the i/o is 
asynchronous.

You would experience the same problem if your test setup was a tty in loopback.

> Sometimes the text makes it and is displayed, so it wouldn't even be 
> practical to modify the test to make it pass.
> We wouldn't really want to do that anyway - the test is just fine on other 
> earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris.

There is a reason Linux is the platform of choice for scalability.

Regards,
Peter Hurley

> -Original Message-
> From: Michael Matz [mailto:m...@suse.de]
> Sent: 04 May 2015 13:24
> To: Peter Hurley
> Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; 
> linux-kernel@vger.kernel.org
> Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
> 'no' when it should say 'yes'
> 
> Hi,
> 
> On Fri, 1 May 2015, Peter Hurley wrote:
> 
>> I don't think this a real bug, in the sense that pty i/o is not 
>> synchronous, in the same way that tty i/o is not synchronous.
> 
> Here's what I wrote internally about my speculations about this being a bug 
> or not:
> 
>>> I also never hit it with pipes (remove the USEPTY define), also not 
>>> on sle12, so it must be some change specific to the pty implementation.
>>>
>>> Now, all of this is of course unspecified.  There are two 
>>> asynchronous processes involved, and a buffered tube between them.
>>> Just because one process filled one end of the tube (the breakpoint 
>>> was hit) doesn't mean the contents have to appear at that instant at 
>>> the other end.  So the change in behaviour in sle12 is not a genuine 
>>> bug.  It _might_ be an unintented change, though, that's why kernel 
>>> people should comment on this.  If there are no terribly good 
>>> reasons for this change I'd consider it a quality-of-implementation 
>>> regression in sle12.
> 
> So, I'd accept this being declared a non-bug, but it is certainly a change in 
> behaviour that's visible for our debugger team.
> 
>> However, that said, if this is a regression (regression as in "it 
>> broke something that used to work", not regression as in "this new 
>> thing I'm writing doesn't behave the way I want it to" :) )
>>
>> Help me u

RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Nic Percival
Michael is correct.
Our COBOL debugger has a test feature whereby we can drive it to step through 
debugging code, hitting breakpoints and so on.
The debugger maintains a 'user screen' which is what the 'debuggee' process has 
displayed.
This is communicated to the debugger with pseudo-tty's.
The state of this user screen is checked as part of this (and other) tests.

The actual test failure is a failure of some text to be displayed on the 
debuggee user screen when we know, given it has hit a certain breakpoint, that 
the text has been written.

What is worse is its non-deterministic. Sometimes the text makes it and is 
displayed, so it wouldn't even be practical to modify the test to make it pass.
We wouldn't really want to do that anyway - the test is just fine on other 
earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris.

Thanks,
Nic

-Original Message-
From: Michael Matz [mailto:m...@suse.de] 
Sent: 04 May 2015 13:24
To: Peter Hurley
Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; 
linux-kernel@vger.kernel.org
Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
'no' when it should say 'yes'

Hi,

On Fri, 1 May 2015, Peter Hurley wrote:

> I don't think this a real bug, in the sense that pty i/o is not 
> synchronous, in the same way that tty i/o is not synchronous.

Here's what I wrote internally about my speculations about this being a bug or 
not:

> > I also never hit it with pipes (remove the USEPTY define), also not 
> > on sle12, so it must be some change specific to the pty implementation.
> > 
> > Now, all of this is of course unspecified.  There are two 
> > asynchronous processes involved, and a buffered tube between them.  
> > Just because one process filled one end of the tube (the breakpoint 
> > was hit) doesn't mean the contents have to appear at that instant at 
> > the other end.  So the change in behaviour in sle12 is not a genuine 
> > bug.  It _might_ be an unintented change, though, that's why kernel 
> > people should comment on this.  If there are no terribly good 
> > reasons for this change I'd consider it a quality-of-implementation 
> > regression in sle12.

So, I'd accept this being declared a non-bug, but it is certainly a change in 
behaviour that's visible for our debugger team.

> However, that said, if this is a regression (regression as in "it 
> broke something that used to work", not regression as in "this new 
> thing I'm writing doesn't behave the way I want it to" :) )
> 
> Help me understand the use-case here: are you using pty i/o to debug 
> the debugger?

Nic is working on the Cobol debugger, but I think this pty i/o is rather a part 
of the normal interaction between a debugged Cobol process and the debugger; 
that's just a theory, Nic is authorative here.  But this change in behaviour 
_did_ result in real testsuite regressions, so it's not something that he 
wanted to write from scratch.

(FWIW: I do think it's a better QoI factor if something returns data from a 
tube if we can know via side channels (break points) that something must have 
been written locally to the other end of the tube, if that can be ensured 
without too much other work)


Ciao,
Michael.


This message has been scanned for malware by Websense. www.websense.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Nic Percival

There is only ever one debuggee process.
My original demo (and indeed the original test failure) is not threaded. The 
debugger is multi-threaded.

I've brought in Chris, Fletch and Paul, my immediate colleagues, into the 
discussion.

The email thread is getting a little tangled, however, from my standpoint I 
have..

1) poll tells us we have nothing to read on a pty, when we know something was 
written into the other end.

2) Given that 'poll' is not telling us that data has been written into the pty, 
what can we use? Surely that is what poll is for.

3) If a debuggee program has displayed 'how old are you?' and then hit a 
breakpoint on the 'ACCEPT' response, then the question might very well not be 
displayed, despite the debugger  sitting on the statement some way subsequent 
to the display. 

4) If I understand correctly, the modification is a performance enhancement. 
Obviously in the case of 'ptrace' debugging, performance is not a requirement.

5) Given 'xterm' use pty's, could a scenario happen where a user is prompted 
'How old are you?' in the xterm, but an input (getchar, whatever) is hit before 
that output is displayed? With or without ptrace?

Thanks,
Nic



-Original Message-
From: Peter Hurley [mailto:pe...@hurleysoftware.com] 
Sent: 05 May 2015 12:19
To: Nic Percival; Michael Matz
Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org
Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
'no' when it should say 'yes'


A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top


On 05/05/2015 04:20 AM, Nic Percival wrote:
 Michael is correct.
 Our COBOL debugger has a test feature whereby we can drive it to step through 
 debugging code, hitting breakpoints and so on.
 The debugger maintains a 'user screen' which is what the 'debuggee' process 
 has displayed.
 This is communicated to the debugger with pseudo-tty's.
 The state of this user screen is checked as part of this (and other) tests.

So the debugger doesn't display output from other non-TRACEME threads or child 
processes of the debuggee, right?

When that's fixed, you'll see that the test failure has gone away.

 The actual test failure is a failure of some text to be displayed on the 
 debuggee user screen when we know, given it has hit a certain breakpoint, 
 that the text has been written.
 
 What is worse is its non-deterministic.

That your test is non-deterministic stems from the fact that the i/o is 
asynchronous.

You would experience the same problem if your test setup was a tty in loopback.

 Sometimes the text makes it and is displayed, so it wouldn't even be 
 practical to modify the test to make it pass.
 We wouldn't really want to do that anyway - the test is just fine on other 
 earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris.

There is a reason Linux is the platform of choice for scalability.

Regards,
Peter Hurley

 -Original Message-
 From: Michael Matz [mailto:m...@suse.de]
 Sent: 04 May 2015 13:24
 To: Peter Hurley
 Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; 
 linux-kernel@vger.kernel.org
 Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
 'no' when it should say 'yes'
 
 Hi,
 
 On Fri, 1 May 2015, Peter Hurley wrote:
 
 I don't think this a real bug, in the sense that pty i/o is not 
 synchronous, in the same way that tty i/o is not synchronous.
 
 Here's what I wrote internally about my speculations about this being a bug 
 or not:
 
 I also never hit it with pipes (remove the USEPTY define), also not 
 on sle12, so it must be some change specific to the pty implementation.

 Now, all of this is of course unspecified.  There are two 
 asynchronous processes involved, and a buffered tube between them.
 Just because one process filled one end of the tube (the breakpoint 
 was hit) doesn't mean the contents have to appear at that instant at 
 the other end.  So the change in behaviour in sle12 is not a genuine 
 bug.  It _might_ be an unintented change, though, that's why kernel 
 people should comment on this.  If there are no terribly good 
 reasons for this change I'd consider it a quality-of-implementation 
 regression in sle12.
 
 So, I'd accept this being declared a non-bug, but it is certainly a change in 
 behaviour that's visible for our debugger team.
 
 However, that said, if this is a regression (regression as in it 
 broke something that used to work, not regression as in this new 
 thing I'm writing doesn't behave the way I want it to :) )

 Help me understand the use-case here: are you using pty i/o to debug 
 the debugger?
 
 Nic is working on the Cobol debugger, but I think this pty i/o is rather a 
 part of the normal interaction between a debugged Cobol process and the 
 debugger; that's just a theory, Nic is authorative here.  But this change in 
 behaviour _did_ result in real testsuite regressions, so it's not something 
 that he

RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'

2015-05-05 Thread Nic Percival
Michael is correct.
Our COBOL debugger has a test feature whereby we can drive it to step through 
debugging code, hitting breakpoints and so on.
The debugger maintains a 'user screen' which is what the 'debuggee' process has 
displayed.
This is communicated to the debugger with pseudo-tty's.
The state of this user screen is checked as part of this (and other) tests.

The actual test failure is a failure of some text to be displayed on the 
debuggee user screen when we know, given it has hit a certain breakpoint, that 
the text has been written.

What is worse is its non-deterministic. Sometimes the text makes it and is 
displayed, so it wouldn't even be practical to modify the test to make it pass.
We wouldn't really want to do that anyway - the test is just fine on other 
earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris.

Thanks,
Nic

-Original Message-
From: Michael Matz [mailto:m...@suse.de] 
Sent: 04 May 2015 13:24
To: Peter Hurley
Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; 
linux-kernel@vger.kernel.org
Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 
'no' when it should say 'yes'

Hi,

On Fri, 1 May 2015, Peter Hurley wrote:

 I don't think this a real bug, in the sense that pty i/o is not 
 synchronous, in the same way that tty i/o is not synchronous.

Here's what I wrote internally about my speculations about this being a bug or 
not:

  I also never hit it with pipes (remove the USEPTY define), also not 
  on sle12, so it must be some change specific to the pty implementation.
  
  Now, all of this is of course unspecified.  There are two 
  asynchronous processes involved, and a buffered tube between them.  
  Just because one process filled one end of the tube (the breakpoint 
  was hit) doesn't mean the contents have to appear at that instant at 
  the other end.  So the change in behaviour in sle12 is not a genuine 
  bug.  It _might_ be an unintented change, though, that's why kernel 
  people should comment on this.  If there are no terribly good 
  reasons for this change I'd consider it a quality-of-implementation 
  regression in sle12.

So, I'd accept this being declared a non-bug, but it is certainly a change in 
behaviour that's visible for our debugger team.

 However, that said, if this is a regression (regression as in it 
 broke something that used to work, not regression as in this new 
 thing I'm writing doesn't behave the way I want it to :) )
 
 Help me understand the use-case here: are you using pty i/o to debug 
 the debugger?

Nic is working on the Cobol debugger, but I think this pty i/o is rather a part 
of the normal interaction between a debugged Cobol process and the debugger; 
that's just a theory, Nic is authorative here.  But this change in behaviour 
_did_ result in real testsuite regressions, so it's not something that he 
wanted to write from scratch.

(FWIW: I do think it's a better QoI factor if something returns data from a 
tube if we can know via side channels (break points) that something must have 
been written locally to the other end of the tube, if that can be ensured 
without too much other work)


Ciao,
Michael.


This message has been scanned for malware by Websense. www.websense.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/