RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
Apologies all. Nic and I did not realise that an internal email had been cross-posted to a public mail list (let alone one with strict email formatting rules!), and were having a hard time understanding the context for the emails we were receiving. I have a certain amount of experience of asynchronous communication and protocol design: we aren't novices in this area. One would hope in the kind of intra-machine protocol that we're using here that, if we know the sender is halted, there should be a way of clearing the contents of the channel so that the receiver can get hold of whatever has been put in to it. Can tcflush() (or some similar API) be used to resolve our debugging scenario? Regards, Chris -- Chris Purvis Senior Development Manager Micro Focus chris.pur...@microfocus.com The Lawn, 22-30 Old Bath Road Newbury, Berkshire, RG14 1QN, UK Direct:+44 1635 565282 This message has been scanned for malware by Websense. www.websense.com
RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
Peter, So is calling tcflush() a solution here? Regards, Chris -- Chris Purvis Senior Development Manager Micro Focus chris.pur...@microfocus.com The Lawn, 22-30 Old Bath Road Newbury, Berkshire, RG14 1QN, UK Direct:+44 1635 565282 -Original Message- From: Peter Hurley [mailto:pe...@hurleysoftware.com] Sent: 05 May 2015 14:29 To: Nic Percival; Michael Matz; Kevin Fletcher; Paul Matthews; Chris Purvis Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes' Stop top-posting. On 05/05/2015 08:03 AM, Nic Percival wrote: > > There is only ever one debuggee process. > My original demo (and indeed the original test failure) is not threaded. The > debugger is multi-threaded. > > I've brought in Chris, Fletch and Paul, my immediate colleagues, into the > discussion. > > The email thread is getting a little tangled, however, from my standpoint I > have.. > > 1) poll tells us we have nothing to read on a pty, when we know something was > written into the other end. You're using a synchronization mechanism (ptrace) to validate an asynchronous process (tty i/o). That's not going to work. > 2) Given that 'poll' is not telling us that data has been written into the > pty, what can we use? Surely that is what poll is for. poll() doesn't tell you that nothing has been written. You're inferring that using a broken understanding of terminal i/o: ttys are not synchronous pipes. > 3) If a debuggee program has displayed 'how old are you?' and then hit a > breakpoint on the 'ACCEPT' response, then the question might very well not be > displayed, despite the debugger sitting on the statement some way subsequent > to the display. Let's extend your logic process here to a general-purpose debugger that can control all output devices. 1. The debugger and debuggee are running on X-Windows. 2. The debuggee outputs 'how old are you?' 3. The debugger immediately halts the debuggee and all output devices. The output will not appear on the monitor because X-Windows output is asynchronous. So is terminal i/o. > 4) If I understand correctly, the modification is a performance enhancement. > Obviously in the case of 'ptrace' debugging, performance is not a requirement. Nothing obvious about it. Not all uses of ptrace are interactive, and certainly don't want alternate behavior based on whether the process is ptraced. > 5) Given 'xterm' use pty's, could a scenario happen where a user is prompted > 'How old are you?' in the xterm, but an input (getchar, whatever) is hit > before that output is displayed? With or without ptrace? Of course. It's called typeahead. Since tty i/o is buffered, the following is possible: 1. The user types '15\r' 2. The process writes 'How old are you?' 3. The process reads '15\n' Processes that don't want typeahead call tcflush() before reading input. Regards, Peter Hurley > Thanks, > Nic > > > > -Original Message- > From: Peter Hurley [mailto:pe...@hurleysoftware.com] > Sent: 05 May 2015 12:19 > To: Nic Percival; Michael Matz > Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org > Subject: Re: [PATCH bisected regression] input_available_p() sometimes says > 'no' when it should say 'yes' > > > A: No. > Q: Should I include quotations after my reply? > > http://daringfireball.net/2007/07/on_top > > > On 05/05/2015 04:20 AM, Nic Percival wrote: >> Michael is correct. >> Our COBOL debugger has a test feature whereby we can drive it to step >> through debugging code, hitting breakpoints and so on. >> The debugger maintains a 'user screen' which is what the 'debuggee' process >> has displayed. >> This is communicated to the debugger with pseudo-tty's. >> The state of this user screen is checked as part of this (and other) tests. > > So the debugger doesn't display output from other non-TRACEME threads or > child processes of the debuggee, right? > > When that's fixed, you'll see that the "test failure" has gone away. > >> The actual test failure is a failure of some text to be displayed on the >> debuggee user screen when we know, given it has hit a certain breakpoint, >> that the text has been written. >> >> What is worse is its non-deterministic. > > That your test is non-deterministic stems from the fact that the i/o is > asynchronous. > > You would experience the same problem if your test setup was a tty in > loopback. > >> Sometimes the text makes it and is displayed, so it wouldn't even be >> practical to modify the test to make it pass. >> We wouldn't really want to do that anyway - the test is just fine on ot
RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
A way of giving context. This message has been scanned for malware by Websense. www.websense.com N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
On 05/05/2015 09:34 AM, Chris Purvis wrote: > Peter, > > So is calling tcflush() a solution here? > > Regards, > Chris What is with the top-posting? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
Stop top-posting. On 05/05/2015 08:03 AM, Nic Percival wrote: > > There is only ever one debuggee process. > My original demo (and indeed the original test failure) is not threaded. The > debugger is multi-threaded. > > I've brought in Chris, Fletch and Paul, my immediate colleagues, into the > discussion. > > The email thread is getting a little tangled, however, from my standpoint I > have.. > > 1) poll tells us we have nothing to read on a pty, when we know something was > written into the other end. You're using a synchronization mechanism (ptrace) to validate an asynchronous process (tty i/o). That's not going to work. > 2) Given that 'poll' is not telling us that data has been written into the > pty, what can we use? Surely that is what poll is for. poll() doesn't tell you that nothing has been written. You're inferring that using a broken understanding of terminal i/o: ttys are not synchronous pipes. > 3) If a debuggee program has displayed 'how old are you?' and then hit a > breakpoint on the 'ACCEPT' response, then the question might very well not be > displayed, despite the debugger sitting on the statement some way subsequent > to the display. Let's extend your logic process here to a general-purpose debugger that can control all output devices. 1. The debugger and debuggee are running on X-Windows. 2. The debuggee outputs 'how old are you?' 3. The debugger immediately halts the debuggee and all output devices. The output will not appear on the monitor because X-Windows output is asynchronous. So is terminal i/o. > 4) If I understand correctly, the modification is a performance enhancement. > Obviously in the case of 'ptrace' debugging, performance is not a requirement. Nothing obvious about it. Not all uses of ptrace are interactive, and certainly don't want alternate behavior based on whether the process is ptraced. > 5) Given 'xterm' use pty's, could a scenario happen where a user is prompted > 'How old are you?' in the xterm, but an input (getchar, whatever) is hit > before that output is displayed? With or without ptrace? Of course. It's called typeahead. Since tty i/o is buffered, the following is possible: 1. The user types '15\r' 2. The process writes 'How old are you?' 3. The process reads '15\n' Processes that don't want typeahead call tcflush() before reading input. Regards, Peter Hurley > Thanks, > Nic > > > > -Original Message- > From: Peter Hurley [mailto:pe...@hurleysoftware.com] > Sent: 05 May 2015 12:19 > To: Nic Percival; Michael Matz > Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org > Subject: Re: [PATCH bisected regression] input_available_p() sometimes says > 'no' when it should say 'yes' > > > A: No. > Q: Should I include quotations after my reply? > > http://daringfireball.net/2007/07/on_top > > > On 05/05/2015 04:20 AM, Nic Percival wrote: >> Michael is correct. >> Our COBOL debugger has a test feature whereby we can drive it to step >> through debugging code, hitting breakpoints and so on. >> The debugger maintains a 'user screen' which is what the 'debuggee' process >> has displayed. >> This is communicated to the debugger with pseudo-tty's. >> The state of this user screen is checked as part of this (and other) tests. > > So the debugger doesn't display output from other non-TRACEME threads or > child processes of the debuggee, right? > > When that's fixed, you'll see that the "test failure" has gone away. > >> The actual test failure is a failure of some text to be displayed on the >> debuggee user screen when we know, given it has hit a certain breakpoint, >> that the text has been written. >> >> What is worse is its non-deterministic. > > That your test is non-deterministic stems from the fact that the i/o is > asynchronous. > > You would experience the same problem if your test setup was a tty in > loopback. > >> Sometimes the text makes it and is displayed, so it wouldn't even be >> practical to modify the test to make it pass. >> We wouldn't really want to do that anyway - the test is just fine on other >> earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris. > > There is a reason Linux is the platform of choice for scalability. > > Regards, > Peter Hurley > >> -----Original Message- >> From: Michael Matz [mailto:m...@suse.de] >> Sent: 04 May 2015 13:24 >> To: Peter Hurley >> Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; >> linux-kernel@vger.kernel.org >> Subject: Re: [PATCH bisected regression] input_available_p() sometimes says >> 'no' when it should say 'yes' >> >
RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
There is only ever one debuggee process. My original demo (and indeed the original test failure) is not threaded. The debugger is multi-threaded. I've brought in Chris, Fletch and Paul, my immediate colleagues, into the discussion. The email thread is getting a little tangled, however, from my standpoint I have.. 1) poll tells us we have nothing to read on a pty, when we know something was written into the other end. 2) Given that 'poll' is not telling us that data has been written into the pty, what can we use? Surely that is what poll is for. 3) If a debuggee program has displayed 'how old are you?' and then hit a breakpoint on the 'ACCEPT' response, then the question might very well not be displayed, despite the debugger sitting on the statement some way subsequent to the display. 4) If I understand correctly, the modification is a performance enhancement. Obviously in the case of 'ptrace' debugging, performance is not a requirement. 5) Given 'xterm' use pty's, could a scenario happen where a user is prompted 'How old are you?' in the xterm, but an input (getchar, whatever) is hit before that output is displayed? With or without ptrace? Thanks, Nic -Original Message- From: Peter Hurley [mailto:pe...@hurleysoftware.com] Sent: 05 May 2015 12:19 To: Nic Percival; Michael Matz Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes' A: No. Q: Should I include quotations after my reply? http://daringfireball.net/2007/07/on_top On 05/05/2015 04:20 AM, Nic Percival wrote: > Michael is correct. > Our COBOL debugger has a test feature whereby we can drive it to step through > debugging code, hitting breakpoints and so on. > The debugger maintains a 'user screen' which is what the 'debuggee' process > has displayed. > This is communicated to the debugger with pseudo-tty's. > The state of this user screen is checked as part of this (and other) tests. So the debugger doesn't display output from other non-TRACEME threads or child processes of the debuggee, right? When that's fixed, you'll see that the "test failure" has gone away. > The actual test failure is a failure of some text to be displayed on the > debuggee user screen when we know, given it has hit a certain breakpoint, > that the text has been written. > > What is worse is its non-deterministic. That your test is non-deterministic stems from the fact that the i/o is asynchronous. You would experience the same problem if your test setup was a tty in loopback. > Sometimes the text makes it and is displayed, so it wouldn't even be > practical to modify the test to make it pass. > We wouldn't really want to do that anyway - the test is just fine on other > earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris. There is a reason Linux is the platform of choice for scalability. Regards, Peter Hurley > -Original Message- > From: Michael Matz [mailto:m...@suse.de] > Sent: 04 May 2015 13:24 > To: Peter Hurley > Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; > linux-kernel@vger.kernel.org > Subject: Re: [PATCH bisected regression] input_available_p() sometimes says > 'no' when it should say 'yes' > > Hi, > > On Fri, 1 May 2015, Peter Hurley wrote: > >> I don't think this a real bug, in the sense that pty i/o is not >> synchronous, in the same way that tty i/o is not synchronous. > > Here's what I wrote internally about my speculations about this being a bug > or not: > >>> I also never hit it with pipes (remove the USEPTY define), also not >>> on sle12, so it must be some change specific to the pty implementation. >>> >>> Now, all of this is of course unspecified. There are two >>> asynchronous processes involved, and a buffered tube between them. >>> Just because one process filled one end of the tube (the breakpoint >>> was hit) doesn't mean the contents have to appear at that instant at >>> the other end. So the change in behaviour in sle12 is not a genuine >>> bug. It _might_ be an unintented change, though, that's why kernel >>> people should comment on this. If there are no terribly good >>> reasons for this change I'd consider it a quality-of-implementation >>> regression in sle12. > > So, I'd accept this being declared a non-bug, but it is certainly a change in > behaviour that's visible for our debugger team. > >> However, that said, if this is a regression (regression as in "it >> broke something that used to work", not regression as in "this new >> thing I'm writing doesn't behave the way I want it to" :) ) >> >> Help me u
Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
A: No. Q: Should I include quotations after my reply? http://daringfireball.net/2007/07/on_top On 05/05/2015 04:20 AM, Nic Percival wrote: > Michael is correct. > Our COBOL debugger has a test feature whereby we can drive it to step through > debugging code, hitting breakpoints and so on. > The debugger maintains a 'user screen' which is what the 'debuggee' process > has displayed. > This is communicated to the debugger with pseudo-tty's. > The state of this user screen is checked as part of this (and other) tests. So the debugger doesn't display output from other non-TRACEME threads or child processes of the debuggee, right? When that's fixed, you'll see that the "test failure" has gone away. > The actual test failure is a failure of some text to be displayed on the > debuggee user screen when we know, given it has hit a certain breakpoint, > that the text has been written. > > What is worse is its non-deterministic. That your test is non-deterministic stems from the fact that the i/o is asynchronous. You would experience the same problem if your test setup was a tty in loopback. > Sometimes the text makes it and is displayed, so it wouldn't even be > practical to modify the test to make it pass. > We wouldn't really want to do that anyway - the test is just fine on other > earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris. There is a reason Linux is the platform of choice for scalability. Regards, Peter Hurley > -Original Message- > From: Michael Matz [mailto:m...@suse.de] > Sent: 04 May 2015 13:24 > To: Peter Hurley > Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; > linux-kernel@vger.kernel.org > Subject: Re: [PATCH bisected regression] input_available_p() sometimes says > 'no' when it should say 'yes' > > Hi, > > On Fri, 1 May 2015, Peter Hurley wrote: > >> I don't think this a real bug, in the sense that pty i/o is not >> synchronous, in the same way that tty i/o is not synchronous. > > Here's what I wrote internally about my speculations about this being a bug > or not: > >>> I also never hit it with pipes (remove the USEPTY define), also not >>> on sle12, so it must be some change specific to the pty implementation. >>> >>> Now, all of this is of course unspecified. There are two >>> asynchronous processes involved, and a buffered tube between them. >>> Just because one process filled one end of the tube (the breakpoint >>> was hit) doesn't mean the contents have to appear at that instant at >>> the other end. So the change in behaviour in sle12 is not a genuine >>> bug. It _might_ be an unintented change, though, that's why kernel >>> people should comment on this. If there are no terribly good >>> reasons for this change I'd consider it a quality-of-implementation >>> regression in sle12. > > So, I'd accept this being declared a non-bug, but it is certainly a change in > behaviour that's visible for our debugger team. > >> However, that said, if this is a regression (regression as in "it >> broke something that used to work", not regression as in "this new >> thing I'm writing doesn't behave the way I want it to" :) ) >> >> Help me understand the use-case here: are you using pty i/o to debug >> the debugger? > > Nic is working on the Cobol debugger, but I think this pty i/o is rather a > part of the normal interaction between a debugged Cobol process and the > debugger; that's just a theory, Nic is authorative here. But this change in > behaviour _did_ result in real testsuite regressions, so it's not something > that he wanted to write from scratch. > > (FWIW: I do think it's a better QoI factor if something returns data from a > tube if we can know via side channels (break points) that something must have > been written locally to the other end of the tube, if that can be ensured > without too much other work) > > > Ciao, > Michael. > > > This message has been scanned for malware by Websense. www.websense.com > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
Michael is correct. Our COBOL debugger has a test feature whereby we can drive it to step through debugging code, hitting breakpoints and so on. The debugger maintains a 'user screen' which is what the 'debuggee' process has displayed. This is communicated to the debugger with pseudo-tty's. The state of this user screen is checked as part of this (and other) tests. The actual test failure is a failure of some text to be displayed on the debuggee user screen when we know, given it has hit a certain breakpoint, that the text has been written. What is worse is its non-deterministic. Sometimes the text makes it and is displayed, so it wouldn't even be practical to modify the test to make it pass. We wouldn't really want to do that anyway - the test is just fine on other earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris. Thanks, Nic -Original Message- From: Michael Matz [mailto:m...@suse.de] Sent: 04 May 2015 13:24 To: Peter Hurley Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes' Hi, On Fri, 1 May 2015, Peter Hurley wrote: > I don't think this a real bug, in the sense that pty i/o is not > synchronous, in the same way that tty i/o is not synchronous. Here's what I wrote internally about my speculations about this being a bug or not: > > I also never hit it with pipes (remove the USEPTY define), also not > > on sle12, so it must be some change specific to the pty implementation. > > > > Now, all of this is of course unspecified. There are two > > asynchronous processes involved, and a buffered tube between them. > > Just because one process filled one end of the tube (the breakpoint > > was hit) doesn't mean the contents have to appear at that instant at > > the other end. So the change in behaviour in sle12 is not a genuine > > bug. It _might_ be an unintented change, though, that's why kernel > > people should comment on this. If there are no terribly good > > reasons for this change I'd consider it a quality-of-implementation > > regression in sle12. So, I'd accept this being declared a non-bug, but it is certainly a change in behaviour that's visible for our debugger team. > However, that said, if this is a regression (regression as in "it > broke something that used to work", not regression as in "this new > thing I'm writing doesn't behave the way I want it to" :) ) > > Help me understand the use-case here: are you using pty i/o to debug > the debugger? Nic is working on the Cobol debugger, but I think this pty i/o is rather a part of the normal interaction between a debugged Cobol process and the debugger; that's just a theory, Nic is authorative here. But this change in behaviour _did_ result in real testsuite regressions, so it's not something that he wanted to write from scratch. (FWIW: I do think it's a better QoI factor if something returns data from a tube if we can know via side channels (break points) that something must have been written locally to the other end of the tube, if that can be ensured without too much other work) Ciao, Michael. This message has been scanned for malware by Websense. www.websense.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
Stop top-posting. On 05/05/2015 08:03 AM, Nic Percival wrote: There is only ever one debuggee process. My original demo (and indeed the original test failure) is not threaded. The debugger is multi-threaded. I've brought in Chris, Fletch and Paul, my immediate colleagues, into the discussion. The email thread is getting a little tangled, however, from my standpoint I have.. 1) poll tells us we have nothing to read on a pty, when we know something was written into the other end. You're using a synchronization mechanism (ptrace) to validate an asynchronous process (tty i/o). That's not going to work. 2) Given that 'poll' is not telling us that data has been written into the pty, what can we use? Surely that is what poll is for. poll() doesn't tell you that nothing has been written. You're inferring that using a broken understanding of terminal i/o: ttys are not synchronous pipes. 3) If a debuggee program has displayed 'how old are you?' and then hit a breakpoint on the 'ACCEPT' response, then the question might very well not be displayed, despite the debugger sitting on the statement some way subsequent to the display. Let's extend your logic process here to a general-purpose debugger that can control all output devices. 1. The debugger and debuggee are running on X-Windows. 2. The debuggee outputs 'how old are you?' 3. The debugger immediately halts the debuggee and all output devices. The output will not appear on the monitor because X-Windows output is asynchronous. So is terminal i/o. 4) If I understand correctly, the modification is a performance enhancement. Obviously in the case of 'ptrace' debugging, performance is not a requirement. Nothing obvious about it. Not all uses of ptrace are interactive, and certainly don't want alternate behavior based on whether the process is ptraced. 5) Given 'xterm' use pty's, could a scenario happen where a user is prompted 'How old are you?' in the xterm, but an input (getchar, whatever) is hit before that output is displayed? With or without ptrace? Of course. It's called typeahead. Since tty i/o is buffered, the following is possible: 1. The user types '15\r' 2. The process writes 'How old are you?' 3. The process reads '15\n' Processes that don't want typeahead call tcflush() before reading input. Regards, Peter Hurley Thanks, Nic -Original Message- From: Peter Hurley [mailto:pe...@hurleysoftware.com] Sent: 05 May 2015 12:19 To: Nic Percival; Michael Matz Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes' A: No. Q: Should I include quotations after my reply? http://daringfireball.net/2007/07/on_top On 05/05/2015 04:20 AM, Nic Percival wrote: Michael is correct. Our COBOL debugger has a test feature whereby we can drive it to step through debugging code, hitting breakpoints and so on. The debugger maintains a 'user screen' which is what the 'debuggee' process has displayed. This is communicated to the debugger with pseudo-tty's. The state of this user screen is checked as part of this (and other) tests. So the debugger doesn't display output from other non-TRACEME threads or child processes of the debuggee, right? When that's fixed, you'll see that the test failure has gone away. The actual test failure is a failure of some text to be displayed on the debuggee user screen when we know, given it has hit a certain breakpoint, that the text has been written. What is worse is its non-deterministic. That your test is non-deterministic stems from the fact that the i/o is asynchronous. You would experience the same problem if your test setup was a tty in loopback. Sometimes the text makes it and is displayed, so it wouldn't even be practical to modify the test to make it pass. We wouldn't really want to do that anyway - the test is just fine on other earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris. There is a reason Linux is the platform of choice for scalability. Regards, Peter Hurley -Original Message- From: Michael Matz [mailto:m...@suse.de] Sent: 04 May 2015 13:24 To: Peter Hurley Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes' Hi, On Fri, 1 May 2015, Peter Hurley wrote: I don't think this a real bug, in the sense that pty i/o is not synchronous, in the same way that tty i/o is not synchronous. Here's what I wrote internally about my speculations about this being a bug or not: I also never hit it with pipes (remove the USEPTY define), also not on sle12, so it must be some change specific to the pty implementation. Now, all of this is of course unspecified. There are two asynchronous
RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
A way of giving context. This message has been scanned for malware by Websense. www.websense.com N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf��^jǫy�m��@A�a��� 0��h���i
Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
On 05/05/2015 09:34 AM, Chris Purvis wrote: Peter, So is calling tcflush() a solution here? Regards, Chris What is with the top-posting? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
Apologies all. Nic and I did not realise that an internal email had been cross-posted to a public mail list (let alone one with strict email formatting rules!), and were having a hard time understanding the context for the emails we were receiving. I have a certain amount of experience of asynchronous communication and protocol design: we aren't novices in this area. One would hope in the kind of intra-machine protocol that we're using here that, if we know the sender is halted, there should be a way of clearing the contents of the channel so that the receiver can get hold of whatever has been put in to it. Can tcflush() (or some similar API) be used to resolve our debugging scenario? Regards, Chris -- Chris Purvis Senior Development Manager Micro Focus chris.pur...@microfocus.com The Lawn, 22-30 Old Bath Road Newbury, Berkshire, RG14 1QN, UK Direct:+44 1635 565282 This message has been scanned for malware by Websense. www.websense.com
RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
Peter, So is calling tcflush() a solution here? Regards, Chris -- Chris Purvis Senior Development Manager Micro Focus chris.pur...@microfocus.com The Lawn, 22-30 Old Bath Road Newbury, Berkshire, RG14 1QN, UK Direct:+44 1635 565282 -Original Message- From: Peter Hurley [mailto:pe...@hurleysoftware.com] Sent: 05 May 2015 14:29 To: Nic Percival; Michael Matz; Kevin Fletcher; Paul Matthews; Chris Purvis Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes' Stop top-posting. On 05/05/2015 08:03 AM, Nic Percival wrote: There is only ever one debuggee process. My original demo (and indeed the original test failure) is not threaded. The debugger is multi-threaded. I've brought in Chris, Fletch and Paul, my immediate colleagues, into the discussion. The email thread is getting a little tangled, however, from my standpoint I have.. 1) poll tells us we have nothing to read on a pty, when we know something was written into the other end. You're using a synchronization mechanism (ptrace) to validate an asynchronous process (tty i/o). That's not going to work. 2) Given that 'poll' is not telling us that data has been written into the pty, what can we use? Surely that is what poll is for. poll() doesn't tell you that nothing has been written. You're inferring that using a broken understanding of terminal i/o: ttys are not synchronous pipes. 3) If a debuggee program has displayed 'how old are you?' and then hit a breakpoint on the 'ACCEPT' response, then the question might very well not be displayed, despite the debugger sitting on the statement some way subsequent to the display. Let's extend your logic process here to a general-purpose debugger that can control all output devices. 1. The debugger and debuggee are running on X-Windows. 2. The debuggee outputs 'how old are you?' 3. The debugger immediately halts the debuggee and all output devices. The output will not appear on the monitor because X-Windows output is asynchronous. So is terminal i/o. 4) If I understand correctly, the modification is a performance enhancement. Obviously in the case of 'ptrace' debugging, performance is not a requirement. Nothing obvious about it. Not all uses of ptrace are interactive, and certainly don't want alternate behavior based on whether the process is ptraced. 5) Given 'xterm' use pty's, could a scenario happen where a user is prompted 'How old are you?' in the xterm, but an input (getchar, whatever) is hit before that output is displayed? With or without ptrace? Of course. It's called typeahead. Since tty i/o is buffered, the following is possible: 1. The user types '15\r' 2. The process writes 'How old are you?' 3. The process reads '15\n' Processes that don't want typeahead call tcflush() before reading input. Regards, Peter Hurley Thanks, Nic -Original Message- From: Peter Hurley [mailto:pe...@hurleysoftware.com] Sent: 05 May 2015 12:19 To: Nic Percival; Michael Matz Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes' A: No. Q: Should I include quotations after my reply? http://daringfireball.net/2007/07/on_top On 05/05/2015 04:20 AM, Nic Percival wrote: Michael is correct. Our COBOL debugger has a test feature whereby we can drive it to step through debugging code, hitting breakpoints and so on. The debugger maintains a 'user screen' which is what the 'debuggee' process has displayed. This is communicated to the debugger with pseudo-tty's. The state of this user screen is checked as part of this (and other) tests. So the debugger doesn't display output from other non-TRACEME threads or child processes of the debuggee, right? When that's fixed, you'll see that the test failure has gone away. The actual test failure is a failure of some text to be displayed on the debuggee user screen when we know, given it has hit a certain breakpoint, that the text has been written. What is worse is its non-deterministic. That your test is non-deterministic stems from the fact that the i/o is asynchronous. You would experience the same problem if your test setup was a tty in loopback. Sometimes the text makes it and is displayed, so it wouldn't even be practical to modify the test to make it pass. We wouldn't really want to do that anyway - the test is just fine on other earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris. There is a reason Linux is the platform of choice for scalability. Regards, Peter Hurley -Original Message- From: Michael Matz [mailto:m...@suse.de] Sent: 04 May 2015 13:24 To: Peter Hurley Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; linux-kernel
RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
There is only ever one debuggee process. My original demo (and indeed the original test failure) is not threaded. The debugger is multi-threaded. I've brought in Chris, Fletch and Paul, my immediate colleagues, into the discussion. The email thread is getting a little tangled, however, from my standpoint I have.. 1) poll tells us we have nothing to read on a pty, when we know something was written into the other end. 2) Given that 'poll' is not telling us that data has been written into the pty, what can we use? Surely that is what poll is for. 3) If a debuggee program has displayed 'how old are you?' and then hit a breakpoint on the 'ACCEPT' response, then the question might very well not be displayed, despite the debugger sitting on the statement some way subsequent to the display. 4) If I understand correctly, the modification is a performance enhancement. Obviously in the case of 'ptrace' debugging, performance is not a requirement. 5) Given 'xterm' use pty's, could a scenario happen where a user is prompted 'How old are you?' in the xterm, but an input (getchar, whatever) is hit before that output is displayed? With or without ptrace? Thanks, Nic -Original Message- From: Peter Hurley [mailto:pe...@hurleysoftware.com] Sent: 05 May 2015 12:19 To: Nic Percival; Michael Matz Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes' A: No. Q: Should I include quotations after my reply? http://daringfireball.net/2007/07/on_top On 05/05/2015 04:20 AM, Nic Percival wrote: Michael is correct. Our COBOL debugger has a test feature whereby we can drive it to step through debugging code, hitting breakpoints and so on. The debugger maintains a 'user screen' which is what the 'debuggee' process has displayed. This is communicated to the debugger with pseudo-tty's. The state of this user screen is checked as part of this (and other) tests. So the debugger doesn't display output from other non-TRACEME threads or child processes of the debuggee, right? When that's fixed, you'll see that the test failure has gone away. The actual test failure is a failure of some text to be displayed on the debuggee user screen when we know, given it has hit a certain breakpoint, that the text has been written. What is worse is its non-deterministic. That your test is non-deterministic stems from the fact that the i/o is asynchronous. You would experience the same problem if your test setup was a tty in loopback. Sometimes the text makes it and is displayed, so it wouldn't even be practical to modify the test to make it pass. We wouldn't really want to do that anyway - the test is just fine on other earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris. There is a reason Linux is the platform of choice for scalability. Regards, Peter Hurley -Original Message- From: Michael Matz [mailto:m...@suse.de] Sent: 04 May 2015 13:24 To: Peter Hurley Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes' Hi, On Fri, 1 May 2015, Peter Hurley wrote: I don't think this a real bug, in the sense that pty i/o is not synchronous, in the same way that tty i/o is not synchronous. Here's what I wrote internally about my speculations about this being a bug or not: I also never hit it with pipes (remove the USEPTY define), also not on sle12, so it must be some change specific to the pty implementation. Now, all of this is of course unspecified. There are two asynchronous processes involved, and a buffered tube between them. Just because one process filled one end of the tube (the breakpoint was hit) doesn't mean the contents have to appear at that instant at the other end. So the change in behaviour in sle12 is not a genuine bug. It _might_ be an unintented change, though, that's why kernel people should comment on this. If there are no terribly good reasons for this change I'd consider it a quality-of-implementation regression in sle12. So, I'd accept this being declared a non-bug, but it is certainly a change in behaviour that's visible for our debugger team. However, that said, if this is a regression (regression as in it broke something that used to work, not regression as in this new thing I'm writing doesn't behave the way I want it to :) ) Help me understand the use-case here: are you using pty i/o to debug the debugger? Nic is working on the Cobol debugger, but I think this pty i/o is rather a part of the normal interaction between a debugged Cobol process and the debugger; that's just a theory, Nic is authorative here. But this change in behaviour _did_ result in real testsuite regressions, so it's not something that he
RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
Michael is correct. Our COBOL debugger has a test feature whereby we can drive it to step through debugging code, hitting breakpoints and so on. The debugger maintains a 'user screen' which is what the 'debuggee' process has displayed. This is communicated to the debugger with pseudo-tty's. The state of this user screen is checked as part of this (and other) tests. The actual test failure is a failure of some text to be displayed on the debuggee user screen when we know, given it has hit a certain breakpoint, that the text has been written. What is worse is its non-deterministic. Sometimes the text makes it and is displayed, so it wouldn't even be practical to modify the test to make it pass. We wouldn't really want to do that anyway - the test is just fine on other earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris. Thanks, Nic -Original Message- From: Michael Matz [mailto:m...@suse.de] Sent: 04 May 2015 13:24 To: Peter Hurley Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes' Hi, On Fri, 1 May 2015, Peter Hurley wrote: I don't think this a real bug, in the sense that pty i/o is not synchronous, in the same way that tty i/o is not synchronous. Here's what I wrote internally about my speculations about this being a bug or not: I also never hit it with pipes (remove the USEPTY define), also not on sle12, so it must be some change specific to the pty implementation. Now, all of this is of course unspecified. There are two asynchronous processes involved, and a buffered tube between them. Just because one process filled one end of the tube (the breakpoint was hit) doesn't mean the contents have to appear at that instant at the other end. So the change in behaviour in sle12 is not a genuine bug. It _might_ be an unintented change, though, that's why kernel people should comment on this. If there are no terribly good reasons for this change I'd consider it a quality-of-implementation regression in sle12. So, I'd accept this being declared a non-bug, but it is certainly a change in behaviour that's visible for our debugger team. However, that said, if this is a regression (regression as in it broke something that used to work, not regression as in this new thing I'm writing doesn't behave the way I want it to :) ) Help me understand the use-case here: are you using pty i/o to debug the debugger? Nic is working on the Cobol debugger, but I think this pty i/o is rather a part of the normal interaction between a debugged Cobol process and the debugger; that's just a theory, Nic is authorative here. But this change in behaviour _did_ result in real testsuite regressions, so it's not something that he wanted to write from scratch. (FWIW: I do think it's a better QoI factor if something returns data from a tube if we can know via side channels (break points) that something must have been written locally to the other end of the tube, if that can be ensured without too much other work) Ciao, Michael. This message has been scanned for malware by Websense. www.websense.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
A: No. Q: Should I include quotations after my reply? http://daringfireball.net/2007/07/on_top On 05/05/2015 04:20 AM, Nic Percival wrote: Michael is correct. Our COBOL debugger has a test feature whereby we can drive it to step through debugging code, hitting breakpoints and so on. The debugger maintains a 'user screen' which is what the 'debuggee' process has displayed. This is communicated to the debugger with pseudo-tty's. The state of this user screen is checked as part of this (and other) tests. So the debugger doesn't display output from other non-TRACEME threads or child processes of the debuggee, right? When that's fixed, you'll see that the test failure has gone away. The actual test failure is a failure of some text to be displayed on the debuggee user screen when we know, given it has hit a certain breakpoint, that the text has been written. What is worse is its non-deterministic. That your test is non-deterministic stems from the fact that the i/o is asynchronous. You would experience the same problem if your test setup was a tty in loopback. Sometimes the text makes it and is displayed, so it wouldn't even be practical to modify the test to make it pass. We wouldn't really want to do that anyway - the test is just fine on other earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris. There is a reason Linux is the platform of choice for scalability. Regards, Peter Hurley -Original Message- From: Michael Matz [mailto:m...@suse.de] Sent: 04 May 2015 13:24 To: Peter Hurley Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes' Hi, On Fri, 1 May 2015, Peter Hurley wrote: I don't think this a real bug, in the sense that pty i/o is not synchronous, in the same way that tty i/o is not synchronous. Here's what I wrote internally about my speculations about this being a bug or not: I also never hit it with pipes (remove the USEPTY define), also not on sle12, so it must be some change specific to the pty implementation. Now, all of this is of course unspecified. There are two asynchronous processes involved, and a buffered tube between them. Just because one process filled one end of the tube (the breakpoint was hit) doesn't mean the contents have to appear at that instant at the other end. So the change in behaviour in sle12 is not a genuine bug. It _might_ be an unintented change, though, that's why kernel people should comment on this. If there are no terribly good reasons for this change I'd consider it a quality-of-implementation regression in sle12. So, I'd accept this being declared a non-bug, but it is certainly a change in behaviour that's visible for our debugger team. However, that said, if this is a regression (regression as in it broke something that used to work, not regression as in this new thing I'm writing doesn't behave the way I want it to :) ) Help me understand the use-case here: are you using pty i/o to debug the debugger? Nic is working on the Cobol debugger, but I think this pty i/o is rather a part of the normal interaction between a debugged Cobol process and the debugger; that's just a theory, Nic is authorative here. But this change in behaviour _did_ result in real testsuite regressions, so it's not something that he wanted to write from scratch. (FWIW: I do think it's a better QoI factor if something returns data from a tube if we can know via side channels (break points) that something must have been written locally to the other end of the tube, if that can be ensured without too much other work) Ciao, Michael. This message has been scanned for malware by Websense. www.websense.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
On 05/04/2015 12:56 PM, Michael Matz wrote: > Hi, > > On Mon, 4 May 2015, Peter Hurley wrote: > >> I think it would be a shame if ptrace() usage forced a whole class of >> i/o to be synchronous. > > I think Neils patch doesn't do that, does it? Yes. Non-blocking i/o would now have to wait until the input worker has completed (at a time when the input worker may not even be running on a cpu yet). > If it has an indication > that in fact some data must be there but isn't it pulls it, otherwise it's > the same as after your patch/current state? At the point when the write() returns from the child process, the input worker may not even have run yet. The patch will now force the reader to wait until the input worker has started and run to completion. Moreover, prior to 4.1, the read i/o loop is sloppy wrt kicking the the input worker, so there is every likelihood that this patch would force extra waits on a non-blocking reader even though no input was actually being written. > (Leaving the debugger question to Nic; but I guess it's similar interface > like gdb. Once you come back into the debugger (breakpoint hit) it looks > once for input on the tubes of the debuggee and then enters a prompt; it > doesn't continue looking for input until you continue the debuggee (1). That's exactly what gdb does. Below is simple test jig [2] where the child outputs while at the gdb prompt. > ptys would be used because it's Cobol, the programs contain data entry > masks presumably needing a real tty, not just pipes. That usecase would > be broken now; the tty provided by the debugger doesn't reflect the real > screen that the debuggee actually generated before the breakpoint. Note > how pipes in my test program _are_ synchronuous in this sense, just ptys > aren't.) What's interesting about that expectation is that it would never work on an actual tty. > (1) And in a single threaded debugger (no matter if the debuggee is > multithreaded) it would be awkward to implement. After read returns 0 > you'd again call poll, which indicates data is there, and read again. > You repeat that until $SOMEWHEN. But when is it enough? Two loops, 10, > 1000? To not sit in a tight loop you'd also add some nanosleeps, but that > would add unnecessary lags. That's not what's happening. poll() with a timeout of 0 returns immediately, even if no file descriptors are ready for i/o. The poll() is returning 0 because there is no data to read, so the loop is exiting. If poll() returned non-zero and read() returned no data, that would definitely be a bug. > Basically, whenever poll indicates that read won't block then it should > also return some data, not 0, if at all reasonably implementable; i.e. > some progress should be guaranteed. ttys have a further refinement of the behavior of read() and poll(), which is controlled by the VMIN and VTIME values in the termios structure. But note: the pty master does not allow its termios to be programmed -- a pty master read() is always non-blocking. > I realize that this isn't always the > case, but here it is. In code, this loop: > > while (poll ([fd, POLLIN], 0) == 1) > // So, read won't block, yippie > if (read (fd, ...) == 0) > continue; > > shouldn't become a tight loop, without the read making progress but the > kernel continuously stating "yep, there's data available", until some > random point in the future. Yeah, that would be broken but, again, that's not what's happening here. Regards, Peter Hurley --- >% --- #include #include #include #include #include #define BRKPT asm("int $3") void child() { sleep(1); printf("Hello, world!"); } int main() { int child_id; setbuf(stdout, NULL); child_id = fork(); switch (child_id) { case -1: printf("fork: %s (code: %d)\n", strerror(errno), errno); exit(EXIT_FAILURE); case 0: child(); break; default: /* parent */ BRKPT; break; } return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
Hi, On Mon, 4 May 2015, Peter Hurley wrote: > I think it would be a shame if ptrace() usage forced a whole class of > i/o to be synchronous. I think Neils patch doesn't do that, does it? If it has an indication that in fact some data must be there but isn't it pulls it, otherwise it's the same as after your patch/current state? (Leaving the debugger question to Nic; but I guess it's similar interface like gdb. Once you come back into the debugger (breakpoint hit) it looks once for input on the tubes of the debuggee and then enters a prompt; it doesn't continue looking for input until you continue the debuggee (1). ptys would be used because it's Cobol, the programs contain data entry masks presumably needing a real tty, not just pipes. That usecase would be broken now; the tty provided by the debugger doesn't reflect the real screen that the debuggee actually generated before the breakpoint. Note how pipes in my test program _are_ synchronuous in this sense, just ptys aren't.) Ciao, Michael. (1) And in a single threaded debugger (no matter if the debuggee is multithreaded) it would be awkward to implement. After read returns 0 you'd again call poll, which indicates data is there, and read again. You repeat that until $SOMEWHEN. But when is it enough? Two loops, 10, 1000? To not sit in a tight loop you'd also add some nanosleeps, but that would add unnecessary lags. Basically, whenever poll indicates that read won't block then it should also return some data, not 0, if at all reasonably implementable; i.e. some progress should be guaranteed. I realize that this isn't always the case, but here it is. In code, this loop: while (poll ([fd, POLLIN], 0) == 1) // So, read won't block, yippie if (read (fd, ...) == 0) continue; shouldn't become a tight loop, without the read making progress but the kernel continuously stating "yep, there's data available", until some random point in the future. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
On 05/04/2015 08:24 AM, Michael Matz wrote: > Hi, > > On Fri, 1 May 2015, Peter Hurley wrote: > >> I don't think this a real bug, in the sense that pty i/o is not >> synchronous, in the same way that tty i/o is not synchronous. > > Here's what I wrote internally about my speculations about this being a > bug or not: > >>> I also never hit it with pipes (remove the USEPTY define), also not on >>> sle12, so it must be some change specific to the pty implementation. >>> >>> Now, all of this is of course unspecified. There are two asynchronous >>> processes involved, and a buffered tube between them. Just because >>> one process filled one end of the tube (the breakpoint was hit) >>> doesn't mean the contents have to appear at that instant at the other >>> end. So the change in behaviour in sle12 is not a genuine bug. It >>> _might_ be an unintented change, though, that's why kernel people >>> should comment on this. If there are no terribly good reasons for >>> this change I'd consider it a quality-of-implementation regression in >>> sle12. > > So, I'd accept this being declared a non-bug, but it is certainly a change > in behaviour that's visible for our debugger team. > >> However, that said, if this is a regression (regression as in "it broke >> something that used to work", not regression as in "this new thing I'm >> writing doesn't behave the way I want it to" :) ) >> >> Help me understand the use-case here: are you using pty i/o to debug the >> debugger? > > Nic is working on the Cobol debugger, but I think this pty i/o is rather a > part of the normal interaction between a debugged Cobol process and the > debugger; that's just a theory, Nic is authorative here. But this change > in behaviour _did_ result in real testsuite regressions, so it's not > something that he wanted to write from scratch. I'd like to understand why the debugger cares about when pty i/o shows up and why there is a testsuite to check for that. Does the debuggee know about the debugger, or is the pty i/o just stdout/stderr? This doesn't seem stable in the face of multiple threads of execution in the debuggee (or grandchild processes); IOW, pty slave writes from the debuggee may continue from other non-TRACEME threads. Presumably that i/o isn't being read either. > (FWIW: I do think it's a better QoI factor if something returns data from > a tube if we can know via side channels (break points) that something must > have been written locally to the other end of the tube, if that can be > ensured without too much other work) Well, if the debugger simply continues to monitor the pty master, the i/o will arrive. I think it would be a shame if ptrace() usage forced a whole class of i/o to be synchronous. Regards, Peter Hurley -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
Hi, On Fri, 1 May 2015, Peter Hurley wrote: > I don't think this a real bug, in the sense that pty i/o is not > synchronous, in the same way that tty i/o is not synchronous. Here's what I wrote internally about my speculations about this being a bug or not: > > I also never hit it with pipes (remove the USEPTY define), also not on > > sle12, so it must be some change specific to the pty implementation. > > > > Now, all of this is of course unspecified. There are two asynchronous > > processes involved, and a buffered tube between them. Just because > > one process filled one end of the tube (the breakpoint was hit) > > doesn't mean the contents have to appear at that instant at the other > > end. So the change in behaviour in sle12 is not a genuine bug. It > > _might_ be an unintented change, though, that's why kernel people > > should comment on this. If there are no terribly good reasons for > > this change I'd consider it a quality-of-implementation regression in > > sle12. So, I'd accept this being declared a non-bug, but it is certainly a change in behaviour that's visible for our debugger team. > However, that said, if this is a regression (regression as in "it broke > something that used to work", not regression as in "this new thing I'm > writing doesn't behave the way I want it to" :) ) > > Help me understand the use-case here: are you using pty i/o to debug the > debugger? Nic is working on the Cobol debugger, but I think this pty i/o is rather a part of the normal interaction between a debugged Cobol process and the debugger; that's just a theory, Nic is authorative here. But this change in behaviour _did_ result in real testsuite regressions, so it's not something that he wanted to write from scratch. (FWIW: I do think it's a better QoI factor if something returns data from a tube if we can know via side channels (break points) that something must have been written locally to the other end of the tube, if that can be ensured without too much other work) Ciao, Michael. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
Hi, On Fri, 1 May 2015, Peter Hurley wrote: I don't think this a real bug, in the sense that pty i/o is not synchronous, in the same way that tty i/o is not synchronous. Here's what I wrote internally about my speculations about this being a bug or not: I also never hit it with pipes (remove the USEPTY define), also not on sle12, so it must be some change specific to the pty implementation. Now, all of this is of course unspecified. There are two asynchronous processes involved, and a buffered tube between them. Just because one process filled one end of the tube (the breakpoint was hit) doesn't mean the contents have to appear at that instant at the other end. So the change in behaviour in sle12 is not a genuine bug. It _might_ be an unintented change, though, that's why kernel people should comment on this. If there are no terribly good reasons for this change I'd consider it a quality-of-implementation regression in sle12. So, I'd accept this being declared a non-bug, but it is certainly a change in behaviour that's visible for our debugger team. However, that said, if this is a regression (regression as in it broke something that used to work, not regression as in this new thing I'm writing doesn't behave the way I want it to :) ) Help me understand the use-case here: are you using pty i/o to debug the debugger? Nic is working on the Cobol debugger, but I think this pty i/o is rather a part of the normal interaction between a debugged Cobol process and the debugger; that's just a theory, Nic is authorative here. But this change in behaviour _did_ result in real testsuite regressions, so it's not something that he wanted to write from scratch. (FWIW: I do think it's a better QoI factor if something returns data from a tube if we can know via side channels (break points) that something must have been written locally to the other end of the tube, if that can be ensured without too much other work) Ciao, Michael. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
On 05/04/2015 08:24 AM, Michael Matz wrote: Hi, On Fri, 1 May 2015, Peter Hurley wrote: I don't think this a real bug, in the sense that pty i/o is not synchronous, in the same way that tty i/o is not synchronous. Here's what I wrote internally about my speculations about this being a bug or not: I also never hit it with pipes (remove the USEPTY define), also not on sle12, so it must be some change specific to the pty implementation. Now, all of this is of course unspecified. There are two asynchronous processes involved, and a buffered tube between them. Just because one process filled one end of the tube (the breakpoint was hit) doesn't mean the contents have to appear at that instant at the other end. So the change in behaviour in sle12 is not a genuine bug. It _might_ be an unintented change, though, that's why kernel people should comment on this. If there are no terribly good reasons for this change I'd consider it a quality-of-implementation regression in sle12. So, I'd accept this being declared a non-bug, but it is certainly a change in behaviour that's visible for our debugger team. However, that said, if this is a regression (regression as in it broke something that used to work, not regression as in this new thing I'm writing doesn't behave the way I want it to :) ) Help me understand the use-case here: are you using pty i/o to debug the debugger? Nic is working on the Cobol debugger, but I think this pty i/o is rather a part of the normal interaction between a debugged Cobol process and the debugger; that's just a theory, Nic is authorative here. But this change in behaviour _did_ result in real testsuite regressions, so it's not something that he wanted to write from scratch. I'd like to understand why the debugger cares about when pty i/o shows up and why there is a testsuite to check for that. Does the debuggee know about the debugger, or is the pty i/o just stdout/stderr? This doesn't seem stable in the face of multiple threads of execution in the debuggee (or grandchild processes); IOW, pty slave writes from the debuggee may continue from other non-TRACEME threads. Presumably that i/o isn't being read either. (FWIW: I do think it's a better QoI factor if something returns data from a tube if we can know via side channels (break points) that something must have been written locally to the other end of the tube, if that can be ensured without too much other work) Well, if the debugger simply continues to monitor the pty master, the i/o will arrive. I think it would be a shame if ptrace() usage forced a whole class of i/o to be synchronous. Regards, Peter Hurley -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
Hi, On Mon, 4 May 2015, Peter Hurley wrote: I think it would be a shame if ptrace() usage forced a whole class of i/o to be synchronous. I think Neils patch doesn't do that, does it? If it has an indication that in fact some data must be there but isn't it pulls it, otherwise it's the same as after your patch/current state? (Leaving the debugger question to Nic; but I guess it's similar interface like gdb. Once you come back into the debugger (breakpoint hit) it looks once for input on the tubes of the debuggee and then enters a prompt; it doesn't continue looking for input until you continue the debuggee (1). ptys would be used because it's Cobol, the programs contain data entry masks presumably needing a real tty, not just pipes. That usecase would be broken now; the tty provided by the debugger doesn't reflect the real screen that the debuggee actually generated before the breakpoint. Note how pipes in my test program _are_ synchronuous in this sense, just ptys aren't.) Ciao, Michael. (1) And in a single threaded debugger (no matter if the debuggee is multithreaded) it would be awkward to implement. After read returns 0 you'd again call poll, which indicates data is there, and read again. You repeat that until $SOMEWHEN. But when is it enough? Two loops, 10, 1000? To not sit in a tight loop you'd also add some nanosleeps, but that would add unnecessary lags. Basically, whenever poll indicates that read won't block then it should also return some data, not 0, if at all reasonably implementable; i.e. some progress should be guaranteed. I realize that this isn't always the case, but here it is. In code, this loop: while (poll ([fd, POLLIN], 0) == 1) // So, read won't block, yippie if (read (fd, ...) == 0) continue; shouldn't become a tight loop, without the read making progress but the kernel continuously stating yep, there's data available, until some random point in the future. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
On 05/04/2015 12:56 PM, Michael Matz wrote: Hi, On Mon, 4 May 2015, Peter Hurley wrote: I think it would be a shame if ptrace() usage forced a whole class of i/o to be synchronous. I think Neils patch doesn't do that, does it? Yes. Non-blocking i/o would now have to wait until the input worker has completed (at a time when the input worker may not even be running on a cpu yet). If it has an indication that in fact some data must be there but isn't it pulls it, otherwise it's the same as after your patch/current state? At the point when the write() returns from the child process, the input worker may not even have run yet. The patch will now force the reader to wait until the input worker has started and run to completion. Moreover, prior to 4.1, the read i/o loop is sloppy wrt kicking the the input worker, so there is every likelihood that this patch would force extra waits on a non-blocking reader even though no input was actually being written. (Leaving the debugger question to Nic; but I guess it's similar interface like gdb. Once you come back into the debugger (breakpoint hit) it looks once for input on the tubes of the debuggee and then enters a prompt; it doesn't continue looking for input until you continue the debuggee (1). That's exactly what gdb does. Below is simple test jig [2] where the child outputs while at the gdb prompt. ptys would be used because it's Cobol, the programs contain data entry masks presumably needing a real tty, not just pipes. That usecase would be broken now; the tty provided by the debugger doesn't reflect the real screen that the debuggee actually generated before the breakpoint. Note how pipes in my test program _are_ synchronuous in this sense, just ptys aren't.) What's interesting about that expectation is that it would never work on an actual tty. (1) And in a single threaded debugger (no matter if the debuggee is multithreaded) it would be awkward to implement. After read returns 0 you'd again call poll, which indicates data is there, and read again. You repeat that until $SOMEWHEN. But when is it enough? Two loops, 10, 1000? To not sit in a tight loop you'd also add some nanosleeps, but that would add unnecessary lags. That's not what's happening. poll() with a timeout of 0 returns immediately, even if no file descriptors are ready for i/o. The poll() is returning 0 because there is no data to read, so the loop is exiting. If poll() returned non-zero and read() returned no data, that would definitely be a bug. Basically, whenever poll indicates that read won't block then it should also return some data, not 0, if at all reasonably implementable; i.e. some progress should be guaranteed. ttys have a further refinement of the behavior of read() and poll(), which is controlled by the VMIN and VTIME values in the termios structure. But note: the pty master does not allow its termios to be programmed -- a pty master read() is always non-blocking. I realize that this isn't always the case, but here it is. In code, this loop: while (poll ([fd, POLLIN], 0) == 1) // So, read won't block, yippie if (read (fd, ...) == 0) continue; shouldn't become a tight loop, without the read making progress but the kernel continuously stating yep, there's data available, until some random point in the future. Yeah, that would be broken but, again, that's not what's happening here. Regards, Peter Hurley --- % --- #include stdio.h #include stdlib.h #include string.h #include errno.h #include unistd.h #define BRKPT asm(int $3) void child() { sleep(1); printf(Hello, world!); } int main() { int child_id; setbuf(stdout, NULL); child_id = fork(); switch (child_id) { case -1: printf(fork: %s (code: %d)\n, strerror(errno), errno); exit(EXIT_FAILURE); case 0: child(); break; default: /* parent */ BRKPT; break; } return 0; } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
Hi Neil, On 05/01/2015 02:20 AM, NeilBrown wrote: > > Hi Peter, > I recently had a report of a regression in 3.12. I bisected it down to your > patch > f95499c3030f ("n_tty: Don't wait for buffer work in read() loop") > > Sometimes a poll on a master-pty will report there is nothing to read after > the slave has written something. > As test program is below. > On a kernel prior to your commit, this program never reports > > Total bytes read is 0. PollRC=0 > > On a kernel subsequent to your commit, that message is produced quite often. > > This was found while working on a debugger. > > Following the test program is my proposed patch which allows the program to > run as it used to. It re-introduces the call to tty_flush_to_ldisc(), but > only if it appears that there is nothing to read. > > Do you think this is a suitable fix? Do you even agree that it is a real > bug? I don't think this a real bug, in the sense that pty i/o is not synchronous, in the same way that tty i/o is not synchronous. However, that said, if this is a regression (regression as in "it broke something that used to work", not regression as in "this new thing I'm writing doesn't behave the way I want it to" :) ) Help me understand the use-case here: are you using pty i/o to debug the debugger? Regards, Peter Hurley > -- > #define _XOPEN_SOURCE > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > > > #define USEPTY > #define COUNT_MAX (500) > #define MY_BREAKPOINT { asm("int $3"); } > #define PTRACE_IGNORED(void *)0 > > /* > ** Open s pseudo-tty pair. > ** > ** Return the master fd, set spty to the slave fd. > */ > int > my_openpt(int *spty) > { > int mfd = posix_openpt(O_RDWR | O_NOCTTY); > char *slavedev; > int sfd=-1; > if(mfd == -1) return -1; > if(grantpt(mfd) == -1) return -1; > if(unlockpt(mfd) == -1) return -1; > > slavedev = (char *)ptsname(mfd); > > if((sfd = open(slavedev, O_RDWR | O_NOCTTY)) == -1) > { > close(mfd); > return -1; > } > if(spty != NULL) > { > *spty = sfd; > } > return mfd; > } > > > /* > ** Read from the provided file descriptor if poll says there's > ** anything there.. > */ > int > DoPollRead(int mpty) > { > struct pollfd fds; > int pollrc; > ssize_t bread=0, totread=0; > char readbuf[101]; > > /* > ** Set up the poll. > */ > fds.fd = mpty; > fds.events = POLLIN | POLLRDNORM | POLLRDBAND | POLLPRI; > > /* > ** poll for any output. > */ > > while((pollrc = poll(, 1, 0)) == 1) > { > if(fds.revents & POLLIN) > { > bread = read( mpty, readbuf, 100 ); > totread += bread; > if(bread > 0) > { > //printf("Read %d bytes.\n", (int)bread); > readbuf[bread] = '\0'; > //printf("\t%s", readbuf); > } else > { > //puts("Nothing read.\n"); > } > } else if (fds.revents & (POLLHUP | POLLERR | POLLNVAL)) { > printf ("hangup/error/invalid on poll\n"); > return totread; > } else { printf("No POLLIN, revents=%d\n", fds.revents); }; > } > > /* > ** This sometimes happens - we're expecting input on the pty, > ** but nothing is there. > */ > if(totread == 0) > printf("Total bytes read is 0. PollRC=%d\n", pollrc); > > return totread; > } > > static > void writeall (int fd, const char *buf, size_t count) > { > while (count) > { > ssize_t r = write (fd, buf, count); > if (r == 0) > break; > if (r < 0 && errno == EINTR) > continue; > if (r < 0) > exit (2); > count -= r; > buf += r; > } > } > > int > thechild(void) > { > unsigned int i; > > writeall (1, "debuggee starts\n", strlen ("debuggee starts\n")); > > for(i=0 ; i { > char buf[100]; > sprintf(buf, "This is the debuggee. Count is %d\n", i); > writeall (1, buf, strlen (buf)); > > MY_BREAKPOINT > } > > writeall (1, "debuggee finishing now.\n", strlen ("debuggee finishing > now.\n")); > exit (0); > } > > int > main() > { > int rv, status, i=0; > pid_t pid; > int sfd = -1; > int mfd; > #ifdef USEPTY > mfd = my_openpt();/* Get a pseudo-tty pair. */ > if(mfd < 0) > { > fprintf(stderr, "Failed to create pty\n"); > return(1); > } > #else > int pipefd[2]; > if (pipe (pipefd) < 0) > { > perror ("pipe"); > return 1; > } > mfd = pipefd[0]; > sfd = pipefd[1]; > #endif > > /* > ** Create a child process. > */ > pid = fork(); > switch(pid) > { > case -1:/* failed fork */ > return -1; > case 0: /*
[PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
Hi Peter, I recently had a report of a regression in 3.12. I bisected it down to your patch f95499c3030f ("n_tty: Don't wait for buffer work in read() loop") Sometimes a poll on a master-pty will report there is nothing to read after the slave has written something. As test program is below. On a kernel prior to your commit, this program never reports Total bytes read is 0. PollRC=0 On a kernel subsequent to your commit, that message is produced quite often. This was found while working on a debugger. Following the test program is my proposed patch which allows the program to run as it used to. It re-introduces the call to tty_flush_to_ldisc(), but only if it appears that there is nothing to read. Do you think this is a suitable fix? Do you even agree that it is a real bug? Thanks, NeilBrown -- #define _XOPEN_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #define USEPTY #define COUNT_MAX (500) #define MY_BREAKPOINT { asm("int $3"); } #define PTRACE_IGNORED (void *)0 /* ** Open s pseudo-tty pair. ** ** Return the master fd, set spty to the slave fd. */ int my_openpt(int *spty) { int mfd = posix_openpt(O_RDWR | O_NOCTTY); char *slavedev; int sfd=-1; if(mfd == -1) return -1; if(grantpt(mfd) == -1) return -1; if(unlockpt(mfd) == -1) return -1; slavedev = (char *)ptsname(mfd); if((sfd = open(slavedev, O_RDWR | O_NOCTTY)) == -1) { close(mfd); return -1; } if(spty != NULL) { *spty = sfd; } return mfd; } /* ** Read from the provided file descriptor if poll says there's ** anything there.. */ int DoPollRead(int mpty) { struct pollfd fds; int pollrc; ssize_t bread=0, totread=0; char readbuf[101]; /* ** Set up the poll. */ fds.fd = mpty; fds.events = POLLIN | POLLRDNORM | POLLRDBAND | POLLPRI; /* ** poll for any output. */ while((pollrc = poll(, 1, 0)) == 1) { if(fds.revents & POLLIN) { bread = read( mpty, readbuf, 100 ); totread += bread; if(bread > 0) { //printf("Read %d bytes.\n", (int)bread); readbuf[bread] = '\0'; //printf("\t%s", readbuf); } else { //puts("Nothing read.\n"); } } else if (fds.revents & (POLLHUP | POLLERR | POLLNVAL)) { printf ("hangup/error/invalid on poll\n"); return totread; } else { printf("No POLLIN, revents=%d\n", fds.revents); }; } /* ** This sometimes happens - we're expecting input on the pty, ** but nothing is there. */ if(totread == 0) printf("Total bytes read is 0. PollRC=%d\n", pollrc); return totread; } static void writeall (int fd, const char *buf, size_t count) { while (count) { ssize_t r = write (fd, buf, count); if (r == 0) break; if (r < 0 && errno == EINTR) continue; if (r < 0) exit (2); count -= r; buf += r; } } int thechild(void) { unsigned int i; writeall (1, "debuggee starts\n", strlen ("debuggee starts\n")); for(i=0 ; i Subject: [PATCH] n_tty: Sometimes wait for buffer work in read() loop Since commit f95499c3030f ("n_tty: Don't wait for buffer work in read() loop") it as been possible for poll to report that there is no data to read on a master-pty even if a write to the slave has actually completed. That patch removes a 'wait' when the wait isn't really necessary. Unfortunately it also removed it in the case when it *is* necessary. If the simple tests show that there is nothing to read, we really need to flush the work queue in case there is something ready but which hasn't arrived yet. This patch restores the wait, but only if simple tests suggest there is nothing ready. Reported-by: Nic Percival Reported-by: Michael Matz Fixes: f95499c3030f ("n_tty: Don't wait for buffer work in read() loop") Signed-off-by: NeilBrown diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c index cf6e0f2e1331..9884091819b6 100644 --- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c @@ -1942,11 +1942,18 @@ static inline int input_available_p(struct tty_struct *tty, int poll) { struct n_tty_data *ldata = tty->disc_data; int amt = poll && !TIME_CHAR(tty) && MIN_CHAR(tty) ? MIN_CHAR(tty) : 1; - - if (ldata->icanon && !L_EXTPROC(tty)) - return ldata->canon_head != ldata->read_tail; - else - return ldata->commit_head - ldata->read_tail >= amt; + int i; + int ret = 0; + + for (i = 0; !ret && i < 2; i++) { + if (i) + tty_flush_to_ldisc(tty); + if (ldata->icanon && !L_EXTPROC(tty)) + ret = (ldata->canon_head !=
Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
Hi Neil, On 05/01/2015 02:20 AM, NeilBrown wrote: Hi Peter, I recently had a report of a regression in 3.12. I bisected it down to your patch f95499c3030f (n_tty: Don't wait for buffer work in read() loop) Sometimes a poll on a master-pty will report there is nothing to read after the slave has written something. As test program is below. On a kernel prior to your commit, this program never reports Total bytes read is 0. PollRC=0 On a kernel subsequent to your commit, that message is produced quite often. This was found while working on a debugger. Following the test program is my proposed patch which allows the program to run as it used to. It re-introduces the call to tty_flush_to_ldisc(), but only if it appears that there is nothing to read. Do you think this is a suitable fix? Do you even agree that it is a real bug? I don't think this a real bug, in the sense that pty i/o is not synchronous, in the same way that tty i/o is not synchronous. However, that said, if this is a regression (regression as in it broke something that used to work, not regression as in this new thing I'm writing doesn't behave the way I want it to :) ) Help me understand the use-case here: are you using pty i/o to debug the debugger? Regards, Peter Hurley -- #define _XOPEN_SOURCE #includeunistd.h #includestdlib.h #includestdio.h #includestdlib.h #includestring.h #includeerrno.h #includesys/wait.h #includesys/types.h #includesys/ptrace.h #includeasm/ptrace.h #includefcntl.h #includesys/poll.h #define USEPTY #define COUNT_MAX (500) #define MY_BREAKPOINT { asm(int $3); } #define PTRACE_IGNORED(void *)0 /* ** Open s pseudo-tty pair. ** ** Return the master fd, set spty to the slave fd. */ int my_openpt(int *spty) { int mfd = posix_openpt(O_RDWR | O_NOCTTY); char *slavedev; int sfd=-1; if(mfd == -1) return -1; if(grantpt(mfd) == -1) return -1; if(unlockpt(mfd) == -1) return -1; slavedev = (char *)ptsname(mfd); if((sfd = open(slavedev, O_RDWR | O_NOCTTY)) == -1) { close(mfd); return -1; } if(spty != NULL) { *spty = sfd; } return mfd; } /* ** Read from the provided file descriptor if poll says there's ** anything there.. */ int DoPollRead(int mpty) { struct pollfd fds; int pollrc; ssize_t bread=0, totread=0; char readbuf[101]; /* ** Set up the poll. */ fds.fd = mpty; fds.events = POLLIN | POLLRDNORM | POLLRDBAND | POLLPRI; /* ** poll for any output. */ while((pollrc = poll(fds, 1, 0)) == 1) { if(fds.revents POLLIN) { bread = read( mpty, readbuf, 100 ); totread += bread; if(bread 0) { //printf(Read %d bytes.\n, (int)bread); readbuf[bread] = '\0'; //printf(\t%s, readbuf); } else { //puts(Nothing read.\n); } } else if (fds.revents (POLLHUP | POLLERR | POLLNVAL)) { printf (hangup/error/invalid on poll\n); return totread; } else { printf(No POLLIN, revents=%d\n, fds.revents); }; } /* ** This sometimes happens - we're expecting input on the pty, ** but nothing is there. */ if(totread == 0) printf(Total bytes read is 0. PollRC=%d\n, pollrc); return totread; } static void writeall (int fd, const char *buf, size_t count) { while (count) { ssize_t r = write (fd, buf, count); if (r == 0) break; if (r 0 errno == EINTR) continue; if (r 0) exit (2); count -= r; buf += r; } } int thechild(void) { unsigned int i; writeall (1, debuggee starts\n, strlen (debuggee starts\n)); for(i=0 ; iCOUNT_MAX ; i++) { char buf[100]; sprintf(buf, This is the debuggee. Count is %d\n, i); writeall (1, buf, strlen (buf)); MY_BREAKPOINT } writeall (1, debuggee finishing now.\n, strlen (debuggee finishing now.\n)); exit (0); } int main() { int rv, status, i=0; pid_t pid; int sfd = -1; int mfd; #ifdef USEPTY mfd = my_openpt(sfd);/* Get a pseudo-tty pair. */ if(mfd 0) { fprintf(stderr, Failed to create pty\n); return(1); } #else int pipefd[2]; if (pipe (pipefd) 0) { perror (pipe); return 1; } mfd = pipefd[0]; sfd = pipefd[1]; #endif /* ** Create a child process. */ pid = fork(); switch(pid) { case -1:/* failed fork */ return -1; case 0: /* child process*/ close (mfd); /* ** Close stdout, use the slave pty
[PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
Hi Peter, I recently had a report of a regression in 3.12. I bisected it down to your patch f95499c3030f (n_tty: Don't wait for buffer work in read() loop) Sometimes a poll on a master-pty will report there is nothing to read after the slave has written something. As test program is below. On a kernel prior to your commit, this program never reports Total bytes read is 0. PollRC=0 On a kernel subsequent to your commit, that message is produced quite often. This was found while working on a debugger. Following the test program is my proposed patch which allows the program to run as it used to. It re-introduces the call to tty_flush_to_ldisc(), but only if it appears that there is nothing to read. Do you think this is a suitable fix? Do you even agree that it is a real bug? Thanks, NeilBrown -- #define _XOPEN_SOURCE #includeunistd.h #includestdlib.h #includestdio.h #includestdlib.h #includestring.h #includeerrno.h #includesys/wait.h #includesys/types.h #includesys/ptrace.h #includeasm/ptrace.h #includefcntl.h #includesys/poll.h #define USEPTY #define COUNT_MAX (500) #define MY_BREAKPOINT { asm(int $3); } #define PTRACE_IGNORED (void *)0 /* ** Open s pseudo-tty pair. ** ** Return the master fd, set spty to the slave fd. */ int my_openpt(int *spty) { int mfd = posix_openpt(O_RDWR | O_NOCTTY); char *slavedev; int sfd=-1; if(mfd == -1) return -1; if(grantpt(mfd) == -1) return -1; if(unlockpt(mfd) == -1) return -1; slavedev = (char *)ptsname(mfd); if((sfd = open(slavedev, O_RDWR | O_NOCTTY)) == -1) { close(mfd); return -1; } if(spty != NULL) { *spty = sfd; } return mfd; } /* ** Read from the provided file descriptor if poll says there's ** anything there.. */ int DoPollRead(int mpty) { struct pollfd fds; int pollrc; ssize_t bread=0, totread=0; char readbuf[101]; /* ** Set up the poll. */ fds.fd = mpty; fds.events = POLLIN | POLLRDNORM | POLLRDBAND | POLLPRI; /* ** poll for any output. */ while((pollrc = poll(fds, 1, 0)) == 1) { if(fds.revents POLLIN) { bread = read( mpty, readbuf, 100 ); totread += bread; if(bread 0) { //printf(Read %d bytes.\n, (int)bread); readbuf[bread] = '\0'; //printf(\t%s, readbuf); } else { //puts(Nothing read.\n); } } else if (fds.revents (POLLHUP | POLLERR | POLLNVAL)) { printf (hangup/error/invalid on poll\n); return totread; } else { printf(No POLLIN, revents=%d\n, fds.revents); }; } /* ** This sometimes happens - we're expecting input on the pty, ** but nothing is there. */ if(totread == 0) printf(Total bytes read is 0. PollRC=%d\n, pollrc); return totread; } static void writeall (int fd, const char *buf, size_t count) { while (count) { ssize_t r = write (fd, buf, count); if (r == 0) break; if (r 0 errno == EINTR) continue; if (r 0) exit (2); count -= r; buf += r; } } int thechild(void) { unsigned int i; writeall (1, debuggee starts\n, strlen (debuggee starts\n)); for(i=0 ; iCOUNT_MAX ; i++) { char buf[100]; sprintf(buf, This is the debuggee. Count is %d\n, i); writeall (1, buf, strlen (buf)); MY_BREAKPOINT } writeall (1, debuggee finishing now.\n, strlen (debuggee finishing now.\n)); exit (0); } int main() { int rv, status, i=0; pid_t pid; int sfd = -1; int mfd; #ifdef USEPTY mfd = my_openpt(sfd); /* Get a pseudo-tty pair. */ if(mfd 0) { fprintf(stderr, Failed to create pty\n); return(1); } #else int pipefd[2]; if (pipe (pipefd) 0) { perror (pipe); return 1; } mfd = pipefd[0]; sfd = pipefd[1]; #endif /* ** Create a child process. */ pid = fork(); switch(pid) { case -1:/* failed fork */ return -1; case 0: /* child process*/ close (mfd); /* ** Close stdout, use the slave pty for output. */ dup2(sfd, STDOUT_FILENO); /* ** Set 'TRACEME' so this child process can be traced by the ** parent process. */ ptrace(PTRACE_TRACEME, PTRACE_IGNORED, PTRACE_IGNORED, PTRACE_IGNORED); thechild (); break; default:/* parent process drops out of switch */ close (sfd); break; } /* ** Wait for the debuggee to hit the traceme. ** When we see this, immediately send a PTRACE_CONT