RE: cygwin source-patch fixing deadlock while writing to serial port
-Original Message- From: [EMAIL PROTECTED] On Behalf Of Brian Ford On Tue, 20 Jan 2004, H. Henning Schmidt wrote: Ok, I will be happy to learn what the proper fix looks like in your opinion. When I can get to that code, I'll post it. I'll just second the notion that a crude fake-timeout-counter is a bit too kludgey, and there will be a better solution. However, I have a hard time accepting the fact that I cannot write OUT the serial port because my INput buffer overflows. At least when I have switched off all kinds of flow control (which I have), then I consider these to be two independent streams that happen to share one filedesc. Or should I perhaps open two independent filedescriptors in this case, one write-only, one read-only? Are you saying that the underlying Win-API enforces what you state above, or does this really make sense in some way and I just didn't get it yet? Your statement sounds to me like I am forced to keep a second thread around, and if only to get me out of that trap, once I get hung there ... does not sound too elegant to me ... Yes. I was stating the limitations of the Win-API. I've also encountered this behaviour. Opening separate handles unfortunately isn't an answer (at least in win32, anyway) because you're required to open serial port descriptors in exclusive access mode, at least according to MSDN. (I don't know if cygwin does any kind of multiplexing of multiple posix fds to a single win32 serial handle under the bonnet - if so, it might work on cygwin.) You also can't do independent reads and writes from separate threads: the operations are serialized, and only one input or output can be in process at a time. Which means that if you've got a blocking read in progress, any write you try at the same time will wait for the read to complete :( This is pretty poor behaviour from the win32 api IMO, I remember that even some of the earliest 300 baud modems were capable of *full* duplex! However, this all goes out the window if you open the serial port handle for overlapped I/O, and use overlapped reads and writes. You then get exactly the sort of behaviour that I think Henning is looking for. I think this might be the long term solution Brian's thinking of; it seems to me it'd be possible to use overlapped I/O to provide all the features of the posix termios api. cheers, DaveK -- Can't think of a witty .sigline today -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: cygwin source-patch fixing deadlock while writing to serial port
On Mon, Jan 19, 2004 at 10:08:39PM +0100, H. Henning Schmidt wrote: I found a potential deadlock while writing to a serial port (e.g. /dev/com1) that has been opened as O_RDWR. The deadlock occurs from time to time (not sure about exact conditions) when I write to that port, while there is data coming in (e.g. from an external device) and I do not read away that data fast enough from the port. I did provide a test case a while ago in http://sources.redhat.com/ml/cygwin/2003-03/msg01529.html. I digged into the issue some more now and found that the executing thread got sometimes deadlocked in fhandler_serial::raw_write(). It basically ends up in a for(;;) loop and just never hits the break; Exactly. When the input buffer overflows, all serial communications cease and calls exit with ERROR_OPERATION_ABORTED. If you only call write, then the ClearCommError() necessary to start things up again is never called, and you stick in that infinite loop. Ok, I will be happy to learn what the proper fix looks like in your opinion. When I can get to that code, I'll post it. However, I have a hard time accepting the fact that I cannot write OUT the serial port because my INput buffer overflows. At least when I have switched off all kinds of flow control (which I have), then I consider these to be two independent streams that happen to share one filedesc. Or should I perhaps open two independent filedescriptors in this case, one write-only, one read-only? Are you saying that the underlying Win-API enforces what you state above, or does this really make sense in some way and I just didn't get it yet? Your statement sounds to me like I am forced to keep a second thread around, and if only to get me out of that trap, once I get hung there ... does not sound too elegant to me ... Yes. I was stating the limitations of the Win-API. The applied patch adds a safety exit to that for(;;) loop. This fixes the testcase referenced above. Yuck! No, this is not the proper fix. This might not be the last problem lingering in the serial access code (there are some FIXME tokens still around ...), but it is definitely an improvement for me. I thought I'd share that with you. Can you convince me that this isn't just a band-aid? I don't understand why cygwin *shouldn't* hang in a situation like this. There are certainly similar situations where this happens on linux. Perhaps we need a low_priority_sleep (10) in the loop in that situation or something. No. I have a partial patch for the above, but I am in the process of getting a new Windows box and shuffling all my data. I'll try to submit it when things settle if no one beats me to it. who am I to beat you ... but be assured that I'll be waiting eagerly for this patch. This issue has been kicking me for a long time (letting my app hang up every once in a while), and I really want get this fixed. So now that I know that there is a better way to do it ... that's what I'm longing for then :-) While you're waiting, why don't you try what I said. Put a ClearCommError call in the switch for the error stated. I bet you get 90+% of what you want. ok. did this. it works. thanks a lot for the tip. With this change, my deadlock-prevention (which I still left in there, paranoyed as I am) just never comes into action. I attach my modified patch to this message for reference (to be applied to the original 1.5.5-1 sources) ;Henning diff -cr cygwin-1.5.5-1.orig/winsup/cygwin/fhandler_serial.cc cygwin-1.5.5-1/winsup/cygwin/fhandler_serial.cc *** cygwin-1.5.5-1.orig/winsup/cygwin/fhandler_serial.ccSat Jun 21 02:12:35 2003 --- cygwin-1.5.5-1/winsup/cygwin/fhandler_serial.cc Tue Jan 20 10:57:31 2004 *** *** 153,174 int fhandler_serial::raw_write (const void *ptr, size_t len) { ! DWORD bytes_written; OVERLAPPED write_status; memset (write_status, 0, sizeof (write_status)); write_status.hEvent = CreateEvent (sec_none_nih, TRUE, FALSE, NULL); ProtectHandle (write_status.hEvent); ! for (;;) { if (WriteFile (get_handle (), ptr, len, bytes_written, write_status)) break; switch (GetLastError ()) { ! case ERROR_OPERATION_ABORTED: ! continue; case ERROR_IO_PENDING: break; default: --- 153,179 int fhandler_serial::raw_write (const void *ptr, size_t len) { ! DWORD bytes_written = 0; OVERLAPPED write_status; memset (write_status, 0, sizeof (write_status)); write_status.hEvent = CreateEvent (sec_none_nih, TRUE, FALSE, NULL); ProtectHandle (write_status.hEvent); ! for (int prevent_deadlock=0 ; prevent_deadlock10 ; prevent_deadlock++) { if (WriteFile (get_handle (), ptr, len, bytes_written, write_status))
cygwin source-patch fixing deadlock while writing to serial port
I found a potential deadlock while writing to a serial port (e.g. /dev/com1) that has been opened as O_RDWR. The deadlock occurs from time to time (not sure about exact conditions) when I write to that port, while there is data coming in (e.g. from an external device) and I do not read away that data fast enough from the port. I did provide a test case a while ago in http://sources.redhat.com/ml/cygwin/2003-03/msg01529.html. I digged into the issue some more now and found that the executing thread got sometimes deadlocked in fhandler_serial::raw_write(). It basically ends up in a for(;;) loop and just never hits the break; The applied patch adds a safety exit to that for(;;) loop. This fixes the testcase referenced above. This might not be the last problem lingering in the serial access code (there are some FIXME tokens still around ...), but it is definitely an improvement for me. I thought I'd share that with you. ;Henning -- H. Henning Schmidt email: [EMAIL PROTECTED] phone: +49 (0) 6155 / 899 283 fax: +49 (0) 6155 / 899 284 *** cygwin-1.5.5-1/winsup/cygwin/fhandler_serial.cc Sat Jun 21 02:12:35 2003 --- cygwin-1.5.5-1.corrected/winsup/cygwin/fhandler_serial.cc Mon Jan 19 17:32:04 2004 *** *** 153,173 int fhandler_serial::raw_write (const void *ptr, size_t len) { ! DWORD bytes_written; OVERLAPPED write_status; memset (write_status, 0, sizeof (write_status)); write_status.hEvent = CreateEvent (sec_none_nih, TRUE, FALSE, NULL); ProtectHandle (write_status.hEvent); ! for (;;) { if (WriteFile (get_handle (), ptr, len, bytes_written, write_status)) break; switch (GetLastError ()) { case ERROR_OPERATION_ABORTED: continue; case ERROR_IO_PENDING: break; --- 153,174 int fhandler_serial::raw_write (const void *ptr, size_t len) { ! DWORD bytes_written = 0; OVERLAPPED write_status; memset (write_status, 0, sizeof (write_status)); write_status.hEvent = CreateEvent (sec_none_nih, TRUE, FALSE, NULL); ProtectHandle (write_status.hEvent); ! for (int prevent_deadlock=0 ; prevent_deadlock10 ; prevent_deadlock++) { if (WriteFile (get_handle (), ptr, len, bytes_written, write_status)) break; switch (GetLastError ()) { case ERROR_OPERATION_ABORTED: + bytes_written = 0; continue; case ERROR_IO_PENDING: break; -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: cygwin source-patch fixing deadlock while writing to serial port
On Mon, Jan 19, 2004 at 10:08:39PM +0100, H. Henning Schmidt wrote: I found a potential deadlock while writing to a serial port (e.g. /dev/com1) that has been opened as O_RDWR. The deadlock occurs from time to time (not sure about exact conditions) when I write to that port, while there is data coming in (e.g. from an external device) and I do not read away that data fast enough from the port. I did provide a test case a while ago in http://sources.redhat.com/ml/cygwin/2003-03/msg01529.html. I digged into the issue some more now and found that the executing thread got sometimes deadlocked in fhandler_serial::raw_write(). It basically ends up in a for(;;) loop and just never hits the break; The applied patch adds a safety exit to that for(;;) loop. This fixes the testcase referenced above. This might not be the last problem lingering in the serial access code (there are some FIXME tokens still around ...), but it is definitely an improvement for me. I thought I'd share that with you. Can you convince me that this isn't just a band-aid? I don't understand why cygwin *shouldn't* hang in a situation like this. There are certainly similar situations where this happens on linux. Perhaps we need a low_priority_sleep (10) in the loop in that situation or something. cgf -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: cygwin source-patch fixing deadlock while writing to serial port
On Mon, 19 Jan 2004, Christopher Faylor wrote: On Mon, Jan 19, 2004 at 10:08:39PM +0100, H. Henning Schmidt wrote: I found a potential deadlock while writing to a serial port (e.g. /dev/com1) that has been opened as O_RDWR. The deadlock occurs from time to time (not sure about exact conditions) when I write to that port, while there is data coming in (e.g. from an external device) and I do not read away that data fast enough from the port. I did provide a test case a while ago in http://sources.redhat.com/ml/cygwin/2003-03/msg01529.html. I digged into the issue some more now and found that the executing thread got sometimes deadlocked in fhandler_serial::raw_write(). It basically ends up in a for(;;) loop and just never hits the break; Exactly. When the input buffer overflows, all serial communications cease and calls exit with ERROR_OPERATION_ABORTED. If you only call write, then the ClearCommError() necessary to start things up again is never called, and you stick in that infinite loop. The applied patch adds a safety exit to that for(;;) loop. This fixes the testcase referenced above. Yuck! No, this is not the proper fix. This might not be the last problem lingering in the serial access code (there are some FIXME tokens still around ...), but it is definitely an improvement for me. I thought I'd share that with you. Can you convince me that this isn't just a band-aid? I don't understand why cygwin *shouldn't* hang in a situation like this. There are certainly similar situations where this happens on linux. Perhaps we need a low_priority_sleep (10) in the loop in that situation or something. No. I have a partial patch for the above, but I am in the process of getting a new Windows box and shuffling all my data. I'll try to submit it when things settle if no one beats me to it. -- Brian Ford Senior Realtime Software Engineer VITAL - Visual Simulation Systems FlightSafety International Phone: 314-551-8460 Fax: 314-551-8444 -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: cygwin source-patch fixing deadlock while writing to serial port
On Mon, Jan 19, 2004 at 10:08:39PM +0100, H. Henning Schmidt wrote: I found a potential deadlock while writing to a serial port (e.g. /dev/com1) that has been opened as O_RDWR. The deadlock occurs from time to time (not sure about exact conditions) when I write to that port, while there is data coming in (e.g. from an external device) and I do not read away that data fast enough from the port. I did provide a test case a while ago in http://sources.redhat.com/ml/cygwin/2003-03/msg01529.html. I digged into the issue some more now and found that the executing thread got sometimes deadlocked in fhandler_serial::raw_write(). It basically ends up in a for(;;) loop and just never hits the break; Exactly. When the input buffer overflows, all serial communications cease and calls exit with ERROR_OPERATION_ABORTED. If you only call write, then the ClearCommError() necessary to start things up again is never called, and you stick in that infinite loop. Ok, I will be happy to learn what the proper fix looks like in your opinion. However, I have a hard time accepting the fact that I cannot write OUT the serial port because my INput buffer overflows. At least when I have switched off all kinds of flow control (which I have), then I consider these to be two independent streams that happen to share one filedesc. Or should I perhaps open two independent filedescriptors in this case, one write-only, one read-only? Are you saying that the underlying Win-API enforces what you state above, or does this really make sense in some way and I just didn't get it yet? Your statement sounds to me like I am forced to keep a second thread around, and if only to get me out of that trap, once I get hung there ... does not sound too elegant to me ... The applied patch adds a safety exit to that for(;;) loop. This fixes the testcase referenced above. Yuck! No, this is not the proper fix. This might not be the last problem lingering in the serial access code (there are some FIXME tokens still around ...), but it is definitely an improvement for me. I thought I'd share that with you. Can you convince me that this isn't just a band-aid? I don't understand why cygwin *shouldn't* hang in a situation like this. There are certainly similar situations where this happens on linux. Perhaps we need a low_priority_sleep (10) in the loop in that situation or something. No. I have a partial patch for the above, but I am in the process of getting a new Windows box and shuffling all my data. I'll try to submit it when things settle if no one beats me to it. who am I to beat you ... but be assured that I'll be waiting eagerly for this patch. This issue has been kicking me for a long time (letting my app hang up every once in a while), and I really want get this fixed. So now that I know that there is a better way to do it ... that's what I'm longing for then :-) -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: cygwin source-patch fixing deadlock while writing to serial port
On Tue, 20 Jan 2004, H. Henning Schmidt wrote: On Mon, Jan 19, 2004 at 10:08:39PM +0100, H. Henning Schmidt wrote: I found a potential deadlock while writing to a serial port (e.g. /dev/com1) that has been opened as O_RDWR. The deadlock occurs from time to time (not sure about exact conditions) when I write to that port, while there is data coming in (e.g. from an external device) and I do not read away that data fast enough from the port. I did provide a test case a while ago in http://sources.redhat.com/ml/cygwin/2003-03/msg01529.html. I digged into the issue some more now and found that the executing thread got sometimes deadlocked in fhandler_serial::raw_write(). It basically ends up in a for(;;) loop and just never hits the break; Exactly. When the input buffer overflows, all serial communications cease and calls exit with ERROR_OPERATION_ABORTED. If you only call write, then the ClearCommError() necessary to start things up again is never called, and you stick in that infinite loop. Ok, I will be happy to learn what the proper fix looks like in your opinion. When I can get to that code, I'll post it. However, I have a hard time accepting the fact that I cannot write OUT the serial port because my INput buffer overflows. At least when I have switched off all kinds of flow control (which I have), then I consider these to be two independent streams that happen to share one filedesc. Or should I perhaps open two independent filedescriptors in this case, one write-only, one read-only? Are you saying that the underlying Win-API enforces what you state above, or does this really make sense in some way and I just didn't get it yet? Your statement sounds to me like I am forced to keep a second thread around, and if only to get me out of that trap, once I get hung there ... does not sound too elegant to me ... Yes. I was stating the limitations of the Win-API. The applied patch adds a safety exit to that for(;;) loop. This fixes the testcase referenced above. Yuck! No, this is not the proper fix. This might not be the last problem lingering in the serial access code (there are some FIXME tokens still around ...), but it is definitely an improvement for me. I thought I'd share that with you. Can you convince me that this isn't just a band-aid? I don't understand why cygwin *shouldn't* hang in a situation like this. There are certainly similar situations where this happens on linux. Perhaps we need a low_priority_sleep (10) in the loop in that situation or something. No. I have a partial patch for the above, but I am in the process of getting a new Windows box and shuffling all my data. I'll try to submit it when things settle if no one beats me to it. who am I to beat you ... but be assured that I'll be waiting eagerly for this patch. This issue has been kicking me for a long time (letting my app hang up every once in a while), and I really want get this fixed. So now that I know that there is a better way to do it ... that's what I'm longing for then :-) While you're waiting, why don't you try what I said. Put a ClearCommError call in the switch for the error stated. I bet you get 90+% of what you want. -- Brian Ford Senior Realtime Software Engineer VITAL - Visual Simulation Systems FlightSafety International Phone: 314-551-8460 Fax: 314-551-8444 -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/