RE: cygwin source-patch fixing deadlock while writing to serial port

2004-01-20 Thread Dave Korn

 -Original Message-
 From: [EMAIL PROTECTED] On Behalf Of Brian Ford
 On Tue, 20 Jan 2004, H. Henning Schmidt wrote:
 
  Ok, I will be happy to learn what the proper fix looks like 
 in your opinion.
 
 When I can get to that code, I'll post it.

  I'll just second the notion that a crude fake-timeout-counter is a bit too
kludgey, and there will be a better solution.

  However, I have a hard time accepting the fact that I 
 cannot write OUT 
  the serial port because my INput buffer overflows. At least when I 
  have switched off all kinds of flow control (which I have), then I 
  consider these to be two independent streams that happen to 
 share one 
  filedesc. Or should I perhaps open two independent 
 filedescriptors in 
  this case, one write-only, one read-only?
 
  Are you saying that the underlying Win-API enforces what you state 
  above, or does this really make sense in some way and I just didn't 
  get it yet? Your statement sounds to me like I am forced to keep a 
  second thread around, and if only to get me out of that 
 trap, once I get hung there ... does not sound too elegant to me ...
 
 Yes.  I was stating the limitations of the Win-API.

  I've also encountered this behaviour.  Opening separate handles
unfortunately isn't an answer (at least in win32, anyway) because you're
required to open serial port descriptors in exclusive access mode, at least
according to MSDN.  (I don't know if cygwin does any kind of multiplexing of
multiple posix fds to a single win32 serial handle under the bonnet - if so,
it might work on cygwin.)

  You also can't do independent reads and writes from separate threads: the
operations are serialized, and only one input or output can be in process at
a time.  Which means that if you've got a blocking read in progress, any
write you try at the same time will wait for the read to complete :(  This
is pretty poor behaviour from the win32 api IMO, I remember that even some
of the earliest 300 baud modems were capable of *full* duplex!

  However, this all goes out the window if you open the serial port handle
for overlapped I/O, and use overlapped reads and writes.  You then get
exactly the sort of behaviour that I think Henning is looking for.  I think
this might be the long term solution Brian's thinking of; it seems to me
it'd be possible to use overlapped I/O to provide all the features of the
posix termios api.


cheers, 
  DaveK
-- 
Can't think of a witty .sigline today


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: cygwin source-patch fixing deadlock while writing to serial port

2004-01-20 Thread H. Henning Schmidt
On Mon, Jan 19, 2004 at 10:08:39PM +0100, H. Henning Schmidt wrote:
I found a potential deadlock while writing to a serial port (e.g.
/dev/com1) that has been opened as O_RDWR. The deadlock occurs 
from time
to time (not sure about exact conditions) when I write to that 
port,
while there is data coming in (e.g. from an external device) 
and I do
not read away that data fast enough from the port.

I did provide a test case a while ago in
http://sources.redhat.com/ml/cygwin/2003-03/msg01529.html. I 
digged into
the issue some more now and found that the executing thread got
sometimes deadlocked in fhandler_serial::raw_write(). It 
basically ends
up in a for(;;) loop and just never hits the break;

   Exactly.  When the input buffer overflows, all serial 
communications cease
   and calls exit with ERROR_OPERATION_ABORTED.  If you only call 
write, then
   the ClearCommError() necessary to start things up again is never 
called,
   and you stick in that infinite loop.
 
  Ok, I will be happy to learn what the proper fix looks like in your 
opinion.
 
 When I can get to that code, I'll post it.

  However, I have a hard time accepting the fact that I cannot write 
OUT the
  serial port because my INput buffer overflows. At least when I have 
switched
  off all kinds of flow control (which I have), then I consider these 
to be two
  independent streams that happen to share one filedesc. Or should I 
perhaps
  open two independent filedescriptors in this case, one write-only, one
  read-only?
 
  Are you saying that the underlying Win-API enforces what you state 
above, or does
  this really make sense in some way and I just didn't get it yet? 
Your statement
  sounds to me like I am forced to keep a second thread around, and 
if only to get
  me out of that trap, once I get hung there ... does not sound too 
elegant to me ...
 
 Yes.  I was stating the limitations of the Win-API.

The applied patch adds a safety exit to that for(;;) loop.
This fixes the testcase referenced above.

   Yuck!  No, this is not the proper fix.
  
This might not be the last problem lingering in the serial 
access code
(there are some FIXME tokens still around ...), but it is 
definitely an
improvement for me. I thought I'd share that with you.
   
Can you convince me that this isn't just a band-aid?  I don't 
understand
why cygwin *shouldn't* hang in a situation like this.  There are
certainly similar situations where this happens on linux.
   
Perhaps we need a low_priority_sleep (10) in the loop in that 
situation
or something.
   
   No.  I have a partial patch for the above, but I am in the process of
   getting a new Windows box and shuffling all my data.  I'll try to 
submit
   it when things settle if no one beats me to it.
 
  who am I to beat you ...
  but be assured that I'll be waiting eagerly for this patch.
  This issue has been kicking me for a long time (letting my app hang 
up every once in
  a while), and I really want get this fixed. So now that I know that 
there is a better
  way to do it ... that's what I'm longing for then :-)
 
 While you're waiting, why don't you try what I said.  Put a 
ClearCommError
 call in the switch for the error stated.  I bet you get 90+% of what you
 want.

ok. did this. it works. thanks a lot for the tip.
With this change, my deadlock-prevention (which I still left in there, 
paranoyed as I am) just never comes into action.
I attach my modified patch to this message for reference (to be applied 
to the original 1.5.5-1 sources)

;Henning


diff -cr cygwin-1.5.5-1.orig/winsup/cygwin/fhandler_serial.cc 
cygwin-1.5.5-1/winsup/cygwin/fhandler_serial.cc
*** cygwin-1.5.5-1.orig/winsup/cygwin/fhandler_serial.ccSat Jun 21 02:12:35 
2003
--- cygwin-1.5.5-1/winsup/cygwin/fhandler_serial.cc Tue Jan 20 10:57:31 2004
***
*** 153,174 
  int
  fhandler_serial::raw_write (const void *ptr, size_t len)
  {
!   DWORD bytes_written;
OVERLAPPED write_status;
  
memset (write_status, 0, sizeof (write_status));
write_status.hEvent = CreateEvent (sec_none_nih, TRUE, FALSE, NULL);
ProtectHandle (write_status.hEvent);
  
!   for (;;)
  {
if (WriteFile (get_handle (), ptr, len, bytes_written, write_status))
break;
  
switch (GetLastError ())
{
!   case ERROR_OPERATION_ABORTED:
! continue;
case ERROR_IO_PENDING:
  break;
default:
--- 153,179 
  int
  fhandler_serial::raw_write (const void *ptr, size_t len)
  {
!   DWORD bytes_written = 0;
OVERLAPPED write_status;
  
memset (write_status, 0, sizeof (write_status));
write_status.hEvent = CreateEvent (sec_none_nih, TRUE, FALSE, NULL);
ProtectHandle (write_status.hEvent);
  
!   for (int prevent_deadlock=0 ; prevent_deadlock10 ; prevent_deadlock++)
  {
if (WriteFile (get_handle (), ptr, len, bytes_written, write_status))
   

cygwin source-patch fixing deadlock while writing to serial port

2004-01-19 Thread H. Henning Schmidt
I found a potential deadlock while writing to a serial port (e.g. 
/dev/com1) that has been opened as O_RDWR. The deadlock occurs from time 
to time (not sure about exact conditions) when I write to that port, 
while there is data coming in (e.g. from an external device) and I do 
not read away that data fast enough from the port.

I did provide a test case a while ago in 
http://sources.redhat.com/ml/cygwin/2003-03/msg01529.html. I digged into 
the issue some more now and found that the executing thread got 
sometimes deadlocked in fhandler_serial::raw_write(). It basically ends 
up in a for(;;) loop and just never hits the break;

The applied patch adds a safety exit to that for(;;) loop.
This fixes the testcase referenced above.
This might not be the last problem lingering in the serial access code 
(there are some FIXME tokens still around ...), but it is definitely an 
improvement for me. I thought I'd share that with you.

;Henning

--
H. Henning Schmidt
email: [EMAIL PROTECTED]
phone: +49 (0) 6155 / 899 283
fax:   +49 (0) 6155 / 899 284
*** cygwin-1.5.5-1/winsup/cygwin/fhandler_serial.cc Sat Jun 21 02:12:35 2003
--- cygwin-1.5.5-1.corrected/winsup/cygwin/fhandler_serial.cc   Mon Jan 19 17:32:04 
2004
***
*** 153,173 
  int
  fhandler_serial::raw_write (const void *ptr, size_t len)
  {
!   DWORD bytes_written;
OVERLAPPED write_status;
  
memset (write_status, 0, sizeof (write_status));
write_status.hEvent = CreateEvent (sec_none_nih, TRUE, FALSE, NULL);
ProtectHandle (write_status.hEvent);
  
!   for (;;)
  {
   if (WriteFile (get_handle (), ptr, len, bytes_written, write_status))
break;
  
switch (GetLastError ())
{
case ERROR_OPERATION_ABORTED:
  continue;
case ERROR_IO_PENDING:
  break;
--- 153,174 
  int
  fhandler_serial::raw_write (const void *ptr, size_t len)
  {
!   DWORD bytes_written = 0;
OVERLAPPED write_status;
  
memset (write_status, 0, sizeof (write_status));
write_status.hEvent = CreateEvent (sec_none_nih, TRUE, FALSE, NULL);
ProtectHandle (write_status.hEvent);
  
!   for (int prevent_deadlock=0 ; prevent_deadlock10 ; prevent_deadlock++)
  {
   if (WriteFile (get_handle (), ptr, len, bytes_written, write_status))
break;
  
switch (GetLastError ())
{
case ERROR_OPERATION_ABORTED:
+ bytes_written = 0;
  continue;
case ERROR_IO_PENDING:
  break;

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/

Re: cygwin source-patch fixing deadlock while writing to serial port

2004-01-19 Thread Christopher Faylor
On Mon, Jan 19, 2004 at 10:08:39PM +0100, H. Henning Schmidt wrote:
I found a potential deadlock while writing to a serial port (e.g. 
/dev/com1) that has been opened as O_RDWR. The deadlock occurs from time 
to time (not sure about exact conditions) when I write to that port, 
while there is data coming in (e.g. from an external device) and I do 
not read away that data fast enough from the port.

I did provide a test case a while ago in 
http://sources.redhat.com/ml/cygwin/2003-03/msg01529.html. I digged into 
the issue some more now and found that the executing thread got 
sometimes deadlocked in fhandler_serial::raw_write(). It basically ends 
up in a for(;;) loop and just never hits the break;

The applied patch adds a safety exit to that for(;;) loop.
This fixes the testcase referenced above.

This might not be the last problem lingering in the serial access code 
(there are some FIXME tokens still around ...), but it is definitely an 
improvement for me. I thought I'd share that with you.

Can you convince me that this isn't just a band-aid?  I don't understand
why cygwin *shouldn't* hang in a situation like this.  There are
certainly similar situations where this happens on linux.

Perhaps we need a low_priority_sleep (10) in the loop in that situation
or something.

cgf

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: cygwin source-patch fixing deadlock while writing to serial port

2004-01-19 Thread Brian Ford
On Mon, 19 Jan 2004, Christopher Faylor wrote:

 On Mon, Jan 19, 2004 at 10:08:39PM +0100, H. Henning Schmidt wrote:
 I found a potential deadlock while writing to a serial port (e.g.
 /dev/com1) that has been opened as O_RDWR. The deadlock occurs from time
 to time (not sure about exact conditions) when I write to that port,
 while there is data coming in (e.g. from an external device) and I do
 not read away that data fast enough from the port.
 
 I did provide a test case a while ago in
 http://sources.redhat.com/ml/cygwin/2003-03/msg01529.html. I digged into
 the issue some more now and found that the executing thread got
 sometimes deadlocked in fhandler_serial::raw_write(). It basically ends
 up in a for(;;) loop and just never hits the break;
 
Exactly.  When the input buffer overflows, all serial communications cease
and calls exit with ERROR_OPERATION_ABORTED.  If you only call write, then
the ClearCommError() necessary to start things up again is never called,
and you stick in that infinite loop.

 The applied patch adds a safety exit to that for(;;) loop.
 This fixes the testcase referenced above.
 
Yuck!  No, this is not the proper fix.

 This might not be the last problem lingering in the serial access code
 (there are some FIXME tokens still around ...), but it is definitely an
 improvement for me. I thought I'd share that with you.

 Can you convince me that this isn't just a band-aid?  I don't understand
 why cygwin *shouldn't* hang in a situation like this.  There are
 certainly similar situations where this happens on linux.

 Perhaps we need a low_priority_sleep (10) in the loop in that situation
 or something.

No.  I have a partial patch for the above, but I am in the process of
getting a new Windows box and shuffling all my data.  I'll try to submit
it when things settle if no one beats me to it.

-- 
Brian Ford
Senior Realtime Software Engineer
VITAL - Visual Simulation Systems
FlightSafety International
Phone: 314-551-8460
Fax:   314-551-8444

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: cygwin source-patch fixing deadlock while writing to serial port

2004-01-19 Thread H. Henning Schmidt
 On Mon, Jan 19, 2004 at 10:08:39PM +0100, H. Henning Schmidt wrote:
 I found a potential deadlock while writing to a serial port (e.g.
 /dev/com1) that has been opened as O_RDWR. The deadlock occurs from time
 to time (not sure about exact conditions) when I write to that port,
 while there is data coming in (e.g. from an external device) and I do
 not read away that data fast enough from the port.
 
 I did provide a test case a while ago in
 http://sources.redhat.com/ml/cygwin/2003-03/msg01529.html. I digged into
 the issue some more now and found that the executing thread got
 sometimes deadlocked in fhandler_serial::raw_write(). It basically ends
 up in a for(;;) loop and just never hits the break;
 
Exactly.  When the input buffer overflows, all serial communications cease
and calls exit with ERROR_OPERATION_ABORTED.  If you only call write, then
the ClearCommError() necessary to start things up again is never called,
and you stick in that infinite loop.
Ok, I will be happy to learn what the proper fix looks like in your opinion.

However, I have a hard time accepting the fact that I cannot write OUT the 
serial port because my INput buffer overflows. At least when I have switched
off all kinds of flow control (which I have), then I consider these to be two
independent streams that happen to share one filedesc. Or should I perhaps
open two independent filedescriptors in this case, one write-only, one read-only?

Are you saying that the underlying Win-API enforces what you state above, or does
this really make sense in some way and I just didn't get it yet? Your statement
sounds to me like I am forced to keep a second thread around, and if only to get
me out of that trap, once I get hung there ... does not sound too elegant to me ...
 The applied patch adds a safety exit to that for(;;) loop.
 This fixes the testcase referenced above.
 
Yuck!  No, this is not the proper fix.
 This might not be the last problem lingering in the serial access code
 (there are some FIXME tokens still around ...), but it is definitely an
 improvement for me. I thought I'd share that with you.

 Can you convince me that this isn't just a band-aid?  I don't understand
 why cygwin *shouldn't* hang in a situation like this.  There are
 certainly similar situations where this happens on linux.

 Perhaps we need a low_priority_sleep (10) in the loop in that situation
 or something.

No.  I have a partial patch for the above, but I am in the process of
getting a new Windows box and shuffling all my data.  I'll try to submit
it when things settle if no one beats me to it.
who am I to beat you ... 
but be assured that I'll be waiting eagerly for this patch.
This issue has been kicking me for a long time (letting my app hang up every once in 
a while), and I really want get this fixed. So now that I know that there is a better
way to do it ... that's what I'm longing for then :-)



--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/


Re: cygwin source-patch fixing deadlock while writing to serial port

2004-01-19 Thread Brian Ford
On Tue, 20 Jan 2004, H. Henning Schmidt wrote:

   On Mon, Jan 19, 2004 at 10:08:39PM +0100, H. Henning Schmidt wrote:
   I found a potential deadlock while writing to a serial port (e.g.
   /dev/com1) that has been opened as O_RDWR. The deadlock occurs from time
   to time (not sure about exact conditions) when I write to that port,
   while there is data coming in (e.g. from an external device) and I do
   not read away that data fast enough from the port.
   
   I did provide a test case a while ago in
   http://sources.redhat.com/ml/cygwin/2003-03/msg01529.html. I digged into
   the issue some more now and found that the executing thread got
   sometimes deadlocked in fhandler_serial::raw_write(). It basically ends
   up in a for(;;) loop and just never hits the break;
   
  Exactly.  When the input buffer overflows, all serial communications cease
  and calls exit with ERROR_OPERATION_ABORTED.  If you only call write, then
  the ClearCommError() necessary to start things up again is never called,
  and you stick in that infinite loop.

 Ok, I will be happy to learn what the proper fix looks like in your opinion.

When I can get to that code, I'll post it.

 However, I have a hard time accepting the fact that I cannot write OUT the
 serial port because my INput buffer overflows. At least when I have switched
 off all kinds of flow control (which I have), then I consider these to be two
 independent streams that happen to share one filedesc. Or should I perhaps
 open two independent filedescriptors in this case, one write-only, one
 read-only?

 Are you saying that the underlying Win-API enforces what you state above, or does
 this really make sense in some way and I just didn't get it yet? Your statement
 sounds to me like I am forced to keep a second thread around, and if only to get
 me out of that trap, once I get hung there ... does not sound too elegant to me ...

Yes.  I was stating the limitations of the Win-API.

   The applied patch adds a safety exit to that for(;;) loop.
   This fixes the testcase referenced above.
   
  Yuck!  No, this is not the proper fix.
 
   This might not be the last problem lingering in the serial access code
   (there are some FIXME tokens still around ...), but it is definitely an
   improvement for me. I thought I'd share that with you.
  
   Can you convince me that this isn't just a band-aid?  I don't understand
   why cygwin *shouldn't* hang in a situation like this.  There are
   certainly similar situations where this happens on linux.
  
   Perhaps we need a low_priority_sleep (10) in the loop in that situation
   or something.
  
  No.  I have a partial patch for the above, but I am in the process of
  getting a new Windows box and shuffling all my data.  I'll try to submit
  it when things settle if no one beats me to it.

 who am I to beat you ...
 but be assured that I'll be waiting eagerly for this patch.
 This issue has been kicking me for a long time (letting my app hang up every once in
 a while), and I really want get this fixed. So now that I know that there is a better
 way to do it ... that's what I'm longing for then :-)

While you're waiting, why don't you try what I said.  Put a ClearCommError
call in the switch for the error stated.  I bet you get 90+% of what you
want.

-- 
Brian Ford
Senior Realtime Software Engineer
VITAL - Visual Simulation Systems
FlightSafety International
Phone: 314-551-8460
Fax:   314-551-8444

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/