I'm not familiar with the Comm calls, but here's one thing I notice.  The second set of code has a 1,000 millisecond timeout and then an immediate call to get the overlapped result.  I'm not sure what you do after that, but perhaps most cases are that the timeout isn't reached and the overlapped result is finished.  On a few cases your timeout is reached, you go get an overlapped result and attempt to use it as if it were a completed IO when it is not in the code below what you have posted and this actually causes the gpf.
 
Just a guess since we can't see it.
 
With 3 calls coming in per second it might be quite a while before this code bug is activated, but if a little network hiccup happens for 1 second your timeout would get hit without there being a comm event coming in.
 
To test this you can (a) put a check on that overlapped to see if it is completed and log it if it isn't and then let your code continue.  If you get one log entry just before gpf you'll know it's the case, to make it quicker you can also (b) decrease that wait time to make smaller hiccups trigger the timeout.
 
Or you can (c) just increase the timeout to something like a minute and then after it's ran fine for about a week you can *assume* that was the problem.
 
Of course if you prove that the timeout is a factor to this problem to fix it you'll have to change your code to check for, and handle appropriately the event of not having a completed overlapped.
 
/dev
----- Original Message -----
Sent: Tuesday, February 01, 2005 7:02 AM
Subject: [msvc] Overlapped Comms - It Came Back To Bite Me

OK... just when everything was looking good, testing revealed that all was not well. I still need to test for longer but early indications are, that the code changes in this one routine are the cause.

(HTML alert)

This piece of code:

---
        if (!::WaitCommEvent(rsThreadData.hPortHandle, &dwEvent,
                                                        &rsNotifyOverlapData) )
        {
                dwError = ::GetLastError();
                if (dwError == ERROR_IO_PENDING)
                {
                        bDone = FALSE ;

                        // Wait for completion of WaitCommEvent().
                        while (!rsThreadData.bTerminate && !bDone)
                        {
                                bDone = ::GetOverlappedResult(rsThreadData.hPortHandle,
                                                              &rsNotifyOverlapData,
                                                              &dwDummy, FALSE);
                                dwError = ::GetLastError();
                                if (!bDone && dwError != ERROR_IO_INCOMPLETE)
                                        bDone = TRUE ;  // Real error.
                        }
                }
        }
        // V1.6 Beta 01: Manually reset it now.
        ::ResetEvent(rsNotifyOverlapData.hEvent);
---

works just fine - it's been running for two hours non-stop now, no problems. But change it to this:

---
        if (!::WaitCommEvent(rsThreadData.hPortHandle, &dwEvent,
                                                        &rsNotifyOverlapData) )
        {
                dwError = ::GetLastError();
                if (dwError == ERROR_IO_PENDING)
                {
                        // V1.6 Beta 01: Use a wait here, not repeated calls!

                        bDone = FALSE ;

                        // Wait for completion of WaitCommEvent().
                        ::WaitForSingleObject(rsNotifyOverlapData.hEvent, 1000);

                        bDone = ::GetOverlappedResult(rsThreadData.hPortHandle,
                                                      &rsNotifyOverlapData,
                                                      &dwDummy, FALSE);
                }
        }
        // V1.6 Beta 01: Manually reset it now.
        ::ResetEvent(rsNotifyOverlapData.hEvent);
---

causes the module / app to crash after an hour maximum (bear in mind comms calls are coming at around 3 per second, so some 10,000 calls per hour).

I've coloured the code that was removed / added in grey, to try and make the simplicity of the change clearer.

Does anyone have any ideas of what I might have done wrong? It seems such a simple, innocent change that I can't believe it causes this, but it does - repeatably. I've run it about four times now and it crashes every time before getting anywhere near two hours.

<POUT>

--
Jason Teagle
[EMAIL PROTECTED]
 


_______________________________________________
msvc mailing list
[email protected]
See http://beginthread.com/mailman/listinfo/msvc_beginthread.com for subscription changes, and list archive.
_______________________________________________
msvc mailing list
[email protected]
See http://beginthread.com/mailman/listinfo/msvc_beginthread.com for 
subscription changes, and list archive.

Reply via email to