On Jan 31, 2007, at 6:46 AM, Brian Ford wrote:
On Fri, 26 Jan 2007, Peter Rehley wrote:
Hello,
I tried the latest release of cygwin1.dll (1.5.24-1) and it still is
hanging in the same way. I've tried to debug further with gdb, but
so far I haven't got any useful information out of gdb.
I'll keep trying to get some debug information, but if any one else
can reproduce the problem I would be most appreciative.
I can reproduce a problem. Your descriptions of it are a bit hard to
follow, so I'm not sure if it is your problem or not.
Unfortunately, I
don't have time to debug it right now. I do have a few comments,
though.
hmmm, rereading those descriptions I see what you mean. I'll try to
clarify.
1) happens when the pthread_create fails. Resources used up
basically. It's a normal error condition.
2) happens when the fork doesn't return. The last message that is
seen is "forking". No messages following it are seen, and no
messages from the main program are seen.
3) happens when the fork returns but has failed. The last message
that is seen is "done here" after the "Unable to fork".
I've tracked what happens after the "done here" message and the
thread is exiting. So that would seem the hang is in the main program.
Why are you creating a thread just to fork/exec another process?
Our main application handles requests from a named socket. Some of
the requests call shell scripts. Most of these shell scripts can
send more requests to the application (I didn't write this, I just
have to maintain it ). So for those requests that call shell scripts
the application has to create a thread and within the thread fork and
then exec.
Pedantically, I believe you are supposed to call _exit, not exit,
if fork
fails as stated here in the Solaris man page for fork:
An applications should call _exit() rather than exit(3C) if
it cannot execve(), since exit() will flush and close stan-
dard I/O channels and thereby corrupt the parent process's
standard I/O data structures. Using exit(3C) will flush buf-
fered data twice. See exit(2).
This is good to know because the same application also runs on
solaris. Although, it seems to run fine there.
I don't know, however, if this is really true in Cygwin, but it might
explain some misdiagnosed hangs on your part.
Also, the execve call appears to be suspect. Again, the Solaris
man page
for execve states:
The value in
argv[0] should point to a filename that is associated with
the process being started by one of the exec functions.
[snip]
As indicated, argc is at least one and the
first member of the array points to a string containing the
name of the file.
Attached is a modified test case that fixes a few of these issues, but
still hangs (or stutters; it does appear to proceed after long
periods of
time).
I've modified my test case to make sure that execve has valid
arguments, but I still get the hang. FWIW, execve is being used
because of the shell scripts being called.
Peter
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/