just to add my 2 cents
I have been looking at zserver code, the only time fork or system (which i presume invokes execve ) calls are used are at startup to either
a) run a cmdline
b) daemonize
heres a snip from strace output
strace -o strace.txt -f -e trace=fork,execve ./runzope
on a zeo instance
15298 execve("./runzope", ["./runzope"], [/* 23 vars */]) = 0
15298 execve("/var/dev/vision/python233/bin/python", ["/var/dev/vision/python233/bin/py"..., "/var/dev/vision/Zope27/lib/python
which creates a child process with id 15299
at this point asyncore main loop thread has not even started so it is safe to assume that the parent does not start the asyncore loop for any servers created but happens in the forked child . which means we probably cannot have multiple asyncore mainloops running
The zeoclient causes threads to be created but there are no "forks" or "system" calls as far as I can tell (or strace for that matter)
Can you please point out where in the zeo code does forking occur ? I will try and duplicate this condition.
-ty
sathya
[Dieter Maurer]
The problem occured in a ZEO client which called "asyncore.poll" in the forked subprocess. This "poll" deterministically stole ZEO server invalidation messages from the parent.
I'm sorry, but this is still too vague to guess what happened.
- Which operating system was in use?
- Which thread package?
- In the ZEO client that called fork(), did it call fork() directly, or indirectly as the result of a system() or popen() call? Or what? I'd like to understand a specific failure before rushing to generalization.
- In the ZEO client that called fork() (whether directly or indirectly), was fork called *from* the thread running ZEO's asyncore loop, or from a different thread?
I read the Linux "fork" manual page and found:
fork creates a child process that differs from the parent process only in its PID and PPID, and in the fact that resource utilizations are set to 0. File locks and pending signals are not inherited.
...
The fork call conforms to SVr4, SVID, POSIX, X/OPEN, BSD 4.3
If it conforms to POSIX (as it says it does), then fork() also has to satisfy the huge list of requirements I referenced before:
http://www.opengroup.org/onlinepubs/009695399/functions/fork.html
That page is the current POSIX spec for fork().
I concluded that if the only difference is in the PID/PPID
and resource utilizations, there is no difference in the threads between parent
and child.
Except that if you're running non-POSIX LinuxThreads, a thread *is* a process (there's a one-to-one relationship under LinuxThreads, not the many-to-one relationship in POSIX), in which case "no difference in threads" is trivially true.
This would mean that the wide spread "asyncore.mainloop" threads could suffer the same message loss and message duplication.
That's why all sane <wink> threading implementations do what POSIX does on a fork(). fork() and threading don't really mix well under POSIX either, but the "fork+exec" model for starting a new process is an historical burden that bristles with subtle problems in a multithreaded world; POSIX introduced posix_spawn() and posix_spawnp() for sane(r) process creation, ironically moving closer to what most non-Unix systems have always done to create a new process.
I did not observe a message loss/duplication in any application with an "asyncore.mainloop" thread.
I don't understand. You said that you *have* seen message loss/duplication in a ZEO client, and I assume the ZEO client was running an asyncore thread. If so, then you have seen loss/duplication in an application with an asyncore thread.
Or are you saying that you haven't seen loss/duplication under the specific Linux flavor whose man page you quoted, but have seen it under some other (so far unidentified) system?
Maybe, the Linux "fork" manual page is only not precise with respect to threads and the problem does not occur in applications with a standard "asyncore.mainloop" thread.
That "fork" manpage is clearly missing a mountain of crucial details
(or it's not telling the truth about being POSIX-compliant). fork()
is historically poorly documented, though.
_______________________________________________
Zope-Dev maillist - [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
** No cross posts or HTML encoding! **
(Related lists - http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )
-- =================================================== CEO ZeOmega Open minds' Open Solutions
Plano, Texas, USA Bangalore, India 972-731-6750 (O) 214-733-3467 (M) http://www.zeomega.com
Open source content management and workflow solutions
====================================================
_______________________________________________
Zope-Dev maillist - [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
** No cross posts or HTML encoding! **
(Related lists - http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )