Re: SMP: inter-process communication
On 02/23/2010 04:30 AM, Tsantilas Christos wrote: - master process communicates with one direction pipes with kids, sending various commands ... Master/kids communication using pipes is used by the apache worker server. I am also using it in my icap server. I am not seeing any problem. I would prefer to have a single IPC mechanism. The parent/kid pipe is a candidate. Here are the properties for the ideal IPC mechanism: 1) Can be used to pass descriptors among processes (so that two kids can listen on the same port and so that we can implement acceptor/worker design if we want to). 2) Can be used to access any squidN process (to get hits from future cacheN processes and to implement multiple acceptors/workers design if we want to). 3) Is based on select(2)-able descriptors. 4) Can be ported to or emulated on Windows and other corner-case OSes. If I missed any, please add. Let's evaluate the parent/kid pipes IPC: Property1 can be satisfied due to fork() side-effects. However, this restricts descriptor sharing or exchange to parent/kid relationship. You cannot share or exchange descriptors among siblings. Property2 cannot be satisfied directly but can be emulated by kids sending the command to the parent and the parent routing/relaying the command to the other kid. Property3 is satisfied. Property3 can be satisfied, I guess, but some emulation may be necessary. Here is a similar evaluation for Unix domain sockets: Property1 is satisfied. Property2 is satisfied. Property3 is satisfied. Property4 can be satisfied but only with emulation using Windows-specific IPC mechanisms. So, neither parent/kid pipe nor UDS is ideal. Both are candidates. I lean towards UDS because they make communication between kids or even unrelated processes possible. I am worried about restricting all communications to communication with a parent. What do you think? Thank you, Alex. P.S. To have multiple processes, we will need to fork one way or the other. Some IPCs (e.g., parent/kid pipes) may dictate certain forking methods. A forking daemon suggested by Robert sounds like a good optimization.
SMP: inter-process communication
On 02/21/2010 06:10 PM, Amos Jeffries wrote: On Mon, 22 Feb 2010 02:03:31 +0100, Henrik Nordström hen...@henriknordstrom.net wrote: mån 2010-02-22 klockan 11:44 +1100 skrev Robert Collins: command protocol for it would be pretty similar to the SHM disk IO helper, but for processes. Something like: squid-helper: spawn stderrfd argv(escaped/encoded to be line NULLZ string safe) helper-squid: pid, stdinfd, stdoutfd Which requires UNIX domain sockets for fd passing, and unknown implementation on Windows.. This would permit several interesting things: - starting helpers would no longer need massive VM overhead - we won't need to worry about vfork, at least for a while - starting helpers can be really async from squid core processing (at the moment everything gets synchronised) Yes. +1 in general, with the reservation that I want to hear back from Guido on what options there may be on Windows platform. +1 with same reservations. The above was posted on the immortal helpers thread. I am responding to this under SMP subject because the discussion is related to the IPC mechanism selection as well. I agree that we need Guido guidelines on how to pass descriptors on Windows. However, I believe that 1) There is just no single mechanism that would work well on Windows and Unix. 2) The Windows portability problem has been solved by ACE and possibly Apache folks so we can solve it as well. I have seen references to similar-but-different descriptor sharing features available on Windows. Thus, we can proceed without sufficient Windows expertise, as long as the IPC mechanism is declared as a Squid API (and implemented as a wrapper for relevant OS-specific features such as UDS on Unix). I believe ACE uses the same approach. When somebody wants to support SMP on Windows, they would need to provide the right implementation for the IPC on that platform but the rest of the code will remain virtually unchanged. Do you agree? Thank you, Alex.
Re: SMP: inter-process communication
So, neither parent/kid pipe nor UDS is ideal. Both are candidates. I lean towards UDS because they make communication between kids or even unrelated processes possible. I am worried about restricting all communications to communication with a parent. I would also lean to UDS, including for IPC for shutdown/reconfiugure etc (with fallback on signals in case the process is stuck, busy-looping or similar) -- /kinkie
Re: SMP: inter-process communication
On Tue, 23 Feb 2010 09:45:56 -0700, Alex Rousskov rouss...@measurement-factory.com wrote: On 02/21/2010 06:10 PM, Amos Jeffries wrote: On Mon, 22 Feb 2010 02:03:31 +0100, Henrik Nordström hen...@henriknordstrom.net wrote: mån 2010-02-22 klockan 11:44 +1100 skrev Robert Collins: command protocol for it would be pretty similar to the SHM disk IO helper, but for processes. Something like: squid-helper: spawn stderrfd argv(escaped/encoded to be line NULLZ string safe) helper-squid: pid, stdinfd, stdoutfd Which requires UNIX domain sockets for fd passing, and unknown implementation on Windows.. This would permit several interesting things: - starting helpers would no longer need massive VM overhead - we won't need to worry about vfork, at least for a while - starting helpers can be really async from squid core processing (at the moment everything gets synchronised) Yes. +1 in general, with the reservation that I want to hear back from Guido on what options there may be on Windows platform. +1 with same reservations. The above was posted on the immortal helpers thread. I am responding to this under SMP subject because the discussion is related to the IPC mechanism selection as well. I agree that we need Guido guidelines on how to pass descriptors on Windows. However, I believe that 1) There is just no single mechanism that would work well on Windows and Unix. 2) The Windows portability problem has been solved by ACE and possibly Apache folks so we can solve it as well. I have seen references to similar-but-different descriptor sharing features available on Windows. Thus, we can proceed without sufficient Windows expertise, as long as the IPC mechanism is declared as a Squid API (and implemented as a wrapper for relevant OS-specific features such as UDS on Unix). I believe ACE uses the same approach. When somebody wants to support SMP on Windows, they would need to provide the right implementation for the IPC on that platform but the rest of the code will remain virtually unchanged. Do you agree? Yes, but ... we need to ensure the wrapper API fits the possible backend capabilities realistically. Amos
Re: SMP: inter-process communication
On 02/21/2010 02:45 PM, Henrik Nordström wrote: I guess the main question to ask is interaction between processes. Mainly sharing of cache etc. How do these impact the chosen model? In the longer term model I see that we will have several cooperating processes, for example N processes monitoring http_port, forwarding requests. May be several different configurations used among these processes. M processes maintaining caches (object, ip, etc) shared by some/all of the above. The exact model how this is done is yet to be determined. X shared data resources of different kinds with no dedicated process I agree that the above are realistic use cases that will need to be supported one way or another. I was planning to post a separate message about that... The only inter-process cooperation I plan to support initially is N processes monitoring the same http_port (and doing everything else). I know of two ways to support that specific use case: A) One dedicated process starts listening and fork()s other processes that can then also listen on the same http_port socket descriptor. B) One process starts listening and sends others the open socket descriptor via UNIX domain sockets, STREAMs, doors, etc. I am working on option (B). While more complex, I think (B) is much more powerful and flexible than (A). For example, (A) cannot efficiently support reconfiguration when http_ports need changing. Within option (B), I am leaning towards UNIX domain sockets as an IPC mechanism. I would prefer to use something that would work on Windows as well, but I do not know what that would be. Eventually, we can support more than one IPC mechanism, of course, and Windows have something similar to Unix domain sockets. If you think a different approach would be better, please let me know. Thank you, Alex. P.S. For data sharing between the cache and other processes, mmaps may be an attractive option, but I have not investigated the details yet (including selection of the best synchronization mechanism).
Re: SMP: inter-process communication
sön 2010-02-21 klockan 17:10 -0700 skrev Alex Rousskov: The only inter-process cooperation I plan to support initially is N processes monitoring the same http_port (and doing everything else). I guess there will be no shared cache then? I am working on option (B). While more complex, I think (B) is much more powerful and flexible than (A). For example, (A) cannot efficiently support reconfiguration when http_ports need changing. Not without restarting the worker processes no. If you think a different approach would be better, please let me know. If things gets broken down like follows: * Forwarding processes. Listens on an http_port. Processes protocols. Forwards requests. Limited internal caching. * Persistent Object Caches, disk and/or memory * ICP/HTCP * DNS Cache each with their own process, then for most purposes, 'a' would work fine. A config change involving http_port changes then sets up new worker processes for those ports and tells the existing ones to shut down. Regards Henrik
Re: SMP: inter-process communication
On Sun, 21 Feb 2010 17:10:34 -0700, Alex Rousskov rouss...@measurement-factory.com wrote: On 02/21/2010 02:45 PM, Henrik Nordström wrote: I guess the main question to ask is interaction between processes. Mainly sharing of cache etc. How do these impact the chosen model? In the longer term model I see that we will have several cooperating processes, for example N processes monitoring http_port, forwarding requests. May be several different configurations used among these processes. M processes maintaining caches (object, ip, etc) shared by some/all of the above. The exact model how this is done is yet to be determined. X shared data resources of different kinds with no dedicated process I agree that the above are realistic use cases that will need to be supported one way or another. I was planning to post a separate message about that... The only inter-process cooperation I plan to support initially is N processes monitoring the same http_port (and doing everything else). I know of two ways to support that specific use case: A) One dedicated process starts listening and fork()s other processes that can then also listen on the same http_port socket descriptor. B) One process starts listening and sends others the open socket descriptor via UNIX domain sockets, STREAMs, doors, etc. I am working on option (B). While more complex, I think (B) is much more powerful and flexible than (A). For example, (A) cannot efficiently support reconfiguration when http_ports need changing. Within option (B), I am leaning towards UNIX domain sockets as an IPC mechanism. I would prefer to use something that would work on Windows as well, but I do not know what that would be. Eventually, we can support more than one IPC mechanism, of course, and Windows have something similar to Unix domain sockets. If you think a different approach would be better, please let me know. Thank you, Alex. P.S. For data sharing between the cache and other processes, mmaps may be an attractive option, but I have not investigated the details yet (including selection of the best synchronization mechanism). I think Roberts concept of a spawn process offers some benefits for the implementation of this. Ending up with a process that does all the spawning and sends back control FD for each child process (of any type). The interior API would be something along the lines of a spawn_process(binary-path, struct-to-receive-FDs, [callback-to-receive-FDs]). The problems as Henrik mentioned is Windows, so all this pending Guido's feedback on what the possibilities there are. Although with a simple API interface like above, we could plug-in several fork implementations where available. Amos
Re: SMP: inter-process communication
On 02/21/2010 06:10 PM, Henrik Nordström wrote: sön 2010-02-21 klockan 17:10 -0700 skrev Alex Rousskov: The only inter-process cooperation I plan to support initially is N processes monitoring the same http_port (and doing everything else). I guess there will be no shared cache then? Not initially, but that is the second step goal. I am working on option (B). While more complex, I think (B) is much more powerful and flexible than (A). For example, (A) cannot efficiently support reconfiguration when http_ports need changing. Not without restarting the worker processes no. If you think a different approach would be better, please let me know. If things gets broken down like follows: * Forwarding processes. Listens on an http_port. Processes protocols. Forwards requests. Limited internal caching. * Persistent Object Caches, disk and/or memory * ICP/HTCP * DNS Cache each with their own process, then for most purposes, 'a' would work fine. A config change involving http_port changes then sets up new worker processes for those ports and tells the existing ones to shut down. I agree in general, although I am not sure using processes dedicated to ICP or DNS caching is worth the overheads. These details are not important for now though. If we go with (A), that is create all sockets you need to share and then fork() processes, the processes still need a mechanism to communicate with each other. For example, a forwarding process needs to communicate with the cache process. This can be done using those pre-created [Unix domain] sockets, I guess. What does (A) buy us? (A) makes it is easier to pre-arrange how processes communicate with each other, right? Any other advantages? As we discussed, (A) makes it more difficult to support some reconfigurations as some processes need to finish all the current activities and then quit nicely while other, new processes start up in their place. This affects significant portions of the code and is not something we currently support (at least not nicely). Also, it may significantly increase CPU and RAM requirements when two process categories co-exist (old dying and new ones). Is (A) really better than (B)? Thank you, Alex.
Re: SMP: inter-process communication
On Sun, 2010-02-21 at 20:18 -0700, Alex Rousskov wrote: On 02/21/2010 06:10 PM, Henrik Nordström wrote: sön 2010-02-21 klockan 17:10 -0700 skrev Alex Rousskov: The only inter-process cooperation I plan to support initially is N processes monitoring the same http_port (and doing everything else). I guess there will be no shared cache then? Not initially, but that is the second step goal. I suggest using CARP then, to route to backends. -Rob signature.asc Description: This is a digitally signed message part
Re: SMP: inter-process communication
On 02/21/2010 08:21 PM, Robert Collins wrote: On Sun, 2010-02-21 at 20:18 -0700, Alex Rousskov wrote: On 02/21/2010 06:10 PM, Henrik Nordström wrote: sön 2010-02-21 klockan 17:10 -0700 skrev Alex Rousskov: The only inter-process cooperation I plan to support initially is N processes monitoring the same http_port (and doing everything else). I guess there will be no shared cache then? Not initially, but that is the second step goal. I suggest using CARP then, to route to backends. In the initial implementation, there are no backends. You get N processes that do the same thing, including listening on the same http_port. Whether it would be more efficient to do CARP or a distributed but cache shared among the same squidN processes is an open question for the second step. I would not be surprised if the answer depends on the traffic/environment. Alex.