Dan Melomedman writes:
> Sam Varshavchik writes:
>
>> Dan Melomedman writes:
>>> Sam Varshavchik writes:
>>>> Dan Melomedman writes:
>>>>> I could be wrong here, but running a process per connection is
>>>>> wasteful on very busy servers in any case. In other words,
>>>>> correctly designed multi-threaded (not necessarily on Linux,
>>>>> and not necessarily with pthreads) servers conserve memory and
>>>>> require less kernel scheduling overhead if any.
>>>
>>>> Only on platforms where processes are expensive. I really don't see
>>>> much of a difference between a process and a thread.
>>> Expensive in comparison to what?
>> A thread. There are some platforms <cough>Solaris</cough>
>> <cough>NT</cough> where for some weird reasons process creation takes a
>> much greater time than creating a thread, even though there isn't
>> much of a technical difference between the two.
>
> What kind of threads? Kernel threads? User-space threads?
User-space threads.
> In libraries like state threads you have a number of threads running in
> one process. With Linux
> threads one thread is one process with all its scheduling overheads.
The ONLY difference between a thread and a process is that one uses a shared
address space, the other doesn't. Scheduling is not a factor. Now, you may
use different scheduling logic for threads versus processes, but that's just
a design choice. Either way, whether you're scheduling a process or a
thread, you have to do a context switch. The only difference is which
address space you are doing a context switch into. Doing a context switch
between two threads does not involve loading a new MMU map, and I'm sure
Linux is smart enough not to go through the overhead of reloading a new MMU
map if it's switching between two processes in the same address space.
>> Actually it should scale better than trying to do everything
>> yourself. When you let the OS handle threading, the OS can
>> schedule one thread for execution when another thread is waiting
>> for I/O. You do not have this flexibility if you run the threading
>> code yourself.
>
> It does not scale better.
> Pthreads mutexes waste CPU time. Additionnaly, thread safety is not a
In a particularly bad implementation of pthreads, perhaps. In a good
implementation the CPU will go ahead and do something else while a thread is
waiting on a mutex.
> concern with state threads. State threads do not have these problems,
> (thread-specific private data support) but yes, if a thread is blocked,
> all your threads are blocked in state threads. This is the only problem.
> But as a work-around you prefork helper processes (which could be
> multithreaded with pthreads or some other library, or not) to do disk
> reads or any I/O that's not scheduled by state threads.. This way only
> helper processes block, not the main processes and their threads.
More complexity, and more potential source of bugs. A saying comes to mind:
the more you overthink the plumbing, the easier it is to stop up the drain.
> Also, here's the project that attempts to change the Linux threading
> model:
> http://oss.software.ibm.com/developerworks/opensource/pthreads/
>
> "The goal of this project is to attempt to solve the problems associated
> with the use of the pthreads library on Linux. It will add M:N
> threading capability and improve significantly on the POSIX compliance of
> pthreads on Linux. This will allow significant performance improvements
> for all applications that make use of the pthreads library, particularly
> on SMP machines. It will also enable Linux to provide threading services
> that are more in line with the capabilities of the commerical Unix
> operating system such as IBM AIX and SGI IRIX."
Great. More power to them. But until that happens, processes will work
just fine, and even after any remaining glitches in threading are
eliminated, cheap processes will still work just as well as they did before.
--
Sam