Sam Varshavchik writes:

> Dan Melomedman writes:       
>       
>> Sam Varshavchik writes:        
>>       
>>> Dan Melomedman writes:      
>>>> I could be wrong here, but running a process per connection is wasteful       
>>>> on very busy servers in any case. In other words, correctly designed        
>>>> multi-threaded (not necessarily on Linux, and not necessarily with        
>>>> pthreads) servers conserve memory and require less kernel scheduling        
>>>> overhead if any.      
>>       
>>> Only on platforms where processes are expensive.  I really don't see       
>>> much of a difference between a process and a thread.      
>>       
>> Expensive in comparison to what?      
>       
> A thread.  There are some platforms <cough>Solaris</cough>       
> <cough>NT</cough> where for some weird reasons process creation takes a       
> much greater time than creating a thread, even though there isn't much of       
> a technical difference between the two.       

What kind of threads? Kernel threads? User-space threads?

AFAIK thread scheduling on NT is much faster than it is on Linux (kernel 
threads).
Process creation on NT is slower than it is on Linux, yes, if not extremely 
slow by comparison.
However, a thread is still MUCH lighter than a process overall in state 
threads or a similar portable library. 

In libraries like state threads you have a number of threads running in one 
process. With Linux
threads one thread is one process with all its scheduling overheads.

>> There's a difference between a thread that kernel doesn't even see       
>> (user-space threading library), and a thread that a kernel has to       
>> actually schedule. In the process case you have the kernel bookkepping       
>> structure per process.        
>>       
>> If you are talking about Linux, and it's default native threading       
>> library, creating a thread, means creating a process, so there's no       
>> difference. This does NOT scale well.      
>       
> Actually it should scale better than trying to do everything yourself.        
> When you let the OS handle threading, the OS can schedule one thread for       
> execution when another thread is waiting for I/O.  You do not have this       
> flexibility if you run the threading code yourself.       

It does not scale better.
Pthreads mutexes waste CPU time. Additionnaly, thread safety is not a 
concern with state threads. State threads do not have these problems, 
(thread-specific private data support) but yes, if a thread is blocked, all 
your threads are blocked in state threads. This is the only problem. But as 
a work-around you prefork helper processes (which could be multithreaded 
with pthreads or some other library, or not) to do disk reads or any I/O 
that's not scheduled by state threads.. This way only helper processes 
block, not the main processes and their threads. 

Also, here's the project that attempts to change the Linux threading model:
http://oss.software.ibm.com/developerworks/opensource/pthreads/ 

"The goal of this project is to attempt to solve the problems associated 
with the use of the pthreads library on Linux.    It will add M:N threading 
capability and improve significantly on the POSIX compliance of pthreads on 
Linux.    This will allow significant performance improvements for all 
applications that make use of the pthreads library, particularly on SMP 
machines.   It will also enable Linux to provide threading services that are 
more in line with the capabilities of the commerical Unix operating system 
such as IBM AIX and SGI IRIX." 

Also see http://www.gnu.org/software/pth/ by Ralph Engelschall.
Also see http://www.annexia.org/freeware/pthrlib/.
IMO these are very cool, though not multi-process like state threads, thus 
SMP machines will require to run the same number of servers as CPUs to 
scale. 

-- 
Dan
Three days of testing can save 10
minutes reading manuals.

Reply via email to