On 11/07/2010 20:29, Philippe Sigaud wrote:
On Sun, Jul 11, 2010 at 20:00, div0 <d...@users.sourceforge.net <mailto:d...@users.sourceforge.net>> wrote: On 11/07/2010 15:28, Philippe Sigaud wrote: - Why is a 2 threads version repeatedly thrice as fast as a no thread version? I thought it'd be only twice as fast. Well if you are running on windows, my guess is that your 2nd cpu is completely free of tasks, so the thread running on that one is more efficient. I'm pretty sure that the windows task scheduler trys as much as possible to keep threads running on the last cpu they where on, to reduce cache flushes and memory paging. On your first cpu you'll be paying for the task switching amongst the OS threads & programs. Even if they aren't using much actual cpu time, there's still a very high cost to perform the task swaps. Your program is obviously very artificial; with a more normal program you'll see the ratio drop back down to less than twice as fast. OK, I get it. Thanks for the explanation! - It's fun to see the process running under windows: first only 50% CPU (1 thread), then it jumps to ~100%, while the memory is slowly growing. Then brutally back to 50% CPU (no thread). - 1024 threads are OK, but I cannot reach 2048. Why? What is the limit for the number of spawn I can do? Would that be different if each threads spawn two sub-threads instead of the main one generating 2048? How many did you expect to run? Under 'doze each thread by default gets a megabyte of virtual address space for it's stack. So at about 1000 threads you'll be using all of your programs 2GB of address then the thread spawn will fail. That's it, I can get much higher than 1000, for 2 GB of RAM. Then it fail with a core.thread.ThreadException: could not create thread.
There's always an OS specific limit on the number of real threads available to a process. It varies a lot but you probably shouldn't count on any more than a thousand or so. See the docs for whatever platform you want to run on.
I tried this because I was reading an article on Scala's actors, where they talk about millions of actors. I guess they are quite different.
Yes a lot of languages use (what I call) fake threads. That is you have a real thread which the language run time uses to simulate multi threading.
So scala, erlang and various other languages do that and you can obviously role your own implementation if you really need.
However when using fake threads it's basically cooperative multi threading and unless the runtime specifically handles it (ie python or some other interpreted language), a fake thread could stall all the other threads that are running in the same real thread.
Not sure about linux, but a similar limitation must apply. The rule of thumb is don't bother spawning more threads than you have cpus. You're just wasting resources mostly. OK. So it means for joe average not much more than 2-8 threads?
Well, query for the number of cpus at runtime and then spawn that many threads. Even today, you can buy a PC with anywhere from 1 to 8 cores (and therefore 1 to 16 real threads running at the same time) so making a hard coded limit seems like a poor idea.
I'd have a configure file that specifies the number of threads so people can tune the performance for their specific machine and requirements. Even if you've got many cores, you end user might not actually want your application to hammer them all.
-- My enormous talent is exceeded only by my outrageous laziness. http://www.ssTk.co.uk