Thank you for the explanation. That does make sense, because when I measure the time spent performing the tasklets it takes more than twice as long when performing two (identical) tasklets, so the added 30% is definitely not being spent on my number crunching tasklets.

I'll be reimplementing my execution environment using processes anytime soon :-)

Best regards
Mads

Kristján Valur Jónsson wrote:
There are probably two reasons for this.
a) The GIL is released for the duration of any time-consuming system call.  
This allows time for another thread to step in.
b) Aquiring the lock, at least on windows, will cause the thread to do a few 
hundred trylock spins.  In fact, this should be removed on windows since it is 
not appropriate for a resource normally occupied...

The effect of b is probably small.  But a) is real and it would suggest that a 
large portion of the time is spent outside of python, performing system calls, 
such as send() and recv(), hardly surprising.

K

-----Original Message-----
From: [email protected] [mailto:[email protected]] 
On Behalf Of Mads Darø Kristensen
Sent: 25. mars 2009 08:29
To: stackless list
Subject: Re: [Stackless] question on preemtive scheduling semantics

Replying to myself here...

I have now tested it more thoroughly, and I get some surprising results
(surprising to me at least). When running a single-threaded stackless
scheduler I get the expected 100% CPU load when i try to stress it, but
running two threads on my dual core machine yielded a CPU load of
approximately 130%? What gives?

Seeing as the global interpreter lock should get in the way of utilizing
more than one core shouldn't I be seeing that using two threads (and two
schedulers) would yield the same 100% CPU load as using a single thread did?

I'm not here to start another "global interpreter lock" discussion, so
if there are obvious answers to be found in the mailing list archives
just tell me to RTFM :)

Best regards
Mads

Mads Darø Kristensen wrote:
Hi Jeff.

Jeff Senn wrote:
Hm. Do you mean "thread" or "process"? Because of the GIL you cannot use
threads to overlap python
execution within one interpreter (this has been discussed at great
length here many times...) --
depending on how you are measuring, perhaps you would aspire to get
200%, 400% ...etc for multicore....
I mean thread, not process. And what I meant with 100% utilization was
200% for the 2-core Mac I tested on... At least that was what I thought
I saw - I'll have to test that again some time :-)

Best regards
Mads

_______________________________________________
Stackless mailing list
[email protected]
http://www.stackless.com/mailman/listinfo/stackless


--
Med venlig hilsen / Best regards
Mads D. Kristensen

Blog: http://kedeligdata.blogspot.com/
Work homepage: http://www.daimi.au.dk/~madsk

_______________________________________________
Stackless mailing list
[email protected]
http://www.stackless.com/mailman/listinfo/stackless

Reply via email to