Re: basic thread question

2009-08-25 Thread Piet van Oostrum
 sturlamolden sturlamol...@yahoo.no (s) wrote:

s On 25 Aug, 01:26, Piet van Oostrum p...@cs.uu.nl wrote:
 That's because it doesn't use copy-on-write. Thereby losing most of its
 advantages. I don't know SUA, but I have vaguely heard about it.

s SUA is a version of UNIX hidden inside Windows Vista and Windows 7
s (except in Home and Home Premium), but very few seem to know of it.
s SUA (Subsystem for Unix based Applications) is formerly known as
s Interix, which is a certified version of UNIX based on OpenBSD. If you
s go to http://www.interopsystems.com (a website run by Interop Systems
s Inc., a company owned by Microsoft), you will find a lot of common
s unix tools prebuilt for SUA, including Python 2.6.2.

s The NT-kernel supports copy-on-write fork with a special system call
s (ZwCreateProcess in ntdll.dll), which is what SUA's implementation of
s fork() uses.

I have heard about that also, but is there a Python implementation that
uses this? (Just curious, I am not using Windows.)
-- 
Piet van Oostrum p...@cs.uu.nl
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-25 Thread sturlamolden
On 25 Aug, 13:33, Piet van Oostrum p...@cs.uu.nl wrote:

 I have heard about that also, but is there a Python implementation that
 uses this? (Just curious, I am not using Windows.)

On Windows we have three different versions of Python 2.6:

* Python 2.6 for Win32/64 (from python.org) does not have os.fork.

* Python 2.6 for Cygwin has os.fork, but it is non-COW and sluggish.

* Python 2.6 for SUA has a fast os.fork with copy-on-write.

You get Python 2.6.2 for SUA prebuilt by Microsoft from 
http://www.interopsystems.com.

Using Python 2.6 for SUA is not without surprices: For example, the
process is not executed from the Win32 subsystem, hence the Windows
API is inaccessible. That means we cannot use native Windows GUI.
Instead we must run an X11 server on the Windows subsystem (e.g. X-
Win32), and use the Xlib SUA has installed. You can compare SUA to a
stripped down Linux distro, on which you have to build and install
most of the software you want to use. I do not recommend using Python
for SUA instead of Python for Windows unless you absolutely need a
fast os.fork or have a program that otherwise requires Unix. But for
running Unix apps on Windows, SUA is clearly superior to Cygwin.
Licencing is also better: Programs compiled against Cygwin libraries
are GPL (unless you buy a commercial license). Program compiled
against SUA libraries are not.



Sturla Molden
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-24 Thread Piet van Oostrum
 Dennis Lee Bieber wlfr...@ix.netcom.com (DLB) wrote:

DLB On Sun, 23 Aug 2009 22:14:17 -0700, John Nagle na...@animats.com
DLB declaimed the following in gmane.comp.python.general:

 Multiple Python processes can run concurrently, but each process
 has a copy of the entire Python system, so the memory and cache footprints 
 are
 far larger than for multiple threads.
 
DLB   One would think a smart enough OS would be able to share the
DLB executable (interpreter) code, and only create a new stack/heap
DLB allocation for data.

Of course they do, but a significant portion of a Python system consists
of imported modules and these are data as far as the OS is concerned.
Only the modules written in C which are loaded as DLL's (shared libs)
and of course the interpreter executable will be shared.
-- 
Piet van Oostrum p...@cs.uu.nl
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-24 Thread Dave Angel

Dennis Lee Bieber wrote:

On Sun, 23 Aug 2009 22:14:17 -0700, John Nagle na...@animats.com
declaimed the following in gmane.comp.python.general:

  

 Multiple Python processes can run concurrently, but each process
has a copy of the entire Python system, so the memory and cache footprints are
far larger than for multiple threads.



One would think a smart enough OS would be able to share the
executable (interpreter) code, and only create a new stack/heap
allocation for data.
  
That's what fork is all about.  (See os.fork(), available on most 
Unix/Linux)  The two processes start out sharing their state, and only 
the things subsequently written need separate swap space.


In Windows (and probably Unix/Linux), the swapspace taken by the 
executable and DLLs(shared libraries) is minimal.  Each DLL may have a 
preferred location and if that part of the address space is available, 
it takes no swapspace at all, except for static variables, which are 
usually allocated together.  I don't know whether the standard build of 
CPython (python.exe and the pyo libraries) uses such a linker option, 
but I'd bet they do.  It also speeds startup time.


On my system, a minimal python program uses about 50k of swapspace.  But 
I'm sure that goes way up with lots of imports.



DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-24 Thread Piet van Oostrum
 Dave Angel da...@ieee.org (DA) wrote:

DA Dennis Lee Bieber wrote:
 On Sun, 23 Aug 2009 22:14:17 -0700, John Nagle na...@animats.com
 declaimed the following in gmane.comp.python.general:
 
 
 Multiple Python processes can run concurrently, but each process
 has a copy of the entire Python system, so the memory and cache footprints 
 are
 far larger than for multiple threads.
 
 
 One would think a smart enough OS would be able to share the
 executable (interpreter) code, and only create a new stack/heap
 allocation for data.
 
DA That's what fork is all about.  (See os.fork(), available on most
DA Unix/Linux)  The two processes start out sharing their state, and only the
DA things subsequently written need separate swap space.

But os.fork() is not available on Windows. And I guess refcounts et al.
will soon destroy the sharing.
-- 
Piet van Oostrum p...@cs.uu.nl
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-24 Thread sturlamolden
On 18 Aug, 22:10, Derek Martin c...@pizzashack.org wrote:

 I have some simple threaded code...  If I run this
 with an arg of 1 (start one thread), it pegs one cpu, as I would
 expect.  If I run it with an arg of 2 (start 2 threads), it uses both
 CPUs, but utilization of both is less than 50%.  Can anyone explain
 why?  

Access to the Python interpreter is serialized by the global
interpreter lock (GIL). You created two threads the OS could schedule,
but they competed for access to the Python interpreter. If you want to
utilize more than one CPU, you have to release the GIL or use multiple
processes instead (os.fork since you are using Linux).

This is how the GIL can be released:

* Many functions in Python's standard library, particularly all
blocking i/o functions, release the GIL. This covers the by far most
common use of threads.

* In C or C++ extensions, use the macros Py_BEGIN_ALLOW_THREADS and
Py_END_ALLOW_THREADS.

* With ctypes, functions called from a cdll release the GIL, whereas
functions called from a pydll do not.

* In f2py, declaring a Fortran function threadsafe in a .pyf file or
cf2py comment releases the GIL.

* In Cython or Pyrex extensions, use a with nogil: block to execute
code without holding the GIL.


Sturla Molden
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-24 Thread sturlamolden
On 24 Aug, 13:21, Piet van Oostrum p...@cs.uu.nl wrote:

 But os.fork() is not available on Windows. And I guess refcounts et al.
 will soon destroy the sharing.

Well, there is os.fork in Cygwin and SUA (SUA is the Unix subsytem in
Windows Vista Professional). Cygwin's fork is a bit sluggish.

Multiprocessing works on Windows and Linux alike.

Apart from that, how are you going to use threads? The GIL will not be
a problem if it can be released. Mostly, the GIL is a hypothetical
problem. It is only a problem for compute-bound code written in pure
Python. But very few use Python for that. However, if you do and can
afford the 200x speed penalty from using Python (instead of C, C++,
Fortran, Cython), you can just as well accept that only one CPU is
used.


Sturla Molden











-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-24 Thread Piet van Oostrum
 sturlamolden sturlamol...@yahoo.no (s) wrote:

s On 24 Aug, 13:21, Piet van Oostrum p...@cs.uu.nl wrote:
 But os.fork() is not available on Windows. And I guess refcounts et al.
 will soon destroy the sharing.

s Well, there is os.fork in Cygwin and SUA (SUA is the Unix subsytem in
s Windows Vista Professional). Cygwin's fork is a bit sluggish.

That's because it doesn't use copy-on-write. Thereby losing most of its
advantages. I don't know SUA, but I have vaguely heard about it.
-- 
Piet van Oostrum p...@cs.uu.nl
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-24 Thread sturlamolden
On 25 Aug, 01:26, Piet van Oostrum p...@cs.uu.nl wrote:

 That's because it doesn't use copy-on-write. Thereby losing most of its
 advantages. I don't know SUA, but I have vaguely heard about it.

SUA is a version of UNIX hidden inside Windows Vista and Windows 7
(except in Home and Home Premium), but very few seem to know of it.
SUA (Subsystem for Unix based Applications) is formerly known as
Interix, which is a certified version of UNIX based on OpenBSD. If you
go to http://www.interopsystems.com (a website run by Interop Systems
Inc., a company owned by Microsoft), you will find a lot of common
unix tools prebuilt for SUA, including Python 2.6.2.

The NT-kernel supports copy-on-write fork with a special system call
(ZwCreateProcess in ntdll.dll), which is what SUA's implementation of
fork() uses.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-23 Thread John Nagle

Jan Kaliszewski wrote:

18-08-2009 o 22:10:15 Derek Martin c...@pizzashack.org wrote:


I have some simple threaded code...  If I run this
with an arg of 1 (start one thread), it pegs one cpu, as I would
expect.  If I run it with an arg of 2 (start 2 threads), it uses both
CPUs, but utilization of both is less than 50%.  Can anyone explain
why?

I do not pretend it's impeccable code, and I'm not looking for a
critiqe of the code per se, excepting the case where what I've written
is actually *wrong*. I hacked this together in a couple of minutes,
with the intent of pegging my CPUs.  Performance with two threads is
actually *worse* than with one, which is highly unintuitive.  I can
accomplish my goal very easily with bash, but I still want to
understand what's going on here...

The OS is Linux 2.6.24, on a Ubuntu base.  Here's the code:


Python threads can't benefit from multiple processors (because of GIL,
see: http://docs.python.org/glossary.html#term-global-interpreter-lock).


This is a CPython implementation restriction.  It's not inherent in
the language.

Multiple threads make overall performance worse because Python's
approach to thread locking produces a large number of context switches.
The interpreter unlocks the Global Interpreter Lock every N interpreter
cycles and on any system call that can block, which, if there is a
thread waiting, causes a context switch.

Multiple Python processes can run concurrently, but each process
has a copy of the entire Python system, so the memory and cache footprints are
far larger than for multiple threads.

John Nagle
--
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-19 Thread Sean DiZazzo
On Aug 18, 4:58 pm, birdsong david.birds...@gmail.com wrote:
 On Aug 18, 3:18 pm, Derek Martin c...@pizzashack.org wrote:



  On Tue, Aug 18, 2009 at 03:10:15PM -0500, Derek Martin wrote:
   I have some simple threaded code...  If I run this
   with an arg of 1 (start one thread), it pegs one cpu, as I would
   expect.  If I run it with an arg of 2 (start 2 threads), it uses both
   CPUs, but utilization of both is less than 50%.  Can anyone explain
   why?  

  Ah, searching while waiting for an answer (the e-mail gateway is a bit
  slow, it seems...) I discovered that the GIL is the culprate.
  Evidently this question comes up a lot.  It would probably save a lot
  of time on the part of those who answer questions here, as well as
  those implementing solutions in Python, if whoever is maintaining the
  docs these days would put a blurb about this in the docs in big bold
  letters...  Concurrency being perhaps the primary reason to use
  threading, essentially it means that Python is not useful for the
  sorts of problems that one would be inclined to solve they way my code
  works (or rather, was meant to).  It would be very helpful to know
  that *before* one tried to implement a solution that way... especially
  for solutions significantly less trivial than mine. ;-)

  Thanks

  --
  Derek D. Martinhttp://www.pizzashack.org/
  GPG Key ID: 0x81CFE75D

   application_pgp-signature_part
   1KViewDownload

 I would still watch that video which will explain a bit more about the
 GIL.

Thank you for the video!  It's good to know, but it raises lots of
other questions in my mind.  Lots of examples would have helped.

~Sean
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-18 Thread birdsong
On Aug 18, 1:10 pm, Derek Martin c...@pizzashack.org wrote:
 I have some simple threaded code...  If I run this
 with an arg of 1 (start one thread), it pegs one cpu, as I would
 expect.  If I run it with an arg of 2 (start 2 threads), it uses both
 CPUs, but utilization of both is less than 50%.  Can anyone explain
 why?  

 I do not pretend it's impeccable code, and I'm not looking for a
 critiqe of the code per se, excepting the case where what I've written
 is actually *wrong*. I hacked this together in a couple of minutes,
 with the intent of pegging my CPUs.  Performance with two threads is
 actually *worse* than with one, which is highly unintuitive.  I can
 accomplish my goal very easily with bash, but I still want to
 understand what's going on here...

 The OS is Linux 2.6.24, on a Ubuntu base.  Here's the code:

 Thanks

 -=-=-=-=-

 #!/usr/bin/python

 import thread, sys, time

 def busy(thread):
     x=0
     while True:
         x+=1

 if __name__ == '__main__':
     try:
         cpus = int(sys.argv[1])
     except ValueError:
         cpus = 1
     print cpus = %d, argv[1] = %s\n % (cpus, sys.argv[1])
     i=0
     thread_list = []
     while i  cpus:
         x = thread.start_new_thread(busy, (i,))
         thread_list.append(x)
         i+=1
     while True:
         pass

 --
 Derek D. Martinhttp://www.pizzashack.org/
 GPG Key ID: 0x81CFE75D

  application_pgp-signature_part
  1KViewDownload

watch this and all your findings will be explained: http://blip.tv/file/2232410

this talk marked a pivotal moment in my understanding of python
threads and signal handling in threaded programs.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: basic thread question

2009-08-18 Thread birdsong
On Aug 18, 3:18 pm, Derek Martin c...@pizzashack.org wrote:
 On Tue, Aug 18, 2009 at 03:10:15PM -0500, Derek Martin wrote:
  I have some simple threaded code...  If I run this
  with an arg of 1 (start one thread), it pegs one cpu, as I would
  expect.  If I run it with an arg of 2 (start 2 threads), it uses both
  CPUs, but utilization of both is less than 50%.  Can anyone explain
  why?  

 Ah, searching while waiting for an answer (the e-mail gateway is a bit
 slow, it seems...) I discovered that the GIL is the culprate.
 Evidently this question comes up a lot.  It would probably save a lot
 of time on the part of those who answer questions here, as well as
 those implementing solutions in Python, if whoever is maintaining the
 docs these days would put a blurb about this in the docs in big bold
 letters...  Concurrency being perhaps the primary reason to use
 threading, essentially it means that Python is not useful for the
 sorts of problems that one would be inclined to solve they way my code
 works (or rather, was meant to).  It would be very helpful to know
 that *before* one tried to implement a solution that way... especially
 for solutions significantly less trivial than mine. ;-)

 Thanks

 --
 Derek D. Martinhttp://www.pizzashack.org/
 GPG Key ID: 0x81CFE75D

  application_pgp-signature_part
  1KViewDownload

I would still watch that video which will explain a bit more about the
GIL.
-- 
http://mail.python.org/mailman/listinfo/python-list