New submission from Hugh Secker-Walker <hug...@users.sourceforge.net>:

Popen seems to be *very* costly, CPU wise, on Mac OS X.

I need to read and process bytes from the stdout of another process that 
outputs bytes in a rate-limited fashion, e.g. 80000 bytes per second.  If I use 
a shell pipeline and a simple Python script to read the bytes and log some 
details then top and time both agree that the Python script is using roughly 1 
second of CPU time for every 2000 seconds of elapsed time.

If I change the script to use Popen to start the first process and to read from 
proc.stdout, then top and time both agree that the Python script uses roughly 1 
second of CPU time for every 6 seconds of elapsed time.  That is, using Popen 
uses 333 times as much CPU as using a shell pipeline.

In practice, this means that using a shell pipeline has no measureable impact 
on system performance, and I can run 300 concurrent pipeline jobs, whereas 
using Popen limits me to running 6 such concurrent jobs.

This is on an up-to-date Mac OS X 10.5.8 system.  The behavior happens with 
both the stock Python 2.5 from Apple and with Python 2.6 from Macports.

On Ubuntu Linux the Popen solution is perhaps 2% slower than using a shell 
pipeline.

I'll attach three scripts:
bytestream -- simple program that sends bytes to its stdout
pull0 -- Python script that reads its stdin and logs a couple of stats
pull1 -- Python script that uses Popen to start bytestream and pulls from 
proc.stdout and runs abysmally slowly.

Use this with a concurrent top to see the shell pipeline behavior:
$ ./bytestream 80000 4 | time ./pull0 11000

Use this with a concurrent top to see the Popen behavior:
$ time ./pull1 ./bytestream 80000 4 11000

In both cases you'll see that system and user time are approximately equal, but 
that they're roughly 333 times higher for pull1.  Is there a clue in the fact 
that both times go up by about the same huge factor?

-Hugh

----------
components: Library (Lib)
files: bytestream
messages: 106862
nosy: hughsw
priority: normal
severity: normal
status: open
title: Popen uses 333 times as much CPU as a shell pipe on Mac OS X
type: behavior
versions: Python 2.5, Python 2.6
Added file: http://bugs.python.org/file17517/bytestream

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue8873>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to