Re: [Python-Dev] Fixing the new GIL

2010-03-15 Thread Martin v. Löwis
 As for the argument that an application with cpu intensive work being
 driven by the IO itself will work itself out...  No it won't, it can
 get into beat patterns where it is handling requests quite rapidly up
 until one that causes a long computation to start comes in.  At that
 point it'll stop performing well on other requests for as long (it
 could be a significant amount of time) as the cpu intensive request
 threads are running.  That is not a graceful degration in serving
 capacity / latency as one would normally expect.  It is a sudden drop
 off.

Why do you say that? The other threads continue to be served - and
Python couldn't use more than one CPU, anyway. Can you demonstrate that
in an example?

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fixing the new GIL

2010-03-15 Thread Martin v. Löwis
 So, just to be clear about the my bug report, it is directly related
 to the problem of overlapping I/O requests with CPU-bound processing.
 This kind of scenario comes up in the context of many
 applications--especially those based on cooperating processes,
 multiprocessing, and message passing.

How so? if you have cooperating processes, multiprocessing, and message
passing, you *don't* have CPU bound threads along with IO bound
threads in the same process - you may not have threads at all!!!

 In any case, if the GIL can be improved in a way that is simple and
 which either improves or doesn't negatively impact the performance of
 existing applications, why wouldn't you want to do it?  Seems like a
 no-brainer.

Unfortunately, a simple improvement doesn't really seem to exist.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (no subject)

2010-03-15 Thread David Beazley


On Mon 15/03/10  4:34 AM , Martin v. Löwis mar...@v.loewis.de sent:
  So, just to be clear about the my bug report, it
 is directly related to the problem of overlapping I/O requests with
 CPU-bound processing. This kind of scenario comes up in the context of
 many applications--especially those based on
 cooperating processes, multiprocessing, and message passing.
 
 How so? if you have cooperating processes, multiprocessing, and message
 passing, you *don't* have CPU bound threads along with IO
 boundthreads in the same process - you may not have threads at all!!!
 

You're right in that end user will probably not be using threads in this 
situation.  However, threads are still often used inside the associated 
programming 
libraries and frameworks that provide communication.   For example, threads 
might be used to implement queuing, background monitoring, event notification, 
routing, and other similar low-level features.   Just as an example, the 
multiprocessing module currently uses background threads as part of its 
implementation 
of queues.   In a similar vein, threads are sometimes used in asynchronous I/O 
frameworks (e.g., Twisted) to handle certain kinds of deferred operations.  
Bottom line: just because a user isn't directly programming with threads 
doesn't mean that threads aren't there.

  In any case, if the GIL can be improved in a way
 that is simple and which either improves or doesn't negatively
 impact the performance of existing applications, why wouldn't you want to
 do it?  Seems like a no-brainer.
 
 Unfortunately, a simple improvement doesn't really seem to exist.
 

Well, I think this problem can be fixed in a manner that is pretty well 
isolated to just one specific part of Python (the GIL)--especially since 
several prototypes, 
including my own, have already demonstrated that it's possible.In any case, 
as it stands now, the current new GIL ignores about 40 years of work 
concerning 
operating system thread/process scheduling and the efficient handling of I/O 
requests.   I suspect that I'm not the only one who be disappointed if the 
Python 
community simply blew it off and said it's not an issue.Trying to fix it is 
a worthwhile goal.

Cheers,
Dave



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__ and bytecode-only

2010-03-15 Thread Barry Warsaw
On Mar 14, 2010, at 12:17 AM, Nick Coghlan wrote:

Hmm - methinks the PEP actually needs to talk explicitly about the
py_compile and compileall modules. These compile the files directly
rather than using the import system's side-effect, so they'll need to
understand the intricacies of the new system.

Good point.  I'll add this to the PEP.

While it's probably OK if the import side-effects only create files
using the new scheme, the standard library modules will likely need to
support both schemes (although I'm not sure if same as import system
or same as Python 3.1 make more sense as the default semantics -
probably the former).

I don't understand this point.

compileall probably /could/ be extended to understand bytecode-only
(i.e. legacy or 3.2) layout.  I've added that to the PEP too.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__ and bytecode-only

2010-03-15 Thread Barry Warsaw
On Mar 14, 2010, at 04:37 PM, Paul Moore wrote:

The bdist_wininst installer also compiles modules explicitly on
install (as does the python.org Windows MSI installer). I've always
assumed that this worked via compileall, but haven't checked.
Regardless, these should probably also be covered in the PEP.

Thanks; added to PEP.
-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__ and bytecode-only

2010-03-15 Thread Barry Warsaw
On Mar 15, 2010, at 07:43 AM, Nick Coghlan wrote:
He did (in favour of keeping the directory visible).

http://www.mail-archive.com/python-dev@python.org/msg45203.html

(added to pep)

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fixing the new GIL

2010-03-15 Thread Cameron Simpson
On 15Mar2010 09:28, Martin v. L�wis mar...@v.loewis.de wrote:
|  As for the argument that an application with cpu intensive work being
|  driven by the IO itself will work itself out...  No it won't, it can
|  get into beat patterns where it is handling requests quite rapidly up
|  until one that causes a long computation to start comes in.  At that
|  point it'll stop performing well on other requests for as long (it
|  could be a significant amount of time) as the cpu intensive request
|  threads are running.  That is not a graceful degration in serving
|  capacity / latency as one would normally expect.  It is a sudden drop
|  off.
| 
| Why do you say that? The other threads continue to be served - and
| Python couldn't use more than one CPU, anyway. Can you demonstrate that
| in an example?

Real example:

I have a FuncMultiQueue class which manages a pool of worker threads.
Other tasks can make requests of it by passing a callable (usually
obtained via partial()) to its .call(), .bgcall() or .qbgcall() methods
depending on how they want to collect the result of the callable.

The idea here is that one has a few threads receiving requests (eg a
daemon watching a socket or monitoring a db queue table) which then use
the FuncMultiQueue to manage how many actual requests are processed
in parallel (yes, a semaphore can cover a lot of this, but not the
asynchronous call modes).

So, suppose a couple of CPU-intensive callables get queued which work for a
substantial time, and meanwhile a bunch of tiny tiny cheap requests arrive.
Their timely response will be impacted by this issue.

Cheers,
-- 
Cameron Simpson c...@zip.com.au DoD#743
http://www.cskk.ezoshosting.com/cs/

hybrid rather than pure; compromising rather than clean;
distorted rather than straightforward; ambiguous rather than
articulated; both-and rather than either-or; the difficult
unity of inclusion rather than the easy unity of exclusion.
- Paul Barton-Davis pa...@cs.washington.edu
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fixing the new GIL

2010-03-15 Thread Cameron Simpson
On 14Mar2010 19:32, Martin v. Löwis mar...@v.loewis.de wrote:
|  Speaking for myself, I have an app with a daemon mode which I expect
|  will sometimes behave as described; it answers requests and thus has I/O
|  bound threads waiting for requests and dispatching replies, and threads
|  doing data handling, which make constant use of the zlib library.
| 
| This is then already different from the scenario described. Every call
| into zlib will release the GIL, for the period of the zlib computation,
| allowing other threads to run.
| 
|  On the
|  client side the same app is often throughput bound by a data examination
|  process that is compute bound; I can easily see it having compute bound
|  threads and I/O bound threads talking to daemon instances.
| 
| I can't see that. I would expect that typically (and specifically
| including your application), the compute bound threads will synchronize
| with the IO bound ones, asking for more requests to perform.

In the single threaded case, sure. The usual command line archive this
mode fit this. But...

| That's the whole point of using the bound adjective (?): execution
| time is *bound* by the computation time. It can't get faster than what
| the computation can process.

If it's lurking behind a filesystem interface or in its daemon mode
(remote archive store), multiple client processes can be using it at once,
and it will be processing multiple tasks somewhat in parallel. Here one
can get a compute bound thread answering one request, impacting quick
response to other parallel-and-cheap requests.

Cheers,
-- 
Cameron Simpson c...@zip.com.au DoD#743
http://www.cskk.ezoshosting.com/cs/

Here was a man who not only had a ready mind and a quick wit,
but could also sing.- _Rope_
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com