Re: [Python-Dev] Dynamic module namspaces

2006-07-17 Thread Josiah Carlson

Andrew Bennetts [EMAIL PROTECTED] wrote:
 On Sat, Jul 15, 2006 at 03:38:04PM -0300, Johan Dahlin wrote:
  In an effort to reduce the memory usage used by GTK+ applications 
  written in python I've recently added a feature that allows attributes 
  to be lazy loaded in a module namespace. The gtk python module contains 
  quite a few attributes (around 850) of which many are classes or 
  interfaces (150+)
 
 Have you seen the demandload hack that Mercurial uses?  You can find it 
 here:
 http://selenic.com/repo/hg?f=cb4715847a81;file=mercurial/demandload.py
 
 You can see an example use of it here:
 http://selenic.com/repo/hg?f=d276571f2c4b;file=mercurial/commands.py

The problem with that particular method is that it requires that you use
a string to describe the set of modules you would like to import, rather
than a name.

In the py3k mailing list I recently described a mechanism (though not
implementation) for a somewhat related method to support automatic
reloading when a file has changed (for things like web frameworks), but
by removing that file change check, you get the equivalent to what you
describe, though you don't need to use strings, you can use names...

from dynamicreload import DR

and used like...

DR.os.path.join(...)

As long as you use the DR.-prefixed name, you get automatic reloading 
(if desired) on access.


 The advantage for an interactive command line tool isn't so much memory
 consumption as speed.  Why waste hundreds of milliseconds importing code that
 isn't used?  There's an experimental branch to use the same demandload code in
 bzr, the reported results are 400ms for 'bzr rocks' down to 100ms, 'bzr root'
 from 400ms = 200ms, etc. (according to
 http://permalink.gmane.org/gmane.comp.version-control.bazaar-ng.general/13967)
 
 Over half the runtime wasted on importing unused code!  There's a definite 
 need
 for a nice solution to this, and it should be included in the standard
 batteries that come with Python.

Well, just starting up Python without loading any modules can be time
consuming; perhaps even dwarfing the few hundred ms saved by dynamic
loading.


 If we can address related problems at the same time, like emitting deprecation
 warnings for accessing certain module attributes, then even better!

__deprecated__ = ['name1', ...]

Make your dynamic load/reload aware of the __deprecated__ attribute, and
you are done.


 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dynamic module namspaces

2006-07-17 Thread Andrew Bennetts
On Sun, Jul 16, 2006 at 11:52:48PM -0700, Josiah Carlson wrote:
 Andrew Bennetts [EMAIL PROTECTED] wrote:
[...]
  
  Have you seen the demandload hack that Mercurial uses?  You can find it 
  here:
  http://selenic.com/repo/hg?f=cb4715847a81;file=mercurial/demandload.py
  
  You can see an example use of it here:
  http://selenic.com/repo/hg?f=d276571f2c4b;file=mercurial/commands.py
 
 The problem with that particular method is that it requires that you use
 a string to describe the set of modules you would like to import, rather
 than a name.

I agree, it's ugly.  I'd like there to be a nicer solution.

 In the py3k mailing list I recently described a mechanism (though not
 implementation) for a somewhat related method to support automatic
 reloading when a file has changed (for things like web frameworks), but
 by removing that file change check, you get the equivalent to what you
 describe, though you don't need to use strings, you can use names...
 
 from dynamicreload import DR
 
 and used like...
 
 DR.os.path.join(...)
 
 As long as you use the DR.-prefixed name, you get automatic reloading 
 (if desired) on access.

Aside from also being ugly in its own way, a magic prefix like this would add a
small performance penalty to places that use it.  I believe demandload has the
nice feature that once a demandloaded object is accessed, it is replaced with
the actual object rather than the demandload gunk, so there's only a one-off
performance penalty.

  The advantage for an interactive command line tool isn't so much memory
  consumption as speed.  Why waste hundreds of milliseconds importing code 
  that
  isn't used?  There's an experimental branch to use the same demandload code 
  in
  bzr, the reported results are 400ms for 'bzr rocks' down to 100ms, 'bzr 
  root'
  from 400ms = 200ms, etc. (according to
  http://permalink.gmane.org/gmane.comp.version-control.bazaar-ng.general/13967)
  
  Over half the runtime wasted on importing unused code!  There's a definite 
  need
  for a nice solution to this, and it should be included in the standard
  batteries that come with Python.
 
 Well, just starting up Python without loading any modules can be time
 consuming; perhaps even dwarfing the few hundred ms saved by dynamic
 loading.

Well, it's only about 10ms on my laptop running Ubuntu (it varies up to 90ms,
but I expect that's just noise), and the -S switch makes no obvious difference
(tested with python -c '').  10ms overhead to start python I can live with.
It takes about that long run svn --version.

  If we can address related problems at the same time, like emitting 
  deprecation
  warnings for accessing certain module attributes, then even better!
 
 __deprecated__ = ['name1', ...]
 
 Make your dynamic load/reload aware of the __deprecated__ attribute, and
 you are done.

I'm fine with that syntax.  But regardless of how it's spelled, I'd really like
the standard library or interpreter to have support for it, rather than having a
non-standard 3rd-party system for it.  I want there to be one-- and preferably
only one --obvious way to do it.

-Andrew.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dynamic module namspaces

2006-07-17 Thread Johan Dahlin
Andrew Bennetts wrote:
 On Sat, Jul 15, 2006 at 03:38:04PM -0300, Johan Dahlin wrote:
 In an effort to reduce the memory usage used by GTK+ applications 
 written in python I've recently added a feature that allows attributes 
 to be lazy loaded in a module namespace. The gtk python module contains 
 quite a few attributes (around 850) of which many are classes or 
 interfaces (150+)
 
 Have you seen the demandload hack that Mercurial uses?  You can find it 
 here:
 http://selenic.com/repo/hg?f=cb4715847a81;file=mercurial/demandload.py

It seems quite similar to Philips Importer, it's not completely solving the
problem I'm having, since I do something like this:

class LazyNamespace(ModuleType):
def __init__(self, realmodule, locals):
attributes = {}
for attr in realmodule._get_lazy_attribute_names():
attributes[attr] = None

def __getattribute__(_, name):
if name in attributes:
value = realmodule._construct_lazy_attribute(name)
...
return value

There are almost 1000 symbols in the gtk namespace, creating all of them at
import time wastes memory and speed, while I've been mainly looking at the
memory consumption speed will also benefit. Importing gtk at this point
quite fast actually, less than 200ms on my fairly old box.

GUI programs does not need to be as responsive as command line applications
as hg  bzr, users seems to accept that it takes a second or a few to start
up a GUI application.

-- 
Johan Dahlin [EMAIL PROTECTED]
Async Open Source
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dynamic module namspaces

2006-07-17 Thread Johan Dahlin
James Y Knight wrote:
 
 On Jul 15, 2006, at 2:38 PM, Johan Dahlin wrote:
 What I want to ask, is it possible to have a sanctioned way to implement
 a dynamic module/namespace in python?

 For instance, it could be implemented to allow you to replace the
 __dict__ attribute in a module with a user provided object which
 implements the dictionary protocol.
 
 I'd like this, as well, although my use case is different: I'd like to
 be able to deprecate attributes in a module. That is, if I have:
 
 foo.py:
 SOME_CONSTANT = 5
 
 I'd like to be able to do something such that any time anyone accessed
 foo.SOME_CONSTANT, it'd emit a DeprecationWarning.

Agreed, this would be another nice feature to have.

I've did something similar a time ago for PyGTK aswell, while less elegant
than your proposed solution, it seems that it's working fairly well:

DeprecatedConstant can be found here:
http://cvs.gnome.org/viewcvs/pygtk/gtk/deprecation.py?view=markup

-- 
Johan Dahlin [EMAIL PROTECTED]
Async Open Source
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dynamic module namspaces

2006-07-17 Thread Johan Dahlin
Phillip J. Eby wrote:

 Just as a point of reference, the Importing package does something very
 similar, to support weak and lazy imports:
 
 http://cheeseshop.python.org/pypi/Importing

Interesting, I was not aware of that, thanks for the pointer.
Another reason for including this feature in the standard library ;-)

 The things most likely to be problems are tools like pydoc or other
 inspect-using code that expects modules to be exactly ModuleType and
 don't use isinstance().  Apart from that, I've been using the technique
 since the early days of Python 2.2 without encountering any problems
 until the PEP 302 reload() bug came along, but that was fixed in
 2.3.5.  I haven't seen any other problems since.

I'd argue that pydoc  friends are broken if they assume that a module will
always be of ModuleType and not a subclass of it.

 On the other hand, the Importing package takes a much more conservative
 approach than you are doing; it simply runs reload() on a module when
 __getattribute__ is called (after restoring the old version of
 __getattribute__).  So, as soon as you touch a lazily loaded module, it
 ceases to be particularly special, and it has a real __dict__.  It's
 possible that what you're doing could have more side-effects than what
 I'm doing.

This is an interesting approach, I thought of using that but I didn't quite
manage to find the time to implement it properly.

However, for the gtk namespace this won't be enough, since I want to avoid
creating all the types when the first one is accessed; when gtk.Button is
accessed it, gtk.Window will still not be created.

 What I want to ask, is it possible to have a sanctioned way to implement
 a dynamic module/namespace in python?
 
 That would be nice, but I think that what you and I are doing are
 probably the One Obvious Ways to do the respective things we're doing.

I consider __getattribute__ a hack, being able to override __dict__ is less
hackish, IMHO.

-- 
Johan Dahlin [EMAIL PROTECTED]
Async Open Source
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dynamic module namspaces

2006-07-17 Thread glyph
On Mon, 17 Jul 2006 10:29:22 -0300, Johan Dahlin [EMAIL PROTECTED] wrote:

I consider __getattribute__ a hack, being able to override __dict__ is less
hackish, IMHO.

Why do you feel one is more hackish than the other?  In my experience the
opposite is true: certain C APIs expect __dict__ to be a real dictionary,
and if you monkey with it they won't call the overridden functions you expect,
whereas things accessing attributes will generally call through all the
appropriate Python-level APIs.

This makes sense to me for efficiency reasons and for clarity as well; if you're
trawling around in a module's __dict__ then you'd better be ready for what
you're going to get - *especially* if the module is actually a package.  Even in
normal python code, packages can have names which would be bound if they were
imported, but aren't yet.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dynamic module namspaces

2006-07-16 Thread James Y Knight

On Jul 15, 2006, at 2:38 PM, Johan Dahlin wrote:
 What I want to ask, is it possible to have a sanctioned way to  
 implement
 a dynamic module/namespace in python?

 For instance, it could be implemented to allow you to replace the
 __dict__ attribute in a module with a user provided object which
 implements the dictionary protocol.

I'd like this, as well, although my use case is different: I'd like  
to be able to deprecate attributes in a module. That is, if I have:

foo.py:
SOME_CONSTANT = 5

I'd like to be able to do something such that any time anyone  
accessed foo.SOME_CONSTANT, it'd emit a DeprecationWarning.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dynamic module namspaces

2006-07-16 Thread Andrew Bennetts
On Sat, Jul 15, 2006 at 03:38:04PM -0300, Johan Dahlin wrote:
 In an effort to reduce the memory usage used by GTK+ applications 
 written in python I've recently added a feature that allows attributes 
 to be lazy loaded in a module namespace. The gtk python module contains 
 quite a few attributes (around 850) of which many are classes or 
 interfaces (150+)

Have you seen the demandload hack that Mercurial uses?  You can find it here:
http://selenic.com/repo/hg?f=cb4715847a81;file=mercurial/demandload.py

You can see an example use of it here:
http://selenic.com/repo/hg?f=d276571f2c4b;file=mercurial/commands.py

The advantage for an interactive command line tool isn't so much memory
consumption as speed.  Why waste hundreds of milliseconds importing code that
isn't used?  There's an experimental branch to use the same demandload code in
bzr, the reported results are 400ms for 'bzr rocks' down to 100ms, 'bzr root'
from 400ms = 200ms, etc. (according to
http://permalink.gmane.org/gmane.comp.version-control.bazaar-ng.general/13967)

Over half the runtime wasted on importing unused code!  There's a definite need
for a nice solution to this, and it should be included in the standard
batteries that come with Python.

If we can address related problems at the same time, like emitting deprecation
warnings for accessing certain module attributes, then even better!

-Andrew.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Dynamic module namspaces

2006-07-15 Thread Johan Dahlin
In an effort to reduce the memory usage used by GTK+ applications 
written in python I've recently added a feature that allows attributes 
to be lazy loaded in a module namespace. The gtk python module contains 
quite a few attributes (around 850) of which many are classes or 
interfaces (150+)

The changes to PyGTK I had to make can not be considered anything but a 
hack; I had to put a subclass of a ModuleType in sys.modules and 
override __getattribute__ to be able to get old code which accessed 
gtk.__dict__ directly to still work (PyModule_GetDict requires that).
However, even if I didn't have to use __getattribute__ overriding 
sys.modules is rather unpleasent and I'm afraid it'll cause problems in 
the future.

My point is that I consider this to be a valid use case, the amount of 
saved memory is significan, and I could not find another way of doing it 
and still keep the gtk interface (import gtk; gtk.Button) to still be 
backwards compatible.

What I want to ask, is it possible to have a sanctioned way to implement 
a dynamic module/namespace in python?

For instance, it could be implemented to allow you to replace the 
__dict__ attribute in a module with a user provided object which 
implements the dictionary protocol.

Johan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dynamic module namspaces

2006-07-15 Thread Giovanni Bajo
Johan Dahlin wrote:

 My point is that I consider this to be a valid use case, the amount of
 saved memory is significan, and I could not find another way of doing
 it and still keep the gtk interface (import gtk; gtk.Button) to still be
 backwards compatible.

You may want to have a look at SIP/PyQt. They implement the full Qt
interface which is rather large, but import time is blazingly fast and
memory occupation grows only of 4-5 Mb at import-time. The trick is that
methods are generated dynamically at their first usage somehow (but dir()
and introspection still works...).

SIP is free and generic btw, you may want to consider it as a tool.
-- 
Giovanni Bajo

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com