On Thu, Jul 31, 2008 at 10:02 PM, Robert Kern <[EMAIL PROTECTED]> wrote: > On Thu, Jul 31, 2008 at 05:43, Andrew Dalke <[EMAIL PROTECTED]> wrote: >> On Jul 31, 2008, at 12:03 PM, Robert Kern wrote: > >>> But you still can't remove them since they are being used inside >>> numerictypes. That's why I labeled them "internal utility functions" >>> instead of leaving them with minimal docstrings such that you would >>> have to guess. >> >> My proposal is to replace that code with a table mapping >> the type name to the uppercase/lowercase/capitalized forms, >> thus eliminating the (small) amount of time needed to >> import string. >> >> It makes adding new types slightly more difficult. >> >> I know it's a tradeoff. > > Probably not a bad one. Write up the patch, and then we'll see how > much it affects the import time. > > I would much rather that we discuss concrete changes like this rather > than rehash the justifications of old decisions. Regardless of the > merits about the old decisions (and I agreed with your position at the > time), it's a pointless and irrelevant conversation. The decisions > were made, and now we have a user base to whom we have promised not to > break their code so egregiously again. The relevant conversation is > what changes we can make now. > > Some general guidelines: > > 1) Everything exposed by "from numpy import *" still needs to work. > a) The layout of everything under numpy.core is an implementation detail. > b) _underscored functions and explicitly labeled internal functions > can probably be modified. > c) Ask about specific functions when in doubt. > > 2) The improvement in import times should be substantial. Feel free to > bundle up the optimizations for consideration. > > 3) Moving imports from module-level down into the functions where they > are used is generally okay if we get a reasonable win from it. The > local imports should be commented, explaining that they are made local > in order to improve the import times. > > 4) __import__ hacks are off the table. > > 5) Proxy objects ... I would really like to avoid proxy objects. They > have caused fragility in the past. > > 6) I'm not a fan of having environment variables control the way numpy > gets imported, but I'm willing to consider it. For example, I might go > for having proxy objects for linalg et al. *only* if a particular > environment variable were set. But there had better be a very large > improvement in import times.
I just want to say that I agree with Andrew that slow imports just suck. But it's not really that bad, for example on my system: In [1]: %time import numpy CPU times: user 0.11 s, sys: 0.01 s, total: 0.12 s Wall time: 0.12 s so that's ok. For comparison: In [1]: %time import sympy CPU times: user 0.12 s, sys: 0.02 s, total: 0.14 s Wall time: 0.14 s But I am still unhappy about it, I'd like if the package could import much faster, because it adds up, when you need to import 7 packages like that, it's suddenly 1s and that's just too much. But of course everything within the constrains that Robert has outlined. From the theoretical point of view, I don't understand why python cannot just import numpy (or any other package) immediatelly, and only at the moment the user actually access something, to import it in real. Mercurial uses a lazy import module, that does exactly this. Maybe that's an option? Look into mercurial/demandimport.py. Use it like this: In [1]: import demandimport In [2]: demandimport.enable() In [3]: %time import numpy CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s Wall time: 0.00 s That's pretty good, huh? :) Unfortunately, numpy cannot work with lazy import (yet): In [5]: %time from numpy import array ERROR: An unexpected error occurred while tokenizing input The following traceback may be corrupted or invalid The error message is: ('EOF in multi-line statement', (17, 0)) --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) [skip] /usr/lib/python2.5/site-packages/numpy/lib/index_tricks.py in <module>() 14 import function_base 15 import numpy.core.defmatrix as matrix ---> 16 makemat = matrix.matrix 17 18 # contributed by Stefan van der Walt /home/ondra/ext/sympy/demandimport.pyc in __getattribute__(self, attr) 73 return object.__getattribute__(self, attr) 74 self._load() ---> 75 return getattr(self._module, attr) 76 def __setattr__(self, attr, val): 77 self._load() AttributeError: 'module' object has no attribute 'matrix' BTW, neither can SymPy. However, maybe it shows some possibilities and maybe it's possible to fix numpy to work with such a lazy import. On the other hand, I can imagine it can bring a lot more troubles, so it should probably only be optional. Ondrej _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion