Re: [Numpy-discussion] import numpy is slow

2008-08-05 Thread Christopher Barker
Robert Kern wrote:

 
 It's still pretty bad, though. I do recommend running Disk Repair like Bill 
 did.

I did that, and it found and did nothing -- I suspect it ran when I 
re-booted -- it did take a while to reboot.

However, this is pretty consistently what I'm getting now:

$ time python -c import numpy

real0m0.728s
user0m0.327s
sys 0m0.398s

Which is apparently pretty slow. Robert gets:

$ time python -c import numpy
python -c import numpy  0.18s user 0.46s system 88% cpu 0.716 total

Is that on a similar machine??? Are you running Universal binaries? 
Would that make any difference? I wouldn't think so, I'm just grasping 
at straws here.

This is a Dual 1.8GHz G5 desktop, running OS-X 10.4.11, Python 2.5.2 
(python.org build), numpy 1.1.1 (from binary on sourceforge)

I just tried this on a colleague's machine that is identical, and got 
about 0.4 seconds real -- so faster than mine, but still slow.

This still feels blazingly fast to me, as I was getting something like 
7+ seconds!

thanks for all the help,

-Chris







-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-08-04 Thread Christopher Barker
Robert Kern wrote:
 It isn't. The problem is on Chris's file system.

Thanks for all your help, Robert. Interestingly, I haven't noticed any 
problems anywhere else, but who knows?

I guess this is what Linux Torvalds meant when he said that OS-X's file 
system was brain dead

 Whatever is wrong
 with his file system (Bill Spotz's identical problem suggests too many
 temporary but unused inodes)

I didn't see anything about Bill having similar issues -- was it on this 
list?

 But the problem really is his disk; it's not
 a problem with numpy or Python or anything else.

so the question is: what can I do about it? Do I have any other choice 
than wiping the disk and re-installing?

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-08-04 Thread Robert Kern
On Mon, Aug 4, 2008 at 14:24, Christopher Barker [EMAIL PROTECTED] wrote:
 Robert Kern wrote:
 It isn't. The problem is on Chris's file system.

 Thanks for all your help, Robert. Interestingly, I haven't noticed any
 problems anywhere else, but who knows?

 I guess this is what Linux Torvalds meant when he said that OS-X's file
 system was brain dead

 Whatever is wrong
 with his file system (Bill Spotz's identical problem suggests too many
 temporary but unused inodes)

 I didn't see anything about Bill having similar issues -- was it on this
 list?

From my earlier message in this thread:


Looking at the Shark results you sent me, it looks like all of your
time is getting sucked up by the system call getdirentries(). Googling
for some of the function names in that stack brings me to the message
Slow python initialization on the Pythonmac-SIG:

 http://mail.python.org/pipermail/pythonmac-sig/2005-December/015542.html

The ultimate resolution was that Bill Spotz, the original poster, ran
Disk Utility and used the Disk Repair option to clean up a large
number of unused inodes. This solved the problem for him:

 http://mail.python.org/pipermail/pythonmac-sig/2005-December/015548.html


-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-08-04 Thread Christopher Barker
OK,

So I'm an idiot. After reading this, I thought I haven't rebooted for a 
while. It turns out it's been 35 days. I think I've been having slow 
startup for a longer than that, but none the less, I re-booted (which 
took a long time), and presto:

$ time python -c import numpy

real0m0.686s
user0m0.322s
sys 0m0.363s

much better!

I suspect OS-X did some disk-cleaning on re-boot.

Frankly, 35 days is pretty pathetic for an uptime, but as I said, I 
think this issue has been going on longer. Perhaps OS-X runs a disk 
check every n re-boots, like some linux distros do.

Sorry about the noise, and thanks, particularly to Robert, for taking an 
  interest in this.

-Chris



Robert Kern wrote:
 On Mon, Aug 4, 2008 at 14:24, Christopher Barker [EMAIL PROTECTED] wrote:
 Robert Kern wrote:
 It isn't. The problem is on Chris's file system.
 Thanks for all your help, Robert. Interestingly, I haven't noticed any
 problems anywhere else, but who knows?

 I guess this is what Linux Torvalds meant when he said that OS-X's file
 system was brain dead

 Whatever is wrong
 with his file system (Bill Spotz's identical problem suggests too many
 temporary but unused inodes)
 I didn't see anything about Bill having similar issues -- was it on this
 list?
 
From my earlier message in this thread:
 
 
 Looking at the Shark results you sent me, it looks like all of your
 time is getting sucked up by the system call getdirentries(). Googling
 for some of the function names in that stack brings me to the message
 Slow python initialization on the Pythonmac-SIG:
 
  http://mail.python.org/pipermail/pythonmac-sig/2005-December/015542.html
 
 The ultimate resolution was that Bill Spotz, the original poster, ran
 Disk Utility and used the Disk Repair option to clean up a large
 number of unused inodes. This solved the problem for him:
 
  http://mail.python.org/pipermail/pythonmac-sig/2005-December/015548.html
 
 

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-08-04 Thread Robert Kern
On Mon, Aug 4, 2008 at 18:01, Christopher Barker [EMAIL PROTECTED] wrote:
 OK,

 So I'm an idiot. After reading this, I thought I haven't rebooted for a
 while. It turns out it's been 35 days. I think I've been having slow
 startup for a longer than that, but none the less, I re-booted (which
 took a long time), and presto:

 $ time python -c import numpy

 real0m0.686s
 user0m0.322s
 sys 0m0.363s

 much better!

It's still pretty bad, though. I do recommend running Disk Repair like Bill did.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-08-02 Thread Robert Kern
On Sat, Aug 2, 2008 at 00:06, David Cournapeau
[EMAIL PROTECTED] wrote:
 Christopher Barker wrote:

 OK, I just installed wxPython, and whoa!

 time python -c import numpy

 real0m2.793s
 user0m0.294s
 sys 0m2.494s

 so it's taking almost two seconds more to import numpy, now that
 wxPython is installed. I haven't even imported it yet. importing wx
 isn't as bad:

 $ time python -c import wx

 real0m1.589s
 user0m0.274s
 sys 0m1.000s

 Since numpy  wo wx + wc import times adds up to numpy import times, this
 suggests that numpy may import wx. Which it shouldn't, obviously. There
 is something strange happening here. Please check wether wx really is
 imported when you do import numpy:

 python -c import numpy; import sys; print sys.modules

 And if it is, we have to know why it is imported at all when doing
 import numpy.

It isn't. The problem is on Chris's file system. Whatever is wrong
with his file system (Bill Spotz's identical problem suggests too many
temporary but unused inodes) increases the traversal of the file
system. wx has a .pth file which adds entries to sys.path. Every time
one tries to import something, the entries on sys.path are examined
for the module. So increasing the number of entries on sys.path
exacerbates the problem. But the problem really is his disk; it's not
a problem with numpy or Python or anything else.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-08-02 Thread David Cournapeau
Robert Kern wrote:

 It isn't. The problem is on Chris's file system. Whatever is wrong
 with his file system (Bill Spotz's identical problem suggests too many
 temporary but unused inodes) increases the traversal of the file
 system.

Ah, I did not think it could indeed affect the whole fs. This seems much
more likely, then. I guess I was confused because wx caused me some
problems a long time ago, with scipy, and thought maybe there were some
leftovers in Chris' system.

It would also explain why import numpy is still kind of slow on his
machine. I don't remember the numbers, but I think it was quicker on my
PPC minimac (under Mac os X) than on his computer.

  wx has a .pth file which adds entries to sys.path. Every time
 one tries to import something, the entries on sys.path are examined
 for the module. So increasing the number of entries on sys.path
 exacerbates the problem. But the problem really is his disk; it's not
 a problem with numpy or Python or anything else.
   

It was an fs problem, after all. I am a bit surprised this can happen in
such an aggravated manner, though.

cheers,

David

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-08-02 Thread Andrew Dalke
I've got a proof of concept that take the time on my machine to  
import numpy from 0.21 seconds down to 0.08 seconds.  Doing that  
required some somewhat awkward things, like deferring all 'import re'  
statements.  I don't think that's stable in the long run because  
people will blithely import re in the future and not care that it  
takes 0.02 seconds to import.  I don't blame them for complaining; I  
was curious on how fast I could get things.

Note that when I started complaining about this a month ago the  
import time on my machine was about 0.3 seconds.

I'll work on patches within the next couple of days.  Here's an  
outline of what I did, along with some questions about what's feasible.

1) don't import 'numpy.testing'.  Savings = 0.012s.
Doing so required patches like

-from numpy.testing import Tester
-test = Tester().test
-bench = Tester().bench
+def test(label='fast', verbose=1, extra_argv=None, doctests=False,
+ coverage=False, **kwargs):
+from testing import Tester
+import numpy
+Tester(numpy).test(label, verbose, extra_argv, doctests,
+   coverage, **kwargs)
+def bench(label='fast', verbose=1, extra_argv=None):
+from testing import Tester
+import numpy
+Tester(numpy).bench(label, verbose, extra_argv)

QUESTION: since numpy is moving to nose, and the documentation only  
describes doing 'import numpy; numpy.test()', can I remove all other  
definitions of test and bench?


2)  removing 'import ctypeslib' in top-level - 0.023 seconds

QUESTION: is this considered part of the API that must be preserved?   
The primary use case is supposed to be to help interactive users.  I  
don't think interactive users spend much time using ctypes, and those  
that do are also those that aren't confused about needing an extra  
import statement.

3) removing 'import string' in numerictypes.py - 0.008 seconds .   
This requires some ugly but simple changes to the code.

4) remove the 'import re' in _internal, numpy/lib/, function_base,  
and other places.  This reduced my overall startup cost by 0.013.

5) defer bzip and gzip imports in _datasource: 0.009 s.  This will  
require non-trivial code changes.

6) defer 'format' from io.py: 0.007 s

7) _datasource imports shutil in order to use shutil.rmdir in a  
__del__.  I don't think this can be deferred, because I don't want to  
do an import during system shutdown, which is when the __del__ might  
be called.  It would save 0.004s.

8) If I can remove 'import doc' from the top-level numpy (is that  
part of the required API?) then I can save 0.004s.

9) defer urlparse in _datasource: about 0.003s

10) If I get rid of the cPickle top-level numeric.py then I can save  
0.006 seconds.

11) not importing add_newdocs saves 0.005 s.  This might be possible  
by moving all of the docstrings to the actual functions.  I haven't  
looked into this much and it might not be possible.

Those millisecond improvements add up!  When I do an interactive  
'import numpy' on my system I don't notice the import time like I did  
before.

Andrew
[EMAIL PROTECTED]




Andrew
[EMAIL PROTECTED]


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-08-01 Thread Gael Varoquaux
On Fri, Aug 01, 2008 at 09:18:48AM -0700, Christopher Barker wrote:
  What does python -c import sys; print sys.path say ?

 A lot! 41 entries, and lot's of eggs -- are eggs an issue? I'm also 
 wondering how the order is determined -- if it looked in site-packages 
 first, it would find numpy a whole lot faster.

AFAIK this is a setuptools issue. From what I hear, it might be fixed in
the svn version of setuptools, but they still have to make a release that
has this feature.

The two issues I can see are: import path priority, it should be screwed
up like it is, and speed. Speed is obviously a hard problem.

 I suspect the thing to do is to re-install from scratch, and only add in 
 packages I'm really using now.

Avoid eggs if you can. This has been my policy. I am not sure how much
this is just superstition or a real problem, though.

I realize that you are on mac, and that mac unlike some distribution of
linux does not have a good dependency tracking system. Thus seutptools
and eggs are a great tentation. Them come to a cost, but it can probably
be improved. If you care about this problem, you could try and work with
the setuptools developers to improve the situation. I must say that I am
under UBuntu, and I don't have the dependency problem at all, so
setuptools does not answer an important need for me. I however realize
that not everybody wants to use Ubuntu and I thus care about the problem,
maybe not enough to invest much time in setuptools, but at least enough
to try to report problems and track solution. Do not underestimate how
difficult it is to get a package-manager that works well.

If you ever do verify that it is indeed eggs that I slowing down your
import, I'd be interested in having the confirmation, just so that I am
sure I am not blaming them for nothing.

Cheers,

Gaël
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-08-01 Thread Robert Kern
On Fri, Aug 1, 2008 at 11:53, Gael Varoquaux
[EMAIL PROTECTED] wrote:
 On Fri, Aug 01, 2008 at 09:18:48AM -0700, Christopher Barker wrote:
  What does python -c import sys; print sys.path say ?

 A lot! 41 entries, and lot's of eggs -- are eggs an issue? I'm also
 wondering how the order is determined -- if it looked in site-packages
 first, it would find numpy a whole lot faster.

 AFAIK this is a setuptools issue. From what I hear, it might be fixed in
 the svn version of setuptools, but they still have to make a release that
 has this feature.

 The two issues I can see are: import path priority, it should be screwed
 up like it is, and speed. Speed is obviously a hard problem.

 I suspect the thing to do is to re-install from scratch, and only add in
 packages I'm really using now.

 Avoid eggs if you can. This has been my policy. I am not sure how much
 this is just superstition or a real problem, though.

Superstition.

[~]$ python -c import sys; print len(sys.path)
269

[~]$ python -v -v -c import numpy 2 foo.txt
[~]$ wc -l foo.txt
42500 foo.txt

[~]$ time python -c import numpy
python -c import numpy  0.18s user 0.46s system 88% cpu 0.716 total

So cut it out.

Chris, please profile your import so we actually have some real
information to work with instead of prejudices.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-08-01 Thread Ondrej Certik
On Thu, Jul 31, 2008 at 10:02 PM, Robert Kern [EMAIL PROTECTED] wrote:
 On Thu, Jul 31, 2008 at 05:43, Andrew Dalke [EMAIL PROTECTED] wrote:
 On Jul 31, 2008, at 12:03 PM, Robert Kern wrote:

 But you still can't remove them since they are being used inside
 numerictypes. That's why I labeled them internal utility functions
 instead of leaving them with minimal docstrings such that you would
 have to guess.

 My proposal is to replace that code with a table mapping
 the type name to the uppercase/lowercase/capitalized forms,
 thus eliminating the (small) amount of time needed to
 import string.

 It makes adding new types slightly more difficult.

 I know it's a tradeoff.

 Probably not a bad one. Write up the patch, and then we'll see how
 much it affects the import time.

 I would much rather that we discuss concrete changes like this rather
 than rehash the justifications of old decisions. Regardless of the
 merits about the old decisions (and I agreed with your position at the
 time), it's a pointless and irrelevant conversation. The decisions
 were made, and now we have a user base to whom we have promised not to
 break their code so egregiously again. The relevant conversation is
 what changes we can make now.

 Some general guidelines:

 1) Everything exposed by from numpy import * still needs to work.
  a) The layout of everything under numpy.core is an implementation detail.
  b) _underscored functions and explicitly labeled internal functions
 can probably be modified.
  c) Ask about specific functions when in doubt.

 2) The improvement in import times should be substantial. Feel free to
 bundle up the optimizations for consideration.

 3) Moving imports from module-level down into the functions where they
 are used is generally okay if we get a reasonable win from it. The
 local imports should be commented, explaining that they are made local
 in order to improve the import times.

 4) __import__ hacks are off the table.

 5) Proxy objects ... I would really like to avoid proxy objects. They
 have caused fragility in the past.

 6) I'm not a fan of having environment variables control the way numpy
 gets imported, but I'm willing to consider it. For example, I might go
 for having proxy objects for linalg et al. *only* if a particular
 environment variable were set. But there had better be a very large
 improvement in import times.


I just want to say that I agree with Andrew that slow imports just
suck. But it's not really that bad, for example on my system:

In [1]: %time import numpy
CPU times: user 0.11 s, sys: 0.01 s, total: 0.12 s
Wall time: 0.12 s

so that's ok. For comparison:

In [1]: %time import sympy
CPU times: user 0.12 s, sys: 0.02 s, total: 0.14 s
Wall time: 0.14 s

But I am still unhappy about it, I'd like if the package could import
much faster, because it adds up, when you need to import 7 packages
like that, it's suddenly 1s and that's just too much.

But of course everything within the constrains that Robert has
outlined. From the theoretical point of view, I don't understand why
python cannot just import numpy (or any other package) immediatelly,
and only at the moment the user actually access something, to import
it in real. Mercurial uses a lazy import module, that does exactly
this. Maybe that's an option?

Look into mercurial/demandimport.py.

Use it like this:

In [1]: import demandimport

In [2]: demandimport.enable()

In [3]: %time import numpy
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.00 s


That's pretty good, huh? :)

Unfortunately, numpy cannot work with lazy import (yet):

In [5]: %time from numpy import array
ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line statement', (17, 0))

---
AttributeErrorTraceback (most recent call last)

[skip]


/usr/lib/python2.5/site-packages/numpy/lib/index_tricks.py in module()
 14 import function_base
 15 import numpy.core.defmatrix as matrix
--- 16 makemat = matrix.matrix
 17
 18 # contributed by Stefan van der Walt

/home/ondra/ext/sympy/demandimport.pyc in __getattribute__(self, attr)
 73 return object.__getattribute__(self, attr)
 74 self._load()
--- 75 return getattr(self._module, attr)
 76 def __setattr__(self, attr, val):
 77 self._load()

AttributeError: 'module' object has no attribute 'matrix'




BTW, neither can SymPy. However, maybe it shows some possibilities and
maybe it's possible to fix numpy to work with such a lazy import.

On the other hand, I can imagine it can bring a lot more troubles, so
it should probably only be optional.


Ondrej
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-08-01 Thread Christopher Barker

David Cournapeau wrote:

IOW, I don't think the problem is the numbers themselves. It has to be
something else. A simple profiling like

python -m cProfile -o foo.stats foo.py

and then:

python -c import pstats; p = pstats.Stats(foo.stats);
p.sort_stats('cumulative').print_stats(50)


OK, see the results -- I think (though i may be wrong) this means that 
the problem isn't in finding the numpy package:


As for Shark, I'm sorry I missed that message, but I'm trying to see if 
I can do that now -- I don't seem to have Shark installed, and the ADC 
site doesn't seem to be working, but I'll keep looking.


Thanks for all your help with this...

-Chris



--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
Fri Aug  1 15:14:10 2008ImportNumpy.stats

 26987 function calls (26098 primitive calls) in 5.150 CPU seconds

   Ordered by: cumulative time
   List reduced from 631 to 50 due to restriction 50

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
10.0000.0005.1515.151 {execfile}
10.0360.0365.1515.151 ImportNumpy.py:1(module)
10.1460.1465.1155.115 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/__init__.py:63(module)
10.0260.0263.9413.941 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/add_newdocs.py:9(module)
10.0640.0643.9033.903 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/__init__.py:1(module)
10.1790.1792.0772.077 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/io.py:1(module)
10.4830.4831.7351.735 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/_datasource.py:33(module)
10.0350.0351.5821.582 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/type_check.py:3(module)
10.1120.1121.5471.547 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/__init__.py:2(module)
10.0100.0101.3481.348 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/defmatrix.py:1(module)
10.3020.3021.3381.338 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/utils.py:1(module)
10.5180.5181.2361.236 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/urllib2.py:74(module)
10.0120.0120.6960.696 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/testing/__init__.py:2(module)
10.3270.3270.6830.683 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/testing/numpytest.py:1(module)
10.0110.0110.6810.681 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/compiler/__init__.py:22(module)
10.4470.4470.6500.650 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/httplib.py:67(module)
10.3510.3510.3560.356 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/compiler/transformer.py:9(module)
10.0120.0120.3140.314 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/compiler/pycodegen.py:1(module)
10.1810.1810.3000.300 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/compiler/pyassem.py:1(module)
10.1620.1620.2050.205 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/inspect.py:24(module)
10.0610.0610.1940.194 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/testing/utils.py:3(module)
10.1630.1630.1630.163 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/mimetools.py:1(module)
10.1310.1310.1630.163 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/tempfile.py:18(module)
10.1610.1610.1620.162 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/unittest.py:45(module)
10.1310.1310.1490.149 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pydoc.py:35(module)
10.1170.1170.1320.132 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/difflib.py:29(module)
10.0610.0610.1220.122 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/_import_tools.py:2(module)
   

Re: [Numpy-discussion] import numpy is slow

2008-08-01 Thread Christopher Barker
Robert Kern wrote:
 File/Save As..., pick a file name. When asked about whether to embed
 source files or strip them out, choose Strip. Then email the resulting
 .mshark file to me.

I've done that, and sent it to you directly -- it's too big to put in 
the mailing list.

 It looks like your Python just takes a truly inordinate amount of time
 to execute any code. Some of the problematic modules like httplib have
 been moved to local imports, but the time it takes for your Python to
 execute the code in that module is still ridiculously large. Can you
 profile just importing httplib instead of numpy?

I've got to go catch a bus now, and I don't have a Mac at home, so this 
will have to wait 'till next Monday -- thanks for all your time on this.

-Chris




-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-08-01 Thread David Cournapeau
On Sat, Aug 2, 2008 at 5:33 AM, Ondrej Certik [EMAIL PROTECTED] wrote:

 But I am still unhappy about it, I'd like if the package could import
 much faster, because it adds up, when you need to import 7 packages
 like that, it's suddenly 1s and that's just too much.

Too much for what ? We need more information on the kind of things
people who complaing about numpy startup cost are doing. I suggested
lazy import a few weeks ago when this discussion started (with the
example of bzr instead of hg), but I am less convinced that it would
be that useful, because numpy is fundamentally different than bzr/hg.
As robert said, it would bring some complexity, and in an area where
python is already fishy.

When you import numpy, you expect some core things to be available,
and they are the ones who take the most time. In bzr/hg, you use a
*program*, and you can relatively easily change the API because not
many people use it. But numpy is essentially an API, not a tool, so we
don't have this freedom. Also, it means it is relatively easy for
bzr/hg developers to control lazy import ,because they are the users,
and users of bzr/hg don't deal with python directly. If our own lazy
import has some bugs, it will impact many people who will not be able
to trace it.

The main advantage I see with lazy imports is that it avoids someone
else from breaking the speed-up work by re-importing globally a costly
package.

 But of course everything within the constrains that Robert has
 outlined. From the theoretical point of view, I don't understand why
 python cannot just import numpy (or any other package) immediatelly,
 and only at the moment the user actually access something, to import
 it in real.

I guess because it would be complex to do everywhere while keeping all
the semantics of python import. Also, like everything lazy, it means
it is more complicated to follow what's happening. Your examples show
that it would be complex to do.

As I see it, there are some things in numpy we could do a bit
differently to cut significantly import times (a few ten ), without
changing much. Let's try that first.

 Mercurial uses a lazy import module, that does exactly
 this. Maybe that's an option?

Note that mercurial is under the GPL :)

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Andrew Dalke
On Jul 31, 2008, at 3:53 AM, David Cournapeau wrote:
 You are supposed to run the tests on an installed numpy, not in the
 sources:

 import numpy
 numpy.test(verbose = 10)

Doesn't that make things more cumbersome to test?  That is, if I were  
to make a change I would need to:
   - python setup.py build  (to put the code into the build/*  
subdirectory)
   - cd the build directory, or switch to a terminal which was  
already there
   - manually do the import/test code you wrote, or a write two-line  
program for it

I would rather do 'nosetests' in the source tree, if at all feasible,  
although that might only be possible for the Python source.

Hmm. And it looks like testing/nosetester.py (which implements the  
'test' function above) is meant to make it easier to run nose, except  
my feeling is the extra level of wrapping makes things more  
complicated.  The nosetest command-line appears to be more flexible,  
with support for, for examples, dropping into the debugger on errors,  
and reseting the coverage test files.

I'm speaking out of ignorance, btw.

Cheers,


Andrew
[EMAIL PROTECTED]


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Hanni Ali
Hi All,

I've been reading this discussion with interest.

I would just to highlight an alternate use of numpy to interactive use. We
have a cluster of machines which process tasks on an individual basis where
a master tasks may spawn 600 slave tasks to be processed. These tasks are
spread across the cluster and processed as scripts in a individual python
thread. Although reducing the process time by 300 seconds for the master
task is only about a 1.5% speedup (total time can be i excess of 24000s). We
process large number of these tasks in any given year and every little
helps!

Hanni



2008/7/31 Stéfan van der Walt [EMAIL PROTECTED]

 2008/7/31 Andrew Dalke [EMAIL PROTECTED]:
  The user base for numpy might be .. 10,000 people?  100,000 people?
  Let's go with the latter, and assume that with command-line scripts,
  CGI scripts, and the other programs that people write in order to
  help do research means that numpy is started on average 10 times a day.
 
  100,000 people * 10 times / day * 0.1 seconds per startup
 = almost 28 people-hours spent each day waiting for numpy to start.

 I don't buy that argument.  No single person is agile enough to do
 anything useful in the half a second or so it takes to start up NumPy.
  No one is *waiting* for NumPy to start.  Just by answering this
 e-mail I could have (and maybe should have) started NumPy three
 hundred and sixty times.

 I don't want to argue about this, though.  Write the patches, file a
 ticket, and hopefully someone will deem them important enough to apply
 them.

 Stéfan
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Nathan Bell
On Thu, Jul 31, 2008 at 7:31 AM, Hanni Ali [EMAIL PROTECTED] wrote:

 I would just to highlight an alternate use of numpy to interactive use. We
 have a cluster of machines which process tasks on an individual basis where
 a master tasks may spawn 600 slave tasks to be processed. These tasks are
 spread across the cluster and processed as scripts in a individual python
 thread. Although reducing the process time by 300 seconds for the master
 task is only about a 1.5% speedup (total time can be i excess of 24000s). We
 process large number of these tasks in any given year and every little
 helps!


There are other components of NumPy/SciPy that are more worthy of
optimization.  Given that programmer time is a scarce resource, it's
more sensible to direct our efforts towards making the other 98.5% of
the computation faster.

/law of diminishing returns

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread David Cournapeau
Nathan Bell wrote:

 There are other components of NumPy/SciPy that are more worthy of
 optimization.  Given that programmer time is a scarce resource, it's
 more sensible to direct our efforts towards making the other 98.5% of
 the computation faster.
   

To be fair, when I took a look at the problem last month, it took a few
of us (Robert and me IIRC) maximum 2 man hours altogether to divide by
two numpy import times on linux, without altering at all the API. Maybe
there are more things which can be done to get to a more 'flat' profile.

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Nathan Bell
On Thu, Jul 31, 2008 at 5:36 AM, Andrew Dalke [EMAIL PROTECTED] wrote:

 The user base for numpy might be .. 10,000 people?  100,000 people?
 Let's go with the latter, and assume that with command-line scripts,
 CGI scripts, and the other programs that people write in order to
 help do research means that numpy is started on average 10 times a day.

 100,000 people * 10 times / day * 0.1 seconds per startup
= almost 28 people-hours spent each day waiting for numpy to start.

 I'm willing to spend a few days to achieve that.


 Perhaps there's fewer people than I'm estimating.  OTOH, perhaps
 there are more imports of numpy per day.  An order of magnitude less
 time is still a couple of hours each day as the world waits to import
 all of the numpy libraries.

 If on average people import numpy 10 times a day and it could be made
 0.1 seconds faster then that's 1 second per person per day.  If it
 takes on average 5 minutes to learn to import the module directly and
 the onus is all on numpy, then after 1 year of use the efficiency has
 made up for it, and the benefits continue to grow.


Just think of the savings that could be achieved if all 2.1 million
Walmart employees were outfitted with colostomy bags.

   0.5 hours / day for bathroom breaks * 2,100,000 employees * 365
days/year * $7/hour = $2,682,750,000/year

Granted, I'm probably not the first to run these numbers.

-- 
Nathan Bell [EMAIL PROTECTED]
http://graphics.cs.uiuc.edu/~wnbell/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Gael Varoquaux
On Thu, Jul 31, 2008 at 03:41:15PM +0900, David Cournapeau wrote:
 Yes. Nothing that an easy make file cannot solve, nonetheless (I am sure
 I am not the only one with a makefile/script which automates the above,
 to test a new svn updated numpy in one command).

That's why distutils have a test target. You can do python setup.py
test, and if you have setup you setup.py properly it should work
(obviously it is easy to make this statement, and harder to get the thing
working).

Gaël
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Gael Varoquaux
On Thu, Jul 31, 2008 at 12:43:17PM +0200, Andrew Dalke wrote:
 Startup performance has not been a numpy concern.  It a concern for  
 me, and it has been (for other packages) a concern for some of my  
 clients.

I am curious, if startup performance is a problem, I guess it is because
you are running lots of little scripts where startup time is big compared
to run time. Did you think of forking them from an already started
process. I had this same problem (with libraries way slower than numpy to
load) and used os.fork to a great success.

Gaël
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Alan McIntyre
On Thu, Jul 31, 2008 at 2:12 AM, Andrew Dalke [EMAIL PROTECTED] wrote:
 Hmm. And it looks like testing/nosetester.py (which implements the
 'test' function above) is meant to make it easier to run nose, except
 my feeling is the extra level of wrapping makes things more
 complicated.  The nosetest command-line appears to be more flexible,
 with support for, for examples, dropping into the debugger on errors,
 and reseting the coverage test files.

You can actually pass those sorts of options to nose through the
extra_argv parameter in test(). That might be a little cumbersome, but
(as far as I know) it's something I'm going to do so infrequently it's
not a big deal.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread David Cournapeau
Gael Varoquaux wrote:

 That's why distutils have a test target. You can do python setup.py
 test, and if you have setup you setup.py properly it should work
 (obviously it is easy to make this statement, and harder to get the thing
 working).
   

I have already seen some discussion about distutils like this, if you
mean something like this:

http://blog.ianbicking.org/pythons-makefile.html

but I would take with rake and make over this anytime. I just don't
understand why something like rake does not exist in python, but well,
let's not go there.

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Gael Varoquaux
On Thu, Jul 31, 2008 at 11:05:33PM +0900, David Cournapeau wrote:
 Gael Varoquaux wrote:

  That's why distutils have a test target. You can do python setup.py
  test, and if you have setup you setup.py properly it should work
  (obviously it is easy to make this statement, and harder to get the thing
  working).


 I have already seen some discussion about distutils like this, if you
 mean something like this:

 http://blog.ianbicking.org/pythons-makefile.html

 but I would take with rake and make over this anytime. I just don't
 understand why something like rake does not exist in python, but well,
 let's not go there.

Well, actually, in the enthought tools suite we use setuptools for
packaging (I don't want to start a controversy, I am not advocating the
use of setuptools, just stating a fact) and nose for testing, and getting
setup.py test to wrok, including do the build test and download nose if
not there, is a matter of addig those two lines to the setup.py:

tests_require = [
'nose = 0.10.3',
],
test_suite = 'nose.collector',

Obviously, the build part has to be well-tuned for the machinery to work,
but there is a lot of value here.

Gaël
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread David Cournapeau
Gael Varoquaux wrote:
 Obviously, the build part has to be well-tuned for the machinery to work,
 but there is a lot of value here.
   

Ah yes, setuptools does have this. But this is specific to setuptools,
bare distutils does not have this test command, right ?

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Kevin Jacobs [EMAIL PROTECTED]
On Thu, Jul 31, 2008 at 10:14 AM, Gael Varoquaux 
[EMAIL PROTECTED] wrote:

 On Thu, Jul 31, 2008 at 12:43:17PM +0200, Andrew Dalke wrote:
  Startup performance has not been a numpy concern.  It a concern for
  me, and it has been (for other packages) a concern for some of my
  clients.

 I am curious, if startup performance is a problem, I guess it is because
 you are running lots of little scripts where startup time is big compared
 to run time. Did you think of forking them from an already started
 process. I had this same problem (with libraries way slower than numpy to
 load) and used os.fork to a great success.


Start up time is an issue for me, but in a larger sense than just numpy.  I
do run many scripts, some that are ephemeral and some that take significant
amounts of time.  However, numpy is just one of many many libraries that I
must import, so improvements, even minor ones, are appreciated.

The morale of this discussion, for me, is that just because _you_ don't care
about a particular aspect or feature, doesn't mean that others don't or
shouldn't.  Your workarounds may not be viable for me and vice-versa.  So
let's just go with the spirit of open source and encourage those motivated
to controbute to do so, provided their suggestions are sensible and do not
break code.

-Kevin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Gael Varoquaux
On Thu, Jul 31, 2008 at 11:16:12PM +0900, David Cournapeau wrote:
 Gael Varoquaux wrote:
  Obviously, the build part has to be well-tuned for the machinery to work,
  but there is a lot of value here.


 Ah yes, setuptools does have this. But this is specific to setuptools,
 bare distutils does not have this test command, right ?

Dunno, sorry. The scale of my ignore of distutils and related subjects
would probably impress you :).

Gaël, looking forward to your tutorial on scons.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Gael Varoquaux
On Thu, Jul 31, 2008 at 10:34:04AM -0400, Kevin Jacobs [EMAIL PROTECTED] 
wrote:
The morale of this discussion, for me, is that just because _you_ don't
care about a particular aspect or feature, doesn't mean that others don't
or shouldn't.  Your workarounds may not be viable for me and vice-versa.
So let's just go with the spirit of open source and encourage those
motivated to controbute to do so, provided their suggestions are sensible
and do not break code.

I fully agree ehre. And if people improve numpy's startup time with
breaking or obfuscating stuff, I am very happy. I was just trying to help
:).

Yes, the value of open source is that different people improve the same
tools to meet different goals, thus we should always keep on open ear to
other people's requirements, especially if they come up with high-quality
code.

Gaël
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Christopher Barker
Andrew Dalke wrote:
 If I had my way, remove things like (in numpy/__init__.py)
 
  import linalg
  import fft
  import random
  import ctypeslib
  import ma

as a side benefit, this might help folks using py2exe, py2app and 
friends -- as it stands all those sub-modules need to be included in 
your app bundle regardless of whether they are used.

I recall having to explicitly add them by hand, too, though that may 
have been a matplotlib.numerix issue.

 but leave the list of submodules in __all__ so that from numpy  
 import * works.

Of course, no one should be doing that anyway ;-)

And for what it's worth, I've found myself very frustrated by how long 
it takes to start up python and import numpy. I often do whip out the 
interpreter to do something fast, and I didn't used to have to wait for  it.

On my OS-X box (10.4.11, python2.5, numpy '1.1.1rc2'), it takes about 7 
seconds to import numpy!

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread David Cournapeau
Christopher Barker wrote:
 On my OS-X box (10.4.11, python2.5, numpy '1.1.1rc2'), it takes about 7 
 seconds to import numpy!

   

Hot or cold ? If hot, there is something horribly wrong with your setup.
On my macbook, it takes ~ 180 ms to to python -c import numpy, and ~
100 ms on linux (same machine).

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Christopher Barker
Stéfan van der Walt wrote:
  No one is *waiting* for NumPy to start. 

I am, and probably 10 times, a day, yes.

And it's a major issue for CGI, though maybe no one's using that anymore 
anyway.

  Just by answering this
 e-mail I could have (and maybe should have) started NumPy three
 hundred and sixty times.

sure, but I like wasting my time on mailing lists

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Christopher Barker
David Cournapeau wrote:
 Christopher Barker wrote:
 On my OS-X box (10.4.11, python2.5, numpy '1.1.1rc2'), it takes about 7 
 seconds to import numpy!
 
 Hot or cold ? If hot, there is something horribly wrong with your setup.

hot -- it takes about 10 cold.

I've been wondering about that.

time python -c import numpy

real0m8.383s
user0m0.320s
sys 0m7.805s

and similar results if run multiple times in a row.

Any idea what could be wrong? I have no clue where to start, though I 
suppose a complete clean out and re-install of python comes to mind.

oh, and this is a dual G5 PPC (which should have a faster disk than your 
Macbook)


-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Nils Wagner
On Thu, 31 Jul 2008 10:12:22 -0700
  Christopher Barker [EMAIL PROTECTED] wrote:
 David Cournapeau wrote:
 Christopher Barker wrote:
 On my OS-X box (10.4.11, python2.5, numpy '1.1.1rc2'), 
it takes about 7 
 seconds to import numpy!
 
 Hot or cold ? If hot, there is something horribly wrong 
with your setup.
 
 hot -- it takes about 10 cold.
 
 I've been wondering about that.
 
 time python -c import numpy
 
 real0m8.383s
 user0m0.320s
 sys 0m7.805s
 
 and similar results if run multiple times in a row.
 
 Any idea what could be wrong? I have no clue where to 
start, though I 
 suppose a complete clean out and re-install of python 
comes to mind.
 
 oh, and this is a dual G5 PPC (which should have a 
faster disk than your 
 Macbook)
 
 
 -Chris
 
  
No idea, but for comparison
  time /usr/bin/python -c import numpy

real0m0.295s
user0m0.236s
sys 0m0.050s
[EMAIL PROTECTED]:~/svn/matplotlib cat /proc/cpuinfo
processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 6
model   : 10
model name  : mobile AMD Athlon (tm) 2500+
stepping: 0
cpu MHz : 662.592
cache size  : 512 KB
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 1
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 sep 
mtrr pge mca cmov pat pse36 mmx fxsr sse pni syscall mp 
mmxext 3dnowext 3dnow
bogomips: 1316.57

Nils

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread David Huard
On Thu, Jul 31, 2008 at 1:12 PM, Christopher Barker
[EMAIL PROTECTED]wrote:

 David Cournapeau wrote:
  Christopher Barker wrote:
  On my OS-X box (10.4.11, python2.5, numpy '1.1.1rc2'), it takes about 7
  seconds to import numpy!
 
  Hot or cold ? If hot, there is something horribly wrong with your setup.

 hot -- it takes about 10 cold.

 I've been wondering about that.

 time python -c import numpy

 real0m8.383s
 user0m0.320s
 sys 0m7.805s

 and similar results if run multiple times in a row.

 Any idea what could be wrong? I have no clue where to start, though I
 suppose a complete clean out and re-install of python comes to mind.


Is only 'import numpy' slow, or other packages import slowly too ?
Are there remote directories in your pythonpath ?
Do you have old `eggs` in the site-packages directory that point to remote
directories (installed with setuptools developp) ?
Try cleaning the site-packages directory. That did the trick for me once.

David


 oh, and this is a dual G5 PPC (which should have a faster disk than your
 Macbook)


 -Chris


 --
 Christopher Barker, Ph.D.
 Oceanographer

 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception

 [EMAIL PROTECTED]
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread David Cournapeau

 hot -- it takes about 10 cold.

 I've been wondering about that.

 time python -c import numpy

 real0m8.383s
 user0m0.320s
 sys 0m7.805s

 and similar results if run multiple times in a row.

What does python -c import sys; print sys.path say ?

 Any idea what could be wrong? I have no clue where to start, though I
 suppose a complete clean out and re-install of python comes to mind.

 oh, and this is a dual G5 PPC (which should have a faster disk than your
 Macbook)

disk should not matter. If hot, everything should be in the IO buffer,
opening a file is of the order of a few micro seconds (that's
certainly the order on Linux;  the VM on Mac OS X is likely not as
good, but still).

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Gael Varoquaux
On Thu, Jul 31, 2008 at 10:12:22AM -0700, Christopher Barker wrote:
 I've been wondering about that.

 time python -c import numpy

 real0m8.383s
 user0m0.320s
 sys 0m7.805s

I don't know what is wrong, but this is plain wrong, unless you are on a
distant file system, or something usual.

On the box I am currently on, I get:

python -c import numpy  0.10s user 0.03s system 101% cpu 0.122 total

And this matches my overall experience.

Gaël
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Scott Ransom
On Thu, Jul 31, 2008 at 07:46:20AM -0500, Nathan Bell wrote:
 On Thu, Jul 31, 2008 at 7:31 AM, Hanni Ali [EMAIL PROTECTED] wrote:
 
  I would just to highlight an alternate use of numpy to interactive use. We
  have a cluster of machines which process tasks on an individual basis where
  a master tasks may spawn 600 slave tasks to be processed. These tasks are
  spread across the cluster and processed as scripts in a individual python
  thread. Although reducing the process time by 300 seconds for the master
  task is only about a 1.5% speedup (total time can be i excess of 24000s). We
  process large number of these tasks in any given year and every little
  helps!
 
 
 There are other components of NumPy/SciPy that are more worthy of
 optimization.  Given that programmer time is a scarce resource, it's
 more sensible to direct our efforts towards making the other 98.5% of
 the computation faster.

This is true in general, but I have a different use case for one
of my programs that uses numpy on a cluster.  Basically, the
program gets called thousands of times per day and the runtime for
each is only a second or two.  In this case I am much more
dominated by numpy's import time.

Scott

PS: Yes, I could change the way that the routine works so that it
is called many fewer times, however, that would be very difficult
(although not impossible).  A free speedup due to faster numpy
import would be very nice.


-- 
Scott M. RansomAddress:  NRAO
Phone:  (434) 296-0320   520 Edgemont Rd.
email:  [EMAIL PROTECTED] Charlottesville, VA 22903 USA
GPG Fingerprint: 06A9 9553 78BE 16DB 407B  FFCA 9BFA B6FF FFD3 2989
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread Robert Kern
On Thu, Jul 31, 2008 at 05:43, Andrew Dalke [EMAIL PROTECTED] wrote:
 On Jul 31, 2008, at 12:03 PM, Robert Kern wrote:

 But you still can't remove them since they are being used inside
 numerictypes. That's why I labeled them internal utility functions
 instead of leaving them with minimal docstrings such that you would
 have to guess.

 My proposal is to replace that code with a table mapping
 the type name to the uppercase/lowercase/capitalized forms,
 thus eliminating the (small) amount of time needed to
 import string.

 It makes adding new types slightly more difficult.

 I know it's a tradeoff.

Probably not a bad one. Write up the patch, and then we'll see how
much it affects the import time.

I would much rather that we discuss concrete changes like this rather
than rehash the justifications of old decisions. Regardless of the
merits about the old decisions (and I agreed with your position at the
time), it's a pointless and irrelevant conversation. The decisions
were made, and now we have a user base to whom we have promised not to
break their code so egregiously again. The relevant conversation is
what changes we can make now.

Some general guidelines:

1) Everything exposed by from numpy import * still needs to work.
  a) The layout of everything under numpy.core is an implementation detail.
  b) _underscored functions and explicitly labeled internal functions
can probably be modified.
  c) Ask about specific functions when in doubt.

2) The improvement in import times should be substantial. Feel free to
bundle up the optimizations for consideration.

3) Moving imports from module-level down into the functions where they
are used is generally okay if we get a reasonable win from it. The
local imports should be commented, explaining that they are made local
in order to improve the import times.

4) __import__ hacks are off the table.

5) Proxy objects ... I would really like to avoid proxy objects. They
have caused fragility in the past.

6) I'm not a fan of having environment variables control the way numpy
gets imported, but I'm willing to consider it. For example, I might go
for having proxy objects for linalg et al. *only* if a particular
environment variable were set. But there had better be a very large
improvement in import times.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-31 Thread David Cournapeau
On Fri, Aug 1, 2008 at 5:02 AM, Robert Kern [EMAIL PROTECTED] wrote:


 5) Proxy objects ... I would really like to avoid proxy objects. They
 have caused fragility in the past.

One recurrent problem around import times optimization is that it is
some work to improve it, but it takes one line to destroy it all. For
example, inspect import came back, and this alone is ~10-15 % of my
import time on mac os x (from ~ 180 to ~160).

This would be the main advantage of lazy import; but does it really
worth the trouble, since it brings some complexity as you mentionned
last time we had this discussion ? Maybe a simple test script to check
for known costly import would be enough (running from time to time ?).

Maybe ctypes can be loaded in the fly, too. Those are the two
obvious hotspot ( ~ 25 % altogether). with a recent SVN checkout

 6) I'm not a fan of having environment variables control the way numpy
 gets imported, but I'm willing to consider it. For example, I might go
 for having proxy objects for linalg et al. *only* if a particular
 environment variable were set. But there had better be a very large
 improvement in import times.

linalg does not seem to have a huge impact. It is typically much
faster to load than ctypeslib or inspect.

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-30 Thread Andrew Dalke
On Jul 4, 2008, at 2:22 PM, Andrew Dalke wrote:
 [josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke% time python -c  
 'pass'
 0.015u 0.042s 0:00.06 83.3% 0+0k 0+0io 0pf+0w
 [josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke% time python -c  
 'import numpy'
 0.084u 0.231s 0:00.33 93.9% 0+0k 0+8io 0pf+0w
 [josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke%

 For one of my clients I wrote a tool to analyze import times.  I  
 don't have it, but here's something similar I just now whipped up:

Based on those results I've been digging into the code trying to  
figure out why numpy imports so many files, and at the same time I've  
been trying to guess at the use case Robert Kern regards as typical  
when he wrote:

 Your use case isn't so typical and so suffers on the import
 time end of the balance

and trying to figure out what code would break if those modules  
weren't all eagerly imported and were instead written as most other  
Python modules are written.


I have two thoughts for why mega-importing might be useful:

   - interactive users get to do tab complete and see everything
(eg, import numpy means numpy.fft.ifft works, without
 having to do import numpy.fft manually)

   - class inspectors don't need to to directory checks to find  
possible modules
(This is a stretch, since every general purpose inspector I  
know of
 has to know how to frob the directories to find directories.)

Are these the reasons numpy imports everything or are there other  
reasons?

The first guess comes from the comment in numpy/__init__.py

  The following sub-packages must be explicitly imported:

meaning, I take it, that the other modules (core, lib, random,  
linalg, fft, testing)
do not need to be explicitly imported.

Is the numpy recommendation that people should do:

   import numpy
   numpy.fft.ifft(data)

?  If so, the documentation should be updated to say that random,  
ma, ctypeslib and several other libraries are included in that  
list.  Why is the last so important that it should be in the top- 
level namespace?

In my opinion, this assistance is counter to standard practice in  
effectively every other Python package.  I don't see the benefit.





You may ask if there are possible improvements.  There's no obvious  
place taking up a bunch of time but there are plenty of small places  
which add up.

For examples:

1) I wondered why 'cPickle' needed to be imported.  One of the places  
it's used is numpy.lib.format which is only imported by  
numpy.lib.io.  It's easy to defer the 'import format' to be inside  
the functions which need it.  Note that io.py already defers the  
import of zipfile, so function-local imports are not inappropriate.

'io' imports 'tempfile', needing 0.016 seconds.  This can be a  
deferred cost only incurred by those who use io.savez, which already  
has some function-local imports.  The reason for the high import  
costs?  Here's what tempfile itself imports.

tempfile: 0.016 (io)
 errno: 0.000 (tempfile)
 random: 0.010 (tempfile)
  binascii: 0.003 (random)
  _random: 0.003 (random)
 fcntl: 0.003 (tempfile)
 thread: 0.000 (tempfile)

(This is read as 'tempfile' is imported by 'io' and takes 0.016  
seconds total, including all children, and the directly imported  
children of 'tempfile' are 'errno', 'random', 'fcntl' and 'thread'.   
'random' imports 'binascii' and '_random'.)


BTW, the load and save commands in io do an incorrect check.

 if isinstance(file, type()):
 fid = _file(file,rb)
 else:
 fid = file

Filenames can be unicode strings.  This test should either be
   isinstance(file, basestring)
or
   not hasatttr(file, 'read')



2) What's the point of add_newdocs?  According to the top of the  
module

# This is only meant to add docs to objects defined in C- 
extension modules.
# The purpose is to allow easier editing of the docstrings without
# requiring a re-compile.

which implies this aids development, but not deployment.  The import  
takes a miniscule 0.006 seconds of the 0.225 (import lib and its  
subimports takes 0.141 seconds) but seems to add no direct end-user  
benefit.  Shouldn't this documentation be pushed into the C code at  
least for each release?

3) I see that numpy/core/numerictypes.py imports 'string', which  
takes 0.008 seconds.  I wondered why.  It's part of english_lower,  
english_upper, and english_capitalize, which are functions  
defined in that module.  The implementation can't be improved, and  
using string.translate is the right approach.

However,
3a) the two functions have no leading underscore and have  
docstrings to imply that this is part of the public API (although  
they are not included in __all__).  Are they meant for general use?   
Note that english_capitalize is over-engineered for the use-case in  
that file.  There are no empty type names, so the test if s is  
never false.

3b) there are only 33 

Re: [Numpy-discussion] import numpy is slow

2008-07-30 Thread Stéfan van der Walt
2008/7/30 Andrew Dalke [EMAIL PROTECTED]:
 Based on those results I've been digging into the code trying to
 figure out why numpy imports so many files, and at the same time I've
 been trying to guess at the use case Robert Kern regards as typical
 when he wrote:

 Your use case isn't so typical and so suffers on the import
 time end of the balance

I.e. most people don't start up NumPy all the time -- they import
NumPy, and then do some calculations, which typically take longer than
the import time.

 and trying to figure out what code would break if those modules
 weren't all eagerly imported and were instead written as most other
 Python modules are written.

For a benefit of 0.03s, I don't think it's worth it.

 I have two thoughts for why mega-importing might be useful:

   - interactive users get to do tab complete and see everything
(eg, import numpy means numpy.fft.ifft works, without
 having to do import numpy.fft manually)

Numpy has a very flat namespace, for better or worse, which implies
many imports.  This can't be easily changed without modifying the API.

 Is the numpy recommendation that people should do:

   import numpy
   numpy.fft.ifft(data)

That's the way many people use it.

 ?  If so, the documentation should be updated to say that random,
 ma, ctypeslib and several other libraries are included in that
 list.

Thanks for pointing that out, I'll edit the documentation wiki.

 Why is the last so important that it should be in the top-
 level namespace?

It's a single Python file -- does it make much of a difference?

 In my opinion, this assistance is counter to standard practice in
 effectively every other Python package.  I don't see the benefit.

How do you propose we change this?

 BTW, the load and save commands in io do an incorrect check.

 if isinstance(file, type()):
 fid = _file(file,rb)
 else:
 fid = file

Thanks, fixed.

[snip lots of suggestions]

 Getting rid of these functions, and thus getting rid of the import
 speeds numpy startup time by 3.5%.

While I appreciate you taking the time to find these niggles, but we
are short on developer time as it is.  Asking them to spend their
precious time on making a 3.5% improvement in startup time does not
make much sense.  If you provide a patch, on the other hand, it would
only take a matter of seconds to decide whether to apply or not.
You've already done most of the sleuth work.

 I could probably get another 0.05 seconds if I dug around more, but I
 can't without knowing what use case numpy is trying to achieve.  Why
 are all those ancillary modules (testing, ctypeslib) eagerly loaded
 when there seems no need for that feature?

Need is relative.  You need fast startup time, but most of our users
need quick access to whichever functions they want (and often use from
an interactive terminal).  I agree that testing and ctypeslib do
not belong in that category, but they don't seem to do much harm
either.

Regards
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-30 Thread Andrew Dalke
On Jul 30, 2008, at 10:59 PM, Stéfan van der Walt wrote:
 I.e. most people don't start up NumPy all the time -- they import
 NumPy, and then do some calculations, which typically take longer than
 the import time.

Is that interactively, or is that through programs?

 For a benefit of 0.03s, I don't think it's worth it.

The final number with all the hundredths of a second added up to 0.08  
seconds, which was about 30% of the 'import numpy' cost.

 Numpy has a very flat namespace, for better or worse, which implies
 many imports.


I don't get the feeling that numpy is flat.  Python's stdlib is flat.  
Numpy has many 2- and 3-level modules.

 Is the numpy recommendation that people should do:

   import numpy
   numpy.fft.ifft(data)

 That's the way many people use it.

The normal Python way is:

  from numpy import fft
   fft.ifft(data)

because in most packages, parent modules don't import all of their  
children.  I acknowledge that existing numpy code will break with my  
desired change, as this example from the tutorial


   import numpy
   import pylab
   # Build a vector of 1 normal deviates with variance 0.5^2 and  
mean 2
   mu, sigma = 2, 0.5
   v = numpy.random.normal(mu,sigma,1)

and I am not saying to change this code.  Instead, I am asking for  
limits on the eagerness, with a long-term goal of minimizing its use.


 Why is [ctypeslib] so important that it should be in the top-
 level namespace?

 It's a single Python file -- does it make much of a difference?

The file imports other files.  Here's the import chain:

  ctypeslib: 0.047 (numpy)
   ctypes: -1.000 (ctypeslib)
_ctypes: 0.003 (ctypes)
gestalt: -1.000 (ctypes)
ma: 0.005 (numpy)
 extras: 0.001 (ma)
  numpy.lib.index_tricks: 0.000 (extras)
  numpy.lib.polynomial: 0.000 (extras)

(The -1.000 indicates a bug in my instrumentation script, which I  
worked around with a -1.0 value.)

Every numpy program, because it eagerly imports 'ctypeslib' to make  
it be accessible as a top-level variable, ends up importing ctypes.

  if 1:
...   t1 = time.time()
...   import ctypes
...   t2 = time.time()
...
  t2-t1
0.032159090042114258

That's 10% of the import time.


 In my opinion, this assistance is counter to standard practice in
 effectively every other Python package.  I don't see the benefit.

 How do you propose we change this?

If I had my way, remove things like (in numpy/__init__.py)

 import linalg
 import fft
 import random
 import ctypeslib
 import ma

but leave the list of submodules in __all__ so that from numpy  
import * works.  Perhaps add a top-level function to 'import_all()'  
which mimics the current behavior, and have iPython know about it so  
interactive users get it automatically.  Or something like that.


Yes, I know the numpy team won't change this behavior.  I want to  
know why you all will consider changing.


Something more concrete: change the top-level definitions in 'numpy'  
from

 from testing import Tester
 test = Tester().test
 bench = Tester().bench

with

def test(label='fast', verbose=1, extra_argv=None, doctests=False,
  coverage=False, **kwargs):
   from testing import Tester
   Tester.test(label, verbose, extra_argv, doctests, coverage, **kwargs

and do something similar for 'bench'.  Note that numpy currently  
implements

   numpy.test  -- this is a Tester().test
   numpy.testing.test -- another Tester().test bound method

so there's some needless and distracting, but extremely minor,  
duplication.



 Getting rid of these functions, and thus getting rid of the import
 speeds numpy startup time by 3.5%.

 While I appreciate you taking the time to find these niggles, but we
 are short on developer time as it is.  Asking them to spend their
 precious time on making a 3.5% improvement in startup time does not
 make much sense.  If you provide a patch, on the other hand, it would
 only take a matter of seconds to decide whether to apply or not.
 You've already done most of the sleuth work.

I wrote that I don't know the reasons for why the design was as it  
is.  Are those functions (english_upper, english_lower,  
english_capitalize) expected as part of the public interface for  
the module?  The lack of a _ prefix and their verbose docstrings  
implies that they are for general use.  In that case, they can't  
easily be gotten rid of.  Yet it doesn't make sense for them to be  
part of 'numerictypes'.

Why would I submit a patch if there's no way those definitions will  
disappear, for reasons I am not aware of?

I am not asking you all to make these changes.  I'm asking about how  
much change is acceptable, what are the restrictions, and why are  
they there?


I also haven't yet figured out how to get the regression tests to  
run, and I'm not going to contribute patches without at least passing  
that bare minimum.  BTW, how do I do that?  In the top-level there's  
a 'test.sh' command but when I run it I get:

% mkdir 

Re: [Numpy-discussion] import numpy is slow

2008-07-30 Thread Andrew Dalke
On Jul 30, 2008, at 10:51 PM, Alan McIntyre wrote:
 I suppose it's necessary for providing the test() and bench()
 functions in subpackages, but I that isn't a good reason to impose
 upon all users the time required to set up numpy.testing.

I just posted this in my reply to Stéfan, but I'll say it again here.

numpy defines

  numpy.test
  numpy.bench

and

  numpy.testing.test

The two 'test's use the same implementation.  This is a likely  
unneeded duplication and one should be removed. The choice depends on  
if people think the name should be 'numpy.test' or 'numpy.testing.test'.


BTW, where's the on-line documentation for these functions?  They are  
actually bound methods, and I wondered if the doc programs handle  
them okay.

If they should be top-level functions then I would prefer the be  
actual functions to hide an import.  In that case, replace

 from testing import Tester
 test = Tester().test

with

def test(label='fast', verbose=1, extra_argv=None, doctests=False,
  coverage=False, **kwargs):
   from testing import Tester
   Tester.test(label, verbose, extra_argv, doctests, coverage, **kwargs)

or something similar.  This would keep the API unchanged (assuming  
those are important in the top-level) and reduce the number of imports.

Else I would keep/move them in 'numpy.testing' and require that if  
someone wants to use 'test' or 'bench' then to get them after a 'from  
numpy import testing'.


 Thanks for taking the time to find those; I just removed the unused
 glob and delayed the import of shlex, difflib, and inspect in
 numpy.testing.

Thanks!

Andrew
[EMAIL PROTECTED]


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-30 Thread David Cournapeau
On Thu, 2008-07-31 at 02:07 +0200, Andrew Dalke wrote:
 On Jul 30, 2008, at 10:59 PM, Stéfan van der Walt wrote:
  I.e. most people don't start up NumPy all the time -- they import
  NumPy, and then do some calculations, which typically take longer than
  the import time.
 
 Is that interactively, or is that through programs?

Most people use it interactively, or for long running programs. Import
times only matters for interactive commands depending on numpy.

 
 and I am not saying to change this code.  Instead, I am asking for  
 limits on the eagerness, with a long-term goal of minimizing its use.

For new API, this is never done, and is a bug if it is. In scipy,
typically, import scipy does not import the whole subpackages list.


 I also haven't yet figured out how to get the regression tests to  
 run, and I'm not going to contribute patches without at least passing  
 that bare minimum.  BTW, how do I do that?  In the top-level there's  
 a 'test.sh' command but when I run it I get:

Argh, this file should have never ended here, that's entirely my fault.
It was a merge from a (at the time) experimental branch. I can't remove
it now because my company does not allow subversion access, but I will
fix this tonight. Sorry for the confusion.

 
 and when I run 'nosetests' in the top-level directory I get:
 
 ImportError: Error importing numpy: you should not try to import  
 numpy from
  its source directory; please exit the numpy source tree, and  
 relaunch
  your python intepreter from there.
 
 I couldn't find (in a cursory search) instructions for running self- 
 tests or regression tests.

You are supposed to run the tests on an installed numpy, not in the
sources:

import numpy
numpy.test(verbose = 10)

You can't really use run numpy without it to be installed first (which
is what the message is about).

cheers,

David

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-30 Thread Alan McIntyre
On Wed, Jul 30, 2008 at 8:19 PM, Andrew Dalke [EMAIL PROTECTED] wrote:
 numpy defines

  numpy.test
  numpy.bench

 and

  numpy.testing.test

 The two 'test's use the same implementation.  This is a likely
 unneeded duplication and one should be removed. The choice depends on
 if people think the name should be 'numpy.test' or 'numpy.testing.test'.

They actually do two different things; numpy.test() runs test for all
of numpy, and numpy.testing.test() runs tests for numpy.testing only.
There are similar functions in numpy.lib, numpy.core, etc.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-30 Thread Andrew Dalke
On Jul 31, 2008, at 4:21 AM, Alan McIntyre wrote:
 They actually do two different things; numpy.test() runs test for all
 of numpy, and numpy.testing.test() runs tests for numpy.testing only.
 There are similar functions in numpy.lib, numpy.core, etc.

Really?  This is the code from numpy/__init__.py:

 from testing import Tester
 test = Tester().test
 bench = Tester().bench

This is the code from numpy/testing/__init__.py:

test = Tester().test


... ahhh, here's the magic, from testing/nosetester.py:NoseTester

 if package is None:
 f = sys._getframe(1)
 package = f.f_locals.get('__file__', None)
 assert package is not None
 package = os.path.dirname(package)

Why are 'test' and 'bench' part of the general API instead something  
only used during testing?

Andrew
[EMAIL PROTECTED]


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-04 Thread Andrew Dalke
On Jul 3, 2008, at 9:06 AM, Robert Kern wrote:
 Can you try the SVN trunk?

Sure.  Though did you know it's not easy to find how to get numpy  
from SVN?  I had to go to the second page of Google, which linked to  
someone's talk.

I expected to find a link to it at http://numpy.scipy.org/ .
Just like I expected to find a link to the numpy mailing list.

Okay, compiled.

[josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke% time python -c  
'pass'
0.015u 0.042s 0:00.06 83.3% 0+0k 0+0io 0pf+0w
[josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke% time python -c  
'import numpy'
0.084u 0.231s 0:00.33 93.9% 0+0k 0+8io 0pf+0w
[josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke%

Previously it took 0.44 seconds so it's now 24% faster.


 I would be interested to know how significantly it improves your  
 use case.


For one of my clients I wrote a tool to analyze import times.  I  
don't have it, but here's something similar I just now whipped up:

import time

seen = set()
import_order = []
elapsed_times = {}
level = 0
parent = None
children = {}

def new_import(name, globals, locals, fromlist):
 global level, parent
 if name in seen:
 return old_import(name, globals, locals, fromlist)
 seen.add(name)
 import_order.append((name, level, parent))
 t1 = time.time()
 old_parent = parent
 parent = name
 level += 1
 module = old_import(name, globals, locals, fromlist)
 level -= 1
 parent = old_parent
 t2 = time.time()
 elapsed_times[name] = t2-t1
 return module

old_import = __builtins__.__import__

__builtins__.__import__ = new_import

import numpy

parents = {}
for name, level, parent in import_order:
 parents[name] = parent

print == Tree ==
for name, level,parent in import_order:
 print %s%s: %.3f (%s) % ( *level, name, elapsed_times[name],  
parent)

print \n
print == Slowest (including children) ==
slowest = sorted((t, name) for (name, t) in elapsed_times.items())[-20:]
for elapsed_time, name in slowest[::-1]:
 print %.3f %s (%s) % (elapsed_time, name, parents[name])


The result using the version out of subversion is

== Tree ==
numpy: 0.237 (None)
  numpy.__config__: 0.000 (numpy)
  version: 0.000 (numpy)
   os: 0.000 (version)
   imp: 0.000 (version)
  _import_tools: 0.024 (numpy)
   sys: 0.000 (_import_tools)
   glob: 0.024 (_import_tools)
fnmatch: 0.020 (glob)
 re: 0.018 (fnmatch)
  sre_compile: 0.009 (re)
   _sre: 0.000 (sre_compile)
   sre_constants: 0.004 (sre_compile)
  sre_parse: 0.006 (re)
  copy_reg: 0.000 (re)
  add_newdocs: 0.156 (numpy)
   lib: 0.150 (add_newdocs)
info: 0.000 (lib)
numpy.version: 0.000 (lib)
type_check: 0.091 (lib)

   ... many lines removed ...

  mtrand: 0.021 (numpy)
  ctypeslib: 0.024 (numpy)
   ctypes: 0.023 (ctypeslib)
_ctypes: 0.003 (ctypes)
gestalt: 0.013 (ctypes)
ctypes._endian: 0.001 (ctypes)
   numpy.core._internal: 0.000 (ctypeslib)
  ma: 0.005 (numpy)
   extras: 0.001 (ma)
numpy.lib.index_tricks: 0.000 (extras)
numpy.lib.polynomial: 0.000 (extras)


== Slowest (including children) ==
0.237 numpy (None)
0.156 add_newdocs (numpy)
0.150 lib (add_newdocs)
0.091 type_check (lib)
0.090 numpy.core.numeric (type_check)
0.049 io (lib)
0.048 numpy.testing (numpy.core.numeric)
0.024 _import_tools (numpy)
0.024 ctypeslib (numpy)
0.024 glob (_import_tools)
0.023 ctypes (ctypeslib)
0.022 utils (numpy.testing)
0.022 difflib (utils)
0.021 mtrand (numpy)
0.020 fnmatch (glob)
0.020 _datasource (io)
0.020 tempfile (io)
0.018 re (fnmatch)
0.018 heapq (difflib)
0.013 gestalt (ctypes)

This only reports the first time a module is imported so fixing, say,  
the 'glob' in _import_tools doesn't mean it won't appear elsewhere.


Andrew
[EMAIL PROTECTED]


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-03 Thread Robert Kern
On Mon, Jun 30, 2008 at 18:32, Andrew Dalke [EMAIL PROTECTED] wrote:
 Why does numpy/__init__.py need to import all of these other modules
 and submodules?  Any chance of cutting down on the number, in order
 to improve startup costs?

Can you try the SVN trunk? In another thread (it must be numpy
imports slowly! week), David Cournapeau found some optimizations that
could be done that don't affect the API. They seem to cut down my
import times (on OS X) by about 1/3; on his Linux machine, it seems to
be more. I would be interested to know how significantly it improves
your use case.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-01 Thread Hanni Ali
Would it not be possible to import just the necessary module of numpy to
meet the necessary functionality of your application.

i.e.

import numpy.core

or whatever you're using

you could even do:

import numpy.core as numpy

I think, to simplify your code, I'm no expert though.

Hanni


2008/7/1 Andrew Dalke [EMAIL PROTECTED]:

 On Jul 1, 2008, at 2:22 AM, Robert Kern wrote:
  Your use case isn't so typical and so suffers on the import time
  end of the
  balance.

 I'm working on my presentation for EuroSciPy.  Isn't so typical
 seems to be a good summary of my first slide.  :)

  Any chance of cutting down on the number, in order
  to improve startup costs?
 
  Not at this point in time, no. That would break too much code.

 Understood.

 Thanks for the response,

Andrew
[EMAIL PROTECTED]


 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-01 Thread Matthieu Brucher
Hi,

IIRC, il you do import numpy.core as numpy, it starts by importing
numpy, so it will be even slower.

Matthieu

2008/7/1 Hanni Ali [EMAIL PROTECTED]:
 Would it not be possible to import just the necessary module of numpy to
 meet the necessary functionality of your application.

 i.e.

 import numpy.core

 or whatever you're using

 you could even do:

 import numpy.core as numpy

 I think, to simplify your code, I'm no expert though.

 Hanni


 2008/7/1 Andrew Dalke [EMAIL PROTECTED]:

 On Jul 1, 2008, at 2:22 AM, Robert Kern wrote:
  Your use case isn't so typical and so suffers on the import time
  end of the
  balance.

 I'm working on my presentation for EuroSciPy.  Isn't so typical
 seems to be a good summary of my first slide.  :)

  Any chance of cutting down on the number, in order
  to improve startup costs?
 
  Not at this point in time, no. That would break too much code.

 Understood.

 Thanks for the response,

Andrew
[EMAIL PROTECTED]


 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion


 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion





-- 
French PhD student
Website : http://matthieu-brucher.developpez.com/
Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn : http://www.linkedin.com/in/matthieubrucher
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-01 Thread Hanni Ali
You are correct, it appears to take slightly longer to import numpy.core and
longer again to import numpy.core as numpy

I should obviously check first in future.

Hanni

2008/7/1 Matthieu Brucher [EMAIL PROTECTED]:

 Hi,

 IIRC, il you do import numpy.core as numpy, it starts by importing
 numpy, so it will be even slower.

 Matthieu

 2008/7/1 Hanni Ali [EMAIL PROTECTED]:
  Would it not be possible to import just the necessary module of numpy to
  meet the necessary functionality of your application.
 
  i.e.
 
  import numpy.core
 
  or whatever you're using
 
  you could even do:
 
  import numpy.core as numpy
 
  I think, to simplify your code, I'm no expert though.
 
  Hanni
 
 
  2008/7/1 Andrew Dalke [EMAIL PROTECTED]:
 
  On Jul 1, 2008, at 2:22 AM, Robert Kern wrote:
   Your use case isn't so typical and so suffers on the import time
   end of the
   balance.
 
  I'm working on my presentation for EuroSciPy.  Isn't so typical
  seems to be a good summary of my first slide.  :)
 
   Any chance of cutting down on the number, in order
   to improve startup costs?
  
   Not at this point in time, no. That would break too much code.
 
  Understood.
 
  Thanks for the response,
 
 Andrew
 [EMAIL PROTECTED]
 
 
  ___
  Numpy-discussion mailing list
  Numpy-discussion@scipy.org
  http://projects.scipy.org/mailman/listinfo/numpy-discussion
 
 
  ___
  Numpy-discussion mailing list
  Numpy-discussion@scipy.org
  http://projects.scipy.org/mailman/listinfo/numpy-discussion
 
 



 --
 French PhD student
 Website : http://matthieu-brucher.developpez.com/
 Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92
 LinkedIn : http://www.linkedin.com/in/matthieubrucher
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-07-01 Thread Andrew Dalke
2008/7/1 Hanni Ali [EMAIL PROTECTED]:
 Would it not be possible to import just the necessary module of  
 numpy to
 meet the necessary functionality of your application.

Matthieu Brucher responded:
 IIRC, il you do import numpy.core as numpy, it starts by importing
 numpy, so it will be even slower.

which you can see if you start python with the -v option to display  
imports.

  import numpy.core
import numpy # directory /Library/Frameworks/Python.framework/ 
Versions/2.5/lib/python2.5/site-packages/numpy
# /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/ 
site-packages/numpy/__init__.pyc matches /Library/Frameworks/ 
Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ 
__init__.py
import numpy # precompiled from /Library/Frameworks/Python.framework/ 
Versions/2.5/lib/python2.5/site-packages/numpy/__init__.pyc
# /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/ 
site-packages/numpy/__config__.pyc matches /Library/Frameworks/ 
Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ 
__config__.py
import numpy.__config__ # precompiled from /Library/Frameworks/ 
Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ 
__config__.pyc

   ...
and many more


Andrew
[EMAIL PROTECTED]


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] import numpy is slow

2008-06-30 Thread Andrew Dalke
(Trying again now that I'm subscribed.  BTW, there's no link to the  
subscription page from numpy.scipy.org .)


The initial 'import numpy' loads a huge number of modules, even when  
I don't need them.

Python 2.5 (r25:51918, Sep 19 2006, 08:49:13)
[GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
Type help, copyright, credits or license for more information.
  import sys
  len(sys.modules)
28
  import numpy
  len(sys.modules)
256
  len([s for s in sorted(sys.modules) if 'numpy' in s])
127
  numpy.__version__
'1.1.0'
 

As a result, I assume that's the reason my program's startup cost is  
quite high.

[josiah:~/src/fp] dalke% time python -c 'a=4'
0.014u 0.038s 0:00.05 80.0% 0+0k 0+1io 0pf+0w
[josiah:~/src/fp] dalke% time python -c 'import numpy'
0.161u 0.279s 0:00.44 97.7% 0+0k 0+9io 0pf+0w

My total runtime is something like 1.4 seconds, and the only thing  
I'm using NumPy for is to make an array of doubles that I can pass to  
a C extension.  (I could use the array module or ctypes, but figured  
numpy is more useful for downstream code.)

Why does numpy/__init__.py need to import all of these other modules  
and submodules?  Any chance of cutting down on the number, in order  
to improve startup costs?

Andrew
[EMAIL PROTECTED]

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-06-30 Thread Robert Kern
On Mon, Jun 30, 2008 at 18:32, Andrew Dalke [EMAIL PROTECTED] wrote:
 (Trying again now that I'm subscribed.  BTW, there's no link to the
 subscription page from numpy.scipy.org .)


 The initial 'import numpy' loads a huge number of modules, even when
 I don't need them.

 Python 2.5 (r25:51918, Sep 19 2006, 08:49:13)
 [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
 Type help, copyright, credits or license for more information.
   import sys
   len(sys.modules)
 28
   import numpy
   len(sys.modules)
 256
   len([s for s in sorted(sys.modules) if 'numpy' in s])
 127
   numpy.__version__
 '1.1.0'
  

 As a result, I assume that's the reason my program's startup cost is
 quite high.

 [josiah:~/src/fp] dalke% time python -c 'a=4'
 0.014u 0.038s 0:00.05 80.0% 0+0k 0+1io 0pf+0w
 [josiah:~/src/fp] dalke% time python -c 'import numpy'
 0.161u 0.279s 0:00.44 97.7% 0+0k 0+9io 0pf+0w

 My total runtime is something like 1.4 seconds, and the only thing
 I'm using NumPy for is to make an array of doubles that I can pass to
 a C extension.  (I could use the array module or ctypes, but figured
 numpy is more useful for downstream code.)

 Why does numpy/__init__.py need to import all of these other modules
 and submodules?

Strictly speaking, there is no *need* for any of it. It was a judgment
call trading off import time for the convenience in fairly typical use
cases which do use functions across the breadth of the library. Your
use case isn't so typical and so suffers on the import time end of the
balance.

 Any chance of cutting down on the number, in order
 to improve startup costs?

Not at this point in time, no. That would break too much code.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] import numpy is slow

2008-06-30 Thread Andrew Dalke
On Jul 1, 2008, at 2:22 AM, Robert Kern wrote:
 Your use case isn't so typical and so suffers on the import time  
 end of the
 balance.

I'm working on my presentation for EuroSciPy.  Isn't so typical  
seems to be a good summary of my first slide.  :)

 Any chance of cutting down on the number, in order
 to improve startup costs?

 Not at this point in time, no. That would break too much code.

Understood.

Thanks for the response,

Andrew
[EMAIL PROTECTED]


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion