[Numpy-discussion] Examples for numpy.genfromtxt

2009-01-20 Thread Nils Wagner
Hi all,

Where can I find some sophisticated examples for the usage 
of numpy.genfromtxt ?
  

Nils
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ANN: Numexpr 1.1, an efficient array evaluator

2009-01-20 Thread Francesc Alted
A Tuesday 20 January 2009, Andrew Collette escrigué:
 Hi Francesc,

 Looks like a cool project!  However, I'm not able to achieve the
 advertised speed-ups.  I wrote a simple script to try three
 approaches to this kind of problem:

 1) Native Python code (i.e. will try to do everything at once using
 temp arrays) 2) Straightforward numexpr evaluation
 3) Simple chunked evaluation using array.flat views.  (This solves
 the memory problem and allows the use of arbitrary Python
 expressions).

 I've attached the script; here's the output for the expression
 63 + (a*b) + (c**2) + sin(b)
 along with a few combinations of shapes/dtypes.  As expected, using
 anything other than f8 (double) results in a performance penalty.
 Surprisingly, it seems that using chunks via array.flat results in
 similar performance for f8, and even better performance for other
 dtypes.
[clip]

Well, there were two issues there.  The first one is that when 
transcendental functions are used (like sin() above), the bottleneck is 
on the CPU instead of memory bandwidth, so numexpr speedups are not so 
high as usual.  The other issue was an actual bug in the numexpr code 
that forced a copy of all multidimensional arrays (I normally only use 
undimensional arrays for doing benchmarks).  This has been fixed in 
trunk (r39).

So, with the fix on, the timings are:

(100, 100, 100) f4 (average of 10 runs)
Simple:  0.0426136016846
Numexpr:  0.11350851059
Chunked:  0.0635252952576
(100, 100, 100) f8 (average of 10 runs)
Simple:  0.119254398346
Numexpr:  0.10092959404
Chunked:  0.128384995461

The speed-up is now a mere 20% (for f8), but at least it is not slower.  
With the patches that recently contributed Georg for using Intel's VML, 
the acceleration is a bit better:

(100, 100, 100) f4 (average of 10 runs)
Simple:  0.0417867898941
Numexpr:  0.0944641113281
Chunked:  0.0636183023453
(100, 100, 100) f8 (average of 10 runs)
Simple:  0.120059680939
Numexpr:  0.0832288980484
Chunked:  0.128114104271

i.e. the speed-up is around 45% (for f8).

Moreover, if I get rid of the sin() function and use the expresion:

63 + (a*b) + (c**2) + b

I get:

(100, 100, 100) f4 (average of 10 runs)
Simple:  0.0119329929352
Numexpr:  0.0198570966721
Chunked:  0.0338240146637
(100, 100, 100) f8 (average of 10 runs)
Simple:  0.0255623102188
Numexpr:  0.00832500457764
Chunked:  0.0340095996857

which has a 3.1x speedup (for f8).

 FYI, the current tar file (1.1-1) has a glitch related to the VERSION
 file; I added to the bug report at google code.

Thanks. Will focus on that asap.  Mmm, seems like there is stuff enough 
for another release of numexpr.  I'll try to do it soon.

Cheers,

-- 
Francesc Alted
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Please don't use google code for hosting

2009-01-20 Thread Fernando Perez
On Mon, Jan 19, 2009 at 11:20 AM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 Do you also know how the situation is with sourceforge/launchpad/trac...
 and other popular hosting systems ?
 Do they also have these restrictions ?

 I've not noticed any problems with sourceforge, nor launchpad - I'm
 using them regularly from here.   You'd hope that was the case for
 launchpad - when my browser started this morning, it reminded me of
 the Ubuntu / Canonical mission - http://www.canonical.com/aboutus

1)  delivering the world's best free software platform
2)  ensuring its availability to everyone

Also, remember that launchpad makes it pretty trivial to set up a bzr
branch off an external SVN or other repo.  So it should  be very easy
to create launchpad projects that track existing google code ones you
are interested in.

Not arguing with your original point, just providing more info on a
viable workaround in the meantime, especially since it's quite likely
that projects already on google code will not switch hosts due to this
issue.

Best,

f
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ANN: Numexpr 1.1, an efficient array evaluator

2009-01-20 Thread Andrew Collette
Works much, much better with the current svn version. :) Numexpr now
outperforms everything except the simple technique, and then only
for small data sets.

Along the lines you mentioned I noticed that simply changing from a
shape of (100*100*100,) to (100, 100, 100) results in nearly a factor
of 2 worse performance, a factor which seems constant when changing
the size of the data set.  Is this related to the way numexpr handles
broadcasting rules?  It would seem the memory contents should be
identical for these two cases.

Andrew

On Tue, Jan 20, 2009 at 6:13 AM, Francesc Alted fal...@pytables.org wrote:
 A Tuesday 20 January 2009, Andrew Collette escrigué:
 Hi Francesc,

 Looks like a cool project!  However, I'm not able to achieve the
 advertised speed-ups.  I wrote a simple script to try three
 approaches to this kind of problem:

 1) Native Python code (i.e. will try to do everything at once using
 temp arrays) 2) Straightforward numexpr evaluation
 3) Simple chunked evaluation using array.flat views.  (This solves
 the memory problem and allows the use of arbitrary Python
 expressions).

 I've attached the script; here's the output for the expression
 63 + (a*b) + (c**2) + sin(b)
 along with a few combinations of shapes/dtypes.  As expected, using
 anything other than f8 (double) results in a performance penalty.
 Surprisingly, it seems that using chunks via array.flat results in
 similar performance for f8, and even better performance for other
 dtypes.
 [clip]

 Well, there were two issues there.  The first one is that when
 transcendental functions are used (like sin() above), the bottleneck is
 on the CPU instead of memory bandwidth, so numexpr speedups are not so
 high as usual.  The other issue was an actual bug in the numexpr code
 that forced a copy of all multidimensional arrays (I normally only use
 undimensional arrays for doing benchmarks).  This has been fixed in
 trunk (r39).

 So, with the fix on, the timings are:

 (100, 100, 100) f4 (average of 10 runs)
 Simple:  0.0426136016846
 Numexpr:  0.11350851059
 Chunked:  0.0635252952576
 (100, 100, 100) f8 (average of 10 runs)
 Simple:  0.119254398346
 Numexpr:  0.10092959404
 Chunked:  0.128384995461

 The speed-up is now a mere 20% (for f8), but at least it is not slower.
 With the patches that recently contributed Georg for using Intel's VML,
 the acceleration is a bit better:

 (100, 100, 100) f4 (average of 10 runs)
 Simple:  0.0417867898941
 Numexpr:  0.0944641113281
 Chunked:  0.0636183023453
 (100, 100, 100) f8 (average of 10 runs)
 Simple:  0.120059680939
 Numexpr:  0.0832288980484
 Chunked:  0.128114104271

 i.e. the speed-up is around 45% (for f8).

 Moreover, if I get rid of the sin() function and use the expresion:

 63 + (a*b) + (c**2) + b

 I get:

 (100, 100, 100) f4 (average of 10 runs)
 Simple:  0.0119329929352
 Numexpr:  0.0198570966721
 Chunked:  0.0338240146637
 (100, 100, 100) f8 (average of 10 runs)
 Simple:  0.0255623102188
 Numexpr:  0.00832500457764
 Chunked:  0.0340095996857

 which has a 3.1x speedup (for f8).

 FYI, the current tar file (1.1-1) has a glitch related to the VERSION
 file; I added to the bug report at google code.

 Thanks. Will focus on that asap.  Mmm, seems like there is stuff enough
 for another release of numexpr.  I'll try to do it soon.

 Cheers,

 --
 Francesc Alted
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Examples for numpy.genfromtxt

2009-01-20 Thread Pierre GM
Till I write some proper doc, you can check the examples in tests/ 
test_io (TestFromTxt suitcase)


On Jan 20, 2009, at 4:17 AM, Nils Wagner wrote:

 Hi all,

 Where can I find some sophisticated examples for the usage
 of numpy.genfromtxt ?


 Nils
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Please don't use google code for hosting

2009-01-20 Thread Tim Michelsen
Hello,
last year there has been a discussion on this on the OSGEO list about 
the same issue.

You may check oggeo.discuss at Gmane or Nabble for it.

Kind regards,
Timmie

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Please don't use google code for hosting

2009-01-20 Thread Robert Kern
On Mon, Jan 19, 2009 at 12:26, Tim Michelsen
timmichel...@gmx-topmail.de wrote:
 Hello,
 last year there has been a discussion on this on the OSGEO list about
 the same issue.

 You may check oggeo.discuss at Gmane or Nabble for it.

I must say that Wilfred L. Guerin's opinions on the subject were quite
entertaining and either the result of paranoid delusions or a certain
Mark V. Shaney.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] python numpy code many times slower than c++

2009-01-20 Thread Neal Becker
I tried a little experiment, implementing some code in numpy (usually I 
build modules in c++ to interface to python).  Since these operations are 
all large vectors, I hoped it would be reasonably efficient.

The code in question is simple.  It is a model of an amplifier, modeled by 
it's AM/AM and AM/PM characteristics.

The function in question is the __call__ operator.  The test program plots a 
spectrum, calling this operator 1024 times each time with a vector of 4096.

Any ideas?  The code is not too big, so I'll try to attach it.

from linear_interp import linear_interp
from numpy import *
from math import pi
from uvector import vector_Complex

def db_to_pwr (db):
return 10**(0.1*db)

def db_to_volt (db):
return 10**(0.05*db)

def db_to_gain (db_in, db_out):
return 10**(0.05*(db_out-db_in))

def from_gain_and_phase (gain, phase):
out = empty (len (gain), dtype=complex)
out.real = cos (phase) * gain
out.imag = sin (phase) * gain
return out

class ampl (object):
def __init__ (self, pin, pout, deg, max_pin, delta_v, ibo):
pin, pout in dB normalized to sat, ibo in dB
ampl_interp = linear_interp (vectorize (db_to_volt) (pin), db_to_volt (pout))
phase_interp = linear_interp (vectorize (db_to_volt)(pin), array(deg)*pi/180)
## These would be used if input was linear instead of db
## ampl_interp = linear_interp (sqrt (pin), sqrt (pout))
## phase_interp = linear_interp (sqrt (pin), array(deg)*pi/180)
eps = 1e-6
max_vin = sqrt (max_pin)
vin = arange (0, max_vin+eps, delta_v)
gain = vectorize (ampl_interp) (vin)/vin
##gain = vectorize (ampl_interp) (vin**2) / vin
#gain[0] = 1
gain[0] = gain[1]   # avoid singularity

phase = vectorize (phase_interp) (vin**2)
self.cmplx_gain = from_gain_and_phase (gain, phase)
self.delta_v = delta_v
self.ibo = 10**(-0.05*ibo)
self._phase_comp = from_gain_and_phase (ones (len (phase)), -phase)

def __call__ (self, zin):
zin_ibo = zin * self.ibo
vin = abs (array (zin_ibo, dtype=complex))
index = vin / self.delta_v
lower = floor (index).astype (int)
upper = lower + 1
assert (alltrue (upper  len (self.cmplx_gain)))
delta = (index - lower)

cmplx_gain = self.cmplx_gain[lower]*(1-delta) + self.cmplx_gain[upper]*delta

return vector_Complex (cmplx_gain * zin_ibo)

def phase_comp (self, zin):
zin_ibo = zin * self.ibo
vin = abs (array (zin_ibo, dtype=complex))
index = vin / self.delta_v
lower = floor (index).astype (int)
upper = lower + 1
assert (alltrue (upper  len (self._phase_comp)))

delta = (index - lower)
return vector_Complex (self._phase_comp[lower]*(1-delta) + self._phase_comp[upper]*delta)

if __name__ == __main__:
x = array ((-20, -10, 0, 10), dtype=float)
y = array ((-20, -10, 0, 0), dtype=float)
p = array ((-20, -10, 0, 0), dtype=float)
t = ampl (x, y, p, 10, .01, 0)

def pwr_to_cmplx (db):
return complex (10**(0.05*db))

db = (-20, -10, 0, 1)
res = array (t (vectorize (pwr_to_cmplx) (db)), dtype=complex)

#print res
def mag_sqr (z):
return z.real*z.real + z.imag*z.imag

db_out = log10 (vectorize(mag_sqr) (res)) * 10
phase_out = arctan2 (res.imag, res.real)*180/pi
print db_out, phase_out

class linear_interp (object):
def __init__ (self, x, y):
assert (len (x) == len (y))
self.n = len (x)
self.x = tuple (x)
self.y = tuple (y)

def __call__ (self, _x):
klo = 1
khi = self.n
while (khi - klo  1):
k = (khi + klo)  1
if (self.x[k - 1]  _x):
khi = k
else:
klo = k

h = float (self.x[khi - 1]-self.x[klo - 1])
if (h == 0.0):
raise ValueError, Bad x input to routine linear_interp
a = (self.x[khi - 1]-_x) / h
b = (_x - self.x[klo - 1]) / h
return a * self.y[klo - 1] + b * self.y[khi - 1]

if (__name__ == __main__):
x = (xrange (10))
y = [e+1 for e in x]
l = linear_interp (x, y)

print l (0)
print l (1)
print l (0.5)


import sys
sys.path.append ('../mod')

from constellation import *
from boost_rand import rng, pn, normal_c, normal, uniform_real, uniform_int, gold_code_generator, beta
from nyquist import *
from fir import coef_from_func_double, FIR_Complex_double, InterpFIR_Complex_double, DecimFIR_Complex_double
from ampl import ampl
#from polar_ampl import ampl
from fft3 import *
from uvector import vector_double, vector_Complex, mag_sqr, log10
from stats import stat_Complex
from limit import Limit
import math

from optparse import OptionParser
parser = OptionParser()
parser.add_option (--sps, type=int, default=4)
parser.add_option (-n, --pts, type=int, 

Re: [Numpy-discussion] python numpy code many times slower than c++

2009-01-20 Thread Robert Kern
2009/1/20 Neal Becker ndbeck...@gmail.com:
 I tried a little experiment, implementing some code in numpy (usually I
 build modules in c++ to interface to python).  Since these operations are
 all large vectors, I hoped it would be reasonably efficient.

 The code in question is simple.  It is a model of an amplifier, modeled by
 it's AM/AM and AM/PM characteristics.

 The function in question is the __call__ operator.  The test program plots a
 spectrum, calling this operator 1024 times each time with a vector of 4096.

 Any ideas?  The code is not too big, so I'll try to attach it.

Any chance you can make it self-contained?

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] python numpy code many times slower than c++

2009-01-20 Thread Robert Kern
2009/1/20 Neal Becker ndbeck...@gmail.com:
 I tried a little experiment, implementing some code in numpy (usually I
 build modules in c++ to interface to python).  Since these operations are
 all large vectors, I hoped it would be reasonably efficient.

 The code in question is simple.  It is a model of an amplifier, modeled by
 it's AM/AM and AM/PM characteristics.

 The function in question is the __call__ operator.  The test program plots a
 spectrum, calling this operator 1024 times each time with a vector of 4096.

If you want to find out what lines in that function are taking the
most time, you can try my line_profiler module:

http://www.enthought.com/~rkern/cgi-bin/hgwebdir.cgi/line_profiler/

That might give us a better idea in the absence of a self-contained example.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] python numpy code many times slower than c++

2009-01-20 Thread Neal Becker
Robert Kern wrote:

 2009/1/20 Neal Becker ndbeck...@gmail.com:
 I tried a little experiment, implementing some code in numpy (usually I
 build modules in c++ to interface to python).  Since these operations are
 all large vectors, I hoped it would be reasonably efficient.

 The code in question is simple.  It is a model of an amplifier, modeled
 by it's AM/AM and AM/PM characteristics.

 The function in question is the __call__ operator.  The test program
 plots a spectrum, calling this operator 1024 times each time with a
 vector of 4096.
 
 If you want to find out what lines in that function are taking the
 most time, you can try my line_profiler module:
 
 http://www.enthought.com/~rkern/cgi-bin/hgwebdir.cgi/line_profiler/
 
 That might give us a better idea in the absence of a self-contained
 example.
 
Sounds interesting, I'll give that a try.  But, not sure how to use it.

If my main script is plot_spectrum.py, and I want to profile the 
ampl.__call__ function (defined in ampl.py), what do I need to do?  I tried 
running kernprof.py plot_spectrum.py having added @profile decorators in 
ampl.py, but that didn't work:
  File ../mod/ampl.py, line 43, in ampl
@profile
NameError: name 'profile' is not defined


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] python numpy code many times slower than c++

2009-01-20 Thread Robert Kern
On Tue, Jan 20, 2009 at 20:44, Neal Becker ndbeck...@gmail.com wrote:
 Robert Kern wrote:

 2009/1/20 Neal Becker ndbeck...@gmail.com:
 I tried a little experiment, implementing some code in numpy (usually I
 build modules in c++ to interface to python).  Since these operations are
 all large vectors, I hoped it would be reasonably efficient.

 The code in question is simple.  It is a model of an amplifier, modeled
 by it's AM/AM and AM/PM characteristics.

 The function in question is the __call__ operator.  The test program
 plots a spectrum, calling this operator 1024 times each time with a
 vector of 4096.

 If you want to find out what lines in that function are taking the
 most time, you can try my line_profiler module:

 http://www.enthought.com/~rkern/cgi-bin/hgwebdir.cgi/line_profiler/

 That might give us a better idea in the absence of a self-contained
 example.

 Sounds interesting, I'll give that a try.  But, not sure how to use it.

 If my main script is plot_spectrum.py, and I want to profile the
 ampl.__call__ function (defined in ampl.py), what do I need to do?  I tried
 running kernprof.py plot_spectrum.py having added @profile decorators in
 ampl.py, but that didn't work:
  File ../mod/ampl.py, line 43, in ampl
@profile
 NameError: name 'profile' is not defined

kernprof.py --line-by-line plot_spectrum.py

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] python numpy code many times slower than c++

2009-01-20 Thread Neal Becker
Robert Kern wrote:

 2009/1/20 Neal Becker ndbeck...@gmail.com:
 I tried a little experiment, implementing some code in numpy (usually I
 build modules in c++ to interface to python).  Since these operations are
 all large vectors, I hoped it would be reasonably efficient.

 The code in question is simple.  It is a model of an amplifier, modeled
 by it's AM/AM and AM/PM characteristics.

 The function in question is the __call__ operator.  The test program
 plots a spectrum, calling this operator 1024 times each time with a
 vector of 4096.
 
 If you want to find out what lines in that function are taking the
 most time, you can try my line_profiler module:
 
 http://www.enthought.com/~rkern/cgi-bin/hgwebdir.cgi/line_profiler/
 
 That might give us a better idea in the absence of a self-contained
 example.
 
I see the problem.  Thanks for the great profiler!  You ought to make this 
more widely known.

It seems the big chunks of time are used in data conversion between numpy 
and my own vectors classes.  Mine are wrappers around boost::ublas.  The 
conversion must be falling back on a very inefficient method since there is no 
special code to handle numpy vectors.

Not sure what is the best solution.  It would be _great_ if I could make 
boost::python objects that export a buffer interface, but I have absolutely 
no idea how to do this (and so far noone else has volunteered any info on 
this).


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] python numpy code many times slower than c++

2009-01-20 Thread Robert Kern
On Tue, Jan 20, 2009 at 20:57, Neal Becker ndbeck...@gmail.com wrote:

 I see the problem.  Thanks for the great profiler!  You ought to make this
 more widely known.

I'll be making a release shortly.

 It seems the big chunks of time are used in data conversion between numpy
 and my own vectors classes.  Mine are wrappers around boost::ublas.  The
 conversion must be falling back on a very inefficient method since there is no
 special code to handle numpy vectors.

 Not sure what is the best solution.  It would be _great_ if I could make
 boost::python objects that export a buffer interface, but I have absolutely
 no idea how to do this (and so far noone else has volunteered any info on
 this).

Who's not volunteering information, boost::python or us?

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] python numpy code many times slower than c++

2009-01-20 Thread T J
On Tue, Jan 20, 2009 at 6:57 PM, Neal Becker ndbeck...@gmail.com wrote:
 It seems the big chunks of time are used in data conversion between numpy
 and my own vectors classes.  Mine are wrappers around boost::ublas.  The
 conversion must be falling back on a very inefficient method since there is no
 special code to handle numpy vectors.

 Not sure what is the best solution.  It would be _great_ if I could make
 boost::python objects that export a buffer interface, but I have absolutely
 no idea how to do this (and so far noone else has volunteered any info on
 this).


I'm not sure if I've understood everything here, but I think that
pyublas provides exactly what you need.

http://tiker.net/doc/pyublas/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion