Re: [Numpy-discussion] "Extended" Outer Product

2007-09-19 Thread Anne Archibald
On 19/09/2007, Travis E. Oliphant <[EMAIL PROTECTED]> wrote:
> Anne Archibald wrote:
> > vectorize, of course, is a good example of my point above: it really
> > just loops, in python IIRC, but conceptually it's extremely handy for
> > doing exactly what the OP wanted. Unfortunately vectorize() does not
> > yield a sufficiently ufunc-like object to support .outer(), as that
> > would be extremely tidy.
> >
> I'm not sure what you mean by sufficiently ufunc-like.  In fact,
> vectorize is a ufunc (it's just an object-based one).  Thus, it should
> produce what you want (as long as you use newaxis so that the
> broadcasting is done).   If you just want it to support the .outer
> method that could be easily done (as under the covers is a real ufunc).
>
> I just over-looked adding these methods to the result of vectorize.
> The purpose of vectorize is to create a ufunc out of a scalar-based
> function, so I don't see any problem in giving them the methods of
> ufuncs as well (as long as the signature is right --- 2 inputs and 1
> output).

Ah. You got it in one: I was missing the methods. It would be handy to
have them back, not least because then I could just remember the rule
"all binary ufuncs have .outer()".

Do ternary ufuncs support outer()? It would presumably just generate a
higher-rank array, for example
U.outer(arange(10),arange(11),arange(12)) would produce an array of
shape (10,11,12)... maybe there aren't any ternary ufuncs yet, apart
from the ones that are generated by vectorize(). I suppose ix_
provides an alternative, so that you could have

def outer(self,*args):
return self(ix_(*args))

Still, I think for conceptual tidiness it would be nice if the ufuncs
vectorize() makes supported the methods.

Thanks,
Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "Extended" Outer Product

2007-09-19 Thread Travis E. Oliphant
Anne Archibald wrote:
> On 21/08/07, Timothy Hochberg <[EMAIL PROTECTED]> wrote:
>
>   
>> This is just a general comment on recent threads of this type and not
>> directed specifically at Chuck or anyone else.
>>
>> IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is
>> often more memory friendly and thus faster to vectorize only the inner loop
>> and leave outer loops alone. Everything varies with the specific case of
>> course, but trying to avoid FOR loops on principle is not a good strategy.
>> 
>
> Yes and no. From a performance point of view, you are certainly right;
> vectorizing is definitely not always a speedup. But for me, the main
> advantage of vectorized operations is generally clarity: C = A*B is
> clearer and simpler than C = [a*b for (a,b) in zip(A,B)]. When it's
> not clearer and simpler, I feel no compunction about falling back to
> list comprehensions and for loops.
>
> That said, it would often be nice to have something like
> map(f,arange(10)) for arrays; the best I've found is
> vectorize(f)(arange(10)).
>
> vectorize, of course, is a good example of my point above: it really
> just loops, in python IIRC, but conceptually it's extremely handy for
> doing exactly what the OP wanted. Unfortunately vectorize() does not
> yield a sufficiently ufunc-like object to support .outer(), as that
> would be extremely tidy.
>   
I'm not sure what you mean by sufficiently ufunc-like.  In fact, 
vectorize is a ufunc (it's just an object-based one).  Thus, it should 
produce what you want (as long as you use newaxis so that the 
broadcasting is done).   If you just want it to support the .outer 
method that could be easily done (as under the covers is a real ufunc). 

I just over-looked adding these methods to the result of vectorize.   
The purpose of vectorize is to create a ufunc out of a scalar-based 
function, so I don't see any problem in giving them the methods of 
ufuncs as well (as long as the signature is right --- 2 inputs and 1 
output).

-Travis

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "Extended" Outer Product

2007-08-22 Thread Gael Varoquaux
On Tue, Aug 21, 2007 at 02:14:00PM -0700, Timothy Hochberg wrote:
>I suppose someone should fix that someday. However, I still think
>vectorize is an attractive nuisance in the sense that someone has a
>function that they want to apply to an array and they get sucked into
>throwing vectorize at the problem. More often than not, vectorize makes
>things slower than they need to be. If you don't care about performance,
>that's fine, but I live in fear of code like:

>   def f(a, b):
>   return sin(a*b + a**2)
>   f = vectorize(f)

>The original function f is a perfectly acceptable vectorized function
>(assuming one uses numpy.sin), but now it's been replaced by a slower
>version by passing it through vectorize. To be sure, this isn't always the
>case; in cases where you have to make choices, things get messier. Still,
>I'm not convinced that vectorize doesn't hurt more than it helps.

I often have code where I am going to loop over a large amount of nested
loops, some thing like:

# A function to return the optical field in each point:

def optical_field( (x, y, z) ):
loop over an array of laser wave-vector
return optical field

# Evaluate the optical field on a grid to plot it :

x, y z = mgrid[-10:10, -10:10, -10:10]
field = optical_field( (x, y, z) )

In such a code every single operation could be vectorized, but the
problem is that each function assumes the input array to be of a certain
dimension: I may be using some code like:
r = c_[x, y, z]
cross(r, r_o) 

So implementing loops with arrays is not that convenient, because I have
to add dimensions to my arrays, and to make sure that my inner functions
are robust to these extra dimensions.

Looking at some of my code where I had this kind of problems, I see
functions similar to:

def delta(r, v, k):
return  dot(r, transpose(k))  
+ Gaussian_beam(r)
+ dot(v, transpose(k))

I am starting to realize that the real problem is that there is no info
of what the expected size for the input and output arguments should be.
Given such info, the function could resize its input and output
arguments.

Maybe some clever decorators could be written to address this issue,
something like:

@inputsize( (3, -1), (3, -1), (3, -1) )

which would reshape every input positional argument to the shape given in
the list of shapes, and reshape the output argument to the shape of the
first input argument.

As I worked around these problems in my code I cannot say whether these
decorators would get rid of them (I had not had the idea at the time), I
like the idea, and I will try next time I run into these problems.

I just wanted to point out that replacing for loops with arrays was not
always that simple and that using "vectorize" sometimes was a quick and a
dirty way to get things done.

Gaƫl
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "Extended" Outer Product

2007-08-21 Thread Robert Kern
Timothy Hochberg wrote:
> On 8/21/07, *Anne Archibald* <[EMAIL PROTECTED]
> > wrote:

> but conceptually it's extremely handy for
> doing exactly what the OP wanted. Unfortunately vectorize() does not
> yield a sufficiently ufunc-like object to support .outer(), as that
> would be extremely tidy.
> 
> I suppose someone should fix that someday.

Not much to fix. There is already frompyfunc() which does make a real ufunc.
However, (and it's a big "however"), those ufuncs only output object arrays.
That's why I didn't mention it earlier.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "Extended" Outer Product

2007-08-21 Thread Timothy Hochberg
On 8/21/07, Anne Archibald <[EMAIL PROTECTED]> wrote:
>
> On 21/08/07, Timothy Hochberg <[EMAIL PROTECTED]> wrote:
>
> > This is just a general comment on recent threads of this type and not
> > directed specifically at Chuck or anyone else.
> >
> > IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is
> > often more memory friendly and thus faster to vectorize only the inner
> loop
> > and leave outer loops alone. Everything varies with the specific case of
> > course, but trying to avoid FOR loops on principle is not a good
> strategy.
>
> Yes and no. From a performance point of view, you are certainly right;
> vectorizing is definitely not always a speedup. But for me, the main
> advantage of vectorized operations is generally clarity: C = A*B is
> clearer and simpler than C = [a*b for (a,b) in zip(A,B)]. When it's
> not clearer and simpler, I feel no compunction about falling back to
> list comprehensions and for loops.


I always assume that in these cases performance is a driver of the question.
It would be straightforward to code an outer equivalent in Python to hide
this for anyone who cares. Since no one who asks these questions ever does,
I assume they must be primarily motivated by performance.

That said, it would often be nice to have something like
> map(f,arange(10)) for arrays; the best I've found is
> vectorize(f)(arange(10)).
>
> vectorize, of course, is a good example of my point above: it really
> just loops, in python IIRC,


I used to think that too, but then I looked at it and I believe it actually
grabs the code object out of the function and loops in C. You still have to
run the code object at each point though so it's not that fast. It's been a
while since I did that looking so I may be totally wrong.

but conceptually it's extremely handy for
> doing exactly what the OP wanted. Unfortunately vectorize() does not
> yield a sufficiently ufunc-like object to support .outer(), as that
> would be extremely tidy.


I suppose someone should fix that someday. However, I still think vectorize
is an attractive nuisance in the sense that someone has a function that they
want to apply to an array and they get sucked into throwing vectorize at the
problem. More often than not, vectorize makes things slower than they need
to be. If you don't care about performance, that's fine, but I live in fear
of code like:

   def f(a, b):
   return sin(a*b + a**2)
   f = vectorize(f)

The original function f is a perfectly acceptable vectorized function
(assuming one uses numpy.sin), but now it's been replaced by a slower
version by passing it through vectorize. To be sure, this isn't always the
case; in cases where you have to make choices, things get messier. Still,
I'm not convinced that vectorize doesn't hurt more than it helps.



-- 
.  __
.   |-\
.
.  [EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "Extended" Outer Product

2007-08-21 Thread Anne Archibald
On 21/08/07, Timothy Hochberg <[EMAIL PROTECTED]> wrote:

> This is just a general comment on recent threads of this type and not
> directed specifically at Chuck or anyone else.
>
> IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is
> often more memory friendly and thus faster to vectorize only the inner loop
> and leave outer loops alone. Everything varies with the specific case of
> course, but trying to avoid FOR loops on principle is not a good strategy.

Yes and no. From a performance point of view, you are certainly right;
vectorizing is definitely not always a speedup. But for me, the main
advantage of vectorized operations is generally clarity: C = A*B is
clearer and simpler than C = [a*b for (a,b) in zip(A,B)]. When it's
not clearer and simpler, I feel no compunction about falling back to
list comprehensions and for loops.

That said, it would often be nice to have something like
map(f,arange(10)) for arrays; the best I've found is
vectorize(f)(arange(10)).

vectorize, of course, is a good example of my point above: it really
just loops, in python IIRC, but conceptually it's extremely handy for
doing exactly what the OP wanted. Unfortunately vectorize() does not
yield a sufficiently ufunc-like object to support .outer(), as that
would be extremely tidy.

Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "Extended" Outer Product

2007-08-21 Thread Timothy Hochberg
On 8/21/07, Geoffrey Zhu <[EMAIL PROTECTED]> wrote:
>
> On 8/21/07, Timothy Hochberg <[EMAIL PROTECTED]> wrote:
> >
> >
> >
> > On 8/21/07, Charles R Harris <[EMAIL PROTECTED]> wrote:
> > >
> > >
> > >
> > > On 8/20/07, Geoffrey Zhu < [EMAIL PROTECTED]> wrote:
> > > > Hi Everyone,
> > > >
> > > > I am wondering if there is an "extended" outer product. Take the
> > > > example in "Guide to Numpy." Instead of doing an multiplication, I
> > > > want to call a custom function for each pair.
> > > >
> > > > >>> print outer([1,2,3],[10,100,1000])
> > > >
> > > > [[ 10 100 1000]
> > > > [ 20 200 2000]
> > > > [ 30 300 3000]]
> > > >
> > > >
> > > > So I want:
> > > >
> > > > [
> > > > [f(1,10), f(1,100), f(1,1000)],
> > > > [f(2,10), f(2, 100), f(2, 1000)],
> > > > [f(3,10), f(3, 100), f(3,1000)]
> > > > ]
> > >
> > >
> > > Maybe something like
> > >
> > > In [15]: f = lambda x,y : x*sin(y)
> > >
> > > In [16]: a = array([[f(i,j) for i in range(3)] for j in range(3)])
> > >
> > > In [17]: a
> > > Out[17]:
> > > array([[ 0.,  0.,  0.],
> > >[ 0.,  0.84147098,  1.68294197],
> > >[ 0.,  0.90929743,  1.81859485]])
> > >
> > > I don't know if nested list comprehensions are faster than two nested
> > loops, but at least they avoid array indexing.
> >
> > This is just a general comment on recent threads of this type and not
> > directed specifically at Chuck or anyone else.
> >
> > IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is
> > often more memory friendly and thus faster to vectorize only the inner
> loop
> > and leave outer loops alone. Everything varies with the specific case of
> > course, but trying to avoid FOR loops on principle is not a good
> strategy.
> >
>
> I agree. My original post asked for solutions without using two nested
> for loops because I already know the two for loop solution. Besides, I
> was hoping that some version of 'outer' will take in a function
> reference and call the function instead of doing multiplifcation.


A specific example would help here. There are ways to deal with certain
subclasses of problems that won't necessarily generalize. For example, are
you aware of the outer methods on ufuncs (add.outer, substract.outer, etc)?
Typical dimensions also matter, since some approaches work well for certain
shapes, but are pretty miserable for others. FWIW, I often have very good
luck with removing the inner for-loop in favor of vector operations. This
tends to be simpler than trying to vectorize everything and often has better
performance since it's often more memory friendly. However, it all depends
on specifics of the problem.

Regards,

-tim





-- 
.  __
.   |-\
.
.  [EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "Extended" Outer Product

2007-08-21 Thread Geoffrey Zhu
On 8/21/07, Timothy Hochberg <[EMAIL PROTECTED]> wrote:
>
>
>
> On 8/21/07, Charles R Harris <[EMAIL PROTECTED]> wrote:
> >
> >
> >
> > On 8/20/07, Geoffrey Zhu < [EMAIL PROTECTED]> wrote:
> > > Hi Everyone,
> > >
> > > I am wondering if there is an "extended" outer product. Take the
> > > example in "Guide to Numpy." Instead of doing an multiplication, I
> > > want to call a custom function for each pair.
> > >
> > > >>> print outer([1,2,3],[10,100,1000])
> > >
> > > [[ 10 100 1000]
> > > [ 20 200 2000]
> > > [ 30 300 3000]]
> > >
> > >
> > > So I want:
> > >
> > > [
> > > [f(1,10), f(1,100), f(1,1000)],
> > > [f(2,10), f(2, 100), f(2, 1000)],
> > > [f(3,10), f(3, 100), f(3,1000)]
> > > ]
> >
> >
> > Maybe something like
> >
> > In [15]: f = lambda x,y : x*sin(y)
> >
> > In [16]: a = array([[f(i,j) for i in range(3)] for j in range(3)])
> >
> > In [17]: a
> > Out[17]:
> > array([[ 0.,  0.,  0.],
> >[ 0.,  0.84147098,  1.68294197],
> >[ 0.,  0.90929743,  1.81859485]])
> >
> > I don't know if nested list comprehensions are faster than two nested
> loops, but at least they avoid array indexing.
>
> This is just a general comment on recent threads of this type and not
> directed specifically at Chuck or anyone else.
>
> IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is
> often more memory friendly and thus faster to vectorize only the inner loop
> and leave outer loops alone. Everything varies with the specific case of
> course, but trying to avoid FOR loops on principle is not a good strategy.
>

I agree. My original post asked for solutions without using two nested
for loops because I already know the two for loop solution. Besides, I
was hoping that some version of 'outer' will take in a function
reference and call the function instead of doing multiplifcation.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "Extended" Outer Product

2007-08-21 Thread Timothy Hochberg
On 8/21/07, Charles R Harris <[EMAIL PROTECTED]> wrote:
>
>
>
> On 8/20/07, Geoffrey Zhu <[EMAIL PROTECTED]> wrote:
> >
> > Hi Everyone,
> >
> > I am wondering if there is an "extended" outer product. Take the
> > example in "Guide to Numpy." Instead of doing an multiplication, I
> > want to call a custom function for each pair.
> >
> > >>> print outer([1,2,3],[10,100,1000])
> >
> > [[ 10 100 1000]
> > [ 20 200 2000]
> > [ 30 300 3000]]
> >
> >
> > So I want:
> >
> > [
> > [f(1,10), f(1,100), f(1,1000)],
> > [f(2,10), f(2, 100), f(2, 1000)],
> > [f(3,10), f(3, 100), f(3,1000)]
> > ]
>
>
> Maybe something like
>
> In [15]: f = lambda x,y : x*sin(y)
>
> In [16]: a = array([[f(i,j) for i in range(3)] for j in range(3)])
>
> In [17]: a
> Out[17]:
> array([[ 0.,  0.,  0.],
>[ 0.,  0.84147098,  1.68294197],
>[ 0.,  0.90929743,  1.81859485]])
>
> I don't know if nested list comprehensions are faster than two nested
> loops, but at least they avoid array indexing.
>

This is just a general comment on recent threads of this type and not
directed specifically at Chuck or anyone else.

IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is
often more memory friendly and thus faster to vectorize only the inner loop
and leave outer loops alone. Everything varies with the specific case of
course, but trying to avoid FOR loops on principle is not a good strategy.


-- 
.  __
.   |-\
.
.  [EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "Extended" Outer Product

2007-08-21 Thread Charles R Harris
On 8/20/07, Geoffrey Zhu <[EMAIL PROTECTED]> wrote:
>
> Hi Everyone,
>
> I am wondering if there is an "extended" outer product. Take the
> example in "Guide to Numpy." Instead of doing an multiplication, I
> want to call a custom function for each pair.
>
> >>> print outer([1,2,3],[10,100,1000])
>
> [[ 10 100 1000]
> [ 20 200 2000]
> [ 30 300 3000]]
>
>
> So I want:
>
> [
> [f(1,10), f(1,100), f(1,1000)],
> [f(2,10), f(2, 100), f(2, 1000)],
> [f(3,10), f(3, 100), f(3,1000)]
> ]


Maybe something like

In [15]: f = lambda x,y : x*sin(y)

In [16]: a = array([[f(i,j) for i in range(3)] for j in range(3)])

In [17]: a
Out[17]:
array([[ 0.,  0.,  0.],
   [ 0.,  0.84147098,  1.68294197],
   [ 0.,  0.90929743,  1.81859485]])

I don't know if nested list comprehensions are faster than two nested loops,
but at least they avoid array indexing.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "Extended" Outer Product

2007-08-20 Thread Charles R Harris
On 8/20/07, Geoffrey Zhu <[EMAIL PROTECTED]> wrote:
>
> Hi Everyone,
>
> I am wondering if there is an "extended" outer product. Take the
> example in "Guide to Numpy." Instead of doing an multiplication, I
> want to call a custom function for each pair.
>
> >>> print outer([1,2,3],[10,100,1000])
>
> [[ 10 100 1000]
> [ 20 200 2000]
> [ 30 300 3000]]
>
>
> So I want:
>
> [
> [f(1,10), f(1,100), f(1,1000)],
> [f(2,10), f(2, 100), f(2, 1000)],
> [f(3,10), f(3, 100), f(3,1000)]
> ]


You could make two matrices like so:

In [46]: a = arange(3)

In [47]: b = a.reshape(1,3).repeat(3,0)

In [48]: c = a.reshape(3,1).repeat(3,1)

In [49]: b
Out[49]:
array([[0, 1, 2],
   [0, 1, 2],
   [0, 1, 2]])

In [50]: c
Out[50]:
array([[0, 0, 0],
   [1, 1, 1],
   [2, 2, 2]])

 which will give you all pairs. You can then make a function of these in
various ways, for example

In [52]: c**b
Out[52]:
array([[1, 0, 0],
   [1, 1, 1],
   [1, 2, 4]])

That is a bit clumsy, though. I don't know how to do what you want in a
direct way.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "Extended" Outer Product

2007-08-20 Thread Christopher Barker
Robert Kern wrote:
> If you can code your function such that it only uses operations that broadcast
> (i.e. operators and ufuncs) and avoids things like branching or loops, then 
> you
> can just use numpy.newaxis on the first array.
> 
>   from numpy import array, newaxis
>   x = array([1, 2, 3])
>   y = array([10, 100, 1000])
>   f(x[:,newaxis], y)

in fact, it may make sense to just have your x be column vector anyway:
 >>> x
array([1, 2, 3])
 >>> y
array([10, 11, 12])
 >>> x.shape = (-1,1)
 >>> x
array([[1],
[2],
[3]])
 >>> x * y
array([[10, 11, 12],
[20, 22, 24],
[30, 33, 36]])

Broadcasting is VERY cool!

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "Extended" Outer Product

2007-08-20 Thread Robert Kern
Geoffrey Zhu wrote:
> Hi Everyone,
> 
> I am wondering if there is an "extended" outer product. Take the
> example in "Guide to Numpy." Instead of doing an multiplication, I
> want to call a custom function for each pair.
> 
 print outer([1,2,3],[10,100,1000])
> 
> [[ 10 100 1000]
> [ 20 200 2000]
> [ 30 300 3000]]
> 
> 
> So I want:
> 
> [
>  [f(1,10), f(1,100), f(1,1000)],
>  [f(2,10), f(2, 100), f(2, 1000)],
>  [f(3,10), f(3, 100), f(3,1000)]
> ]
> 
> Does anyone know how to do this without using a double loop?

If you can code your function such that it only uses operations that broadcast
(i.e. operators and ufuncs) and avoids things like branching or loops, then you
can just use numpy.newaxis on the first array.

  from numpy import array, newaxis
  x = array([1, 2, 3])
  y = array([10, 100, 1000])
  f(x[:,newaxis], y)

Otherwise, you can use numpy.vectorize() to turn your function into one that
will do that broadcasting for you.

  from numpy import array, newaxis, vectorize
  x = array([1, 2, 3])
  y = array([10, 100, 1000])
  f = vectorize(f)
  f(x[:,newaxis], y)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion