Re: [Numpy-discussion] speed of numpy.ndarray compared to Numeric.array

2011-01-10 Thread EMMEL Thomas
To John:

 Did you try larger arrays/tuples? I would guess that makes a significant
 difference.

No I didn't, due to the fact that these values are coordinates in 3D (x,y,z).
In fact I work with a list/array/tuple of arrays with 10 to 1M of elements 
or more.
What I need to do is to calculate the distance of each of these elements 
(coordinates)
to a given coordinate and filter for the nearest.
The brute force method would look like this:


#~
def bruteForceSearch(points, point):

minpt = min([(vec2Norm(pt, point), pt, i)
 for i, pt in enumerate(points)], key=itemgetter(0))
return sqrt(minpt[0]), minpt[1], minpt[2] 

#~~
def vec2Norm(pt1,pt2):
xDis = pt1[0]-pt2[0]
yDis = pt1[1]-pt2[1]
zDis = pt1[2]-pt2[2]
return xDis*xDis+yDis*yDis+zDis*zDis

I have a more clever method but it still takes a lot of time in the 
vec2norm-function.
If you like I can attach a running example.

To Ben:

 Don't know how much of an impact it would have, but those timeit statements
 for array creation include the import process, which are going to be
 different for each module and are probably not indicative of the speed of
 array creation.

No, the timeit statements counts the time for the statement in the first 
argument only,
the import-thing isn't included in the time.

Thomas

This email and any attachments are intended solely for the use of the 
individual or entity to whom it is addressed and may be confidential and/or 
privileged.  If you are not one of the named recipients or have received this 
email in error, (i) you should not read, disclose, or copy it, (ii) please 
notify sender of your receipt by reply email and delete this email and all 
attachments, (iii) Dassault Systemes does not accept or assume any liability or 
responsibility for any use of or reliance on this email.For other languages, go 
to http://www.3ds.com/terms/email-disclaimer.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed of numpy.ndarray compared to Numeric.array

2011-01-10 Thread Sebastian Berg
Hey,

On Mon, 2011-01-10 at 08:09 +, EMMEL Thomas wrote:
 #~
 def bruteForceSearch(points, point):
 
 minpt = min([(vec2Norm(pt, point), pt, i)
  for i, pt in enumerate(points)], key=itemgetter(0))
 return sqrt(minpt[0]), minpt[1], minpt[2] 
 
 #~~
 def vec2Norm(pt1,pt2):
 xDis = pt1[0]-pt2[0]
 yDis = pt1[1]-pt2[1]
 zDis = pt1[2]-pt2[2]
 return xDis*xDis+yDis*yDis+zDis*zDis
 
 I have a more clever method but it still takes a lot of time in the 
 vec2norm-function.
 If you like I can attach a running example.
 

if you use the vec2Norm function as you wrote it there, this code is not
vectorized at all, and as such of course numpy would be slowest as it
has the most overhead and no advantages for non vectorized code, you
simply can't write python code like that and expect it to be fast for
these kind of calculations.

Your function should look more like this:

import numpy as np

def bruteForceSearch(points, point):
dists = points - point
# that may need point[None,:] or such for broadcasting to work
dists *= dists
dists = dists.sum(1)
I = np.argmin(dists)
return sqrt(dists[I]), points[I], I

If points is small, this may not help much (though compared to this
exact code my guess is it probably would), if points is larger it should
speed up things tremendously (unless you run into RAM problems). It may
be that you need to fiddle around with axes, I did not check the code.
If this is not good enough for you (you will need to port it (and maybe
the next outer loop as well) to Cython or write it in C/C++ and make
sure it can optimize things right. Also I think somewhere in scipy there
were some distance tools that may be already in C and nice fast, but not
sure.

I hope I got this right and it helps,

Sebastian

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed of numpy.ndarray compared to Numeric.array

2011-01-10 Thread EMMEL Thomas
Hey back...
 
 #~
 ~
 ~~~
  def bruteForceSearch(points, point):
 
  minpt = min([(vec2Norm(pt, point), pt, i)
   for i, pt in enumerate(points)], key=itemgetter(0))
  return sqrt(minpt[0]), minpt[1], minpt[2]
 
 
 #~
 ~
 
  def vec2Norm(pt1,pt2):
  xDis = pt1[0]-pt2[0]
  yDis = pt1[1]-pt2[1]
  zDis = pt1[2]-pt2[2]
  return xDis*xDis+yDis*yDis+zDis*zDis
 
  I have a more clever method but it still takes a lot of time in the
 vec2norm-function.
  If you like I can attach a running example.
 
 
 if you use the vec2Norm function as you wrote it there, this code is 
 not vectorized at all, and as such of course numpy would be slowest as 
 it has the most overhead and no advantages for non vectorized code, 
 you simply can't write python code like that and expect it to be fast 
 for these kind of calculations.
 
 Your function should look more like this:
 
 import numpy as np
 
 def bruteForceSearch(points, point):
 dists = points - point
 # that may need point[None,:] or such for broadcasting to work
 dists *= dists
 dists = dists.sum(1)
 I = np.argmin(dists)
 return sqrt(dists[I]), points[I], I
 
 If points is small, this may not help much (though compared to this 
 exact code my guess is it probably would), if points is larger it 
 should speed up things tremendously (unless you run into RAM 
 problems). It may be that you need to fiddle around with axes, I did 
 not check the code.
 If this is not good enough for you (you will need to port it (and 
 maybe the next outer loop as well) to Cython or write it in C/C++ and 
 make sure it can optimize things right. Also I think somewhere in 
 scipy there were some distance tools that may be already in C and nice 
 fast, but not sure.
 
 I hope I got this right and it helps,
 
 Sebastian
 

I see the point and it was very helpful to understand the behavior of the 
arrays a bit better. And your attempt improved the bruteForceSearch which is up 
to 6 times faster.
But in case of a leaf in a kd-tree you end up with 50, 20, 10 or less points 
where the speed-up is reversed. In this particular case 34000 runs take 90s 
with your method and 50s with mine (not the bruteForce).
I see now the limits of the arrays but of course I see the chances and - coming 
back to my original question - it seems that Numeric arrays were faster for my 
kind of application but they might be slower for larger amounts of data.

Regards

Thomas



This email and any attachments are intended solely for the use of the 
individual or entity to whom it is addressed and may be confidential and/or 
privileged.  If you are not one of the named recipients or have received this 
email in error, (i) you should not read, disclose, or copy it, (ii) please 
notify sender of your receipt by reply email and delete this email and all 
attachments, (iii) Dassault Systemes does not accept or assume any liability or 
responsibility for any use of or reliance on this email.For other languages, go 
to http://www.3ds.com/terms/email-disclaimer.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed of numpy.ndarray compared to Numeric.array

2011-01-10 Thread Pascal
Hi,

On 01/10/2011 09:09 AM, EMMEL Thomas wrote:

 No I didn't, due to the fact that these values are coordinates in 3D (x,y,z).
 In fact I work with a list/array/tuple of arrays with 10 to 1M of 
 elements or more.
 What I need to do is to calculate the distance of each of these elements 
 (coordinates)
 to a given coordinate and filter for the nearest.
 The brute force method would look like this:


 #~
 def bruteForceSearch(points, point):

  minpt = min([(vec2Norm(pt, point), pt, i)
   for i, pt in enumerate(points)], key=itemgetter(0))
  return sqrt(minpt[0]), minpt[1], minpt[2]

 #~~
 def vec2Norm(pt1,pt2):
  xDis = pt1[0]-pt2[0]
  yDis = pt1[1]-pt2[1]
  zDis = pt1[2]-pt2[2]
  return xDis*xDis+yDis*yDis+zDis*zDis


I am not sure I understood the problem properly but here what I would 
use to calculate a distance from horizontally stacked vectors (big):

ref=numpy.array([0.1,0.2,0.3])
big=numpy.random.randn(100, 3)

big=numpy.add(big,-ref)
distsquared=numpy.sum(big**2, axis=1)

Pascal
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed of numpy.ndarray compared to Numeric.array

2011-01-10 Thread René Dudfield
Hi,

Spatial hashes are the common solution.

Another common optimization is using the distance squared for
collision detection.  Since you do not need the expensive sqrt for
this calc.

cu.



On Mon, Jan 10, 2011 at 3:25 PM, Pascal pascal...@parois.net wrote:
 Hi,

 On 01/10/2011 09:09 AM, EMMEL Thomas wrote:

 No I didn't, due to the fact that these values are coordinates in 3D (x,y,z).
 In fact I work with a list/array/tuple of arrays with 10 to 1M of 
 elements or more.
 What I need to do is to calculate the distance of each of these elements 
 (coordinates)
 to a given coordinate and filter for the nearest.
 The brute force method would look like this:


 #~
 def bruteForceSearch(points, point):

      minpt = min([(vec2Norm(pt, point), pt, i)
                   for i, pt in enumerate(points)], key=itemgetter(0))
      return sqrt(minpt[0]), minpt[1], minpt[2]

 #~~
 def vec2Norm(pt1,pt2):
      xDis = pt1[0]-pt2[0]
      yDis = pt1[1]-pt2[1]
      zDis = pt1[2]-pt2[2]
      return xDis*xDis+yDis*yDis+zDis*zDis


 I am not sure I understood the problem properly but here what I would
 use to calculate a distance from horizontally stacked vectors (big):

 ref=numpy.array([0.1,0.2,0.3])
 big=numpy.random.randn(100, 3)

 big=numpy.add(big,-ref)
 distsquared=numpy.sum(big**2, axis=1)

 Pascal
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] speed of numpy.ndarray compared to Numeric.array

2011-01-07 Thread EMMEL Thomas
Hi,

There are some discussions on the speed of numpy compared to Numeric in this 
list, however I have a topic
I don't understand in detail, maybe someone can enlighten me...
I use python 2.6 on a SuSE installation and test this:

#Python 2.6 (r26:66714, Mar 30 2010, 00:29:28)
#[GCC 4.3.2 [gcc-4_3-branch revision 141291]] on linux2
#Type help, copyright, credits or license for more information.

import timeit

#creation of arrays and tuples (timeit number=100 by default)

timeit.Timer('a((1.,2.,3.))','from numpy import array as a').timeit()
#8.2061841487884521
timeit.Timer('a((1.,2.,3.))','from Numeric import array as a').timeit()
#9.6958281993865967
timeit.Timer('a((1.,2.,3.))','a=tuple').timeit()
#0.13814711570739746

#Result: tuples - of course - are much faster than arrays and numpy is a bit 
faster in creating arrays than Numeric

#working with arrays

timeit.Timer('d=x1-x2;sum(d*d)','from Numeric import array as a; 
x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit()
#3.263314962387085
timeit.Timer('d=x1-x2;sum(d*d)','from numpy import array as a; 
x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit()
#9.7236979007720947

#Result: Numeric is three times faster than numpy! Why?

#working with components:

timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','a=tuple;
 x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit()
#0.64785194396972656
timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','from
 numpy import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit()
#3.4181499481201172
timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','from
 Numeric import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit()
#0.97426199913024902

Result: tuples are again the fastest variant, Numeric is faster than numpy and 
both are faster than the variant above using the high-level functions!
Why?

For various reasons I need to use numpy in the future where I used Numeric 
before.
Is there any better solution in numpy I missed?

Kind regards and thanks in advance

Thomas



This email and any attachments are intended solely for the use of the 
individual or entity to whom it is addressed and may be confidential and/or 
privileged.  If you are not one of the named recipients or have received this 
email in error, (i) you should not read, disclose, or copy it, (ii) please 
notify sender of your receipt by reply email and delete this email and all 
attachments, (iii) Dassault Systemes does not accept or assume any liability or 
responsibility for any use of or reliance on this email.For other languages, go 
to http://www.3ds.com/terms/email-disclaimer.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed of numpy.ndarray compared to Numeric.array

2011-01-07 Thread John Salvatier
Did you try larger arrays/tuples? I would guess that makes a significant
difference.

On Fri, Jan 7, 2011 at 7:58 AM, EMMEL Thomas thomas.em...@3ds.com wrote:

  Hi,

 There are some discussions on the speed of numpy compared to Numeric in
 this list, however I have a topic
 I don't understand in detail, maybe someone can enlighten me...
 I use python 2.6 on a SuSE installation and test this:

 #Python 2.6 (r26:66714, Mar 30 2010, 00:29:28)
 #[GCC 4.3.2 [gcc-4_3-branch revision 141291]] on linux2
 #Type help, copyright, credits or license for more information.

 import timeit

 #creation of arrays and tuples (timeit number=100 by default)

 timeit.Timer('a((1.,2.,3.))','from numpy import array as a').timeit()
 #8.2061841487884521
 timeit.Timer('a((1.,2.,3.))','from Numeric import array as a').timeit()
 #9.6958281993865967
 timeit.Timer('a((1.,2.,3.))','a=tuple').timeit()
 #0.13814711570739746

 #Result: tuples - of course - are much faster than arrays and numpy is a
 bit faster in creating arrays than Numeric

 #working with arrays

 timeit.Timer('d=x1-x2;sum(d*d)','from Numeric import array as a;
 x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit()
 #3.263314962387085
 timeit.Timer('d=x1-x2;sum(d*d)','from numpy import array as a;
 x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit()
 #9.7236979007720947

 #Result: Numeric is three times faster than numpy! Why?

 #working with components:

 timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','a=tuple;
 x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit()
 #0.64785194396972656
 timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','from
 numpy import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit()
 #3.4181499481201172
 timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','from
 Numeric import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit()
 #0.97426199913024902

 Result: tuples are again the fastest variant, Numeric is faster than numpy
 and both are faster than the variant above using the high-level functions!
 Why?

 For various reasons I need to use numpy in the future where I used Numeric
 before.
 Is there any better solution in numpy I missed?

 Kind regards and thanks in advance

 Thomas

This email and any attachments are intended solely for the use of the
 individual or entity to whom it is addressed and may be confidential and/or
 privileged.

 If you are not one of the named recipients or have received this email in
 error,

 (i) you should not read, disclose, or copy it,

 (ii) please notify sender of your receipt by reply email and delete this
 email and all attachments,

 (iii) Dassault Systemes does not accept or assume any liability or
 responsibility for any use of or reliance on this email.

 For other languages, Click Herehttp://www.3ds.com/terms/email-disclaimer


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed of numpy.ndarray compared to Numeric.array

2011-01-07 Thread Benjamin Root
On Fri, Jan 7, 2011 at 9:58 AM, EMMEL Thomas thomas.em...@3ds.com wrote:

  Hi,

 There are some discussions on the speed of numpy compared to Numeric in
 this list, however I have a topic
 I don't understand in detail, maybe someone can enlighten me...
 I use python 2.6 on a SuSE installation and test this:

 #Python 2.6 (r26:66714, Mar 30 2010, 00:29:28)
 #[GCC 4.3.2 [gcc-4_3-branch revision 141291]] on linux2
 #Type help, copyright, credits or license for more information.

 import timeit

 #creation of arrays and tuples (timeit number=100 by default)

 timeit.Timer('a((1.,2.,3.))','from numpy import array as a').timeit()
 #8.2061841487884521
 timeit.Timer('a((1.,2.,3.))','from Numeric import array as a').timeit()
 #9.6958281993865967
 timeit.Timer('a((1.,2.,3.))','a=tuple').timeit()
 #0.13814711570739746

 #Result: tuples - of course - are much faster than arrays and numpy is a
 bit faster in creating arrays than Numeric

 #working with arrays

 timeit.Timer('d=x1-x2;sum(d*d)','from Numeric import array as a;
 x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit()
 #3.263314962387085
 timeit.Timer('d=x1-x2;sum(d*d)','from numpy import array as a;
 x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit()
 #9.7236979007720947

 #Result: Numeric is three times faster than numpy! Why?

 #working with components:

 timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','a=tuple;
 x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit()
 #0.64785194396972656
 timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','from
 numpy import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit()
 #3.4181499481201172
 timeit.Timer('d0=x1[0]-x2[0];d1=x1[1]-x2[1];d2=x1[2]-x2[2];d0*d0+d1*d1+d2*d2','from
 Numeric import array as a; x1=a((1.,2.,3.));x2=a((2.,4.,6.))').timeit()
 #0.97426199913024902

 Result: tuples are again the fastest variant, Numeric is faster than numpy
 and both are faster than the variant above using the high-level functions!
 Why?

 For various reasons I need to use numpy in the future where I used Numeric
 before.
 Is there any better solution in numpy I missed?

 Kind regards and thanks in advance

 Thomas


Don't know how much of an impact it would have, but those timeit statements
for array creation include the import process, which are going to be
different for each module and are probably not indicative of the speed of
array creation.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion