Re: [Numpy-discussion] dot() performance depends on data?

2010-09-11 Thread Hagen Fürstenau
 Anyway, seems it is indeed a denormal issue, as adding a small (1e-10)
 constant gives same speed for both timings.

With adding 1e-10 or clipping to 0 at 1e-150, I still get a slowdown of
about 30% compared with the random arrays. Any explanation for that?

Cheers,
Hagen



signature.asc
Description: OpenPGP digital signature
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] dot() performance depends on data?

2010-09-11 Thread Matthieu Brucher
Denormal numbers are a tricky beast. You may have to change the clip
or the shift depending on the processor you have.
It's no wonder that processors and thus compilers have options to
round denormals to zero.

Matthieu

2010/9/11 Hagen Fürstenau ha...@zhuliguan.net:
 Anyway, seems it is indeed a denormal issue, as adding a small (1e-10)
 constant gives same speed for both timings.

 With adding 1e-10 or clipping to 0 at 1e-150, I still get a slowdown of
 about 30% compared with the random arrays. Any explanation for that?

 Cheers,
 Hagen


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion





-- 
Information System Engineer, Ph.D.
Blog: http://matt.eifelle.com
LinkedIn: http://www.linkedin.com/in/matthieubrucher
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] scan array to extract min-max values (with if condition)

2010-09-11 Thread Massimo Di Stefano
Hello All,

i need to extract data from an array, that are inside a 
rectangle area defined as :

N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 921185.3779625

the data are in a csv (comma delimited text file, with 3 columns X,Y,Z) 

#X,Y,Z
3020081.5500,76.3100,0.0300
3020086.2000,769991.6500,0.4600
3020099.6600,769996.2700,0.9000
...
...

i read it using  numpy.loadtxt 

data :

http://www.geofemengineering.it/data/csv.txt 5,3 mb (158735 rows)

to extract data that are inside the boundy-box area (N, S, E, W) i'm using a 
loop
inside a function like :

import numpy as np

def getMinMaxBB(data, N, S, E, W):
mydata = data * 0.3048006096012
for i in range(len(mydata)):
if mydata[i,0]  E or mydata[i,0]  W or mydata[i,1]  N or 
mydata[i,1]  S :
if i == 0:
newdata = 
np.array((mydata[i,0],mydata[i,1],mydata[i,2]), float)
else :
newdata = np.vstack((newdata,(mydata[i,0], 
mydata[i,1], mydata[i,2])))
results = {}
results['Max_Z'] = newdata.max(0)[2]
results['Min_Z'] = newdata.min(0)[2]
results['Num_P'] = len(newdata)
return results


N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 921185.3779625
data = '/Users/sasha/csv.txt'
mydata = np.loadtxt(data, comments='#', delimiter=',')
out = getMinMaxBB(mydata, N, S, E, W)

print out


This method works, but maybe is not soo fast, have you any hints on how to 
improve code with better performance ?

thanks!!!

Massimo.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] scan array to extract min-max values (with if condition)

2010-09-11 Thread Brett Olsen
On Sat, Sep 11, 2010 at 7:45 AM, Massimo Di Stefano
massimodisa...@gmail.com wrote:
 Hello All,

 i need to extract data from an array, that are inside a
 rectangle area defined as :

 N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 921185.3779625

 the data are in a csv (comma delimited text file, with 3 columns X,Y,Z)

 #X,Y,Z
 3020081.5500,76.3100,0.0300
 3020086.2000,769991.6500,0.4600
 3020099.6600,769996.2700,0.9000
 ...
 ...

 i read it using  numpy.loadtxt 

 data :

 http://www.geofemengineering.it/data/csv.txt     5,3 mb (158735 rows)

 to extract data that are inside the boundy-box area (N, S, E, W) i'm using a 
 loop
 inside a function like :

 import numpy as np

 def getMinMaxBB(data, N, S, E, W):
        mydata = data * 0.3048006096012
        for i in range(len(mydata)):
                if mydata[i,0]  E or mydata[i,0]  W or mydata[i,1]  N or 
 mydata[i,1]  S :
                        if i == 0:
                                newdata = 
 np.array((mydata[i,0],mydata[i,1],mydata[i,2]), float)
                        else :
                                newdata = np.vstack((newdata,(mydata[i,0], 
 mydata[i,1], mydata[i,2])))
        results = {}
        results['Max_Z'] = newdata.max(0)[2]
        results['Min_Z'] = newdata.min(0)[2]
        results['Num_P'] = len(newdata)
        return results


 N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 921185.3779625
 data = '/Users/sasha/csv.txt'
 mydata = np.loadtxt(data, comments='#', delimiter=',')
 out = getMinMaxBB(mydata, N, S, E, W)

 print out

Use boolean arrays to index the parts of your array that you want to look at:

def newGetMinMax(data, N, S, E, W):
mydata = data * 0.3048006096012
mask = np.zeros(mydata.shape[0], dtype=bool)
mask |= mydata[:,0]  E
mask |= mydata[:,0]  W
mask |= mydata[:,1]  N
mask |= mydata[:,1]  S
results = {}
results['Max_Z'] = mydata[mask,2].max()
results['Min_Z'] = mydata[mask,2].min()
results['Num_P'] = mask.sum()
return results

This runs about 5000 times faster on my machine.

Brett
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] scan array to extract min-max values (with if condition)

2010-09-11 Thread Massimo Di Stefano
That's awesome!

masked array are defintley what i need!

thanks to point my attention on it!

best regards,

Massimo.


Il giorno 11/set/2010, alle ore 16.19, Brett Olsen ha scritto:

 On Sat, Sep 11, 2010 at 7:45 AM, Massimo Di Stefano
 massimodisa...@gmail.com wrote:
 Hello All,
 
 i need to extract data from an array, that are inside a
 rectangle area defined as :
 
 N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 
 921185.3779625
 
 the data are in a csv (comma delimited text file, with 3 columns X,Y,Z)
 
 #X,Y,Z
 3020081.5500,76.3100,0.0300
 3020086.2000,769991.6500,0.4600
 3020099.6600,769996.2700,0.9000
 ...
 ...
 
 i read it using  numpy.loadtxt 
 
 data :
 
 http://www.geofemengineering.it/data/csv.txt 5,3 mb (158735 rows)
 
 to extract data that are inside the boundy-box area (N, S, E, W) i'm using a 
 loop
 inside a function like :
 
 import numpy as np
 
 def getMinMaxBB(data, N, S, E, W):
mydata = data * 0.3048006096012
for i in range(len(mydata)):
if mydata[i,0]  E or mydata[i,0]  W or mydata[i,1]  N or 
 mydata[i,1]  S :
if i == 0:
newdata = 
 np.array((mydata[i,0],mydata[i,1],mydata[i,2]), float)
else :
newdata = np.vstack((newdata,(mydata[i,0], 
 mydata[i,1], mydata[i,2])))
results = {}
results['Max_Z'] = newdata.max(0)[2]
results['Min_Z'] = newdata.min(0)[2]
results['Num_P'] = len(newdata)
return results
 
 
 N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 
 921185.3779625
 data = '/Users/sasha/csv.txt'
 mydata = np.loadtxt(data, comments='#', delimiter=',')
 out = getMinMaxBB(mydata, N, S, E, W)
 
 print out
 
 Use boolean arrays to index the parts of your array that you want to look at:
 
 def newGetMinMax(data, N, S, E, W):
   mydata = data * 0.3048006096012
   mask = np.zeros(mydata.shape[0], dtype=bool)
   mask |= mydata[:,0]  E
   mask |= mydata[:,0]  W
   mask |= mydata[:,1]  N
   mask |= mydata[:,1]  S
   results = {}
   results['Max_Z'] = mydata[mask,2].max()
   results['Min_Z'] = mydata[mask,2].min()
   results['Num_P'] = mask.sum()
   return results
 
 This runs about 5000 times faster on my machine.
 
 Brett
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ANNOUNCE: mahotas 0.5

2010-09-11 Thread Luis Pedro Coelho
On Friday, September 10, 2010 03:40:33 am Sebastian Haase wrote:
 Hi Luis,
 
 thanks for the announcement. How would you compare mahotas to scipy's
 ndimage ? Are you using ndimage in mahotas at all ?

Hi Sebastian,

In general there is little overlap (there are 1 or 2 functions which are 
replicated). I wrote mahotas mostly to get functions that I wasn't finding 
elsewhere (or, occasionally, to make them faster). Ndimage, in some ways, 
contains more basic functions, mahotas has a bit more advanced functions (at 
the cost of having a somewhat idiosyncratically chosen set of functions).

I do use ndimage in a couple of places (mostly to do convolution). So, they 
complement themselves rather than compete.

Here are a couple of differences in philosophy:

 - ndimage is *always* n-D. Mahotas is mostly n-D but some functions are 
specialised to 2-D (patches always welcome if you want to extend them).

 - mahotas is written in templated C++, ndimage is C with macros.

 - ndimage tries to be very generic in its interfaces (which makes some of 
them hard to use at first), mahotas tries to have natural interfaces.

HTH,
Luis


signature.asc
Description: This is a digitally signed message part.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] scan array to extract min-max values (with if condition)

2010-09-11 Thread Massimo Di Stefano
Brett,


i tried a different way to solve the problem, using :

#
import os

fpath = '/Users/sasha/py/'
input_fp = open( os.path.join(fpath, 'BE3730072600WC20050817.txt'), 'r' )
input_file = input_fp.readlines()

N = 234560.94503118 
S = 234482.56929822 
E = 921336.53116178 
W = 921185.3779625

xL = []
yL = []
zL = []

for index, line in enumerate( input_file ):
if index == 0:
print 'skipping header line...'
else:
x, y, z = line.split(',')
xL.append( float(x) * 0.3048006096012 )
yL.append( float(y) * 0.3048006096012 )
zL.append( float(z) * 0.3048006096012 )

xLr = []
yLr = []
zLr = []

for coords in zip(xL, yL, zL):  
if W  coords[0]  E and S  coords[1]  N:
xLr.append( coords[0] )
yLr.append( coords[1] )
zLr.append( coords[2] )

elements = len(xLr)
minZ = min(zLr)
maxZ = max(zLr)



using the same input file i posted early,
it give me  966 elements 
instead of 158734 elements gived by your MASK example  

the input file contains 158734 elements, this means the mask code :


   mask |= mydata[:,0]  E
   mask |= mydata[:,0]  W
   mask |= mydata[:,1]  N
   mask |= mydata[:,1]  S

is not working as aspected


have you hints on how to get working the MASK code ?
as it is now it pick all the points in the mydata array.


thanks!

Massimo.

Il giorno 11/set/2010, alle ore 16.19, Brett Olsen ha scritto:

 On Sat, Sep 11, 2010 at 7:45 AM, Massimo Di Stefano
 massimodisa...@gmail.com wrote:
 Hello All,
 
 i need to extract data from an array, that are inside a
 rectangle area defined as :
 
 N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 
 921185.3779625
 
 the data are in a csv (comma delimited text file, with 3 columns X,Y,Z)
 
 #X,Y,Z
 3020081.5500,76.3100,0.0300
 3020086.2000,769991.6500,0.4600
 3020099.6600,769996.2700,0.9000
 ...
 ...
 
 i read it using  numpy.loadtxt 
 
 data :
 
 http://www.geofemengineering.it/data/csv.txt 5,3 mb (158735 rows)
 
 to extract data that are inside the boundy-box area (N, S, E, W) i'm using a 
 loop
 inside a function like :
 
 import numpy as np
 
 def getMinMaxBB(data, N, S, E, W):
mydata = data * 0.3048006096012
for i in range(len(mydata)):
if mydata[i,0]  E or mydata[i,0]  W or mydata[i,1]  N or 
 mydata[i,1]  S :
if i == 0:
newdata = 
 np.array((mydata[i,0],mydata[i,1],mydata[i,2]), float)
else :
newdata = np.vstack((newdata,(mydata[i,0], 
 mydata[i,1], mydata[i,2])))
results = {}
results['Max_Z'] = newdata.max(0)[2]
results['Min_Z'] = newdata.min(0)[2]
results['Num_P'] = len(newdata)
return results
 
 
 N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 
 921185.3779625
 data = '/Users/sasha/csv.txt'
 mydata = np.loadtxt(data, comments='#', delimiter=',')
 out = getMinMaxBB(mydata, N, S, E, W)
 
 print out
 
 Use boolean arrays to index the parts of your array that you want to look at:
 
 def newGetMinMax(data, N, S, E, W):
   mydata = data * 0.3048006096012
   mask = np.zeros(mydata.shape[0], dtype=bool)
   mask |= mydata[:,0]  E
   mask |= mydata[:,0]  W
   mask |= mydata[:,1]  N
   mask |= mydata[:,1]  S
   results = {}
   results['Max_Z'] = mydata[mask,2].max()
   results['Min_Z'] = mydata[mask,2].min()
   results['Num_P'] = mask.sum()
   return results
 
 This runs about 5000 times faster on my machine.
 
 Brett
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] scan array to extract min-max values (with if condition)

2010-09-11 Thread Pierre GM

On Sep 11, 2010, at 9:53 PM, Massimo Di Stefano wrote:

 have you hints on how to get working the MASK code ?
 as it is now it pick all the points in the mydata array.

Brett's code for the mask matched the loop of your post. However, taking a 
second look at it, I don't see why it would work. Mind trying something like

# Selection on the NS axis
yselect = (mydata[:,1] = N)  (mydata[:,1] = S)
# selection on the EW axis
xselect = (mydata[:,1] = E)  (mydata[:,1] = W)
# Global selection
selected_data = mydata[xselect  yselect]

Now you can check in selected data that the coordinates are indeed in the 
rectangle that you want, and take the min/max of your data as needed.
let me know how it goes

P.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] scan array to extract min-max values (with if condition)

2010-09-11 Thread Massimo Di Stefano
Thanks Pierre,

i tried it and all works fine and fast.

my apologize :-(

i used a wrong if statment to represent my needs

if mydata[i,0]  E or mydata[i,0]  W or mydata[i,1]  N or mydata[i,1]  S :

^^ totally wrong for my needs^^


this if  instead :

if W  mydata[i,0]  E and S  mydata[i,1]  N:

should reflect your example :

yselect = (data[:,1] = N)  (data[:,1] = S)
xselect = (data[:,0] = E)  (data[:,0] = W)
selected_data = data[xselect  yselect]


a question, how to code a masked array, 
as in the Brett's code, to reflect the new (right) if statment ?




i'm asking this to try to learn how to use masked array.
thanks a lot for your support!!!

Massimo.

Il giorno 11/set/2010, alle ore 23.00, Pierre GM ha scritto:

 
 On Sep 11, 2010, at 9:53 PM, Massimo Di Stefano wrote:
 
 have you hints on how to get working the MASK code ?
 as it is now it pick all the points in the mydata array.
 
 Brett's code for the mask matched the loop of your post. However, taking a 
 second look at it, I don't see why it would work. Mind trying something like
 
 # Selection on the NS axis
 yselect = (mydata[:,1] = N)  (mydata[:,1] = S)
 # selection on the EW axis
 xselect = (mydata[:,1] = E)  (mydata[:,1] = W)
 # Global selection
 selected_data = mydata[xselect  yselect]
 
 Now you can check in selected data that the coordinates are indeed in the 
 rectangle that you want, and take the min/max of your data as needed.
 let me know how it goes
 
 P.
 
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] least() of datetime objects doesn't work

2010-09-11 Thread dyamins
P
Sent via BlackBerry from T-Mobile

-Original Message-
From: Charles R Harris charlesr.har...@gmail.com
Sender: numpy-discussion-boun...@scipy.org
Date: Thu, 19 Aug 2010 18:03:29 
To: Discussion of Numerical Pythonnumpy-discussion@scipy.org
Reply-To: Discussion of Numerical Python numpy-discussion@scipy.org
Subject: Re: [Numpy-discussion] lexsort() of datetime objects doesn't work

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] scan array to extract min-max values (with if condition)

2010-09-11 Thread Brett Olsen
On Sat, Sep 11, 2010 at 4:46 PM, Massimo Di Stefano
massimodisa...@gmail.com wrote:
 Thanks Pierre,

 i tried it and all works fine and fast.

 my apologize :-(

 i used a wrong if statment to represent my needs

 if mydata[i,0]  E or mydata[i,0]  W or mydata[i,1]  N or mydata[i,1]  S :

 ^^ totally wrong for my needs^^


 this if  instead :

 if W  mydata[i,0]  E and S  mydata[i,1]  N:

 should reflect your example :

 yselect = (data[:,1] = N)  (data[:,1] = S)
 xselect = (data[:,0] = E)  (data[:,0] = W)
 selected_data = data[xselect  yselect]


 a question, how to code a masked array,
 as in the Brett's code, to reflect the new (right) if statment ?

Just replace the lines

  mask |= mydata[:,0]  E
  mask |= mydata[:,0]  W
  mask |= mydata[:,1]  N
  mask |= mydata[:,1]  S

with

  mask = mydata[:,0]  E
  mask = mydata[:,0]  W
  mask = mydata[:,1]  N
  mask = mydata[:,1]  S

Sorry, I wasn't paying attention to what you were actually trying to
do and just duplicated the function of the code you supplied.

There's a good primer on how to index with boolean arrays at
http://www.scipy.org/Tentative_NumPy_Tutorial#head-d55e594d46b4f347c20efe1b4c65c92779f06268
that will explain why this works.

Brett
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion