Re: [Numpy-discussion] dot() performance depends on data?
Anyway, seems it is indeed a denormal issue, as adding a small (1e-10) constant gives same speed for both timings. With adding 1e-10 or clipping to 0 at 1e-150, I still get a slowdown of about 30% compared with the random arrays. Any explanation for that? Cheers, Hagen signature.asc Description: OpenPGP digital signature ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dot() performance depends on data?
Denormal numbers are a tricky beast. You may have to change the clip or the shift depending on the processor you have. It's no wonder that processors and thus compilers have options to round denormals to zero. Matthieu 2010/9/11 Hagen Fürstenau ha...@zhuliguan.net: Anyway, seems it is indeed a denormal issue, as adding a small (1e-10) constant gives same speed for both timings. With adding 1e-10 or clipping to 0 at 1e-150, I still get a slowdown of about 30% compared with the random arrays. Any explanation for that? Cheers, Hagen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] scan array to extract min-max values (with if condition)
Hello All, i need to extract data from an array, that are inside a rectangle area defined as : N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 921185.3779625 the data are in a csv (comma delimited text file, with 3 columns X,Y,Z) #X,Y,Z 3020081.5500,76.3100,0.0300 3020086.2000,769991.6500,0.4600 3020099.6600,769996.2700,0.9000 ... ... i read it using numpy.loadtxt data : http://www.geofemengineering.it/data/csv.txt 5,3 mb (158735 rows) to extract data that are inside the boundy-box area (N, S, E, W) i'm using a loop inside a function like : import numpy as np def getMinMaxBB(data, N, S, E, W): mydata = data * 0.3048006096012 for i in range(len(mydata)): if mydata[i,0] E or mydata[i,0] W or mydata[i,1] N or mydata[i,1] S : if i == 0: newdata = np.array((mydata[i,0],mydata[i,1],mydata[i,2]), float) else : newdata = np.vstack((newdata,(mydata[i,0], mydata[i,1], mydata[i,2]))) results = {} results['Max_Z'] = newdata.max(0)[2] results['Min_Z'] = newdata.min(0)[2] results['Num_P'] = len(newdata) return results N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 921185.3779625 data = '/Users/sasha/csv.txt' mydata = np.loadtxt(data, comments='#', delimiter=',') out = getMinMaxBB(mydata, N, S, E, W) print out This method works, but maybe is not soo fast, have you any hints on how to improve code with better performance ? thanks!!! Massimo. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scan array to extract min-max values (with if condition)
On Sat, Sep 11, 2010 at 7:45 AM, Massimo Di Stefano massimodisa...@gmail.com wrote: Hello All, i need to extract data from an array, that are inside a rectangle area defined as : N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 921185.3779625 the data are in a csv (comma delimited text file, with 3 columns X,Y,Z) #X,Y,Z 3020081.5500,76.3100,0.0300 3020086.2000,769991.6500,0.4600 3020099.6600,769996.2700,0.9000 ... ... i read it using numpy.loadtxt data : http://www.geofemengineering.it/data/csv.txt 5,3 mb (158735 rows) to extract data that are inside the boundy-box area (N, S, E, W) i'm using a loop inside a function like : import numpy as np def getMinMaxBB(data, N, S, E, W): mydata = data * 0.3048006096012 for i in range(len(mydata)): if mydata[i,0] E or mydata[i,0] W or mydata[i,1] N or mydata[i,1] S : if i == 0: newdata = np.array((mydata[i,0],mydata[i,1],mydata[i,2]), float) else : newdata = np.vstack((newdata,(mydata[i,0], mydata[i,1], mydata[i,2]))) results = {} results['Max_Z'] = newdata.max(0)[2] results['Min_Z'] = newdata.min(0)[2] results['Num_P'] = len(newdata) return results N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 921185.3779625 data = '/Users/sasha/csv.txt' mydata = np.loadtxt(data, comments='#', delimiter=',') out = getMinMaxBB(mydata, N, S, E, W) print out Use boolean arrays to index the parts of your array that you want to look at: def newGetMinMax(data, N, S, E, W): mydata = data * 0.3048006096012 mask = np.zeros(mydata.shape[0], dtype=bool) mask |= mydata[:,0] E mask |= mydata[:,0] W mask |= mydata[:,1] N mask |= mydata[:,1] S results = {} results['Max_Z'] = mydata[mask,2].max() results['Min_Z'] = mydata[mask,2].min() results['Num_P'] = mask.sum() return results This runs about 5000 times faster on my machine. Brett ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scan array to extract min-max values (with if condition)
That's awesome! masked array are defintley what i need! thanks to point my attention on it! best regards, Massimo. Il giorno 11/set/2010, alle ore 16.19, Brett Olsen ha scritto: On Sat, Sep 11, 2010 at 7:45 AM, Massimo Di Stefano massimodisa...@gmail.com wrote: Hello All, i need to extract data from an array, that are inside a rectangle area defined as : N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 921185.3779625 the data are in a csv (comma delimited text file, with 3 columns X,Y,Z) #X,Y,Z 3020081.5500,76.3100,0.0300 3020086.2000,769991.6500,0.4600 3020099.6600,769996.2700,0.9000 ... ... i read it using numpy.loadtxt data : http://www.geofemengineering.it/data/csv.txt 5,3 mb (158735 rows) to extract data that are inside the boundy-box area (N, S, E, W) i'm using a loop inside a function like : import numpy as np def getMinMaxBB(data, N, S, E, W): mydata = data * 0.3048006096012 for i in range(len(mydata)): if mydata[i,0] E or mydata[i,0] W or mydata[i,1] N or mydata[i,1] S : if i == 0: newdata = np.array((mydata[i,0],mydata[i,1],mydata[i,2]), float) else : newdata = np.vstack((newdata,(mydata[i,0], mydata[i,1], mydata[i,2]))) results = {} results['Max_Z'] = newdata.max(0)[2] results['Min_Z'] = newdata.min(0)[2] results['Num_P'] = len(newdata) return results N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 921185.3779625 data = '/Users/sasha/csv.txt' mydata = np.loadtxt(data, comments='#', delimiter=',') out = getMinMaxBB(mydata, N, S, E, W) print out Use boolean arrays to index the parts of your array that you want to look at: def newGetMinMax(data, N, S, E, W): mydata = data * 0.3048006096012 mask = np.zeros(mydata.shape[0], dtype=bool) mask |= mydata[:,0] E mask |= mydata[:,0] W mask |= mydata[:,1] N mask |= mydata[:,1] S results = {} results['Max_Z'] = mydata[mask,2].max() results['Min_Z'] = mydata[mask,2].min() results['Num_P'] = mask.sum() return results This runs about 5000 times faster on my machine. Brett ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANNOUNCE: mahotas 0.5
On Friday, September 10, 2010 03:40:33 am Sebastian Haase wrote: Hi Luis, thanks for the announcement. How would you compare mahotas to scipy's ndimage ? Are you using ndimage in mahotas at all ? Hi Sebastian, In general there is little overlap (there are 1 or 2 functions which are replicated). I wrote mahotas mostly to get functions that I wasn't finding elsewhere (or, occasionally, to make them faster). Ndimage, in some ways, contains more basic functions, mahotas has a bit more advanced functions (at the cost of having a somewhat idiosyncratically chosen set of functions). I do use ndimage in a couple of places (mostly to do convolution). So, they complement themselves rather than compete. Here are a couple of differences in philosophy: - ndimage is *always* n-D. Mahotas is mostly n-D but some functions are specialised to 2-D (patches always welcome if you want to extend them). - mahotas is written in templated C++, ndimage is C with macros. - ndimage tries to be very generic in its interfaces (which makes some of them hard to use at first), mahotas tries to have natural interfaces. HTH, Luis signature.asc Description: This is a digitally signed message part. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scan array to extract min-max values (with if condition)
Brett, i tried a different way to solve the problem, using : # import os fpath = '/Users/sasha/py/' input_fp = open( os.path.join(fpath, 'BE3730072600WC20050817.txt'), 'r' ) input_file = input_fp.readlines() N = 234560.94503118 S = 234482.56929822 E = 921336.53116178 W = 921185.3779625 xL = [] yL = [] zL = [] for index, line in enumerate( input_file ): if index == 0: print 'skipping header line...' else: x, y, z = line.split(',') xL.append( float(x) * 0.3048006096012 ) yL.append( float(y) * 0.3048006096012 ) zL.append( float(z) * 0.3048006096012 ) xLr = [] yLr = [] zLr = [] for coords in zip(xL, yL, zL): if W coords[0] E and S coords[1] N: xLr.append( coords[0] ) yLr.append( coords[1] ) zLr.append( coords[2] ) elements = len(xLr) minZ = min(zLr) maxZ = max(zLr) using the same input file i posted early, it give me 966 elements instead of 158734 elements gived by your MASK example the input file contains 158734 elements, this means the mask code : mask |= mydata[:,0] E mask |= mydata[:,0] W mask |= mydata[:,1] N mask |= mydata[:,1] S is not working as aspected have you hints on how to get working the MASK code ? as it is now it pick all the points in the mydata array. thanks! Massimo. Il giorno 11/set/2010, alle ore 16.19, Brett Olsen ha scritto: On Sat, Sep 11, 2010 at 7:45 AM, Massimo Di Stefano massimodisa...@gmail.com wrote: Hello All, i need to extract data from an array, that are inside a rectangle area defined as : N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 921185.3779625 the data are in a csv (comma delimited text file, with 3 columns X,Y,Z) #X,Y,Z 3020081.5500,76.3100,0.0300 3020086.2000,769991.6500,0.4600 3020099.6600,769996.2700,0.9000 ... ... i read it using numpy.loadtxt data : http://www.geofemengineering.it/data/csv.txt 5,3 mb (158735 rows) to extract data that are inside the boundy-box area (N, S, E, W) i'm using a loop inside a function like : import numpy as np def getMinMaxBB(data, N, S, E, W): mydata = data * 0.3048006096012 for i in range(len(mydata)): if mydata[i,0] E or mydata[i,0] W or mydata[i,1] N or mydata[i,1] S : if i == 0: newdata = np.array((mydata[i,0],mydata[i,1],mydata[i,2]), float) else : newdata = np.vstack((newdata,(mydata[i,0], mydata[i,1], mydata[i,2]))) results = {} results['Max_Z'] = newdata.max(0)[2] results['Min_Z'] = newdata.min(0)[2] results['Num_P'] = len(newdata) return results N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 921185.3779625 data = '/Users/sasha/csv.txt' mydata = np.loadtxt(data, comments='#', delimiter=',') out = getMinMaxBB(mydata, N, S, E, W) print out Use boolean arrays to index the parts of your array that you want to look at: def newGetMinMax(data, N, S, E, W): mydata = data * 0.3048006096012 mask = np.zeros(mydata.shape[0], dtype=bool) mask |= mydata[:,0] E mask |= mydata[:,0] W mask |= mydata[:,1] N mask |= mydata[:,1] S results = {} results['Max_Z'] = mydata[mask,2].max() results['Min_Z'] = mydata[mask,2].min() results['Num_P'] = mask.sum() return results This runs about 5000 times faster on my machine. Brett ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scan array to extract min-max values (with if condition)
On Sep 11, 2010, at 9:53 PM, Massimo Di Stefano wrote: have you hints on how to get working the MASK code ? as it is now it pick all the points in the mydata array. Brett's code for the mask matched the loop of your post. However, taking a second look at it, I don't see why it would work. Mind trying something like # Selection on the NS axis yselect = (mydata[:,1] = N) (mydata[:,1] = S) # selection on the EW axis xselect = (mydata[:,1] = E) (mydata[:,1] = W) # Global selection selected_data = mydata[xselect yselect] Now you can check in selected data that the coordinates are indeed in the rectangle that you want, and take the min/max of your data as needed. let me know how it goes P. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scan array to extract min-max values (with if condition)
Thanks Pierre, i tried it and all works fine and fast. my apologize :-( i used a wrong if statment to represent my needs if mydata[i,0] E or mydata[i,0] W or mydata[i,1] N or mydata[i,1] S : ^^ totally wrong for my needs^^ this if instead : if W mydata[i,0] E and S mydata[i,1] N: should reflect your example : yselect = (data[:,1] = N) (data[:,1] = S) xselect = (data[:,0] = E) (data[:,0] = W) selected_data = data[xselect yselect] a question, how to code a masked array, as in the Brett's code, to reflect the new (right) if statment ? i'm asking this to try to learn how to use masked array. thanks a lot for your support!!! Massimo. Il giorno 11/set/2010, alle ore 23.00, Pierre GM ha scritto: On Sep 11, 2010, at 9:53 PM, Massimo Di Stefano wrote: have you hints on how to get working the MASK code ? as it is now it pick all the points in the mydata array. Brett's code for the mask matched the loop of your post. However, taking a second look at it, I don't see why it would work. Mind trying something like # Selection on the NS axis yselect = (mydata[:,1] = N) (mydata[:,1] = S) # selection on the EW axis xselect = (mydata[:,1] = E) (mydata[:,1] = W) # Global selection selected_data = mydata[xselect yselect] Now you can check in selected data that the coordinates are indeed in the rectangle that you want, and take the min/max of your data as needed. let me know how it goes P. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] least() of datetime objects doesn't work
P Sent via BlackBerry from T-Mobile -Original Message- From: Charles R Harris charlesr.har...@gmail.com Sender: numpy-discussion-boun...@scipy.org Date: Thu, 19 Aug 2010 18:03:29 To: Discussion of Numerical Pythonnumpy-discussion@scipy.org Reply-To: Discussion of Numerical Python numpy-discussion@scipy.org Subject: Re: [Numpy-discussion] lexsort() of datetime objects doesn't work ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scan array to extract min-max values (with if condition)
On Sat, Sep 11, 2010 at 4:46 PM, Massimo Di Stefano massimodisa...@gmail.com wrote: Thanks Pierre, i tried it and all works fine and fast. my apologize :-( i used a wrong if statment to represent my needs if mydata[i,0] E or mydata[i,0] W or mydata[i,1] N or mydata[i,1] S : ^^ totally wrong for my needs^^ this if instead : if W mydata[i,0] E and S mydata[i,1] N: should reflect your example : yselect = (data[:,1] = N) (data[:,1] = S) xselect = (data[:,0] = E) (data[:,0] = W) selected_data = data[xselect yselect] a question, how to code a masked array, as in the Brett's code, to reflect the new (right) if statment ? Just replace the lines mask |= mydata[:,0] E mask |= mydata[:,0] W mask |= mydata[:,1] N mask |= mydata[:,1] S with mask = mydata[:,0] E mask = mydata[:,0] W mask = mydata[:,1] N mask = mydata[:,1] S Sorry, I wasn't paying attention to what you were actually trying to do and just duplicated the function of the code you supplied. There's a good primer on how to index with boolean arrays at http://www.scipy.org/Tentative_NumPy_Tutorial#head-d55e594d46b4f347c20efe1b4c65c92779f06268 that will explain why this works. Brett ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion