Brett,
i tried a different way to solve the problem, using : ############# import os fpath = '/Users/sasha/py/' input_fp = open( os.path.join(fpath, 'BE3730072600WC20050817.txt'), 'r' ) input_file = input_fp.readlines() N = 234560.94503118 S = 234482.56929822 E = 921336.53116178 W = 921185.3779625 xL = [] yL = [] zL = [] for index, line in enumerate( input_file ): if index == 0: print 'skipping header line...' else: x, y, z = line.split(',') xL.append( float(x) * 0.3048006096012 ) yL.append( float(y) * 0.3048006096012 ) zL.append( float(z) * 0.3048006096012 ) xLr = [] yLr = [] zLr = [] for coords in zip(xL, yL, zL): if W < coords[0] < E and S < coords[1] < N: xLr.append( coords[0] ) yLr.append( coords[1] ) zLr.append( coords[2] ) elements = len(xLr) minZ = min(zLr) maxZ = max(zLr) ############ using the same input file i posted early, it give me 966 elements instead of 158734 elements gived by your "MASK" example the input file contains 158734 elements, this means the mask code : > mask |= mydata[:,0] < E > mask |= mydata[:,0] > W > mask |= mydata[:,1] < N > mask |= mydata[:,1] > S is not working as aspected have you hints on how to get working the "MASK" code ? as it is now it pick all the points in the "mydata" array. thanks! Massimo. Il giorno 11/set/2010, alle ore 16.19, Brett Olsen ha scritto: > On Sat, Sep 11, 2010 at 7:45 AM, Massimo Di Stefano > <massimodisa...@gmail.com> wrote: >> Hello All, >> >> i need to extract data from an array, that are inside a >> rectangle area defined as : >> >> N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, >> 921185.3779625 >> >> the data are in a csv (comma delimited text file, with 3 columns X,Y,Z) >> >> #X,Y,Z >> 3020081.5500,769999.3100,0.0300 >> 3020086.2000,769991.6500,0.4600 >> 3020099.6600,769996.2700,0.9000 >> ... >> ... >> >> i read it using " numpy.loadtxt " >> >> data : >> >> http://www.geofemengineering.it/data/csv.txt 5,3 mb (158735 rows) >> >> to extract data that are inside the boundy-box area (N, S, E, W) i'm using a >> loop >> inside a function like : >> >> import numpy as np >> >> def getMinMaxBB(data, N, S, E, W): >> mydata = data * 0.3048006096012 >> for i in range(len(mydata)): >> if mydata[i,0] < E or mydata[i,0] > W or mydata[i,1] < N or >> mydata[i,1] > S : >> if i == 0: >> newdata = >> np.array((mydata[i,0],mydata[i,1],mydata[i,2]), float) >> else : >> newdata = np.vstack((newdata,(mydata[i,0], >> mydata[i,1], mydata[i,2]))) >> results = {} >> results['Max_Z'] = newdata.max(0)[2] >> results['Min_Z'] = newdata.min(0)[2] >> results['Num_P'] = len(newdata) >> return results >> >> >> N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, >> 921185.3779625 >> data = '/Users/sasha/csv.txt' >> mydata = np.loadtxt(data, comments='#', delimiter=',') >> out = getMinMaxBB(mydata, N, S, E, W) >> >> print out > > Use boolean arrays to index the parts of your array that you want to look at: > > def newGetMinMax(data, N, S, E, W): > mydata = data * 0.3048006096012 > mask = np.zeros(mydata.shape[0], dtype=bool) > mask |= mydata[:,0] < E > mask |= mydata[:,0] > W > mask |= mydata[:,1] < N > mask |= mydata[:,1] > S > results = {} > results['Max_Z'] = mydata[mask,2].max() > results['Min_Z'] = mydata[mask,2].min() > results['Num_P'] = mask.sum() > return results > > This runs about 5000 times faster on my machine. > > Brett > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion