On Wed, Dec 8, 2010 at 12:12 PM, Gregory, Matthew <[email protected]> wrote: > Hi all, > > Likely a very newbie type of question. I'm using numpy with GDAL to > calculate zonal statistics on images. The basic approach is that I have a > zone raster and a value raster which are aligned spatially and I am storing > each zone's corresponding values in a dictionary, then calculating the > statistics on that population. (I'm well aware that this approach may have > memory issues with large rasters ...) > > GDAL ReadAsArray gives you a chunk of raster data as a numpy array. > Currently I'm iterating over rows and columns of that chunk, but I'm guessing > there's a better (and more numpy-like) way. > > zone_stats = {} > zone_block = zone_band.ReadAsArray(x_off, y_off, x_size, y_size) > value_block = value_band.ReadAsArray(x_off, y_off, x_size, y_size) > for row in xrange(y_size): > for col in xrange(x_size): > zone = zone_block[row][col] > value = value_block[row][col] > try: > zone_stats[zone].append(value) > except KeyError: > zone_stats[zone] = [value] > > # Then calculate stats per zone > ...
Just a thought since I'm not doing spatial statistics. If you can create (integer) labels that assigns each point to a zone, then you can treat it essentially as a 1d grouped data, and you could use np.bincount to calculate some statistics, or alternatively scipy.ndimage.measurements for some additional statistics. This would avoid any python loop, but require a full label array. Josef > > Thanks for all suggestions on how to make this better, especially if the > initial approach I'm taking is flawed. > > matt > > > > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
