On Mon, Oct 24, 2011 at 6:57 AM, Nadav Horesh <nad...@visionsense.com> wrote: > I am trying to replace an old code (biliteral filter) that rely on > ndimage.generic_filter with the neighborhood iterator. In the old code, the > generic_filter generates a contiguous copy of the neighborhood, thus the > (cython) code could use C loop to iterate over the neighbourhood copy. In the > new code version the PyArrayNeighborhoodIter_Next must be called to retrieve > every neighbourhood item. The results of rough benchmarking to compare > bilateral filtering on a 1000x1000 array: > Old code (ndimage.generic_filter): 16.5 sec > New code (neighborhood iteration): 60.5 sec > New code with PyArrayNeighborhoodIter_Next omitted: 1.5 sec > > * The last benchmark is not "real" since the omitted call is a must. It just > demonstrates the iterator overhead. > * I assune the main overhead in the old code is the python function callback > process. There are instructions in the manual how to wrap a C code for a > faster callback, but I rather use the neighbourhood iterator as I consider it > as more generic. >
I am afraid the cost is unavoidable: you are really trading cpu for memory. When using PyArrayNeighborhood_Next, there is a loop with a condiational within, and I don't think those can easily be avoided without losing genericity. Which mode are you using when creating the neighborhood iterator ? There used to be a PyArrayNeightborhoodIter_Next2d, I don't know why I commented out. You could try to see if you can get faster. > If the PyArrayNeighborhoodIter_Reset could (optionally) copy the relevant > data (as the generic_filter does) it would provide a major speed up in many > cases. Optionally copying may be an option, but it would make more sense to do it at creation time than during reset, no ? Something like a binary and with the current mode flag, cheers, David _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion