Sebastian, I think the core issue you pointed out is indeed correct, but your detailed explanation is backwards, since `maximum(arr[:, 0], arr[:, 1])` implements `arr.max(axis=1)` instead of `arr.max(axis=0)`. So OP's transpose method is essentially approach 1, which for this array shape has less overhead than approach 2 (and since 2 is small enough, it doesn't matter that we are paying this overhead in Python instead of in C).
>From my tests, when n = 1_000_000, the advantage of the transpose method persists up to m = 6. But for smaller n the normal method becomes better again, so it's not that clear cut (although it would be nice to have a way to use the transpose method but pay the overhead in C instead of Python!) Test code: import numpy as np from IPython import get_ipython def calculate_bbox_normal(vertices: np.ndarray): return vertices.min(axis=0), vertices.max(axis=0) def calculate_bbox_transpose(vertices: np.ndarray): return np.array([col.min() for col in vertices.T]), np.array([col.max() for col in vertices.T]) n = 1_000_000 for m in range(4, 9): print(f"shape: {n, m}") vertices = np.random.random((n, m)) vmin, vmax = calculate_bbox_normal(vertices) vmin_t, vmax_t = calculate_bbox_transpose(vertices) assert (vmin == vmin_t).all() assert (vmax == vmax_t).all() print("Normal: ", end="") get_ipython().run_line_magic("timeit", "calculate_bbox_normal(vertices)") print("Transpose: ", end="") get_ipython().run_line_magic("timeit", "calculate_bbox_transpose(vertices)") print() - Fang On Sat, Mar 22, 2025 at 7:10 PM Sebastian Berg <sebast...@sipsolutions.net> wrote: > On Fri, 2025-03-21 at 23:22 +0100, Tiziano Zito via NumPy-Discussion > wrote: > > Hi George, > > > > what you see is due to the memory layout of numpy arrays. If you > > switch your array to F-order you'll see that the two functions have > > the same timings, i.e. both are fast (on my machine 25 times faster > > for the 1_000_000 points case). > > > Memory order is indeed very important, but it should be indirectly so! > > The thing is that NumPy tries to re-order most operations for memory > access order (not particularly advanced). > > I.e. NumPy can calculate this type of operation in two ways (the inner- > loop is always 1-D, so in C first, outer, loop here is separate): > 1. for i in arr.shape[0]: res[i] = arr[i].max() > 2. for i in arr.shape[1]: res[:] = maximum(res, arr[i]) > (For i == 0, you would just copy `arr[0]` or initialize `res` first) > > NumPy chooses the first form here, precisely because it is (normally) > better for the memory layout, here. > But for 1 `arr[i].max()` is actually a function call inside NumPy (deep > in C, not on an actual NumPy array object of course). > > So for `arr.shape[1] == 2`, that just gets in the way because of > overheads! > Writing it as 2. is much better (memory order doesn't even matter), > calculating it as `maximum(arr[0], arr[1])` would be the best. > > But even for other small values for `arr.shape[1]` it would probably be > better to use the second version, memory order will start to matter > more and more (I could imagine even for arr.shape[1] == 3 or 4 it > dominates quickly for huge arrays, but didn't try). > > So what you see should be overheads of using approach 1. Taking the > maximum of 2 elements is simply fast compared to even a lightweight > outer for loop and just calling the function to take the maximum. > (And the core function does a bit of extra work to optimize for many > elements additionally [1]) > You really don't need large inner-loops to hide most of those > overheads, but 2 elements just isn't enough. > > Anyway, of course it could be cool to add a heuristic to pick approach > 2 here when we obviously have `arr.shape[1] == 2` or maybe `arr.shape < > small_value`. > There are also probably some low-hanging fruits to just reduce the > overheads here (for this specific case I know the outer iteration can > be optimized, although I am not sure it would improve things, also, > some inner-loop optimizations could be moved out of the inner-loop > since a few year now -- but only casts actually do so). > > In the end, `maximum(arr[:, 0], arr[:, 1])` will always the fastest way > for this specific thing, because it explicitly picks the clearly best > approach. > It would be good to close that gap at least a bit, though. > > - Sebastian > > > [1] locally I can actually see a small speed-up if I pick less > optimized loops at run-time via `NPY_DISABLE_CPU_FEATURES=AVX2`. > > > > > Try: > > > > vertices = np.array(np.random.random((n, 2)), order='F') > > > > When your array doesn't fit in L1-cache anymore, either order 'C' or > > order 'F' becomes (much) more efficient depending on which dimension > > you are internally looping through. > > > > You can read more about it here: > > > https://numpy.org/doc/stable/dev/internals.html#internal-organization-of-numpy-arrays > > and > > > https://numpy.org/doc/stable/reference/arrays.ndarray.html#internal-memory-layout-of-an-ndarray > > > > Hope that helps, > > Tiziano > > > > > > On Fri 21 Mar, 12:10 +0200, George Tsiamasiotis via NumPy-Discussion > > <numpy-discussion@python.org> wrote: > > > Hello NumPy community! > > > > > > I was writing a function that calculates the bounding box of a > > > polygon (the smallest rectangle that fully contains the polygon, > > > and who's sides are parallel to the x and y axes). The input is a > > > (N,2) array containing the vertices of the polygon, and the output > > > is a 4-tuple containing the vertices of the 2 corners of the > > > bounding box. > > > > > > I found two ways to do that using the np.min() and np.max() > > > methods, and I was surprised to see a significant speed difference, > > > even though they seemingly do the same thing. > > > > > > While for small N the speed is essentially the same, the difference > > > becomes noticeable for larger N. >From my testing, the difference > > > seems to plateau, with the one way being around 4-5 times faster > > > than the other. > > > > > > Is there an explanation for this? > > > > > > Here is a small benchmark I wrote (must be executed with IPython): > > > > > > import numpy as np > > > from IPython import get_ipython > > > > > > vertices = np.random.random((1000, 2)) > > > > > > def calculate_bbox_normal(vertices: np.ndarray) -> > > > tuple[np.float64]: > > > xmin, ymin = vertices.min(axis=0) > > > xmax, ymax = vertices.max(axis=0) > > > return xmin, ymin, xmax, ymax > > > > > > def calculate_bbox_transpose(vertices: np.ndarray) -> > > > tuple[np.float64]: > > > xmin = vertices.T[0].min() > > > xmax = vertices.T[0].max() > > > ymin = vertices.T[1].min() > > > ymax = vertices.T[1].max() > > > return xmin, ymin, xmax, ymax > > > > > > bbox_normal = calculate_bbox_normal(vertices) > > > bbox_transpose = calculate_bbox_transpose(vertices) > > > > > > print(f"Equality: {bbox_normal == bbox_transpose}") > > > > > > for n in [10, 100, 1000, 10_000, 100_000, 1_000_000]: > > > print(f"Number of points: {n}") > > > vertices = np.random.random((n, 2)) > > > print("Normal: ", end="") > > > get_ipython().run_line_magic("timeit", > > > "calculate_bbox_normal(vertices)") > > > print("Transpose: ", end="") > > > get_ipython().run_line_magic("timeit", > > > "calculate_bbox_transpose(vertices)") > > > print() > > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > > To unsubscribe send an email to numpy-discussion-le...@python.org > > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > > Member address: opossumn...@gmail.com > > > > _______________________________________________ > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > To unsubscribe send an email to numpy-discussion-le...@python.org > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > Member address: sebast...@sipsolutions.net > > _______________________________________________ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: fan...@umich.edu >
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com