Shasang, My main concern is that if there is a legitimate bug in someone's code that is causing mismatched array sizes, this can mask that bug in certain situations where the mismatch just so happens to produce arrays of certain shapes. I am intrigued by the idea, though.
Ben Root On Tue, Mar 25, 2025 at 9:29 AM Shasang Thummar via NumPy-Discussion < numpy-discussion@python.org> wrote: > Dear NumPy Developers, > > > I hope you are doing well. I am writing to propose an enhancement to NumPy’s > broadcasting mechanism that could make the library even more powerful, > intuitive, and flexible while maintaining its memory efficiency. > > ## **Current Broadcasting Rule and Its Limitation** > As per NumPy's current broadcasting rules, if two arrays have different > shapes, the smaller array can be expanded **only if one of its dimensions is > 1**. This allows memory-efficient expansion of data without copying. However, > if a dimension is greater than 1, NumPy **does not** allow expansion to a > larger size, even when the larger size is a multiple of the smaller size. > > ### **Example of Current Behavior (Allowed Expansion)** > ```python > import numpy as np > > A = np.array([[1, 2, 3]]) # Shape (1,3) > B = np.array([[4, 5, 6], # Shape (2,3) > [7, 8, 9]]) > > C = A + B # Broadcasting works because (1,3) can expand to (2,3) > print(C) > > *Output:* > > [[ 5 7 9] > [ 8 10 12]] > > Here, A has shape (1,3), which is automatically expanded to (2,3) without > copying because *a dimension of size 1 can be stretched*. > > However, *NumPy fails when trying to expand a dimension where N > 1, even > if the larger size is a multiple of N*. > *Example of a Case That Fails (Even Though It Could Work)* > > A = np.array([[1, 2, 3], # Shape (2,3) > [4, 5, 6]]) > > B = np.array([[10, 20, 30], # Shape (4,3) > [40, 50, 60], > [70, 80, 90], > [100, 110, 120]]) > > C = A + B # Error! NumPy does not allow (2,3) to (4,3) > > *This fails with the error:* > > ValueError: operands could not be broadcast together with shapes (2,3) (4,3) > > *Why Should This Be Allowed?* > > If *a larger dimension is an exact multiple of a smaller one*, then > logically, the smaller array can be *repeated* along that axis *without > physically copying data*—just like NumPy does when broadcasting (1,M) to > (N,M). > > In the above example, > > - A has shape (2,3), and B has shape (4,3). > - Since 4 is *a multiple of* 2 (4 % 2 == 0), A could be *logically > repeated 2 times* to become (4,3). > - NumPy *already does* a similar expansion when a dimension is 1, so > why not extend the same logic? > > *Proposed Behavior (Expanding N → M When M % N == 0)* > > Allow an axis with size N to be broadcast to size M *if and only if M is > an exact multiple of N (M % N == 0)*. This is *just as memory-efficient* > as the current broadcasting rules because it can be done using *stride > tricks instead of copying data*. > *Example of the Proposed Behavior* > > If NumPy allowed this new form of broadcasting: > > A = np.array([[1, 2, 3], # Shape (2,3) > [4, 5, 6]]) > > B = np.array([[10, 20, 30], # Shape (4,3) > [40, 50, 60], > [70, 80, 90], > [100, 110, 120]]) > > # Proposed new broadcasting rule > C = A + B > > print(C) > > *Expected Output:* > > [[ 11 22 33] > [ 44 55 66] > [ 71 82 93] > [104 115 126]] > > This works by *logically repeating A* to match B’s shape (4,3). > *Why This is a Natural Extension of Broadcasting* > > - *Memory Efficiency:* Just like broadcasting (1,M) to (N,M), this > expansion does *not* require physically copying data. Instead, strides > can be adjusted to *logically repeat elements*, making this as > efficient as current broadcasting. > - *Intuitiveness:* Right now, broadcasting already surprises new > users. If (1,3) can become (2,3), why not (2,3) to (4,3) when 4 is a > multiple of 2? This would be a more intuitive rule. > - *Extends Current Functionality:* This is *not* a new concept—it *extends > the existing rule* where 1 can be stretched to any value. We are > generalizing it to *any factor relationship* (N → M when M % N == 0). > > *Implementation Considerations* > > The logic behind NumPy’s current broadcasting already uses *stride tricks* > for memory-efficient expansion. Extending it to handle (N, M) → (K, M) > (where K % N == 0) would require: > > - Updating np.broadcast_shapes(), np.broadcast_to(), and related > functions. > - Extending the existing logic that handles expanding 1 to support > factors as well. > - Ensuring backward compatibility and maintaining performance. > > *Conclusion* > > I strongly believe this enhancement would make NumPy’s broadcasting *more > powerful, more intuitive, and just as efficient as before*. The fact that > we can implement this *without unnecessary copying* makes it a natural > extension of the existing feature. > > I would love to hear the NumPy team’s thoughts on this! If the community > is open to it, I am also interested in contributing to its implementation. > > Looking forward to your feedback! > > Best regards, > *Shasang Thummar* > shasangthumm...@gmail.com > > _______________________________________________ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: ben.v.r...@gmail.com >
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com