On Mon, May 28, 2018 at 4:26 PM, Stephan Hoyer <sho...@gmail.com> wrote: > On Mon, May 21, 2018 at 5:42 PM Matti Picus <matti.pi...@gmail.com> wrote: >> >> - create a wrapper that can convince the ufunc mechanism to call >> __array_ufunc__ even on functions that are not true ufuncs > > > I am somewhat opposed to this approach, because __array_ufunc__ is about > overloading ufuncs, and as soon as we relax this guarantee the set of > invariants __array_ufunc__ implementors rely on becomes much more limited. > > We really should have another mechanism for arbitrary function overloading > in NumPy (NEP to follow shortly!). > >> >> - expand the semantics of core signatures so that a single matmul ufunc >> can implement matrix-matrix, vector-matrix, matrix-vector, and >> vector-vector multiplication. > > > I was initially concerned that adding optional dimensions for gufuncs would > introduce additional complexity for only the benefit of a single function > (matmul), but I'm now convinced that it makes sense: > 1. All other arithmetic overloads use __array_ufunc__, and it would be nice > to keep @/matmul in the same place. > 2. There's a common family of gufuncs for which optional dimensions like > np.matmul make sense: matrix functions where 1D arrays should be treated as > 2D row- or column-vectors. > > One example of this class of behavior would be np.linalg.solve, which could > support vectors like Ax=b and matrices like Ax=B with the signature > (m,m),(m,n?)->(m,n?). We couldn't immediately make np.linalg.solve a gufunc > since it uses a subtly different dispatching rule, but it's the same > use-case.
Specifically, np.linalg.solve uses a unique rule where solve(a, b) assumes that b is a stack of vectors if (a.ndim - 1 == b.ndim), and otherwise assumes that it's a stack of matrices. This is pretty confusing. You'd think that solve(a, b) should be equivalent to (inv(a) @ b), but it isn't. Say a.shape == (10, 3, 3) and b.shape == (3,). Then inv(a) @ b works, and does what you'd expect: for each of the ten 3x3 matrices in a, it computes the inverse and multiplies it by the 1-d vector in b (treated as a column vector). But solve(a, b) is an error, because the dimension aren't lined up to trigger the special handling for 1-d vectors. Or, say a.shape == (10, 3, 3) and b.shape == (3, 3). Then again inv(a) @ b works, and does what you'd expect: for each of the ten 3x3 matrices in a, it computes the inverse and multiplies it by the 3x3 matrix in b. But again solve(a, b) is an error -- this time because the special handling for 1-d vectors *does* kick in, even though it doesn't make sense: it tries to match up the ten 3x3 matrices in a against the three one-dimensional vectors in b, and 10 != 3 so the broadcasting fails. This also points to even more confusing possibilities: if a.shape == (3, 3) or (3, 3, 3, 3) and b.shape == (3, 3), then inv(a) @ b and solve(a, b) both work and do the same thing. But if a.shape == (3, 3, 3), then inv(a) @ b and solve(a, b) both work, and do totally *different* things. I wonder if we should deprecate these corner cases, and eventually migrate to making inv(a) @ b and solve(a, b) the same in all situations. If we did, then solve(a, b) would actually be a gufunc with signature (m,m),(m,n?)->(m,n?). I think the cases that would need to be changed are those where (a.ndim - 1 == b.ndim and b.ndim > 1). My guess is that this happens very rarely in existing code, especially since (IIRC) this behavior was only added a few years ago, when we gufunc-ified numpy.linalg. > Another example would be the "matrix transpose" function that has been > occasionally proposed, to swap the last two dimensions of an array. It could > have the signature (m?,n)->(n,m?), which ensure that it is still well > defined (as the identity) on 1d arrays. Unfortunately I don't think we could make "broadcasting matrix transpose" be literally a gufunc, since it should return a view. But I guess there'd still be some value in having the notation available just when talking about it, so we could say "this operation is *like* a gufunc with signature (m?,n)->(n,m?), except that it returns a view". -n -- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion