On Thu, Sep 11, 2014 at 11:18 PM, Charles R Harris <[email protected]> wrote: > > On Thu, Sep 11, 2014 at 8:49 PM, Nathaniel Smith <[email protected]> wrote: >> >> On Thu, Sep 11, 2014 at 10:12 PM, Charles R Harris >> <[email protected]> wrote: >> > >> > On Thu, Sep 11, 2014 at 8:01 PM, Nathaniel Smith <[email protected]> wrote: >> >> >> >> On Thu, Sep 11, 2014 at 12:10 PM, Charles R Harris >> >> <[email protected]> wrote: >> >> > >> >> > On Wed, Sep 10, 2014 at 10:08 PM, Nathaniel Smith <[email protected]> >> >> > wrote: >> >> >> >> >> >> My vote is: >> >> >> >> >> >> __matmul__/__rmatmul__ do the standard dispatch stuff that all >> >> >> __op__ >> >> >> methods do (so I guess check __array_priority__ or whatever it is we >> >> >> always do). I'd also be okay with ignoring __array_priority__ on the >> >> >> grounds that __numpy_ufunc__ is better, and there's no existing code >> >> >> relying on __array_priority__ support in __matmul__. >> >> >> >> >> >> Having decided that we are actually going to run, they dispatch >> >> >> unconditionally to np.newdot(a, b) (or newdot(a, b, out=a) for the >> >> >> in-place version), similarly to how e.g. __add__ dispatches to >> >> >> np.add. >> >> >> >> >> >> newdot acts like a standard gufunc with all the standard niceties, >> >> >> including __numpy_ufunc__ dispatch. >> >> >> >> >> >> ("newdot" here is intended as a placeholder name, maybe it should be >> >> >> np.linalg.matmul or something else to be bikeshed later. I also vote >> >> >> that eventually 'dot' become an alias for this function, but whether >> >> >> to do that is an orthogonal discussion for later.) >> >> >> >> >> > If we went the ufunc route, I think we would want three of them, >> >> > matxvec, >> >> > vecxmat, and matxmat, because the best inner loops would be different >> >> > in >> >> > the >> >> > three cases, >> >> >> >> Couldn't we write a single inner loop like: >> >> >> >> void ufunc_loop(blah blah) { >> >> if (arg1_shape[0] == 1 && arg2_shape[1] == 1) { >> >> call DOT >> >> } else if (arg2_shape[0] == 1) { >> >> call GEMV >> >> } else if (...) { >> >> ... >> >> } else { >> >> call GEMM >> >> } >> >> } >> >> ? >> > >> > Not for generalized ufuncs, different signatures, or if linearized, more >> > info on dimensions. >> >> This sentence no verb, but I think the point you might be raising is: >> we don't actually have the technical capability to define a single >> gufunc for @, because the matmat, matvec, vecmat, and vecvec forms >> have different gufunc signatures ("mn,nk->mk", "mn,n->m", "n,nk->k", >> and "n,n->" respectively, I think)? >> >> This is true, but Jaime has said he's willing to look at fixing this :-): >> >> http://thread.gmane.org/gmane.comp.python.numeric.general/58669/focus=58670 >> > > Don't see the need here, the loops are not complicated. > >> >> ...and fundamentally, it's very difficult to solve this anywhere else >> except the ufunc internals. When we first enter a function like >> newdot, we need to check for special overloads like __numpy_ufunc__ >> *before* we do any array_like->ndarray coercion. But we have to do >> array_like->ndarray coercion before we know what the shape/ndim of the >> our inputs is. > > > True. > >> >> And in the wrapper-around-multiple-ufuncs approach, we >> have to check the shape/ndim before we choose which ufunc to dispatch >> to. > > > True. I don't see a problem there and the four ufuncs would be useful in > themselves. I think they should be part of the multiarray module methods if > we go that way.
Not sure what you mean about multiarray module methods -- it's kinda tricky to expose ufuncs from multiarray.so isn't it, b/c of the split between multiarray.so and umath.so? >> So __numpy_ufunc__ won't get checked until after we've already >> coerced to ndarray, oops... OTOH the ufunc internals already know how >> to do this dance, so it's at least straightforward (if not necessarily >> trivial) to insert the correct logic once in the correct place. > > > I'm thinking that the four ufuncs being overridden by user subtypes should > be sufficient. Maybe I don't understand what you're proposing. Suppose we get handed a random 3rd party type that has a __numpy_ufunc__ attribute, but we know nothing else about it. What do we do? Pick one of the 4 ufuncs at random and call it? >> >> >> > What you show is essentially what dot does now for cblas >> > enabled functions. But note, we need more than the simple '@', we also >> > need >> > stacks of vectors, and turning vectors into matrices, and then back into >> > vectors seems unnecessarily complicated. >> >> By the time we get into the data/shape/strides world that ufuncs work >> with, converting vectors->matrices is literally just adding a single >> entry to the shape+strides arrays. This feels trivial and cheap to me? > > > And moving things around, then removing. It's an option if we just want to > use matrix multiplication for everything. I don't think there is any speed > advantage one way or the other, although there are currently size > limitations that are easier to block in the 1D case than the matrix case. In > practice 2**31 per dimension for floating point is probably plenty. I guess we'll have to implement the 2d blocking sooner or later, and then it won't much matter whether the 1d blocking is simpler, because implementing *anything* for 1d blocking will still be more complicated than just using the 2d blocking code. (Assuming DGEMM is just as fast as DDOT/DGEMV, which seems likely.) -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
