On Sat, Mar 15, 2014 at 11:30 PM, Charles R Harris < charlesr.har...@gmail.com> wrote:
> > > > On Sat, Mar 15, 2014 at 7:20 PM, <josef.p...@gmail.com> wrote: > >> >> >> >> On Fri, Mar 14, 2014 at 11:41 PM, Nathaniel Smith <n...@pobox.com> wrote: >> >>> Hi all, >>> >>> Here's the main blocker for adding a matrix multiply operator '@' to >>> Python: we need to decide what we think its precedence and associativity >>> should be. I'll explain what that means so we're on the same page, and what >>> the choices are, and then we can all argue about it. But even better would >>> be if we could get some data to guide our decision, and this would be a lot >>> easier if some of you all can help; I'll suggest some ways you might be >>> able to do that. >>> >>> So! Precedence and left- versus right-associativity. If you already know >>> what these are you can skim down until you see CAPITAL LETTERS. >>> >>> We all know what precedence is. Code like this: >>> a + b * c >>> gets evaluated as: >>> a + (b * c) >>> because * has higher precedence than +. It "binds more tightly", as they >>> say. Python's complete precedence able is here: >>> >>> http://docs.python.org/3/reference/expressions.html#operator-precedence >>> >>> Associativity, in the parsing sense, is less well known, though it's >>> just as important. It's about deciding how to evaluate code like this: >>> a * b * c >>> Do we use >>> a * (b * c) # * is "right associative" >>> or >>> (a * b) * c # * is "left associative" >>> ? Here all the operators have the same precedence (because, uh... >>> they're the same operator), so precedence doesn't help. And mostly we can >>> ignore this in day-to-day life, because both versions give the same answer, >>> so who cares. But a programming language has to pick one (consider what >>> happens if one of those objects has a non-default __mul__ implementation). >>> And of course it matters a lot for non-associative operations like >>> a - b - c >>> or >>> a / b / c >>> So when figuring out order of evaluations, what you do first is check >>> the precedence, and then if you have multiple operators next to each other >>> with the same precedence, you check their associativity. Notice that this >>> means that if you have different operators that share the same precedence >>> level (like + and -, or * and /), then they have to all have the same >>> associativity. All else being equal, it's generally considered nice to have >>> fewer precedence levels, because these have to be memorized by users. >>> >>> Right now in Python, every precedence level is left-associative, except >>> for '**'. If you write these formulas without any parentheses, then what >>> the interpreter will actually execute is: >>> (a * b) * c >>> (a - b) - c >>> (a / b) / c >>> but >>> a ** (b ** c) >>> >>> Okay, that's the background. Here's the question. We need to decide on >>> precedence and associativity for '@'. In particular, there are three >>> different options that are interesting: >>> >>> OPTION 1 FOR @: >>> Precedence: same as * >>> Associativity: left >>> My shorthand name for it: "same-left" (yes, very creative) >>> >>> This means that if you don't use parentheses, you get: >>> a @ b @ c -> (a @ b) @ c >>> a * b @ c -> (a * b) @ c >>> a @ b * c -> (a @ b) * c >>> >>> OPTION 2 FOR @: >>> Precedence: more-weakly-binding than * >>> Associativity: right >>> My shorthand name for it: "weak-right" >>> >>> This means that if you don't use parentheses, you get: >>> a @ b @ c -> a @ (b @ c) >>> a * b @ c -> (a * b) @ c >>> a @ b * c -> a @ (b * c) >>> >>> OPTION 3 FOR @: >>> Precedence: more-tightly-binding than * >>> Associativity: right >>> My shorthand name for it: "tight-right" >>> >>> This means that if you don't use parentheses, you get: >>> a @ b @ c -> a @ (b @ c) >>> a * b @ c -> a * (b @ c) >>> a @ b * c -> (a @ b) * c >>> >>> We need to pick which of which options we think is best, based on >>> whatever reasons we can think of, ideally more than "hmm, weak-right gives >>> me warm fuzzy feelings" ;-). (In principle the other 2 possible options are >>> tight-left and weak-left, but there doesn't seem to be any argument in >>> favor of either, so we'll leave them out of the discussion.) >>> >>> Some things to consider: >>> >>> * and @ are actually not associative (in the math sense) with respect to >>> each other, i.e., (a * b) @ c and a * (b @ c) in general give different >>> results when 'a' is not a scalar. So considering the two expressions 'a * b >>> @ c' and 'a @ b * c', we can see that each of these three options gives >>> produces different results in some cases. >>> >>> "Same-left" is the easiest to explain and remember, because it's just, >>> "@ acts like * and /". So we already have to know the rule in order to >>> understand other non-associative expressions like a / b / c or a - b - c, >>> and it'd be nice if the same rule applied to things like a * b @ c so we >>> only had to memorize *one* rule. (Of course there's ** which uses the >>> opposite rule, but I guess everyone internalized that one in secondary >>> school; that's not true for * versus @.) This is definitely the default we >>> should choose unless we have a good reason to do otherwise. >>> >>> BUT: there might indeed be a good reason to do otherwise, which is the >>> whole reason this has come up. Consider: >>> Mat1 @ Mat2 @ vec >>> Obviously this will execute much more quickly if we do >>> Mat1 @ (Mat2 @ vec) >>> because that results in two cheap matrix-vector multiplies, while >>> (Mat1 @ Mat2) @ vec >>> starts out by doing an expensive matrix-matrix multiply. So: maybe @ >>> should be right associative, so that we get the fast behaviour without >>> having to use explicit parentheses! /If/ these kinds of expressions are >>> common enough that having to remember to put explicit parentheses in all >>> the time is more of a programmer burden than having to memorize a special >>> associativity rule for @. Obviously Mat @ Mat @ vec is more common than vec >>> @ Mat @ Mat, but maybe they're both so rare that it doesn't matter in >>> practice -- I don't know. >>> >>> Also, if we do want @ to be right associative, then I can't think of any >>> clever reasons to prefer weak-right over tight-right, or vice-versa. For >>> the scalar multiplication case, I believe both options produce the same >>> result in the same amount of time. For the non-scalar case, they give >>> different answers. Do people have strong intuitions about what expressions >>> like >>> a * b @ c >>> a @ b * c >>> should do actually? (I'm guessing not, but hey, you never know.) >>> >>> And, while intuition is useful, it would be really *really* nice to be >>> basing these decisions on more than *just* intuition, since whatever we >>> decide will be subtly influencing the experience of writing linear algebra >>> code in Python for the rest of time. So here's where I could use some help. >>> First, of course, if you have any other reasons why one or the other of >>> these options is better, then please share! But second, I think we need to >>> know something about how often the Mat @ Mat @ vec type cases arise in >>> practice. How often do non-scalar * and np.dot show up in the same >>> expression? How often does it look like a * np.dot(b, c), and how often >>> does it look like np.dot(a * b, c)? How often do we see expressions like >>> np.dot(np.dot(a, b), c), and how often do we see expressions like np.dot(a, >>> np.dot(b, c))? This would really help guide the debate. I don't have this >>> data, and I'm not sure the best way to get it. A super-fancy approach would >>> be to write a little script that uses the 'ast' module to count things >>> automatically. A less fancy approach would be to just pick some code you've >>> written, or a well-known package, grep through for calls to 'dot', and make >>> notes on what you see. (An advantage of the less-fancy approach is that as >>> a human you might be able to tell the difference between scalar and >>> non-scalar *, or check whether it actually matters what order the 'dot' >>> calls are done in.) >>> >>> -n >>> >>> -- >>> Nathaniel J. Smith >>> Postdoctoral researcher - Informatics - University of Edinburgh >>> http://vorpus.org >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> I'm in favor of same-left because it's the easiest to remember. >> with scalar factors it is how I read formulas. >> > > Note that if there are no (interior) vectors involved then the two methods > of association give theoretically identical results. But when there is a > vector on the right and no vector on the left, then right association is > more efficient and likely more numerically accurate. > What's so special about a vector on the right? What if I have a vector on the left, or, as is pretty common, a quadratic form? having a different associative rule between a * b * c and A @ B @ C looks confusing to me. there is something special about the last array (in numpy) np.arange(5).dot(np.diag(np.ones(5))).dot(np.arange(10).reshape(5, 2, order="F")) np.arange(5).dot(np.diag(np.ones(5))).dot(np.arange(5)) np.arange(5).dot(np.diag(np.ones(5))).dot(np.arange(5).reshape(5, 1)) chains go left to right Josef > > >> Both calculating dot @ first or calculating elementwise * first sound >> logical, but I wouldn't know which should go first. (My "feeling" would be >> @ first.) >> >> >> two cases I remembered in statsmodels >> H = np.dot(results.model.pinv_wexog, scale[:,None] * >> results.model.pinv_wexog.T) >> se = (exog * np.dot(covb, exog.T).T).sum(1) >> >> we are mixing * and dot pretty freely in all combinations AFAIR >> >> my guess is that I wouldn't trust any sequence without parenthesis for a >> long time. >> (and I don't trust a sequence of dots @ without parenthesis either, in >> our applications.) >> >> x @ (W.T @ W) @ x ( W.shape = (10000, 5) ) >> or >> x * (W.T @ W) * x >> >> > Judicious use of parenthesis is definitely recommended no matter what is > decided. > > >> (w * x) @ x weighted sum of squares >> >> > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion