Re: [sympy] Double dispatch

Ronan Lamy Sat, 28 Jul 2012 10:59:11 -0700

Le samedi 28 juillet 2012 à 10:37 +0200, Joachim Durchholz a écrit :
> Am 27.07.2012 22:55, schrieb Ronan Lamy:
> > I'm experimenting with a system implementing a perfectly sound
> > double-dispatch, even in the face of multiple inheritance,
> 
> A perfectly sound double-dispatch is like the quadrature of the circle: 
> it's mathematically impossible but too tempting not to try.
> (More on that below.)


We do need double-dispatch, no matter what, though. We already do it, in
a rather haphazard fashion.
> 
>  > bypassing Python's byzantine and subtly incorrect rules.
> 
> Yep, that's not uncommon for incomplete solutions.
> 
>  > It is based on a callable
> > binary_operation object that stores implementations for pairs of types.
> > When it is called, it finds the implementation corresponding to the most
> > derived pair of types or raises an error in case of ambiguity.
> 
> There is no unique "most derived pair of types", so you'll need a 
> disambiguation rule.
> Whatever rule you choose, the routine it selects for a given combination 
> of types will fluctuate wildly depending on what slots in the grid have 
> or don't have an implementation.

Well, my implementation doesn't have a disambiguation rule. It raises an
error instead.
> 
> For example, consider this grid (x means an implementation is provided, 
> _ means no implementation present, uppercase letters indicating types):
> 
>    A B C
> A x x x
> B x _ _
> C x _ _
> 
> There is no implementation for C^C. There is no single most derived 
> implementation, disambiguation would have to choose either A^C or C^A.
> 
> The problem here is whatever disambiguation rule you choose, it will 
> sometimes select unintuitively.
> Consider the following three grids:
> 
>    A B C     A B C     A B C
> A x x x   A x x x   A x x x
> B x _ x   B x _ _   B x x _
> C x _ _   C x x _   C x _ _
> 
> I'm pretty sure that different people will have different intuitions 
> about what the "right" selection is, so there's no way to choose an 
> intuitive disambiguation rule. That's bad for maintenance and bugfixing 
> because people reading the code will then make wrong intuitive 
> assumptions about which code will actually run when the operator is 
> called with two C parameters at runtime.

What they will get is an exception telling them which rules apply, i.e.
{(C, A), (B, C)}, {(C, B), (A, C)} or {(C, A), (B, B), (A, C)},
respectively. It's then up to the user to fix their code.
> 
> It gets worse. If the type hierarchy gets extended (say, a new class is 
> inserted between B and C), you'll find that with almost all 
> disambiguation rules, there will be cases where an existing call will 
> now disambiguate to different code than it did before.

You can't do that in Python (or, at least, you really shouldn't).

> One way to deal with this all is to insist that all slots in the grid be 
> filled explicitly.
> To the very least, you'll need to run a full set of tests for every grid 
> slot, just to make sure that whatever algorithm is chosen for each 
> combination of parameter types, the result is correct.

We should already be testing every slot of the grid for correctness, so
there isn't much of an additional testing burden. With this
implementation, it's easy to check that every combination dispatches
correctly. 

> The quadratic blowup isn't exactly nice, but type hierarchies typically 
> don't get deep enough to make that a really serious problem (not on 
> modern hardware anyway). Also, you can cut corners by leaving grid slots 
> open at the bottom of the hierarchy - but extending a type hierarchy 
> downwards means that those slots have to be filled, so this needs to be 
> explained in comments.

sympy has a lot of classes, and runtime class generation doesn't help,
so the quadratic blowup in the number of possible interactions is a
concern. 
> 
> The other way to deal with this would be to restructure the code so that 
> the formerly double-dispatching operator has a single implementation, 
> and calls single-dispatching code on each operand.
> For example, in a physics simulation, an impulse transfer operator for 
> collision handling won't double-dispatch on both operands, it will ask 
> each operand for their mass and speed (single-dispatch operations), 
> calculate an impulse change, and ask each operand to apply that change 
> to itself (again, just single dispatch on each operand).
> For arithmetic operators, the usual way out is to convert the more 
> specialized operand to the more general one.

That's fine for numerical code, but sympy has too many completely
different kinds of objects for this strategy to be applicable. Between a
Symbol, a HilbertSpace, a Ket and a Set, which one is the most general?
How can you cast one to the other? What attributes can they have in
common?
> 
> There's also the approach that the specializations are supposed to have 
> the exact same semantics, just be more efficient. In that case, 
> unintuitive disambiguation doesn't hurt - but you want to iterate 
> through all combinations of types in the tests and see that they give 
> the same result as the unoptimized version does.
> 
> 
> Apologies for the wall of text.
> I didn't know how to cover all aspects in less characters.
> 
> Regards,
> Jo
> 


-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Re: [sympy] Double dispatch

Reply via email to