Re: Optimizing PMC-based MMD
chromatic wrote: Within the cmp op bodies, we *know* the arity and most of the types of MMD- participant arguments at compile time. We can get the types of PMC participants within the body of the op itself. Thus we could avoid most of the argument marshalling and counting and analysis if we had a way to perform cached MMD lookup without constructing a CallSignature PMC. That would clear up a third of the work. This we should open up to general discussion. The consequence of short-cutting like this is that individual PMCs will no longer be able to override 'cmp' to do something other than multi-dispatch. At the moment, developers still have the option of providing their own quick comparison, which gives an even more extreme speedup than this shortcut. So, question for language developers and other PMC developers, how important is the ability to define a 'cmp' vtable function that's called when the 'cmp' opcode is invoked? Or, is defining a 'cmp' multi for your PMC type enough? Another area for optimization is invoking a Sub from a signature PMC; I believe we're throwing away and recalculating valuable information, though we may have to wait for dramatic improvements until we can unify contexts and CallSignature. Providing a new way of invoking Subs that uses CallSignatures all the way down is already planned in the coming series of calling conventions refactors. The final opportunity for optimization is making the PMC multis defined in PMCs use PCC instead of C calling conventions. Corresponding multis written in PIR already use PCC, and we want to support that, so we should unify our approach. That would remove the NCI expense here, though that's probably minor in comparison to the CallSignature PMC expense. Changing all NCI calls to something more like PCC calls is already planned in the coming series of calling conventions refactors. Changing the Pmc2c generator to build PCC subs instead of NCI Subs is a quick change that could happen now. The calling conventions refactors are non-critical (some will likely land after 1.0), because the interface will stay the same, it's only the internals that will change. Allison
Re: Optimizing PMC-based MMD
On Wed, Dec 24, 2008 at 09:55:58AM -0600, Allison Randal wrote: Within the cmp op bodies, we *know* the arity and most of the types of MMD- participant arguments at compile time. We can get the types of PMC participants within the body of the op itself. Thus we could avoid most of the argument marshalling and counting and analysis if we had a way to perform cached MMD lookup without constructing a CallSignature PMC. That would clear up a third of the work. This we should open up to general discussion. The consequence of short-cutting like this is that individual PMCs will no longer be able to override 'cmp' to do something other than multi-dispatch. Does individual PMCs here mean PMC instance or PMC classes? I.e., are you saying that a specific PMC instance could choose to override the cmp opcode for that individual PMC? If so, do we have any examples where this is being done now? At the moment, developers still have the option of providing their own quick comparison, which gives an even more extreme speedup than this shortcut. So, question for language developers and other PMC developers, how important is the ability to define a 'cmp' vtable function that's called when the 'cmp' opcode is invoked? Or, is defining a 'cmp' multi for your PMC type enough? From a Rakudo perspective, the ability to define custom 'cmp' vtable functions doesn't appear to be at all important. Comparisons are almost invariably done by invoking :multi Sub PMCs of one form or another and letting those handle the MMD dispatch. The opcode form seems to impose too many limitations to be used directly. To turn the question around a bit: I can tell that a lot of work has gone into Parrot to make MMD possible at the vtable level, but I haven't see how vtable MMD is at all useful or usable in languages where operator overloading is possible from the HLL itself. And most dynamic languages I'm looking at seem to support that in one form or another. If someone (Allison) could make an example of how vtable MMD is intended to improve things -- i.e., taking an HLL language statement and showing how that translates to PIR that is improved by vtable MMD, that would be very helpful. The calling conventions refactors are non-critical (some will likely land after 1.0), because the interface will stay the same, it's only the internals that will change. Oh, I'm very disappointed to hear this. Named and positional argument handling still has an odd behavior [*], and Perl 6 still really needs the :lookahead option described earlier in the year. I thought that was going to be made possible by the refactor, and is partially why PDS had calling conventions schedule for the December 2008 release. [*] Currently named parameters are filled from any leftover positionals in the argument list -- there's no way to declare an argument that can _only_ be filled by name, short of defining a :slurpy array that grabs any extra positional arguments and then checking that the slurpy is empty. And, Jonathan can correct me on this if I'm mistaken, but I suspect the other big reason that calling convention refactor was scheduled for the December 2008 release is that it's likely a blocker or important component for the custom dispatcher that Jonathan will be creating for Rakudo as part of his funded grant. That's due to be completed by the end of January, IIRC. Pm
Optimizing PMC-based MMD
The following code performs far more work than it has to, mostly due to crossing the C/PCC boundary multiple times, as well as throwing away known information: $P0 = box 10 $I0 = cmp $P0, 10 This: - calls VTABLE_cmp on $P1, reaching VTABLE_cmp in the Default PMC - calls Parrot_mmd_multi_dispatch_from_c_args - passing 'cmp', 'PP-I' signature, and args as varargs - builds sig object from varargs - loops through signature string - creates a new CallSignature PMC - creates a new return PMC for all return argument - creates a new CPointer for each return argument - pushes arguments onto the CallSignature PMC - builds a type tuple for MMD - loops through signature stored in CallSignature to find MMD-participant arguments - loops through type signature to set argument types - checks MMD cache - use cached candidate if possible - find new candidate - creates new array PMC for candidate list - searches CallSignature's namespace for candidates (?) - searches global MULTI namespace for candidates - sorts candidate list by MMD type tuple - loops through candidate list - calculates distance to each candidate - loops through each argument (parallel iteration over type tuple and argument list) - loops over all elements in MRO for each argument type - calls Parrot_pcc_invoke_sub_from_sig_object - converts CallSignature string to C string - creates array PMCs for arguments and results - counts number of arguments and return values (looping over signature string) - sets up input parameters in current context - loops over the C signature - assigns each parameter to the appropriate context - invokes the Parrot sub (NCI) - calls the NCI thunk (pcf_I_JPP) - calls Parrot_init_arg_nci - inits data structures - calls Parrot_init_arg_indexes_and_sig_pmc - calls Parrot_init_arg_sig - calls C function - calls set_nci_I to store return value - converts argument to INTVAL if necessary - stores argument into register - assigns return values from the context to the CallSignature - loops over the C signature - assigns each return value appropriately The default Integer case performs a C-level comparison. Most of this codepath is new as of the MMD branch merge. Within the cmp op bodies, we *know* the arity and most of the types of MMD- participant arguments at compile time. We can get the types of PMC participants within the body of the op itself. Thus we could avoid most of the argument marshalling and counting and analysis if we had a way to perform cached MMD lookup without constructing a CallSignature PMC. That would clear up a third of the work. Another area for optimization is invoking a Sub from a signature PMC; I believe we're throwing away and recalculating valuable information, though we may have to wait for dramatic improvements until we can unify contexts and CallSignature. The final opportunity for optimization is making the PMC multis defined in PMCs use PCC instead of C calling conventions. Corresponding multis written in PIR already use PCC, and we want to support that, so we should unify our approach. That would remove the NCI expense here, though that's probably minor in comparison to the CallSignature PMC expense. -- c