> >
> >I have a minor issue with a proliferation of createArray. In perl5 we
> >used the Stack for just about everything minus physically setting @x =
> >(1,2,3). The creation of a dynamic array is a memory hog.
>
> Less of a hog in many ways than using a stack. Worth the times when it's not.
I don't see how. A stack is "supposed" to be pre-existing and maintain
headroom for growth. The simplest stack (which I think is plenty peppy)
is one that checks bounds and realloc's when it grows too much. If it
grew that big once, then it would surely grow that bug in the future,
which alleviates memory fragmentation.
There's already a stack structure in interpreter_t. I'm not sure how
parameter passing is being implemented yet, but anything other than some
type of stack is going to have problems. (which can only be addressed via
dynamic allocation of arrays.. which are just slower-parameter-stacks).
>
> >ESPECIALLY with
> >mark-and-sweep dynamic memory (which is sounds like perl6 is aiming for).
>
> I don't see the connection here.
The connection had to do with the allocation / deallocation of these
intermediate dynamic arrays. It was just a call-of-attention to the ever
present point of performance degredation.
>
> If the compiler generates code like that and I'm happy about it, someone
> needs to smack me. *hard*. :)
>
> That ought to be either a series of print ops or, more likely, a set of
> push ops and a final prints.
But pushing onto what? A PMC which points to a dynamic array? Pushing is
arguable slower than pre-extending an array and statically assigning to an
index, since you avoid multiple increments. Not to mention if you push a
million entries, you might have to regrow the array 20 or more times.
I stated why I used the array for the print, but you could substitude
print for "@y = map { .. } (1,2,@x)" which would require an intermediate
array of memory. We just didn't have asembly available for that code.
You could set P1 to the new array and concat the three terms, but that
ultimately involves iterating through the array @x and "pushing values".
So you've not escaped the use of a stack. What's more since this is a
temporary stack, you deal with memory management.
Once again, the alternative is a general purpose stack, who's memory
management is minimal.
> >In any case, what we've done here is dynamically create an array each time
> >through the loop.
>
> Yes, but we're doing this with a stack-based system anyway. It's just an
> anonymous pesudo-array (i.e. the stack top), but an array nonetheless.
I'm not convinced that we can't /shouldn't implement persistent
general-purpose stacks, which include parameter passing. Obviously this
would require frame-support (since otherwise @_ would have to be a newly
allocated array), and it seems that the direction is towards
the exclusive use of registers. I just don't know that this doesn't
provide even greater performance / memory penalties than for non-static
parameter functions (such as print / map / grep or non-proto-typed
subroutines).
The way I see such a stack being used:
interp:
..
Stack stack;
Stack frame;
AUTO_OP push_x {
if ( stack->top >= stack->size ) {
grow_stack(stack);
}
setPMC( stack->data[ stack->top++ ], P1 );
}
AUTO_OP get_stack_frame_size {
REG_INT(P1) = stack->top - frame->data[ frame->top ];
}
AUTO_OP push_frame {
// auto-extend frame-stack
frame->data[ frame->top++ ] = stack->top;
}
AUTO_OP get_stack_X_i {
REG_X[P1] = [ stack->data[ frame->data[ frame->top ] + P2 ];
}
// set_stack_X_i
// alloc_stackn
AUTO_OP pop_frame {
stack->top = frame->data[ --frame->top ];
}
Internal c-code could even get pointers to the base-frame so as to avoid
all the indirections. So print, map, grep, etc would be able to read
quickly. You could additionally have efficient op-codes that copy in PMC
arrays to the stack.
That said, if we're exclusively using registers for parameter passing then
there's currently the issue of efficiently passing and returning values.
Currently the only way to pass parameters is to duplicate a given register
set, then move parameters around so they'll be where a subroutine expects
it. The called function would then push any additional
register-segments-stacks that weren't given parameters so it wouldn't
overwrite the caller's values. When complete, it would restore those
reg-stacks that it pushed and would leave return values as presents in P1,
I1, etc. But now the caller has to pop it's register-stacks to get back
to it's old values. But in doing so, it'll lose the return value(s).
There are two ways I can see this working. The first involves doing a
half-push / half-pop. Where you have a 16register input window and a 16
register output window. If a function can accomplish it's tasks with
whatever is left-over, then it doesn't need to do another half-push (it
just can't touch P1..16). On return, the return-values are in P17..P32.
The caller would have to copy the values back out then do a half-pop.
The other method uses an external return stack (which might also have
functioned as the parameter stack). Now it can be a simple PMC variable
w/in interpreter_t which is assigned to either a scalar or a dynamically
allocated array. But then that performance stone is once again stepped
on.
The alternative is to push values onto a persistent stack, push a new
reference frame, make the function call (who inspects the call stack,
possibly adds it's own reference frame), possibly makes another call which
potentially appends to the stack (thereby avoiding data-copying),
ultimately clears the calling stack-frame and or replaces it with the
return values (on the same frame). The caller then inspects the return
values and eventually pops-the frame.
This call-stack would only be used for values when the number of arguments
is not constant (or exceeds the number of parameter registers (which is at
most 16 for return values, and 16 for calling values if the policy
sanctions the above method)). If a caller knows they won't need a return
value, then they don't have to setup a stack-frame.
Another alternative follows the SPARC register windows. In the special
case that only PMC-scalars can be used as parameters / return values, then
you have two PMC register-stacks. One contains 32 locals (which are only
pushed at the beginning of a block if more are needed). The other
contains 32 registers where 16 are input and 16 are output. This stack
gets pushed by a function just before calling another function if they
needed to save parameters (just like sparc). This way you really have 48
registers at your disposal. Additionally you could push the IO stack as
many times as you needed so-as to produce a dynamic number of arguments.
You'd just have to specify somewhere how many values there are.
Note that the net effect of this last method is a persistent call-stack
(since existing code for register-sets maintain poped segments in a
linked-list).
Parrot is going to be littered with special cases, so I'm not suggesting
that this is how things could actually be done. But assuming the special
cases can be resolved, I'm of the impression that this is fast and simple.
None of this is going to matter until 0.1 anyway.
-Michael