Hi guys,
I'm in the middle of writing an RFC regarding Perl's use of the C
stack. If any of you has the time, please take a look at what I've
written below and send me any comments or corrections. I haven't got
to the implementation section yet, but it will involve smallish,
fixed-size stack frames for recursive runops(), a per-runlevel SV
arena, and judicious use of C<volatile>.
Thanks
-John
=head1 TITLE
Binary stack layout to support accurate garbage collection and
continuations
=head1 VERSION
=head1 ABSTRACT
C is neither designed nor well suited for high-level language
interpreters or compiler output. However, it can be bludgeoned into
allowing enough stack inspection to implement accurate mark-and-sweep
garbage collection and Lisp/Scheme-style continuations. Perl might
gain performance from the former and a desirable feature in the
latter.
=head1 DESCRIPTION
=head2 Perl 5 Stack Management
Perl's virtual machine makes limited use of the machine stack as it
runs Perl code. That is, it normally is not very deep in C function
calls. This is a good thing, because C stack frames are a nuisance to
high-level languages. This aspect of Perl 5 is somewhat formalized in
its C<PERL_SI> and C<JMPENV> structures.
A C<PERL_SI> (``stack info'') structure is allocated whenever an op
must recursively call back into Perl. Such calls occur in C<sort>,
object destructors, tied variable accessor methods, overloaded
operator invocations, signal handlers (including C<$SIG{__WARN__}> and
C<$SIG{__DIE__}>), and the Perl API function C<require_pv>. There is
always a current, innermost C<PERL_SI> structure called
C<PL_curstackinfo>, and the outer ones are attached to it in a linked
list.
A C<JMPENV> (``jump environment'') structure is allocated when
I<foreign> C code, such as an XSub or embedding program, wishes to
enter the Perl interpreter and guarantee that control returns to the
point of entry. If no C<JMPENV> is used, non-local jump operations
like C<die> cause control to be transfered directly to an outer
C<eval> or equivalent frame, or they make the process terminate.
C<PERL_SI> structures are not especially useful outside of Perl, and
neither they nor C<JMPENV>s are part of the public API. A C<JMPENV>
is used by API functions such as C<perl_parse>, C<perl_run>,
C<perl_destroy>, and C<eval_pv>. The C<eval_sv> family of API
functions lets one request that a C<JMPENV> be used by specifying
C<G_EVAL> in the flags.
=head2 Accurate garbage collection, continuations: impossible?
A tricky problem for garbage collection is to find live pointers in
the stack. C makes this difficult by hiding the details of stack
layout behind its support for functions and local (``automatic'')
variables. Typically, interpreters written in C maintain expensive,
global lists of objects to be released when a stack frame is unwound
(i.e., discarded through function return or an exception). The Perl 5
API provides the macros C<ENTER>, C<LEAVE>, C<SAVETMPS>, and
C<FREETMPS>, which serve this purpose (among others).
If the unwinding code knew where in the machine stack to find the Perl
temporaries, the global lists could be jettisoned. This might benefit
reference-counting garbage collection as well as the proposed new
mark-and-sweep systems.
What's more, being able to correlate machine stack addresses with Perl
values would facilitate the introduction of I<continuations> to Perl.
[INCOMPLETE]