RFC 1 (v2) Implementation of Threads in Perl

Perl6 RFC Librarian Fri, 04 Aug 2000 22:35:23 -0700
This and other RFCs are available on the web at
  http://tmtowtdi.perl.org/rfc/

=head1 TITLE

Implementation of Threads in Perl

=head1 VERSION

    Maintainer: Bryan C. Warnock <[EMAIL PROTECTED]>
    Date: 04 Aug 2000
    Version: 2
    Mailing List: [EMAIL PROTECTED]
    Number: 1

=head1 ABSTRACT

Perl 6 should be built around threads from the beginning.

=head1 DESCRIPTION

Perl 5 attempted (with relatively good success) to implement threads
atop the current architecture.  It did, unfortunately, leave several gaps,
traps, and "features" in heavy concurrency uses.  These weaknesses could
be fixed if Perl was built with threading from the start.

All Perl programs are threaded.  Most just only have one.

=head1 MOTIVATORS

Impatience, Hubris, and Laziness, in that order.

=head1 IMPLEMENTATION

Attempt to build-in thread constructs for the internals, while allowing
a Thread module to safely and robustly add user thread constructs, while
not making things bad for the single-threaded folks.

=head2 SUMMARY OF IMPLEMENTATION

The summary is based on the current Perl 5 architecture.  As the internal
structure changes, like using vtables, the thread design will have to 
change.

=over 4

=item *

Create an additional pseudo-global stash, one per thread created, that is
local to that thread.  This stash would be the default space for non-
lexical variables.  C<$main::foo> == C<$foo> within one thread, while 
C<$main::foo> != C<$main::foo> in different threads.  There need be no way to 
specify the particular thread-space, as it should be visible only to the 
owning thread.

=item *

The Thread module should add a C<global> keyword or function that explicitly
access a variable in the program-global stash.

    C<global $main::foo = $foo;  # Let another thread know what my $foo is.>
    C<global $main::foo = \$foo; # Share my local foo.  Dangerous!>
    C<$foo = global $main::foo;  # Localize this instance of $main::foo.>

=item *

The Thread module should, on inclusion, also set the optree flag that triggers
mutex locking on variables within the perl core itself.  (As differentiated
by a user-created and controlled mutex.)  This is to guarantee that the
above constructs will actually work - user created race conditions aside.

=item *

Populate the thread-space stash with the built-ins, vice the program global 
stash.  Very few of the built-ins are meaningless in this threaded construct,
most are truly independent, and those that aren't, like $^O, should probably
be read-only anyway.  

=back

=head2 IMPACT

=over 4

=item *

Impact on Perl on a non-thread-supporting architecture.  None.  (The mutex
locking code would be no-opped out, the Thread module would fail on inclusion,
preventing any of the global semantics from being invoked.  The thread
space would appear to the program to be a standard global stash.)

=item *

Impact on Perl built for non-threaded use.  None.  Same as above.

=item *

Impact on a single-threaded program under a multi-threaded Perl.  None, most
likely, for the above reasons.  (There would be an additional flag check, 
vice, I believe, automatic mutex locking under the current scheme.)

=item *

Impact on multi-threaded scripts under a multi-threaded Perl.  Some.  Mutex
locking would occur much as it does today.  Current Perl scripts, without 
the knowledge of global versus thread space would find data-sharing broken.
Threads have been declared experimental, and I believe the benefits of 
simplifying threads in general outweigh the heartache of those (who would
benefit) that would have to change their programs.  In addition, see the
notes about module inclusion below.

=item *

Impact on Perl 5.  Possible mutual compatibility between Perl 5 and Perl 6, 
with the exception of C<use Thread> and the sematics it would add.  See the
notes below about module inclusion. (Obviously, other changes to the language 
notwithstanding.)

=back

=head2 UNKNOWNS

=over 4

=item *

Probably the biggest unknown, and the one with the largest potential impact,
will be exactly how module inclusion will work with threads under Perl 6.

Currently, modules are parsed and interpreted at compile time in a global
scope.  Under the above architecture, this will populate the primary thread
by default, as secondary threads are a run-time issue.

So how do secondary threads C<use> a module?  How does a module's symbols 
find themselves in the proper thread space, instead of cramming the primary
thread space at compile time, or, worse yet, completely undoing the entire
point of threads by making everything global?

Certainly, the global approach, by default, is not the desired solution, as
you now lose any ability to make your interface reentrant, unless it is
specifically designed and tested for thread use.  In which case, the module
would then need to C<use Threads>, which would then initiate multi-threading,
assuming the core and platform supported it, even if the original program
didn't.  Evil, evil, evil.

Another possibility, and another one I do not like, is that a thread 
inherits the entire stash of the parent thread.  Now, you either need to
duplicate the entire stash, or resolve yourself to automatically sharing
all the data.  Neither one is acceptable, for what I hope are obvious
reasons.

So that means that C<use Threads> must now also define a method for runtime
inclusion of modules.  This, in and of itself, should not be too difficult.
A possible syntax might be to include the necessary module names as arguments
to the spawning call.  But there are issues of lexical scoping across 
multiple threads that could be an issue.

Lastly, how do other compile-time constructs, such as C<BEGIN> and C<END>
blocks, deal with handling thread-space?  Is there going to be a need to
support similar constructs for thread creation?

=item *

Mutex locking of a hash or array, and the scalars they contain, and vice
versa?

=item *

Mutex locking of a reference and the referree.

=item *

Limitations or assumptions on threading schemes other than those in pthreads,
due to the author's lack of experience with anything but.

=back

=head1 REFERENCES

   None, currently.

=head1 CHANGES

=over 4

=item *

    Added module inclusion lament under L<"UNKNOWNS">.

=back
RFC 1 (v2) Implementation of Threads in Perl

Reply via email to