This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Lightweight Threads =head1 VERSION Maintainer: Steven McDougall <[EMAIL PROTECTED]> Date: 30 Aug 2000 Mailing List: [EMAIL PROTECTED] Version: 1 Number: 178 =head1 ABSTRACT A lightweight thread model for Perl. =over 4 =item * All threads see the same compiled subroutines =item * All threads share the same global variables =item * Threads can create thread-local storage by C<local>izing global variables =item * All threads share the same file-scoped lexicals =item * Each thread gets its own copy of block-scoped lexicals upon execution of C<my> =item * Open code can only be executed by a thread that compiles it =item * The interpreter guarantees data coherence =back =head1 DESCRIPTION The overriding design principle in this model is that there is one program executing in multiple threads. One body of code; one set of global variables; many threads of execution. I like this model because =over 4 =item * I understand it =item * it does what I want =item * I think it can be implemented =back =head2 Notation =over 4 =item I<main> and I<spawned> threads We'll call the first thread that executes in a program the I<main> thread. It isn't distinguished in any other way. All other threads are called I<spawned> threads. =item I<open code> Code that isn't contained in a BLOCK. =back Examples are written in Perl5, and use the thread programming model documented in C<Thread.pm>. Discussions of performance and implementation issues are all based on the Perl5 internals; obviously, these are subject to change. =head2 All threads see the same compiled subroutines Subroutines are typically defined during the initial compilation of a program. C<use>, C<require>, C<do>, and C<eval> can later define additional subroutines or redefine existing ones. Regardless, at any point in its execution, a program has one and only one collection of defined subroutines, and all threads see this collection. Example 1 sub foo { print 1 } sub hack_foo { eval 'sub foo { print 2 }' } foo(); Thread->new(\&hack_foo)->join; foo(); Output: 12. The main thread executes C<foo>; the spawned thread redefines C<foo>; the main thread executes the redefined subroutine. Example 2 sub foo { print 1 } sub hack_foo { eval 'sub foo { print 2 }' } foo(); Thread->new(\&hack_foo); foo(); Output: 11 or 12, according as the main thread does or does not make the second call to C<foo()> before the spawned thread redefines it. If the user cares which happens first, then they are responsible for doing their own synchronization, for example, with C<join>, as shown in Example 1. Code refs (like all Perl data objects) are reference counted. Threads increment the reference count upon entry to a subroutine, and decrement it upon exit. This ensures that the op tree won't be garbage collected while the thread is executing it. =head2 All threads share the same global variables Example 3 #!/my/path/to/perl $a = 1; Thread->new(\&foo)->join; print $a; sub foo { $a++ } Output: 2. C<$a> is a global, and it is the I<same> global in both the main thread and the spawned thread. =head2 Threads can create thread-local storage by C<local>izing global variables Example 4 #!/my/path/to/perl $a = 1; Thread->new(\&foo); print $a; sub foo { local $a = 2 } Output: 1. The spawned thread gets it's own copy of C<$a>. The copy of C<$a> in the main thread is unaffected. It doesn't matter whether the assignment in C<foo> executes before or after the C<print> in the main thread. It doesn't matter whether the copy of C<$a> goes out of scope before or after the C<print> executes. As in Perl5, C<local>ized variables are visible to any subroutines called while they remain in scope. Example 5 #!/my/path/to/perl $a = 1; Thread->new(\&foo); bar(); sub foo { local $a = 2; bar(); } sub bar { print $a } Output: 12 or 21, depending on the order in which the calls to C<bar> execute. Dynamic scopes are not inherited by spawned threads. Example 6 #!/my/path/to/perl $a = 1; foo(); sub foo { local $a = 2; Thread->new(\&bar)->join; } sub bar { print $a } Output: 1. The spawned thread sees the original value of C<$a>. =head2 All threads share the same file-scoped lexicals Example 7 #!/my/path/to/perl my $a = 1; Thread->new(\&foo)->join; print $a; sub foo { $a = 2 } Output: 2. C<$a> is a file-scoped lexical. It is the same variable in both the main thread and the spawned thread. =head2 Each thread gets its own copy of block-scoped lexicals upon execution of C<my> Example 8 #!/my/path/to/perl foo(); Thread->new(\&foo); sub foo { my $a = 1; print $a++; } Output: 11. This result is guaranteed, even if the statements execute in this order Main thread Spawned thread my $a = 1; my $a = 1; print $a++; print $a++ C<$a> is a block-scoped lexical variable. Every time a thread executes the C<my>, a new variable is created, completely unrelated to any other variable in any thread. =head2 Open code can only be executed by a thread that compiles it Threads execute BLOCKs new Thread \&foo new Thread sub { ... } async Thread { ... } This means that code that is not contained in a BLOCK can only be executed by a thread that compiles it. Example 9 #!/my/path/to/perl Thread->new(\&foo)->join; Thread->new(\&foo)->join; print $a; sub foo { require Bar; } # Bar.pm $a++; Output: 1. C<require> won't compile the same file twice, so the increment only executes in the first spawned thread. Example 10 #!/my/path/to/perl Thread->new(\&foo)->join; Thread->new(\&foo)->join; print $a; sub foo { do 'Bar.pm'; } # Bar.pm $a++; Output: 2. C<do> will compile the same file repeatedly, so the increment executes in both spawned threads. Example 11 #!/my/path/to/perl Thread->new(\&foo)->join; Thread->new(\&foo)->join; sub foo { do 'Bar.pm'; } # Bar.pm my $a = 1; print $a++; Output: 11. The C<my> creates a new file-scoped lexical each time it executes. Example 12 #!/my/path/to/perl $a++; async { do $0 } if $a < 2; print $a; Output: 12 or 21. Evil, but straightforward. The main thread and the spawned thread both compile and execute the program. =head2 The interpreter guarantees data coherence Any implementation of threads implicitly defines a boundary between the interpreter and the programmer: a boundary between what the interpreter guarantees about the execution of threads and what the programmer must ensure through synchronization mechanisms. This boundary must be =over * =item * precise =item * intuitive =back Otherwise, programmers will forever be guessing (wrong) about what exactly they can rely on the interpreter for. This RFC proposes that the boundary be placed at I<data coherence>. Data coherence means that primitive operations in Perl always produce a result that could be obtained if the operation executed atomically. Example 13 #!/my/path/to/perl $a = "abcdef"; $b = "123456"; my $t1 = async { $a = $b }; my $t2 = async { $b = $a }; $t1->join; $t2->join; print "$a $b"; Output: `abcdef abcdef' or `123456 123456' Assignment is a primitive operation. Even though the code doesn't synchronize execution of the two C<async> blocks, data coherence guarantees that each assignment executes as if it were atomic. In particular, we are guaranteed not to get output like `ab34ef ab34ef'. Controlling the order in which the two assignment statements execute is a I<data synchronization> problem. The interpreter doesn't make any guarantees regarding data synchronization. If the programmer cares about this, they must add synchronization mechanisms (e.g. mutexes) to the code. To fully specify a data coherence model, we have to decide which operations are primitive. As a starting point, we can take all the operators documented in C<perlop.pod> and all the functions documented in C<perlfunc.pod> as primitive. Example 14 #!/my/path/to/perl my $t1 = async { push @a, (1, 2, 3 ) }; my $t2 = async { push @a, (4, 5, 6 ) }; $t1->join; $t2->join; print @a; Output: 123456 or 456123 C<push> is a primitive operation. Data coherence guarantees that we won't get output like `142536'. Example 15 #!/my/path/to/perl $a = 'aa'; my $t1 = async { $a = 'bb' }; my $t2 = async { $a = 'cc' }; print "$a $a $a"; Possible output: `aa cc bb' Variable interpolation is a primitive operation. There are three interpolations in the C<print> statement; C<$a> is liable to change between any of them. Data coherence does guarantee that we won't get output like `ac cc bb'. =head1 IMPLEMENTATION Perl6 could have either cooperative threads or preemptive threads. =head2 Cooperative threads RFC 47 proposes that "there...be one event loop for all of Perl". This event loop would dispatch op codes, deliver signals, and invoke callbacks. It would be a natural extension of this architecture for the event loop to dispatch op codes for multiple threads of a Perl program. The big advantage of cooperative threads is that the Perl interpreter remains a single-threaded program. A Perl program may have many threads, but the interpreter has only one: it runs an event loop and it dispatches op codes. Because it is single-threaded, the interpreter is not subject to race conditions, and requires no synchronization code. Cooperative threads have several disadvantages =over 4 =item * The interpreter has to do asynchronous I/O. But Perl6 may support asynchronous I/O per RFC 47, and the interpreter has an event loop to run the callbacks... =item * The interpreter can't preempt XSUBs. But XSUBs don't I<have> to be ill-behaved. The XS interface could expose a C<yield()> call for XSUBs to call during time-consuming operations. =item * We don't get Symmetric MultiProcessing (SMP). No way around this one. Boo, Hiss. =back =head2 Preemptive threads The interpreter is implemented on top of a native threading package, such as PThreads. Each Perl thread runs in its own native thread. We get SMP, and we always get control back from XSUBs. (Although XSUBs can still crash the interpreter...) The big drawback of preemptive threads is that the interpreter itself becomes a multi-threaded program, with all attendant synchronization requirements. If Perl6 gets preemptive threads, expect race conditions to become the kind of ongoing headache that memory leaks were for Perl4 and Perl5. =head2 C<local> C<local> needs some new functionality in a multi-threaded program. Global variables are bound to globs at compile time. The glob holds a pointer to the actual data value. Perl source Internal data representation $a = 'abcd'; *a -> 'abcd' In a single-threaded program, C<local> creates a new data value and installs it in the glob. The new data value holds a pointer to the old value, so that the old value can be restored when the new one goes out of scope. { local $a = 42; } *a -> 42 -> "abcd" In a multi-threaded program, each thread can create its own local values for a global. The glob can keep these separate by having an array of pointers, indexed by thread ID. The glob also needs a pointer to the original value from the main thread, because that is the value that spawned threads see if they don't localize the variable. $a = 1; # ID 0; main thread { local $a = 2; async { $a = 3 }; # ID 1; sets original value async { local $a = 4 }; # ID 2 } *foo ORIGINAL --------+ | [0] -> 2 -> 3 <--+ [2] -> 4 Threads come and go, but the IDs keep incrementing, so the array will usually be sparse. A B* tree might be a suitable implementation. Threads that don't localize the variable don't get an entry in the array; instead, they follow the ORIGINAL pointer. However, the interpreter still has to search the array to discover this. =head2 Primitive operations Primitive operations typically lock their operands to avoid race conditions Perl source C Implementation $a = $b lock(a.mutex); lock(b.mutex); free(a.pData); a.length = b.length; a.pData = malloc(a.length); memcpy(a.pData, b.pData, a.length); unlock(a.mutex); unlock(b.mutex); Two threads that need to lock the same variables could deadlock. One way to avoid this is to lock variables in storage order if (a < b) { lock(a.mutex); lock(b.mutex); } else { lock(b.mutex); lock(a.mutex); } =head1 DISCUSSION =head2 Globals and Reentrancy RFC1 "Implementation of Threads in Perl" proposes that, by default, threads be isolated in separate data spaces. =over 4 =item * Each thread gets its own copy of all global variables. A special stash named C<global::> provides shared storage between threads. $a # different in different threads $global::a # shared between different threads =item * Each thread reC<use>'s all its modules, so that it any module data can be reinitialized for that thread. =back Discussion on perl6-language-flow has further suggested that each thread get its own copy of file-scoped lexicals. A C<:shared> attribute could be used to declare file-scoped lexical that are shared between threads. my $a # different in different threads my $a : shared # shared between different threads We'll call this an I<isolated> data model. The rational for adopting an isolated data model is that it will make existing Perl5 modules reentrant. This RFC proposes that Perl not take any special steps to isolate threads in separate data spaces. Globals are shared unless localized, and file-scoped lexicals are shared unless a thread recompiles the file. We'll call this a I<shared> data model. I prefer a shared data model because =over 4 =item * It does what I want. =item * One of the goals of Perl6 is to get out from under the backwards compatibility constraints that have boxed in Perl5. Organizing the threading model around the need to make Perl5 modules reentrant seems inconsistent with this. =item * The collection of Perl5 modules that an isolated data model can rescue from reentrancy problems may be vanishingly small; conversely, it may break modules that genuinely need global data. =back This isn't something that we can argue about with thought experiments. The modules are out there on CPAN; we have to look and see how they behave. I took a quick stroll through the modules that are installed on my own system; here is a small, non-random sample of what I found. =over 4 =item C<Sys::Hostname> C<Sys::Hostname> gets the system hostname and caches it in C<$Sys::Hostname::host>. This will work correctly in a shared data model, even without any synchronization mechanism. An isolated data model will defeat the cache, forcing every thread to look up the hostname itself. =item C<Set::IntSpan> Set::IntSpan uses one global: C<$Set::IntSpan::Empty_String>. All C<Set::IntSpan> must see the same value for this global. Applications typically set this global once and then leave it untouched; methods in C<Set::IntSpan> read it, but do not write it. This works correctly in a shared data model; it breaks in an isolated data model. =item C<Time::Local> Time::Local caches the start times of months in C<%Time::Local::cheat>. This works correctly in shared data model; an isolated data model defeats the cache. Some methods in C<Time::Local> store temporary values in package globals, e.g. C<$Time::Local::ym>. This works correctly in an isolated data model, and breaks in a shared data model. =item C<File::Find> C<File::Find> stores the name of the current file in C<$File::Find::name>, and the current directory path in C<$File::Find::path>. This works in an isolated data model, and breaks in a shared data model. However, C<File::Find> also C<cd>s to the directory where the current file is. This isn't reentrant, and it can't be made reentrant, because a process has only one CWD, which is shared by all threads. This means that the C<File::Find> interface is intrinsically broken under threads. =item C<Term::Complete> C<Term::Complete> stores key codes in globals: C<$Term::Complete::complete>, C<$Term::Complete::kill>, C<$Term::Complete::erase1>, and C<$Term::Complete::erase2>. This is reentrant in an isolated data model, and not in a shared data model. However, C<Term::Complete> isn't even reentrant I<under Perl5>. If two different parts of an application both use C<Term::Complete>, they don't need threads to fight over the values of its globals. I'm hard pressed to see that the design of Perl6 should be driven by the need to fix modules that are broken in Perl5. =back Again, this sample of modules isn't large, or random, or representative. But it does show that =over 4 =item * globals don't necessarily cause concurrency problems =item * not all concurrency problems can be fixed with an isolated data model =back =head2 Other concurrency mechanisms RFCs 27 and 31 discuss coroutines. RFC 47 discusses asynchronous I/O. I'm happy to have other concurrency mechanisms in Perl, but I want threads, and I don't want to give up any features of threads on the grounds that you can do the same thing with some other concurrency mechanism. =head1 REFERENCES RFC 1: Implementation of Threads in Perl RFC 27: Coroutines for Perl RFC 31: Co-routines RFC 47: Universal Asynchronous I/O RFC xxx: Thread Programming Model
