IRIX64 tests
Just out of interest, what are the tests looking like on IRIX? mmm...not so good. - SWM world:~/src/Perl/parrotuname -a IRIX64 world 6.2 03131016 IP19 world:~/src/Perl/parrotmake test perl t/harness t/op/basic..ok 1/2skip() is UNIMPLEMENTED! at /home/abhaile/swmcd/perl/lib/s ite_perl/5.6.1/Test/More.pm line 505. t/op/basic..dubious Test returned status 255 (wstat 65280, 0xff00) DIED. FAILED test 2 Failed 1/2 tests, 50.00% okay t/op/integerok t/op/number.ok 2/23Confused test output: test 2 answered after test 6 t/op/number.NOK 7Test output counter mismatch [test 7] t/op/number.ok 2/23Confused test output: test 2 answered after test 7 t/op/number.NOK 8Test output counter mismatch [test 8] t/op/number.ok 2/23Confused test output: test 2 answered after test 8 t/op/number.NOK 9Test output counter mismatch [test 9] t/op/number.ok 2/23Confused test output: test 2 answered after test 9 t/op/number.NOK 10Test output counter mismatch [test 10] t/op/number.ok 2/23Confused test output: test 2 answered after test 10 t/op/number.NOK 11Test output counter mismatch [test 11] t/op/number.ok 2/23Confused test output: test 2 answered after test 11 t/op/number.NOK 12Test output counter mismatch [test 12] t/op/number.ok 2/23Confused test output: test 2 answered after test 12 t/op/number.NOK 13Test output counter mismatch [test 13] t/op/number.ok 2/23Confused test output: test 2 answered after test 13 t/op/number.NOK 14Test output counter mismatch [test 14] t/op/number.ok 2/23Confused test output: test 2 answered after test 14 t/op/number.NOK 15Test output counter mismatch [test 15] t/op/number.ok 2/23Confused test output: test 2 answered after test 15 t/op/number.NOK 16Test output counter mismatch [test 16] t/op/number.ok 2/23Confused test output: test 2 answered after test 16 t/op/number.NOK 17Test output counter mismatch [test 17] t/op/number.ok 2/23Confused test output: test 2 answered after test 17 t/op/number.NOK 18Test output counter mismatch [test 18] t/op/number.ok 2/23Confused test output: test 2 answered after test 18 t/op/number.NOK 19skip() is UNIMPLEMENTED! at /home/abhaile/swmcd/perl/lib/s ite_perl/5.6.1/Test/More.pm line 505. Test output counter mismatch [test 19] t/op/number.dubious Test returned status 255 (wstat 65280, 0xff00) DIED. FAILED tests 1-23 Failed 23/23 tests, 0.00% okay t/op/string.ok 3/5skip() is UNIMPLEMENTED! at /home/abhaile/swmcd/perl/lib/s ite_perl/5.6.1/Test/More.pm line 505. t/op/string.dubious Test returned status 255 (wstat 65280, 0xff00) DIED. FAILED tests 4-5 Failed 2/5 tests, 60.00% okay t/op/trans..dubious Test returned status 2 (wstat 512, 0x200) DIED. FAILED tests 13, 18 Failed 2/18 tests, 88.89% okay Failed Test Stat Wstat Total Fail Failed List of Failed --- t/op/basic.t 255 65280 21 50.00% 2 t/op/number.t 255 6528023 144 626.09% 1-23 t/op/string.t 255 65280 52 40.00% 4-5 t/op/trans.t 2 512182 11.11% 13 18 Failed 4/5 test scripts, 20.00% okay. 6/74 subtests failed, 91.89% okay. gmake: *** [test] Error 2 world:~/src/Perl/parrot
Re: [PATCH] Fix IRIX64 warnings
-opcode_t *(*(*opcode_funcs)[2048])(); /* Opcode */ - /* function table */ -STRING_FUNCS *(*(*string_funcs)[64])(); /* String function table */ +opcode_t *(**opcode_funcs)(); /* Opcode function table */ +STRING_FUNCS *(**string_funcs)(); /* String function table */ I'm a little unsure about this - where have those array declarations gone and why? If you strip off the return type and argument list, the declaration of opcode_funcs is *(*opcode_funcs)[2048] which is a pointer to an array of function pointers (3 levels of indirection). But if you look in interpreter.c, you find foo = mem_sys_allocate(2048 * sizeof(void *)); ... interpreter-opcode_funcs = (void*)foo; which allocates the array and assigns it directly to opcode_funcs (2 levels of indirection), and the DO_OP macro has x = z-opcode_funcs; \\ y = x[*w]; \\ w = (y)(w,z); \\ which expands to code = (interpreter-opcode_funcs[*code])(code, interpreter); (again, 2 levels of indirection). So the declaration of opcode_funcs was at a different level of indirection than its allocation and use. The compilers weren't complaining about this because of all the (void *) casts. The IRIX64 compiler did complain, not about indirection levels, but about assigning data pointers to function pointers. For dynamic allocation of the opcode_funcs array (as in current code) the appropriate declaration of opcode_funcs is opcode_t *(**opcode_funcs)(); For static allocation, write opcode_t *(*opcode_funcs[2048])(); and drop the mem_sys_allocate. string_funcs isn't currently used, but I changed its declaration to match opcode_funcs. - SWM
[PATCH] Fix IRIX64 warnings
IRIX64 6.2 cc -n32 issues 123 warnings (one per op code) complaining that interpreter.c, line 219: warning(1048): cast between pointer-to-object and pointer-to-function BUILD_TABLE(foo); ^ This patch makes them go away. - SWM Index: build_interp_starter.pl === RCS file: /home/perlcvs/parrot/build_interp_starter.pl,v retrieving revision 1.11 diff -u -u -r1.11 build_interp_starter.pl --- build_interp_starter.pl 2001/09/19 20:05:06 1.11 +++ build_interp_starter.pl 2001/09/24 01:59:04 @@ -21,7 +21,7 @@ my $opcode_fingerprint = Parrot::Opcode::fingerprint(); for my $name (sort {$opcodes{$a}{CODE} = $opcodes{$b}{CODE}} keys %opcodes) { -print INTERP \tx[$opcodes{$name}{CODE}] = (void*)$name; \\\n; +print INTERP \tx[$opcodes{$name}{CODE}] = $name; \\\n; } print INTERP } while (0);\n; @@ -61,8 +61,8 @@ print INTERP EOI; #define DO_OP(w,x,y,z) do { \\ -x = (void *)z-opcode_funcs; \\ -y = (opcode_t* (*)())x[*w]; \\ +x = z-opcode_funcs; \\ +y = x[*w]; \\ w = (y)(w,z); \\ } while (0); EOI Index: interpreter.c === RCS file: /home/perlcvs/parrot/interpreter.c,v retrieving revision 1.18 diff -u -u -r1.18 interpreter.c --- interpreter.c 2001/09/19 20:05:06 1.18 +++ interpreter.c 2001/09/24 01:59:04 @@ -48,8 +48,8 @@ runops_notrace_core (struct Parrot_Interp *interpreter, opcode_t *code, IV code_size) { /* Move these out of the inner loop. No need to redeclare 'em each time through */ -opcode_t *(*func)(); -void **temp; +opcode_t *(* func)(); +opcode_t *(**temp)(); opcode_t *code_start; code_start = code; @@ -95,8 +95,8 @@ runops_trace_core (struct Parrot_Interp *interpreter, opcode_t *code, IV code_size) { /* Move these out of the inner loop. No need to redeclare 'em each time through */ -opcode_t *(*func)(); -void **temp; +opcode_t *( *func)(); +opcode_t *(**temp)(); opcode_t *code_start; code_start = code; @@ -213,7 +213,7 @@ /* The default opcode function table would be a good thing here... */ { -void **foo; +opcode_t *(**foo)(); foo = mem_sys_allocate(2048 * sizeof(void *)); BUILD_TABLE(foo); Index: include/parrot/interpreter.h === RCS file: /home/perlcvs/parrot/include/parrot/interpreter.h,v retrieving revision 1.3 diff -u -u -r1.3 interpreter.h --- include/parrot/interpreter.h2001/09/19 20:04:45 1.3 +++ include/parrot/interpreter.h2001/09/24 01:59:04 @@ -30,9 +30,8 @@ /* variable area */ struct Arenas *arena_base;/* Pointer to this */ /* interpreter's arena */ -opcode_t *(*(*opcode_funcs)[2048])(); /* Opcode */ - /* function table */ -STRING_FUNCS *(*(*string_funcs)[64])(); /* String function table */ +opcode_t *(**opcode_funcs)(); /* Opcode function table */ +STRING_FUNCS *(**string_funcs)(); /* String function table */ IV flags;/* Various interpreter flags that signal that runops should do something */
RFCs for thread models
RFC 178 proposes a shared data model for Perl6 threads. In a shared data model - globals are shared unless localized - file-scoped lexicals are shared unless the thread recompiles the file - block scoped lexicals may be shared by - passing a reference to them - closures - declaring one subroutine within the scope of another In short, lots of stuff is shared, and just about everything can be shared. To prevent the interpreter from crashing in a shared data model, every access to a named variable must be protected by a mutex lock(b.mutex) fetch(b) unlock(b.mutex) lock(a.mutex) store(a) unlock(a.mutex) It has been argued on perl6-internals that - acquiring mutexes takes time - most variables aren't shared - we should optimize for the common case by requiring a :shared attribute on shared variables. Variables declared without a :shared attribute would be isolated: each thread gets its own value for the variable. In this model, the user incurs the cost of mutexes only for data that is actually shared between threads. This is a valid argument. However, an isolated data model has its own costs, and we need to understand these, so that we can compare them to the costs of a shared data model. The first interesting question is: How does a thread get access to its own value for a variable? We can break the problem into two broad cases - All threads execute the same op tree - Each thread executes its own copy of the op tree Let's look at these in detail 1. All threads execute the same op tree Consider an op, like fetch(b) If you actually compile a Perl program, like $a = $b and then look at the op tree, you won't find the symbol "$b", or "b" anywhere in it. The fetch() op does not have the name of the variable $b; rather, it holds a pointer to the value for $b. If each thread is to have its own value for $b, then the fetch() op can't hold a pointer to *the* value. Instead, it must hold a pointer to a map that indexes from thread ID to the value of $b for that thread. Thread IDs tend to be sparse, so the map can't be implemented as an array. It will have to be a hash, or a B*-tree, or a balanced B-tree, or the like. We can do this: we can build maps. But they take space to build, and they take time to search, and we incur that space for every variable, and we incur that time for every variable access. 2. Each thread executes its own copy of the op tree This breaks down further according to how much of the op tree we copy, and when we copy it. Here are several possibilities 2.1 Copy everything at thread creation This is simple and straightforward. We copy the op tree for every subroutine in the entire program at thread creation. As we copy the ops, we create new values for all the variables, and set the new ops to point to the new values. Obviously, this takes space and time. 2.2 Copy subroutines on demand We could defer copying subroutines until they are actually called by the new thread. However, this leads to a problem analogous to the one discussed in case 1 above. The entersub() op can no longer hold a pointer to *the* coderef for the subroutine. Instead, it must hold a pointer to a map that indexes from thread ID to the coderef. The first time a thread calls a subroutine, it finds that there is no entry for it in the map, makes a copy of the subroutine for itself, and enters it into the map. Subsequent calls find the entry in the map and call it immediately. All subroutine calls must search the map to find the coderef. 2.3 Copy just the subroutines we need at thread creation We could do a control flow analysis to determine the collection of subroutines that can be called by a thread, and copy just those subroutines when the thread is created. In this implementation, there is no thread ID map: the entersub() op holds a pointer to the coderef. This trades a more complex implementation for greater run-time efficiency. Constructs like $foo() are likely to complicate control flow analysis. We could probably punt on hard cases and make them go through a thread ID map. RFC 178 describes a shared data model, and there has been enough discussion of it on perl6-internals that we have some understanding of its performance characteristics. RFCs for other thread models would allow us to discuss them in definite terms, and come to some understanding of their performance characteristics, as well. This would then be a basis for choosing one model over another. Any volunteers? - SWM
Re: RFCs for thread models
SWM If you actually compile a Perl program, like SWM $a = $b SWM and then look at the op tree, you won't find the symbol "$b", or "b" SWM anywhere in it. The fetch() op does not have the name of the variable SWM $b; rather, it holds a pointer to the value for $b. Where did you get this idea from? P5 currently does many lookups for names. All globals. Lexicals live elsewhere. From perlmod.pod, Symbol Tables: the following have the same effect, though the first is more efficient because it does the symbol table lookups at compile time: local *main::foo= *main::bar; local $main::{foo} = $main::{bar}; Perhaps I misinterpreted it. You are imagining an implementation and then arguing against it. Yes. Here is my current 'guess'. [...] Now where sub recursive() { my $a :shared; ; return recursive() } would put $a or even which $a is meant, is left as an excersize My point is that we can't work with guesses and exercises. We need a specific, detailed proposal that we can discuss and evaluate. I'm hoping that someone will submit an RFC for one. - SWM
Re: RFC 178 (v2) Lightweight Threads
You aren't being clear here. fetch($a) fetch($a) fetch($b) ... add ... store($a) store($a) Now all of the perl internals are done 'safely' but the result is garbage. You don't even know the result of the addition. Sorry you are right, I wasn't clear. You are correct - the final value of $a will depend on the exact ordering of the FETCHEs and STOREs in the two threads. ...I hadn't been thinking in terms of the stack machine. OK, we could put the internal locks around fetch and store. Now, can everyone deal with these examples Example $a = 0; $thread = new Thread sub { $a++ }; $a++; $thread-join; print $a; Output: 1 or 2 Example @a = (); async { push @a, (1, 2, 3) }; push @a, (4, 5, 6); print @a; Possible output: 142536 - SWM
Re: RFC 178 (v2) Lightweight Threads
Example @a = (); async { push @a, (1, 2, 3) }; push @a, (4, 5, 6); print @a; Possible output: 142536 Actually, I'm not sure I understand this. Can someone show how to program push() on a stack machine? - SWM
Re: RFC 178 (v2) Lightweight Threads
I think there may be a necessity for more than just a work area to be non-shared. There has been no meaningful discussion so far related to the fact that the vast majority of perl6 modules will *NOT* be threaded, but that people will want to use them in threaded programs. That is a non-trivial problem that may best be solved by keeping the entirety of such modules private to a single thread. In that case the optree might also have to be private, and with that and private work area it looks very much like a full interpreter to me. RFC 1 proposes this model, and there was some discussion of it on perl6-language-flow. RFC 178 argues against it, under DISCUSSION, Globals and Reentrancy. - SWM
Re: RFC 178 (v2) Lightweight Threads
DS Some things we can guarantee to be atomic. This is going to be tricky. A list of atomic guarentees by perl will be needed. From RFC 178 ...we have to decide which operations are [atomic]. As a starting point, we can take all the operators documented in Cperlop.pod and all the functions documented in Cperlfunc.pod as [atomic]. - SWM
Re: RFC 178 (v2) Lightweight Threads
what if i do $i++ and overflow into the float (or bigint) domain? that is enough work that you would need to have a lock around the ++. so then all ++ would have implied locks and their baggage. i say no atomic ops in perl. From RFC 178 [Atomic] operations typically lock their operands to avoid race conditions Perl source C Implementation $a = $b lock(a.mutex); lock(b.mutex); free(a.pData); a.length = b.length; a.pData = malloc(a.length); memcpy(a.pData, b.pData, a.length); unlock(a.mutex); unlock(b.mutex); leave the locking to the coder and keep perl clean. If we don't provide this level of locking internally, then async { $a = $b } is liable to crash the interpreter. - SWM
Re: RFC 178 (v1) Lightweight Threads
SWM Question: Can the interpreter determine when a variable becomes SWM shared? SWM Answer: No. Then neglecting to put a :shared attribute on a shared SWM variable will crash the interpreter. This doesn't seem very Perlish. Err, no. It won't crash the interpreter. It'll make the script operate incorrectly. This is just the distinction that I am concerned about. In RFC 178, I used the term *data coherence* to mean that the interpreter won't crash or corrupt its internal data representation, and the term *data synchronization* to mean that the program actually does what the user wants. Perhaps we could use the terms *internal* thread-safety and *external* thread-safety, instead. Here is an example of what can happen without internal thread-safety Perl source: Thread1 Thread2 $a = 'a'; $a = 'a' x 1_000_000; Exeuction trace (C language pseudo code) a.size = 1; a.data = malloc(a.size); a.size = 100; memset(a.data, 'a', a.size); Crash. Crash Burn. Do not pass Go, Do not collect $200. Here is an example of what can happen without external thread-safety Perl source: Thread1 Thread2 $a = 'a'; $a = 'a' x 1_000_000; print "Thread1 $a"; Execution trace (Perl code) $a = 'a'; $a = 'a' x 1_000_000; print $a; and the user gets 999,999 more characters of output than they expect. If the users cares about external thread-safety, they have to do their own synchronization Perl source: Thread1 Thread2 lock $a; $a = 'a'; $a = 'a' x 1_000_000; print "Thread1 $a"; Now the output is guaranteed to be `a'. All I want the language to guarantee is internal thread-safety. Everything else is the user's problem. - SWM
Re: RFC 178 (v2) Lightweight Threads
What I'm trying to do in RFC178 is take the thread model that you get in compiled languages like C and C++, and combine it with the Perl5 programming model in a way that makes sense. There may be reasons not to follow RFC178 in Perl6. Maybe - it's too hard to implement - there are performance problems - Perl6 can actually do more for the user - it just doesn't make sense for Perl6 But RFC178 is the thread model that I'd like to program in, and I'm spec'ing it in the hopes that I'll actually get it in Perl6. PRL All threads see the same compiled subroutines Why? Why not allow two different threads to have a different view of the universe? 1. That's how it works in compiled languages. You have one .exe, and all threads run it. 2. Thread programming is difficult to begin with. A language where different threads see different code could be *very* difficult to program in. PRL All threads share the same global variables _All_ or only as requested by the user (ala :shared)? All. PRL Each thread gets its own copy of block-scoped lexicals upon execution PRL of Cmy Why? Perhaps I want a shared my? Different invocations of a subroutine within the same thread get their own lexicals. It seems a natural extension to say that different invocations of a subroutine in different threads also get their own lexicals. PRL Threads can share block-scoped lexicals by passing a reference to a PRL lexical into a thread, by declaring one subroutine within the scope of PRL another, or with closures. Sounds complex to me. Why not make it simply visible by marking it as such? These are the ways in which one subroutine can get access to the lexical variables of another in Perl5. RFC178 specifies that these mechanisms work across threads. PRL The interpreter guarantees data coherence It can't, don't even try. What if I need two or more variables kept in sync. The user has to mediate. Perl can't determine this. Data coherence just means that the interpreter won't crash or corrupt its internal data representation. RFC178 uses the term *data synchronization* for coordinating access to multiple variables between threads. Perhaps, I'm archaic, but I really wouldn't mind if the thread model basically copied the fork() model and required those variable that have to live across threads to be marked as :shared. Sigh...if that's the best I can get, I'll take it. - SWM
Re: RFC 178 (v1) Lightweight Threads
Single thingee access mediation, should be done automatically by perl. The multi-thingee complex mediation should have the user step in, since solving it (correctly and efficiently) is a complex problem. I'm not sure we have a common understanding of the terms we are using. Can you give some examples showing what happens - withsingle thingee access mediation - without single thingee access mediation - with multi-thingee complex mediation - without multi-thingee complex mediation - SWM
Re: RFC 178 (v1) Lightweight Threads
I think we are talking about the same issues, but we can't seem to get in sync on the terminology. I'm going to try to get off the merry-go-round by recapitualting the two approaches. RFC178 - globals are shared unless localized - file-scoped lexicals are shared by all code in the file - block-scoped lexicals can be shared through @_, closures, or sub. fork() on steroids Variables are only shared if declared with :shared. If a varible is not declared with :shared, then each thread gets a separate copy of the data value. Example my $a; my $b :shared; $a++; $b++; $Thread-new(\foo)-join; print $a$b; sub foo { $a++; $b++; } Output: 12 SWM - without single thingee access mediation my $a; Perl simply ignores locking. Thread gets the value of the winner in a race condition. Perl does _not_ crash and burn. Internal structures, mallocs, and accesses are properly mutexed. I don't understand this. Is $a shared between threads or isn't it? If it isn't, then every thread has it's own copy of the data value, and there isn't any need for locking. If it is, then these two statements seem directly contradictory: - Perl simply ignores locking. - Internal structures, mallocs, and accesses are properly mutexed - SWM