Re: Generators -- Icon, Python, YAML and YATL
On Mon, 31 Dec 2001, Clark C . Evans wrote: Hello. I was wondering if Parrot is going to support Generators. A generator is a function that returns multiple times, and I believe, was first made available in the language ICON. Now, ICON may have taken it a bit too far (everything is a generator), however, Python's newest version supports generators. The generator PEP which contains a more complete discussion: http://python.sourceforge.net/peps/pep-0255.html After reading that I'm only left wondering how this concept connects with continuations. Something tells me that if we implement continuations then coroutines and generators will fall out nearly for free. On the other hand, if we don't do continuations then I think this will be quite hard. Simply put, either we figure out how to package up program state into a nice package or we don't. Um, I think. -sam
Re: [PATCH] The Code Police [1/
On Sat, 29 Dec 2001, Boris Tschirschwitz wrote: I suggest opcode_t* code_start So what does this declare: opcode_t* code_start, code_end; If you said two pointers to opcode_t then you just got fooled by your notation! If you want to move the '*' then it has to go to the RHS since that's where it is in the grammar. -sam
Re: Thoughts on vtables...
On Sun, 2 Dec 2001, Michael L Maraist wrote: On Sunday 02 December 2001 02:47 pm, Brent Dax wrote: Quick comment: I've been thinking about constructing an 'OpaqueHandle' PMC type. All the programmer is supposed to know about it is that it points to Something (usually a C struct, but they don't have to know that). I'm not sure what trying to access it results in--perhaps it'll look like a reference from the outside (ParrotInterp=HANDLE(0xDECAF)), or maybe it'll throw a tantrum. Perl5 just used a string as the generic c-struct handle as far as I know.. Actually, most XS I've seen uses an SvIV and just stores a pointer using PTR2INT. Then when you need to access pointer you just do an INT2PTR and get it back out. Since Perl5's IV is garaunteed to be large enough for a pointer this all works fine. -sam
RE: [PATCH] Don't count on snprintf
On Sat, 1 Dec 2001, Dan Sugalski wrote: At 01:20 PM 12/1/2001 -0500, Jeff G wrote: I'll probably pull out perl5's snprintf function and add it to the vtables. Unfortunately I'll say the dreaded L word here... Licensing. Wow, are we still running this project without a license? That's insane. What do we need to do to remedy this situation? Whatever it is we should do it now. -sam
Re: Yet another switch/goto implementation
On Thu, 8 Nov 2001, Dan Sugalski wrote: Gack. Looks like a mis-placed optimization in perl 5. The list of a foreach is *supposed* to flatten at loop start and be static. Apparently not. :) Care to file the perl 5 bug report, or shall I? It's not a bug. Check out the Foreach Loops section in perlsyn, where you'll find: If any element of LIST is an lvalue, you can modify it by modifying VAR inside the loop. Conversely, if any element of LIST is NOT an lvalue, any attempt to modify that element will fail. In other words, the foreach loop index variable is an implicit alias for each item in the list that you're looping over. -sam
RE: Yet another switch/goto implementation
On Thu, 8 Nov 2001, Brent Dax wrote: That doesn't support your argument. The point is that in the statement: foreach(@array) { ... } @array should only be evaluated once, at the beginning of the loop. In effect (using := here, but otherwise Perl 5 code): Oops, you're right - I pulled the wrong section from perlsyn. I think this is the correct paragraph: If any part of LIST is an array, foreach will get very confused if you add or remove elements within the loop body, for example with splice. So don't do that. ;) -sam
Re: Yet another switch/goto implementation
On Thu, 8 Nov 2001, Dan Sugalski wrote: Yes, it is a bug. There's an array in list context--it's supposed to be flattened before the foreach loop gets evaluated. (And if there are multiple arrays in the list it works as you'd expect) Sorry, I quoted the wrong section. It really isn't a bug - the next paragraph in perlsyn explains: If any part of LIST is an array, foreach will get very confused if you add or remove elements within the loop body, for example with splice. So don't do that. -sam
Re: Vtables fixed, scalar started
On Sun, 28 Oct 2001, Simon Cozens wrote: You are all encouraged to write implementations of the vtable functions in scalarclass.c Cool. So, what needs to get done first? By that I mean, what is standing in the way of our creating tests for scalar PMCs? Maybe I'm anal retentive, but I don't know if I'm ready to start writing methods I can't test! -sam
Re: Parameter passing conventions
On Fri, 26 Oct 2001, Dan Sugalski wrote: Okay, here are the conventions. Great. Anyone want to offer up some examples or should I just wait for Jako support to see this in action? -sam
Re: String rationale
On Thu, 25 Oct 2001, Dan Sugalski wrote: The only bits of the interpreter that much care about the string data are the regex engine parts, and those only operate on fixed-sized data. Care to elaborate? I thought the mandate from Larry was to have regexes compile down to a stream of string ops. Doesn't that mean it should work regardless of the encoding of the string? The interpreter can only peek inside a string if that string is of fixed length, and the interpreter doesn't actually care about the character set the data is in. Why is this necessary at all? Wouldn't it be prefereable to have all access go through the String vtable regardless of the encoding? =item encoding Pointer to the library that handles the string encoding. Encoding is basically how the stream of bytes pointed to by Cbufstart can be turned into a stream of 32-bit codepoints. Examples include UTF-8, Big 5, or Shift JIS. Unicode, Ascii, or EBCDIC are Bnot encodings.first .first? Aside from the above, this was a nice refresher. -sam
Re: Resync your CVS...
Fresh checkout won't compile on Redhat Linux 7.1: string.c: In function `string_compare': string.c:161: warning: passing arg 1 of pointer to function from incompatible pointer type string.c:161: too few arguments to function string.c:164: warning: passing arg 1 of pointer to function from incompatible pointer type string.c:164: too few arguments to function make: *** [string.o] Error 1 -sam
Why is make test so slow?
Why is make test so durn slow? Our tools run individually seem pretty snappy on my low-end box (P200/64MB) but running make test is like watching paint dry. I'm seeing something like 1 test per second! Please feel free to point me to TFM if this question is answered there. -sam
[PATCH] Fixes logical ops in Parrot Scheme compiler
Here's a patch to fix logical ops in the Parrot Scheme compiler. The patch: - Implements (min) and (max) which had stubs and some =pod'd out code which I couldn't understand. - Fixes (=), (), (), (=) and (=) to work with more than 2 operands. Added tests where they were missing and fixed tests that were incorrectly passing. After this patch make test has no failures on my box. Question: - Why does Scheme::Generator::_save(2) return an array of three elements? I tried to fix _save() and that just broke things... I must be missing something! -sam PS: Can we get this into languages/scheme? diff -ru scheme.orig/Scheme/Generator.pm scheme/Scheme/Generator.pm --- scheme.orig/Scheme/Generator.pm Thu Oct 18 15:44:43 2001 +++ scheme/Scheme/Generator.pm Sat Oct 20 17:36:19 2001 @@ -244,8 +244,9 @@ $self-_add_inst('','set',[I$return,0]); $self-_generate($node-{children}[0],$temp[0]); for(1..$#{$node-{children}}) { -$self-_generate($node-{children}[1],$temp[1]); +$self-_generate($node-{children}[$_],$temp[1]); $self-_add_inst('','ne',[I$temp[0],I$temp[1],DONE_$label]); +($temp[0], $temp[1]) = ($temp[1], $temp[0]); } $self-_add_inst('','set',[I$return,1]); $self-_add_inst(DONE_$label); @@ -260,8 +261,9 @@ $self-_add_inst('','set',[I$return,0]); $self-_generate($node-{children}[0],$temp[0]); for(1..$#{$node-{children}}) { -$self-_generate($node-{children}[1],$temp[1]); +$self-_generate($node-{children}[$_],$temp[1]); $self-_add_inst('','ge',[I$temp[0],I$temp[1],DONE_$label]); +($temp[0], $temp[1]) = ($temp[1], $temp[0]); } $self-_add_inst('','set',[I$return,1]); $self-_add_inst(DONE_$label); @@ -276,8 +278,9 @@ $self-_add_inst('','set',[I$return,0]); $self-_generate($node-{children}[0],$temp[0]); for(1..$#{$node-{children}}) { -$self-_generate($node-{children}[1],$temp[1]); +$self-_generate($node-{children}[$_],$temp[1]); $self-_add_inst('','le',[I$temp[0],I$temp[1],DONE_$label]); +($temp[0], $temp[1]) = ($temp[1], $temp[0]); } $self-_add_inst('','set',[I$return,1]); $self-_add_inst(DONE_$label); @@ -292,8 +295,9 @@ $self-_add_inst('','set',[I$return,0]); $self-_generate($node-{children}[0],$temp[0]); for(1..$#{$node-{children}}) { -$self-_generate($node-{children}[1],$temp[1]); +$self-_generate($node-{children}[$_],$temp[1]); $self-_add_inst('','gt',[I$temp[0],I$temp[1],DONE_$label]); +($temp[0], $temp[1]) = ($temp[1], $temp[0]); } $self-_add_inst('','set',[I$return,1]); $self-_add_inst(DONE_$label); @@ -308,8 +312,9 @@ $self-_add_inst('','set',[I$return,0]); $self-_generate($node-{children}[0],$temp[0]); for(1..$#{$node-{children}}) { -$self-_generate($node-{children}[1],$temp[1]); +$self-_generate($node-{children}[$_],$temp[1]); $self-_add_inst('','lt',[I$temp[0],I$temp[1],DONE_$label]); +($temp[0], $temp[1]) = ($temp[1], $temp[0]); } $self-_add_inst('','set',[I$return,1]); $self-_add_inst(DONE_$label); @@ -385,59 +390,32 @@ sub _op_max { my ($self,$node,$return) = @_; - my $label = $self-_gensym(); - - my @temp = _save(1); - $self-_generate($node-{children}[0],$return); - - _restore(@temp); + my ($targ) = _save(1); -=pod - - $self-__build_children($node); - - $self-_add_inst('','set',[$return,$registers[0]]); - - $self-_add_inst('', 'lt', [$registers[0],$return,$label]); - $self-_add_inst('', 'set',[$return,$registers[0]]); - for(1..$#registers) { -my $tmp_label = NEXT_.$self-_gensym(); -$self-_add_inst($label,'lt' ,[$registers[$_],$return,$tmp_label]); -$self-_add_inst('','set',[$return,$registers[$_]]); -$label = $tmp_label; + $self-_generate($node-{children}[0], $return); + for (1 .. $#{$node-{children}}) { +my $label = $self-_gensym(); +$self-_generate($node-{children}[$_], $targ); +$self-_add_inst('','le',[I$targ,I$return,SKIP_$label]); +$self-_add_inst('','set',[I$return,I$targ]); +$self-_add_inst(SKIP_$label); } - $self-_add_inst($label); - -=cut - + _restore($targ); } sub _op_min { + my ($self,$node,$return) = @_; + my ($targ) = _save(1); -=pod - - my $self = shift; - my $node = shift; - my $return = I$node-{register}; - - $self-__build_children($node); - my @registers = map { I$_-{register} } @{$node-{children}}; - - $self-_add_inst('','set',[$return,$registers[0]]); - - my $label = NEXT_.$self-_gensym(); - $self-_add_inst('', 'gt', [$registers[0],$return,$label]); - $self-_add_inst('', 'set',[$return,$registers[0]]); - for(1..$#registers) { -my $tmp_label = NEXT_.$self-_gensym(); -$self-_add_inst($label,'gt' ,[$registers[$_],$return,$tmp_label]);
Re: Why is make test so slow?
On 20 Oct 2001, Gregor N. Purdy wrote: I want to libify everything to the point where Perl wrappers around the libs allow you to pass the .pasm stuff as a string and get back a packfile that you can pass on to the interpreter, without firing off separate processes and writing files. Sounds like a good idea. It'll be necessary to have something like this to support string eval() anyway, right? -sam
Re: Fetching the PC?
On Thu, 11 Oct 2001, Dan Sugalski wrote: Did we put a patch into parrot that lets you fetch the current PC and store it in an integer register? I seem to recall someone did, but I can't find it. That's the '@' thing I was talking about making a doc patch for. I then realized that I didn't understand it well enough to explain it! The little I grok is that a '@' in the assembly gets replaced by the value of $op_pc. Jako does stuff like: set I31, [ printit - @ - 3 ] For function calls. As I thought about documenting '@' I realized I had no idea what those braces are for... Seems to me whoever put in this hack should be the one to document it. -sam
Re: [PROPOSED] Crystalizing loader
On Sat, 6 Oct 2001, Gregor N. Purdy wrote: After the bytecode is loaded, but before it is executed, put it through a stage of processing that requires about as much information as a disassembler would (which is why my op_info stuff from one of my previous patches is required). This process converts opcodes into pointers to the op functions, and arguments to pointers to the constant values or register entries. This means that we amortize the dereferences over all invocations of the op at each PC, which when tight loops are involved should make for noticable savings. It seems that this would interfere with trying to share the bytecode using mmap(). We might gain a small performance increase but lose a significant advantage in memory reduction in large systems. Of course, if it's configurable then people can make that trade-off for themselves. -sam
@ undocumented?
Am I correct that '@' in the assembly syntax is undocumented? The only place I could find it was in source comments in Parrot::Assembler. I had to search for @ to find it. Believe me, the last thing you want to search for in a Perl module is @. I think a patch to docs/parrot_assembly.pod would be appropriate. Sound good? -sam
Re: Bytecode safety
On Tue, 18 Sep 2001, Damien Neil wrote: Proposed: Parrot should never crash due to malformed bytecode. When choosing between execution speed and bytecode safety, safety should always win. I don't see this as a safety issue. There's nothing unsafe about crashing. It's just not as pretty as putting up a big ol' YOUR BYTECODE IS SNOOKERED message. Having that message is not worth even a 1% degradation in runtime speed. Careful op design and possibly a validation pass before execution will hopefully keep the speed penalty to a minimum. Sounds great, but make it a separate program that gets run optionally with the default being to not run it. Validating variable-length instructions takes non-trival time and having it outside the normal execution path will allow to indulge in whatever costly-code analysis you desire. -sam
Re: parrot compilation failure in Tru64
On Sat, 15 Sep 2001, Philip Kendall wrote: My personal view would be that the gains due to portable bytecode would be outweighed by the amount of cruft we'd have to put into the interpreter to get them. As Nicholas Clark and someone else who's name I've forgotten[1] mentioned, there are platforms (eg Cray) which don't have a native 32-bit integer type. No cruft need go into the interpreter. We can preprocess any given bytecode format into a native format. These preprocessors can be entirely separate from the interpreter. The gains here are pretty important - with a portable bytecode modules can be distributed pre-compiled, bytecode can be generated on one platform and used on another (think embedding) and all manner of network computing silliness becomes possible. Yes, some platforms will need some costly pre-processing in order to use other platforms' bytecode. However, the 95% case of just needing some byte-swapping will work with almost no penalty at all. The remaining 5% that need to resize the data will still work, albeit with a startup penalty. Easy things easy, hard things possible, right? -sam
Re: Parrot 0.0.1 is released.
Patches should be sent to the perl6-internals mailing list, where I'll take a look at them and apply them to the CVS tree. Ooo, ooo - me first. Since you turned on -Wall in the Makefile I thought it would be nice if it compiled without warnings. Below is a patch that does that on my system. This is the output from cvs -q diff -u. Is this is the best way to send a multi-file patch from the CVS checkout? -sam Index: basic_opcodes.ops === RCS file: /home/perlcvs/parrot/basic_opcodes.ops,v retrieving revision 1.4 diff -u -r1.4 basic_opcodes.ops --- basic_opcodes.ops 2001/09/10 15:48:36 1.4 +++ basic_opcodes.ops 2001/09/10 21:28:39 @@ -60,7 +60,7 @@ // PRINT Ix AUTO_OP print_i { - printf(I reg %i is %i\n, P1, INT_REG(P1)); + printf(I reg %li is %li\n, P1, INT_REG(P1)); } // BRANCH CONSTANT @@ -152,7 +152,7 @@ // PRINT Nx AUTO_OP print_n { - printf(N reg %i is %Lf\n, P1, NUM_REG(P1)); + printf(N reg %li is %Lf\n, P1, NUM_REG(P1)); } // INC Nx @@ -257,7 +257,7 @@ // PRINT Sx AUTO_OP print_s { STRING *s = STR_REG(P1); - printf(S reg %i is %.*s\n, P1, string_length(s), s-bufstart); + printf(S reg %li is %.*s\n, P1, (int) string_length(s), (char *) s-bufstart); } // LEN Ix, Sx @@ -272,4 +272,4 @@ // NOOP AUTO_OP noop { -} \ No newline at end of file +} Index: bytecode.c === RCS file: /home/perlcvs/parrot/bytecode.c,v retrieving revision 1.3 diff -u -r1.3 bytecode.c --- bytecode.c 2001/09/10 09:50:39 1.3 +++ bytecode.c 2001/09/10 21:28:39 @@ -93,7 +93,7 @@ } num--; if (len 0 || (len 0 num == 0)) { -printf(Bytecode error: string constant segment corrupted: %i, %i\n, len, num); +printf(Bytecode error: string constant segment corrupted: %i, %i\n, +(int) len, (int) num); exit(1); } } Index: parrot.h === RCS file: /home/perlcvs/parrot/parrot.h,v retrieving revision 1.2 diff -u -r1.2 parrot.h --- parrot.h2001/09/07 15:23:40 1.2 +++ parrot.h2001/09/10 21:28:39 @@ -25,6 +25,7 @@ #include sys/stat.h #include fcntl.h #include errno.h +#include string.h #define NUM_REGISTERS 32 #define PARROT_MAGIC 0x13155a1 Index: register.c === RCS file: /home/perlcvs/parrot/register.c,v retrieving revision 1.2 diff -u -r1.2 register.c --- register.c 2001/09/10 15:49:27 1.2 +++ register.c 2001/09/10 21:28:39 @@ -10,10 +10,10 @@ struct IRegChunk *chunk_base; chunk_base = CHUNK_BASE(interpreter-int_reg); - printf(Chunk base is %x for %x\n, chunk_base, interpreter-int_reg); + printf(Chunk base is %x for %x\n, (unsigned int) chunk_base, (unsigned int) +interpreter-int_reg); /* Do we have any slots left in the current chunk? */ if (chunk_base-free) { -printf(Free was %i\n, chunk_base-free); +printf(Free was %i\n, (int) chunk_base-free); interpreter-int_reg = chunk_base-IReg[chunk_base-used++]; chunk_base-free--; } Index: test_main.c === RCS file: /home/perlcvs/parrot/test_main.c,v retrieving revision 1.2 diff -u -r1.2 test_main.c --- test_main.c 2001/09/10 10:05:23 1.2 +++ test_main.c 2001/09/10 21:28:40 @@ -35,19 +35,19 @@ int i; time_t foo; - printf(String %p has length %i: %.*s\n, s, string_length(s), string_length(s), s-bufstart); + printf(String %p has length %i: %.*s\n, s, (int) string_length(s), (int) +string_length(s), (char *) s-bufstart); string_concat(s, t, 0); - printf(String %p has length %i: %.*s\n, s, string_length(s), string_length(s), s-bufstart); + printf(String %p has length %i: %.*s\n, s, (int) string_length(s), (int) +string_length(s), (char *) s-bufstart); string_chopn(s, 4); - printf(String %p has length %i: %.*s\n, s, string_length(s), string_length(s), s-bufstart); + printf(String %p has length %i: %.*s\n, s, (int) string_length(s), (int) +string_length(s), (char *) s-bufstart); string_chopn(s, 4); - printf(String %p has length %i: %.*s\n, s, string_length(s), string_length(s), s-bufstart); + printf(String %p has length %i: %.*s\n, s, (int) string_length(s), (int) +string_length(s), (char *) s-bufstart); foo = time(0); for (i = 0; i 1; i++) { string_concat(s, t, 0); string_chopn(s, 4); } - printf(1000 concats and chops took %i seconds.\n, time(0)-foo); + printf(1000 concats and chops took %li seconds.\n, time(0)-foo); string_destroy(s); } /* Otherwise load in the program they gave and try that */
Re: An overview of the Parrot interpreter
On Sun, 2 Sep 2001, Simon Cozens wrote: For instance, the Parrot VM will have a register architecture, rather than a stack architecture. s/rather than/as well as/; # we've got a stack of register frames, right? There will be global and private opcode tables; that is to say, an area of the bytecode can define a set of custom operations that it will use. These areas will roughly map to compilation units of the original source; each precompiled module will have its own opcode table. Side note: this isn't making sense to me. I'm looking forward to further explanation! If our PMC is a string and has a vtable which implements Perl-like string operations, this will return the length of the string. If, on the other hand, the PMC is an array, we might get back the number of elements in the array. (If that's what we want it to do.) Ok, so one example of a PMC is a Perl string... Parrot provides a programmer-friendly view of strings. The Parrot string handling subsection handles all the work of memory allocation, expansion, and so on behind the scenes. It also deals with some of the encoding headaches that can plague Unicode-aware languages. Or not! Are Perl strings PMCs or not? Why does Parrot want to handle Unicode? Shouldn't that go in a specific language's string PMC vtables? -sam
Re: An overview of the Parrot interpreter
On Mon, 3 Sep 2001, Nathan Torkington wrote: Ok, so one example of a PMC is a Perl string... If you grok vtables, think of a PMC as the thing a vtable hangs off. Another way to think of it is that a PMC is an object. To the outside (the interpreter that is manipulating data values) its contents are opaque. All you can do is call methods (vtable entries) on it. So if you have an object/PMC that implements a string, the length method/vtable-entry will return the length of the string. An object/PMC that implements an array, the length method/vtable-entry will return the number of things in the array. I think I understand this. What I don't understand is how this relates to the next section about Parrot's special relationship with strings. If Parrot has a string type and string handling functions, why use a PMC to implement a string? What does it mean to have a PMC that implements a string and also have a string type in Parrot? -sam
Re: An overview of the Parrot interpreter
On Mon, 3 Sep 2001, Dan Sugalski wrote: Basically chunks of perl code can define opcodes on the fly--they might be perl subs that meet the proper critera, or opcode functions defined by C code with magic stuck in the parser, or wacky optimizer extensions or whatever. There won't be a single global table of these, since we can potentially be loading in precompiled code. (Modules, say) Each compilation unit has its own table of opcode number-function maps. If you want to think of it C-ishly, each object module would have its own opcode table. Ok, I think I understand. This is some kind of code-compression hack to avoid using a call opcode all over the place, right? Speaking of soubroutines, what is Parrot's calling conventions? Obviously we're no long in PUSH/POP land... Or not! Are Perl strings PMCs or not? Why does Parrot want to handle Unicode? Shouldn't that go in a specific language's string PMC vtables? Strings are a step below PMCs. I feel like I almost understand this. So when you call the length() vtable method on a PMC representing a Perl scalar, the length op is eventually going to call another length() op, this time on an underlying Parrot string. Right? I'm still not sure I understand why Parrot is doing string ops at all. Do all our target languages have identical semantics for string operations? When you bring Unicode into the mix I start to wonder. -sam
Re: An overview of the Parrot interpreter
On Mon, 3 Sep 2001, Dan Sugalski wrote: avoid using a call opcode all over the place, right? No, more a try and leave the bytecode sections read-only hack. Imagine, if you will, building LWP and bytecode compiling it. It uses private opcodes 1024-1160. Then you later build, say, MIME::Lite, which uses opcodes 1024-1090. I was referring to the practice of having compilation units create private opcodes. Am I wrong in thinking this is a new technique deserving of an excuse for existence? Up until now, I didn't know, so consider yourself the first to find out. :) I'm honored... * Integer, String, and Number registers 0-x are used to pass parameters when the compiler calls routines. s/compiler/interpreter/, right? * Subs may have variable number, or unknown number, of PMC parameters. (Basically Parrot variables) They may *not* take a variable or unknown number of integer, string, or number parameters. I don't understand this restriction. Won't it make implementing variadic functions more difficult? Don't consider this list final until I've had a chance to run it past Larry. He might be thinking of allowing prototypes to change, or spring into existance relatively late in the game. (In which case we probably get a call_in_list and call_in_registers form of sub call) Or those other language designers you're wooing, right? The prototype stuff sounds pretty Perl specific. -sam
Re: An overview of the Parrot interpreter
On Mon, 3 Sep 2001, Dan Sugalski wrote: I'm not entirely sure how much this'll be used, but I really, *really* want to be able to call any sub that qualifies as an op rather than as a sub. What would a sub have to do (be?) to qualify? I don't understand this restriction. Won't it make implementing variadic functions more difficult? Varadic functions that take actual integers, floats, or strings, yes. Varadic functions that take parrot variables (i.e. PMCs) no. Right, so why make the former hard? Is there an upside to the restriction? -sam
RE: Should MY:: be a real symbol table?
On Mon, 3 Sep 2001, Brent Dax wrote: Now is where the temp() stuff I was talking about earlier comes in. sub foo { my($bar); foo(); } is basically equivalent to sub foo { temp($MY::bar); foo(); } Oh, you're pitching softballs to yourself. Try a hard one: my @numbers; for (0 .. 10) { my $num = $_; push(@numbers, sub { $num }); } for (0 .. 10) { local $num = $_; push(@numbers, sub { $num }); } print join(', ', map { $_-() } @numbers), \n; It's in Perl5, but the analogy to Perl6 should be clear enough. This is a good example of the different natures of lexical and dynamic variables. -sam
RE: Should MY:: be a real symbol table?
On Sun, 2 Sep 2001, Brent Dax wrote: but in that case the inner my($x) could be translated to temp($MY::x)--the behavior is basically the same. (Actually, if pads are replaced with stashes, is there any situation where my($x) can't be translated to temp($MY::x)? Hmmm...) Closures, for one. File-scoped lexicals for another. Lexical variables are very different beasts from package variables. They are not compatible in some significant ways. Now, that said, we'll need to do something better than pads to support %MY. If that means full-blown symbol tables for every scope... Well, I'd be surprised. There's a reason lexical variables are faster than package variables and I imagine we'd like to keep it that way. -sam
Re: Final, no really, Final draft: Conventions and Guidelines forPerl Source Code
On Wed, 29 Aug 2001, Simon Cozens wrote: It's almost time to start coding, people, almost. Not to be an ass, but is it? It seems like we're still a long way from having a language spec. -sam
Re: Final, no really, Final draft: Conventions and Guidelines forPerl Source Code
On Thu, 30 Aug 2001, Simon Cozens wrote: That's not entirely relevant any more. Parrot has a semi-autonomous existence as a generic bytecode interpreter. We may be a long way from having a language spec, but we're pretty damned close to having a spec for the interpreter. I look forward to reading it! -sam
RE: Draft assembly PDD
On Mon, 6 Aug 2001, Dan Sugalski wrote: No, he's right. Not dirtying cache lines is pretty much always faster than dirtying them, and not twiddling with memory's faster than twiddling. And unfortunately we can't really do fully platform-dependent code, since it'll be the actual bytecode that'll ned to be different. Ok, I'll go back to lurking - I definitely don't have the education to try to argue the point. I've got some half-formed idea about a stack-based opcode set that compiles down to register references at runtime, but I'm definitely a few books short of articulate on this subject. Choose wisely then - if we want this thing to run well on the Palm and on the Athlon we'll have to! We're actually doing the appropriate amount of optimization here. When dealing with low-level constructs it's appropriate to consider low-level effects and algorithms that handle low-level machinery. Lo tho we walk through the valley of the shadow of the JVM... Is anyone else nervous that we seem to be trying to replace GCC here? Is register allocation really something the Perl community has expertise in? -sam
RE: Draft assembly PDD
On Mon, 6 Aug 2001, Hong Zhang wrote: It is not just for performance, the stack size and cache locationality are also big issues. Cache sizes and timings vary from machine to machine. Maybe we should make it configurable at compile-time? If we do that then there's no reason to try to guess at the right number now - we can test later and get an answer we can trust. [ insert routine comment about premature optimzation causing the death of the dinasours here ] -sam
Re: Modules, Versioning, and Beyond
On Mon, 30 Jul 2001, Dan Sugalski wrote: When you actually use a module, the simple name (like IO) will be internally expanded out to the three value thing. So if you have two modules that each use a different version of the same module, they won't interact because each will be dealing with a separate thing. How will this work with XS modules that load external libraries? Won't trying to load two versions of mysql.so cause symbol collision? -sam
Re: [DDJ] Fast and Small Resizable Arrays
On Fri, 8 Jun 2001, Jarkko Hietaniemi wrote: An interesting article in the July DDJ) in the Algorithm Alley: Fast and Small Resizable Arrays, presents a datastructure that promises just what the subject says. The first thing I thought of after reading the article was use less memory... I wonder how hard it would be to use this technique for AVs in Perl5. -sam
Re: Generating Perl 6 source with Perl
On Fri, 16 Feb 2001, Simon Cozens wrote: On Fri, Feb 16, 2001 at 08:52:03PM +, Nicholas Clark wrote: macro languages and symbolic debuggers don't mix well. Generated output would be Real Life C. I'm thinking something along the lines of perl vtable.pl vtable.spec vtable.c which would work just fine with symbolic debuggers. I think he meant that using a symbolic debugger is hard, not that it wouldn't work. After all, when GDB is tell you that: (*fooz).blazt[10].mark[0]-set(fungle(10)); Is causing a seg fault and all you wrote was: $fooz-set(10); You've got to get pretty smart to figure out what's going south. -sam
Re: GC: what is better, reuse or avoid cloning?
On Sat, 10 Feb 2001, Branden wrote: Suppose I have a string stored in $foo, say, "abcbca", and then I do: $bar = $foo; $foo .= "xyzyzx"; I see two ways of doing this: one is allowing a string value to be shared by two or more variables, and the other one not. Why would you want to share the string value? Why did you assign the value of $foo to $bar if you really wanted to: $bar = \$foo; Or actually closer to what you seem to want: *bar = \$foo; Although a little birdy told me we're dropping globs for Perl6. Don't most programmers do assignment for a reason? Why should we second-guess them? Given mark-and-sweep or other advanced GC, which of them is better? Sharing the value or cloning on each assignment? I don't believe implementing copy-on-write for scalars has anything to do with garbage collection. Any garbage collector that will work for Perl will need to work with references. All you're suggesting is a beneath-the-covers reference, right? What's the point? You seem to be engaging in some extreme and bizarre form of premature optimization. -sam
Re: GC: what is better, reuse or avoid cloning?
On Sat, 10 Feb 2001, Buddha M Buck wrote: I think what he's thinking (in C terms) would be more like the following: Right. It already has a technical name - copy-on-write. I should have made it more clear that I recognized the intended mechanism. I was trying to demonstrate that Perl-level mechanisms already existed to do value aliasing. I was also trying to show that what he is suggesting is a lot like aliasing with some simple copy-on-write STORE magic. For some reason I thought that by pointing that out I could relieve him of his bizarre worries about garbage collecting things with references. I probably should have just gone to bed instead. -sam
Re: Speaking of signals...
On Wed, 3 Jan 2001, Dan Sugalski wrote: I think one of the things we might want to do is figure out what people use signals for and see if we can abstract out some of that functionality without actually exposing signals. (From an internals standpoint, at least) Well, one thing people use signals for is IPC with non-Perl processes. I think that if you're want to continue to support that then we need an interface to real, broken though they may be, POSIX signals. Maybe the code gets pushed out of the core, but it still needs to be somewhere. -sam
Re: Perl6 compatibility with non-C enviornments
On Thu, 7 Dec 2000, Bradley M. Kuhn wrote: Now, I would agree that there are more C hackers about. However, many people are graduating college with computer science degrees having worked mostly in Java and very little in C. In 6 years or so, we may find that there are more Java hackers than C hackers about. But, I agree this alone isn't a reason to pick Java. That's an interesting theory. Want to hear another one? People that only learn Java in college and never go on to learn other languages aren't going to be developing Perl. Why? They didn't learn Perl in college either. On the other hand, the smart ones will realize pretty quickly that a huge proportion of the best software is written in C. It's not too hard to figure out that C might be a good language to learn. Bottom line: is C's lifespan finite? Certainly. Is Java the replacement? I serriously doubt it. I already knew that "writing the canonical Perl6 implementation in Java was likely a lost cause. ;) However, I hope we won't confuse this issue with the one of making it possible to port Perl to non-C environments. Such environments do exist, and they do matter, IMO. I'm a jerk, so I have to ask: do they exist? What platform are you talking about where there exists a JVM and where no C compiler can target the architecture? How did they write the JVM with no C compiler? However, point taken. The easier we can make porting Perl the better. I don't know if that is proven. We still lack a port to the JVM, while our "sister languages" like Python, Scheme, Tcl, Eiffel and the like all have JVM ports. Little languages are much more portable than big ones. Ok, I don't really know if that covers Eiffel and Python but I'm pretty sure it covers Tcl and Scheme. You can write interpreters for those languages in surprisingly little code. -sam
RE: Meta-design
On Wed, 6 Dec 2000, Dan Sugalski wrote: What I'm thinking is that we'll have a scoped destruct stack that gets pointers to variables that explicitly need destruction, and as we exit levels of scope we call the destructors of those variables that need it. (They can still be GC'd later to pick up their now-free memory) Most things won't get tossed on there, since most variables don't have any destruction behaviour. If you don't reference count how do you protect yourself from DESTROYing objects that are still referenced: my $new_dog; { my $dog = new Dog; $new_dog = \$dog; } Did $dog just get erroneously collected by your destruct stack? No? Without reference counting? Frankly these hybrid GC schemes look more like the *worst* of both worlds than best - all the predictable performance problems of reference counting with the unpredictable performance problems of mark and sweep! -sam
RE: Meta-design
On Wed, 6 Dec 2000, Dan Sugalski wrote: my $new_dog; { my $dog = new Dog; $new_dog = \$dog; } That would hoist the Dog reference into an outer level of scope--in this case the one containing $new_dog. Or so my thinking goes at the moment, though there may be (almost inevitably are) problems with that. "hoist"? I thought this was a stack you were talking about. You're going to do an O(n) operation on a stack every time a reference is passed out of scope? What happens when two references diverge: my $newDog1; { my $newDog2; { my $dog = new Dog; $newDog2 = $newDog1 = \$dog; } $newDog1 = undef; } How does a non-refcounting scheme know that $newDog2 is the last ref to $dog when the first block is done? As far as claims about mark-and-sweep improving performance, I guess we'd need a test implementation in order to find out. Considering the non-deterministic character of many mark-and-sweep systems performance testing can be a delicate matter. -sam
RE: Meta-design
On Wed, 6 Dec 2000, Dan Sugalski wrote: Sure, but only objects. (or, to be really paranoid, things referred to) Nothing else needs refcounting. All the refcounting code can be isolated in the reference creation and deletion code, and we don't have to pay it otherwise. Good point. I hadn't considered the predominance of non-objects in Perl programs. I suppose that any system that allowed us to get away without having to refcount every simple SV would be a pretty big win even if blessed SVs still had to be counted. -sam
Re: Perl6 in Java? (was Re: Meta-design)
On Wed, 6 Dec 2000, Bradley M. Kuhn wrote: And, it will make the barrier for entry for new internals hacker lower. Really? Do you honestly believe there are more Java programmers than C programmers? Particularily in the Perl development community! I would note that if we write in Java, we aren't targeting a single compiler, although at the moment, the only efficient compiler for Java might be GCJ. It speaks volumes that the only efficient compiler is vaporware. How fast is it? Who knows - it's not finished! Does it have major flaws? Probably - but just wait and see how fast it is! Let's move on - C is our only real option. It's portable enough (proven by perl5), it's fast enough (or nothing is) and we already have in-house experts to lead the implementation. Are we serriously considering anything else? PIL? If someone here has a real proposal for a better-than-cpp macro language then let's hear it. It's worth considering in a purely abstract keep-an-open-mind sense. -sam
Re: Proposal for groups
On Tue, 5 Dec 2000, Alan Burlison wrote: How about writing the documents in XML and having a 'perl specification' DTD? ... Death to POD! Can we *please* not re-fight this war? I know you remember the last couple incarnations of XML VS POD. Just replay them in your mind and enjoy the show. Spare us the angst. -sam
Re: Proposal for groups
On Sat, 2 Dec 2000, Nathan Torkington wrote: * it's difficult for the design to happen through the questions Is that really true? Have we tried? As far as I can tell we've got a lot of well-intentioned people that for whatever reason are spending very little time making Perl 6 happen. Let me explain why I think this is a useful comment instead of just slander from the sidelines. I'm somone who's reasonably knowledgeable about compiler technology and about Perl internals. Still, I'm not so expert that I feel comfortable leading the design of the Perl 6 internals. I'd hoped to be involved as a skilled helper - able to develop and debug proposed systems. The problem is that I can't be of much use until the people that really are qualified to design this stuff start producing designs. So, here's my opinion: we have enough structure. All the people are here that are going to show up. Now it's time to do the work and that means the experts have got to dedicate some serious time to sketching out the skeleton of this beast. Once you've done that then I think you'll find there are more people around to add the needed muscles, skin and brains. If you need to go off in a room alone to that, well, I guess that's your option. I just don't think you've actually given the existing structure much of a trial yet. -sam How about we do this to design the architecture and API: perl6-internals-design is for a team of no more than 10 people. These people should have experience either with perl5 or with a similar system. Mail to this list goes to perl6-internals-design and to perl6-internals. perl6-internals is a public access list, where folks can feel free to question and kibitz. The design team will probably want to have a few people on the public list as well. This is where the consciousness of the rest of us can be raised. We can see what they're doing, ask questions, and make suggestions. Because the meta discussion happens off the -design list, designers will be able to tune it out if they have to focus on the task at hand. This lets us satisfy these goals: * open process, both for visible and participation * small team doing the design (elephant is a mouse designed by committee, etc) Make sense? Nat
Re: To get things started...
On Fri, 24 Nov 2000, Nicholas Clark wrote: I think Dan was suggesting that the (user side) regex doesn't change at all (so that's no new syntax there) It's just that the innards of perl gains a tied scalar that doesn't actually read in and buffer the file immediately, but defers it as long as it can get away with. And that the regex engine knows about these lazy scalars and provokes the read-more when needed. Right. And I was suggesting that while this might solve our problem it wouldn't do much for all the other people that have to solve the same problem. I'd like to see a general solution accessible from Perl. If that solution is some tied-scalar magic, fine. If it's more involved than that (and I think it will be) then we'll need to think about the syntax a bit. I don't think that this differs from the current parser. If it encounters open " but never a close ", it will read and buffer to the end of file before realising that there's a problem. (because strictly there isn't a problem until EOF is encountered before the closing ") I'm not certain there's anything that can actually be done to avert the need to buffer a lot of script in these situations. You mustn't attempt to seek the script file handle as it might be from something unseekable such as a pipe (or socket. BEGIN {socket STDIN...}) I suppose that's true. I was immagining something less extreme than the absolute failure case of missing a closing ". I'm imagining a failure that is recoverable but still requires running the regex to the end of the "string" to find that out. Are there any like this? Perhaps not. Perhaps this just isn't a reasonable criticism of regex parsers since normal parsers do it all the time anyway! -sam
Re: RFC 125 (v2) Components in the Perl Core Should Have Well-Defined APIs and Behavior
On Tue, 10 Oct 2000, Tim Bunce wrote: A very complete UML tool in Java is ArgoUML: http://argouml.tigris.org/ Umm, it might be interesting for someone to add a Perl code generator for it... I've played with the idea of adding Perl code-generation to my design tools (Visio2000 and ObjectDomain). One question that comes up immediately: what code should be generated for class files? There's no single standard way to layout a Perl class beyond the obvious "h2xs" format. In particular there are numerous ways to declare attributes and methods. An awful lot of what makes a code generator usefull for languages like C++ and Java doesn't apply to Perl. We don't have to manage disjoint sets of declarations (header files / interface files) and implementation files (class files). We already have a flexible and simple documentation format - adding a POD interface to a UML tool would be cute but hardly a huge time saver. I enjoy working with visual UML tools but I'm unconvinced that Perl code generation would be of much practical use. Do you have ideas for functionality that I haven't considered? -sam
Re: RFC 227 (v1) Extend the window to turn on taint mode
On 14 Sep 2000, Chaim Frenkel wrote: (Someone remind me, What is the point of -T if not running setuid?) All you need to get root is an unprivilaged shell on anything but a fully patched machine. A dumb Perl CGI running without -T is all you need to get a shell. Besides, I bet most online stores keep our credit card numbers in databases accessible by 'nobody'. You probably wouldn't even need root in most cases if you were after card numbers. -sam
Re: RFC 155 - Remove geometric functions from core
On Tue, 29 Aug 2000, David L. Nicol wrote: Well then. It is impossible to rearchitect it to make it shared text? Perhaps the first instance of perl sets up some vast shared memory segments and a way for the newcomers to link in to it and look at the modules that have been loaded, somewhere on this system, and use the common copy? That approach invites big security problems. Any system that involves one program trusting another program to load executable code into their memory space is vulnerable to attack. This kind of thing works for forking daemons running identical code since the forked process trusts the parent process. In the general case of a second perl program starting on a machine, why would this second program trust the first program to not load a poison module? I don't believe you can simply "rearchitect it to make it shared text". This sounds like a problem to be fixed. Relax, Tom, we'll take it from here. Are you so sure? From where I'm sitting he's got some pretty tough points there. If you've got a solution then I'm quite suprised, which would be great. If not then I suggest you avoid writing the proverbial bad check. -sam
Re: RFC 155 - Remove geometric functions from core
On Tue, 29 Aug 2000, David L. Nicol wrote: does sysV shm not support the equivalent security as the file system? Well, no, I don't think it does. It supports permissions on individual segments but it doesn't support anything like directory perimssions. It might be enough, and it might not be. A user can run two programs and not expect one to have an automatic exploit on the other just because they're both Perl! Think "nobody". Yes, you'd provide a paranoid mode for experts to use to avoid the problems to which most users would be exposed. Great. Did I not just describe how a .so or a DLL works currently? Certainly not. You wrote only a few sentences. I'm no expert but I don't think that shared libraries are that simple. I also don't think they're implemented using SysV IPC shared memory, but you might know differently. In the ever-immenent vaporware implementation, this whole thing may be represented as a big file into which we can seek() to locate stuff. Zuh? What are you talking about? Is this some kind of Inline.pm-esque system? -sam
Re: RFC 155 - Remove geometric functions from core
On Tue, 29 Aug 2000, Nick Ing-Simmons wrote: David L . Nicol [EMAIL PROTECTED] writes: does sysV shm not support the equivalent security as the file system? mmap() has the file system. I wasn't aware that mmap() was part of SysV shared memory. My mistake? It's not on the SysV IPC man pages on my Linux system. The mmap manpage doesn't mention SysV IPC either. -sam
Re: Do threads support SMP?
On 20 Aug 2000, Chaim Frenkel wrote: SWM Does Perl6 support Symmetric MultiProcessing (SMP)? Perl5 does - see 'fork'. I'm guessing that Perl6 will have at least that much support. SWM This is a *huge* issue. It affects everything else that we do with SWM threads. Most operating system's thread implementations treat threads like processes - they get scheduled independently on whatever CPUs are available. Some OSs do a better job than others, but there's nothing Perl can do about that. So, this is a pretty small issue for us. -sam
Re: inline mania
On Tue, 1 Aug 2000, John Tobey wrote: The people here are rightly skeptical about the effectiveness of using the 5.6 code base as a starting point for v6, but I have a pretty clear vision of how to do it, and I am committed to giving it a try, even if no one else will. In fact, I'll give you all a tentative schedule: Wait, you're going to develop Perl 6 ALONE? Wasn't this going to be "the community's rewrite of Perl"? Shouldn't you be trying to rally support for your vision before issuing schedules? I'm not trying to knock you - I'm not at all against hearing you plans and possibly helping out. This just seems like a pretty strange way to approach a community effort. 15 August 2000 - detailed draft spec to perl6-internals. 31 August 2000 - revised spec after discussion. What? You're expecting all the various perl6-* lists to come up with final RFCs be the end of the month? And you're expecting to have Larry's final plans by then? Or are you going to implement Perl 6 without knowing what it is? Unicode and threading would become integrable only after a lot of morphing (refactoring). The morphing would probably destroy any traces of v5 unicode support (since well under 20% of test scripts will notice), and of course 5005threads will be the first to go. With any luck, a compatible, well-integrated replacement will eventually take its place. This sounds hopeful, but mostly unfounded. Without starting with threading and Unicode as primary features you're going to be fighting an uphill battle ala Perl 5. -sam