Re: PDL-P: Re: Hooks for array notation (was Re: Ramblings on baseclass for SV etc.)
On Thu, 10 Aug 2000, Jeremy Howard wrote: At 07:07 PM 8/9/00 -0400, Karl Glazebrook wrote: Dan Sugalski wrote: At 11:27 PM 8/9/00 +1000, Jeremy Howard wrote: 5- Compact array storage: RFC still coming I hope this RFC will be "Arrays should be sparse when possible, and compact" and just about nothing else. :) Why? Because it's awfully premature to be thinking about how arrays should be sparsely stored. I feel bad looking at a well thought-out RFC and going "Nope, doesn't fit". Sure. Is this obvious enough that it doesn't need an RFC at all? Basically the RFC would say: - If we have arrays, and - If we allow strong typing, then - Arrays of strongly typed variables should be sparse when possible, and compact Strong typing and sparse arrays are orthogonal--no reason we shouldn't use them if someone does: $foo[time] or something of the sort. (People like huge arrays with few scalars in them too... :) Dan
Re: Program lifecycle
At 09:57 PM 8/9/00 -0700, Matthew Cline wrote: On Wed, 09 Aug 2000, Nathan Torkington wrote: It seems to me that a perl5 program exists as several things: - pure source code (ASCII or Unicode) - a stream of tokens from the parser - a munged stream of tokens from the parser (e.g., use Foo has become BEGIN { require Foo; Foo-import }) - an unthreaded and unoptimized optree Isn't there a tree of whatchamacallits between a token stream and the optree, and also a symbol table? I'm not too up on compilers... I think so. There are some thingamabobs in there too. :) I think we'll see at least a syntax tree, a bytecode stream, and an optree in perl 6, depending on where you look. That's still sort of up in the air, though. (We might see machine code too, if I can convince myself that it can be done portably) Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Program lifecycle
At 10:01 PM 8/9/00 -0600, Nathan Torkington wrote: Would it make sense for the parsing of a Perl program to be done as: - tokenize without rewriting (e.g., use stays as it is) - structure without rewriting (e.g., constant subs are unfolded) - rewrite for optimizations and actual ops The structure I've been thinking of looks like: Program Text | | | V +--+ | Lex/parse | +--+ | Syntax tree | V +--+ | Bytecoder| +--+ | Bytecodes | V +--+ | Optimizer| +--+ | Optimized bytecodes | V +--+ | Execution| | Engine | +--+ With each box being replaceable, and the process being freezable between boxes. The lexer and parser probably ought to be separated, thinking about it, and we probably want to allow folks to wedge at least C code into each bit. (I'm not sure whether allowing you to write part of the optimizer in perl would be a win, but I suppose if it was saving the byte stream to disk...) Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Program lifecycle
You may also want to be able to short circuit some of the steps. Especially where the startup time may outweigh the win of optimization. And if there could be different execution engines. Machine level, bytecode, (and perhaps straight out of the syntax tree.) Hmm, might that make some debugging easier? chaim "DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS The structure I've been thinking of looks like: DS Program Text DS | DS | DS | DS V DS +--+ DS | Lex/parse | DS +--+ DS | DS Syntax tree DS | DS V DS +--+ DS | Bytecoder| DS +--+ DS | DS Bytecodes DS | DS V DS +--+ DS | Optimizer| DS +--+ DS | DS Optimized DS bytecodes DS | DS V DS +--+ DS | Execution| DS | Engine | DS +--+ DS With each box being replaceable, and the process being freezable between DS boxes. The lexer and parser probably ought to be separated, thinking about DS it, and we probably want to allow folks to wedge at least C code into each DS bit. (I'm not sure whether allowing you to write part of the optimizer in DS perl would be a win, but I suppose if it was saving the byte stream to disk...) -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Program lifecycle
"NT" == Nathan Torkington [EMAIL PROTECTED] writes: NT - source filters munge the pure source code NT - cpp-like macros would work with token streams NT - pretty printers need unmunged tokens in an unoptimized tree, which NTmay well be unfeasible I was thinking of macros as being passed some arguments but then can either manipulate the raw source code or ask the lexer/parser for parsed tokens. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Program lifecycle
At 03:36 PM 8/10/00 -0400, Chaim Frenkel wrote: You may also want to be able to short circuit some of the steps. Especially where the startup time may outweigh the win of optimization. The only one that's skippable is the optimizer, really. I'd planned on having to pass it some indicator of how aggressive it should be, And if there could be different execution engines. Machine level, bytecode, (and perhaps straight out of the syntax tree.) Yup. Hence the "replaceable" bit. :) The boxes would all have a fixed and well-defined interface, and the various streams (syntax tree and bytcode) would also be well-defined. If you wanted to build an execution box that instead dumped out java bytecodes, well, sounds like a good plan to me. :) Hmm, might that make some debugging easier? Might. Hard to say, though if we get them as black boxes at least it'll make debugging more compartmentalized. Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Method call optimization.
Dan Sugalski [EMAIL PROTECTED] writes: At 03:35 PM 8/9/00 -0700, Damien Neil wrote: On Wed, Aug 09, 2000 at 03:32:41PM -0400, Chaim Frenkel wrote: Each sub is assigned an index. This index is unique for the package the sub is in, and all ancestor packages. Add all sibling packages of all the packages involved ;-) If we are not careful we can end up making the compile NP complete. We just had all the numbers nicely sorted and then someone reads in: package Foo; use base qw(Meth_is_1 Other_is_1); sub Meth ... sub Other ... And now we have to recompute the whole tree so that Meth and Other don't share the index. The first runtime reassignment of @ISA shoots this one down hard. Sorry. (MI also makes it more difficult, since dependency trees will have to be built...) Yes - this is why Malcolm dodged MI with 'fields' module. -- Nick Ing-Simmons
Re: Method call optimization.
At 08:16 PM 8/10/00 +, Nick Ing-Simmons wrote: The first runtime reassignment of @ISA shoots this one down hard. Sorry. (MI also makes it more difficult, since dependency trees will have to be built...) Yes - this is why Malcolm dodged MI with 'fields' module. I'm not sure we can, or should, ultimately dodge MI. We have it, and it's ours to keep. (Lucky us...) Which isn't to say the answer can't be "MI sucks, here's the best band-aid we can manage", of course. :) Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Method call optimization.
"GB" == Graham Barr [EMAIL PROTECTED] writes: GB On Thu, Aug 10, 2000 at 04:54:25PM -0400, Dan Sugalski wrote: At 08:31 PM 8/10/00 +, Nick Ing-Simmons wrote: You just re-invented "look up the name in a hash table" ;-) You now have one big hash table rather than several small ones. Which _may_ give side benefits - but I doubt it. If we prefigure a bunch of things (hash values of method names, store package stash pointers in objects, and pre-lookup CVs for objects that are typed) it'll save us maybe one level of pointer indirection and a type comparison. If the object isn't the same type we pay for a type comparison and hashtable lookup. Not free, but not expensive either. 'Specially if we get way too clever and cache the new CV and type in the opcode for the next time around, presuming we'll have the same type the next time through. GB Or if someone has defined a new sub, you don't know it was not the GB one you stashed. Leave a stub behind at the old address that would trigger the repointing. (I'm not clear what to do for refcounting the old address) GB I am not sure it got into perl5, but pre-computing the hash value of GB the method name and storing it in the op is one thing. Maybe also trying GB a bit harder to keep package hashes more flat. GB Another thing that may help is if all the keys in package hashes are shared GB and also shared with constant method names in the op tree. Then when GB scanning the chain you only need do a pointer comparison and not a GB string compare. I don't think anything should be in the op tree. The optree (or whatever the engine is) should only be operations. Data should be either in the object, vtbl, or stack. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: PDL-P: Re: Hooks for array notation (was Re: Ramblings on baseclass for SV etc.)
Dan Sugalski wrote: Strong typing and sparse arrays are orthogonal--no reason we shouldn't use them if someone does: $foo[time] or something of the sort. (People like huge arrays with few scalars in them too... :) Good point. It also occurs to me that we would want some syntax to say "Don't make this sparse". That way, arrays that are, for example, read from a file, could be stored contiguously and can be accessed without traversing extra pointers.
Re: re rfc 15 typing
Michael Fowler wrote: Which then raises a few more problems (whew): how do you coax user input (which is an SV) into a value $foo can accept with very-strict on? You run it through an explicit conversion process, like using Catoi() in a C program Unfortunately, this involves more cooperation from the compiler; it has to provide a way of declaring the return value of a subroutine. I'm not sure if this is out of the question, it may be more generally useful outside of type-checking. Look how easily (?) everyone started using "my" variables instead of just variables. With the carrot of Compiletime-Bound-Speed for the _low_low_cost_ of declaring things you know are going to be working together with object names (no complex declarations required, this is perl, the objects spring into existence based on consistent usage w/in the code in question) it can't help but catch on fairly quickly. Even if it means the parser has to do a linking pass. ; what happens when an external function (say, from a module) is being very-strict, but is passed arguments from code that doesn't do type checking? This is documented in the module's documentation, so (I say this at the risk of bringing on the wrath of those who hate C++ casting) a conversion method must be called. I had also thought of the ability to write wrapper subroutines, with the appropriately declared parameters and return value, for those modules and subroutines that don't provide them. The point is to leave it up to the person wanting type checking to make sure it's working everywhere, and not force it on anyone else in -any- way. Maybe objects that fail to provide impliable interfaces to and from CSTRING and DOUBLE could generate a compile-time warning, instead of (or in addition to) just stringifying into RESTAURANT::INDIAN(0xFF23D) and zero, respectively Michael -- Administrator www.shoebox.net Programmer, System Administrator www.gallanttech.com -- -- David Nicol 816.235.1187 [EMAIL PROTECTED] :wq