Re: RFC 61 (v2) Interfaces for linking C objects into pe
Hildo Biersma quoted RFC 62 and then went on thusly: CXS is an excellent medium for sharing the glory of the internals of Perl with the C programming community. It is hoped that the interface deescribed herein will become an excellent medium for sharing the glory of the internals of well-written C libraries with the Perl programming community. At TPC, Chip made the excellent suggestion that we should look at supporting two different interfaces: one for C and one for C++. For various reasons, these interfaces could offer the same functionality in quite different ways. Any ideas on that? Hildo That is the direction I intend RFC 61 to go in; when I refer to "external*" in it I am suggesting that the fortran linkage be Cuse externalf and so on. I think we should offer any number of interfaces, one for each language, C, C++, Fortran, Lego Power Motor Language, anything that can be defined, and have a clear expansion path for defining new definition languages, and linking them in. That is why I refer to the DefinitionStructure as a type of object on its own, analogous to a compiled regular expression in perl5: I hope that definitionstructure compilers will appear for various languages. C is the compiled language I am most familiar with and I have been led to believe that its data structure linkage is very well defined, making it suitable for the example language, and the first language. I believe the RFC is a template for further, or more general, specs for other languages, but C would be a good one to do first. Due to the similarity of C and C++ data structure definition syntax, maybe both of those at the same time -- do I get to choose the letter for C++? I'd like to leave that to the implementor, but Cexternalcpp is the first thing that comes to mind, along with Cexternalhpp for class and function definitions. Would tieing a hash to an external C++ class import all the methods and bless the hash into the appropriate package, or would two steps be required, in order to better support variant types? I think the two steps. (It is possible to tie something to one class and bless it to another, is it not?) if we are to keep the package == namespace correlation, we'll need to define strong subpackages in order to use object methods that exist within a name space, that's my thought on applying RFC 61 version 2 to C++. At your service, David Nicol -- David Nicol 816.235.1187 [EMAIL PROTECTED]
http://dev.perl.org/rfc/37.pod
http://dev.perl.org/rfc/37.pod And if we adopt complex data structures (how complex? just wrappers for C structs, into very fast hashes) as suggested in rfc 61, we could return those special, limited pseudohashes with only the relevant names, resolvable into offsets at compile time, instead of the fully hashed hashes suggested in 37. For backwards compat, they could still return their list in array context.
selectively invalidating cached inherited methods (as Re: Method call optimiza
Dan Sugalski wrote: not work. I think we're going to have to have a doubly-linked list going for @ISA, so when a parent package changes the child packages get changed too. It'll make updates to @ISA more expensive, but if you do that then you ought to be prepared to take a hit. Dan Well said. What about if methods (in parent classes) keep a list of what other classes have inherited the method -- by method -- so they'll know who to signal when one gets redefined? Speaking as a well connected perl6 class, When I change my ISA I would have to * clear from my cache those which were inherited and also, if applicable * reset the function pointers in any op-nodes that refer to my inherited classes(see below) to (text me) back from (functionaddress 0xwhatever) * inform (issue a "definitioninvalid" message) to all the classes that have actively inherited (we have a reference list for this too) and inherited methods from me, on those methods -- on either all of them or only those that have changed, if all I did was switch from a CAR::SPORTS to a CAR::SPORTS::CANADIAN maybe all the methods are the same, so I could check the methods that have been inherited from me and only issue definvalid messages for the ones that are actually changed. these would be ops that were tagged at compile time as going to be working with specific classes and are therefore double-optimized -- David Nicol 816.235.1187 [EMAIL PROTECTED] :wq
Re: re rfc 15 typing
Michael Fowler wrote: Which then raises a few more problems (whew): how do you coax user input (which is an SV) into a value $foo can accept with very-strict on? You run it through an explicit conversion process, like using Catoi() in a C program Unfortunately, this involves more cooperation from the compiler; it has to provide a way of declaring the return value of a subroutine. I'm not sure if this is out of the question, it may be more generally useful outside of type-checking. Look how easily (?) everyone started using "my" variables instead of just variables. With the carrot of Compiletime-Bound-Speed for the _low_low_cost_ of declaring things you know are going to be working together with object names (no complex declarations required, this is perl, the objects spring into existence based on consistent usage w/in the code in question) it can't help but catch on fairly quickly. Even if it means the parser has to do a linking pass. ; what happens when an external function (say, from a module) is being very-strict, but is passed arguments from code that doesn't do type checking? This is documented in the module's documentation, so (I say this at the risk of bringing on the wrath of those who hate C++ casting) a conversion method must be called. I had also thought of the ability to write wrapper subroutines, with the appropriately declared parameters and return value, for those modules and subroutines that don't provide them. The point is to leave it up to the person wanting type checking to make sure it's working everywhere, and not force it on anyone else in -any- way. Maybe objects that fail to provide impliable interfaces to and from CSTRING and DOUBLE could generate a compile-time warning, instead of (or in addition to) just stringifying into RESTAURANT::INDIAN(0xFF23D) and zero, respectively Michael -- Administrator www.shoebox.net Programmer, System Administrator www.gallanttech.com -- -- David Nicol 816.235.1187 [EMAIL PROTECTED] :wq
string types proposal
Larry Wall wrote: By the way, don't take this as a final design of string types either. :-) If string types are a tree of partially full nodes of string data, the representation of each sNode could be independent of the others. The originial idea behind using partially full nodes is, you can do substitutions that affect the length in the middle without needing to rewrite the whole thing. But with variable string data types, each node can have a variable type. Instead of redefining the whole thing from utf8 to utf16 when a chr(3852) arrives (I have no idea how that looks from a data stream perspective, but I am assuming that it is robustly defined) we can just redefine the sNode that the big character lives in. Also, to optimize most calls to eq (as well as all the other basic comparison operators) an immediate data area defined in the SV structure that is used for holding numeric data could hold, for pure string/raw data, the first letter or two.
Re: RFC 83 (v2) Make constants look like variables
internally, how to do it: ASSIGNMENT, as in, Change-my-values-w/o-changing-my-type, is in the vtable for the object. When an object becomes constant, its ASSIGNMENT function gets replaced with $heres_what_it_gets_replaced_with=sub{ throw "ERROR-ASSIGNMENT-TO-CONSTANT" }; or something very much like that. Constanting an array or hash as described in 83 will require an iteration over the container. It is proposed that a new syntax for declaring constants be introduced: my $PI : constant = 3.1415926; my @FIB : constant = (1,1,2,3,5,8,13,21); my %ENG_ERRORS : constant = (E_UNDEF='undefined', E_FAILED='failed'); Constants can be lexically or globally scoped (or any other new scoping level yet to be defined). If an array or hash is marked constant, it cannot be assigned to, and its elements can not be assigned to: @FIB = (1,2,3); # Compile time error @FIB[0] = 2; # Compile time error %ENG_ERRORS=(); # Compile time error %ENG_ERRORS{E_UNDEF='No problem'} # Compile time error To create a reference to a constant use the reference operator: my $ref_pi = \$PI; To create a constant reference use a reference operator in the declaration: my $a = 'Nothing to declare'; my $const_ref : constant = \$a; Note that this does not make the scalar referenced become constant: $$const_ref = 'Jewellery'; # No problems $const_ref = \4; # Compile time error =head1 IMPLEMENTATION Constants should have the same behaviour as the do now. They should be inlined, and constant expressions should be calculated at compile time. =head1 EXTENSIONS It may be desirable to have a way to remove constness from a value. This will not be covered in this RFC--if it is required a separate RFC should be written referencing this one. -- David Nicol 816.235.1187 [EMAIL PROTECTED] Damian Conway for president
Re: RFC 130 (v1) Transaction-enabled variables for Perl6
I wrote a transaction-enabled database in perl 5, using fork() for my multithreading and flock() for my mutexes. It continues to work just fine. Threading gives us mutexes. Are you just saying that every variable should have its own mutex, and perl's = assignment operator should implicitly set the mutex? Giving every variable its own mutex would mean we could have greater control if needed --- something like flock(SH|EX|NB|UN) semantics on anything that can be an Lvalue. How about rewriting it as an extension to flock() involving mutexes and sending it to perl6-language? -- David Nicol 816.235.1187 [EMAIL PROTECTED] Damian Conway for president
incorporate VMS into perl entirely
512 byte pages, stored on a permanent device and paged in as required, for everything -- David Nicol 816.235.1187 [EMAIL PROTECTED] Does despair.com sell a discordian calendar?
Re: Vtable speed worry
No, because each table lookup takes less time than comparing one letter of a text string. sv-vtable-svpvx; Isn't this going to really, really hurt? -- David Nicol 816.235.1187 [EMAIL PROTECTED] Does despair.com sell a discordian calendar?
extremely general top level threaded loop
John Porter wrote: "All flow control will implemented as longjumps." -- John Porter # language description has a lot to do. # this is a general, threaded, top-level loop. # a no-brainer. $rearrange = sub{ # so this can be redefinied my $i, $t, $n; for (my $i = 0; $i @_; $i++){ $t = $_[$n = rand @_]; $_[$n] = $_[$i]; $_[$i] = $t; } @_; }; require "language_description.pl"; # must define the following: push @threads, tokens2thread tokenize(); while($rearrange(@threads)){ $_-load_run_unload for @threads; }; -- David Nicol 816.235.1187 [EMAIL PROTECTED]
Re: RFC 146 (v1) Remove socket functions from core
Nathan Torkington wrote: moving getprotobyname() to a module isn't the same as moving open(). And it can be transparent, if it isn't already. Why does perl need to be monolithic? I thought I selcted to build as shared libraries, splitting that into several shared libraries might be entirely painless. So if my program has getprotobyname in it clarifying that token will take a moment, but then all the other weird socket calls will be there to use. How about automatic library search before syntax error? -- David Nicol 816.235.1187 [EMAIL PROTECTED] safety first: Republicans for Nader in 2000
multidim. containers
You can make multidimensional containers in perl5 by settling on a syntax for combining all the dimensions into a key value and using that as the key in a hash. If arrays as we know them implement by using a key space restricted to integers, I think a reasonable way to get matrices would be to open up their key space to lists of integers. A comparison operator that works on lists of integers until($a[$n++] = $b[$n++]) {check they both still exist} to use w/in the tree structure would work, that would group data by last listed dimension. This is for the sparse case, as soon as it gets more than half full (we do some simulating to determine the optimal percentage? Have it settable as a container attribute?) we copy it to a big C array of data or pointers, depending on if the data are constantly sized. Have I got the idea? -- David Nicol 816.235.1187 [EMAIL PROTECTED] safety first: Republicans for Nader in 2000
Re: Structuring perl's guts (Or: If a function is removed from aninterpreter, but you can still use it transparently, is it reallygone?)
Dan Sugalski wrote: If it's decreed that fork is just there without having to do something special, then it will be no matter what magic needs to be done. package refimpl::builtins; sub fork { $refimpl::threads::deepcopy_target = new refimpl::perprocglobaltop; push @main::threads, refimpl::threads::deepcopy( @main::threads); };
Re: Why except and always should be a seperate RFC.
perl5 sort of already has an Calways, in that DESTROY() methods are called on any blessed lexicals when the scope closes. Taking advantage of that for closing a file works if you hide your files in an object class or equivalent chicanery. Allowing user code into the list of things that perl does on shutting down a scope should be a breeze.
Re: RFC 143 (v1) Case ignoring eq and cmp operators
nothing to do with 119 vs 88 discussion. No, it isn't in any discussion, It's just how I imagine a tokenizer/clarifier would work. Any subroutine declaration, for instance sub Cmp:infix($$){ return uc($_[0]) cmp uc($_[1]) }; implicitly sets up a "catch unknown-keyword:Cmp" routine; that is, it installs the name of the function in a place the clarifier will know to look for the definition. It doesn't convert it to opcodes, doesn't "parse" it yet, just stores the token string. Later, while parsing some expression or other, Cmp is encountered. BAREWORD:Cmp is looked up in the whaddayaknow table, and there it is, an infix subroutine taking two scalar arguments, so if that makes sense with what is in front of and behind it, it gets evaluated as such. it's an exception in that it is not in the short list of functions I've used very recently, or something like that. This Nathan Torkington wrote: David L. Nicol writes: If we use exceptions of some kind to handle syntax, encountering an exception of type "unknown-keyword:Cmp" could result in the subroutine definition getting run to clarify this piece of code. I'm nervous about this. I'm trying to picture what happens, and having trouble. Could you post some hypothetical code that would trigger the exception (including the loading of the module that defines the exception) so I can better see what you're proposing? If this was in last week's discussion, please send me a pointer to the archives. Thanks, Nat -- David Nicol 816.235.1187 [EMAIL PROTECTED] safety first: seat-belt wearers for Nader in 2000
Re: RFC 155 - Remove geometric functions from core
Well then. It is impossible to rearchitect it to make it shared text? Perhaps the first instance of perl sets up some vast shared memory segments and a way for the newcomers to link in to it and look at the modules that have been loaded, somewhere on this system, and use the common copy? This handwringing naysaying is depressing. Tom Christiansen wrote: Disastrously, you will then also lose the shared text component, which is what makes all this cheap when Perl loads. Since the modules will have to be pasted in the data segment of each process that wants them, they aren't going to be in a shared region, except perhaps for some of the non-perl parts of them on certain architectures. But certainly the Perl parts are *NEVER* shared. This sounds like a problem to be fixed. Relax, Tom, we'll take it from here. That's why the whole CGI.pm or IO::whatever.pm stuff hurts so badly: you run with 10 copies of Perl on your system (as many people do, if not much more than that), then you have to load them, from disk, into each process that wants them, and eth result of what you've loaded cannot be shared, since you loaded and compiled source code into non-shared parse trees. This is completely abysmal. Loading bytecode is no win: it's not shared text. --tom -- David Nicol 816.235.1187 [EMAIL PROTECTED] Ask me about sidewalk eggs
Re: RFC 155 - Remove geometric functions from core
Sam Tregar wrote: On Tue, 29 Aug 2000, David L. Nicol wrote: Well then. It is impossible to rearchitect it to make it shared text? Perhaps the first instance of perl sets up some vast shared memory segments and a way for the newcomers to link in to it and look at the modules that have been loaded, somewhere on this system, and use the common copy? That approach invites big security problems. Any system that involves one program trusting another program to load executable code into their memory space is vulnerable to attack. This kind of thing works for forking daemons running identical code since the forked process trusts the parent process. In the general case of a second perl program starting on a machine, why would this second program trust the first program to not load a poison module? does sysV shm not support the equivalent security as the file system? Did I not just describe how a .so or a DLL works currently? Yes, the later Perls would have to trust the first one to load the modules into the shared space correctly, and none of them would be allowed to barf on the couch. A paranoid mode would be required in which you don't use the shared pre-loaded module pool. In the ever-immenent vaporware implementation, this whole thing may be represented as a big file into which we can seek() to locate stuff. -- David Nicol 816.235.1187 [EMAIL PROTECTED] Yum, sidewalk eggs!
Re: RFC 146 (v1) Remove socket functions from core
Nick Ing-Simmons wrote: We need to distinguish "module", "overlay", "loadable", ... if we are going to get into this type of discussion. Here is my 2ยข: Module - separately distributable Perl and/or C code. (e.g. Tk800.022.tar.gz) Loadable - OS loadable binary e.g. Tk.so or Tk.dll Overlay - Tightly coupled ancillary loadable which is no use without its "base" - e.g. Tk/Canvas.so which can only be used when a particular Tk.so has already be loaded. I know I've got helium Karma around here these days but I don't like "overlay" it is reminiscent of old IBM machines swapping parts of the program out because there isn't enough core. Linux modules have dependencies on each other and sometimes you have to load the more basic ones first or else get symbol-undefined errors. So why not follow that lead and call Overlays "dependent modules." If a dependent module knows what it depends on, that module can be loaded on demand for the dependent one. -- David Nicol 816.235.1187 [EMAIL PROTECTED] Yum, sidewalk eggs!
Re: RFC 146 (v1) Remove socket functions from core
Dan Sugalski wrote: Oh, and then they will be unloaded if we need the space for something else. I understand now, thanks. Well, probably not, though that could be reasonable for a particular platform. It's only relevant for a persistent interpreter anyway--for ones fired up fresh it doesn't matter, since they won't have anything loaded to start. Still, if there was some networking code at the beginning but only the beginning, and othe code later, explicitly marking the now-unreachable code as recyclable could be a win, if it isn't much trouble. But that's what LRU paging does anyway -- what platfoms are we talking about, that don't have LRU paging? -- David Nicol 816.235.1187 [EMAIL PROTECTED] Subroutine one-arg, him called no-arg, get $_-arg. Ug.
Re: the C JIT
Ken Fox wrote: Trolling? No, I'm not, it's the direction that RFC 61 ends up if you let it take you there. fast perl6 becomes, as well as slicing, dicing and scratching your back, a drop-in replacement for gcc. -- David Nicol 816.235.1187 [EMAIL PROTECTED] Kansas City Perl Mongers will meet Sept. 20th at 7:00 in Westport Flea Market Bar Grill http://tipjar.com/kcpm
Re: A tentative list of vtable functions
Dan Sugalski wrote: Okay, here's a list of functions I think should go into variable vtables. All the math functions are in here. Can the entries that my type does not use be replaced with other functions that my type does use? Functions marked with a * will take an optional type offset so we can handle asking for various permutations of the basic type. These aren't going to be that huge then, with each one taking *void() thismuch[30] or so; why don't we skip the "optional offset" and make subclasses keep their whole own copy? Will 240 bytes per type blow out a modern cache? type name get_bool get_string * get_int * get_float * get_value set_string * set_int * set_float * set_value add * subtract * multiply * divide * modulus * clone (returns a new copy of the thing in question) new (creates a new thing) concatenate is_equal (true if this thing is equal to the parameter thing) is_same (True if this thing is the same thing as the parameter thing) logical_or logical_and logical_not bind (For =~) repeat (For x) Anyone got anything to add before I throw together the base vtable RFC? clarify (where does this type go to resolve uncached method names?) or is that better kept in a global clarifier that keeps it's own mapping. I think the list is too long, for the base type. Could the base type be something that just knows how to return its type name, in order to build types that do not have defined STRING methods? Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk -- David Nicol 816.235.1187 [EMAIL PROTECTED] Kansas City Perl Mongers will meet Sept. 20th at 7:00 in Westport Flea Market Bar Grill http://tipjar.com/kcpm
Re: the C JIT
Ken Fox wrote: . The real problems of exception handling, closures, dynamic scoping, etc. are just not possible to solve using simple C code. - Ken I'm not talking about translating perl to C code, I'm talking about translating perl to machine language. C is babytalk compared to Perl, when it comes to being something which is translatable to machine language. Ug. -- David Nicol 816.235.1187 [EMAIL PROTECTED] Kansas City Perl Mongers will meet Sept. 20th at 7:00 in Westport Flea Market Bar Grill http://tipjar.com/kcpm
Re: A tentative list of vtable functions
Dan Sugalski wrote: We're shooting for speed here. Any common operation that could be affected by the type of the variable should be represented so a custom function can be called that does exactly what needs to be done. Dan so if I want to make up a type that is strictly a 16-bit integer, I overload everything except the math operations with pointers to errors? That's the direction I'm going (off by myself), to merging C in, and that's why a even more limited base type appeals to me. But of course, with a more limited base type, every call to plus would have to check to see if plus was there before resolving, instead of just hopping over the edge and trusting the top end of the rope to be tied.
Re: the C JIT
David Corbin wrote: A C JIT is an interesting idea. I think that a project works best when it has a set of goals (I haven't seen one yet really for Perl 6). Unless this is one of the goals, I can easily see how this could become a serious distraction to what I perceive as the likely goals of Perl6. -- David Corbin what is and what is not a goal? The danger of getting semantic about what the conversation is about -- arguments over, is it a function, a subroutine, or a method and why, for instance -- is very real. Perl looks, and AFAIK has always looked, like "C plus lune noise" to many people. To adopt that as a listed goal -- yet another extended C -- may not be a new goal but rather a slightly different viewpoint for including several previously stated goals, including: strong typing polymorphism run-time efficiency The ability to parse various input syntaces _is_ on the perl6 agenda, since LW mentioned it in his initial announcement. The idea of a "C to Perl translator" has been kicked around as a funny joke in various forums, such as the FunWithPerl list for one. The "C to Perl Translator" is funny (with current perl) because of these reasons: 1: Efficiencywise, it is backwards. C is speedier and is to be preferred, when you have something that works in C 2: It seems like it would be trivial to accomplish 3: If you already have working C code, why would you want to translate it to Perl rather than just use it as is? One of the more recently stated goals is for perl6 to be fast, fast, fast. If we have a C language front end for it, we will be able to compare its approach with the mature compilers -- we may very well get something that can take you from source-code to running-process faster than `make make run`. Since C is very well defined and is very similar to perl -- the matching brackets are mostly the same, for instance, and the idea of what can be a variable name is very similar if not identical -- developing C mode might be easier than developing Python mode as an alternate mode for the "different front ends" goal. ... -- David Nicol 816.235.1187 [EMAIL PROTECTED]
Re: Perl Implementation Language
Dan Sugalski wrote: 1) How fast is the C (or whatever) code it emits likely to be? The perl-in-perl interpreter would not be a deliverable. Speed would not be its goal. It would be a reference implementation that would be easier to break and repair. An internals tutorial, if you will. So you don't have to go explaining what you mean by "vtable" freshly to every new person who figures it out on their own and gets it subtly wrong, for instance. It's a documentation exercise, the serious implementors would port it to the target language and would raise hell when something difficult to port shows up in it. -- David Nicol 816.235.1187 [EMAIL PROTECTED] perl -e'map{sleep print$w[rand@w]}@w=' ~/nsmail/Inbox
Re: RFC 302 (v1) Unrolling loops and tail recursion
Simon Cozens [EMAIL PROTECTED] formally RFC'd: I have no idea how to implement tail recursion elimination, and I'd dearly love to learn. Unrolling loops with constant indices shouldn't be too hard. AIUI you trigger your destructors on the appearance of the "return" keyword rather than on the exiting of the block. The problem is the life of locals that are referred to in the returned expression, but that is not a value if you mark them as exempt from destruction or copy their values onto the new parameter list first. If we are hastier with our destructions on returns, we get tail optomizations for free. At any cost? I don't see much cost, please someone who understands the greater trickiness explain it. # we've got some kind of data structure implemented in # a big, ugly hash %buh and $$buh{key}{next} is the next # element from key, if defined sub recursive_list_end($){ my $element = $buh{$_[0]}; return $_[0] unless defined $element{next}; # warning: reference loops kill! return recursive_list_end $element{next}; }; I guess right now, Creturn works like a normal function up until it does it's thing and exits the scope, triggering destruction. It is normal in that its arguments are evaluated fully. So all these subroutine scopes stack up, waiting for their last argument, and it is common to run out of memory or at get seriously into your swap space. I think that tail recursion could be made possible by simply making Creturn a more magical keyword: Each thread would maintain a global Thing_That_Is_Being_Returned pointer, and on entry into a Creturn call, opposed to on completion of the gathering of the parameter list, That_Which_Is_Being_Returned starts pointing to the expression which is Creturn's argument list, and destruction of the current scope (except for elements referred to from TWIBR's parameters). We do not have to run complex heuristics to recognize tail-optimizable situations if we always do things this way (early destruction on return). CLASSIC RECURSION: return is just another function, gathering all its data before "running" and leaving the scope, at which time destruction occurs. TAIL-OPTIMIZED RECURSION: return triggers destruction, except for things passed to functions named in expressions. Memory which is tied up in stack frames and other undestroyed structures in Classic model is available for reuse. -- David Nicol 816.235.1187 [EMAIL PROTECTED] "The most powerful force in the universe is gossip"
Re: RFC 334 (v1) Perl should allow specially attributed subs to be called as C functions
Dan Sugalski wrote: If there's no hit, I'd love to have all perl functions callable from outside. I'm not sure that'll be the case, though I'm all for it... With the 334 infrastructure, the -o option to generate a linkable object from a perl program/library (RFC 121) will be most do-able: "specially attributed" functions get put in the .h file and linker symbol table, and normal functions still require conversion to/from PerlData before calling. It would be nice to add as much automatic conversion as possible based on information in prototypes. A C++ programmer could define some conversions from the types in their strongly typed compiled type system to the PerlData types, for instance. -- David Nicol 816.235.1187 [EMAIL PROTECTED] "After jotting these points down, we felt better."
Re: RFC 334 (v1) I'm {STILL} trying to understand this...
Dan Sugalski wrote: At 08:57 PM 10/12/00 +0100, Simon Cozens wrote: On Thu, Oct 12, 2000 at 03:43:07PM -0400, Dan Sugalski wrote: Doing this also means someone writing an app with an embedded perl interpreter can call into perl code the same way as they call into any C library. Of course, the problem comes that we can't have anonymous functions in C. Sure we do. You can get a pointer to a function, and then call that function through the pointer. (Though argument handling's rather dodgy) That is, if we want to call Perl sub "foo", we'll really need to call something like call_perl("foo", ..args... ); whereas we'd much rather do this: foo(..args..) (Especially since C's handling of varargs is, well, unpleasant.) cat ENDEND rfc334.h: /* based on http://www.eskimo.com/~scs/C-faq/q15.4.html */ void *call_perl(char *PerlFuncName, ...); ENDEND Which then makes the RFC121 -oh output simple and easy: perl routines which have been marked with a RFC334 attribute indicating their C calling convention get two lines written to standard output (or the designated header file): one is a macro, which will call callperl directly with the name and the args, #define fooDIRECT(A,B,C) callperl(foo, A,B,C) and the other is a wrapper. perlval *foo(int A, perlval *B, char *C); -- David Nicol 816.235.1187 [EMAIL PROTECTED] "After jotting these points down, we felt better."
Re: (COPY) compile-time taint checking and the halting problem
Steve Fink wrote: It's standard semantic analysis. Both your taintedness analysis and my reachability analyses can be fully described by specifying what things generate the characteristic you're analyzing, what things block (in the literature, "kill") it, and the transfer rules. It's often not the best way of implementing it, since the fully general framework can be pretty slow, but it's a concise way of describing things that you'll find in many textbooks. They talk in terms of maintaining GEN and KILL sets and describing when to add, subtract, union, and intersect the incoming outgoing GENs and KILLs. So what would be involved in adding hooks for arbitrary semantic analysis? What language features can be described to allow for introduction of arbitrary SA passes? At what levels? What's a good introductory glossary of SA terms? Things get confusing with many reinvented wheels rolling around.
How to tell (in perl5) if friz is a core command or not?
Anyone remember when I posted the top level of a language prototyping environment? Well, that system has now grown a lexical parser that can understand arbitararily deep doublequotes, and I'm working on a perl5 into perl5 capability for it. Is there a way to determine if a word is defined as a command? (A few methods come to mind, including getting a list from the documentation and making a hash of it) -- David Nicol 816.235.1187 [EMAIL PROTECTED] :syntax on
Re: interesting read: why the EROS project has switched from C++ to C
Simon Cozens wrote: On Tue, Jan 16, 2001 at 08:49:57PM +, David L. Nicol wrote: http://www.eros-os.org/pipermail/eros-arch/2001-January/002683.html Uhm. That's not *why* they're doing it, it's how they're doing it. Did you get the right URL? I thought I did -- now that message is http://www.eros-os.org/pipermail/eros-arch/2001-January/002666.html maybe pipermail is dynamic and gives new numbers each time it generates a list, that would suck. Here's the one I meant to give a pointer to: [EROS-Arch] Conversion to C Jonathan S. Shapiro [EMAIL PROTECTED] Sat, 13 Jan 2001 10:47:19 -0500 Previous message: [EROS-Arch] Conversion to C Next message: [EROS-Arch] Conversion to C Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] Pardon my ignorance. I haven't been following the list too closely to understand the reasons EROS is being rewritten in C. I now have a more thorough answer, so I'm replying in detail. For a record of the earlier conversation on this topic, see the email archives at http://www.eros-os.org/mailman/listinfo The messages are archived in the eros-arch archive. Look for the early messages with the "Conversion to C" subject. Is it a performance issue? Indirectly, C++ is a performance problem. The C++ compiler generates code for exceptions unless this feature is disabled. The exception handling code has performance consequences. In particular, it restricts various kinds of code motion that the optimizer would like to perform. In a more immediate sense, however, the exception handling code also leads to greater contention in the instruction cache, and we are fighting fairly hard to keep the entire EROS working set in a small fraction of the I-cache. The EROS kernel does not generate exceptions (ever). Unfortunately, the compiler must compile under the assumption that operator new/ will be called somewhere and must therefore generate exception handling code. Typically, about 1/3 to 1/2 of the total code generated is exception handling code. Bjarne argues that this code is never executed, and he is mostly correct. Unfortunately, this code is interspersed with the ordinary code, so it changes the cache contention behavior. [Note that this could be fixed by collecting the exception code into a seperate code segment at the end of the application, and perhaps I should suggest this to the G++ team.] At the moment, we disable exception generation by passing extra options to the G++ compiler. The problem with this is that we are no longer compiling in standard C++. We are compiling in an odd extension that happens to be supported by G++. I am concerned both about code portability and about code clarity. Finally, name mangling schemes in C++ are not consistent across compilers, so writing assembly code that calls C++ with some semblance of portability is a problem. You can call a procedure that uses C linkage convention easily enough, and this can call the corresponding C++ procedure, but the extra call is undesirable overhead. Did you find that even with C++ your code was largely functional? No, but it is almost entirely *procedural*, which is probably what you meant. While there are many member functions, these could equally well be coded as C procedures taking a struct pointer. The current kernel makes essentially NO use of inheritance, which is what you would expect in a microkernel -- if you have complex enough structure to need inheritance, you aren't building a microkernel. The one place where inheritance was used (there is a Link structure for doubly linked list chains, but this doesn't qualify as serious) was in the driver code. Here it proved to be a mistake, because drivers need something more flexible -- something much closer to Java-style interface pointers. The difficulty here is that the C++ type checking is actually too strong. You can declare a pointer to a member function, but there is no way to declare "pointer to object s.t. object has a member function of type T". In fact, what you really want to declare is a pair of the form (X *, RetType (*X::someFunc(...args...)) without being obliged to ever say what X would be. Note that the compiler doesn't need to know. It really only needs to know that X* will be an appropriate thing to use as a /this/ pointer in an invocation of X::someFunc(). What's needed here is something like ML pattern matching, and C++ doesn't have it. The result is that there is no way to capture interface dispatch gracefully. In C, it is easier to do this because you can declare X* to be a void pointer. Actually, what is really going on in this case is a violation of the static type system in C++. In summary, using inheritance was a mistake. A secondary issue concerns the Process class. The vast majority of the kernel code (not surprisingly) is concerned in some form with man
Re: Autovivification behavior
"Deven T. Corzine" wrote: On Sat, 23 Dec 2000, Graham Barr wrote: This has been discussed on p5p many many times. And many times I have agreed with what you wrote. However one thing you did not mention, but does need to be considered is func($x{1}{2}{3}) at this point you do not know if this is a read or write access as the sub could do $_[0] = 'fred'. If this can be handled in someway so that the autoviv does not happen until the assignment then OK, but otherwise I think we would have to stick with perl5 semantics This is a good point. Similar arguments apply to grep, map, foreach, etc. Could we have some sort of "lazy evaluation" mode for lvalues where a reference to an undefined substructure sets a flag and saves the expression for the lvalue, returning undef whenever evaluated in a read-only context and autovivifying when necessary to write to the lvalue? Deven Lazy autovivification could work by adding an on-assignment magic to the previously nonexistent SV, which details the creation of the intervening structures. This will have to chain: What if func defers the question of read or write to another layer down? An addition of a magic, so that values in previously undefined structures evaluate as undef but will create interveners as needed if and when they are assigned to (even if it is an assignment of undef) would work. The magic would have to consult the appropriate symbol tables when it runs, to avoid problems of two magics getting set up referring to the same nonexistent structure: $first = \($$$hash{one}{two}{three}); # a reference to an undefined # scalar, with write-magic $second = \($$$hash{one}{two}{four}); # undef with very similar write-magic $$first = "first"; $$second = 2; # magic runs but doesn't do anything since # %{$$hash{one}{two}} got vivified already. -- David Nicol 816.235.1187 [EMAIL PROTECTED] "Live fast, die young, and leave a beautiful corpse"
Re: PDD for code comments ????
Jarkko Hietaniemi wrote: Some sort of simple markup embedded within the C comments. Hey, let's extend pod! Hey, let's use XML! Hey, let's use SGML! Hey, let's use XHTML! Hey, let's use lout! Hey, ... Either run pod through a pod puller before the C preprocessor gets to the code, or figure out a set of macros that can quote and ignore pod. The second is Yet Another Halting Problem so we go with the first? Which means a little program to depod the source before building it, or a -HASPOD extension to gcc Or just getting in the habit of writing /* =pod and =cut */ -- David Nicol 816.235.1187 [EMAIL PROTECTED] "Nothing in the definition of the word `word' says that a word has to be in a dictionary to be called one." -- Anu Garg
Re: defined: Short-cutting on || with undef only.
I think "defined" should be altered so that it only looks like a function, but in effect alters the tests being made by the thing that is looking at it. if (defined $x){ # slower than if ($x){ # or if($x or defined($x)) could be made faster by propagating the "defined" question up the parse tree to the decision that is being made based on it, and having that decision only look as far as definition. In Perl 5, it would be written `defined($x) ? $x : "N/A"', but this has the problem that $x is evaluated twice, so it doesn't work if instead of $x we have a function call (or even if $x is tied...). With the propagation approach, there's no speed penalty for defined $x or $x = "N/A"; # slower than $x ||= "N/A" in perl 5 It's a perl5 speed optimization not a perl6 language change -- David Nicol 816.235.1187 [EMAIL PROTECTED] Soon to take out full page ads looking for venture capitalists
Re: PDD for code comments ????
David Mitchell wrote: 4. Are we all agreed that in addition to anything else (eg rfc281), at least some of the standard commentary should appear actually within the src file itself? s/at least some/most, if not all/ 5. Do *all* these comments need to be extractable, or only ones related to published APIs etc? Initially the automaton creates them all extractable, and the coder can depod the ones that are implementation details and should not be published, as a measure more extreme than noting "This function is an implementaion detail" in the mandatory comment for the function. Which implies an at least implicit standard lexicon of flexibility to use in these comments, ranging from "implements ISO standard interface" down to "experimental - failed, remove during clean-up phase" 6. Can we leave the details of pod/apidoc/rfc281 until 1..5 have been agreed? I can't. Can you? Altering a C prettyprinter to insert an extensible standard comment template before each function definition would be even easier than writing one from scratch. But what goes in that block of text, beyond /* =pod =head1 function $FunctionName returning $ReturnType =head1 Named arguments: @ArgNameList =cut */
Re: Tolkein (was Re: PDD for code comments ????)
Simply Hao wrote: Douglas Adams does seem rather more appropriate a source of quotes for software (anyone's, alas) than Pratchett. But Adams already has a software company. And Sirius pioneered the GPP in Perl 6.
Re: wacko idea
Uri Guttman wrote: i was looking at dan's PMC arena allocator struct and it reminded me of something an older language can do (which MJD has likened to an early perl :). ever heard of AREA in PL/I? it was a large chunk of ram dedicated to allocate memory from. what was special is that all pointers generated by these malloc calls (i forget the pl/1 syntax but it doesn't matter) were OFFSETS into the area and not direct pointers. the reason for this was that you can make a complex tree structure out of this AREA and then write the whole thing to disk as a single chunk, and in another program read it back in and your structure is back. since all the pointers are relative to the base of the AREA, you can read it back into any address in your virtual space. i like it. How do I indicate that my variable is to be taken from the (which) persistent space at a language level? Or would this be an internal infrastructure service which other things could build on. -- David Nicol 816.235.1187 [EMAIL PROTECTED] Described as awesome by users
deferred vtable assignment?
Dan Sugalski wrote: 2) Anyway, even resizing vtables we would need some more indirection to determine in which position of the vtable is which operator. No. Each operator goes in a fixed position in the vtable, and it's the same for each table. Anything else is nasty, error prone, and slow. What if the decision in-vtable or not-in-vtable is deferred? The size of the vtable could be chosen late in the compilation. There could be hints. I am right now imagining vtable slots analogous to register entries for data in a C function. That we we also can deal with the aliasing of comparison operators to a variety of cmp/== (or not) on a case-by-case basis. if it is supposed to be an optimization, keep it an optimization, with a fall-back to a non-optimized paradigm. -- David Nicol 816.235.1187 [EMAIL PROTECTED] and they all say yodelahihu
Re: deferred vtable assignment?
Dan Sugalski wrote: What if the decision in-vtable or not-in-vtable is deferred? That's doable, I think, though I can see some issues. how about a two-tiered vtable, where a single high bit, if set, indicates extended handling, or at least consultation of a different table. I guess that amounts to the same as having a set number of extended entries that indicate check elsewhere to decide what do do now. Which again causes mind-expoloding possibilities, except that there is no reason to keep all possibilities in mind, just open up the pandorabox and let all the evil out. teddy bears get drunk and they all say yodelahihu
continuations are prerequisite for coroutines
Dan Sugalski wrote: We don't have continuations yet... But there's nothing at the lowest levels of the interpreter that prevent that. You could, if you chose, get a stream of bytecode that would do you continuations. Slowly and awkwardly, perhaps, but still do them. (I'm all up for doing them better in perl 6) my currently accreting coroutines module, which will, if all goes well, appear in TPJ #21, will implement its continuations by rewriting all lexical variables and all controlled loop constructs containing yield calls in terms of explicitly named and recoverable structures, allowing coro drivers to jump to any of several entry points as the first thing on entry into a coro. For announcements and discussion as this project develops, I have just now ordered up a EZMLM list, to subscribe send a message to [EMAIL PROTECTED] If someone else is working on a continuation syntax or semantics, I'd like to have mine resemble yours, please contact me re: it. -- David Nicol 816.235.1187 [EMAIL PROTECTED] Parse, munge, repeat.
Re: Please shoot down this GC idea...
Damien Neil wrote: sub foo { my Dog $spot = shift; my $fh = IO::File-new(file); $spot-eat_homework($fh); } Even with the object type declared, the compiler can make no assumptions about whether a reference to $fh will be held or not. Perhaps the Poodle subclass of Dog will hold a reference, and the Bulldog subclass will not. what if two GCs are maintained, a refcouting one for any lexical that is a reference and is passed as a parameter, or that has a reference taken of it and passed as a parameter, and a lexical-analysis one for anything that is safe from these reference hazards. sub foo { my Dog $spot = shift; my $homework = shift; my $fh = IO::File-new($homework); # pass=by=value, safe to LAGC $spot-eat_homework($fh); # $fh is no longer fair game for LA-GC } -- David Nicol 816.235.1187 [EMAIL PROTECTED] Parse, munge, repeat.
Re: Please shoot down this GC idea...
David L. Nicol wrote: i'm swearing off sort-by-subject. Sorry.
Re: Stacks, registers, and bytecode. (Oh, my!)
Larry Wall wrote: Sure, you can download the object code for this 5 line Perl program into your toaster...but you'll also have to download this 5 gigabyte regex interpreter before it'll run. That's a scenario I'd love to avoid. And if we can manage to store regex opcodes and state using mechanisms similar to ordinary opcodes, maybe we'll not fall back into the situation where the regex engine is understood by only three people, plus or minus four. Larry Does anyone have on-their-shelves a regex-into-non-regex-perl translator? run time is not an issue, correct behavior is -- David Nicol 816.235.1187 [EMAIL PROTECTED] Obi-Wan taught me mysticism -- Luke Housego
Re: Stacks, registers, and bytecode. (Oh, my!)
Jarkko Hietaniemi wrote: Err...a regex that isn't a regex, is this a Zen koan...? Ahhh, you want to emulate the state machine in Pure Perl. Okay... next thing you want to do is to write symbolic assembler in C...? :-) I have my reasons :) Actually, I want to write a c into perl compiler -- I have for years -- one of these long manic afternoons I'll do it, too
Re: Stacks, registers, and bytecode. (Oh, my!)
Graham Barr wrote: I think there are a lot of benefits to the re engine not to be separate from the core perl ops. So does it start with a split(//,$bound_thing) or does it use substr(...) with explicit offsets?
Re: Should we care much about this Unicode-ish criticism?
Russ Allbery wrote: a caseless character wouldn't show up in either IsLower or IsUpper. maybe an IsCaseless is warrented -- or Is[Upper|Lower] could return UNKNOWN instead of TRUE|FALSE, if the extended boolean attributes allow transbinary truth values.
Re: -g vs. -O
Benjamin Stuhl wrote: (eg. I solemnly swear to never use symbolic references, count on specific op patterns, or use any number large enough to require bignums.) These are things (aside from the number limit, but overflow catching is needed anyhow, so switching to bignums instead of crashing and burning seems like a reasonable default behavior) that could be easily identified and flagged (well, use of symbolic reference) at first-pass time. Except for ${$thing} if we don't know if $thing is a reference or a name -- but can that be figured out on first-pass? Is there an entry point between this line and $thing's last use as an l-value, and can the expression that is getting assigned there be seen to clearly be a reference? Do we even care? if symbolic reference only gets fallen back to when, oops, something that is not a reference gets used as one, what exactly do we save? The check to verify that something is in fact a reference? Could that check be deferred into an exception handler? -- David Nicol 816.235.1187 It's widely known that the 'F' in RTFM is silent. -- Olie
Re: PDD 4, v1.3 Perl's internal data types (Final version)
Dan Sugalski wrote: The C structure that represents a bigint is: struct bigint { void *buffer; UV length; IV exponent; UV flags; } =begin question Should we scrap the buffer pointer and just tack the buffer on the end of the structure? Saves a level of indirection, but means if we need to make the buffer bigger we have to adjust anything pointing to it. =end question Absolutely not. Keep as much static-sized as possible, so you can trivially recycle it. Nobody much liked the suggestion of tracking precision at the lowest levels, but here I am repeating it anyway. Perl has a single internal string form: =item unused Filler. Here to make sure we're both exactly double the size of a bigint/bigfloat header and to make sure we don't cross cache lines on any modern processor. Is this explicitly guaranteed to remain unused, so that it may be safely used for arbitrary user-magic (as long as they don't step on each others toes) and semantic analysis flags, and so forth? Or would that kind of thing be better included into whatever is containing these guys -- along with reference counts and other details of additional systems which are not referred to w/in this document. =item Class Class refers to a higher-level piece of perl data. Each class has its own vtable, which is a class' distinguishing mark. Classes live one step below the perl source level, and should not be confused with perl packages. Does this imply that perl packages will continue to be called perl packages, even when they start getting introduced with a class keyword?
Re: -g vs. -O
Dan Sugalski wrote: At 12:51 PM 7/6/2001 -0500, David L. Nicol wrote: Benjamin Stuhl wrote: (eg. I solemnly swear to never use symbolic references, count on specific op patterns, or use any number large enough to require bignums.) Would these promises be better stated as in-code pragmata instead of compiler switches? { no symrefs; ${$thing}; # assuming $thing is a reference is now safe }
Lexically scoped optimization hints
Dan Sugalski wrote: At 01:59 PM 7/6/2001 -0500, David L. Nicol wrote: in-code pragmata instead of compiler switches? Lexically scoped optimization hints seem like rather a tricky thing to deal with. I know I'm naive but here's how I see it: - we design a linkage standard that is the same at all optimization levels. - How far to optimize anything is set with flags. - Compilation occurs by block, innermost first, as far as is possible (but no farther, until Godot arrives.) Optimization flags set things that can be done in advance, checks to skip and so forth. With this mind set, Lexically scoped optmization is the only way to go, and maintain current flexibility. What if my symref-free code wants to use a module that uses symrefs? With a --promise command line switch, the module would need to override it or break, which gives us LSO the other way. What am I missing, I wonder? That the block structure goes away?
Re: CLOS multiple dispatch
Dan Sugalski wrote: [... massive sniping snippage ...] The problem I was talking about was those cases where we have a good but not perfect match at compile time. In the case you gave, we assume that @A are full of fish, so dispatch to the multiple fish parameter version of list_medication. But we can't be sure, since what happens if at runtime we install a list_medication function that takes an Aquarium array as a single parameter? (And we won't deal with the case where the function is there at compile time but deleted before we hit it) We can do exact matching at compile time, but nothing else. Best superclass matching can't be done at compile time, since we can't be sure that something better won't come along later. Dan what if: * there is a way to say that no new classes will be introduced * parameters have some additional flags, either as more elements of structure that ref() returns or as a second parallel array of lexical knowledge * dispatch functions have rights to rewrite themselves The second one is really what I meant about a more complex interface If we know that the second element is always going to be a puppy, the dispatcher can be just a little bit simpler because it does not need to consider the second element at run time. At an op level, if function calls are all the same size, a function call to a general dispatcher that knows the address of the function that called it, and any other information could replace the call to itself with a call to an arbitrarily simpler dispatch routine -- perhaps even a generic dynamic dispatcher that knows it can't optimize any so it optimizes away the pass to check for optimizability. In perlperl (imagine...) , that means turning a call foo($A) into {$Dispatcher[macro_increment(N)] ||= sub { goto Dispatch_Default(macro(N),foo,$A) }} assuming that macro_increment will increment every time it is seen in the source code -- that would be easy to do with a source filter :) then Dispatch_Default could replace $Dispatcher[N] with a more specific call, to a simpler dispatcher or an alternate dispatcher or, in the rare and elegant case where we actually have all the type information ahead of time, directly to the function. --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk -- David Nicol 816.235.1187 Keep that Sugalski character away from my stuffed animals!
A discussion of writing to the GCC front end
http://cobolforgcc.sourceforge.net/cobol_14.html