Re: Calling parrot from C?
Leopold Toetsch writes: Luke Palmer [EMAIL PROTECTED] wrote: How does one call a parrot Sub from C and get the return value(s)? Is it even possible, given CPS, to do this generally? If not, how can I check when it is? Good question. Its very similar to classes/Eval.pmc:invoke(). You would do: - call runops_int(interp, offset) Okay, now I've got a pretty nice XS interface for embedding and introspecting into a parrot interp from Perl. I just figured out what you meant by offset, but the way I compute it is, well: void _run_sub (interp, pmc) Parrot_Interp interp PMC* pmc INIT: PMC* retc = pmc_new_noinit(interp, enum_class_RetContinuation); CODE: VTABLE_init(interp, retc); VTABLE_set_integer_native(interp, retc, 0); interp-pmc_reg.registers[0] = pmc; interp-pmc_reg.registers[1] = retc; runops_int(interp, ((void*)VTABLE_invoke(interp, pmc, NULL) - (void*)interp-code-byte_code)2); Is there a better way to do it (other than stylistic concerns)? Oh, don't worry about parameter passing, I've that handled in the Perl front end. Also, I can't seem to figure out how to make a CSub which calls to a user-provided perl sub. It seems I need more metadata, specifically, the SV containing the sub. Should I just subclass CSub and include it in pmc_val or somesuch? Or does CSub already do what I need? Thanks again, Luke - returning from the sub is by invoke'ing the return continuation with address 0, which you have put in P1 - param passing is like PCC (pdd03) Thanks, Luke leo
Re: [CVS ci] exit opcode
Luke Palmer [EMAIL PROTECTED] wrote: Those things are interpreter exceptions -- the program did something the interpreter didn't expect. But I think the idea is to make Cexit a control exception, much like Perl 6's Cleave or Cnext. Yep yep. Cexit already is a (control) exception. Luke leo
Re: [PATCH] File Spec
Vladimir Lipskiy [EMAIL PROTECTED] wrote: [ my first answer seems to be missing ] From: Leopold Toetsch [EMAIL PROTECTED] Subject: TWEAKS: Takers Wanted - Effort And Knowledge Sought Platform code - We need some functions to deal with paths and files like File::Spec. For loading include files or runtime extension some search path should be available to locate these files (a la use lib LIST;). For now runtime/parrot/{include,dynext} and the current working directory would be sufficient. I ain't 100% sure what Leo wanted there and afraid that my patch is out of place. Though it presets rudimentary support for the Parrot File::Spec-like functions which are as follows: curdir, catdir, catfile. Albeit File::Spec is using catfile and catdir, I don't like the function names (cat file is on *nix what type file is on Win*). Maybe concat_pathname and concat_filename is better. I should warn you the patch is a lack of any documentation. Examples of usage can be found in file_spec.t. Nevetheless does it need writing some documentation on for non-perl folks and if it does where should I put it in? The docs directory? docs/dev is the place for documents about internal functionality and design decisions. WRT the patch - please can people having experience with different platforms have a look at it, if the functionality would be able to cope with all platform weirdness. =3Dhead1 NAME [ can you switch your mailer to plain text, thanks ] [ WRT diff: make a copy of your original tree, do modifications there and then cd ..; diff -urN parrot parrot-modified ] Thanks, leo
Re: Calling parrot from C?
Luke Palmer [EMAIL PROTECTED] wrote: Also, I can't seem to figure out how to make a CSub which calls to a user-provided perl sub. Please use the NCI interface. You can call arbitrary C functions with it. S. classes/parrotio.pmc or Parrot_compreg() and the docs. Thanks again, Luke leo
Re: [PATCH] File Spec
Leo wrote: Albeit File::Spec is using catfile and catdir, I don't like the function names (cat file is on *nix what type file is on Win*). Maybe concat_pathname and concat_filename is better. Yes, indeed. I'm for having concat_pathname only since this patch or the File::Spec module makes no difference when concatenates paths and files (though I can be mistaken on account of VMS, Dan? (~:). So catdir and catfile give the same result. Morever, catfile is sort of a wrapper around catdir and does nothing smarter than just calling catdir on all platforms. We can bring concat_filename in either (I don't mind) but as an alias of concat_pathname. I don't know how to implement this(I mean aliasing) in terms of parrot, though. Can we do it in some elegant way? However, for consistensy's sakes, I really really want that we have only concat_pathname, because whether we do concatenating of dirs or dirs file we always do the same -- concatenate a path. docs/dev is the place for documents about internal functionality and design decisions. Okay. WRT the patch - please can people having experience with different platforms have a look at it, if the functionality would be able to cope with all platform weirdness. The time being, it can works properly only on windows and unix platforms. Why is it so? I feel I should give some explanations on how it works. There is only one generic function catdir, but not many ones as we have in File::Spec. And there are some filters[1], which we can assign to an array Filters. typedef void (*ParrotFSFilter)(struct Parrot_Interp *, STRING **); ParrotFSFilter Filters[] = { filter_1, filter_2, ... , filter_n }; When we have such a PASM code as set S0, foo_dir set S1, bar_dir catdir S0, S1 it firstly calls the file_spec_catdir() function which just only glues parts with an OS specific directory separator and directs the control to another function, that is file_spec_filter(). No doubt after the gluing a path can contain some trash like successive slashes, that's why we call file_spec_filter, anyway, which in its turn calls each function registered on the Filters array. Filters could be an OS specific, there is no sense to register filter that does the # xx///xx -xx/xx changes when you are working on cygwin. Another question is how we can add an OS specific filter -- it's nothing to do: ParrotFSFilter Filters[] = { file_spec_some_filter #ifndef PARROT_OS_NAME_IS_CYGWIN file_spec_successive_slashes_filter, #endif file_spec_filter_which_deletes_redundant_root_direct #ifdef UNIX file_spec_vms_specific_filter, #endif file_spec_yet_another_filter, and so on }; If somebody imagines a plan that could manage without macroing, you know, ideas are always welcome. Now, when you know how it's supposed to work, I can return to the question why can it works properly only on windows and unix platforms. The answer is: Filters haven't been implemented yet. Because I am still hesitating on accounts of what would be the best solution for find 'n' search actions. And wish I could have heard some comments on that. To clarify what the heck I'm talknig about I put the following fragment that I have cut off of my inital mail Next. In the future I'll need to be able to do some find 'n' replace actions in order to clean the trash off of paths. The perl version uses the regexes like these: $path =~ s|/+|/|g unless($^O eq 'cygwin'); # xxxx - xx/xx $path =~ s|(/\.)+/|/|g; # xx/././xx -xx/xx $path =~ s|^(\./)+||s unless $path eq ./; # ./xx - xx $path =~ s|^/(\.\./)+|/|s; # /../../xx -xx $path =~ s|/\Z(?!\n)|| unless $path eq /;# xx/ - xx The bodkin is whether I should take advantage of string_str_index, string_replace and friends or there is a better solution? In any case it never uses long paths, so we won't be violently penalized while using any of find 'n' replace sheme. There is one more thing to have been said, for some cases a result obtained with the parrot file spec will devirege from a result obtained with the perl one. For instance, set S0, set S1, concat_pathname S0, S1 print S1 prints , but File::Spec's equivalent my $path = catdir(, ); print $path; prints / on UNIX, windows, and so forth. I don't think it's the Right result, though you can argue with me on that account. I'm gonna document all divegrences. [ can you switch your mailer to plain text, thanks ] Yep. I regularly do that. But sometimes my MTA outwits me. [ WRT diff: make a copy of your original tree, do modifications there and then cd ..; diff -urN parrot parrot-modified ] Thanks, indeed. I'll try that as soon as I prepare a new patch.
Re: [PATCH] File Spec
Leo wrote: Albeit File::Spec is using catfile and catdir, I don't like the function names (cat file is on *nix what type file is on Win*). Maybe concat_pathname and concat_filename is better. Yes, indeed. I'm for having concat_pathname only since this patch or the File::Spec module makes no difference when concatenates paths and files (though I can be mistaken on account of VMS, Dan? (~:). So catdir and catfile give the same result. Morever, catfile is sort of a wrapper around catdir and does nothing smarter than just calling catdir on all platforms. We can bring concat_filename in either (I don't mind) but as an alias of concat_pathname. I don't know how to implement this(I mean aliasing) in terms of parrot, though. Can we do it in some elegant way? However, for consistensy's sakes, I really really want that we have only concat_pathname, because whether we do concatenating of dirs or dirs file we always do the same -- concatenate a path. docs/dev is the place for documents about internal functionality and design decisions. Okay. WRT the patch - please can people having experience with different platforms have a look at it, if the functionality would be able to cope with all platform weirdness. The time being, it can works properly only on windows and unix platforms. Why is it so? I feel I should give some explanations on how it works. There is only one generic function catdir, but not many ones as we have in File::Spec. And there are some filters[1], which we can assign to an array Filters. typedef void (*ParrotFSFilter)(struct Parrot_Interp *, STRING **); ParrotFSFilter Filters[] = { filter_1, filter_2, ... , filter_n }; When we have such a PASM code as set S0, foo_dir set S1, bar_dir catdir S0, S1 it firstly calls the file_spec_catdir() function which just only glues parts with an OS specific directory separator and directs the control to another function, that is file_spec_filter(). No doubt after the gluing a path can contain some trash like successive slashes, that's why we call file_spec_filter, anyway, which in its turn calls each function registered on the Filters array. Filters could be an OS specific, there is no sense to register filter that does the # xx///xx -xx/xx changes when you are working on cygwin. Another question is how we can add an OS specific filter -- it's nothing to do: ParrotFSFilter Filters[] = { file_spec_some_filter #ifndef PARROT_OS_NAME_IS_CYGWIN file_spec_successive_slashes_filter, #endif file_spec_filter_which_deletes_redundant_root_direct #ifdef UNIX file_spec_vms_specific_filter, #endif file_spec_yet_another_filter, and so on }; If somebody imagines a plan that could manage without macroing, you know, ideas are always welcome. Now, when you know how it's supposed to work, I can return to the question why can it works properly only on windows and unix platforms. The answer is: Filters haven't been implemented yet. Because I am still hesitating on accounts of what would be the best solution for find 'n' search actions. And wish I could have heard some comments on that. To clarify what the heck I'm talknig about I put the following fragment that I have cut off of my inital mail Next. In the future I'll need to be able to do some find 'n' replace actions in order to clean the trash off of paths. The perl version uses the regexes like these: $path =~ s|/+|/|g unless($^O eq 'cygwin'); # xxxx - xx/xx $path =~ s|(/\.)+/|/|g; # xx/././xx -xx/xx $path =~ s|^(\./)+||s unless $path eq ./; # ./xx - xx $path =~ s|^/(\.\./)+|/|s; # /../../xx -xx $path =~ s|/\Z(?!\n)|| unless $path eq /;# xx/ - xx The bodkin is whether I should take advantage of string_str_index, string_replace and friends or there is a better solution? In any case it never uses long paths, so we won't be violently penalized while using any of find 'n' replace sheme. There is one more thing to have been said, for some cases a result obtained with the parrot file spec will devirege from a result obtained with the perl one. For instance, set S0, set S1, concat_pathname S0, S1 print S1 prints , but File::Spec's equivalent my $path = catdir(, ); print $path; prints / on UNIX, windows, and so forth. I don't think it's the Right result, though you can argue with me on that account. I'm gonna document all divegrences. [ can you switch your mailer to plain text, thanks ] Yep. I regularly do that. But sometimes my MTA outwits me. [ WRT diff: make a copy of your original tree, do modifications there and then cd ..; diff -urN parrot parrot-modified ] Thanks, indeed. I'll try that as soon as I prepare a new patch.
RE: Notifications
Tim Bunce wrote: On Thu, Aug 28, 2003 at 07:26:25PM -0400, Dan Sugalski wrote: How does it work? Simple. When a watched resource does what we're watching for (it changes, an entry is deleted, an entry is added [...] Only after the action being watched is performed I presume. It's also useful to have notifications before an operation--or even during the operation, if the notification is an end unto itself. Typically, notifications are named with future tense (DogWillPoop) and past tense (DogDidPoop or DogPooped) to indicate whether they preceed or follow the action. Sometimes, an object will actually support both. Consider: Your neighbors are complaining of poor response time leading to residual odor. To help improve the situation, you add a DogWillPoop notification handler so that you can cache a pooper scooper handle while the dog poops asynchronously. But you still need to wait for the DogDidPoop notification before employing the pooper scooper, else you run the risk of a race condition with the dog. If you win the race condition, then the problem of residual odor will be worsened and the next homeowners association meeting will not end favorably for you. So you need both notifications. Speaking generally, the only safe statement here is to say that notification occurs when the subject issues a notification. we post an event to the event queue. When that event is processed, whatever notification routines were registered are run. Very simple. The async nature of this approach needs to be kept in mind. It will often be important that the 'thread' handling the event queue runs at a high priority. (Perhaps it would help to have a simple flag on each watch to say if a yield() should be performed after posting an event for that watch.) Parrot uses high-priority event dispatch for signal handling. Imagine that notifications would be high-priority events as well. Imagine an event dequeue applied just after the notification event is enqueued. Optimizate that: Don't bother with enqueue-dequeue, and simply feed the event straight to the handler. So event dispatch mechanics could be leveraged, but the notification could in fact be dispatched synchronously. Common notification vocab: - Subject: An object which is a source of notifications. - Notification: The combination of a subject and a notification identifier (name, number--implementation detail). - Observer: An object which asks to be informed when a notification fires. (Note: Subjects are generally *much* more common than observers.) - Notification center: An object which maps {subject, notification name} [that is, notifications] onto zero or more {observer, handler} tuples, and handles dispatching notifications. The NC needs to know both how to remove all notifications for a given subject, and how to remove all handlers for a given observer. Since observers are generally uncommon, it's often cheaper at a systems level to have a huge master notification center than it is to even reserve a single list head in every subject for chaining notifications off of them. Of course, such a global notification center here has to be threadsafe. Now, what notifications issued during DoD would be usefully able to do is another question, probably more what Tim was getting at than what I responded to. It strikes me that the handlers would be unable to do much more than nullify their weak pointer, and would have to be written in C: - If the notification handler tried to allocate an object, it could invoke recursive DoD. That's bad, right? Parrot code pretty much can't run without allocating memory. If DoD was for memory exhaustion, then parrot just screwed itself with full generality. - Dereferencing another weak reference from that code would be dangerous, too: Weak ref A and weak ref B are found to be invalid during the same DoD run. Weak ref A's subject died notification fires first. One of its handlers happen to dereference weak ref B. Is that reference guaranteed valid until DoD completes? And for recursive DoD? In the case of memory exhaustion? Seems to like weak references need to be nulled out in one big atomic sweep, primarily for the first reason. Afterwards, and once GC is run, death notifications (now an entirely separate feature) could fire. (But how to identify an object which has been GC'd? Java always encapsulates weak references within a WeakRef instance, so that instance can be a surrogate identity for the collected object...) I also have to wonder at adding the expense of any of the following to DoD: 1. Enqueuing an event for each dying object. 2. Adding space for a listhead to every object so that an notification observer list can be built. 3. Checking a notification center for every dying object. Tangentially: Notification centers are themselves an example of why dying object notifications are useful: Both
Re: What the heck is active data?
At 10:57 AM +0200 8/29/03, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: Most objects in Parrot will be dealt with by reference, Dou you have anything about references? I'm thinking about putting in a default CRef PMC class, which delegates almost all its methods to CSELF-cache.pmc_val, autogenerated inside pmc2c.pl. Yeah, that was the way I was planning on going. (I'm not sure if I sent this already, but a resend never hurts :) -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
[CVS ci] Ref class-1 (was: What the heck is active data?)
Dan Sugalski [EMAIL PROTECTED] wrote: At 10:57 AM +0200 8/29/03, Leopold Toetsch wrote: Dou you have anything about references? I'm thinking about putting in a default CRef PMC class, which delegates almost all its methods to CSELF-cache.pmc_val, autogenerated inside pmc2c.pl. Yeah, that was the way I was planning on going. (I'm not sure if I sent this already, but a resend never hurts :) Done. As a reference delegates assign too, there is currently no way to rebind a Ref PMC to another variable. (We could implement a bind method for this or not delegate set_pmc). Please re^Wsend design docs ;-) leo
Re: lvalue cast warnings
On Fri, 29 Aug 2003, Nicholas Clark wrote: Andy Dougherty [EMAIL PROTECTED] wrote: closure.pmc, line 21: warning: a cast does not yield an lvalue I think that the appended patch will work around the problem, by doing the case on the pointer (which is an RVALUE) and then deferencing. --- include/parrot/sub.h.orig 2003-08-29 21:09:19.0 +0100 +++ include/parrot/sub.h 2003-08-29 21:14:26.0 +0100 @@ -33,7 +33,7 @@ typedef struct Parrot_Sub { char *packed; /* to simplify packing Constant Subs */ } * parrot_sub_t; -#define PMC_sub(pmc) ((parrot_sub_t)((pmc)-cache.pmc_val)) +#define PMC_sub(pmc) (*((parrot_sub_t *)((pmc)-cache.pmc_val))) /* the first entries must match Parrot_Sub, so we can cast * these two to the other type Thanks. That worked just fine. Andy Dougherty [EMAIL PROTECTED]
Re: [RfT] Request for Test: build system changes
On Sat, 30 Aug 2003, Juergen Boemmels wrote: the make shipped with Borland C++ builder doesn't like the makefiles in the current way. I had to tweak the buildfiles a little in order to get it Configure and compile. (It still does not link but thats another story). I removed the appearences of in the Makefiles with ${make_and} which is defined to '' on all platforms but bcc, and simplified the cd dir make cd .. case by using a perl-replacement for the -C commandlineoption of make on windows platforms. make -C is non-portable even among Unix systems. I think it's a GNU-specific extension. It's not in Solaris make nor in BSD make, as far as I can tell. In short, if you have a portable perl-replacement for it, perhaps you should just use your replacement everywhere. -- Andy Dougherty [EMAIL PROTECTED]
Re: [RfT] Request for Test: build system changes
Andy Dougherty [EMAIL PROTECTED] writes: On Sat, 30 Aug 2003, Juergen Boemmels wrote: the make shipped with Borland C++ builder doesn't like the makefiles in the current way. I had to tweak the buildfiles a little in order to get it Configure and compile. (It still does not link but thats another story). I removed the appearences of in the Makefiles with ${make_and} which is defined to '' on all platforms but bcc, and simplified the cd dir make cd .. case by using a perl-replacement for the -C commandlineoption of make on windows platforms. make -C is non-portable even among Unix systems. I think it's a GNU-specific extension. It's not in Solaris make nor in BSD make, as far as I can tell. In short, if you have a portable perl-replacement for it, perhaps you should just use your replacement everywhere. Ok will look into this. I need to fix the quoting to work under Unix and Windows. Should the name of the replacement-script still be $(MAKE_C) or should I use some other name. bye boe -- Juergen Boemmels[EMAIL PROTECTED] Fachbereich Physik Tel: ++49-(0)631-205-2817 Universitaet Kaiserslautern Fax: ++49-(0)631-205-3906 PGP Key fingerprint = 9F 56 54 3D 45 C1 32 6F 23 F6 C7 2F 85 93 DD 47
The reason for scads of keyed variants
I should read the list and respond to the outstanding stuff, but I should also get this done, and since the former probably precludes the latter... Why, exactly, have I spec'd (nay, demanded!) that every darned operation in a PMC's vtable have a keyed variant? Simple. A combination of speed and space. (And yes, I know we're going to fight over this one) Now, for aggregates that hold PMCs and are relatively passive containers (that is, they don't get in the way of anything besides load and store operations, if that) the keyed variants provide no benefit to speak of. Less opcode dispatch overhead, but there's not a whole lot of that in general anyway, and on JITted cores there's no win at all. For aggregates that *don't* hold PMCs, though, that's where the win is. Those are the sorts of aggregates we're aggressively targeting as well. One of the big issues with perl, python, and ruby is that the base variable data structure is big. (And we're not making it any smaller with Parrot--our PMCs are pretty hefty still) Optimizing the size of an individual scalar isn't much of a win, but optimizing the size of arrays and hashes of scalars is a win. (This is one of the lessons learned from Chip's Topaz project) Many hashes and arrays don't need full-fledged PMCs in them, nor the space that full PMCs take. A string, integer, or bool array is sufficient for many of the uses that aggregates are put to. This is a good argument for abstract aggregate element access, which we have now, and that's good. Saves us space, potentially. Yay us. How this argues for keyed access to the operations on aggregate elements, however, is less obvious, but I think it's worth it. If we don't have direct operations on aggregate elements but instead have to do a fetch and perform the operation on what we fetch, it means we have to create a temporary PMC for each 'fake' entry, one potentially with a fair amount of knowledge about how the aggregate works, which means that every aggregate will need to provide two vtables, one for itself and one for an aggregate entry. Going with direct operations on keyed aggregates, then, makes the code somewhat more dense, since we only need to access one vtable per operand rather than two (one to fetch the data and one to operate on it). That's balanced somewhat by having to have two sets of PMC opcodes, one that operates on all parts keyed and one that doesn't. The integrated approach *also* makes it easier to optimize the operation. For example, this code: foo[12] = foo[12] + bar[15]; or the more compact foo[12] += bar[15]; has the destination identical to one of the sources. In the keyed access scheme, that's a single operation, one where foo's vtable function can *easily* determine that the destination and one of the sources is identical. (It's a pair of comparisons) If accessing foo is somewhat expensive--for example requiring a DB access--having knowledge that the source and destination are identical can allow for optimizations that wouldn't otherwise be allowable. (For example, if foo and bar represented fields in two tables, knowing that the destination and one of the sources is identical can potentially cut out one of the DB accesses) While this is doable if we pass in plain scalar PMCs, it'll have to be more expensive, and as such is less likely to be done. Yes, I also know that this potentially breaks down in more complex expressions. That's fine--it means that we can optimize some access rather than all access. I'm OK with that, as it's better than optimizing all accesses. More information's available if anyone wants it, but this *is* the way we're going unless someone proves that it's suboptimal in the common case. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Parrot - 100% Gnu.NET ?
Hi there! I´m a java programmer and I´m not really experienced with perl. But I´ve searched a long time for a system like .NET that can´t be controlled by Microsoft through Patents. Imagine 10.000 apps need the .NET api ruuning on mono, and microsoft permits cloning the .NET api. Maybe 5000 of these 10.000 apps are unmaintained, and even using these apps with an older version of mono would not be legal. The other 5.000 will need weeks to create a patent-free version of there software, till MS finds another point which could be destroyed using patents. And, after the fist change Mono wont be compatinble to .NET at all, so why dont create a patent-free .NET from the beginning. I searched a long time for such a project, and after all I found it where I never expected it: Perl ;-) I really like languages which are deployed compiled to bytecode, no problems with ABI-changes or with compiliers, once compiled link everywhere. Thats the reason why I linke Java really much, but there are some points, which are really showstoppers: *Java isnt free, *Swing is t slow to be useful. In my opinion runtimes only make sence when they dont need to rely on native code. It doesnt make sence that parrot is platform independent and I need to link against native libraries through bindings, because perl´s classpath lacks e.g. GUI functionality. Mono solves this problem like Java - they just include every class which could be useful. Maybe this design is to heavyweight Hmm, I talked a too long time to myself, I know ;-) So, heres my question: I think that parrot could be the Gnu-version of .NET and could be a realy benefit for the whole opensource-world. No 20 runtimes need to be installed on a system - parrot would do the job better than each could alone (Because if many apps rely on parrot the JIT will be tuned by guys from IBM ;-) ). But in my opinion parrot needs a more complete classpath that perl5 currently has. Parrot is another level like perl5, perl5 was fine for scripting and even bigger apps worked great, bt it never tried to be a complete framework for many languages. Parrot is in my eyes a way to go away from C to higher level languages. So, my question: Is it planned to include a complete classpath into parrot, including gui, network, db, sound functionality or will it have only the really needed things like perl5 had? I know that everything is already available for perl5 installing bindings, but in my eyes this doesnt solve the problem. Many users dont want to install seperat libraries, they simply want to use parrot based apps with nice frontends, etc. I hope I didnt make you angry, and please dont missunderstand me, I think what you do is great! Please let me know what you think about the idea to include a complete classpath into parrot. lg Clemens
Re: serialisation (was Re: [RfC] vtable-dump)
On Sat, Aug 30, 2003 at 10:13:02PM -0400, Benjamin Goldberg wrote: Nicholas Clark wrote: The attacker can craft a bogus CGITempFile object that refers to any file on the system, and when this object is destroyed it will attempt to delete that file at whatever privilege level the CGI runs at. And because that object is getting destroyed inside the deserialise routine of Storable, this all happens without the user written code getting any chance to inspect the data. And even Storable can't do anything about it, because by the time it encounters the repeated hash key, it has already deserialised this time-bomb. How does it defuse it? The simplest solution *I* can think of, is to have storable copy the taint flag from the input string/stream onto every single string that it produces. Taint checking doesn't solve *all* security problems, of course, but it can catch many of them, and it certainly would catch this one (if $$self were tainted). I don't believe that it would. A quick test suggests that in perl5 destructors get to run after a taint violation is detected: $ perl -Tle 'sub DESTROY {print We still get to run arbitrary code after taint violations}; bless ($a = []); `cat /etc/motd`' Insecure $ENV{PATH} while running with -T switch at -e line 1. We still get to run arbitrary code after taint violations $ One defense against following a bomb with malformed data, might be to have Storable save up the SV*s and the names with which to bless them, and only do the blessing *after* the data is fully deserialized, as a last step before returning it to the user. This way, if there's malformed data, no destructors get called. The user still needs to validate the returned data, though, and rebless anything which might result in an evil destructor being called. Another defense is to run deserialization and validation inside of a Safe object. Make sure that if the object fails to validate, it is completely destructed before we exit the Safe compartment. I think that these could work well. For backwards compatibility with perl5, parrot will quite likely support taint checking, safe.pm, and ops.pm. I don't see why support is needed at core parrot level. I believe that the plan is to provide XS-level Perl 5 compatibility via ponie, in which case much of this stuff would be up to ponie, not core parrot. Tainting sounds to me like the sort of things that would be a property on a scalar, checked by custom ponie ops, which would be as parrot core as python's ops. A brief inspection suggests that ops.pm is entirely a parser level thing, so I don't see the need for core parrot support there. Safe.pm as implemented is an unreliable mess. It's also problematic as it describes actions solely in terms of perl 5 ops. I've no idea quite how close Arthur, Rafael and the other ponie conspirators think that ponie-generated parrot bytecode will be to the perl 5 optree structure, but I wouldn't like to place any bets right now. I think that safe execution compartments are part of Dan's plan, but I'm not sure if anyone yet knows how any of this fits together (even Dan or Arthur) Nicholas Clark
Re: Parrot - 100% Gnu.NET ?
Clemens Eisserer writes: (B Hi there! (B (B I$BB4(Bm a java programmer (B (BUh oh :-) (B (B and I$BB4(Bm not really experienced with perl. (B (B [...] (B (B I think that parrot could be the Gnu-version of .NET and could be a (B realy benefit for the whole opensource-world. No 20 runtimes need to be (B installed on a system - parrot would do the job better than each could (B alone (Because if many apps rely on parrot the JIT will be tuned by guys (B from IBM ;-) ). (B (B But in my opinion parrot needs a more complete "classpath" that perl5 (B currently has. (B (BThere is a big problem with that: it kinda precludes the whole (B"community" thing that made everyone love Perl 5. In particular, CPAN. (B (BThe plan for Perl 6, at least, is to include almost nothing in the base (Bdistribution, so administrators are forced to install some stuff from (BCPAN order for Perl to be useful at all[1]. (B (BWe need to harness the community work force of CPAN, for one because we (Bdon't have the work force ourselves to put together such a library. But (Bthe bigger reason is to enable the community to write modules, so we (Balways have forward progress on the language. (B (B Parrot is another level like perl5, perl5 was fine for scripting and (B even bigger apps worked great, bt it never tried to be a complete (B framework for many languages. (B Parrot is in my eyes a way to go away from C to higher level languages. (B (B So, my question: Is it planned to include a complete classpath into (B parrot, including gui, (B (Bmaybe (B (B network, (B (Byes (B (B db, (B (Bno (B (B sound functionality (B (Bno (B (B or will it have only the "really needed" things like perl5 had? (B I know that everything is already available for perl5 installing (B bindings, but in my eyes this doesnt solve the problem (B Many users dont want to install seperat libraries, they simply want to (B use parrot based apps with nice frontends, etc. (B (BAhh, I see what you mean. You want to distribute your app and parrot (Band have it Just Work. You can always include the needed modules in (Byour distribution and install them when you get there... but that kills (Bthe platform independence thing. You could tell parrot to install them (Bwhen you get there off the CPAN automatically, but from a commercial (Bstandpoint, that puts all your users in the hands of a -- possibly (Bmischievous -- module author (though I don't know any authors who would (Bdo such a thing, it's still a concern). (B (BThere will be a solution to the commercial thing. One of the (smaller) (Bgoals of Perl 6 is to be used commercially to some extent. (B (BA bigger goal, though, is to keep the open source support we already (Bhave. Including a big default classpath makes people feel like (Beverything is "easy enough," which, ironically, isn't easy enough. It (Bdoesn't motivate people to write/install modules. (B (BSo there you go, we're including everything we need to make the Easy (BThings Possible, and the community will worry about making them Easy. (B (BLuke (B (B (B I hope I didnt make you angry, and please dont missunderstand me, I (B think what you do is great! (B (B Please let me know what you think about the idea to include a "complete" (B classpath into parrot. (B (B lg Clemens (B (B (B (B[1] This was a big problem in Perl 5: admins would install Perl, and (Bassume that it's ready because it already comes with, like, 40 modules. (BAnd then nobody could use it, because there were no good modules (Binstalled.
Re: serialisation (was Re: [RfC] vtable-dump)
Nicholas Clark wrote: On Sat, Aug 30, 2003 at 10:13:02PM -0400, Benjamin Goldberg wrote: Nicholas Clark wrote: The attacker can craft a bogus CGITempFile object that refers to any file on the system, and when this object is destroyed it will attempt to delete that file at whatever privilege level the CGI runs at. And because that object is getting destroyed inside the deserialise routine of Storable, this all happens without the user written code getting any chance to inspect the data. And even Storable can't do anything about it, because by the time it encounters the repeated hash key, it has already deserialised this time-bomb. How does it defuse it? The simplest solution *I* can think of, is to have storable copy the taint flag from the input string/stream onto every single string that it produces. Taint checking doesn't solve *all* security problems, of course, but it can catch many of them, and it certainly would catch this one (if $$self were tainted). I don't believe that it would. A quick test suggests that in perl5 destructors get to run after a taint violation is detected: $ perl -Tle 'sub DESTROY {print We still get to run arbitrary code after taint violations}; bless ($a = []); `cat /etc/motd`' Insecure $ENV{PATH} while running with -T switch at -e line 1. We still get to run arbitrary code after taint violations $ That wasn't what I meant. The taint violation in your example does *not* correspond to Storable turning on the taint flag in the strings it produces. At best, it corresponds to Storable dieing due to a malformed input file, resulting in destructors being called. If sub DESTROY tries to unlink a file, and that filename is a tainted string, then the DESTROY will die. Currently, Storable *doesn't* turn on the taint flag in the SVPVs that it produces; because of this, $$self in CGITempFile::DESTORY isn't tainted. If $$self in CGITempFile::DESTROY *were* tainted, then obviously that DESTROY would die, and the file wouldn't get unlinked. Thus, we would avoid that security hole. One defense against following a bomb with malformed data, might be to have Storable save up the SV*s and the names with which to bless them, and only do the blessing *after* the data is fully deserialized, as a last step before returning it to the user. This way, if there's malformed data, no destructors get called. The user still needs to validate the returned data, though, and rebless anything which might result in an evil destructor being called. Another defense is to run deserialization and validation inside of a Safe object. Make sure that if the object fails to validate, it is completely destructed before we exit the Safe compartment. I think that these could work well. For backwards compatibility with perl5, parrot will quite likely support taint checking, safe.pm, and ops.pm. I don't see why support is needed at core parrot level. I believe that the plan is to provide XS-level Perl 5 compatibility via ponie, in which case much of this stuff would be up to ponie, not core parrot. Tainting sounds to me like the sort of things that would be a property on a scalar, checked by custom ponie ops, which would be as parrot core as python's ops. A brief inspection suggests that ops.pm is entirely a parser level thing, so I don't see the need for core parrot support there. Safe.pm as implemented is an unreliable mess. It's also problematic as it describes actions solely in terms of perl 5 ops. I've no idea quite how close Arthur, Rafael and the other ponie conspirators think that ponie-generated parrot bytecode will be to the perl 5 optree structure, but I wouldn't like to place any bets right now. I think that safe execution compartments are part of Dan's plan, but I'm not sure if anyone yet knows how any of this fits together (even Dan or Arthur) Nicholas Clark -- $a=24;split//,240513;s/\B/ = /for@@=qw(ac ab bc ba cb ca );{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print [EMAIL PROTECTED] ]\n;((6=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))redo;}
Re: Parrot Z-machine
On Thu, Aug 28, 2003 at 06:17:07AM -0700, Amir Karger wrote: Hi. Hugely newbie at Parroting, but think it's the coolest. Good stuff. I hope it stays that with the inevitable setbacks and annoyances that will come while gaining experience. - Is it not being ported because of a lack of tuits, or because it's extremely hard? I think it's party because of tuits, and party because to be done properly it requires a couple of big features to be added to parrot, notably 1: dynamic opcode loading 2: dynamic bytecode conversion (This is the point where someone tells me that dynamic opcode loading now works) We'd need dynamic opcode loading because we don't want to have the Z-machine specific opcodes compiled into every parrot (or every other specialist set of opcodes) We'd want dynamic bytecode conversion because we want parrot to be able to directly load Z-code files, rather than having to first run an external program to convert them. Both these features are wanted to seamlessly run any alien bytecode, such as Python's bytecode, Java bytecode or .NET bytecode. However, I don't think that we'd need them in place to begin working on a Z-code port. (We'd just have to arrange to link in any specialist Z-code ops for a while, and to convert Z-code before loading it) - A Perl 6 Summary from last year claimed Josh Wilmes was going to work on it. Does anyone know if he is and, if so, how far he's gotten? I have no idea - Whether or not it's extremely hard, would it be useful to have some of the easy parts done by a newbie who can hack assembly but not well enough to put into the parrot core? In that case, which would be the easy parts? I've no idea. How familiar are you with Z-code? About all that I know (and I may be wrong) is that it fits in a virtual machine with 128K total memory, that it has continuations (which we now have), and that the Hitch-Hiker's Guide to the Galaxy adventure game was written in it. Is the virtual machine stack based, or register based? (I guess from your next paragraph that it's stack based) Do you know enough Z-code to create Z-code regression tests that progressively exercise features of the Z-machine (along with the expected correct output) so that any implementor (possibly yourself) knows when they've got it right? - I saw that Dan wanted to create a library to handle stack-based languages. I don't suppose that's been done at all? If not, I could steal from, e.g., befunge, which would be way better than starting from scratch. I would offer to create the library, but I'm not really confident enough about my (as-yet nonexistent) pasm-writing skills to write a library a bunch of other people use. I don't think you'd necessarily need to know actual PASM. I think that the tricky part is thinking about how to map from a stack machine to a register machine. I've no idea what academic (or other) literature there is on this. The simplest way to run a stack based language on parrot would be to ignore most of the registers and just use parrot's stack. However, this isn't going to be efficient. So researching/working out how to efficiently map stack operations to use more than a couple of registers would be very useful for all sorts of stack based languages. Getting started on that would probably be very helpful - you don't need to actually write the implementation PASM if you're able to describe what needs to a co-volunteer. Nicholas Clark
Re: [RfC] vtable-dump
Leopold Toetsch wrote: Benjamin Goldberg [EMAIL PROTECTED] wrote: class freezer { class thawer { class cloner { [ big snip ] Do you expect that these are overridden by some languages using parrot? I.e. that ponie tries to implement a freezer that writes output for Storable? I'm not entirely sure... For ponie to implement something that deals with Storable, then of course it needs to output data in the same file format as Storable does. I haven't looked at the guts of storable, so I don't know what that format is. Further: having clone implemented in terms of freeze + thaw needs additional memory for the intermediate frozen image. Isn't that suboptimal? Only slightly -- It's just *one single* PMC's data that's stored in that additional memory. And if we have a seperate clone vtable method, then there's a chance that our cloning procedure will drift apart from our freezing/thawing procedure. A general traverse routine or iterator seems to be more flxible. leo -- $a=24;split//,240513;s/\B/ = /for@@=qw(ac ab bc ba cb ca );{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print [EMAIL PROTECTED] ]\n;((6=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))redo;}
Class libraries deployed with parrot.
Hi again! Wow, thanks for thinking about my ideas. I expected that you call me a troll, but it seems that there are cool people here ;-) O.K. lets simply call it class-library. It doesn't seem to be the Perl way to limit yourself to one option only (There's more than one way to do it). Of course we wouldn't want five different implementations of Unicode, and it makes sense to ship _one_. However, people might want to use different GUIs -- e.g. wxWindows, TK, native Windows GUI, Aqua, ... In my opinion wxWindows would be fine with a nice api-binding. The C++-Api is terrible, but the toolkit itself seems to offer quite cool features. I would prefer wxWindows, because they did what Sun didnt continue. One API many toolkits, small library-size. Of course, many people have different tastes, but look at Java (thats the world where I come from). swing is slow and ugly, but nearly 97% of all java programs use it because its shipped with java since 1.2... Look at TCL, TK is used because it integrates well with TCL. If one really doesnt like wxWindows, he can of course provide other libraries. But the mainstream will use the libraries provided with the runtime. Sound is a whole different beast -- implementations vary wildly across systems, and I'm not sure whether it is possible to have a high-performance cross-platform implementation that satisifes 90% of users. SDL shows that this is possible. Of course, its often a problem of portability. But developers also have to think about portability. If the class-libraries shipped with a runtime-enviroment are available across most ports, they dont have to use 3 different bindings to different native libs. They can simply use the api implemented in the class-library and work with it. For the developers of the class library this is of course hard work and the runtime often gets big. - Parrot bytecode executables might be packages that contain the necessary libraries in bytecode format (e.g. wxWindows). - Installers might include libraries (in bytecode) and install them if needed (install = simple copy) Bytecode-packages are typically no problem. They are small and only depend on libraries provieded by the enviroment. For binary (platform-dependant) packages: Usually, people just ship these statically linked against those libs that can't be typically expected to be installed on the system. So I guess for Parrot apps distributed in binary, there's not much of a problem. For apps shipped as bytecode, it still needs to be discussed what is going to be provided: - Preferably a small package ( 5 - 10 MB) that lets people use Parrot apps quickly. - A huge package complete with Perl/Python/PHP/Befunge/hq9+/... support so that everybody will have 95% of everything they are ever gonna need - Or both? Of course, both would be best ;-). In my opinion it would be best, to provide a native basis. Lightweight-libraries could do the rest. However the user has to install native libraries, he has the good old problems: *ABI problems *Dependencies * Of course, in parts where high-performance is needed, this wont be a good solution. Of course it would be fine to have a package that only weights 5Mb, but what if the user needs to install 3 different libraries so that the user can use its program. Imagine how much fits in a 15Mb RPM. A very important thing is in my eyes, that bindings can be created without the need of native code. e.g. JNI-bindings need to be written using C, theres no way to do everything in Java. Maybe I can help creating such a class-library? How hard do you think will it be to create a library which works good for many languages. E.G. Java has interfaces, I dont know about perl or python. But I´m sure perl has language features that java doesnt have Mochts as guat, Clemens *real Austrian slang*
Re: Parrot Z-machine
Nicholas Clark [EMAIL PROTECTED] wrote: 2: dynamic bytecode conversion (This is the point where someone tells me that dynamic opcode loading now works) No it doesn't. Albeit I have posted a proof of concept standalone program months ago. Nicholas Clark leo
Re: The reason for scads of keyed variants
Dan Sugalski [EMAIL PROTECTED] wrote: [ heavily snipped ] Now, for aggregates that hold PMCs ... ... and on JITted cores there's no win at all. For aggregates that *don't* hold PMCs, though, that's where the win is. If we don't have direct operations on aggregate elements but instead have to do a fetch and perform the operation on what we fetch, it means we have to create a temporary PMC for each 'fake' entry, one potentially with a fair amount of knowledge about how the aggregate works, which means that every aggregate will need to provide two vtables, one for itself and one for an aggregate entry. I don't see the point here especially why we would need a temporary PMC. If we have an array of packed ints, I just need a pointer to the element to work on it. This is very similar to the Ckey opcode I had in some of my proposals. Going with direct operations on keyed aggregates, then, makes the code somewhat more dense, since we only need to access one vtable per operand rather than two (one to fetch the data and one to operate on it). That's balanced somewhat by having to have two sets of PMC opcodes, one that operates on all parts keyed and one that doesn't. Not when you need 64 opcodes for the keyed variants. 64:1 isn't somewhat balanced. More information's available if anyone wants it, but this *is* the way we're going unless someone proves that it's suboptimal in the common case. Implementation details wanted ;-) leo
This week's Perl 6 Summary
The Perl 6 Summary for the week ending 20030831 Welcome to this week's Perl 6 summary. This week, for one week only I'm going to break with a long established summary tradition. No, that doesn't mean I won't be mentioning Leon Brocard this week. Nope, this week we're going to start by discussing what's been said on the perl6-language list this week. Now that's out of the way, we'll continue with a summary of the internals list. Continuation Passing is becoming the One True Calling Style Jos Visser had some code that broke in an interesting fashion when find_lex through an exception, so he asked the list about it. Leo Tötsch explained that exceptions and old style subs don't play at all well together. It seems to me that the total and utter deprecation of subs using the old style calling conventions is not far distant. http://xrl.us/rtk Embedding Parrot in Perl Luke Palmer has started to learn XS in the service of his project to embed Parrot in Perl. Unsurprisingly, he had a few questions. He got a few answers. http://xrl.us/rtl Implementing ISA Leo Tötsch has implemented isa. The unfashionably lowercased chromatic argued that what Leo had implemented should actually be called does. Chris Dutton thought does should be an alias for has. Piers Cawley thinks he might be missing something. http://xrl.us/rtm More on constant PMCs and classes Leo Tötsch's RFC on constant PMCs and classes from last week continued to attract comments about possible interfaces and implementations. http://xrl.us/qxk A miscellany of newbie questions James Michael DuPont a bunch of questions and suggestions about bytecode emission, the JIT and about possibly extracting the Parrot object system into a separate library. Leo supplied answers. http://xrl.us/rtn What the heck is active data? Dan clarified what he'd meant when he talked about Active Data. His one sentence definition being 'Active Data is data that takes some active role in its use -- reading, writing, modifying, deleting or using in some activity'. The consequences of such data are far reaching; even if your code has no active data in it, Dan points out that you still have to take the possibility into account, or solve the halting problem. Benjamin Goldberg seemed to think that you didn't need to solve the halting problem, you could just add scads of metadata to everything and do dataflow analysis at compile time. I look forward with interest to his implementation. Matt Fowles wondered why active data necessitated keyed variants of all the ops, asking instead why we couldn't have a prepkeyed op to return an appropriate specialized PMC to use in the next op. Dan agreed that such an approach was possible, but not necessarily very efficient. Leo Tötsch disagreed with him though. TOGoS wondered if this meant that we wouldn't know whether set Px, Py did binding or morphing until runtime. (It doesn't. set always simply copies a pointer). In an IRC conversation with Dan we realised that some of this confusion arises from the fact that set_string and friends behave as if they were called assign_string; to get the expected set_string semantics you'd have to do: new Px, .PerlUndef set_string Px, some string Hopefully this is going to get fixed. http://xrl.us/rto Mission haiku Nicholas Clark To make some kind of mark Committed haiku. Don't you. Yes, I know that's not a haiku. It's a Clerihew. I suggest that anyone else who feels tempted to perpetrate verse on list restrict themselves to a sestina or a villanelle, or maybe a sonnet. I also note that POD is a lousy format for setting poetry in. http://xrl.us/rtp Jürgen gets De-Warnocked Jürgen Bömmels had been caught on the horns of Warnock's Dilemma over a patch he submitted a while back. It turns out that he'd been Warnocked in part because both Leo and Dan thought he already had commit rights. So that got fixed. Welcome to the ranks of Parrot committers Jürgen, you've deserved it for a while. http://xrl.us/rtq Parrot Z-machine New face Amir Karger wants to write the Parrot Z-machine implementation and had a few questions about stuff. So far he's been Warnocked. http://xrl.us/rtr Notifications Dan described how Parrot's notification system would work, and what that means for weak references. Michael Schwern thought the outlined notification system would also be awfully useful for debugger watch expressions. Tim Bunce worried about some edge cases. http://xrl.us/rts MSVC++ complaints Vladimir Lipskiy (who's been doing some stellar work recently on various build issues amongst other things) found some problems trying to build Parrot with MSVC++