RE: [RfC] vtable-dump
Dan Sugalski [EMAIL PROTECTED] wrote: On Tue, 9 Sep 2003, Gordon Henriksen wrote: Random thought There's some discussion on perl-qa right now about how Test::More should implement is_deeply, which executes a code block and tests that the return value is equivalent to a particular nested data structure. (The question posed on that list is with regard to how to handle tie()'d objects, which is not what I'm addressing here.) Result of that discussion's outcome, should the serialization API being discussed here enable the following behavior? ok(freeze($expected) eq freeze($actual)); I bring this up because using object addresses as IDs in the serialization entirely prevents this usage. Good. Having waded through one of the threads on p5p just a minute ago, I'm not willing to guarantee this. In fact, I'm willing to explicitly not guarantee this. If we want to compare two frozen structures, string equality is *not* the way to go. Each freeze could, theoretically, choose a different freezing method yet still represent the original. What do you mean by a different freezing method? That key-value pairs from two externally identical hashes could be frozen in different orders? I can see that. Sorting large hashes can be expensive, and certainly requires memory allocation. Or do you mean that freeze($objA) and freeze($objB)--freezing (definitionally identical!) object graphs and with no intervening code between the two calls to freeze--could internally and arbitrarily select a significantly divergent object graph encoding? I can't see that at ALL... Over time (and revisions), I certainly can see a desire not to marry parrot-freeze to a specific binary representation. That's not the question I intended to raise--I asked a question only of repeatability, not of permanent format invariance. This is a Good Place for a black-box comparison op, which string equality definitely is not. (At which point do extremely complex routines cease to be operators?) A black-box comparison of the (live) object graphs, or black-box comparison of the serializations themselves? I can see the former--while it's precisely the problem that consistent traversal outputs avoids, a deep == could have utility and avoid the serialization overhead. But how do you implement concurrent traversals without private seen tables? (e.g., if the graphs overlap, one of the traversals will find the other traversal's visited flag and fail to visit the entire graph.) Reentrant traversal again. Pathological case: $aa = [1]; $ab = [1]; %ha = ( a = \$aa, b = \$ab ); %hb = ( a = \$ab, b = \$aa ); # %ha and %hb overlap and are deep-identical, # but not deeply reference-identical. Comparing serialized object graphs strikes me as tremendously esoteric, e.g. a maintenance burden to be used by very few clients. (One of the few significant uses being that of replacing string-equals for the testing of the serializer itself.) It also strikes me as very, very, very complicated should the freeze methods diverge even slightly. I've never seen any such mechanism in any other environment. To compare graphs that I had saved in long-term storage, I as a caller would expect to need to deserialize the graphs and use a deep equals--and I would expect to implement deep equals (another feature I've never before seen as a specific feature of any environment) by serializing the deserialized graphs again and doing a string (or file) comparison. E.g., if I had read $a or $b from a file, I would expect to have to: freeze(thaw($a)) eq freeze(thaw($b)) instead of: $a eq $b That synthesizes the functionality from a minimal set of operations--freeze and thaw--and with a lot less maintenance for parrot. (At the cost of a bit of memory and runtime speed. But anybody who performs deep-compares in truly performance-critical code needs to meet my Clue Bat[tm].) -- Gordon Henriksen IT Manager ICLUBcentral Inc. [EMAIL PROTECTED]
RE: [RfC] vtable-dump
On Tue, 9 Sep 2003, Gordon Henriksen wrote: Dan Sugalski [EMAIL PROTECTED] wrote: On Tue, 9 Sep 2003, Gordon Henriksen wrote: Random thought There's some discussion on perl-qa right now about how Test::More should implement is_deeply, which executes a code block and tests that the return value is equivalent to a particular nested data structure. (The question posed on that list is with regard to how to handle tie()'d objects, which is not what I'm addressing here.) Result of that discussion's outcome, should the serialization API being discussed here enable the following behavior? ok(freeze($expected) eq freeze($actual)); I bring this up because using object addresses as IDs in the serialization entirely prevents this usage. Good. Having waded through one of the threads on p5p just a minute ago, I'm not willing to guarantee this. In fact, I'm willing to explicitly not guarantee this. If we want to compare two frozen structures, string equality is *not* the way to go. Each freeze could, theoretically, choose a different freezing method yet still represent the original. What do you mean by a different freezing method? That key-value pairs from two externally identical hashes could be frozen in different orders? I can see that. Sorting large hashes can be expensive, and certainly requires memory allocation. That's possible, yes. We may well decide to have each hash have a separate randomness seed. While a bit excessive, it's not too unreasonable. Or do you mean that freeze($objA) and freeze($objB)--freezing (definitionally identical!) object graphs and with no intervening code between the two calls to freeze--could internally and arbitrarily select a significantly divergent object graph encoding? I can't see that at ALL... This is, in fact, what I mean. (Well, actually, what I mean is that freeze($a) and freeze($a), with no intervening changes at all to $a, may produce different freeze representations) There are two reasons for this: 1) We make no promises on the format for the freeze, just that it will be able to reconstitute to the original structure. That means that the system may, if it chooses, use one of several formats. The freeze format can be chosen for any number of reasons including current free memory, phase of the moon, or sheer random chance. 2) We may choose to use internal characteristics, including addresses of immovable structures, as unique identifiers for the freezing. #2 gets relatively minor changes between two otherwise functionally identical graphs, but nonetheless significant enough to make string equality testing infeasable. #1 means that the first freeze may give you the graph in the internal binary format and the second freeze gives you the graph in YAML. (And the third potentially in XML) Over time (and revisions), I certainly can see a desire not to marry parrot-freeze to a specific binary representation. That's not the question I intended to raise--I asked a question only of repeatability, not of permanent format invariance. I don't want to make any guarantees of repeatability. Past history suggests that when behaviour isn't guaranteed it's best to actively randomize the behaviour if there's no significant penalty to doing so, as otherwise people will count on the non-promised behaviour to the point where the installed base makes it infeasable to change when the future rolls around. This is a Good Place for a black-box comparison op, which string equality definitely is not. (At which point do extremely complex routines cease to be operators?) A black-box comparison of the (live) object graphs, or black-box comparison of the serializations themselves? Both, though I was thinking the latter, which is significantly more useful. Comparing serialized object graphs strikes me as tremendously esoteric, e.g. a maintenance burden to be used by very few clients. It actually makes things potentially much simpler, if you consider comparison of live objects for equivalence a suecial case of this. (One of the few significant uses being that of replacing string-equals for the testing of the serializer itself.) It also strikes me as very, very, very complicated should the freeze methods diverge even slightly. I've never seen any such mechanism in any other environment. Which is fine. I suppose we get to do at least one new thing with Parrot, though I expect this isn't anywhere near new. It's also going to be necessary to handle duplicate detection when getting objects over the network or from long-term storage. To compare graphs that I had saved in long-term storage, I as a caller would expect to need to deserialize the graphs and use a deep equals Why? This one makes no sense to me. If I have two black-box serialized representations of objects, I'd expect to *not* have to reconstitute them, and would actively not want to do so, since
RE: [RfC] vtable-dump
Random thought There's some discussion on perl-qa right now about how Test::More should implement is_deeply, which executes a code block and tests that the return value is equivalent to a particular nested data structure. (The question posed on that list is with regard to how to handle tie()'d objects, which is not what I'm addressing here.) Result of that discussion's outcome, should the serialization API being discussed here enable the following behavior? ok(freeze($expected) eq freeze($actual)); I bring this up because using object addresses as IDs in the serialization entirely prevents this usage. -- Gordon Henriksen IT Manager ICLUBcentral Inc. [EMAIL PROTECTED]
RE: [RfC] vtable-dump
On Tue, 9 Sep 2003, Gordon Henriksen wrote: Random thought There's some discussion on perl-qa right now about how Test::More should implement is_deeply, which executes a code block and tests that the return value is equivalent to a particular nested data structure. (The question posed on that list is with regard to how to handle tie()'d objects, which is not what I'm addressing here.) Result of that discussion's outcome, should the serialization API being discussed here enable the following behavior? ok(freeze($expected) eq freeze($actual)); I bring this up because using object addresses as IDs in the serialization entirely prevents this usage. Good. Having waded through one of the threads on p5p just a minute ago, I'm not willing to guarantee this. In fact, I'm willing to explicitly not guarantee this. If we want to compare two frozen structures, string equality is *not* the way to go. Each freeze could, theoretically, choose a different freezing method yet still represent the original. This is a Good Place for a black-box comparison op, which string equality definitely is not. Dan
RE: [RfC] vtable-dump
Nicholas Clark wrote: Garrett Goebel wrote: Leopold Toetsch wrote: Garrett Goebel [EMAIL PROTECTED] wrote: As I was googling to see how other people have approached these problems I ran across: http://blogs.gotdotnet.com/cbrumme/CategoryView.aspx/CLR What did you find there, besides a description that a certain company is producing b0rken software ;-) The thoughts of a CLR developer on implementation issues: mistakes made, lessons learned, nasty things to keep in mind, etc. Discussions of lock-free thread-safe coding issues, balancing the CLR's memory model with cpu memory models, GC, finalization, exceptions, MP, NUMA, JIT, security policies, managed code, typing, shared vtables, marshaling, serialization. But not hugely indexed and with a lot of Win32 specific detail (Although there are section headers). For example the term lock-free seems to occur three times, and none seem to be discussions on how to go about writing lock-free code, merely references to it. (Or did you mean that we should read some of the comments too?) Are there specific permalinks of blog entries that discuss individual items in your list above, to save us all trying to wade through masses of text? The comments do hone in on some of the issues... but no, I can't give anything more specific than that the last 2 entries looked to cover many of the same issues that have been discussed recently on this list. If you're looking for something specifically on threading issues, perhaps a doctoral thesis on Implementation Issues in Concurrent Programming Languages: www.cs.usfca.edu/benson/diss.ps? -- Garrett Goebel IS Development Specialist ScriptPro Direct: 913.403.5261 5828 Reeds Road Main: 913.384.1008 Mission, KS 66202 Fax: 913.384.2180 www.scriptpro.com garrett at scriptpro dot com
RE: [RfC] vtable-dump
Garrett Goebel wrote: Nicholas Clark wrote: Garrett Goebel wrote: Leopold Toetsch wrote: Garrett Goebel [EMAIL PROTECTED] wrote: As I was googling to see how other people have approached these problems I ran across: http://blogs.gotdotnet.com/cbrumme/CategoryView.aspx/CLR What did you find there, besides a description that a certain company is producing b0rken software ;-) The thoughts of a CLR developer on implementation issues: mistakes made, lessons learned, nasty things to keep in mind, etc. Discussions of lock-free thread-safe coding issues, balancing the CLR's memory model with cpu memory models, GC, finalization, exceptions, MP, NUMA, JIT, security policies, managed code, typing, shared vtables, marshaling, serialization. But not hugely indexed and with a lot of Win32 specific detail (Although there are section headers). For example the term lock-free seems to occur three times, and none seem to be discussions on how to go about writing lock-free code, merely references to it. (Or did you mean that we should read some of the comments too?) Are there specific permalinks of blog entries that discuss individual items in your list above, to save us all trying to wade through masses of text? The comments do hone in on some of the issues... but no, I can't give anything more specific than that the last 2 entries looked to cover many of the same issues that have been discussed recently on this list. Sorry, I was looking at: http://blogs.gotdotnet.com/cbrumme/ which truncates old blogs... App Domains http://blogs.gotdotnet.com/cbrumme/PermaLink.aspx/56dd7611-a199-4a1f-adae-6f ac4019f11b Memory Model http://blogs.gotdotnet.com/cbrumme/PermaLink.aspx/480d3a6d-1aa8-4694-96db-c6 9f01d7ff2b -- Garrett Goebel IS Development Specialist ScriptPro Direct: 913.403.5261 5828 Reeds Road Main: 913.384.1008 Mission, KS 66202 Fax: 913.384.2180 www.scriptpro.com garrett at scriptpro dot com
Re: [RfC] vtable-dump
On Sep-08, Leopold Toetsch wrote: Nicholas Clark [EMAIL PROTECTED] wrote: On Fri, Sep 05, 2003 at 09:34:04AM -0400, Dan Sugalski wrote: On Fri, 5 Sep 2003, Peter Haworth wrote: With the seen hash approach, I wouldn't expect the hash itself to take nearly as much space as the frozen structure; My experience tracing variable graphs in Perl 5 indicates otherwise. I think it may be worth listening to Dan on this one. I don't know why Storable.xs is using a general HV for mapping an address to an integer (more isn't needed to serialize all IMHO - Dan's next_for_GC approach doesn't provide more info). Perl5 does convert the address to a string and uses all weirdness of hv.c code. This isn't necessary for this special kind of problem. Parrot has some DOD counters for possibly active PMCs. A simple and probably fix-sized[1] hash based on such a count with just a mapping of {pmc = id} takes 12 bytes per entry on 32-bit architecures and its really fast. It's late so I'm probably being an idiot, but is all that is needed (1) a unique id for every pmc address, and (2) whether or not the PMC has been visited yet? If so, why not use the address itself for the id (or some variant, like index of the pmc in its arena, added to/or-ed with some base id for the arena itself)? And for the flag, have arenas preallocate a bit vector of seen flags for all their PMCs? (Or if you have the arena base ids, then you can have a single bit vector for all arenas.) Or is there no simple way of going from a PMC address to its arena? Perhaps it's no win to pollute the cache with chunks of the bit vector instead of the actual referenced PMC flagpoles. But if that's an issue then I can't help but think that most of the time the seen bit vector would be quite sparse, so we could go to a multi-level scheme: use a conservative vector that can answer definitely not seen vs maybe seen, and then look at the actual PMC flagpole (or, again, a separate bit vector if it somehow helps multithreading) for the true answer. An eighth of a bit per entry ought to be small enough, no? I suppose I ought to pay more attention to the conversation (and the current code) before butting in with suggestions. I've a feeling I'm solving the wrong problem.
Re: [RfC] vtable-dump
On Sunday, September 7, 2003, at 06:21 , Nicholas Clark wrote: On Fri, Sep 05, 2003 at 09:34:04AM -0400, Dan Sugalski wrote: On Fri, 5 Sep 2003, Peter Haworth wrote: With the seen hash approach, I wouldn't expect the hash itself to take nearly as much space as the frozen structure; My experience tracing variable graphs in Perl 5 indicates otherwise. I think it may be worth listening to Dan on this one. Appended patch to Storable implements an option to count the size of the seen hashes. (There seem to be two, one for objects, one for classes) It's disabled by default - compile Storable with -DDEBUGME or change the #if 0 at line 23 of Storable.xs to enable. You'll also need Devel::Size installed. Taking a relatively complex structure, such as the POSIX exports: $ /usr/local/bin/perl5.8.0 -Mblib -MStorable -MPOSIX -MDevel::Size -le 'package POSIX; foreach ([EMAIL PROTECTED], [EMAIL PROTECTED], \%EXPORT_TAGS) { print Devel::Size::total_size $_; print length Storable::freeze $_; print $Storable::SEENSIZE; print }' reformatted, this give: size in memory size frozen seen hash size @EXPORT 22680 547719068 @EXPORT_OK 1943422 1864 %EXPORT_TAGS23686 596320472 Not very surprised by this result at all. Hash table entries are going to be roughly comparable in size to the original object graph, since so many objects in Perl are really quite tiny. The issues with the alternatives are stunningly painful, though. They've been hashed out already, though. I'm less concerned by transient memory consumption than by the sabotage of threading. Perhaps it would make sense to use a specialized data structure if memory consumption is really at the top of anybody's list of concerns. Remember the hash tables without linked lists chained off of them from undergrad CS? They're much more compact (literally a single table in memory), and most of their flukes are avoidable when entries can't be removed. Gordon Henriksen [EMAIL PROTECTED]
Re: [RfC] vtable-dump
On Sunday, September 7, 2003, at 01:41 , Dan Sugalski wrote: At 8:56 PM -0600 9/4/03, Luke Palmer wrote: Gordon Henriksen writes: What you're suggesting also has significant side-effects: It halts hypothetical multithreaded programs, suspends DoD, prevents the traversal mechanism from calling back into parrot code, requires fine-grained locks which are extremely expensive and have been summarily abandoned to great acclaim in all prior works... and for that, still doesn't provide a useful form of thread safety in the general case anyhow. Is there a problem with providing a mechanism which would suspend all threads except for the current one, as to ensure that the serialize operates, er, serially. Could it be implemented with a top-priority event post that acquires a global lock? The biggest problem with a lock of this sort is that it requires all the interpreters to actually *check* it to be meaningful. (Not that there aren't other problems--there are, quite a number of them--but this is the big one) Locks nobody gets aren't useful, and for this one to be useful we'd need to have some assurance that the other threads were looking at it regularly, and that those threads were all paused when we wanted to do our stuff. You end up with either long delays waiting, or a lot of contention on a single global lock. Neither is particularly good. I think Luke wasn't suggesting a global lock, but rather suggesting that the competing threads be suspended using operating system facilities to temporarily prevent them from being scheduled. Gordon Henriksen [EMAIL PROTECTED]
Re: [RfC] vtable-dump
Steve Fink [EMAIL PROTECTED] wrote: On Sep-08, Leopold Toetsch wrote: Parrot has some DOD counters for possibly active PMCs. A simple and probably fix-sized[1] hash based on such a count with just a mapping of {pmc = id} takes 12 bytes per entry on 32-bit architecures and its really fast. It's late so I'm probably being an idiot, but is all that is needed (1) a unique id for every pmc address, and (2) whether or not the PMC has been visited yet? If so, why not use the address itself for the id Yep. The address can serve as the ID (it would with Dan's scheme). Or is there no simple way of going from a PMC address to its arena? With ARENA_DOD_FLAGS enabled: GET_ARENA(pmc). When its disabled it would need arena back pointers or a lookup like Ccontained_in_pool(). Both methods can generate an index usable for indexing into the bit vector. But I wouldn't want to use arena memory for the bit vector. This leads to the same multi threading issues, as Dan's scheme has. A bit vector instead of the hash allocated on the heap sounds very promising and space efficient. Perhaps it's no win to pollute the cache with chunks of the bit vector instead of the actual referenced PMC flagpoles. clone()/freeze()/dump() pull in the whole PMC. DOD doesn't for the fast case of just setting the live flag on a scalar. But ... ... An eighth of a bit per entry ought to be small enough, no? Yep. Think so. I suppose I ought to pay more attention to the conversation (and the current code) before butting in with suggestions. I've a feeling I'm solving the wrong problem. No. The idea is fine and would work. leo
Re: [RfC] vtable-dump
On Monday, September 8, 2003, at 03:34 , Steve Fink wrote: On Sep-08, Leopold Toetsch wrote: I don't know why Storable.xs is using a general HV for mapping an address to an integer (more isn't needed to serialize all IMHO - Dan's next_for_GC approach doesn't provide more info). Perl5 does convert the address to a string and uses all weirdness of hv.c code. This isn't necessary for this special kind of problem. Parrot has some DOD counters for possibly active PMCs. A simple and probably fix-sized[1] hash based on such a count with just a mapping of {pmc = id} takes 12 bytes per entry on 32-bit architecures and its really fast. It's late so I'm probably being an idiot, but is all that is needed (1) a unique id for every pmc address, and (2) whether or not the PMC has been visited yet? If so, why not use the address itself for the id (or some variant, like index of the pmc in its arena, added to/or-ed with some base id for the arena itself)? A down side to using the address as a key is that it would cause even the simplest serialization to have nondeterministic (or at least unpredictable) output, which certainly makes testing more difficult. And for the flag, have arenas preallocate a bit vector of seen flags for all their PMCs? (Or if you have the arena base ids, then you can have a single bit vector for all arenas.) Consider concurrent serialization from multiple threads. The seen hash (table, whatever) is local to the serialization, and thus thread-safe. The programmer can guarantee that the traversed structures are not concurrently updatedthe runtime flatly cannot. Any global structure is inherently not thread-safe. Furthermore, global structures are persistently extant, and must be paid for continually. Most programs will never serialize even one object graph, much less spend any significant portion of their CPU time in serialization. Gordon Henriksen [EMAIL PROTECTED]
Re: [RfC] vtable-dump
At 8:56 PM -0600 9/4/03, Luke Palmer wrote: Gordon Henriksen writes: What you're suggesting also has significant side-effects: It halts hypothetical multithreaded programs, suspends DoD, prevents the traversal mechanism from calling back into parrot code, requires fine-grained locks which are extremely expensive and have been summarily abandoned to great acclaim in all prior works... and for that, still doesn't provide a useful form of thread safety in the general case anyhow. Is there a problem with providing a mechanism which would suspend all threads except for the current one, as to ensure that the serialize operates, er, serially. Could it be implemented with a top-priority event post that acquires a global lock? The biggest problem with a lock of this sort is that it requires all the interpreters to actually *check* it to be meaningful. (Not that there aren't other problems--there are, quite a number of them--but this is the big one) Locks nobody gets aren't useful, and for this one to be useful we'd need to have some assurance that the other threads were looking at it regularly, and that those threads were all paused when we wanted to do our stuff. You end up with either long delays waiting, or a lot of contention on a single global lock. Neither is particularly good. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: [RfC] vtable-dump
Nicholas Clark [EMAIL PROTECTED] wrote: On Fri, Sep 05, 2003 at 09:34:04AM -0400, Dan Sugalski wrote: On Fri, 5 Sep 2003, Peter Haworth wrote: With the seen hash approach, I wouldn't expect the hash itself to take nearly as much space as the frozen structure; My experience tracing variable graphs in Perl 5 indicates otherwise. I think it may be worth listening to Dan on this one. I don't know why Storable.xs is using a general HV for mapping an address to an integer (more isn't needed to serialize all IMHO - Dan's next_for_GC approach doesn't provide more info). Perl5 does convert the address to a string and uses all weirdness of hv.c code. This isn't necessary for this special kind of problem. Parrot has some DOD counters for possibly active PMCs. A simple and probably fix-sized[1] hash based on such a count with just a mapping of {pmc = id} takes 12 bytes per entry on 32-bit architecures and its really fast. [1] no DOD interfernce here: just malloc'ed. leo
Re: [RfC] vtable-dump
On Fri, Sep 05, 2003 at 09:34:04AM -0400, Dan Sugalski wrote: On Fri, 5 Sep 2003, Peter Haworth wrote: With the seen hash approach, I wouldn't expect the hash itself to take nearly as much space as the frozen structure; My experience tracing variable graphs in Perl 5 indicates otherwise. I think it may be worth listening to Dan on this one. Appended patch to Storable implements an option to count the size of the seen hashes. (There seem to be two, one for objects, one for classes) It's disabled by default - compile Storable with -DDEBUGME or change the #if 0 at line 23 of Storable.xs to enable. You'll also need Devel::Size installed. Taking a relatively complex structure, such as the POSIX exports: $ /usr/local/bin/perl5.8.0 -Mblib -MStorable -MPOSIX -MDevel::Size -le 'package POSIX; foreach ([EMAIL PROTECTED], [EMAIL PROTECTED], \%EXPORT_TAGS) { print Devel::Size::total_size $_; print length Storable::freeze $_; print $Storable::SEENSIZE; print }' reformatted, this give: size in memory size frozen seen hash size @EXPORT 22680 547719068 @EXPORT_OK 1943422 1864 %EXPORT_TAGS23686 596320472 So it seems that the seen hash is about 3 to 4 times the size of the frozen object, and comparable in size to the original structure. I was surprised by these results - I had taken the common sense view that this hash would be small I'm wondering if a documented version of this patch should go into Storable. It could be useful for experimenting. Nicholas Clark --- Storable.pm.orig2003-07-29 08:06:35.0 +0100 +++ Storable.pm 2003-09-07 22:53:24.0 +0100 @@ -361,6 +361,19 @@ sub thaw { return $self; } +# We take advantage of autoload - this reference to SEENSIZE won't get noticed +# by perl unless we're called, which doesn't happen unless the user has +# already made reference to the variable. +sub _add_seensize { + undef $Storable::SEENSIZE; + eval { +require Devel::Size; +foreach (@_) { + $SEENSIZE += Devel::Size::size ($_) if defined $_; +} + }; + warn $@ if $@; +} 1; __END__ --- Storable.xs.orig2003-09-05 19:42:41.0 +0100 +++ Storable.xs 2003-09-07 23:05:28.0 +0100 @@ -1181,10 +1181,6 @@ static void init_store_context( * * It is reported fixed in 5.005, hence the #if. */ -#if PERL_VERSION = 5 -#define HBUCKETS 4096/* Buckets for %hseen */ - HvMAX(cxt-hseen) = HBUCKETS - 1; /* keys %hseen = $HBUCKETS; */ -#endif /* * The `hclass' hash uses the same settings as `hseen' above, but it is @@ -1197,7 +1193,18 @@ static void init_store_context( cxt-hclass = newHV(); /* Where seen classnames are stored */ #if PERL_VERSION = 5 - HvMAX(cxt-hclass) = HBUCKETS - 1; /* keys %hclass = $HBUCKETS; */ +#ifdef DEBUGME + /* Only pre-stretch the buckets if we're not interested in sizes. + You may want to tweak this to still stretch the buckets even + if we are interested in size. */ + if (!perl_get_sv(Storable::SEENSIZE, FALSE)) +#endif + { +#define HBUCKETS 4096/* Buckets for %hseen */ + + HvMAX(cxt-hseen) = HBUCKETS - 1; /* keys %hseen = $HBUCKETS; */ + HvMAX(cxt-hclass) = HBUCKETS - 1; /* keys %hclass = $HBUCKETS; */ + } #endif /* @@ -1250,6 +1257,28 @@ static void clean_store_context(stcxt_t HeVAL(he) = PL_sv_undef; } +#ifdef DEBUGME + if (perl_get_sv(Storable::SEENSIZE, FALSE)) { + /* User would like a count of the seen size. */ + dSP; + + ENTER; + SAVETMPS; + + PUSHMARK(SP); + XPUSHs(cxt-hseen ? sv_2mortal(newRV_inc((SV *)cxt-hseen)) +: PL_sv_undef); + XPUSHs(cxt-hclass ? sv_2mortal(newRV_inc((SV *)cxt-hclass)) + : PL_sv_undef); + PUTBACK; + + call_pv(Storable::_add_seensize, G_DISCARD); + + FREETMPS; + LEAVE; + } +#endif + /* * And now dispose of them... *
Re: [RfC] vtable-dump
On Thu, Sep 04, 2003 at 08:43:50AM -0500, Garrett Goebel wrote: Leopold Toetsch wrote: Garrett Goebel [EMAIL PROTECTED] wrote: As I was googling to see how other people have approached these problems I ran across: http://blogs.gotdotnet.com/cbrumme/CategoryView.aspx/CLR What did you find there, besides a description that a certain company is producing b0rken software ;-) The thoughts of a CLR developer on implementation issues: mistakes made, lessons learned, nasty things to keep in mind, etc. Discussions of lock-free thread-safe coding issues, balancing the CLR's memory model with cpu memory models, GC, finalization, exceptions, MP, NUMA, JIT, security policies, managed code, typing, shared vtables, marshaling, serialization. But not hugely indexed and with a lot of Win32 specific detail (Although there are section headers). For example the term lock-free seems to occur three times, and none seem to be discussions on how to go about writing lock-free code, merely references to it. (Or did you mean that we should read some of the comments too?) Are there specific permalinks of blog entries that discuss individual items in your list above, to save us all trying to wade through masses of text? Nicholas Clark
Re: [RfC] vtable-dump
On Thursday, September 4, 2003, at 10:56 , Luke Palmer wrote: Is there a problem with providing a mechanism which would suspend all threads except for the current one, as to ensure that the serialize operates, er, serially. Could it be implemented with a top-priority event post that acquires a global lock? Problem? Well, it's is antithetical to threading... Imagine a massively multithreaded server program. Stopping the entire thing to so much as Cdump(undef)? The overhead of halting the threads in predictable states far exceeds the cost of the serialization itself. A server which ever availed itself of dump would become unable to saturate a multiprocessor system, and most likely unable to take advantage of a uniprocessor as well. To serialize the entire interpreter (including thread state continuations), entering exclusive execution by halting all threads is probably necessary. In any other case? Ill-advised... Gordon Henriksen [EMAIL PROTECTED]
RE: [RfC] vtable-dump
On Thu, 4 Sep 2003 09:02:18 -0400 (EDT), Dan Sugalski wrote: On Wed, 3 Sep 2003, Gordon Henriksen wrote: A seen hash is also threadsafe, can be interrupted by DoD, and is safe for recursion. (By threadsafe, I mean that unchanged data structures can be safely serialized from multiple threads without ill effect or possibility of deadlock.) While a seen hash is DOD-interruptable (which, admittedly, the scheme I'm preferring isn't, or is with a *lot* of care) it's also slower and requires a lot more resources when running. If you're freezing a large graph of PMCs, that's going to take up a lot of memory, however you do it (other than streaming to disk). What happens when you run out, and need to garbage collect? With your scheme, this is a big problem, since as you say, it's either not possible or needs a lot of care. Actually, it seems like it would need unfeasible amounts of care to me, such as recording which PMCs are flagged so the flags can be restored after DOD. Then where's the benefit over a seen hash? With the seen hash approach, I wouldn't expect the hash itself to take nearly as much space as the frozen structure; a GC run in the middle of freezing should only be a problem if you run out of *unreclaimable* memory. -- Peter Haworth [EMAIL PROTECTED] Nothing in the definition of the word `word' says that a word has to be in a dictionary to be called one. -- Anu Garg
RE: [RfC] vtable-dump
On Fri, 5 Sep 2003, Peter Haworth wrote: On Thu, 4 Sep 2003 09:02:18 -0400 (EDT), Dan Sugalski wrote: On Wed, 3 Sep 2003, Gordon Henriksen wrote: A seen hash is also threadsafe, can be interrupted by DoD, and is safe for recursion. (By threadsafe, I mean that unchanged data structures can be safely serialized from multiple threads without ill effect or possibility of deadlock.) While a seen hash is DOD-interruptable (which, admittedly, the scheme I'm preferring isn't, or is with a *lot* of care) it's also slower and requires a lot more resources when running. If you're freezing a large graph of PMCs, that's going to take up a lot of memory, however you do it (other than streaming to disk). What happens when you run out, and need to garbage collect? With your scheme, this is a big problem, since as you say, it's either not possible or needs a lot of care. Note that using the pointers in the PMCs means *DOD* must be disabled, not *GC*. Doing a GC run to reclaim memory is perfectly fine, though there still is the potential to have large amounts of memory tied up in dead but not reaped Pobj things. With the seen hash approach, I wouldn't expect the hash itself to take nearly as much space as the frozen structure; My experience tracing variable graphs in Perl 5 indicates otherwise. Dan
RE: [RfC] vtable-dump
What's a dump stream actually going to look like? E.g., $a = [1, 2, 3]; print dump($a); 1: PMC PerlArray 2: 3 # element count 3: PMC PerlInt# [0] 4: 1 # value 5: PMC PerlInt# [1] 6: 2 # value 7: PMC PerlInt# [2] 8: 3 # value Fine and good. But dump must handle object graphs, which can be recursive or self-referential or whatnot: $a = { b = undef }; $b = { a = $a}; $a-{b} = $b; print dump($a); 1: PMC PerlHash 2: 1 # key-value count 3: PMC PerlString # keys[0] 4: b# value 5: PMC PerlHash # values[0] 6: 1 # key-value count 7: PMC PerlString # keys[0] 8: a# value 9: !!! Panic! Dump needs to refer to line 4 again, or else recurse! How? Most serializers use a seen hash not only to indicate that an object has been seen, but also to remember a name or number for that object. If dump had remembered $a = line 4, it could finish off with something like... 9: Again 4# values[0] A seen hash is also threadsafe, can be interrupted by DoD, and is safe for recursion. (By threadsafe, I mean that unchanged data structures can be safely serialized from multiple threads without ill effect or possibility of deadlock.) -- Gordon Henriksen IT Manager ICLUBcentral Inc. [EMAIL PROTECTED]
Re: [RfC] vtable-dump
Ok, this thread has now +45 entries, I'll try to summarize the proposed PMC traversal schemes. 0) As the subject still applies ;-) we want to be able to dump() or pretty_print() a PMC (and possibly aggregate members of that). This turns out to be a special case of a deep traversal over nested PMCs of any kind, which might contain self-references and the like. 1) There are more such traverse-like operations namely DOD, clone(), freeze(). thaw() is different, as we run over the frozen byte-stream and regenerate the PMC structures from that. 2) Proposals for DOD, freeze(), clone(), dump() 2a) Implement clone() dump() on top of freeze(): clone=freeze+thaw, dump() is generated in the debugger sub-system 2b) A general traverse() vtable that calls freeze/clone/dump-vtables. DOD is a separate traversal (mark).[1] 2c) A more general traverse(), which is our current DOD mark() functionality using next_for_GC.[2] Comparison 2a 2b 2c clone needs intermediate frozen string[3] yes - - clone involves 2 steps yes - - dump has to know PMCs internals[4] yes - - duplicates recognition overheadHash Hash- Traversals per aggregate for clone/freeze[5]1 1 3 Thread-safe[6] yesyes - Increases PMC size[7] - - yes Adds complexity to DOD[8] - - yes Cache friendly[9] - yes - Code duplication[10]- - yes [1] I consider Benjamin's proposal as some generalization of this for now and a mixture of 2a) and 2b) (clone=freeze/thaw). [2] mark() as of Parrot 0.0.6, each PMC gets on the next_for_GC list [3] e.g. for clone(Array[1]) the intermediate frozen image is first generated in memory, then scanned again during thaw(). [4] what about dynamically loaded PMCs? [5] 2c): 1. mark() per PMC, 2. clone() or freeze(), 3. clear next_for_GC. mark is called inside clone/freeze again, which additonally kills data cache coherency. [6] While all schemes aren't thread-safe from user level (e.g. manually sorting an array containing shared PMCs, while it gets frozen), 2c) isn't thread-safe at low-level, as the next_for_GC pointer inside the PMC is used as a duplicate marker. But if a user changes shared resources its a user problem. We only guarantee atomic updates per PMC (s. P6E p 86f by Dan). [7] next_for_GC (and others like poperties) are in the PMC_EXT structure now. Plain PMC scalars don't contain other PMCs, so the can be marked life directly by setting just their live_flag. [8] Additionally DOD has some shortcuts for marking e.g. a PMC pointing to another or to an array of PMCs. [9] 2a): The intermediate frozen image for clone() 2c): 3 passes over aggregates; each PMC gets pulled in during DOD [10] 2a): mark + freeze 2b): mark + traverse 2c): all traverse functions duplicated This is of course my personal POV. If anything is wrong above please correct me as well as WRT missing issues. My major concerns are regarding footnotes [3], [5], [7], [8], and [9]. These haven't been commented/answered yet at all. Thanks for reading til here ;-) leo leo
Re: [RfC] vtable-dump
Gordon Henriksen [EMAIL PROTECTED] wrote: Panic! Dump needs to refer to line 4 again, or else recurse! How? Don't Panic! Here is the point, where either next_for_GC is used or the seen hash. The former is/was used to mark (visit) PMCs. If they have been visited, they have that pointer set else not. This can be used during serialization. The latter is in my proposal. A seen hash is also threadsafe, can be interrupted by DoD, and is safe for recursion. (By threadsafe, I mean that unchanged data structures can be safely serialized from multiple threads without ill effect or possibility of deadlock.) Yes. That's what I'm saying. leo
Re: [RfC] vtable-dump
Garrett Goebel [EMAIL PROTECTED] wrote: As I was googling to see how other people have approached these problems I ran across: http://blogs.gotdotnet.com/cbrumme/CategoryView.aspx/CLR What did you find there, besides a description that a certain company is producing b0rken software ;-) leo
RE: [RfC] vtable-dump
Leopold Toetsch wrote: Garrett Goebel [EMAIL PROTECTED] wrote: As I was googling to see how other people have approached these problems I ran across: http://blogs.gotdotnet.com/cbrumme/CategoryView.aspx/CLR What did you find there, besides a description that a certain company is producing b0rken software ;-) The thoughts of a CLR developer on implementation issues: mistakes made, lessons learned, nasty things to keep in mind, etc. Discussions of lock-free thread-safe coding issues, balancing the CLR's memory model with cpu memory models, GC, finalization, exceptions, MP, NUMA, JIT, security policies, managed code, typing, shared vtables, marshaling, serialization. -- Garrett Goebel IS Development Specialist ScriptPro Direct: 913.403.5261 5828 Reeds Road Main: 913.384.1008 Mission, KS 66202 Fax: 913.384.2180 www.scriptpro.com garrett at scriptpro dot com
Re: [RfC] vtable-dump
Gordon Henriksen writes: What you're suggesting also has significant side-effects: It halts hypothetical multithreaded programs, suspends DoD, prevents the traversal mechanism from calling back into parrot code, requires fine-grained locks which are extremely expensive and have been summarily abandoned to great acclaim in all prior works... and for that, still doesn't provide a useful form of thread safety in the general case anyhow. Is there a problem with providing a mechanism which would suspend all threads except for the "current" one, as to ensure that the serialize operates, er, serially. Could it be implemented with a top-priority event post that acquires a global lock? Forgive my ignorance. I'm pretty na誰ve when it comes to threads. Luke
RE: [RfC] vtable-dump
Luke Palmer: # Is there a problem with providing a mechanism which would suspend all # threads except for the current one, as to ensure that the serialize # operates, er, serially. Could it be implemented with a top-priority # event post that acquires a global lock? What about the thread that draws the GUI progress bar so people don't think the computer froze up while serializing 42gb of data? Or worse, the thread that handles the cancel button? --Brent Dax [EMAIL PROTECTED] Perl and Parrot hacker Yeah, and my underwear is flame-retardant--that doesn't mean I'm gonna set myself on fire to prove it.
Re: [RfC] vtable-dump
Dan Sugalski [EMAIL PROTECTED] wrote: At 8:38 PM +0200 9/2/03, Leopold Toetsch wrote: [ next_for_GC ] That's easy, then, as it is. :) Besides, from looking at pobj.h, it's already there, so I'm not sure what we'd need to do. Perhaps I need to resync. The PMC_EXT structure is only allocated for PMCs that access PMC_data(). Adding properties adds this structure on the fly. Plain PerlScalars don't have the PMC_EXT structure and therefore no next_for_GC pointer. As these scalars can't refer to other PMCs, there is no need to visist these twice in the DOD run. They are marked by setting their life flag directly. Done. ... We don't care about the live flag for this, what we care is that the PMC is put on the traversal list. You are right, live is live. When you have a PerlArray containing 100.000 plain PerlInts, you have your linked list with 100.001 entries or one entry. You have polluted caches or not - that's it (I'm speaking of DOD here). So you shortcut in the normal course of DOD? Sure, as decribed above. ... I suppose--I presume you've benchmarked to see if testing all the flags of the contained PMCs when marking the container is faster than just throwing them on the list and checking them when you get to them? Yes I did benchmark this. The main problem where data cache misses and PMC size. Putting the PMC on the next list pulls the whole structure into the cache. Just setting the live_flag needs half a byte, which is located in the PMCs arena. ... I can see that being the case in some cases, but I'm concerned that it'll consume more memory and take more time in some reasonably common cases. For small sets of PMCs, this is equally fast then before. It trades some extra cycles (calculate the arena for a PMC) against cache pollution. As CPUs get faster and the more data you are processing this scheme wins. And putting the next_for_GC back into the main PMC structure accounts for +20% memory overhead for almost all PMCs (I expect the vast majority to be plain scalars w/o properties). And I still don't see, how you will handle duplicate lookup of self-referential structures for freeze and thaw with a linked list of items alone. The same way the DOD sweep handles it, unless you've changed things beyond recognition. The way it used to work is that when you marked a PMC as live, it got put on the end of the chain unless the PMC's next pointer field is non-NULL, in which case it's already been put on the list, and nothing happens. I see ... It used to work that way in 0.0.6. I'm not sure if I changed that, or it happened during the split of Fresources.c. Anyway at that time, we had mark_used() for marking PMCs and buffer_lives() for marking Buffers. These are united (my changes ;-), there is only one pobject_lives(), which of course could put PMCs on the next-list again if we really want that. Ok then back to your proposed scheme for freeze (and with nested and big structures in mind): 1) call mark() on the PMC, that is one walk through the aggregate 2) call freeze() on the PMC - i.e. a second walk through the same aggregate 3) if inside 2) we find a PMC: if it has next_for_GC set: duplicate else call mark() on it. This pulls in a potentially big aggregate and pollutes chaches. 4) get item from next list, goto 2) 5) reset next_for_GC for all PMCs This means, we have 2 alternating runs through the same aggregates, intersparsed with mark runs through other aggregates. Then we have another run for all PMCs processed, resetting the next_for_GC pointer. 5) is for free on DOD runs (we walk the arenas anyway to destroy dead objects). It's not free for freeze et al. This all is for sure cache unfriendly. Then: we want a solution for clone(), dump() (or pretty_print()), freeze(). and thaw(). That means, we would need 3 more routines in aggregates, that do all pretty much the same: walk through the aggregate's items and call mark() for PMCs, then do something. thaw() is different anyway. One general traverse() vtable is just the simpler interface IMHO. Running through each aggregate just once plus updating a seen_hash seems simpler and faster to me and doesn't have any negative impact on PMC size and DOD speed. leo
Re: [RfC] vtable-dump
Benjamin Goldberg [EMAIL PROTECTED] wrote: Leopold Toetsch wrote: [ optimize DOD before thaw/clone ] Even with a depth restriction, a recursive estimate can produce misleading results due to circular references. Only actually walking the structure can get the number right. However, walking the structure *twice* would be silly. So, begin with an estimate of unknown (serialize an integer -1), and then, after the whole thing has been frozen, we seek backwards (if that's possible) and replace that unknown with the actual number of pmcs that were serialized. Yep. That's good for thaw. Doing clone by freeze/thaw OTOH doesn't help here. Neither freezing everything first (yielding an exact count - but generating the whole intermediate frozen image) nor freeze/thaw per PMC (with smaller images) does help here. Maybe we can get some statistics during a DOD run and keep that in an aggregate's header. leo
Re: [RfC] vtable-dump
On Wed, 3 Sep 2003 09:24:22 +0200, Leopold Toetsch wrote: One general traverse() vtable is just the simpler interface IMHO. Running through each aggregate just once plus updating a seen_hash seems simpler and faster to me and doesn't have any negative impact on PMC size and DOD speed. And also makes it threadsafe? If a shared PMC is frozen simultaneously by two threads under Dan's scheme, will the right thing happen? I can't help being worried about interactions between any two operations which use flags on the PMCs to makr their progress (such as freezing and DOD, or two freezes). Bear in mind that I have never used threads and can't remember how they work in parrot, so this may be an irrelevant question. -- Peter Haworth [EMAIL PROTECTED] You're not going to watch the eclipse in yesterday's underpants?
Re: [RfC] vtable-dump
Peter Haworth [EMAIL PROTECTED] wrote: On Wed, 3 Sep 2003 09:24:22 +0200, Leopold Toetsch wrote: One general traverse() vtable is just the simpler interface IMHO. Running through each aggregate just once plus updating a seen_hash seems simpler and faster to me and doesn't have any negative impact on PMC size and DOD speed. And also makes it threadsafe? If a shared PMC is frozen simultaneously by two threads under Dan's scheme, will the right thing happen? I can't help being worried about interactions between any two operations which use flags on the PMCs to makr their progress (such as freezing and DOD, or two freezes). Good question. We don't have threads yet, they might exist somewhere inside Dan's brain, though. What I know (or think to know is): As in Dan's scheme next_for_GC is inside the PMC and used as flag/id for freeze, this can't be threadsafe (If one thread has visited a PMC its next_for_GC will be set, a second thread would write just the PMC id, because the next_for_GC isn't cleared). So all these operations (clone too) would have to acquire a global lock first AFAIK. For my scheme I don't see *this* kind of problems (all state is kept in automatic variables in the interpreter). Albeit I don't like to know what happens, when a shared aggregate gets sorted on one side and the other is freezing it in the meanwhile. leo
Re: [RfC] vtable-dump
- Original Message - From: Leopold Toetsch [EMAIL PROTECTED] To: Peter Haworth [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Wednesday, September 03, 2003 4:38 PM Subject: Re: [RfC] vtable-dump Peter Haworth [EMAIL PROTECTED] wrote: On Wed, 3 Sep 2003 09:24:22 +0200, Leopold Toetsch wrote: One general traverse() vtable is just the simpler interface IMHO. Running through each aggregate just once plus updating a seen_hash seems simpler and faster to me and doesn't have any negative impact on PMC size and DOD speed. And also makes it threadsafe? If a shared PMC is frozen simultaneously by two threads under Dan's scheme, will the right thing happen? I can't help being worried about interactions between any two operations which use flags on the PMCs to makr their progress (such as freezing and DOD, or two freezes). Good question. We don't have threads yet, they might exist somewhere inside Dan's brain, though. What I know (or think to know is): As in Dan's scheme next_for_GC is inside the PMC and used as flag/id for freeze, this can't be threadsafe (If one thread has visited a PMC its next_for_GC will be set, a second thread would write just the PMC id, because the next_for_GC isn't cleared). So all these operations (clone too) would have to acquire a global lock first AFAIK. For my scheme I don't see *this* kind of problems (all state is kept in automatic variables in the interpreter). Albeit I don't like to know what happens, when a shared aggregate gets sorted on one side and the other is freezing it in the meanwhile. leo just a thought: aren't shared objects/data/whatever protected by a mutex? To me, that's the obvious solution for accessing shared data. As I said, just a thought. Klaas-Jan
Re: [RfC] vtable-dump
K Stol [EMAIL PROTECTED] wrote: just a thought: aren't shared objects/data/whatever protected by a mutex? To me, that's the obvious solution for accessing shared data. As I said, just a thought. They will be protected by a mutex yes. That helps against two threads updating the same PMC at the same time. But if one thread sets pmc-next_for_GC the next thread might encounter this PMC later (albeit for this thread at the first time) and think, I've visited this PMC already and just flush out the object ID and not the whole PMC. Klaas-Jan leo
Re: [RfC] vtable-dump
On Wed, 3 Sep 2003, Leopold Toetsch wrote: K Stol [EMAIL PROTECTED] wrote: just a thought: aren't shared objects/data/whatever protected by a mutex? To me, that's the obvious solution for accessing shared data. As I said, just a thought. They will be protected by a mutex yes. That helps against two threads updating the same PMC at the same time. But if one thread sets pmc-next_for_GC the next thread might encounter this PMC later (albeit for this thread at the first time) and think, I've visited this PMC already and just flush out the object ID and not the whole PMC. The freezing routine can just lock the PMCs as it freezes them. As long as it keeps the original chain head around, it can then just walk the list again and unlock them. Dan
Re: [RfC] vtable-dump
Dan Sugalski [EMAIL PROTECTED] wrote: On Wed, 3 Sep 2003, Leopold Toetsch wrote: But if one thread sets pmc-next_for_GC the next thread might encounter this PMC later (albeit for this thread at the first time) and think, I've visited this PMC already and just flush out the object ID and not the whole PMC. The freezing routine can just lock the PMCs as it freezes them. As long as it keeps the original chain head around, it can then just walk the list again and unlock them. Then we might get deadlocks. Dan leo
Re: [RfC] vtable-dump
On Wed, 3 Sep 2003, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: On Wed, 3 Sep 2003, Leopold Toetsch wrote: But if one thread sets pmc-next_for_GC the next thread might encounter this PMC later (albeit for this thread at the first time) and think, I've visited this PMC already and just flush out the object ID and not the whole PMC. The freezing routine can just lock the PMCs as it freezes them. As long as it keeps the original chain head around, it can then just walk the list again and unlock them. Then we might get deadlocks. This is always a possibility, and there isn't anything we can do about it at the interpreter level. Deadlocks are an unfortunate fact of life in a threaded environment. Dan
Re: [RfC] vtable-dump
On Wed, Sep 03, 2003 at 02:07:10PM -0400, Dan Sugalski wrote: On Wed, 3 Sep 2003, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: On Wed, 3 Sep 2003, Leopold Toetsch wrote: But if one thread sets pmc-next_for_GC the next thread might encounter this PMC later (albeit for this thread at the first time) and think, I've visited this PMC already and just flush out the object ID and not the whole PMC. The freezing routine can just lock the PMCs as it freezes them. As long as it keeps the original chain head around, it can then just walk the list again and unlock them. Then we might get deadlocks. This is always a possibility, and there isn't anything we can do about it at the interpreter level. Deadlocks are an unfortunate fact of life in a threaded environment. Can we at least detect this class of deadlock? (which I think is where two threads are simultaneously freezing, but walk the shared objects in a different order, so that thread 0 has already locked object A and is waiting on B, while thread 1 has B locked and is waiting on A) If so, does that help? Is this a stupid question which demonstrates my lack of knowledge about threads? Nicholas Clark
Re: [RfC] vtable-dump
Dan Sugalski [EMAIL PROTECTED] wrote: On Wed, 3 Sep 2003, Leopold Toetsch wrote: Then we might get deadlocks. This is always a possibility, and there isn't anything we can do about it at the interpreter level. Deadlocks are an unfortunate fact of life in a threaded environment. clone() isn't supposed to deadlock, nor freeze(), nor dump(). They officially do not change (or shouldn't change) anything in the PMCs they operate on. These methods just passively iterate over PMCs. To work against that, we would have to implement shared PMCs as: - first COWed (updating next_for_GC doesn't change this state) - if any thread updates the PMC unCOW the PMCs and - communicate via events such changes to other threads - horrid. Dan, that clone/freeze/dump... scheme doesn't work. Dan leo
Re: [RfC] vtable-dump
On Wed, 3 Sep 2003, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: On Wed, 3 Sep 2003, Leopold Toetsch wrote: Then we might get deadlocks. This is always a possibility, and there isn't anything we can do about it at the interpreter level. Deadlocks are an unfortunate fact of life in a threaded environment. clone() isn't supposed to deadlock, nor freeze(), nor dump(). They officially do not change (or shouldn't change) anything in the PMCs they operate on. These methods just passively iterate over PMCs. It's impossible to have clone in a threaded environment guarantee that it won't deadlock. Can't do it. If there's more than one shared PMC in the set of PMCs that need freezing, then there exists the possibility of deadlock. You can't prescan the PMC list either, since you can't guarantee that shared PMCs won't change between the time of the scan and the actual clone. (Unless you lock, and then you're running the risk of deadlock) There's no such thing as passively iterating over shared PMCs. Dan
RE: [RfC] vtable-dump
On Wed, 3 Sep 2003, Brent Dax wrote: Dan Sugalski: # It's impossible to have clone in a threaded environment guarantee that it # won't deadlock. If I recall correctly, all shared variables are owned by a single shared interpreter. Nope. Shared variables are owned by their creating interpreter. We could make them all owned by a single shared interpreter, but then you get into issues with active data's home interpreters, loaded and accessible libraries, and other stuff. Big nasty mess. Dan
RE: [RfC] vtable-dump
Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: On Wed, 3 Sep 2003, Leopold Toetsch wrote: Then we might get deadlocks. This is always a possibility, and there isn't anything we can do about it at the interpreter level. Deadlocks are an unfortunate fact of life in a threaded environment. As I was googling to see how other people have approached these problems I ran across: http://blogs.gotdotnet.com/cbrumme/CategoryView.aspx/CLR -- Garrett Goebel IS Development Specialist ScriptPro Direct: 913.403.5261 5828 Reeds Road Main: 913.384.1008 Mission, KS 66202 Fax: 913.384.2180 www.scriptpro.com garrett at scriptpro dot com
Re: [RfC] vtable-dump
At 12:03 PM +0200 8/31/03, Leopold Toetsch wrote: Benjamin Goldberg [EMAIL PROTECTED] wrote: class freezer { class thawer { class cloner { [ big snip ] Do you expect that these are overridden by some languages using parrot? I.e. that ponie tries to implement a freezer that writes output for Storable? Languages can't override these on a per-PMC class basis. At best they can override the representation of frozen or thawed PMCs on a global basis if, for example, someone really wanted output in XML or YAML instead of a packed binary format. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: [RfC] vtable-dump
At 6:37 PM +0200 8/29/03, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: On Fri, 29 Aug 2003, Leopold Toetsch wrote: I think, we need a general solution for freeze, dump and clone. As shown the latter is broken. That would be IMHO an iterator interface with a callback function taking and returning a void*. Right, what I want is for the system freeze/thaw routines to have a standard way to encode a variety of core types (ints, floats, strings), lists, and hashes, as well as a way to handle self-referential, circular, and shared data in the dumped PMC data. It's the only way to safely do this, since we can't count on all the different PMC classes to handle dumping arrays of arrays of references to the same end PMC properly. Which we have to if we want this to work right. The freeze(), dump() and clone() would be vtables which are called by the iterator. So this functions hould have the same signature. No iterator. When you call freeze on a PMC, that PMC is responsible for any sort of iteration that may have to be done. (Though there is the issue of blowing stack, which we'll have to address) Aren't you saying the opposite of above here? I want to be able to traverse from a given start point (being it the own interpreter or some PMC) as deeply down as there is something. You did say, that we don't recursively mark() because of recursion depth. clone(), dump(), freeze() all have these same issues (plus self-referntial structures). No, I'm just not being clear, so lets fix that. We don't need any special iterator, since we already have one--the DOD iterator, which is not only sufficient for our needs its likely the only thing that *can* have sufficient information to do the iteration. The -freeze or -thaw methods in the vtables are sufficient to freeze or thaw a PMC, but aren't going to be driving the freezing/thawing. The way things work is this: The freeze opcode takes a PMC as a parameter. The DOD tracing system is set up. The first PMC's freeze vtable function is called. When it returns (and we'll get to what it does later) the freeze op follows the chain and calls the next PMC's freeze, then the next, and so forth until it runs out of PMCs to freeze. This may, for simple PMCs, stop after the first--there may be no traversal at all. The freeze vtable method of a PMC is responsible for handing off parts of the PMC to the freezing and thawing encoding subsystem (which itself turns those parts into bits on the wire or characters in a file or string). The encoding subsystem should have a way to encode: Single values: STRING * (so we can save encoding, charset, and language) Integers Floats Binary data PMCs frozen bytecode Name/value pair (where the value is one of the above bits, or a set of name/value pairs, or a list of values) Lists of values (where the list can be a set of name/value pairs, or lists of values) Meta data--Internal structure stuff (flags, PMC class name, and whatnot) Not that complex -- this gets 95% of what any PMC class would need, and the remaining 5% can get shoved as binary data into a value thing. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: [RfC] vtable-dump
Dan Sugalski [EMAIL PROTECTED] wrote: At 6:37 PM +0200 8/29/03, Leopold Toetsch wrote: Aren't you saying the opposite of above here? I want to be able to traverse from a given start point (being it the own interpreter or some PMC) as deeply down as there is something. You did say, that we don't recursively mark() because of recursion depth. clone(), dump(), freeze() all have these same issues (plus self-referntial structures). No, I'm just not being clear, so lets fix that. We don't need any special iterator, since we already have one--the DOD iterator, which is not only sufficient for our needs its likely the only thing that *can* have sufficient information to do the iteration. While a general iterator would be very similar to the DOD iterator, we can't use it IMHO: - this prohibits DOD runs during clone/thaw, when we run out of free headers[1] - it would need an extra function pointer to call the actual vtable (mark, freeze, clone ...) - we have currently shortcuts for DOD marking (is_PMC_ptr ..) which don't play nicely with it probably and most important: - we must track duplicates (or restore pointers to these during thaw/clone) while marking (pobject_lives) just returns here. This would slow down DOD runs remarkably. ... The encoding subsystem should have a way to encode: Basically packfile extensions. Yep. [1] when we want to thaw/clone big structures, we should have some means to estimate the amount of needed headers. If we will not have enough, we do a DOD run before clone/thaw and then turn DOD off - it will not yield any more free headers anyway. This can avoid a couple of DOD runs that do just nothing except burning a lot of cycles and massive cache pollution. To achieve this, we might call aggregates.elements() first by means of the iterator again or with some depth restriction and returning, when we reach the free-header limit. leo
Re: [RfC] vtable-dump
On Tue, 2 Sep 2003, Leopold Toetsch wrote: and no, not that one inside DOD, that one doesn't handle duplicates. Yes, yes it *does* handle duplicates. Otherwise it'd get caught in infinite loops every time it came across a circular data structure. That's what the next pointer in the PObj header gets used for, amongst other things. Dan
Re: [RfC] vtable-dump
Dan Sugalski [EMAIL PROTECTED] wrote: On Tue, 2 Sep 2003, Leopold Toetsch wrote: and no, not that one inside DOD, that one doesn't handle duplicates. Yes, yes it *does* handle duplicates. Otherwise it'd get caught in infinite loops every time it came across a circular data structure. That's what the next pointer in the PObj header gets used for, amongst other things. It does handle duplicates like so: void pobject_lives(struct Parrot_Interp *interpreter, PObj *obj) { /* if object is live or on free list return */ if (PObj_is_live_or_free_TESTALL(obj)) { return; } /* mark it live */ PObj_live_SET(obj); So if it was mark()ed already we return. That's not possible for freeze, thaw, dump, clone whatever. These must keep track of already visited objects via an hash for freeze, dump, clone, and via an ID array for thaw. Dan leo
Re: [RfC] vtable-dump
On Tue, 2 Sep 2003, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: On Tue, 2 Sep 2003, Leopold Toetsch wrote: and no, not that one inside DOD, that one doesn't handle duplicates. Yes, yes it *does* handle duplicates. Otherwise it'd get caught in infinite loops every time it came across a circular data structure. That's what the next pointer in the PObj header gets used for, amongst other things. It does handle duplicates like so: void pobject_lives(struct Parrot_Interp *interpreter, PObj *obj) { /* if object is live or on free list return */ if (PObj_is_live_or_free_TESTALL(obj)) { return; } /* mark it live */ PObj_live_SET(obj); So if it was mark()ed already we return. That's not possible for freeze, thaw, dump, clone whatever. These must keep track of already visited objects via an hash for freeze, dump, clone, and via an ID array for thaw. No, this isn't necessary. What you need to do when freezing an already frozen PMC is to emit a marker label that refers to the original version of the PMC, or always emit the marker when a PMC freezes another PMC and defer actual freezing of the PMC until the master freeze function walks over the PMC to freeze it. Dan
Re: [RfC] vtable-dump
Dan Sugalski [EMAIL PROTECTED] wrote: On Tue, 2 Sep 2003, Leopold Toetsch wrote: So if it was mark()ed already we return. That's not possible for freeze, thaw, dump, clone whatever. These must keep track of already visited objects via an hash for freeze, dump, clone, and via an ID array for thaw. No, this isn't necessary. What you need to do when freezing an already frozen PMC is to emit a marker label that refers to the original version of the PMC, So I have to do a lookup to get the marker, or... ... or always emit the marker when a PMC freezes another PMC and defer actual freezing of the PMC until the master freeze function walks over the PMC to freeze it. on first freeze generate a maker (store it where in the PMC?) and then... When I defer freezing of the aggregate the thawing doesn't work. When everything gets defered, you have to walk the whole defered list again, for simple types too, which already would have been done. Then in another run, the marker has to get cleared. *If* that works it still adds extra complexity to DOD. Dan leo
Re: [RfC] vtable-dump
On Tue, 2 Sep 2003, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: On Tue, 2 Sep 2003, Leopold Toetsch wrote: So if it was mark()ed already we return. That's not possible for freeze, thaw, dump, clone whatever. These must keep track of already visited objects via an hash for freeze, dump, clone, and via an ID array for thaw. No, this isn't necessary. What you need to do when freezing an already frozen PMC is to emit a marker label that refers to the original version of the PMC, So I have to do a lookup to get the marker, or... Or nothing. Freezing a PMC that refers to one or more PMCs will have to handle this. ... or always emit the marker when a PMC freezes another PMC and defer actual freezing of the PMC until the master freeze function walks over the PMC to freeze it. on first freeze generate a maker (store it where in the PMC?) and then... When I defer freezing of the aggregate the thawing doesn't work. When everything gets defered, you have to walk the whole defered list again, for simple types too, which already would have been done. Then in another run, the marker has to get cleared. Nope. First off, the freeze and thaw code *must* deal with potentially self-referential or circular data structures. They're depressingly common, and to not be able to handle something as simple as: $foo = \$foo; freeze($foo); would be a big problem. As would this: $foo[0] = \$bar; $foo[1] = \$bar; freeze(@foo); generating a structure that, when reconstituted, had elements 0 and 1 pointing to different PMCs. So with that in mind, handling labels and deferred or already-instantiated PMCs is a necessity. Doing the walk is *also* easy. You clear the next PMC pointer, just as with a normal DOD run setup. call the DOD mark on the inital PMC, and call its freeze routine. Inside the freeze for the PMC, it calls mark on any PMCs it needs frozen, which will then go on the list if they've not already been frozen. A marker for that PMC, probably based on the PMC header address, goes in the frozen data structure. When the initial freeze routine exits, the freeze code just follows the next for DOD pointer chain, calling freeze on each of those PMCs, until it runs off the end of the chain, just as the DOD does. Now, granted, we won't use the DOD's intimate knowledge of base array/hash types, rather deferring to their freeze routines, but that's fine and as it should be. Reconstituting these things may be somewhat problematic, but there's not much for it--reconstituting circular data structures that require some sort of active setup is problematic in general. *If* that works it still adds extra complexity to DOD. It doesn't add anything to the DOD. Worst case it turns out that disabling a DOD sweep during a freeze is untenable, which is possible. In that case we'll need an alternate pointer chain. Dan
Re: [RfC] vtable-dump
Dan Sugalski [EMAIL PROTECTED] wrote: Doing the walk is *also* easy. You clear the next PMC pointer, just as with a normal DOD run setup. call the DOD mark on the inital PMC there is no mark() for a plain PMC scalar (and no next pointer inside the PMC). If the PMC has a mark routine this calls pobject_lives() (setting the live flag) and puts complex PMCs but *not all* on the next for GC pointer. Some get their live flag directly set. Using the DOD functionality doesn't match freeze, thaw, dump at all and interferes with DOD runs. Please have a look at dod.c Dan leo
Re: [RfC] vtable-dump
On Tue, 2 Sep 2003, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: Doing the walk is *also* easy. You clear the next PMC pointer, just as with a normal DOD run setup. call the DOD mark on the inital PMC there is no mark() for a plain PMC scalar (and no next pointer inside the PMC). If the PMC has a mark routine this calls pobject_lives() (setting the live flag) and puts complex PMCs but *not all* on the next for GC pointer. Some get their live flag directly set. Every PMC should have a next_for_GC pointer. You moved it to the ext structure, but it's there, and they all ought to have one. Any PMC that gets frozen *will* have one, as it needs to have it for traversal to work properly. Using the DOD functionality doesn't match freeze, thaw, dump at all and interferes with DOD runs. Oh, nonsense, it doesn't at all interfere with things. I designed it with this as one of the applications for the DOD system. It should also work reasonably well for destruction ordering. If it's been changed in a way that makes this untenable, it needs to be fixed so it does. Dan
Re: [RfC] vtable-dump
Dan Sugalski [EMAIL PROTECTED] wrote: Every PMC should have a next_for_GC pointer. You moved it to the ext structure, ... I moved it there for a good reason: speed. The smaller a PMC the faster is the interpreter. ... but it's there, and they all ought to have one. Any PMC that gets frozen *will* have one, as it needs to have it for traversal to work properly. Ok. Its a macro and back again in a second *if* needed. Using the DOD functionality doesn't match freeze, thaw, dump at all and interferes with DOD runs. Oh, nonsense, it doesn't at all interfere with things. Calling mark sets the live flag on PMC headers. When you use mark() you have to reset the whole arena after a single clone. I expect a lot of clone()s to be emitted by the HLL. If we don't use mark() we are almost at my proposal: have an extra traverse() vtable, that gets a vtable pointer doing freeze, thaw, clone, or dump. If you want to have a general traverse() for DOD too (currents mark?) then you have to pass in a function pointer - or put each and every item on the next_for_GC list. When you have a PerlArray containing 100.000 plain PerlInts, you have your linked list with 100.001 entries or one entry. You have polluted caches or not - that's it (I'm speaking of DOD here). ... I designed it with this as one of the applications for the DOD system. It should also work reasonably well for destruction ordering. If it's been changed in a way that makes this untenable, it needs to be fixed so it does. Or the design ;-) And I still don't see, how you will handle duplicate lookup of self-referential structures for freeze and thaw with a linked list of items alone. Dan leo
Re: [RfC] vtable-dump
Leopold Toetsch wrote: [snip] [1] when we want to thaw/clone big structures, we should have some means to estimate the amount of needed headers. If we will not have enough, we do a DOD run before clone/thaw and then turn DOD off - it will not yield any more free headers anyway. This can avoid a couple of DOD runs that do just nothing except burning a lot of cycles and massive cache pollution. To achieve this, we might call aggregates.elements() first by means of the iterator again or with some depth restriction and returning, when we reach the free-header limit. Even with a depth restriction, a recursive estimate can produce misleading results due to circular references. Only actually walking the structure can get the number right. However, walking the structure *twice* would be silly. So, begin with an estimate of unknown (serialize an integer -1), and then, after the whole thing has been frozen, we seek backwards (if that's possible) and replace that unknown with the actual number of pmcs that were serialized. -- $a=24;split//,240513;s/\B/ = /for@@=qw(ac ab bc ba cb ca );{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print [EMAIL PROTECTED] ]\n;((6=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))redo;}
Re: [RfC] vtable-dump
At 8:38 PM +0200 9/2/03, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: Every PMC should have a next_for_GC pointer. You moved it to the ext structure, ... I moved it there for a good reason: speed. The smaller a PMC the faster is the interpreter. Right, and this part is only needed by the DOD sweep, the freeze/thaw code, and the destruction ordering code, so since it's in the ext struct it won't get in the way for normal usage. ... but it's there, and they all ought to have one. Any PMC that gets frozen *will* have one, as it needs to have it for traversal to work properly. Ok. Its a macro and back again in a second *if* needed. That's easy, then, as it is. :) Besides, from looking at pobj.h, it's already there, so I'm not sure what we'd need to do. Perhaps I need to resync. Using the DOD functionality doesn't match freeze, thaw, dump at all and interferes with DOD runs. Oh, nonsense, it doesn't at all interfere with things. Calling mark sets the live flag on PMC headers. When you use mark() you have to reset the whole arena after a single clone. I expect a lot of clone()s to be emitted by the HLL. I expect very few clones to be emitted. Past history indicates that it's an unusual operation. Besides, there's nothing wrong with setting the live flag, since the PMCs *are* alive. We don't care about the live flag for this, what we care is that the PMC is put on the traversal list. If we don't use mark() we are almost at my proposal: have an extra traverse() vtable, that gets a vtable pointer doing freeze, thaw, clone, or dump. If you want to have a general traverse() for DOD too (currents mark?) then you have to pass in a function pointer - or put each and every item on the next_for_GC list. When you have a PerlArray containing 100.000 plain PerlInts, you have your linked list with 100.001 entries or one entry. You have polluted caches or not - that's it (I'm speaking of DOD here). So you shortcut in the normal course of DOD? I suppose--I presume you've benchmarked to see if testing all the flags of the contained PMCs when marking the container is faster than just throwing them on the list and checking them when you get to them? I can see that being the case in some cases, but I'm concerned that it'll consume more memory and take more time in some reasonably common cases. ... I designed it with this as one of the applications for the DOD system. It should also work reasonably well for destruction ordering. If it's been changed in a way that makes this untenable, it needs to be fixed so it does. Or the design ;-) Yep. In this case, though, design flaws turn out not to be the case. :-P And I still don't see, how you will handle duplicate lookup of self-referential structures for freeze and thaw with a linked list of items alone. The same way the DOD sweep handles it, unless you've changed things beyond recognition. The way it used to work is that when you marked a PMC as live, it got put on the end of the chain unless the PMC's next pointer field is non-NULL, in which case it's already been put on the list, and nothing happens. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: serialisation (was Re: [RfC] vtable-dump)
On Sat, Aug 30, 2003 at 10:13:02PM -0400, Benjamin Goldberg wrote: Nicholas Clark wrote: The attacker can craft a bogus CGITempFile object that refers to any file on the system, and when this object is destroyed it will attempt to delete that file at whatever privilege level the CGI runs at. And because that object is getting destroyed inside the deserialise routine of Storable, this all happens without the user written code getting any chance to inspect the data. And even Storable can't do anything about it, because by the time it encounters the repeated hash key, it has already deserialised this time-bomb. How does it defuse it? The simplest solution *I* can think of, is to have storable copy the taint flag from the input string/stream onto every single string that it produces. Taint checking doesn't solve *all* security problems, of course, but it can catch many of them, and it certainly would catch this one (if $$self were tainted). I don't believe that it would. A quick test suggests that in perl5 destructors get to run after a taint violation is detected: $ perl -Tle 'sub DESTROY {print We still get to run arbitrary code after taint violations}; bless ($a = []); `cat /etc/motd`' Insecure $ENV{PATH} while running with -T switch at -e line 1. We still get to run arbitrary code after taint violations $ One defense against following a bomb with malformed data, might be to have Storable save up the SV*s and the names with which to bless them, and only do the blessing *after* the data is fully deserialized, as a last step before returning it to the user. This way, if there's malformed data, no destructors get called. The user still needs to validate the returned data, though, and rebless anything which might result in an evil destructor being called. Another defense is to run deserialization and validation inside of a Safe object. Make sure that if the object fails to validate, it is completely destructed before we exit the Safe compartment. I think that these could work well. For backwards compatibility with perl5, parrot will quite likely support taint checking, safe.pm, and ops.pm. I don't see why support is needed at core parrot level. I believe that the plan is to provide XS-level Perl 5 compatibility via ponie, in which case much of this stuff would be up to ponie, not core parrot. Tainting sounds to me like the sort of things that would be a property on a scalar, checked by custom ponie ops, which would be as parrot core as python's ops. A brief inspection suggests that ops.pm is entirely a parser level thing, so I don't see the need for core parrot support there. Safe.pm as implemented is an unreliable mess. It's also problematic as it describes actions solely in terms of perl 5 ops. I've no idea quite how close Arthur, Rafael and the other ponie conspirators think that ponie-generated parrot bytecode will be to the perl 5 optree structure, but I wouldn't like to place any bets right now. I think that safe execution compartments are part of Dan's plan, but I'm not sure if anyone yet knows how any of this fits together (even Dan or Arthur) Nicholas Clark
Re: serialisation (was Re: [RfC] vtable-dump)
Nicholas Clark wrote: On Sat, Aug 30, 2003 at 10:13:02PM -0400, Benjamin Goldberg wrote: Nicholas Clark wrote: The attacker can craft a bogus CGITempFile object that refers to any file on the system, and when this object is destroyed it will attempt to delete that file at whatever privilege level the CGI runs at. And because that object is getting destroyed inside the deserialise routine of Storable, this all happens without the user written code getting any chance to inspect the data. And even Storable can't do anything about it, because by the time it encounters the repeated hash key, it has already deserialised this time-bomb. How does it defuse it? The simplest solution *I* can think of, is to have storable copy the taint flag from the input string/stream onto every single string that it produces. Taint checking doesn't solve *all* security problems, of course, but it can catch many of them, and it certainly would catch this one (if $$self were tainted). I don't believe that it would. A quick test suggests that in perl5 destructors get to run after a taint violation is detected: $ perl -Tle 'sub DESTROY {print We still get to run arbitrary code after taint violations}; bless ($a = []); `cat /etc/motd`' Insecure $ENV{PATH} while running with -T switch at -e line 1. We still get to run arbitrary code after taint violations $ That wasn't what I meant. The taint violation in your example does *not* correspond to Storable turning on the taint flag in the strings it produces. At best, it corresponds to Storable dieing due to a malformed input file, resulting in destructors being called. If sub DESTROY tries to unlink a file, and that filename is a tainted string, then the DESTROY will die. Currently, Storable *doesn't* turn on the taint flag in the SVPVs that it produces; because of this, $$self in CGITempFile::DESTORY isn't tainted. If $$self in CGITempFile::DESTROY *were* tainted, then obviously that DESTROY would die, and the file wouldn't get unlinked. Thus, we would avoid that security hole. One defense against following a bomb with malformed data, might be to have Storable save up the SV*s and the names with which to bless them, and only do the blessing *after* the data is fully deserialized, as a last step before returning it to the user. This way, if there's malformed data, no destructors get called. The user still needs to validate the returned data, though, and rebless anything which might result in an evil destructor being called. Another defense is to run deserialization and validation inside of a Safe object. Make sure that if the object fails to validate, it is completely destructed before we exit the Safe compartment. I think that these could work well. For backwards compatibility with perl5, parrot will quite likely support taint checking, safe.pm, and ops.pm. I don't see why support is needed at core parrot level. I believe that the plan is to provide XS-level Perl 5 compatibility via ponie, in which case much of this stuff would be up to ponie, not core parrot. Tainting sounds to me like the sort of things that would be a property on a scalar, checked by custom ponie ops, which would be as parrot core as python's ops. A brief inspection suggests that ops.pm is entirely a parser level thing, so I don't see the need for core parrot support there. Safe.pm as implemented is an unreliable mess. It's also problematic as it describes actions solely in terms of perl 5 ops. I've no idea quite how close Arthur, Rafael and the other ponie conspirators think that ponie-generated parrot bytecode will be to the perl 5 optree structure, but I wouldn't like to place any bets right now. I think that safe execution compartments are part of Dan's plan, but I'm not sure if anyone yet knows how any of this fits together (even Dan or Arthur) Nicholas Clark -- $a=24;split//,240513;s/\B/ = /for@@=qw(ac ab bc ba cb ca );{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print [EMAIL PROTECTED] ]\n;((6=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))redo;}
Re: [RfC] vtable-dump
Leopold Toetsch wrote: Benjamin Goldberg [EMAIL PROTECTED] wrote: class freezer { class thawer { class cloner { [ big snip ] Do you expect that these are overridden by some languages using parrot? I.e. that ponie tries to implement a freezer that writes output for Storable? I'm not entirely sure... For ponie to implement something that deals with Storable, then of course it needs to output data in the same file format as Storable does. I haven't looked at the guts of storable, so I don't know what that format is. Further: having clone implemented in terms of freeze + thaw needs additional memory for the intermediate frozen image. Isn't that suboptimal? Only slightly -- It's just *one single* PMC's data that's stored in that additional memory. And if we have a seperate clone vtable method, then there's a chance that our cloning procedure will drift apart from our freezing/thawing procedure. A general traverse routine or iterator seems to be more flxible. leo -- $a=24;split//,240513;s/\B/ = /for@@=qw(ac ab bc ba cb ca );{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print [EMAIL PROTECTED] ]\n;((6=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))redo;}
Re: [RfC] vtable-dump
Leopold Toetsch wrote: Benjamin Goldberg [EMAIL PROTECTED] wrote: Actually, I think the following interface would be better: void freeze(PMC *freezer); void thaw (PMC *thawer); I'm thinking of (in horrible pseudo code ;-): [snip] Above functionality does IMHO match your description. The only difference is, that I'd like to use that not only for freeze/thaw, but for clone and dump too. Well, I wasn't precisely thinking of my interface being *soley* for freeze/thaw, but for clone and dump, too. Cloning would be like freezing followed immediately by thawing, without writing to disk. That is, I was thinking of something like the following psuedocode. class freezer { fifoPMC* queue; setINTVAL seen; outputer o; void push_integer(INTVAL i) { o.print_int(i); } void push_number(NUMVAL n) { o.print_num(n); } void push_string(STRING*s) { o.print_str(s); } void push_bigint(BIGINT*b) { o.print_big(b); } void push_pmc(PMC *p) { INTVAL i = (INTVAL)p; o.print_int(i); if( !seen.contains(i) ) { seen.add(i); queue.push(p); } } } void do_freeze(PMC *p, outputer o) { PMC * f = new_freezer_dumper(o); f.push_pmc(p); while( f.queue.notempty() ) { p = f.queue.pop(); o.print_int( p.getclass() ); p.freeze(f); } } class thawer { fifoPMC* queue; mapINTVAL, PMC* prep; inputter i; INTVAL shift_integer() { return i.read_int(); } NUMVAL shift_number() { return i.read_num(); } STRING*shift_string() { return i.read_str(); } BIGNUM*shift_bignum() { return i.read_big(); } void shift_pmc() { INTVAL j = i.read_int(); if( prep.contains(j) ) { return prep.get(j); } PMC * n = new_placeholder(); prep.put(j, n); queue.push(n); return n; } } PMC* do_thaw(inputter i) { PMC * t = new_thawer(i); PMC * ret = t.shift_pmc(); PMC * p = ret; do { p.setclass( i.read_int() ); p.thaw(t); p = t.queue.pop(); } while(p); return ret; } class cloner { fifopairPMC*,PMC* queue; mapPMC*,PMC* prep; fifoPMC* tempdata; void push_integer(INTVAL i) { tempdata.push_integer(i); } void push_number (NUMVAL n) { tempdata.push_number (n); } void push_string (STRING*s) { tempdata.push_string (s); } void push_bigint (BIGINT*b) { tempdata.push_bigint (b); } void push_pmc(PMC *p) { if( prep.contains(p) ) { tempdata.push_pmc(prep.get(p)); } else { PMC *p2 = new_placeholder(); prep.put(p, p2); queue.push( pair(p, p2) ); tempdata.push_pmc(p2); } } INTVAL shift_integer() { return tempdata.shift_integer(); } NUMVAL shift_number () { return tempdata.shift_number (); } STRING*shift_string () { return tempdata.shift_string (); } BIGINT*shift_bigint () { return tempdata.shift_bigint (); } PMC* shift_pmc() { return tempdata.shift_pmc(); } } PMC * do_clone(PMC *p) { PMC *c = new_cloner(); PMC *r = new_placeholder(); c.prep.put(p, r); p.freeze(c); r.setclass( p.getclass() ); r.thaw(c); assert( c.tempdata.isempty() ); while( c.queue.notempty() ) ) { pairPMC*,PMC* x = c.queue.pop(); x.first.freeze(c); x.second.setclass( x.first.getclass() ); x.second.thaw(c); assert( c.tempdata.isempty() ); } return r; } I've ommited the psuedocode for dumping/pretty-printing since I'm not sure how best it would be done. It does require quite a bit more information than freezing (since the computer can tell when/where a pmc class starts and ends, but a human wouldn't be able to), so merely calling do_freeze() with an outputer which sprintfs intvals and numvals nicely wouldn't be sufficient. However, if you replaced the o.print_int inside of push_pmc with a o.print_pmc_pointer, and o.print_int in the main loop with o.print_pmc_class, then the outputter *would* have enough info to make something vaguely human-readable. -- $a=24;split//,240513;s/\B/ = /for@@=qw(ac ab bc ba cb ca );{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print [EMAIL PROTECTED] ]\n;((6=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))redo;}
Re: serialisation (was Re: [RfC] vtable-dump)
Nicholas Clark wrote: On Fri, Aug 29, 2003 at 05:30:37PM +0200, Leopold Toetsch wrote: I think, we need a general solution for freeze, dump and clone. As shown I don't know if this is relevant here, but I'll mention it in case. For perl5 there isn't a single good generic clone system. Probably the best (in terms of quality of what it can cope with, if not speed) is Storable. Storable provides a dclone method which internally serialises the passed in reference, and then deserialises a new reference. This is a lot of extra effort, but it does work. Obviously this fail's parrots aim - speed. I don't know whether this is relevant either: The internal representation of Storable's serialisation is stream of tagged data. There was a discussion on p5p a while back about whether it would be possible to make a callback interface to the tags, so that people could write custom deserialisers (eg vetting the data as it came in) This sort of interface to hooking deserialisation would be useful. Likewise being able to hook into the traversal/serialisation interface to provide custom output (eg XML when we're serialising as standard to compressed YAML) would save people having to reinvent traversal/introspection routines. (eg perl as both Data::Dumper and Storable in core doing separate traversal routines) I suspect that this is relevant: You can't trust you data deserialiser. It can do evil on you before it returns. I forget who worked this attack out, but Jonathan Stowe presented a talk on it at YAPC::EU in Amsterdam. The specific attack form on Storable is as follows, but it should generalise to any system capable of deserialising objects: You create serialisation which holds a hash. The hash happens to have 2 identical keys (the serialisation format does not prevent this) The second key is innocent. When Storable writes this key into the hash it's deserialising, perl's hash API automatically frees any previous value for that key. This is how hashes are supposed to work. Obviously while deserialising there should never have been a previous key (a legitimate file generated from a real hash could not have had repeating keys) The problem is that things like perl let you (de)serialise objects, and objects have destructors that can run arbitrary code. So your attacker puts an object as the value associated with the first copy of the key which is an object with attributes that cause it to do something interesting. The suggestion for attacking a Perl 5 CGI was to use a CGITempFile object. The destructor looks like this: sub DESTROY { my($self) = @_; unlink $$self; # get rid of the file } The attacker can craft a bogus CGITempFile object that refers to any file on the system, and when this object is destroyed it will attempt to delete that file at whatever privilege level the CGI runs at. And because that object is getting destroyed inside the deserialise routine of Storable, this all happens without the user written code getting any chance to inspect the data. And even Storable can't do anything about it, because by the time it encounters the repeated hash key, it has already deserialised this time-bomb. How does it defuse it? The simplest solution *I* can think of, is to have storable copy the taint flag from the input string/stream onto every single string that it produces. Taint checking doesn't solve *all* security problems, of course, but it can catch many of them, and it certainly would catch this one (if $$self were tainted). Simple solutions such as checking the hash keys are unique don't work either. All the attacker does then increase the abstraction slightly. Put the time-bomb as the first element in an array. Put one of these bogus hashes as the second object. Fine, you can realise that you've got bad keys in the bogus hash and never build it. But at this point the time-bomb object already exists. You'd have to validate the entire serialised stream before continuing. And if deserialisers for objects are allowed to fail, then you're still stuffed, because the attacker then crafts a time-bomb object, and a second malformed object that is known to cause its class's deserialiser to fail. One defense against following a bomb with malformed data, might be to have Storable save up the SV*s and the names with which to bless them, and only do the blessing *after* the data is fully deserialized, as a last step before returning it to the user. This way, if there's malformed data, no destructors get called. The user still needs to validate the returned data, though, and rebless anything which might result in an evil destructor being called. Another defense is to run deserialization and validation inside of a Safe object. Make sure that if the object fails to validate, it is completely destructed before we exit the Safe compartment. These defenses still aren't perfect: One could embed a time-bomb inside of a
Re: [RfC] vtable-dump
Benjamin Goldberg [EMAIL PROTECTED] wrote: class freezer { class thawer { class cloner { [ big snip ] Do you expect that these are overridden by some languages using parrot? I.e. that ponie tries to implement a freezer that writes output for Storable? Further: having clone implemented in terms of freeze + thaw needs additional memory for the intermediate frozen image. Isn't that suboptimal? A general traverse routine or iterator seems to be more flxible. leo
Re: [RfC] vtable-dump
Dan Sugalski wrote: On Fri, 29 Aug 2003, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: I think we'd be better served getting the freeze/thaw stuff in and We were just discussing this in the f'up. I read those, but I wanted to make sure the discussion went this way. :) If we make the initial dump scheme something mildly human readable (XML, YAML, whatever) then the debugger doesn't have to worry about formatting it for a while. (And we're going to want a pluggable dump scheme anyway, to make experimentation eaiser) Pluggable implies a dump() vtable method, doesn't it? Nope. Pluggable implies freezethaw.c provides all the functionality to freeze or thaw a PMC, with code in there to handle the right final encoding or decoding of the data. I thought we had a freeze and thaw entry in the vtables, but I see not. The signatures (since I don't have source handy to check them in) should be: STRING *freeze() PMC* thaw(STRING*) With thaw ignoring its initial PMC parameter. (It's a class method, essentially) Actually, I think the following interface would be better: void freeze(PMC *freezer); void thaw (PMC *thawer); When you freeze a pmc, it's passed a handle into the freezing mechansim, which it can then use to record the data that will be necessary to reconstitute itself later. This is done using the VTABLE_push_type methods. Pushing a pmc does *not* instantly result in that other pmc being frozen; that only happens *after* the freeze() function returns. In effect, it marks the other pmc as needing serialization, which then happens later. During thawing, a pmc header is created for you, morphed into your class, and then has VTABLE_thaw called on it. It can then reconsititute itself using VTABLE_shift_type on the handle to the thawer that's been passed in to it. Just as serialization of external pmcs in freeze wasn't immediate, any pmc which you shift_ out of the Thawer object might not be fully thawed... it might be a pmc header which has been created as a placeholder, and which has not had VTABLE_thaw called on it yet. Freezing and thawing would have algorithms similar to the mark-and-sweep of the DoD (but simple fifos, or simple stacks, not any kind of priority queue;)). I'm not *entirely* sure, but I think we might need/want to allow STRING*s returned by shift_ to be placeholders too. This allow the freezer to print out the string data after the pmcs if it choses to. If it does, it can perhaps write the strings in a more compact manner. Almost certainly moving the strings to the end would make compression by gzip or whatever easier. (I vaguely recall hearing that the following problem is NP-hard, but I don't remember. Does anyone know for sure? Given a set of N strings, construct a single larger string, of which all N strings are substrings. This allows all N strings to be represented on disk using nothing more than the large string, and an offset and length for each substring.) -- $a=24;split//,240513;s/\B/ = /for@@=qw(ac ab bc ba cb ca );{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print [EMAIL PROTECTED] ]\n;((6=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))redo;}
Re: [RfC] vtable-dump
Benjamin Goldberg [EMAIL PROTECTED] wrote: Actually, I think the following interface would be better: void freeze(PMC *freezer); void thaw (PMC *thawer); I'm thinking of (in horrible pseudo code ;-): typedef struct { size_t action_func_nr; // clone, freeze, thaw, dump vtable# PMC *todo_list_fifo; PMC *seen; size_t id; UINTVAL flags; PMC *other; // for clone and thaw void *result; // freeze, dump append string here void *other_things_needed; } Traverse_info; INTVAL traverse(Traverse_info *info) { (SELF-vtable[info-action_func_nr]) (INTERP, SELF, info); foreach item (SELF.elements) { if (item-needs_cloning) VTABLE_push(INTERP, info-todo_list, item); else ;// copy native type } return 0; // ok } INTVAL clone(Traverse_info *info) { if (SELF.can(clone)) // overridden? return SELF.invoke(SELF.find_method(clone), info); if (info-flags DO_CLONE) copy(info-other-state, SELF.state); else ; // other pointer already done for self-ref aggregates return 0; // ok } freeze_thaw_clone_dump.c // ;-) do_clone(PMC *pmc) { return (PMC*) do_traverse(action_func_nr = (VTABLE.clone-VTABLE[0], CLONE, pmc); } void *do_traverse(func, what, pmc) { info.action_func_nr = func; info.todo_list_fifo = new Fifo; info.flags = what; info.seen = new Hash; info.result = NULL; push(info.todo_list, pmc); while (info.todo_list_fifo.elements()) { todo = shift info.todo_list_fifo; if (seen{todo}) // set flags info.flags = DO_NOT_CLONE; info.other = todo; else { seen{todo} = ++ID; switch (what) { case CLONE: info.other = new todo.type(); if (!info.result) info.result = info.other; // top clone break; ... } } info.id = ID; if (todo-vtable-traverse(...,info)) ; // error } return (void*) info-result; } [ big snip ] Above functionality does IMHO match your description. The only difference is, that I'd like to use that not only for freeze/thaw, but for clone and dump too. The thaw part is still missing in above pseudocode. It needs an additional ID-array for lookup + some more things. Its better done in a separate subroutine. There is one more indirection through the passed in vtable but that doesn't matter very much. Constructing new PMCs or STRINGs takes much more time. And I'd like to have a vtable-dump too. Dan's approach of first freezing the PMC and then let the debugger construct the dump doesn't work IMHO: we already have dynamic loaded PMC classes. The debugger doesn't know anything about such classes, so these classes have to provide this functionality themselfs. We could rename dump() to pretty_print() though. leo
[RfC] vtable-dump
Tracing through Parrot programs shows the type and value of variables. This doesn't work very well for PMCs, trace_pmc_dump() tries to extract some info from the PMC, but much better would it be, if the PMC itself can return a string representation of itself. STRING* dump(INTERP, SELF, STRING *, INTVAL flags); append and return a printable representation of SELF, with flags being: 0x1 ... name (whoami) 0x2 ... id (enum) 0x4 ... value(s) (non aggregates) 0x8 ... address ... e.g. newlines, tabs in output for aggregates 0x100...n ... dump of aggregate items at nesting (n-0x100) Comments welcome leo
Re: [RfC] vtable-dump
Leopold Toetsch [EMAIL PROTECTED] writes: Tracing through Parrot programs shows the type and value of variables. This doesn't work very well for PMCs, trace_pmc_dump() tries to extract some info from the PMC, but much better would it be, if the PMC itself can return a string representation of itself. ACK, wanted something like this forever. STRING* dump(INTERP, SELF, STRING *, INTVAL flags); what is the STRING * parameter good for. I assume lineheaders. append and return a printable representation of SELF, with flags being: 0x1 ... name (whoami) 0x2 ... id (enum) 0x4 ... value(s) (non aggregates) 0x8 ... address ... e.g. newlines, tabs in output for aggregates 0x100...n ... dump of aggregate items at nesting (n-0x100) Use the lower bits for nesting depth. This way you can use n = flags 0xff to obtain the nesting depth, and still define the flags which can be passed to the passed to dump of the childs: dump (..., (flags ~0xff) | n-1); Comments welcome leo allways boe -- Juergen Boemmels[EMAIL PROTECTED] Fachbereich Physik Tel: ++49-(0)631-205-2817 Universitaet Kaiserslautern Fax: ++49-(0)631-205-3906 PGP Key fingerprint = 9F 56 54 3D 45 C1 32 6F 23 F6 C7 2F 85 93 DD 47
Re: [RfC] vtable-dump
Juergen Boemmels [EMAIL PROTECTED] wrote: Leopold Toetsch [EMAIL PROTECTED] writes: STRING* dump(INTERP, SELF, STRING *, INTVAL flags); what is the STRING * parameter good for. I assume lineheaders. The place, where to append SELF.dump. Appending doesn't consume a new header for each piece. Use the lower bits for nesting depth. This way you can use n = flags 0xff to obtain the nesting depth, and still define the flags which can be passed to the passed to dump of the childs: dump (..., (flags ~0xff) | n-1); That would restrict the nesting depth. 2 more shifts don't harm, dump()ing will be slow enough. BTW what about self-referential structures (same problem as with clone): $ cat rclone.pasm new P0, .PerlArray new P1, .PerlInt push P0, P1 push P0, P0 clone P2, P0 end $ ulimit -m 5000 # don't forget, if you havn't any $ parrot rclone.pasm Segmentation fault (core dumped) I'm thinking of dump to be: - if (flag) return - set a flag on SELF - dump children - reset flag But this doesn't help against very deeply nested structures. An alternative would be to have a general iterator with a todo-list like the current DOD mark_ptr. We could use a generalization of the mark-scheme by passing a function argument for DOD too with some? impact on DOD performance. When we want to use this for serialization too, we need some means to refer to an already visited PMC, passing an optional hash would do it IMHO. boe leo
Re: [RfC] vtable-dump
Leopold Toetsch [EMAIL PROTECTED] writes: Juergen Boemmels [EMAIL PROTECTED] wrote: Leopold Toetsch [EMAIL PROTECTED] writes: STRING* dump(INTERP, SELF, STRING *, INTVAL flags); what is the STRING * parameter good for. I assume lineheaders. The place, where to append SELF.dump. Appending doesn't consume a new header for each piece. Ah, ok. How do you want to solve indentation in multiline dumps? With an additional indent parameter? Use the lower bits for nesting depth. This way you can use n = flags 0xff to obtain the nesting depth, and still define the flags which can be passed to the passed to dump of the childs: dump (..., (flags ~0xff) | n-1); That would restrict the nesting depth. 2 more shifts don't harm, dump()ing will be slow enough. Ok, then use a mask of 0x. Or even 0xff (I don't want to see a dump of depth 4 million on my screen). I wanted to keep the opportunity to pass the other flags like multiline or name only down to the childs. BTW what about self-referential structures (same problem as with clone): $ cat rclone.pasm new P0, .PerlArray new P1, .PerlInt push P0, P1 push P0, P0 clone P2, P0 end $ ulimit -m 5000 # don't forget, if you havn't any $ parrot rclone.pasm Segmentation fault (core dumped) I'm thinking of dump to be: - if (flag) return - set a flag on SELF - dump children - reset flag But this doesn't help against very deeply nested structures. An alternative would be to have a general iterator with a todo-list like the current DOD mark_ptr. We could use a generalization of the mark-scheme by passing a function argument for DOD too with some? impact on DOD performance. A general nesting iterator would be a good idea. It might be useful in quite a few places: DOD, destruction ordering, dumping, serialisation, cloning. But DOD is the most timing-critical: It's called often and has to visit all active objects, whereas the others only visit the affected objects (the ones which gets destructed, dumped, serialized, cloned). When we want to use this for serialization too, we need some means to refer to an already visited PMC, passing an optional hash would do it IMHO. When using callbacks they often have a void* params. The interpretation of this *params is job of the callback. boe -- Juergen Boemmels[EMAIL PROTECTED] Fachbereich Physik Tel: ++49-(0)631-205-2817 Universitaet Kaiserslautern Fax: ++49-(0)631-205-3906 PGP Key fingerprint = 9F 56 54 3D 45 C1 32 6F 23 F6 C7 2F 85 93 DD 47
Re: [RfC] vtable-dump
On Fri, 29 Aug 2003, Leopold Toetsch wrote: Tracing through Parrot programs shows the type and value of variables. This doesn't work very well for PMCs, trace_pmc_dump() tries to extract some info from the PMC, but much better would it be, if the PMC itself can return a string representation of itself. STRING* dump(INTERP, SELF, STRING *, INTVAL flags); append and return a printable representation of SELF, with flags being: 0x1 ... name (whoami) 0x2 ... id (enum) 0x4 ... value(s) (non aggregates) 0x8 ... address ... e.g. newlines, tabs in output for aggregates 0x100...n ... dump of aggregate items at nesting (n-0x100) Comments welcome I think we'd be better served getting the freeze/thaw stuff in and teaching the debugger how to dump a frozen PMC. This'll get us two benefits: 1) We'll have a single way to dump a PMC rather than two 2) We'll have a mechanism in place to handle PMC constants better If we make the initial dump scheme something mildly human readable (XML, YAML, whatever) then the debugger doesn't have to worry about formatting it for a while. (And we're going to want a pluggable dump scheme anyway, to make experimentation eaiser) Dan
Re: [RfC] vtable-dump
On Fri, 29 Aug 2003, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: I think we'd be better served getting the freeze/thaw stuff in and We were just discussing this in the f'up. I read those, but I wanted to make sure the discussion went this way. :) If we make the initial dump scheme something mildly human readable (XML, YAML, whatever) then the debugger doesn't have to worry about formatting it for a while. (And we're going to want a pluggable dump scheme anyway, to make experimentation eaiser) Pluggable implies a dump() vtable method, doesn't it? Nope. Pluggable implies freezethaw.c provides all the functionality to freeze or thaw a PMC, with code in there to handle the right final encoding or decoding of the data. I thought we had a freeze and thaw entry in the vtables, but I see not. The signatures (since I don't have source handy to check them in) should be: STRING *freeze() PMC* thaw(STRING*) With thaw ignoring its initial PMC parameter. (It's a class method, essentially) Dan
Re: [RfC] vtable-dump
Dan Sugalski [EMAIL PROTECTED] wrote: I think we'd be better served getting the freeze/thaw stuff in and We were just discussing this in the f'up. teaching the debugger how to dump a frozen PMC. This'll get us two benefits: 1) We'll have a single way to dump a PMC rather than two 2) We'll have a mechanism in place to handle PMC constants better Yep. If we make the initial dump scheme something mildly human readable (XML, YAML, whatever) then the debugger doesn't have to worry about formatting it for a while. (And we're going to want a pluggable dump scheme anyway, to make experimentation eaiser) Pluggable implies a dump() vtable method, doesn't it? Dan leo
Re: [RfC] vtable-dump
Dan Sugalski [EMAIL PROTECTED] wrote: On Fri, 29 Aug 2003, Leopold Toetsch wrote: I think, we need a general solution for freeze, dump and clone. As shown the latter is broken. That would be IMHO an iterator interface with a callback function taking and returning a void*. Right, what I want is for the system freeze/thaw routines to have a standard way to encode a variety of core types (ints, floats, strings), lists, and hashes, as well as a way to handle self-referential, circular, and shared data in the dumped PMC data. It's the only way to safely do this, since we can't count on all the different PMC classes to handle dumping arrays of arrays of references to the same end PMC properly. Which we have to if we want this to work right. The freeze(), dump() and clone() would be vtables which are called by the iterator. So this functions hould have the same signature. No iterator. When you call freeze on a PMC, that PMC is responsible for any sort of iteration that may have to be done. (Though there is the issue of blowing stack, which we'll have to address) Aren't you saying the opposite of above here? I want to be able to traverse from a given start point (being it the own interpreter or some PMC) as deeply down as there is something. You did say, that we don't recursively mark() because of recursion depth. clone(), dump(), freeze() all have these same issues (plus self-referntial structures). That's way I'm speaking of an iterator. An array can't clone() itself safely. It can clone() the SELF pmc and then put array items on a TODO list, which then get cloned by the framework. Its like interpreter-mark_ptr holding a TODO list of items still to mark, which can put other items onto this list too. So I'm speaking of a framework, which calls a vtable-traverse() method that calls vtable methods (freeze, clone, dump, or whatever) on SELF and then puts aggregate items (if any) on a TODO list. This functionality is IMHO that of an iterator. This does address stack blowing and self-referential structures. The TODO list can be avoided for plain items, samething as we do in pobject_lives(). Dan leo