RE: [RfC] vtable-dump

2003-09-10 Thread Gordon Henriksen
Dan Sugalski [EMAIL PROTECTED] wrote:

 On Tue, 9 Sep 2003, Gordon Henriksen wrote:
 
  Random thought
  
  There's some discussion on perl-qa right now about how Test::More
  should implement is_deeply, which executes a code block and tests
  that the return value is equivalent to a particular nested data 
  structure. (The question posed on that list is with regard to how to
  handle tie()'d objects, which is not what I'm addressing here.)
  Result of that discussion's outcome, should the serialization API
  being discussed here enable the following behavior?
  
  ok(freeze($expected) eq freeze($actual));
  
  I bring this up because using object addresses as IDs in the
  serialization entirely prevents this usage.
 
 Good. Having waded through one of the threads on p5p just a minute
 ago, I'm not willing to guarantee this. In fact, I'm willing to 
 explicitly not guarantee this. If we want to compare two frozen
 structures, string equality is *not* the way to go.
 
 Each freeze could, theoretically, choose a different freezing method
 yet still represent the original.

What do you mean by a different freezing method? That key-value pairs
from two externally identical hashes could be frozen in different
orders? I can see that. Sorting large hashes can be expensive, and
certainly requires memory allocation.

Or do you mean that freeze($objA) and freeze($objB)--freezing
(definitionally identical!) object graphs and with no intervening code
between the two calls to freeze--could internally and arbitrarily select
a significantly divergent object graph encoding? I can't see that at
ALL...

Over time (and revisions), I certainly can see a desire not to marry
parrot-freeze to a specific binary representation. That's not the
question I intended to raise--I asked a question only of repeatability,
not of permanent format invariance.


 This is a Good Place for a black-box comparison op, which string
 equality definitely is not.

(At which point do extremely complex routines cease to be operators?)

A black-box comparison of the (live) object graphs, or black-box
comparison of the serializations themselves?

I can see the former--while it's precisely the problem that consistent
traversal outputs avoids, a deep == could have utility and avoid the
serialization overhead. But how do you implement concurrent traversals
without private seen tables? (e.g., if the graphs overlap, one of the
traversals will find the other traversal's visited flag and fail to
visit the entire graph.) Reentrant traversal again. Pathological case:

$aa = [1];
$ab = [1];
%ha = ( a = \$aa, b = \$ab );
%hb = ( a = \$ab, b = \$aa );
# %ha and %hb overlap and are deep-identical,
# but not deeply reference-identical.

Comparing serialized object graphs strikes me as tremendously esoteric,
e.g. a maintenance burden to be used by very few clients. (One of the
few significant uses being that of replacing string-equals for the
testing of the serializer itself.) It also strikes me as very, very,
very complicated should the freeze methods diverge even slightly. I've
never seen any such mechanism in any other environment. To compare
graphs that I had saved in long-term storage, I as a caller would expect
to need to deserialize the graphs and use a deep equals--and I would
expect to implement deep equals (another feature I've never before seen
as a specific feature of any environment) by serializing the
deserialized graphs again and doing a string (or file) comparison. E.g.,
if I had read $a or $b from a file, I would expect to have to:

freeze(thaw($a)) eq freeze(thaw($b))

instead of:

$a eq $b

That synthesizes the functionality from a minimal set of
operations--freeze and thaw--and with a lot less maintenance for parrot.
(At the cost of a bit of memory and runtime speed. But anybody who
performs deep-compares in truly performance-critical code needs to meet
my Clue Bat[tm].)

--
 
Gordon Henriksen
IT Manager
ICLUBcentral Inc.
[EMAIL PROTECTED]




RE: [RfC] vtable-dump

2003-09-10 Thread Dan Sugalski
On Tue, 9 Sep 2003, Gordon Henriksen wrote:

 Dan Sugalski [EMAIL PROTECTED] wrote:
 
  On Tue, 9 Sep 2003, Gordon Henriksen wrote:
  
   Random thought
   
   There's some discussion on perl-qa right now about how Test::More
   should implement is_deeply, which executes a code block and tests
   that the return value is equivalent to a particular nested data 
   structure. (The question posed on that list is with regard to how to
   handle tie()'d objects, which is not what I'm addressing here.)
   Result of that discussion's outcome, should the serialization API
   being discussed here enable the following behavior?
   
   ok(freeze($expected) eq freeze($actual));
   
   I bring this up because using object addresses as IDs in the
   serialization entirely prevents this usage.
  
  Good. Having waded through one of the threads on p5p just a minute
  ago, I'm not willing to guarantee this. In fact, I'm willing to 
  explicitly not guarantee this. If we want to compare two frozen
  structures, string equality is *not* the way to go.
  
  Each freeze could, theoretically, choose a different freezing method
  yet still represent the original.
 
 What do you mean by a different freezing method? That key-value pairs
 from two externally identical hashes could be frozen in different
 orders? I can see that. Sorting large hashes can be expensive, and
 certainly requires memory allocation.

That's possible, yes. We may well decide to have each hash have a separate 
randomness seed. While a bit excessive, it's not too unreasonable.
 
 Or do you mean that freeze($objA) and freeze($objB)--freezing
 (definitionally identical!) object graphs and with no intervening code
 between the two calls to freeze--could internally and arbitrarily select
 a significantly divergent object graph encoding? I can't see that at
 ALL...

This is, in fact, what I mean. (Well, actually, what I mean is that 
freeze($a) and freeze($a), with no intervening changes at all to $a, may 
produce different freeze representations)

There are two reasons for this:

1) We make no promises on the format for the freeze, just that it will be 
able to reconstitute to the original structure. That means that the system 
may, if it chooses, use one of several formats. The freeze format can be 
chosen for any number of reasons including current free memory, phase of 
the moon, or sheer random chance.

2) We may choose to use internal characteristics, including addresses of
immovable structures, as unique identifiers for the freezing.

#2 gets relatively minor changes between two otherwise functionally 
identical graphs, but nonetheless significant enough to make string 
equality testing infeasable.

#1 means that the first freeze may give you the graph in the internal 
binary format and the second freeze gives you the graph in YAML. (And the 
third potentially in XML)
 
 Over time (and revisions), I certainly can see a desire not to marry
 parrot-freeze to a specific binary representation. That's not the
 question I intended to raise--I asked a question only of repeatability,
 not of permanent format invariance.

I don't want to make any guarantees of repeatability. Past history 
suggests that when behaviour isn't guaranteed it's best to actively 
randomize the behaviour if there's no significant penalty to doing so, as 
otherwise people will count on the non-promised behaviour to the point 
where the installed base makes it infeasable to change when the future 
rolls around.

  This is a Good Place for a black-box comparison op, which string
  equality definitely is not.
 
 (At which point do extremely complex routines cease to be operators?)
 
 A black-box comparison of the (live) object graphs, or black-box
 comparison of the serializations themselves?

Both, though I was thinking the latter, which is significantly more 
useful.

 Comparing serialized object graphs strikes me as tremendously esoteric,
 e.g. a maintenance burden to be used by very few clients.

It actually makes things potentially much simpler, if you consider 
comparison of live objects for equivalence a suecial case of this.

 (One of the
 few significant uses being that of replacing string-equals for the
 testing of the serializer itself.) It also strikes me as very, very,
 very complicated should the freeze methods diverge even slightly. I've
 never seen any such mechanism in any other environment.

Which is fine. I suppose we get to do at least one new thing with Parrot, 
though I expect this isn't anywhere near new. It's also going to be 
necessary to handle duplicate detection when getting objects over the 
network or from long-term storage.

 To compare
 graphs that I had saved in long-term storage, I as a caller would expect
 to need to deserialize the graphs and use a deep equals

Why? This one makes no sense to me. If I have two black-box serialized 
representations of objects, I'd expect to *not* have to reconstitute them, 
and would actively not want to do so, since 

RE: [RfC] vtable-dump

2003-09-09 Thread Gordon Henriksen
Random thought

There's some discussion on perl-qa right now about how Test::More should
implement is_deeply, which executes a code block and tests that the
return value is equivalent to a particular nested data structure. (The
question posed on that list is with regard to how to handle tie()'d
objects, which is not what I'm addressing here.) Result of that
discussion's outcome, should the serialization API being discussed here
enable the following behavior?

ok(freeze($expected) eq freeze($actual));

I bring this up because using object addresses as IDs in the
serialization entirely prevents this usage.

--
 
Gordon Henriksen
IT Manager
ICLUBcentral Inc.
[EMAIL PROTECTED]




RE: [RfC] vtable-dump

2003-09-09 Thread Dan Sugalski
On Tue, 9 Sep 2003, Gordon Henriksen wrote:

 Random thought
 
 There's some discussion on perl-qa right now about how Test::More should
 implement is_deeply, which executes a code block and tests that the
 return value is equivalent to a particular nested data structure. (The
 question posed on that list is with regard to how to handle tie()'d
 objects, which is not what I'm addressing here.) Result of that
 discussion's outcome, should the serialization API being discussed here
 enable the following behavior?
 
 ok(freeze($expected) eq freeze($actual));
 
 I bring this up because using object addresses as IDs in the
 serialization entirely prevents this usage.

Good. Having waded through one of the threads on p5p just a minute ago,
I'm not willing to guarantee this. In fact, I'm willing to explicitly not
guarantee this. If we want to compare two frozen structures, string
equality is *not* the way to go. Each freeze could, theoretically, choose
a different freezing method yet still represent the original.

This is a Good Place for a black-box comparison op, which string equality 
definitely is not.

Dan



RE: [RfC] vtable-dump

2003-09-09 Thread Garrett Goebel
Nicholas Clark wrote:
 Garrett Goebel wrote:
  Leopold Toetsch wrote:
   Garrett Goebel [EMAIL PROTECTED] wrote:
   
As I was googling to see how other people have approached 
these problems I ran across:
   
http://blogs.gotdotnet.com/cbrumme/CategoryView.aspx/CLR
   
   What did you find there, besides a description that a certain
   company is producing b0rken software ;-)
  
  The thoughts of a CLR developer on implementation issues: 
  mistakes made, lessons learned, nasty things to keep in mind,
  etc. Discussions of lock-free thread-safe coding issues,
  balancing the CLR's memory model with cpu memory models, GC,
  finalization, exceptions, MP, NUMA, JIT, security policies,
  managed code, typing, shared vtables, marshaling, serialization.
 
 But not hugely indexed and with a lot of Win32 specific detail
 (Although there are section headers). For example the term
 lock-free seems to occur three times, and none seem to be
 discussions on how to go about writing lock-free code, merely
 references to it. (Or did you mean that we should read some 
 of the comments too?) Are there specific permalinks of blog
 entries that discuss individual items in your list above, to
 save us all trying to wade through masses of text?

The comments do hone in on some of the issues... but no, I can't give
anything more specific than that the last 2 entries looked to cover many of
the same issues that have been discussed recently on this list.

If you're looking for something specifically on threading issues, perhaps a
doctoral thesis on Implementation Issues in Concurrent Programming
Languages: www.cs.usfca.edu/benson/diss.ps?

--
Garrett Goebel
IS Development Specialist

ScriptPro   Direct: 913.403.5261
5828 Reeds Road   Main: 913.384.1008
Mission, KS 66202  Fax: 913.384.2180
www.scriptpro.com  garrett at scriptpro dot com


RE: [RfC] vtable-dump

2003-09-09 Thread Garrett Goebel
Garrett Goebel wrote:
 Nicholas Clark wrote:
  Garrett Goebel wrote:
   Leopold Toetsch wrote:
Garrett Goebel [EMAIL PROTECTED] wrote:

 As I was googling to see how other people have approached 
 these problems I ran across:

 http://blogs.gotdotnet.com/cbrumme/CategoryView.aspx/CLR

What did you find there, besides a description that a certain
company is producing b0rken software ;-)
   
   The thoughts of a CLR developer on implementation issues: 
   mistakes made, lessons learned, nasty things to keep in mind,
   etc. Discussions of lock-free thread-safe coding issues,
   balancing the CLR's memory model with cpu memory models, GC,
   finalization, exceptions, MP, NUMA, JIT, security policies,
   managed code, typing, shared vtables, marshaling, serialization.
  
  But not hugely indexed and with a lot of Win32 specific detail
  (Although there are section headers). For example the term
  lock-free seems to occur three times, and none seem to be
  discussions on how to go about writing lock-free code, merely
  references to it. (Or did you mean that we should read some 
  of the comments too?) Are there specific permalinks of blog
  entries that discuss individual items in your list above, to
  save us all trying to wade through masses of text?
 
 The comments do hone in on some of the issues... but no, I can't
 give anything more specific than that the last 2 entries looked
 to cover many of the same issues that have been discussed
 recently on this list.

Sorry, I was looking at: http://blogs.gotdotnet.com/cbrumme/ which truncates
old blogs...

App Domains
http://blogs.gotdotnet.com/cbrumme/PermaLink.aspx/56dd7611-a199-4a1f-adae-6f
ac4019f11b

Memory Model
http://blogs.gotdotnet.com/cbrumme/PermaLink.aspx/480d3a6d-1aa8-4694-96db-c6
9f01d7ff2b

--
Garrett Goebel
IS Development Specialist

ScriptPro   Direct: 913.403.5261
5828 Reeds Road   Main: 913.384.1008
Mission, KS 66202  Fax: 913.384.2180
www.scriptpro.com  garrett at scriptpro dot com


Re: [RfC] vtable-dump

2003-09-08 Thread Steve Fink
On Sep-08, Leopold Toetsch wrote:
 Nicholas Clark [EMAIL PROTECTED] wrote:
  On Fri, Sep 05, 2003 at 09:34:04AM -0400, Dan Sugalski wrote:
  On Fri, 5 Sep 2003, Peter Haworth wrote:
 
   With the seen hash approach, I wouldn't expect the hash itself to take
   nearly as much space as the frozen structure;
 
  My experience tracing variable graphs in Perl 5 indicates otherwise.
 
  I think it may be worth listening to Dan on this one.
 
 I don't know why Storable.xs is using a general HV for mapping an
 address to an integer (more isn't needed to serialize all IMHO - Dan's
 next_for_GC approach doesn't provide more info). Perl5 does convert the
 address to a string and uses all weirdness of hv.c code. This isn't
 necessary for this special kind of problem.
 
 Parrot has some DOD counters for possibly active PMCs. A simple and
 probably fix-sized[1] hash based on such a count with just a mapping of
 {pmc = id} takes 12 bytes per entry on 32-bit architecures and its
 really fast.

It's late so I'm probably being an idiot, but is all that is needed
(1) a unique id for every pmc address, and (2) whether or not the PMC
has been visited yet? If so, why not use the address itself for the id
(or some variant, like index of the pmc in its arena, added to/or-ed
with some base id for the arena itself)? And for the flag, have arenas
preallocate a bit vector of seen flags for all their PMCs? (Or if you
have the arena base ids, then you can have a single bit vector for all
arenas.)

Or is there no simple way of going from a PMC address to its arena?

Perhaps it's no win to pollute the cache with chunks of the bit vector
instead of the actual referenced PMC flagpoles. But if that's an issue
then I can't help but think that most of the time the seen bit vector
would be quite sparse, so we could go to a multi-level scheme: use a
conservative vector that can answer definitely not seen vs maybe seen,
and then look at the actual PMC flagpole (or, again, a separate bit
vector if it somehow helps multithreading) for the true answer. An
eighth of a bit per entry ought to be small enough, no?

I suppose I ought to pay more attention to the conversation (and the
current code) before butting in with suggestions. I've a feeling I'm
solving the wrong problem.


Re: [RfC] vtable-dump

2003-09-08 Thread Gordon Henriksen
On Sunday, September 7, 2003, at 06:21 , Nicholas Clark wrote:

On Fri, Sep 05, 2003 at 09:34:04AM -0400, Dan Sugalski wrote:

On Fri, 5 Sep 2003, Peter Haworth wrote:

With the seen hash approach, I wouldn't expect the hash itself to 
take nearly as much space as the frozen structure;
My experience tracing variable graphs in Perl 5 indicates otherwise.
I think it may be worth listening to Dan on this one.

Appended patch to Storable implements an option to count the size of 
the seen hashes. (There seem to be two, one for objects, one for 
classes) It's disabled by default - compile Storable with -DDEBUGME or 
change the #if 0 at line 23 of Storable.xs to enable. You'll also need 
Devel::Size installed.

Taking a relatively complex structure, such as the POSIX exports:

$ /usr/local/bin/perl5.8.0 -Mblib -MStorable -MPOSIX -MDevel::Size -le 
'package POSIX; foreach ([EMAIL PROTECTED], [EMAIL PROTECTED], \%EXPORT_TAGS) { print 
Devel::Size::total_size $_; print length Storable::freeze $_; print 
$Storable::SEENSIZE; print }'

reformatted, this give:

size in memory  size frozen seen hash size
@EXPORT 22680   547719068
@EXPORT_OK  1943422 1864
%EXPORT_TAGS23686   596320472
Not very surprised by this result at all. Hash table entries are going 
to be roughly comparable in size to the original object graph, since so 
many objects in Perl are really quite tiny. The issues with the 
alternatives are stunningly painful, though. They've been hashed out 
already, though. I'm less concerned by transient memory consumption than 
by the sabotage of threading.

Perhaps it would make sense to use a specialized data structure if 
memory consumption is really at the top of anybody's list of concerns. 
Remember the hash tables without linked lists chained off of them from 
undergrad CS? They're much more compact (literally a single table in 
memory), and most of their flukes are avoidable when entries can't be 
removed.



Gordon Henriksen
[EMAIL PROTECTED]


Re: [RfC] vtable-dump

2003-09-08 Thread Gordon Henriksen
On Sunday, September 7, 2003, at 01:41 , Dan Sugalski wrote:

At 8:56 PM -0600 9/4/03, Luke Palmer wrote:

Gordon Henriksen writes:

 What you're suggesting also has significant side-effects: It halts  
hypothetical multithreaded programs, suspends DoD, prevents the 
traversal mechanism from calling back into parrot code, requires 
fine-grained locks which are extremely expensive and have been 
summarily abandoned to great acclaim in all prior works... and for 
that, still doesn't provide a useful form of thread safety in the 
general case anyhow.
Is there a problem with providing a mechanism which would suspend all 
threads except for the current one, as to ensure that the serialize 
operates, er, serially.  Could it be implemented with a top-priority 
event post that acquires a global lock?
The biggest problem with a lock of this sort is that it requires all 
the interpreters to actually *check* it to be meaningful. (Not that 
there aren't other problems--there are, quite a number of them--but 
this is the big one) Locks nobody gets aren't useful, and for this one 
to be useful we'd need to have some assurance that the other threads 
were looking at it  regularly, and that those threads were all paused 
when we wanted to do our stuff.

You end up with either long delays waiting, or a lot of contention on a 
single global lock. Neither is particularly good.
I think Luke wasn't suggesting a global lock, but rather suggesting that 
the competing threads be suspended using operating system facilities to 
temporarily prevent them from being scheduled.



Gordon Henriksen
[EMAIL PROTECTED]


Re: [RfC] vtable-dump

2003-09-08 Thread Leopold Toetsch
Steve Fink [EMAIL PROTECTED] wrote:
 On Sep-08, Leopold Toetsch wrote:

 Parrot has some DOD counters for possibly active PMCs. A simple and
 probably fix-sized[1] hash based on such a count with just a mapping of
 {pmc = id} takes 12 bytes per entry on 32-bit architecures and its
 really fast.

 It's late so I'm probably being an idiot, but is all that is needed
 (1) a unique id for every pmc address, and (2) whether or not the PMC
 has been visited yet? If so, why not use the address itself for the id

Yep. The address can serve as the ID (it would with Dan's scheme).

 Or is there no simple way of going from a PMC address to its arena?

With ARENA_DOD_FLAGS enabled: GET_ARENA(pmc). When its disabled it would
need arena back pointers or a lookup like Ccontained_in_pool(). Both
methods can generate an index usable for indexing into the bit vector.

But I wouldn't want to use arena memory for the bit vector. This leads
to the same multi threading issues, as Dan's scheme has.

A bit vector instead of the hash allocated on the heap sounds very
promising and space efficient.

 Perhaps it's no win to pollute the cache with chunks of the bit vector
 instead of the actual referenced PMC flagpoles.

clone()/freeze()/dump() pull in the whole PMC. DOD doesn't for the fast
case of just setting the live flag on a scalar. But ...

 ...  An
 eighth of a bit per entry ought to be small enough, no?

Yep. Think so.

 I suppose I ought to pay more attention to the conversation (and the
 current code) before butting in with suggestions. I've a feeling I'm
 solving the wrong problem.

No. The idea is fine and would work.

leo


Re: [RfC] vtable-dump

2003-09-08 Thread Gordon Henriksen
On Monday, September 8, 2003, at 03:34 , Steve Fink wrote:

On Sep-08, Leopold Toetsch wrote:

I don't know why Storable.xs is using a general HV for mapping an 
address to an integer (more isn't needed to serialize all IMHO - Dan's 
next_for_GC approach doesn't provide more info). Perl5 does convert 
the address to a string and uses all weirdness of hv.c code. This 
isn't necessary for this special kind of problem.

Parrot has some DOD counters for possibly active PMCs. A simple and 
probably fix-sized[1] hash based on such a count with just a mapping 
of {pmc = id} takes 12 bytes per entry on 32-bit architecures and 
its really fast.
It's late so I'm probably being an idiot, but is all that is needed (1) 
a unique id for every pmc address, and (2) whether or not the PMC has 
been visited yet? If so, why not use the address itself for the id (or 
some variant, like index of the pmc in its arena, added to/or-ed with 
some base id for the arena itself)?
A down side to using the address as a key is that it would cause even 
the simplest serialization to have nondeterministic (or at least 
unpredictable) output, which certainly makes testing more difficult.

And for the flag, have arenas preallocate a bit vector of seen flags 
for all their PMCs? (Or if you have the arena base ids, then you can 
have a single bit vector for all arenas.)
Consider concurrent serialization from multiple threads. The seen hash 
(table, whatever) is local to the serialization, and thus thread-safe. 
The programmer can guarantee that the traversed structures are not 
concurrently updatedthe runtime flatly cannot. Any global structure is 
inherently not thread-safe. Furthermore, global structures are 
persistently extant, and must be paid for continually. Most programs 
will never serialize even one object graph, much less spend any 
significant portion of their CPU time in serialization.



Gordon Henriksen
[EMAIL PROTECTED]


Re: [RfC] vtable-dump

2003-09-07 Thread Dan Sugalski
At 8:56 PM -0600 9/4/03, Luke Palmer wrote:
Gordon Henriksen writes:
 What you're suggesting also has significant side-effects: It halts
 hypothetical multithreaded programs, suspends DoD, prevents the
 traversal mechanism from calling back into parrot code, requires
 fine-grained locks which are extremely expensive and have been summarily
 abandoned to great acclaim in all prior works... and for that, still
 doesn't provide a useful form of thread safety in the general case
 anyhow.
Is there a problem with providing a mechanism which would suspend all
threads except for the current one, as to ensure that the serialize
operates, er, serially.  Could it be implemented with a top-priority
event post that acquires a global lock?
The biggest problem with a lock of this sort is that it requires all 
the interpreters to actually *check* it to be meaningful. (Not that 
there aren't other problems--there are, quite a number of them--but 
this is the big one) Locks nobody gets aren't useful, and for this 
one to be useful we'd need to have some assurance that the other 
threads were looking at it  regularly, and that those threads were 
all paused when we wanted to do our stuff.

You end up with either long delays waiting, or a lot of contention on 
a single global lock. Neither is particularly good.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [RfC] vtable-dump

2003-09-07 Thread Leopold Toetsch
Nicholas Clark [EMAIL PROTECTED] wrote:
 On Fri, Sep 05, 2003 at 09:34:04AM -0400, Dan Sugalski wrote:
 On Fri, 5 Sep 2003, Peter Haworth wrote:

  With the seen hash approach, I wouldn't expect the hash itself to take
  nearly as much space as the frozen structure;

 My experience tracing variable graphs in Perl 5 indicates otherwise.

 I think it may be worth listening to Dan on this one.

I don't know why Storable.xs is using a general HV for mapping an
address to an integer (more isn't needed to serialize all IMHO - Dan's
next_for_GC approach doesn't provide more info). Perl5 does convert the
address to a string and uses all weirdness of hv.c code. This isn't
necessary for this special kind of problem.

Parrot has some DOD counters for possibly active PMCs. A simple and
probably fix-sized[1] hash based on such a count with just a mapping of
{pmc = id} takes 12 bytes per entry on 32-bit architecures and its
really fast.

[1] no DOD interfernce here: just malloc'ed.

leo


Re: [RfC] vtable-dump

2003-09-07 Thread Nicholas Clark
On Fri, Sep 05, 2003 at 09:34:04AM -0400, Dan Sugalski wrote:
 On Fri, 5 Sep 2003, Peter Haworth wrote:

  With the seen hash approach, I wouldn't expect the hash itself to take
  nearly as much space as the frozen structure; 
 
 My experience tracing variable graphs in Perl 5 indicates otherwise. 

I think it may be worth listening to Dan on this one.

Appended patch to Storable implements an option to count the size of
the seen hashes. (There seem to be two, one for objects, one for classes)
It's disabled by default - compile Storable with -DDEBUGME or change the
#if 0 at line 23 of Storable.xs to enable. You'll also need Devel::Size
installed.

Taking a relatively complex structure, such as the POSIX exports:

$ /usr/local/bin/perl5.8.0 -Mblib -MStorable -MPOSIX -MDevel::Size -le 'package POSIX; 
foreach ([EMAIL PROTECTED], [EMAIL PROTECTED], \%EXPORT_TAGS) { print 
Devel::Size::total_size $_; print length Storable::freeze $_; print 
$Storable::SEENSIZE; print }'

reformatted, this give:

size in memory  size frozen seen hash size
@EXPORT 22680   547719068
@EXPORT_OK  1943422 1864
%EXPORT_TAGS23686   596320472


So it seems that the seen hash is about 3 to 4 times the size of the frozen
object, and comparable in size to the original structure. I was surprised
by these results - I had taken the common sense view that this hash
would be small

I'm wondering if a documented version of this patch should go into
Storable. It could be useful for experimenting.

Nicholas Clark

--- Storable.pm.orig2003-07-29 08:06:35.0 +0100
+++ Storable.pm 2003-09-07 22:53:24.0 +0100
@@ -361,6 +361,19 @@ sub thaw {
return $self;
 }
 
+# We take advantage of autoload - this reference to SEENSIZE won't get noticed
+# by perl unless we're called, which doesn't happen unless the user has
+# already made reference to the variable.
+sub _add_seensize {
+  undef $Storable::SEENSIZE;
+  eval {
+require Devel::Size;
+foreach (@_) {
+  $SEENSIZE += Devel::Size::size ($_) if defined $_;
+}
+  };
+  warn $@ if $@;
+}
 1;
 __END__
 
--- Storable.xs.orig2003-09-05 19:42:41.0 +0100
+++ Storable.xs 2003-09-07 23:05:28.0 +0100
@@ -1181,10 +1181,6 @@ static void init_store_context(
 *
 * It is reported fixed in 5.005, hence the #if.
 */
-#if PERL_VERSION = 5
-#define HBUCKETS   4096/* Buckets for %hseen */
-   HvMAX(cxt-hseen) = HBUCKETS - 1;   /* keys %hseen = $HBUCKETS; */
-#endif
 
/*
 * The `hclass' hash uses the same settings as `hseen' above, but it is
@@ -1197,7 +1193,18 @@ static void init_store_context(
cxt-hclass = newHV();  /* Where seen classnames are stored */
 
 #if PERL_VERSION = 5
-   HvMAX(cxt-hclass) = HBUCKETS - 1;  /* keys %hclass = $HBUCKETS; */
+#ifdef DEBUGME
+   /* Only pre-stretch the buckets if we're not interested in sizes.
+  You may want to tweak this to still stretch the buckets even
+  if we are interested in size.  */
+   if (!perl_get_sv(Storable::SEENSIZE, FALSE))
+#endif
+ {
+#define HBUCKETS   4096/* Buckets for %hseen */
+
+   HvMAX(cxt-hseen) = HBUCKETS - 1;   /* keys %hseen = $HBUCKETS; */
+   HvMAX(cxt-hclass) = HBUCKETS - 1;  /* keys %hclass = $HBUCKETS; */
+ }
 #endif
 
/*
@@ -1250,6 +1257,28 @@ static void clean_store_context(stcxt_t 
HeVAL(he) = PL_sv_undef;
}
 
+#ifdef DEBUGME
+   if (perl_get_sv(Storable::SEENSIZE, FALSE)) {
+ /* User would like a count of the seen size.  */
+ dSP;
+
+ ENTER;
+ SAVETMPS;
+
+ PUSHMARK(SP);
+ XPUSHs(cxt-hseen ? sv_2mortal(newRV_inc((SV *)cxt-hseen))
+: PL_sv_undef);
+ XPUSHs(cxt-hclass ? sv_2mortal(newRV_inc((SV *)cxt-hclass))
+  : PL_sv_undef);
+ PUTBACK;
+
+ call_pv(Storable::_add_seensize, G_DISCARD);
+
+ FREETMPS;
+ LEAVE;
+   }
+#endif
+
/*
 * And now dispose of them...
 *


Re: [RfC] vtable-dump

2003-09-06 Thread Nicholas Clark
On Thu, Sep 04, 2003 at 08:43:50AM -0500, Garrett Goebel wrote:
 Leopold Toetsch wrote:
  Garrett Goebel [EMAIL PROTECTED] wrote:
  
   As I was googling to see how other people have approached 
   these problems I ran across:
  
   http://blogs.gotdotnet.com/cbrumme/CategoryView.aspx/CLR
  
  What did you find there, besides a description that a certain
  company is producing b0rken software ;-)
 
 The thoughts of a CLR developer on implementation issues: mistakes made,
 lessons learned, nasty things to keep in mind, etc. Discussions of lock-free
 thread-safe coding issues, balancing the CLR's memory model with cpu memory
 models, GC, finalization, exceptions, MP, NUMA, JIT, security policies,
 managed code, typing, shared vtables, marshaling, serialization.

But not hugely indexed and with a lot of Win32 specific detail
(Although there are section headers).
For example the term lock-free seems to occur three times, and none
seem to be discussions on how to go about writing lock-free code, merely
references to it. (Or did you mean that we should read some of the comments
too?) Are there specific permalinks of blog entries that discuss
individual items in your list above, to save us all trying to wade
through masses of text?

Nicholas Clark


Re: [RfC] vtable-dump

2003-09-05 Thread Gordon Henriksen
On Thursday, September 4, 2003, at 10:56 , Luke Palmer wrote:

Is there a problem with providing a mechanism which would suspend all 
threads except for the current one, as to ensure that the serialize 
operates, er, serially.  Could it be implemented with a top-priority 
event post that acquires a global lock?
Problem? Well, it's is antithetical to threading...

Imagine a massively multithreaded server program. Stopping the entire 
thing to so much as Cdump(undef)? The overhead of halting the threads 
in predictable states far exceeds the cost of the serialization itself. 
A server which ever availed itself of dump would become unable to 
saturate a multiprocessor system, and most likely unable to take 
advantage of a uniprocessor as well.

To serialize the entire interpreter (including thread state 
continuations), entering exclusive execution by halting all threads is 
probably necessary. In any other case? Ill-advised...



Gordon Henriksen
[EMAIL PROTECTED]


RE: [RfC] vtable-dump

2003-09-05 Thread Peter Haworth
On Thu, 4 Sep 2003 09:02:18 -0400 (EDT), Dan Sugalski wrote:
 On Wed, 3 Sep 2003, Gordon Henriksen wrote:
  A seen hash is also threadsafe, can be interrupted by DoD, and is safe
  for recursion. (By threadsafe, I mean that unchanged data structures
  can be safely serialized from multiple threads without ill effect or
  possibility of deadlock.)

 While a seen hash is DOD-interruptable (which, admittedly, the scheme I'm
 preferring isn't, or is with a *lot* of care) it's also slower and
 requires a lot more resources when running.

If you're freezing a large graph of PMCs, that's going to take up a lot of
memory, however you do it (other than streaming to disk). What happens when
you run out, and need to garbage collect? With your scheme, this is a big
problem, since as you say, it's either not possible or needs a lot of care.
Actually, it seems like it would need unfeasible amounts of care to me, such
as recording which PMCs are flagged so the flags can be restored after DOD.
Then where's the benefit over a seen hash?

With the seen hash approach, I wouldn't expect the hash itself to take
nearly as much space as the frozen structure; a GC run in the middle of
freezing should only be a problem if you run out of *unreclaimable* memory.

-- 
Peter Haworth   [EMAIL PROTECTED]
Nothing in the definition of the word `word' says
 that a word has to be in a dictionary to be called one.
-- Anu Garg


RE: [RfC] vtable-dump

2003-09-05 Thread Dan Sugalski
On Fri, 5 Sep 2003, Peter Haworth wrote:

 On Thu, 4 Sep 2003 09:02:18 -0400 (EDT), Dan Sugalski wrote:
  On Wed, 3 Sep 2003, Gordon Henriksen wrote:
   A seen hash is also threadsafe, can be interrupted by DoD, and is safe
   for recursion. (By threadsafe, I mean that unchanged data structures
   can be safely serialized from multiple threads without ill effect or
   possibility of deadlock.)
 
  While a seen hash is DOD-interruptable (which, admittedly, the scheme I'm
  preferring isn't, or is with a *lot* of care) it's also slower and
  requires a lot more resources when running.
 
 If you're freezing a large graph of PMCs, that's going to take up a lot of
 memory, however you do it (other than streaming to disk). What happens when
 you run out, and need to garbage collect? With your scheme, this is a big
 problem, since as you say, it's either not possible or needs a lot of care.

Note that using the pointers in the PMCs means *DOD* must be disabled, not 
*GC*. Doing a GC run to reclaim memory is perfectly fine, though there 
still is the potential to have large amounts of memory tied up in dead but 
not reaped Pobj things.

 With the seen hash approach, I wouldn't expect the hash itself to take
 nearly as much space as the frozen structure; 

My experience tracing variable graphs in Perl 5 indicates otherwise. 

Dan



RE: [RfC] vtable-dump

2003-09-04 Thread Gordon Henriksen
What's a dump stream actually going to look like? E.g.,

$a = [1, 2, 3];
print dump($a);

1: PMC PerlArray
2: 3  # element count
3: PMC PerlInt# [0]
4: 1  #   value
5: PMC PerlInt# [1]
6: 2  #   value
7: PMC PerlInt# [2]
8: 3  #   value

Fine and good. But dump must handle object graphs, which can be
recursive or self-referential or whatnot:

$a = { b = undef };
$b = { a = $a};
$a-{b} = $b;
print dump($a);

1: PMC PerlHash
2: 1  # key-value count
3: PMC PerlString # keys[0]
4: b#   value
5: PMC PerlHash   # values[0]
6: 1  #   key-value count
7: PMC PerlString #   keys[0]
8: a# value
9: !!!

Panic! Dump needs to refer to line 4 again, or else recurse! How?

Most serializers use a seen hash not only to indicate that an object has
been seen, but also to remember a name or number for that object. If
dump had remembered $a = line 4, it could finish off with something
like...

9: Again 4#   values[0]

A seen hash is also threadsafe, can be interrupted by DoD, and is safe
for recursion. (By threadsafe, I mean that unchanged data structures
can be safely serialized from multiple threads without ill effect or
possibility of deadlock.)

--
 
Gordon Henriksen
IT Manager
ICLUBcentral Inc.
[EMAIL PROTECTED]




Re: [RfC] vtable-dump

2003-09-04 Thread Leopold Toetsch
Ok, this thread has now +45 entries, I'll try to summarize the
proposed PMC traversal schemes.

0) As the subject still applies ;-) we want to be able to dump() or
pretty_print() a PMC (and possibly aggregate members of that). This
turns out to be a special case of a deep traversal over nested PMCs
of any kind, which might contain self-references and the like.

1) There are more such traverse-like operations namely DOD, clone(),
freeze(). thaw() is different, as we run over the frozen byte-stream
and regenerate the PMC structures from that.

2) Proposals for DOD, freeze(), clone(), dump()

2a) Implement clone()  dump() on top of freeze(): clone=freeze+thaw,
dump() is generated in the debugger sub-system

2b) A general traverse() vtable that calls freeze/clone/dump-vtables.
DOD is a separate traversal (mark).[1]

2c) A more general traverse(), which is our current DOD mark()
functionality using next_for_GC.[2]

Comparison 2a 2b 2c

clone needs intermediate frozen string[3]  yes -  -
clone involves 2 steps yes -  -
dump has to know PMCs internals[4] yes -  -
duplicates recognition overheadHash   Hash-
Traversals per aggregate for clone/freeze[5]1  1  3
Thread-safe[6] yesyes -
Increases PMC size[7]   -  - yes
Adds complexity to DOD[8]   -  - yes
Cache friendly[9]   - yes -
Code duplication[10]-  - yes

[1] I consider Benjamin's proposal as some generalization of this for
now and a mixture of 2a) and 2b) (clone=freeze/thaw).

[2] mark() as of Parrot 0.0.6, each PMC gets on the next_for_GC list

[3] e.g. for clone(Array[1]) the intermediate frozen image is
first generated in memory, then scanned again during thaw().

[4] what about dynamically loaded PMCs?

[5] 2c): 1. mark() per PMC, 2. clone() or freeze(), 3. clear
next_for_GC. mark is called inside clone/freeze again, which additonally
kills data cache coherency.

[6] While all schemes aren't thread-safe from user level (e.g.
manually sorting an array containing shared PMCs, while it gets
frozen), 2c) isn't thread-safe at low-level, as the next_for_GC
pointer inside the PMC is used as a duplicate marker. But if a user
changes shared resources its a user problem. We only guarantee atomic
updates per PMC (s. P6E p 86f by Dan).

[7] next_for_GC (and others like poperties) are in the PMC_EXT
structure now. Plain PMC scalars don't contain other PMCs, so the can
be marked life directly by setting just their live_flag.

[8] Additionally DOD has some shortcuts for marking e.g. a PMC
pointing to another or to an array of PMCs.

[9] 2a): The intermediate frozen image for clone()
2c): 3 passes over aggregates; each PMC gets pulled in during DOD

[10] 2a): mark + freeze
 2b): mark + traverse
 2c): all traverse functions duplicated

This is of course my personal POV. If anything is wrong above please
correct me as well as WRT missing issues.

My major concerns are regarding footnotes [3], [5], [7], [8], and [9].
These haven't been commented/answered yet at all.

Thanks for reading til here ;-)
leo

leo


Re: [RfC] vtable-dump

2003-09-04 Thread Leopold Toetsch
Gordon Henriksen [EMAIL PROTECTED] wrote:

 Panic! Dump needs to refer to line 4 again, or else recurse! How?

Don't Panic! Here is the point, where either next_for_GC is used or the
seen hash. The former is/was used to mark (visit) PMCs. If they have
been visited, they have that pointer set else not. This can be used
during serialization. The latter is in my proposal.

 A seen hash is also threadsafe, can be interrupted by DoD, and is safe
 for recursion. (By threadsafe, I mean that unchanged data structures
 can be safely serialized from multiple threads without ill effect or
 possibility of deadlock.)

Yes. That's what I'm saying.

leo


Re: [RfC] vtable-dump

2003-09-04 Thread Leopold Toetsch
Garrett Goebel [EMAIL PROTECTED] wrote:

 As I was googling to see how other people have approached these problems I
 ran across:

 http://blogs.gotdotnet.com/cbrumme/CategoryView.aspx/CLR

What did you find there, besides a description that a certain company
is producing b0rken software ;-)

leo


RE: [RfC] vtable-dump

2003-09-04 Thread Garrett Goebel
Leopold Toetsch wrote:
 Garrett Goebel [EMAIL PROTECTED] wrote:
 
  As I was googling to see how other people have approached 
  these problems I ran across:
 
  http://blogs.gotdotnet.com/cbrumme/CategoryView.aspx/CLR
 
 What did you find there, besides a description that a certain
 company is producing b0rken software ;-)

The thoughts of a CLR developer on implementation issues: mistakes made,
lessons learned, nasty things to keep in mind, etc. Discussions of lock-free
thread-safe coding issues, balancing the CLR's memory model with cpu memory
models, GC, finalization, exceptions, MP, NUMA, JIT, security policies,
managed code, typing, shared vtables, marshaling, serialization.

--
Garrett Goebel
IS Development Specialist

ScriptPro   Direct: 913.403.5261
5828 Reeds Road   Main: 913.384.1008
Mission, KS 66202  Fax: 913.384.2180
www.scriptpro.com  garrett at scriptpro dot com


Re: [RfC] vtable-dump

2003-09-04 Thread Luke Palmer
Gordon Henriksen writes:
 What you're suggesting also has significant side-effects: It halts
 hypothetical multithreaded programs, suspends DoD, prevents the
 traversal mechanism from calling back into parrot code, requires
 fine-grained locks which are extremely expensive and have been summarily
 abandoned to great acclaim in all prior works... and for that, still
 doesn't provide a useful form of thread safety in the general case
 anyhow.

Is there a problem with providing a mechanism which would suspend all
threads except for the "current" one, as to ensure that the serialize
operates, er, serially.  Could it be implemented with a top-priority
event post that acquires a global lock?

Forgive my ignorance.  I'm pretty na誰ve when it comes to threads.

Luke

RE: [RfC] vtable-dump

2003-09-04 Thread Brent Dax
Luke Palmer:
# Is there a problem with providing a mechanism which would suspend all
# threads except for the current one, as to ensure that the serialize
# operates, er, serially.  Could it be implemented with a top-priority
# event post that acquires a global lock?

What about the thread that draws the GUI progress bar so people don't
think the computer froze up while serializing 42gb of data?  Or worse,
the thread that handles the cancel button?

--Brent Dax [EMAIL PROTECTED]
Perl and Parrot hacker
 
Yeah, and my underwear is flame-retardant--that doesn't mean I'm gonna
set myself on fire to prove it.





Re: [RfC] vtable-dump

2003-09-03 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:
 At 8:38 PM +0200 9/2/03, Leopold Toetsch wrote:

[ next_for_GC ]
 That's easy, then, as it is. :) Besides, from looking at pobj.h, it's
 already there, so I'm not sure what we'd need to do. Perhaps I need
 to resync.

The PMC_EXT structure is only allocated for PMCs that access PMC_data().
Adding properties adds this structure on the fly.

Plain PerlScalars don't have the PMC_EXT structure and therefore no
next_for_GC pointer. As these scalars can't refer to other PMCs, there
is no need to visist these twice in the DOD run. They are marked by
setting their life flag directly. Done.

 ... We don't care
 about the live flag for this, what we care is that the PMC is put on
 the traversal list.

You are right, live is live.

When you have a PerlArray containing 100.000 plain PerlInts, you have
your linked list with 100.001 entries or one entry. You have polluted
caches or not - that's it (I'm speaking of DOD here).

 So you shortcut in the normal course of DOD?

Sure, as decribed above.

 ... I suppose--I presume
 you've benchmarked to see if testing all the flags of the contained
 PMCs when marking the container is faster than just throwing them on
 the list and checking them when you get to them?

Yes I did benchmark this. The main problem where data cache misses and
PMC size.  Putting the PMC on the next list pulls the whole structure
into the cache. Just setting the live_flag needs half a byte, which is
located in the PMCs arena.

 ... I can see that being
 the case in some cases, but I'm concerned that it'll consume more
 memory and take more time in some reasonably common cases.

For small sets of PMCs, this is equally fast then before. It trades some
extra cycles (calculate the arena for a PMC) against cache pollution. As
CPUs get faster and the more data you are processing this scheme wins.

And putting the next_for_GC back into the main PMC structure accounts
for +20% memory overhead for almost all PMCs (I expect the vast majority
to be plain scalars w/o properties).

And I still don't see, how you will handle duplicate lookup of
self-referential structures for freeze and thaw with a linked list of
items alone.

 The same way the DOD sweep handles it, unless you've changed things
 beyond recognition. The way it used to work is that when you marked a
 PMC as live, it got put on the end of the chain unless the PMC's next
 pointer field is non-NULL, in which case it's already been put on the
 list, and nothing happens.

I see ... It used to work that way in 0.0.6. I'm not sure if I changed
that, or it happened during the split of Fresources.c. Anyway at that
time, we had mark_used() for marking PMCs and buffer_lives() for marking
Buffers.

These are united (my changes ;-), there is only one pobject_lives(),
which of course could put PMCs on the next-list again if we really want
that.

Ok then back to your proposed scheme for freeze (and with nested and
big structures in mind):

1) call mark() on the PMC, that is one walk through the aggregate
2) call freeze() on the PMC - i.e. a second walk through the
   same aggregate
3) if inside 2) we find a PMC: if it has next_for_GC set: duplicate
   else call mark() on it. This pulls in
   a potentially big aggregate and pollutes chaches.
4) get item from next list, goto 2)

5) reset next_for_GC for all PMCs

This means, we have 2 alternating runs through the same aggregates,
intersparsed with mark runs through other aggregates. Then we have
another run for all PMCs processed, resetting the next_for_GC pointer.

5) is for free on DOD runs (we walk the arenas anyway to destroy dead
objects). It's not free for freeze et al.

This all is for sure cache unfriendly.

Then: we want a solution for clone(), dump() (or pretty_print()),
freeze(). and thaw(). That means, we would need 3 more routines in
aggregates, that do all pretty much the same: walk through the
aggregate's items and call mark() for PMCs, then do something.
thaw() is different anyway.

One general traverse() vtable is just the simpler interface IMHO.

Running through each aggregate just once plus updating a seen_hash seems
simpler and faster to me and doesn't have any negative impact on PMC
size and DOD speed.

leo


Re: [RfC] vtable-dump

2003-09-03 Thread Leopold Toetsch
Benjamin Goldberg [EMAIL PROTECTED] wrote:
 Leopold Toetsch wrote:

[ optimize DOD before thaw/clone ]

 Even with a depth restriction, a recursive estimate can produce
 misleading results due to circular references.  Only actually walking
 the structure can get the number right.  However, walking the structure
 *twice* would be silly.  So, begin with an estimate of unknown
 (serialize an integer -1), and then, after the whole thing has been
 frozen, we seek backwards (if that's possible) and replace that
 unknown with the actual number of pmcs that were serialized.

Yep. That's good for thaw. Doing clone by freeze/thaw OTOH doesn't help
here. Neither freezing everything first (yielding an exact count - but
generating the whole intermediate frozen image) nor freeze/thaw per PMC
(with smaller images) does help here.

Maybe we can get some statistics during a DOD run and keep that in an
aggregate's header.

leo


Re: [RfC] vtable-dump

2003-09-03 Thread Peter Haworth
On Wed, 3 Sep 2003 09:24:22 +0200, Leopold Toetsch wrote:
 One general traverse() vtable is just the simpler interface IMHO.

 Running through each aggregate just once plus updating a seen_hash seems
 simpler and faster to me and doesn't have any negative impact on PMC size
 and DOD speed.

And also makes it threadsafe? If a shared PMC is frozen simultaneously by
two threads under Dan's scheme, will the right thing happen? I can't help
being worried about interactions between any two operations which use
flags on the PMCs to makr their progress (such as freezing and DOD, or
two freezes).

Bear in mind that I have never used threads and can't remember how they work
in parrot, so this may be an irrelevant question.

-- 
Peter Haworth   [EMAIL PROTECTED]
You're not going to watch the eclipse in yesterday's underpants?


Re: [RfC] vtable-dump

2003-09-03 Thread Leopold Toetsch
Peter Haworth [EMAIL PROTECTED] wrote:
 On Wed, 3 Sep 2003 09:24:22 +0200, Leopold Toetsch wrote:
 One general traverse() vtable is just the simpler interface IMHO.

 Running through each aggregate just once plus updating a seen_hash seems
 simpler and faster to me and doesn't have any negative impact on PMC size
 and DOD speed.

 And also makes it threadsafe? If a shared PMC is frozen simultaneously by
 two threads under Dan's scheme, will the right thing happen? I can't help
 being worried about interactions between any two operations which use
 flags on the PMCs to makr their progress (such as freezing and DOD, or
 two freezes).

Good question. We don't have threads yet, they might exist somewhere
inside Dan's brain, though.

What I know (or think to know is):

As in Dan's scheme next_for_GC is inside the PMC and used as flag/id for
freeze, this can't be threadsafe (If one thread has visited a PMC its
next_for_GC will be set, a second thread would write just the PMC id,
because the next_for_GC isn't cleared).

So all these operations (clone too) would have to acquire a global lock
first AFAIK.

For my scheme I don't see *this* kind of problems (all state is kept in
automatic variables in the interpreter). Albeit I don't like to know
what happens, when a shared aggregate gets sorted on one side and the
other is freezing it in the meanwhile.

leo


Re: [RfC] vtable-dump

2003-09-03 Thread K Stol

- Original Message -
From: Leopold Toetsch [EMAIL PROTECTED]
To: Peter Haworth [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Wednesday, September 03, 2003 4:38 PM
Subject: Re: [RfC] vtable-dump


 Peter Haworth [EMAIL PROTECTED] wrote:
  On Wed, 3 Sep 2003 09:24:22 +0200, Leopold Toetsch wrote:
  One general traverse() vtable is just the simpler interface IMHO.
 
  Running through each aggregate just once plus updating a seen_hash
seems
  simpler and faster to me and doesn't have any negative impact on PMC
size
  and DOD speed.

  And also makes it threadsafe? If a shared PMC is frozen simultaneously
by
  two threads under Dan's scheme, will the right thing happen? I can't
help
  being worried about interactions between any two operations which use
  flags on the PMCs to makr their progress (such as freezing and DOD, or
  two freezes).

 Good question. We don't have threads yet, they might exist somewhere
 inside Dan's brain, though.

 What I know (or think to know is):

 As in Dan's scheme next_for_GC is inside the PMC and used as flag/id for
 freeze, this can't be threadsafe (If one thread has visited a PMC its
 next_for_GC will be set, a second thread would write just the PMC id,
 because the next_for_GC isn't cleared).

 So all these operations (clone too) would have to acquire a global lock
 first AFAIK.

 For my scheme I don't see *this* kind of problems (all state is kept in
 automatic variables in the interpreter). Albeit I don't like to know
 what happens, when a shared aggregate gets sorted on one side and the
 other is freezing it in the meanwhile.

 leo

just a thought: aren't shared objects/data/whatever protected by a mutex? To
me, that's the obvious solution for accessing shared data. As I said, just a
thought.

Klaas-Jan




Re: [RfC] vtable-dump

2003-09-03 Thread Leopold Toetsch
K Stol [EMAIL PROTECTED] wrote:

 just a thought: aren't shared objects/data/whatever protected by a
 mutex? To me, that's the obvious solution for accessing shared data.
 As I said, just a thought.

They will be protected by a mutex yes. That helps against two threads
updating the same PMC at the same time.

But if one thread sets pmc-next_for_GC the next thread might encounter
this PMC later (albeit for this thread at the first time) and think, I've
visited this PMC already and just flush out the object ID and not the
whole PMC.

 Klaas-Jan

leo


Re: [RfC] vtable-dump

2003-09-03 Thread Dan Sugalski
On Wed, 3 Sep 2003, Leopold Toetsch wrote:

 K Stol [EMAIL PROTECTED] wrote:
 
  just a thought: aren't shared objects/data/whatever protected by a
  mutex? To me, that's the obvious solution for accessing shared data.
  As I said, just a thought.
 
 They will be protected by a mutex yes. That helps against two threads
 updating the same PMC at the same time.
 
 But if one thread sets pmc-next_for_GC the next thread might encounter
 this PMC later (albeit for this thread at the first time) and think, I've
 visited this PMC already and just flush out the object ID and not the
 whole PMC.

The freezing routine can just lock the PMCs as it freezes them. As long as 
it keeps the original chain head around, it can then just walk the list 
again and unlock them.

Dan



Re: [RfC] vtable-dump

2003-09-03 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:
 On Wed, 3 Sep 2003, Leopold Toetsch wrote:

 But if one thread sets pmc-next_for_GC the next thread might encounter
 this PMC later (albeit for this thread at the first time) and think, I've
 visited this PMC already and just flush out the object ID and not the
 whole PMC.

 The freezing routine can just lock the PMCs as it freezes them. As long as
 it keeps the original chain head around, it can then just walk the list
 again and unlock them.

Then we might get deadlocks.

   Dan

leo


Re: [RfC] vtable-dump

2003-09-03 Thread Dan Sugalski
On Wed, 3 Sep 2003, Leopold Toetsch wrote:

 Dan Sugalski [EMAIL PROTECTED] wrote:
  On Wed, 3 Sep 2003, Leopold Toetsch wrote:
 
  But if one thread sets pmc-next_for_GC the next thread might encounter
  this PMC later (albeit for this thread at the first time) and think, I've
  visited this PMC already and just flush out the object ID and not the
  whole PMC.
 
  The freezing routine can just lock the PMCs as it freezes them. As long as
  it keeps the original chain head around, it can then just walk the list
  again and unlock them.
 
 Then we might get deadlocks.

This is always a possibility, and there isn't anything we can do about it
at the interpreter level. Deadlocks are an unfortunate fact of life in a
threaded environment.

Dan



Re: [RfC] vtable-dump

2003-09-03 Thread Nicholas Clark
On Wed, Sep 03, 2003 at 02:07:10PM -0400, Dan Sugalski wrote:
 On Wed, 3 Sep 2003, Leopold Toetsch wrote:
 
  Dan Sugalski [EMAIL PROTECTED] wrote:
   On Wed, 3 Sep 2003, Leopold Toetsch wrote:
  
   But if one thread sets pmc-next_for_GC the next thread might encounter
   this PMC later (albeit for this thread at the first time) and think, I've
   visited this PMC already and just flush out the object ID and not the
   whole PMC.
  
   The freezing routine can just lock the PMCs as it freezes them. As long as
   it keeps the original chain head around, it can then just walk the list
   again and unlock them.
  
  Then we might get deadlocks.
 
 This is always a possibility, and there isn't anything we can do about it
 at the interpreter level. Deadlocks are an unfortunate fact of life in a
 threaded environment.

Can we at least detect this class of deadlock?

(which I think is where two threads are simultaneously freezing, but walk the
shared objects in a different order, so that thread 0 has already locked
object A and is waiting on B, while thread 1 has B locked and is waiting on A)

If so, does that help?
Is this a stupid question which demonstrates my lack of knowledge about
threads?

Nicholas Clark



Re: [RfC] vtable-dump

2003-09-03 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:
 On Wed, 3 Sep 2003, Leopold Toetsch wrote:

 Then we might get deadlocks.

 This is always a possibility, and there isn't anything we can do about it
 at the interpreter level. Deadlocks are an unfortunate fact of life in a
 threaded environment.

clone() isn't supposed to deadlock, nor freeze(), nor dump(). They
officially do not change (or shouldn't change) anything in the PMCs they
operate on. These methods just passively iterate over PMCs.

To work against that, we would have to implement shared PMCs as:
- first COWed (updating next_for_GC doesn't change this state)
- if any thread updates the PMC unCOW the PMCs and
- communicate via events such changes to other threads - horrid.

Dan, that clone/freeze/dump... scheme doesn't work.

   Dan

leo


Re: [RfC] vtable-dump

2003-09-03 Thread Dan Sugalski
On Wed, 3 Sep 2003, Leopold Toetsch wrote:

 Dan Sugalski [EMAIL PROTECTED] wrote:
  On Wed, 3 Sep 2003, Leopold Toetsch wrote:
 
  Then we might get deadlocks.
 
  This is always a possibility, and there isn't anything we can do about it
  at the interpreter level. Deadlocks are an unfortunate fact of life in a
  threaded environment.
 
 clone() isn't supposed to deadlock, nor freeze(), nor dump(). They
 officially do not change (or shouldn't change) anything in the PMCs they
 operate on. These methods just passively iterate over PMCs.

It's impossible to have clone in a threaded environment guarantee that it 
won't deadlock. Can't do it. If there's more than one shared PMC in the 
set of PMCs that need freezing, then there exists the possibility of 
deadlock. You can't prescan the PMC list either, since you can't guarantee 
that shared PMCs won't change between the time of the scan and the actual 
clone. (Unless you lock, and then you're running the risk of deadlock)
 
There's no such thing as passively iterating over shared PMCs. 

Dan



RE: [RfC] vtable-dump

2003-09-03 Thread Dan Sugalski
On Wed, 3 Sep 2003, Brent Dax wrote:

 Dan Sugalski:
 # It's impossible to have clone in a threaded environment guarantee that
 it
 # won't deadlock.
 
 If I recall correctly, all shared variables are owned by a single shared
 interpreter.

Nope. Shared variables are owned by their creating interpreter. We could 
make them all owned by a single shared interpreter, but then you get into 
issues with active data's home interpreters, loaded and accessible 
libraries, and other stuff. Big nasty mess.

Dan



RE: [RfC] vtable-dump

2003-09-03 Thread Garrett Goebel
Leopold Toetsch wrote:
 Dan Sugalski [EMAIL PROTECTED] wrote:
  On Wed, 3 Sep 2003, Leopold Toetsch wrote:
 
  Then we might get deadlocks.
 
  This is always a possibility, and there isn't anything we 
  can do about it at the interpreter level. Deadlocks are
  an unfortunate fact of life in a threaded environment.

As I was googling to see how other people have approached these problems I
ran across:

http://blogs.gotdotnet.com/cbrumme/CategoryView.aspx/CLR

--
Garrett Goebel
IS Development Specialist

ScriptPro   Direct: 913.403.5261
5828 Reeds Road   Main: 913.384.1008
Mission, KS 66202  Fax: 913.384.2180
www.scriptpro.com  garrett at scriptpro dot com



Re: [RfC] vtable-dump

2003-09-02 Thread Dan Sugalski
At 12:03 PM +0200 8/31/03, Leopold Toetsch wrote:
Benjamin Goldberg [EMAIL PROTECTED] wrote:
class freezer {
class thawer {
class cloner {
[ big snip ]

Do you expect that these are overridden by some languages using parrot?
I.e. that ponie tries to implement a freezer that writes output for
Storable?
Languages can't override these on a per-PMC class basis. At best they 
can override the representation of frozen or thawed PMCs on a global 
basis if, for example, someone really wanted output in XML or YAML 
instead of a packed binary format.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [RfC] vtable-dump

2003-09-02 Thread Dan Sugalski
At 6:37 PM +0200 8/29/03, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
 On Fri, 29 Aug 2003, Leopold Toetsch wrote:

 I think, we need a general solution for freeze, dump and clone. As shown
 the latter is broken. That would be IMHO an iterator interface with a
 callback function taking and returning a void*.

 Right, what I want is for the system freeze/thaw routines to have a
 standard way to encode a variety of core types (ints, floats, strings),
 lists, and hashes, as well as a way to handle self-referential, circular,
 and shared data in the dumped PMC data. It's the only way to safely do
 this, since we can't count on all the different PMC classes to handle
 dumping arrays of arrays of references to the same end PMC properly. Which
 we have to if we want this to work right.

 The freeze(), dump() and clone() would be vtables which are called by
 the iterator. So this functions hould have the same signature.

 No iterator. When you call freeze on a PMC, that PMC is responsible for
 any sort of iteration that may have to be done. (Though there is the issue
 of blowing stack, which we'll have to address)
Aren't you saying the opposite of above here? I want to be able to
traverse from a given start point (being it the own interpreter or some
PMC) as deeply down as there is something. You did say, that we don't
recursively mark() because of recursion depth. clone(), dump(), freeze()
all have these same issues (plus self-referntial structures).
No, I'm just not being clear, so lets fix that.

We don't need any special iterator, since we already have one--the 
DOD iterator, which is not only sufficient for our needs its likely 
the only thing that *can* have sufficient information to do the 
iteration.

The -freeze or -thaw methods in the vtables are sufficient to 
freeze or thaw a PMC, but aren't going to be driving the 
freezing/thawing. The way things work is this:

The freeze opcode takes a PMC as a parameter. The DOD tracing system 
is set up. The first PMC's freeze vtable function is called. When it 
returns (and we'll get to what it does later) the freeze op follows 
the chain and calls the next PMC's freeze, then the next, and so 
forth until it runs out of PMCs to freeze. This may, for simple PMCs, 
stop after the first--there may be no traversal at all.

The freeze vtable method of a PMC is responsible for handing off 
parts of the PMC to the freezing and thawing encoding subsystem 
(which itself turns those parts into bits on the wire or characters 
in a file or string). The encoding subsystem should have a way to 
encode:

 Single values:
   STRING * (so we can save encoding, charset, and language)
   Integers
   Floats
   Binary data
   PMCs
   frozen bytecode
 Name/value pair (where the value is one of the above bits, or a set 
of name/value pairs, or a list of values)

 Lists of values (where the list can be a set of name/value pairs, or 
lists of values)

 Meta data--Internal structure stuff (flags, PMC class name, and whatnot)

Not that complex -- this gets 95% of what any PMC class would need, 
and the remaining 5% can get shoved as binary data into a value thing.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [RfC] vtable-dump

2003-09-02 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:
 At 6:37 PM +0200 8/29/03, Leopold Toetsch wrote:

Aren't you saying the opposite of above here? I want to be able to
traverse from a given start point (being it the own interpreter or some
PMC) as deeply down as there is something. You did say, that we don't
recursively mark() because of recursion depth. clone(), dump(), freeze()
all have these same issues (plus self-referntial structures).

 No, I'm just not being clear, so lets fix that.

 We don't need any special iterator, since we already have one--the
 DOD iterator, which is not only sufficient for our needs its likely
 the only thing that *can* have sufficient information to do the
 iteration.

While a general iterator would be very similar to the DOD iterator, we
can't use it IMHO:

- this prohibits DOD runs during clone/thaw, when we run out of
  free headers[1]
- it would need an extra function pointer to call the actual vtable
  (mark, freeze, clone ...)
- we have currently shortcuts for DOD marking (is_PMC_ptr ..) which
  don't play nicely with it probably
and most important:
- we must track duplicates (or restore pointers to these during
  thaw/clone) while marking (pobject_lives) just returns here.

This would slow down DOD runs remarkably.

 ... The encoding subsystem should have a way to
 encode:

Basically packfile extensions. Yep.

[1] when we want to thaw/clone big structures, we should have some means
to estimate the amount of needed headers. If we will not have enough, we
do a DOD run before clone/thaw and then turn DOD off - it will not yield
any more free headers anyway. This can avoid a couple of DOD runs that
do just nothing except burning a lot of cycles and massive cache
pollution.
To achieve this, we might call aggregates.elements() first by means of
the iterator again or with some depth restriction and returning, when we
reach the free-header limit.

leo


Re: [RfC] vtable-dump

2003-09-02 Thread Dan Sugalski
On Tue, 2 Sep 2003, Leopold Toetsch wrote:

 and no, not that one inside DOD, that one doesn't handle duplicates.

Yes, yes it *does* handle duplicates. Otherwise it'd get caught in 
infinite loops every time it came across a circular data structure. That's 
what the next pointer in the PObj header gets used for, amongst other 
things.

Dan



Re: [RfC] vtable-dump

2003-09-02 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:
 On Tue, 2 Sep 2003, Leopold Toetsch wrote:

 and no, not that one inside DOD, that one doesn't handle duplicates.

 Yes, yes it *does* handle duplicates. Otherwise it'd get caught in
 infinite loops every time it came across a circular data structure. That's
 what the next pointer in the PObj header gets used for, amongst other
 things.

It does handle duplicates like so:

void pobject_lives(struct Parrot_Interp *interpreter, PObj *obj)
{
/* if object is live or on free list return */
if (PObj_is_live_or_free_TESTALL(obj)) {
return;
}
/* mark it live */
PObj_live_SET(obj);

So if it was mark()ed already we return. That's not possible for freeze,
thaw, dump, clone whatever. These must keep track of already visited
objects via an hash for freeze, dump, clone, and via an ID array for
thaw.

   Dan

leo


Re: [RfC] vtable-dump

2003-09-02 Thread Dan Sugalski
On Tue, 2 Sep 2003, Leopold Toetsch wrote:

 Dan Sugalski [EMAIL PROTECTED] wrote:
  On Tue, 2 Sep 2003, Leopold Toetsch wrote:
 
  and no, not that one inside DOD, that one doesn't handle duplicates.
 
  Yes, yes it *does* handle duplicates. Otherwise it'd get caught in
  infinite loops every time it came across a circular data structure. That's
  what the next pointer in the PObj header gets used for, amongst other
  things.
 
 It does handle duplicates like so:
 
 void pobject_lives(struct Parrot_Interp *interpreter, PObj *obj)
 {
 /* if object is live or on free list return */
 if (PObj_is_live_or_free_TESTALL(obj)) {
 return;
 }
 /* mark it live */
 PObj_live_SET(obj);
 
 So if it was mark()ed already we return. That's not possible for freeze,
 thaw, dump, clone whatever. These must keep track of already visited
 objects via an hash for freeze, dump, clone, and via an ID array for
 thaw.

No, this isn't necessary. What you need to do when freezing an already 
frozen PMC is to emit a marker label that refers to the original version 
of the PMC, or always emit the marker when a PMC freezes another PMC and 
defer actual freezing of the PMC until the master freeze function walks 
over the PMC to freeze it.

Dan



Re: [RfC] vtable-dump

2003-09-02 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:
 On Tue, 2 Sep 2003, Leopold Toetsch wrote:

 So if it was mark()ed already we return. That's not possible for freeze,
 thaw, dump, clone whatever. These must keep track of already visited
 objects via an hash for freeze, dump, clone, and via an ID array for
 thaw.

 No, this isn't necessary. What you need to do when freezing an already
 frozen PMC is to emit a marker label that refers to the original version
 of the PMC,

So I have to do a lookup to get the marker, or...

 ... or always emit the marker when a PMC freezes another PMC and
 defer actual freezing of the PMC until the master freeze function walks
 over the PMC to freeze it.

on first freeze generate a maker (store it where in the PMC?) and
then... When I defer freezing of the aggregate the thawing doesn't work.
When everything gets defered, you have to walk the whole defered list
again, for simple types too, which already would have been done. Then in
another run, the marker has to get cleared.

*If* that works it still adds extra complexity to DOD.

   Dan

leo


Re: [RfC] vtable-dump

2003-09-02 Thread Dan Sugalski
On Tue, 2 Sep 2003, Leopold Toetsch wrote:

 Dan Sugalski [EMAIL PROTECTED] wrote:
  On Tue, 2 Sep 2003, Leopold Toetsch wrote:
 
  So if it was mark()ed already we return. That's not possible for freeze,
  thaw, dump, clone whatever. These must keep track of already visited
  objects via an hash for freeze, dump, clone, and via an ID array for
  thaw.
 
  No, this isn't necessary. What you need to do when freezing an already
  frozen PMC is to emit a marker label that refers to the original version
  of the PMC,
 
 So I have to do a lookup to get the marker, or...

Or nothing. Freezing a PMC that refers to one or more PMCs will have to 
handle this.
 
  ... or always emit the marker when a PMC freezes another PMC and
  defer actual freezing of the PMC until the master freeze function walks
  over the PMC to freeze it.
 
 on first freeze generate a maker (store it where in the PMC?) and
 then... When I defer freezing of the aggregate the thawing doesn't work.
 When everything gets defered, you have to walk the whole defered list
 again, for simple types too, which already would have been done. Then in
 another run, the marker has to get cleared.

Nope.

First off, the freeze and thaw code *must* deal with potentially 
self-referential or circular data structures. They're depressingly common, 
and to not be able to handle something as simple as:

   $foo = \$foo;
   freeze($foo);

would be a big problem. As would this:

   $foo[0] = \$bar;
   $foo[1] = \$bar;
   freeze(@foo);

generating a structure that, when reconstituted, had elements 0 and 1 
pointing to different PMCs. So with that in mind, handling labels and 
deferred or already-instantiated PMCs is a necessity.

Doing the walk is *also* easy. You clear the next PMC pointer, just as
with a normal DOD run setup. call the DOD mark on the inital PMC, and call
its freeze routine. Inside the freeze for the PMC, it calls mark on any 
PMCs it needs frozen, which will then go on the list if they've not 
already been frozen. A marker for that PMC, probably based on the PMC 
header address, goes in the frozen data structure. When the initial freeze 
routine exits, the freeze code just follows the next for DOD pointer 
chain, calling freeze on each of those PMCs, until it runs off the end of 
the chain, just as the DOD does.

Now, granted, we won't use the DOD's intimate knowledge of base array/hash 
types, rather deferring to their freeze routines, but that's fine and as 
it should be.

Reconstituting these things may be somewhat problematic, but there's not 
much for it--reconstituting circular data structures that require some 
sort of active setup is problematic in general.

 *If* that works it still adds extra complexity to DOD.

It doesn't add anything to the DOD. Worst case it turns out that 
disabling a DOD sweep during a freeze is untenable, which is possible. In 
that case we'll need an alternate pointer chain.

Dan



Re: [RfC] vtable-dump

2003-09-02 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:

 Doing the walk is *also* easy. You clear the next PMC pointer, just as
 with a normal DOD run setup. call the DOD mark on the inital PMC

there is no mark() for a plain PMC scalar (and no next pointer inside
the PMC). If the PMC has a mark routine this calls pobject_lives()
(setting the live flag) and puts complex PMCs but *not all* on the next
for GC pointer. Some get their live flag directly set.

Using the DOD functionality doesn't match freeze, thaw, dump at all and
interferes with DOD runs.

Please have a look at dod.c

   Dan

leo


Re: [RfC] vtable-dump

2003-09-02 Thread Dan Sugalski
On Tue, 2 Sep 2003, Leopold Toetsch wrote:

 Dan Sugalski [EMAIL PROTECTED] wrote:
 
  Doing the walk is *also* easy. You clear the next PMC pointer, just as
  with a normal DOD run setup. call the DOD mark on the inital PMC
 
 there is no mark() for a plain PMC scalar (and no next pointer inside
 the PMC). If the PMC has a mark routine this calls pobject_lives()
 (setting the live flag) and puts complex PMCs but *not all* on the next
 for GC pointer. Some get their live flag directly set.

Every PMC should have a next_for_GC pointer. You moved it to the ext 
structure, but it's there, and they all ought to have one. Any PMC that 
gets frozen *will* have one, as it needs to have it for traversal to work 
properly.
 
 Using the DOD functionality doesn't match freeze, thaw, dump at all and
 interferes with DOD runs.

Oh, nonsense, it doesn't at all interfere with things. I designed it with
this as one of the applications for the DOD system. It should also work
reasonably well for destruction ordering. If it's been changed in a way
that makes this untenable, it needs to be fixed so it does.

Dan



Re: [RfC] vtable-dump

2003-09-02 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:

 Every PMC should have a next_for_GC pointer. You moved it to the ext
 structure, ...

I moved it there for a good reason: speed. The smaller a PMC the faster
is the interpreter.

 ... but it's there, and they all ought to have one. Any PMC that
 gets frozen *will* have one, as it needs to have it for traversal to work
 properly.

Ok. Its a macro and back again in a second *if* needed.

 Using the DOD functionality doesn't match freeze, thaw, dump at all and
 interferes with DOD runs.

 Oh, nonsense, it doesn't at all interfere with things.

Calling mark sets the live flag on PMC headers. When you use mark() you
have to reset the whole arena after a single clone. I expect a lot of
clone()s to be emitted by the HLL.

If we don't use mark() we are almost at my proposal: have an extra
traverse() vtable, that gets a vtable pointer doing freeze, thaw, clone,
or dump. If you want to have a general traverse() for DOD too (currents
mark?) then you have to pass in a function pointer - or put each and
every item on the next_for_GC list.

When you have a PerlArray containing 100.000 plain PerlInts, you have
your linked list with 100.001 entries or one entry. You have polluted
caches or not - that's it (I'm speaking of DOD here).

 ... I designed it with
 this as one of the applications for the DOD system. It should also work
 reasonably well for destruction ordering. If it's been changed in a way
 that makes this untenable, it needs to be fixed so it does.

Or the design ;-)

And I still don't see, how you will handle duplicate lookup of
self-referential structures for freeze and thaw with a linked list of
items alone.

   Dan

leo


Re: [RfC] vtable-dump

2003-09-02 Thread Benjamin Goldberg
Leopold Toetsch wrote:
[snip]
 [1] when we want to thaw/clone big structures, we should have some means
 to estimate the amount of needed headers. If we will not have enough, we
 do a DOD run before clone/thaw and then turn DOD off - it will not yield
 any more free headers anyway. This can avoid a couple of DOD runs that
 do just nothing except burning a lot of cycles and massive cache
 pollution.
 To achieve this, we might call aggregates.elements() first by means of
 the iterator again or with some depth restriction and returning, when we
 reach the free-header limit.

Even with a depth restriction, a recursive estimate can produce
misleading results due to circular references.  Only actually walking
the structure can get the number right.  However, walking the structure
*twice* would be silly.  So, begin with an estimate of unknown
(serialize an integer -1), and then, after the whole thing has been
frozen, we seek backwards (if that's possible) and replace that
unknown with the actual number of pmcs that were serialized.

-- 
$a=24;split//,240513;s/\B/ = /for@@=qw(ac ab bc ba cb ca
);{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print [EMAIL PROTECTED]
]\n;((6=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))redo;}


Re: [RfC] vtable-dump

2003-09-02 Thread Dan Sugalski
At 8:38 PM +0200 9/2/03, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:

 Every PMC should have a next_for_GC pointer. You moved it to the ext
 structure, ...
I moved it there for a good reason: speed. The smaller a PMC the faster
is the interpreter.
Right, and this part is only needed by the DOD sweep, the freeze/thaw 
code, and the destruction ordering code, so since it's in the ext 
struct it won't get in the way for normal usage.

  ... but it's there, and they all ought to have one. Any PMC that
 gets frozen *will* have one, as it needs to have it for traversal to work
 properly.
Ok. Its a macro and back again in a second *if* needed.
That's easy, then, as it is. :) Besides, from looking at pobj.h, it's 
already there, so I'm not sure what we'd need to do. Perhaps I need 
to resync.

  Using the DOD functionality doesn't match freeze, thaw, dump at all and
 interferes with DOD runs.

 Oh, nonsense, it doesn't at all interfere with things.
Calling mark sets the live flag on PMC headers. When you use mark() you
have to reset the whole arena after a single clone. I expect a lot of
clone()s to be emitted by the HLL.
I expect very few clones to be emitted. Past history indicates that 
it's an unusual operation. Besides, there's nothing wrong with 
setting the live flag, since the PMCs *are* alive. We don't care 
about the live flag for this, what we care is that the PMC is put on 
the traversal list.

If we don't use mark() we are almost at my proposal: have an extra
traverse() vtable, that gets a vtable pointer doing freeze, thaw, clone,
or dump. If you want to have a general traverse() for DOD too (currents
mark?) then you have to pass in a function pointer - or put each and
every item on the next_for_GC list.
When you have a PerlArray containing 100.000 plain PerlInts, you have
your linked list with 100.001 entries or one entry. You have polluted
caches or not - that's it (I'm speaking of DOD here).
So you shortcut in the normal course of DOD? I suppose--I presume 
you've benchmarked to see if testing all the flags of the contained 
PMCs when marking the container is faster than just throwing them on 
the list and checking them when you get to them? I can see that being 
the case in some cases, but I'm concerned that it'll consume more 
memory and take more time in some reasonably common cases.

  ... I designed it with
 this as one of the applications for the DOD system. It should also work
 reasonably well for destruction ordering. If it's been changed in a way
 that makes this untenable, it needs to be fixed so it does.
Or the design ;-)
Yep. In this case, though, design flaws turn out not to be the case. :-P

And I still don't see, how you will handle duplicate lookup of
self-referential structures for freeze and thaw with a linked list of
items alone.
The same way the DOD sweep handles it, unless you've changed things 
beyond recognition. The way it used to work is that when you marked a 
PMC as live, it got put on the end of the chain unless the PMC's next 
pointer field is non-NULL, in which case it's already been put on the 
list, and nothing happens.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: serialisation (was Re: [RfC] vtable-dump)

2003-09-01 Thread Nicholas Clark
On Sat, Aug 30, 2003 at 10:13:02PM -0400, Benjamin Goldberg wrote:
 Nicholas Clark wrote:

  The attacker can craft a bogus CGITempFile object that refers to any
  file on the system, and when this object is destroyed it will attempt to
  delete that file at whatever privilege level the CGI runs at. And
  because that object is getting destroyed inside the deserialise routine
  of Storable, this all happens without the user written code getting any
  chance to inspect the data. And even Storable can't do anything about
  it, because by the time it encounters the repeated hash key, it has
  already deserialised this time-bomb. How does it defuse it?
 
 The simplest solution *I* can think of, is to have storable copy the
 taint
 flag from the input string/stream onto every single string that it
 produces.
 
 Taint checking doesn't solve *all* security problems, of course, but it
 can catch many of them, and it certainly would catch this one (if $$self
 were tainted).

I don't believe that it would. A quick test suggests that in perl5
destructors get to run after a taint violation is detected:

$ perl -Tle 'sub DESTROY {print We still get to run arbitrary code after taint 
violations}; bless ($a = []); `cat /etc/motd`'
Insecure $ENV{PATH} while running with -T switch at -e line 1.
We still get to run arbitrary code after taint violations
$

 One defense against following a bomb with malformed data, might be to
 have Storable save up the SV*s and the names with which to bless them,
 and only do the blessing *after* the data is fully deserialized, as a
 last step before returning it to the user.  This way, if there's
 malformed data, no destructors get called.  The user still needs to
 validate the returned data, though, and rebless anything which might
 result in an evil destructor being called.
 
 Another defense is to run deserialization and validation inside of a
 Safe object.  Make sure that if the object fails to validate, it is
 completely destructed before we exit the Safe compartment.

I think that these could work well.

 For backwards compatibility with perl5, parrot will quite likely support
 taint checking, safe.pm, and ops.pm.

I don't see why support is needed at core parrot level. I believe that
the plan is to provide XS-level Perl 5 compatibility via ponie, in which
case much of this stuff would be up to ponie, not core parrot.

Tainting sounds to me like the sort of things that would be a property on
a scalar, checked by custom ponie ops, which would be as parrot core as
python's ops. A brief inspection suggests that ops.pm is entirely a parser
level thing, so I don't see the need for core parrot support there.

Safe.pm as implemented is an unreliable mess. It's also problematic as
it describes actions solely in terms of perl 5 ops. I've no idea quite
how close Arthur, Rafael and the other ponie conspirators think that
ponie-generated parrot bytecode will be to the perl 5 optree structure,
but I wouldn't like to place any bets right now. I think that safe
execution compartments are part of Dan's plan, but I'm not sure if
anyone yet knows how any of this fits together (even Dan or Arthur)

Nicholas Clark


Re: serialisation (was Re: [RfC] vtable-dump)

2003-09-01 Thread Benjamin Goldberg


Nicholas Clark wrote:
 
 On Sat, Aug 30, 2003 at 10:13:02PM -0400, Benjamin Goldberg wrote:
  Nicholas Clark wrote:
 
   The attacker can craft a bogus CGITempFile object that refers to any
   file on the system, and when this object is destroyed it will attempt to
   delete that file at whatever privilege level the CGI runs at. And
   because that object is getting destroyed inside the deserialise routine
   of Storable, this all happens without the user written code getting any
   chance to inspect the data. And even Storable can't do anything about
   it, because by the time it encounters the repeated hash key, it has
   already deserialised this time-bomb. How does it defuse it?
 
  The simplest solution *I* can think of, is to have storable copy the
  taint
  flag from the input string/stream onto every single string that it
  produces.
 
  Taint checking doesn't solve *all* security problems, of course, but it
  can catch many of them, and it certainly would catch this one (if $$self
  were tainted).
 
 I don't believe that it would. A quick test suggests that in perl5
 destructors get to run after a taint violation is detected:
 
 $ perl -Tle 'sub DESTROY {print We still get to run arbitrary code after taint 
 violations}; bless ($a = []); `cat /etc/motd`'
 Insecure $ENV{PATH} while running with -T switch at -e line 1.
 We still get to run arbitrary code after taint violations
 $

That wasn't what I meant.

The taint violation in your example does *not* correspond to Storable
turning on the taint flag in the strings it produces.  At best, it
corresponds to Storable dieing due to a malformed input file, resulting
in destructors being called.

If sub DESTROY tries to unlink a file, and that filename is a tainted
string, then the DESTROY will die.

Currently, Storable *doesn't* turn on the taint flag in the SVPVs that it
produces; because of this, $$self in CGITempFile::DESTORY isn't tainted.

If $$self in CGITempFile::DESTROY *were* tainted, then obviously that
DESTROY would die, and the file wouldn't get unlinked.  Thus, we would
avoid that security hole.

  One defense against following a bomb with malformed data, might be to
  have Storable save up the SV*s and the names with which to bless them,
  and only do the blessing *after* the data is fully deserialized, as a
  last step before returning it to the user.  This way, if there's
  malformed data, no destructors get called.  The user still needs to
  validate the returned data, though, and rebless anything which might
  result in an evil destructor being called.
 
  Another defense is to run deserialization and validation inside of a
  Safe object.  Make sure that if the object fails to validate, it is
  completely destructed before we exit the Safe compartment.
 
 I think that these could work well.
 
  For backwards compatibility with perl5, parrot will quite likely support
  taint checking, safe.pm, and ops.pm.
 
 I don't see why support is needed at core parrot level. I believe that
 the plan is to provide XS-level Perl 5 compatibility via ponie, in which
 case much of this stuff would be up to ponie, not core parrot.
 
 Tainting sounds to me like the sort of things that would be a property on
 a scalar, checked by custom ponie ops, which would be as parrot core as
 python's ops. A brief inspection suggests that ops.pm is entirely a parser
 level thing, so I don't see the need for core parrot support there.
 
 Safe.pm as implemented is an unreliable mess. It's also problematic as
 it describes actions solely in terms of perl 5 ops. I've no idea quite
 how close Arthur, Rafael and the other ponie conspirators think that
 ponie-generated parrot bytecode will be to the perl 5 optree structure,
 but I wouldn't like to place any bets right now. I think that safe
 execution compartments are part of Dan's plan, but I'm not sure if
 anyone yet knows how any of this fits together (even Dan or Arthur)
 
 Nicholas Clark

-- 
$a=24;split//,240513;s/\B/ = /for@@=qw(ac ab bc ba cb ca
);{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print [EMAIL PROTECTED]
]\n;((6=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))redo;}


Re: [RfC] vtable-dump

2003-09-01 Thread Benjamin Goldberg
Leopold Toetsch wrote:
 
 Benjamin Goldberg [EMAIL PROTECTED] wrote:
 class freezer {
 class thawer {
 class cloner {
 
 [ big snip ]
 
 Do you expect that these are overridden by some languages using parrot?
 I.e. that ponie tries to implement a freezer that writes output for
 Storable?

I'm not entirely sure...

For ponie to implement something that deals with Storable, then of
course it needs to output data in the same file format as Storable
does.  I haven't looked at the guts of storable, so I don't know what
that format is.

 Further: having clone implemented in terms of freeze + thaw needs
 additional memory for the intermediate frozen image. Isn't that
 suboptimal?

Only slightly -- It's just *one single* PMC's data that's stored in that
additional memory.

And if we have a seperate clone vtable method, then there's a chance
that our cloning procedure will drift apart from our freezing/thawing
procedure.

 A general traverse routine or iterator seems to be more flxible.
 
 leo

-- 
$a=24;split//,240513;s/\B/ = /for@@=qw(ac ab bc ba cb ca
);{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print [EMAIL PROTECTED]
]\n;((6=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))redo;}



Re: [RfC] vtable-dump

2003-08-31 Thread Benjamin Goldberg
Leopold Toetsch wrote:
 
 Benjamin Goldberg [EMAIL PROTECTED] wrote:
  Actually, I think the following interface would be better:
 
 void freeze(PMC *freezer);
 void thaw  (PMC *thawer);
 
 I'm thinking of (in horrible pseudo code ;-):
[snip]
 Above functionality does IMHO match your description. The only
 difference is, that I'd like to use that not only for freeze/thaw, but
 for clone and dump too.

Well, I wasn't precisely thinking of my interface being *soley* for
freeze/thaw, but for clone and dump, too.

Cloning would be like freezing followed immediately by thawing, without
writing to disk.

That is, I was thinking of something like the following psuedocode.

   class freezer {
  fifoPMC* queue;
  setINTVAL seen;
  outputer o;
  void push_integer(INTVAL i) { o.print_int(i); }
  void push_number(NUMVAL n) { o.print_num(n); }
  void push_string(STRING*s) { o.print_str(s); }
  void push_bigint(BIGINT*b) { o.print_big(b); }
  void push_pmc(PMC *p) {
 INTVAL i = (INTVAL)p;
 o.print_int(i);
 if( !seen.contains(i) ) { seen.add(i); queue.push(p); }
  }
   }
   void do_freeze(PMC *p, outputer o) {
  PMC * f = new_freezer_dumper(o);
  f.push_pmc(p);
  while( f.queue.notempty() ) {
 p = f.queue.pop();
 o.print_int( p.getclass() );
 p.freeze(f);
  }
   }

   class thawer {
  fifoPMC* queue;
  mapINTVAL, PMC* prep;
  inputter i;
  INTVAL shift_integer() { return i.read_int(); }
  NUMVAL shift_number() { return i.read_num(); }
  STRING*shift_string() { return i.read_str(); }
  BIGNUM*shift_bignum() { return i.read_big(); }
  void shift_pmc() {
 INTVAL j = i.read_int();
 if( prep.contains(j) ) { return prep.get(j); }
 PMC * n = new_placeholder();
 prep.put(j, n);
 queue.push(n);
 return n;
 }
   }
   PMC* do_thaw(inputter i) {
  PMC * t = new_thawer(i);
  PMC * ret = t.shift_pmc();
  PMC * p = ret;
  do {
 p.setclass( i.read_int() );
 p.thaw(t);
 p = t.queue.pop();
  } while(p);
  return ret;
   }

   class cloner {
  fifopairPMC*,PMC*  queue;
  mapPMC*,PMC* prep;
  fifoPMC* tempdata;
  void push_integer(INTVAL i) { tempdata.push_integer(i); }
  void push_number (NUMVAL n) { tempdata.push_number (n); }
  void push_string (STRING*s) { tempdata.push_string (s); }
  void push_bigint (BIGINT*b) { tempdata.push_bigint (b); }
  void push_pmc(PMC *p) {
 if( prep.contains(p) ) {
tempdata.push_pmc(prep.get(p));
 } else {
PMC *p2 = new_placeholder();
prep.put(p, p2);
queue.push( pair(p, p2) );
tempdata.push_pmc(p2);
 }
  }
  INTVAL shift_integer() { return tempdata.shift_integer(); }
  NUMVAL shift_number () { return tempdata.shift_number (); }
  STRING*shift_string () { return tempdata.shift_string (); }
  BIGINT*shift_bigint () { return tempdata.shift_bigint (); }
  PMC*   shift_pmc() { return tempdata.shift_pmc(); }
   }
   PMC * do_clone(PMC *p) {
  PMC *c = new_cloner();
  PMC *r = new_placeholder();
  c.prep.put(p, r);
  p.freeze(c);
  r.setclass( p.getclass() );
  r.thaw(c);
  assert( c.tempdata.isempty() );
  while( c.queue.notempty() ) ) {
 pairPMC*,PMC* x = c.queue.pop();
 x.first.freeze(c);
 x.second.setclass( x.first.getclass() );
 x.second.thaw(c);
 assert( c.tempdata.isempty() );
  }
  return r;
   }

I've ommited the psuedocode for dumping/pretty-printing since I'm not
sure how best it would be done.  It does require quite a bit more
information than freezing (since the computer can tell when/where a pmc
class starts and ends, but a human wouldn't be able to), so merely
calling do_freeze() with an outputer which sprintfs intvals and numvals
nicely wouldn't be sufficient.

However, if you replaced the o.print_int inside of push_pmc with a
o.print_pmc_pointer, and o.print_int in the main loop with
o.print_pmc_class, then the outputter *would* have enough info to make
something vaguely human-readable.
 
-- 
$a=24;split//,240513;s/\B/ = /for@@=qw(ac ab bc ba cb ca
);{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print [EMAIL PROTECTED]
]\n;((6=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))redo;}


Re: serialisation (was Re: [RfC] vtable-dump)

2003-08-31 Thread Benjamin Goldberg
Nicholas Clark wrote:
 
 On Fri, Aug 29, 2003 at 05:30:37PM +0200, Leopold Toetsch wrote:
 I think, we need a general solution for freeze, dump and clone. As
 shown
 
 I don't know if this is relevant here, but I'll mention it in case.
 For perl5 there isn't a single good generic clone system. Probably the
 best (in terms of quality of what it can cope with, if not speed) is
 Storable.
 Storable provides a dclone method which internally serialises the passed
 in reference, and then deserialises a new reference. This is a lot of
 extra effort, but it does work. Obviously this fail's parrots aim -
 speed.
 
 I don't know whether this is relevant either:
 The internal representation of Storable's serialisation is stream of
 tagged data. There was a discussion on p5p a while back about whether it
 would be possible to make a callback interface to the tags, so that
 people could write custom deserialisers (eg vetting the data as it came
 in) This sort of interface to hooking deserialisation would be useful.
 Likewise being able to hook into the traversal/serialisation interface
 to provide custom output (eg XML when we're serialising as standard to
 compressed YAML) would save people having to reinvent
 traversal/introspection routines. (eg perl as both Data::Dumper and
 Storable in core doing separate traversal routines)
 
 I suspect that this is relevant:
 You can't trust you data deserialiser. It can do evil on you before it
 returns.
 
 I forget who worked this attack out, but Jonathan Stowe presented a talk
 on it at YAPC::EU in Amsterdam. The specific attack form on Storable is
 as follows, but it should generalise to any system capable of
 deserialising objects:
 
 You create serialisation which holds a hash. The hash happens to have 2
 identical keys (the serialisation format does not prevent this)
 The second key is innocent. When Storable writes this key into the hash
 it's deserialising, perl's hash API automatically frees any previous
 value for that key. This is how hashes are supposed to work.
 
 Obviously while deserialising there should never have been a previous
 key (a legitimate file generated from a real hash could not have had
 repeating keys) The problem is that things like perl let you
 (de)serialise objects, and objects have destructors that can run
 arbitrary code. So your attacker puts an object as the value associated
 with the first copy of the key which is an object with attributes that
 cause it to do something interesting. The suggestion for attacking a
 Perl 5 CGI was to use a CGITempFile object. The destructor looks like
 this:
 
 sub DESTROY {
 my($self) = @_;
 unlink $$self;  # get rid of the file
 }
 
 The attacker can craft a bogus CGITempFile object that refers to any
 file on the system, and when this object is destroyed it will attempt to
 delete that file at whatever privilege level the CGI runs at. And
 because that object is getting destroyed inside the deserialise routine
 of Storable, this all happens without the user written code getting any
 chance to inspect the data. And even Storable can't do anything about
 it, because by the time it encounters the repeated hash key, it has
 already deserialised this time-bomb. How does it defuse it?

The simplest solution *I* can think of, is to have storable copy the
taint
flag from the input string/stream onto every single string that it
produces.

Taint checking doesn't solve *all* security problems, of course, but it
can catch many of them, and it certainly would catch this one (if $$self
were tainted).

 Simple solutions such as checking the hash keys are unique don't work
 either. All the attacker does then increase the abstraction slightly.
 Put the time-bomb as the first element in an array. Put one of these
 bogus hashes as the second object. Fine, you can realise that you've
 got bad keys in the bogus hash and never build it. But at this
 point the time-bomb object already exists. You'd have to validate the
 entire serialised stream before continuing. And if deserialisers for
 objects are allowed to fail, then you're still stuffed, because the
 attacker then crafts a time-bomb object, and a second malformed object
 that is known to cause its class's deserialiser to fail.

One defense against following a bomb with malformed data, might be to
have Storable save up the SV*s and the names with which to bless them,
and only do the blessing *after* the data is fully deserialized, as a
last step before returning it to the user.  This way, if there's
malformed data, no destructors get called.  The user still needs to
validate the returned data, though, and rebless anything which might
result in an evil destructor being called.

Another defense is to run deserialization and validation inside of a
Safe object.  Make sure that if the object fails to validate, it is
completely destructed before we exit the Safe compartment.

These defenses still aren't perfect:  One could embed a time-bomb inside
of a 

Re: [RfC] vtable-dump

2003-08-31 Thread Leopold Toetsch
Benjamin Goldberg [EMAIL PROTECTED] wrote:
class freezer {
class thawer {
class cloner {

[ big snip ]

Do you expect that these are overridden by some languages using parrot?
I.e. that ponie tries to implement a freezer that writes output for
Storable?

Further: having clone implemented in terms of freeze + thaw needs
additional memory for the intermediate frozen image. Isn't that
suboptimal?

A general traverse routine or iterator seems to be more flxible.

leo


Re: [RfC] vtable-dump

2003-08-30 Thread Benjamin Goldberg
Dan Sugalski wrote:
 
 On Fri, 29 Aug 2003, Leopold Toetsch wrote:
 
 Dan Sugalski [EMAIL PROTECTED] wrote:
 I think we'd be better served getting the freeze/thaw stuff in and

 We were just discussing this in the f'up.
 
 I read those, but I wanted to make sure the discussion went this way. :)
 
 If we make the initial dump scheme something mildly human readable
 (XML, YAML, whatever) then the debugger doesn't have to worry about
 formatting it for a while. (And we're going to want a pluggable dump
 scheme anyway, to make experimentation eaiser)

 Pluggable implies a dump() vtable method, doesn't it?
 
 Nope. Pluggable implies freezethaw.c provides all the functionality to
 freeze or thaw a PMC, with code in there to handle the right final
 encoding or decoding of the data.
 
 I thought we had a freeze and thaw entry in the vtables, but I see not.
 The signatures (since I don't have source handy to check them in) should
 be:
 
   STRING *freeze()
   PMC* thaw(STRING*)
 
 With thaw ignoring its initial PMC parameter. (It's a class method,
 essentially)

Actually, I think the following interface would be better:

   void freeze(PMC *freezer);
   void thaw  (PMC *thawer);

When you freeze a pmc, it's passed a handle into the freezing mechansim,
which it can then use to record the data that will be necessary to
reconstitute itself later.  This is done using the VTABLE_push_type
methods.

Pushing a pmc does *not* instantly result in that other pmc being
frozen;
that only happens *after* the freeze() function returns.  In effect,
it marks the other pmc as needing serialization, which then happens
later.

During thawing, a pmc header is created for you, morphed into your
class, and then has VTABLE_thaw called on it.  It can then reconsititute
itself using VTABLE_shift_type on the handle to the thawer that's been
passed in to it.

Just as serialization of external pmcs in freeze wasn't immediate, any
pmc which you shift_ out of the Thawer object might not be fully
thawed... it might be a pmc header which has been created as a
placeholder, and which has not had VTABLE_thaw called on it yet.

Freezing and thawing would have algorithms similar to the mark-and-sweep
of the DoD (but simple fifos, or simple stacks, not any kind of priority
queue;)).

I'm not *entirely* sure, but I think we might need/want to allow
STRING*s returned by shift_ to be placeholders too.  This allow the
freezer to print out the string data after the pmcs if it choses to.  If
it does, it can perhaps write the strings in a more compact manner. 
Almost certainly moving the strings to the end would make compression by
gzip or whatever easier.

(I vaguely recall hearing that the following problem is NP-hard, but I
don't remember.  Does anyone know for sure?  Given a set of N strings,
construct a single larger string, of which all N strings are
substrings.  This allows all N strings to be represented on disk using
nothing more than the large string, and an offset and length for each
substring.)

-- 
$a=24;split//,240513;s/\B/ = /for@@=qw(ac ab bc ba cb ca
);{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print [EMAIL PROTECTED]
]\n;((6=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))redo;}


Re: [RfC] vtable-dump

2003-08-30 Thread Leopold Toetsch
Benjamin Goldberg [EMAIL PROTECTED] wrote:
 Actually, I think the following interface would be better:

void freeze(PMC *freezer);
void thaw  (PMC *thawer);

I'm thinking of (in horrible pseudo code ;-):

typedef struct {
size_t action_func_nr; // clone, freeze, thaw, dump vtable#
PMC *todo_list_fifo;
PMC *seen;
size_t id;
UINTVAL flags;
PMC *other; // for clone and thaw
void *result;   // freeze, dump append string here
void *other_things_needed;
} Traverse_info;


INTVAL traverse(Traverse_info *info) {
(SELF-vtable[info-action_func_nr]) (INTERP, SELF, info);
foreach item (SELF.elements) {
if (item-needs_cloning)
VTABLE_push(INTERP, info-todo_list, item);
else
;// copy native type
}
return 0; // ok
}


INTVAL clone(Traverse_info *info) {
if (SELF.can(clone))  // overridden?
return SELF.invoke(SELF.find_method(clone), info);
if (info-flags  DO_CLONE)
copy(info-other-state, SELF.state);
else
; // other pointer already done for self-ref aggregates
return 0; // ok
}

freeze_thaw_clone_dump.c // ;-)

do_clone(PMC *pmc) {
   return (PMC*)
   do_traverse(action_func_nr = (VTABLE.clone-VTABLE[0], CLONE, pmc);
}

void *do_traverse(func, what, pmc) {
   info.action_func_nr = func;
   info.todo_list_fifo = new Fifo;
   info.flags = what;
   info.seen = new Hash;
   info.result = NULL;
   push(info.todo_list, pmc);
   while (info.todo_list_fifo.elements()) {
   todo = shift info.todo_list_fifo;
   if (seen{todo})
// set flags
info.flags = DO_NOT_CLONE;
info.other = todo;
   else {
   seen{todo} = ++ID;
   switch (what) {
 case CLONE:
 info.other = new todo.type();
 if (!info.result)
 info.result = info.other; // top clone
 break;
 ...
   }
   }
   info.id = ID;
   if (todo-vtable-traverse(...,info))
  ; // error
   }
   return (void*) info-result;
}

[ big snip ]

Above functionality does IMHO match your description. The only
difference is, that I'd like to use that not only for freeze/thaw, but
for clone and dump too.

The thaw part is still missing in above pseudocode. It needs an
additional ID-array for lookup + some more things. Its better done in a
separate subroutine.

There is one more indirection through the passed in vtable but that
doesn't matter very much. Constructing new PMCs or STRINGs takes much
more time.

And I'd like to have a vtable-dump too. Dan's approach of first
freezing the PMC and then let the debugger construct the dump doesn't
work IMHO: we already have dynamic loaded PMC classes. The debugger
doesn't know anything about such classes, so these classes have to
provide this functionality themselfs. We could rename dump() to
pretty_print() though.

leo


[RfC] vtable-dump

2003-08-29 Thread Leopold Toetsch
Tracing through Parrot programs shows the type and value of variables. 
This doesn't work very well for PMCs, trace_pmc_dump() tries to extract 
some info from the PMC, but much better would it be, if the PMC itself 
can return a string representation of itself.

STRING* dump(INTERP, SELF, STRING *, INTVAL flags);

append and return a printable representation of SELF, with flags being:
0x1 ... name (whoami)
0x2 ... id (enum)
0x4 ... value(s) (non aggregates)
0x8 ... address
... e.g. newlines, tabs in output for aggregates
0x100...n ... dump of aggregate items at nesting (n-0x100)
Comments welcome
leo


Re: [RfC] vtable-dump

2003-08-29 Thread Juergen Boemmels
Leopold Toetsch [EMAIL PROTECTED] writes:

 Tracing through Parrot programs shows the type and value of
 variables. This doesn't work very well for PMCs, trace_pmc_dump()
 tries to extract some info from the PMC, but much better would it be,
 if the PMC itself can return a string representation of itself.

ACK, wanted something like this forever.

 STRING* dump(INTERP, SELF, STRING *, INTVAL flags);

what is the STRING * parameter good for. I assume lineheaders.

 append and return a printable representation of SELF, with flags being:
 0x1 ... name (whoami)
 0x2 ... id (enum)
 0x4 ... value(s) (non aggregates)
 0x8 ... address
 ... e.g. newlines, tabs in output for aggregates
 0x100...n ... dump of aggregate items at nesting (n-0x100)

Use the lower bits for nesting depth. This way you can use 
n = flags  0xff to obtain the nesting depth, and still define 
the flags which can be passed to the passed to dump of the childs:
dump (..., (flags  ~0xff) | n-1);

 Comments welcome 
 leo

allways
boe
-- 
Juergen Boemmels[EMAIL PROTECTED]
Fachbereich Physik  Tel: ++49-(0)631-205-2817
Universitaet Kaiserslautern Fax: ++49-(0)631-205-3906
PGP Key fingerprint = 9F 56 54 3D 45 C1 32 6F  23 F6 C7 2F 85 93 DD 47


Re: [RfC] vtable-dump

2003-08-29 Thread Leopold Toetsch
Juergen Boemmels [EMAIL PROTECTED] wrote:
 Leopold Toetsch [EMAIL PROTECTED] writes:

 STRING* dump(INTERP, SELF, STRING *, INTVAL flags);

 what is the STRING * parameter good for. I assume lineheaders.

The place, where to append SELF.dump. Appending doesn't consume a new
header for each piece.

 Use the lower bits for nesting depth. This way you can use
 n = flags  0xff to obtain the nesting depth, and still define
 the flags which can be passed to the passed to dump of the childs:
 dump (..., (flags  ~0xff) | n-1);

That would restrict the nesting depth. 2 more shifts don't harm,
dump()ing will be slow enough.

BTW what about self-referential structures (same problem as with clone):

$ cat rclone.pasm
new P0, .PerlArray
new P1, .PerlInt
push P0, P1
push P0, P0
clone P2, P0
end

$ ulimit -m 5000 # don't forget, if you havn't any
$ parrot rclone.pasm
Segmentation fault (core dumped)

I'm thinking of dump to be:
- if (flag) return
- set a flag on SELF
- dump children
- reset flag

But this doesn't help against very deeply nested structures. An
alternative would be to have a general iterator with a todo-list like the
current DOD mark_ptr. We could use a generalization of the mark-scheme
by passing a function argument for DOD too with some? impact on DOD
performance.

When we want to use this for serialization too, we need some means to
refer to an already visited PMC, passing an optional hash would do it
IMHO.

 boe

leo


Re: [RfC] vtable-dump

2003-08-29 Thread Juergen Boemmels
Leopold Toetsch [EMAIL PROTECTED] writes:

 Juergen Boemmels [EMAIL PROTECTED] wrote:
  Leopold Toetsch [EMAIL PROTECTED] writes:
 
  STRING* dump(INTERP, SELF, STRING *, INTVAL flags);
 
  what is the STRING * parameter good for. I assume lineheaders.
 
 The place, where to append SELF.dump. Appending doesn't consume a new
 header for each piece.

Ah, ok.

How do you want to solve indentation in multiline dumps?
With an additional indent parameter?

  Use the lower bits for nesting depth. This way you can use
  n = flags  0xff to obtain the nesting depth, and still define
  the flags which can be passed to the passed to dump of the childs:
  dump (..., (flags  ~0xff) | n-1);
 
 That would restrict the nesting depth. 2 more shifts don't harm,
 dump()ing will be slow enough.

Ok, then use a mask of 0x. Or even 0xff (I don't want to see a
dump of depth 4 million on my screen). I wanted to keep the
opportunity to pass the other flags like multiline or name only down
to the childs.

 BTW what about self-referential structures (same problem as with clone):
 
 $ cat rclone.pasm
 new P0, .PerlArray
 new P1, .PerlInt
 push P0, P1
 push P0, P0
 clone P2, P0
 end
 
 $ ulimit -m 5000 # don't forget, if you havn't any
 $ parrot rclone.pasm
 Segmentation fault (core dumped)
 
 I'm thinking of dump to be:
 - if (flag) return
 - set a flag on SELF
 - dump children
 - reset flag
 
 But this doesn't help against very deeply nested structures. An
 alternative would be to have a general iterator with a todo-list like the
 current DOD mark_ptr. We could use a generalization of the mark-scheme
 by passing a function argument for DOD too with some? impact on DOD
 performance.

A general nesting iterator would be a good idea. It might be useful in
quite a few places: DOD, destruction ordering, dumping, serialisation,
cloning. But DOD is the most timing-critical: It's called often and
has to visit all active objects, whereas the others only visit the
affected objects (the ones which gets destructed, dumped, serialized,
cloned).

 When we want to use this for serialization too, we need some means to
 refer to an already visited PMC, passing an optional hash would do it
 IMHO.

When using callbacks they often have a void* params. The
interpretation of this *params is job of the callback.

boe
-- 
Juergen Boemmels[EMAIL PROTECTED]
Fachbereich Physik  Tel: ++49-(0)631-205-2817
Universitaet Kaiserslautern Fax: ++49-(0)631-205-3906
PGP Key fingerprint = 9F 56 54 3D 45 C1 32 6F  23 F6 C7 2F 85 93 DD 47


Re: [RfC] vtable-dump

2003-08-29 Thread Dan Sugalski
On Fri, 29 Aug 2003, Leopold Toetsch wrote:

 Tracing through Parrot programs shows the type and value of variables. 
 This doesn't work very well for PMCs, trace_pmc_dump() tries to extract 
 some info from the PMC, but much better would it be, if the PMC itself 
 can return a string representation of itself.
 
 STRING* dump(INTERP, SELF, STRING *, INTVAL flags);
 
 append and return a printable representation of SELF, with flags being:
 0x1 ... name (whoami)
 0x2 ... id (enum)
 0x4 ... value(s) (non aggregates)
 0x8 ... address
 ... e.g. newlines, tabs in output for aggregates
 0x100...n ... dump of aggregate items at nesting (n-0x100)
 
 Comments welcome

I think we'd be better served getting the freeze/thaw stuff in and 
teaching the debugger how to dump a frozen PMC. This'll get us two 
benefits:

1) We'll have a single way to dump a PMC rather than two
2) We'll have a mechanism in place to handle PMC constants better

If we make the initial dump scheme something mildly human readable (XML, 
YAML, whatever) then the debugger doesn't have to worry about formatting 
it for a while. (And we're going to want a pluggable dump scheme anyway, 
to make experimentation eaiser)

Dan



Re: [RfC] vtable-dump

2003-08-29 Thread Dan Sugalski
On Fri, 29 Aug 2003, Leopold Toetsch wrote:

 Dan Sugalski [EMAIL PROTECTED] wrote:
  I think we'd be better served getting the freeze/thaw stuff in and
 
 We were just discussing this in the f'up.

I read those, but I wanted to make sure the discussion went this way. :)
 
  If we make the initial dump scheme something mildly human readable (XML,
  YAML, whatever) then the debugger doesn't have to worry about formatting
  it for a while. (And we're going to want a pluggable dump scheme anyway,
  to make experimentation eaiser)
 
 Pluggable implies a dump() vtable method, doesn't it?

Nope. Pluggable implies freezethaw.c provides all the functionality to 
freeze or thaw a PMC, with code in there to handle the right final 
encoding or decoding of the data. 

I thought we had a freeze and thaw entry in the vtables, but I see not. 
The signatures (since I don't have source handy to check them in) should 
be:

  STRING *freeze()
  PMC* thaw(STRING*)

With thaw ignoring its initial PMC parameter. (It's a class method, 
essentially)

Dan



Re: [RfC] vtable-dump

2003-08-29 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:
 I think we'd be better served getting the freeze/thaw stuff in and

We were just discussing this in the f'up.

 teaching the debugger how to dump a frozen PMC. This'll get us two
 benefits:

 1) We'll have a single way to dump a PMC rather than two
 2) We'll have a mechanism in place to handle PMC constants better

Yep.

 If we make the initial dump scheme something mildly human readable (XML,
 YAML, whatever) then the debugger doesn't have to worry about formatting
 it for a while. (And we're going to want a pluggable dump scheme anyway,
 to make experimentation eaiser)

Pluggable implies a dump() vtable method, doesn't it?

   Dan

leo


Re: [RfC] vtable-dump

2003-08-29 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:
 On Fri, 29 Aug 2003, Leopold Toetsch wrote:

 I think, we need a general solution for freeze, dump and clone. As shown
 the latter is broken. That would be IMHO an iterator interface with a
 callback function taking and returning a void*.

 Right, what I want is for the system freeze/thaw routines to have a
 standard way to encode a variety of core types (ints, floats, strings),
 lists, and hashes, as well as a way to handle self-referential, circular,
 and shared data in the dumped PMC data. It's the only way to safely do
 this, since we can't count on all the different PMC classes to handle
 dumping arrays of arrays of references to the same end PMC properly. Which
 we have to if we want this to work right.

 The freeze(), dump() and clone() would be vtables which are called by
 the iterator. So this functions hould have the same signature.

 No iterator. When you call freeze on a PMC, that PMC is responsible for
 any sort of iteration that may have to be done. (Though there is the issue
 of blowing stack, which we'll have to address)

Aren't you saying the opposite of above here? I want to be able to
traverse from a given start point (being it the own interpreter or some
PMC) as deeply down as there is something. You did say, that we don't
recursively mark() because of recursion depth. clone(), dump(), freeze()
all have these same issues (plus self-referntial structures).

That's way I'm speaking of an iterator. An array can't clone() itself
safely. It can clone() the SELF pmc and then put array items on a TODO
list, which then get cloned by the framework.

Its like interpreter-mark_ptr holding a TODO list of items still to
mark, which can put other items onto this list too.

So I'm speaking of a framework, which calls a vtable-traverse() method
that calls vtable methods (freeze, clone, dump, or whatever) on SELF
and then puts aggregate items (if any) on a TODO list. This
functionality is IMHO that of an iterator. This does address stack
blowing and self-referential structures.

The TODO list can be avoided for plain items, samething as we do in
pobject_lives().

   Dan

leo