cvsuser 03/11/20 02:29:05
Modified: . MANIFEST
Added: docs/dev pmc_freeze.pod
examples/benchmarks freeze.pasm freeze.pl
Log:
pmc_freeze devel doc and a benchmark
Revision Changes Path
1.505 +3 -0 parrot/MANIFEST
Index: MANIFEST
===================================================================
RCS file: /cvs/public/parrot/MANIFEST,v
retrieving revision 1.504
retrieving revision 1.505
diff -u -w -r1.504 -r1.505
--- MANIFEST 19 Nov 2003 15:43:17 -0000 1.504
+++ MANIFEST 20 Nov 2003 10:29:00 -0000 1.505
@@ -182,6 +182,7 @@
docs/debugger.pod [main]doc
docs/dev/byteorder.dev [main]doc
docs/dev/dod.dev [devel]doc
+docs/dev/pmc_freeze.pod [devel]doc
docs/dev/infant.dev [devel]doc
docs/dev/jit_i386.dev [main]doc
docs/dev/longopt.dev [main]doc
@@ -285,6 +286,8 @@
examples/benchmarks/bench_newp.pasm [main]doc
examples/benchmarks/fib.imc [main]doc
examples/benchmarks/fib.pl [main]doc
+examples/benchmarks/freeze.pl [main]doc
+examples/benchmarks/freeze.pasm [main]doc
examples/benchmarks/gc_alloc_new.pasm [main]doc
examples/benchmarks/gc_alloc_reuse.pasm [main]doc
examples/benchmarks/gc_generations.pasm [main]doc
1.1 parrot/docs/dev/pmc_freeze.pod
Index: pmc_freeze.pod
===================================================================
=head1 TITLE
pmc_freeze.c - design notes
=head1 Overview
Freezing or serializing arbitrary PMCs is an interesting problem.
Aggregates can hold other aggregates nested deeply, so that a
recursive approach easily could blow the stack especially on embedded
systems. And aggregates can be self-referential, they can hold
pointers to the aggregate itself, so that working on such structures
could create infinite loops.
=head1 Coverage
Albeit the file is named pmc_freeze.c it finally will deal with all
kinds of operations, that deeply traverse arbitrary structures, that
is:
=over 4
=item freeze
Called from user code to serialize the state of a PMC into some
possibly binary representation held in a STRING.
=item freeze_at_destruct
A variant of above, possibly called from an exception handler or on
resource shortage before interpreter shutdown, to save some data
before dying. It must not consume any additional resources.
=item thaw
The opposite of above: Reconstruct all PMCs to generate an identically
copy of the original frozen PMC. This is also called from user code.
=item dclone
Deeply clone an aggregate, which is basically thaw(freeze(p)).
=item dump, pretty_print
Create a visual representation of an aggregate.
=item destruction ordering
Find logical dependencies of PMCs, so that they can get destructed in
appropriate order. This is also called on interpreter shutdown.
=item mark
Mark all objects being live by calling B<pobject_lives> called from
DOD. While the functionality is the same, it will not be implemented
on top of this general scheme for performance reasons. This leads to
some code duplication, but DOD is run permanently and deserves all
speed it can get.
=back
=head1 Description
The basic scheme of operation looks like this:
info = init()
push todo_list, pmc
while (todo_list)
current = shift todo_list
current->visit(info)
done.
=head2 The visit_info structure
This structure holds all necessary information and function pointers
specific to the desired functionality. It gets passed on to all vtable
methods and callback functions.
=head2 Working loops
These are labeled B<visit_loop_*>. There are currently to schemes to
handle the B<todo_list>.
=over 4
=item next_for_GC
All PMCs that can contain other PMCs have the B<next_for_GC> pointer
in the PMCs extended data area. The B<todo_list> gets built by
appending (or prepending) the current PMC to a B<mark_ptr>, which then
points to the current PMC forming a linked list of items.
This pointer is also used during DOD's B<mark()> functionality, so
that DOD has to be turned off during operations using this scheme.
As the B<next_for_GC> pointer is inside the PMC, this scheme isn't
thread-safe at low-level, because shared PMCs also would share this
pointer, so that there can be only one operation at a time.
=item todo list
A B<List> called B<todo> holds items still to be worked on. This
method is slower, consumes more resources, but doesn't interfer with
DOD runs and is thread-safe.
=back
=head2 Putting items on the todo list
This is done by a callback function inside the B<visit_info> structure
called B<visit_child_function>. It gets called initially to put the
first item on the list and is called therafter from all PMCs for
contained PMCs inside the B<visit> vtable method.
=head2 The visit() vtable
The general scheme above shows that this method is called for all
items on the B<todo_list>. B<visit> has to call
B<visit_child_function> for all contained PMCs, which then get
visitted until all is done.
=head2 The visit_child_function() callback
The basic operation is:
(seen, id) = was_already_seen(pmc)
do_specific_action(pmc, seen, id)
if (!seen)
pmc->visit_function()
=head2 Avoiding duplicates
As stated in the introduction strucutures can be self-referential,
they can contain (at an arbitrary depth) PMCs, that where already
processed. Just following these PMCs would lead to endless loops. So
already B<seen> PMCs have to be remembered.
=over 4
=item The B<seen> hash
Using a B<Hash> is one method to avoid duplicates. The B<seen> hash
holds keys being the address of the PMC and values being a PMC B<id>,
which is unique for this PMC. While this is straight forward, it consumes
16 bytes per PMC (plus overhead, 32-bit system assumed). Hash lookups
also take a considerable amount of time.
=item B<next_for_GC>
The pointer used for the B<todo_list> handling itself can serve as a
marker that this item was already processed. There are some issues
with this though: Plain scalars (not being able to contain other PMCs)
don't have a B<next_for_GC> pointer. This is an optimization reducing
the size of scalars and increasing performance considerably.
Second, the B<next_for_GC> pointers have to be cleared beforehand. DOD
uses only a nibble-sized flag area located inside the PMCs arena to
manage, if a PMC was seen already by checking the live bit. The
B<next_for_GC> pointer is just set and never cleared to avoid touching
a PMCs memory and polluting caches when possible.
Finally, generating a PMCs B<id> isn't as simple as just incrementing
a counter used with the B<seen> hash approach.
=item PMC B<id>s
We could of course use the PMCs address as its own B<id>, its for sure
unique. But then we get a suboptimal operation during thawing. To
manage duplicates during B<thaw> we basically need a mapping
B<PMC_in_image =E<gt> newly_constructed_PMC>. When now the
B<PMC_in_image> (the B<id>) is the address, we have to use a hash
again, for B<thaw()> with all the negative impact on resources and
speed.
So both schemes are using small B<id> values and the seen handling
inside B<thaw> is done via a list lookup, which is a lot faster and
takes less resources.
The B<seen> hash approach just has a counter for PMC B<id>s, the
B<next_for_GC> approach calculates the B<id> from the address of the
PMC in its arena, again yielding a small and unique number. The two lo
bits of PMC B<id>s are used as flags.
=back
=head2 The actual action
So after all we finally arrived at the point to actually perform the
desired functionality. First the PMC-specific part is done inside
F<pmc_freeze.c> then the specific vtable method B<freeze>, B<thaw>,
whatever, is called, again via a function pointer called
B<visit_function> (albeit B<action_function> might be a better name
for it).
=head1 Freeze and thaw
As stated PMCs are currently processed inside the core, PMC-specific
parts are done by calling the PMCs vtable method. This parts could of
course be moved to F<default.pmc> too, so that its simpler to override
the functionality.
=head2 Serializer interface
During initialisation the B<visit_info>s B<image_io> data pointer is
filled with an object having B<vtable> methods that remarkably look
like a PMCs vtable. So B<io-E<gt>vtable-E<gt>push_integer> spits out
an INTVAL to the frozen B<image>, while B<shift_integer> gets an
INTVAL from the frozen stream.
This simplifies final changes when B<image_io> becomes just a PMC of some
serializer class. There are currently two serializers:
=over 4
=item Plain text
This serializer is mainly intended for testing. Having a readable
representation of the image simplifies debugging a lot.
=item Parrot Byte Code
We already have a platform-independent way of reading and writing
opcodes, string-, and number-constants. So this serializer uses
functionality of the packfile routines. The produced image isn't as
dense as it could be though, because all data are aligned at
B<opcode_t> boundaries.
=back
=head2 Image data format
PMC B<id>s ranging from 1 to N-PMCs are shifted left by two, so that
the 2 lo bits can serve as flags:
id + 0x1 ... PMC was seen
id + 0x2 ... PMC has same type as previous PMC
id + 0x3 ... escape flag
A PMCs image generally looks like:
<id><type><pmc-specific-data>
The text prepresentation of the array
P0 = [P1=666, P2=777, P0]
may look like:
0xdf4 30 3 0xdf8 33 666 0xdf2 777 0xdf5
0xdf4 ... PMC id (with "0x" in front for clarity)
30 ... enum_class_PerlArray
3 ... elements count
0xdf8 ... id of first element
33 ... enum_class_PerlInt
666 ... value
0xdf2 ... id of second element, same type as prev element
777 ... value
0xdf5 ... id of array itself with lo bit set
[ To be continued ]
=head1 Author
Leopold Toetsch <[EMAIL PROTECTED]>
1.1 parrot/examples/benchmarks/freeze.pasm
Index: freeze.pasm
===================================================================
new P0, .PerlArray
set I0, 100000
time N0
lp1:
push P0, I0
dec I0
if I0, lp1
time N1
sub N1, N0
print "constr.time "
print N1
print "\n"
time N0
freeze S0, P0
time N1
sub N1, N0
print "freeze time "
print N1
print "\n"
# print S0
# print "\n"
time N0
thaw P10, S0
time N1
sub N1, N0
print " thaw time "
print N1
print "\n"
# time N0
# clone P11, P0
# time N1
# sub N1, N0
# print " clone time "
# print N1
# print "\n"
print "Image len "
length I0, S0
print I0
print "\n"
typeof S10, P10
print S10
print " "
set I11, P10
print I11
print "\n"
end
1.1 parrot/examples/benchmarks/freeze.pl
Index: freeze.pl
===================================================================
#!/usr/bin/perl -w
use strict;
use Storable qw( freeze thaw dclone );
use Time::HiRes qw( time );
my (@a);
my ($s, $e);
$s = time();
for my $i (0..99999) {
push @a, $i;
};
$e = time();
printf "constr.time %.6f\n", $e-$s;
$s = time();
my $image = freeze([EMAIL PROTECTED]);
$e = time();
printf "freeze time %.6f\n", $e-$s;
$s = time();
my @b = @{ thaw $image };
$e = time();
printf " thaw time %.6f\n", $e-$s;
#$s = time();
#my $c = dclone [EMAIL PROTECTED];
#$e = time();
#printf " clone time %.6f\n", $e-$s;
print "Image len ", length($image), "\n";
print "array size ", scalar(@b), "\n";