Re: [perl #15942] UNICOS/mk new unhappiness: hash.c

2002-08-04 Thread Sean O'Rourke

On Sun, 4 Aug 2002, Mike Lambert wrote:
> Unfortunately, this causes different semantics for whether you are storing
> primitives or pointers (primitives copy, whereas pointers are shallow). Of
> course, one could argue that the previous one didn't work at all. :)
>
> Thoughts?

Well, it's certainly wrong (though very efficiently so!) -- it needs to
call string_copy for strings and vtable->clone for pmc's (attached).

Are hashes the only (non-packed) containers we'll have to worry about
holding things other than PMC's, or will arrays need this same snippet?

/s


Index: hash.c
===
RCS file: /cvs/public/parrot/hash.c,v
retrieving revision 1.12
diff -p -u -w -r1.12 hash.c
--- hash.c  3 Aug 2002 07:49:23 -   1.12
+++ hash.c  4 Aug 2002 07:38:20 -
@@ -436,8 +436,31 @@ hash_clone(struct Parrot_Interp * interp
 for (i = 0; i < hash->num_buckets; i++) {
 HASHBUCKET * b = table[i];
 while (b) {
-/* XXX: does b->key need to be copied? */
-hash_put(interp, ret, b->key, &(b->value));
+KEY_ATOM valtmp;
+switch (b->value.type) {
+case enum_key_undef:
+case enum_key_int:
+case enum_key_num:
+valtmp = b->value;
+break;
+
+case enum_key_string:
+valtmp.type = enum_key_string;
+valtmp.val.string_val
+= string_copy(interp, b->value.val.string_val);
+break;
+
+case enum_key_pmc:
+valtmp.type = enum_key_pmc;
+valtmp.val.pmc_val = b->value.val.pmc_val->vtable->clone(
+interp, b->value.val.pmc_val);
+break;
+
+default:
+internal_exception(-1, "hash corruption: type = %d\n",
+   b->value.type);
+};
+hash_put(interp, ret, b->key, &valtmp);
 b = b->next;
 }
 }



Re: Unifying PMCs and Buffers for GC

2002-08-04 Thread Mike Lambert

> Okay, I finally give. For purposes of liveness tracing and GC, we're
> going to unify PMCs and strings/buffers. This means we trace through
> strings and buffers if the flags are right, and we need to add a GC
> link pointer to strings/buffers. It'll make things a bit larger,
> which I don't like, but it lifts some restrictions I see looming,
> which I do like.
>
> Anyone care to take a shot at this?

I've started on this task, although it seems to be rather involved. :)

What follows is basically a brain dump on my current ideas that I'm
tossing around in an attempt to resolve the unification issues while
retaining the current speed.

a) The current hash implementation works against the GC, not with it.
Since we currently need a PerlHash PMC surrounding every buffer, these are
directly related by such a unification, and it would be good to allow for
them, and other such data structures.

I'm currently favoring allowing for header pools on a per-type basis, not
just a per-size basis. This would give us a 'hash' pool. The pool
structure would contain function pointers for collection and/or dod
purposes. (stuff that would otherwise be in a PMC vtable.)

Since collection phases are done on a per-header-pool basis already, it
wouldn't be difficult to make per-pool collection functions that are
responsible for iterating over their elements and handling them.

This would help speed up hashes, and make them easier to implement, since
they could update their internal pointers on hash relocation, while it's
all still in the cache.


However, dod functions are a bit harder to handle. mark_used currently
calls pmc->vtable->mark to handle its behavior, and buffers don't do
anything special. This is what prevents hashes from being implemented as
buffers, GC-wise...they need special collection logic. Currently, any
buffer that contains pointers *must* be surrounded by a PMC which
indicates it's behavior, or it's considered a dumb data pointer, like
strings.

One idea, which is most closely in line with the current semantics, is to
add a pool pointer to every header. I've found a few times in the past
where such a pointer would have come in handy. This would allow us to call
the pool's mark() function, to handle stuff like pointing-to-buffers, etc.

It's main drawback is the additional size of the pointer in the header.
I believe this might be okay for a few reasons:

a) our main types, pmc and string, are already quite large. This isn't
that much in their scale of things.

b) it allows us to make new types of buffer-like headers on par with
existing structures. This should hopefully make the core GC code change
less often, and push it out onto the implementation of the headers.

c) currently pmc's have a vtable pointer. If we're really concerned about
the additional data element, we could do something like:
pmc->pool.vtable->add_used instead of the traditional vtable-> . I'm not
convinced of the merit of this idea, and if the 'add' is deemed too slow,
we can just keep a vtable *and* pool pointer in the PMC header.


One implication of c) is that every pmc type has its own pool. This means:

a) no pmc type morphing. once in a pool, it stays in a pool. I don't see
this as a big loss, since type morphing is error-prone to begin with, imo.

b) data members! Since not all pmcs are the same size, pmcs are able to
store data elements in their structure. This allows us to make a SV-like
PMC which stores str-value, int-value, float-value, etc. All without
imposing on the base PMC buffer size. (no, data and cache aren't enough to
handle the above three values, without having the data point to a header
pointing to a buffer containing the values.)


Thoughts on all of the above? The main drawback that I see is that we can
have a lot more pools. Currently, we don't take advantage of sized header
pools, so making them per-type won't hurt us. However, by making different
pools for different pmc types, an explosion in base pmc types could cause
an explosion in pools and create wasteful memory usage as each pool stores
'extra' headers for allocation. This can probably be tuned in some form
to reduce over-allocation's affect, but I thought it wise to bring it up.


Finallythe unification of buffers and PMCs means that buffers can now
point to things of their own accord, without requiring that they be
surrounded by an accompanying PMC type. (This is a seperate question from
the above discussion, as this problem occurs regardless of what we do
above.) This imposes additional work on the DOD, since instead of just
buffer_lives-ing a buffer, it must now stick it on the DOD list so that it
can be properly traced later. This then requires that each buffer contain
a next_for_GC pointer, so it can be added to the to-do list. Alternately,
we can use pool-specific memory to handle the various pointers that are
required for DODbut the point remains that this further increases the
memory footprint of buffers, and I wanted to verify that i

Re: Unifying PMCs and Buffers for GC

2002-08-04 Thread Mike Lambert

Mike Lambert wrote:

> One idea, which is most closely in line with the current semantics, is to
> add a pool pointer to every header. I've found a few times in the past
> where such a pointer would have come in handy. This would allow us to call
> the pool's mark() function, to handle stuff like pointing-to-buffers, etc.

Oh, I meant to mention an alternative to the pool pointer, but forgot...

At one point, we had a mem_alloc_aligned, which guaranteed the start of a
block of memory given any pointer into the contents of the block. If we
store a pointer to the pool at the beginning of each set of headers, then
we navoid the need for a per-header pool pointer, at the cost of a bit
more math and an additional dereference to get at it.

The benefits to this are the drawbacks to the aforementioned approach, but
the drawbacks include:

- additional cpu, and/or cache misses in getting to the pool. for dod,
this might be very inefficient.

- it imposes additional memory requirements in order to align the block of
memory, and imposes a bit more in this 'header header' at the beginning of
the block of headers.

Mike Lambert




Re: [perl #15942] UNICOS/mk new unhappiness: hash.c

2002-08-04 Thread Leopold Toetsch

Sean O'Rourke wrote:

> On Sun, 4 Aug 2002, Mike Lambert wrote:
> 
>>Unfortunately, this causes different semantics for whether you are storing
>>primitives or pointers (primitives copy, whereas pointers are shallow). Of
>>course, one could argue that the previous one didn't work at all. :)
>>
>>Thoughts?
>>
> 
> Well, it's certainly wrong (though very efficiently so!) -- it needs to
> call string_copy for strings and vtable->clone for pmc's (attached).
> 
> Are hashes the only (non-packed) containers we'll have to worry about
> holding things other than PMC's, or will arrays need this same snippet?


[ patch ]
if would first hash_put the values, which copies primitives and 
afterwords copy strings or clone PMCs.

perlarrays currently store PMC only, which get cloned, so no problem.

Currently perlhash has KEY_ATOM values (perl6 only uses PMC) and 
perlarray has PMC values.

But what about (exe2):
my int @a is dim(1_000_000);
This sould definitly be an array of natural ints, not perlint's aka 
PMCs. Using KEY_ATOMs would be wasting space too.


> /s

leo




Re: PARROT QUESTIONS: The PDDs

2002-08-04 Thread Bryan C. Warnock

Sorry for the Wayback Machine...

On Mon, 2002-07-15 at 01:13, Ashley Winters wrote:
> I decided my next step should be to take a look at the PDDs so I know what's 
> going on. I would expect them to be like a writer's canon for a TV show. I'll 
> write my impressions as I go on.
> 
> PDD00:
> Does PDD still mean 'Perl Design Document', or should it mean 'Parrot ...'? 
> The documents seem to all refer to the interpreter.
> 
> While I'm thinking about it, where will 'Parrot' leave off and 'Perl6' begin? 
> At some point, it will be inappropriate to discuss the Parrot interpreter on 
> a Perl6 list, since Perl6 might have JVM/CLR backends, and Parrot might have 
> Python/Ruby frontends.

These are both questions I asked a long time ago, for which I received
no sufficient answers.  So PDD 0, at least, remained as it was.

>From a coding/design perspective, there was at least a thread starting
at http:[EMAIL PROTECTED]/msg03748.html

As a matter of fact, looking at it more closely indicates that this was
actually annotated with CVS version 1.2 of PDD 0.  Original threads:
http:[EMAIL PROTECTED]/msg08677.html
http:[EMAIL PROTECTED]/msg08678.html

As you can see, I was dilemma'd about it, too. :-)

-- 
Bryan C. Warnock
bwarnock@(gtemail.net|raba.com)




Re: Unifying PMCs and Buffers for GC

2002-08-04 Thread Peter Gibbs

Mike Lambert wrote:

> I'm currently favoring allowing for header pools on a per-type basis, not
> just a per-size basis. This would give us a 'hash' pool. The pool
> structure would contain function pointers for collection and/or dod
> purposes. (stuff that would otherwise be in a PMC vtable.)
I am very much in agreement with this concept in principle. I would like you
to consider adding a name/tag/id field to all pool headers, containing a
short text description of the pool, for debugging purposes.
>
>
> One idea, which is most closely in line with the current semantics, is to
> add a pool pointer to every header. I've found a few times in the past
> where such a pointer would have come in handy. This would allow us to call
> the pool's mark() function, to handle stuff like pointing-to-buffers, etc.
This is something I have done in my personal version, for buffer headers
only at present (I have been mainly ignoring PMCs, as I believe they are
still immature). I use it for my latest version COW code, as well as to
allow buffer headers to be returned to the correct pool when they are
detected as free in code that is not resource-pool driven.

> b) it allows us to make new types of buffer-like headers on par with
> existing structures.
On this subject, I would like to see the string structure changed to include
a buffer header structure, rather than duplicating the fields. This would
mean a lot of changes (e.g. all s->bufstart to s->buffer.bufstart), but
would be safer and more consistant. Of course, strings may not even
warrant existence outside of a generic String pmc any more.

>
> a) no pmc type morphing. once in a pool, it stays in a pool. I don't see
> this as a big loss, since type morphing is error-prone to begin with, imo.
The main issue here would be the definition of pmc type, in an untyped
language. We may need a PerlScalar pmc type, as that is what most Perl
variables really are - if we stick to using pmc types based on current
content, then we need to be able to morph between the different
subclasses of  PerlScalar as the contents change.

>
> b) data members! Since not all pmcs are the same size, pmcs are able to
> store data elements in their structure. This allows us to make a SV-like
> PMC which stores str-value, int-value, float-value, etc. All without
Okay, you were obviously thinking the same way!

>
> Thoughts on all of the above? The main drawback that I see is that we can
> have a lot more pools. Currently, we don't take advantage of sized header
> pools, so making them per-type won't hurt us. However, by making different
> pools for different pmc types, an explosion in base pmc types could cause
> an explosion in pools and create wasteful memory usage as each pool stores
> 'extra' headers for allocation. This can probably be tuned in some form
> to reduce over-allocation's affect, but I thought it wise to bring it up.
One option would be to use a limited set of physical sizes (only multiples
of 16 bytes or something) and have free lists per physical size, rather than
per individual pool. This would waste some space in each header, but may
be more efficient overall.

>
> Finallythe unification of buffers and PMCs means that buffers can now
> point to things of their own accord, without requiring that they be
> surrounded by an accompanying PMC type.
How about the other way round? If the one-size-fits-all PMCs were to be
replaced by custom structures, then everything could be a PMC, and
buffer headers as a separate resource could just disappear!

--
Peter Gibbs
EmKel Systems




Re: Unifying PMCs and Buffers for GC

2002-08-04 Thread Peter Gibbs

Mike Lambert wrote:

> At one point, we had a mem_alloc_aligned, which guaranteed the start of a
> block of memory given any pointer into the contents of the block. If we
> store a pointer to the pool at the beginning of each set of headers, then
> we navoid the need for a per-header pool pointer, at the cost of a bit
> more math and an additional dereference to get at it.
>
> - it imposes additional memory requirements in order to align the block of
> memory, and imposes a bit more in this 'header header' at the beginning of
> the block of headers.
I considered this option also, but dismissed it as you need to allocate
twice
the required size to get guaranteed alignment, so you are better off with
the
pointer per header. To use this method without the memory overhead would
require implementing another allocator: if you want for example a 1K
aligned block, first allocate 16K, discard the amount before the alignment
point, and dish out the rest as 15 (or 16 if you're really lucky) 1K aligned
pages. I seriously considered this when I changed my buffer memory to be
paged instead of a single allocation per memory pool; but I haven't actually
implemented it yet.

--
Peter Gibbs
EmKel Systems





Re: [perl #15574] [PATCH] RECALL renamed to AVOID

2002-08-04 Thread Tanton Gibbs

How Freudian can you get.  The subject on this email should have been RECALL
renamed to AGAIN.  It took me until now to realize this.

Sorry,
Tanton
- Original Message -
From: "Tanton Gibbs (via RT)" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, July 25, 2002 2:36 PM
Subject: [perl #15574] [PATCH] RECALL renamed to AVOID


> # New Ticket Created by  "Tanton Gibbs"
> # Please include the string:  [perl #15574]
> # in the subject line of all future correspondence about this issue.
> # http://rt.perl.org/rt2/Ticket/Display.html?id=15574 >
>
>
> This patch implements the AGAIN pmc preprocessor command.  AGAIN should be
> used after a PMC type change.  For example, the following function
> illustrates AGAIN:
>
> # in perlint.pmc
> void set_string( PMC* string ) {
>   CHANGE_TYPE( SELF, PerlString );
>   AGAIN;
> }
>
> In this situation, AGAIN calls set_string on SELF's
> vtable which now points to a perl string PMC type.  This
> ensures that the same semantics are kept no matter what PMC type is used
and
> if any problems are found, they only have to be fixed in one place.
>
> I also added comments to pmc2c.pl to explain what
> the program was doing.  I added a sample grammar
> at the top, etc..
>
> Finally, I made some minor style modifications to perlint.pmc and
> perlnum.pmc.
>
> Thanks,
> Tanton
>
>
> -- attachment  1 --
> url: http://rt.perl.org/rt2/attach/31636/26319/21d14f/diff.out
>
>




[perl #15971] [BUG] pbc2c / OpTrans capturecc:p

2002-08-04 Thread via RT

# New Ticket Created by  Leopold Toetsch 
# Please include the string:  [perl #15971]
# in the subject line of all future correspondence about this issue. 
# http://rt.perl.org/rt2/Ticket/Display.html?id=15971 >


Hi,

when trying to compile a perl6 program to native c, pbc2c fails on the 
capturecc_p op.

$ perl6 -w -C mops.6
.
Use of uninitialized value at lib/Parrot/OpTrans/Compiled.pm line 94.
/home/lt/src/parrot-007/languages/perl6/mops.c: In function `main':
/home/lt/src/parrot-007/languages/perl6/mops.c:229: warning: passing arg 
3 of `PackFile_unpack' from incompatible pointer type
core.ops: In function `run_compiled':
core.ops:4292: label `PC_725' used but not defined

After applying this patch, the undefined label get's generated:
(I don't know, if this is correct)

--- pbc2c.plMon Apr 22 22:56:23 2002
+++ /home/lt/src/parrot-007/pbc2c.plSun Aug  4 17:55:58 2002
@@ -134,6 +134,10 @@
 my $offset = $1;
 $is_branch = 1;
 }
+   # -lt capturecc
+   if ($src =~ /{{\^\+(.*?)}}/g) {
+   $is_branch = 1;
+   }
 # relative branch
 while($src =~ /{{(\-|\+)=(.*?)}}/g){
 my $dir = $1;
@@ -232,7 +236,7 @@
  }
  interpreter->code = pf;
  runops(interpreter, pf, 0);
-exit(1);
+exit(0);
  }

But the program still doesn't run, now it SIGSEGVs:
#0  0x808be70 in runops_fast_core (interpreter=0x814ffe8, pc=0x81661c8)
 at runops_cores.c:34
34  DO_OP(pc, interpreter);
(gdb) bac
#0  0x808be70 in runops_fast_core (interpreter=0x814ffe8, pc=0x81661c8)
 at runops_cores.c:34

I hope, someone can check this.

TIA,
leo






Re: Lexical variables, scratchpads, closures, ...

2002-08-04 Thread Jerome Vouillon

On Fri, Aug 02, 2002 at 09:06:31PM +0100, Nicholas Clark wrote:
> On Fri, Aug 02, 2002 at 06:43:49PM +0200, Jerome Vouillon wrote:
> > Allocating a hash will certainly remain a lot more expensive than
> > allocating an array.  And we are going to allocate pads pretty
> > often...
> 
> Are we? Or are we just going to copy them a lot at runtime?
> [but actually allocate the beasts only at compile time, or when someone
> mucks around with them using string eval or %MY]

When we enter a block we need to:
- create a new pad
- copy the current pad array and add the new pad to the copy
- allocate a PMC for each lexical variable introduced in the block and
  insert the PMCs in the new pad
The creation of the pad could be done either by copying a template
pad.  This would not make any difference compared to allocating from
scratch for an array, but this must be faster for a hash.  Still, I
think using an array will remain faster.

> > I really hope %MY and caller.MY will not be able to extend the lexical
> > scope: we would get a really huge speed penalty for a feature which
> > would hardly ever be used.
> 
> I think we need to be aware of this, but I don't think it has to be that
> painful. If all scopes have a pad pointer, but lexical-free scopes start
> with it null, then there won't be a really huge speed penalty.
> [it might actually be faster, as this way all pads *are* equal, so there
> doesn't need to be special case code distinguishing between scopes with
> lexicals and scopes without.]

I think this make a large difference for variable access.  Here is a
comparison.

Consider the following example:

  $x = 1;
  my $y = 1;
  sub foo () {
bar ();
print "$x $y\n"; 
  }

We consider the two possibilities:

* The lexical scopes cannot be extended

We can implement the lexical scopes with an array of arrays.  For this
example, we don't need to allocate a pad for the subroutine foo, as it
has no lexical variable.  So, in subroutine foo, the layout of the
lexical scopes is the following :

  outer
  array pad
   +--++--+
   | -+--->| -+--> $y PMC
   +--++--+

So, accessing the $y variable require 5 memory reads (5 assembly
instructions on a i386 machine):
- one read to get the address of the outer array from a parrot register
- two reads to get the address of the pad
- two reads to get the address of the PMC.

This could be further improved:
- the address of the outer array could probably be placed in a machine
  register, which would save one memory read
- we may be able to save one memory read per array look-up if we can
  avoid wrapping them in a PMC.
This would get us down to 2 memory reads.

On architectures which are not register starved (about anything but
the i386 architecture), it may be worthwhile to store some of the pad
addresses in machine register: this would save yet another memory
read.

I believe accessing a global variable such as $y can be made hardly
more costly: just one ore two more memory reads.  In particular, we
can avoid a hash look-up.

* The lexical scopes can be extended

I think we need to implement pads as hashes, not arrays, in this case.
Also, we need a pad even for blocks that does not introduce any
lexical variable, because a variable may be inserted latter.  So, for
the example, the layout of the lexical scopes looks like:

   Pads
 (hashes)
   +--++--+
   | -+--->|  |  (empty pad for subroutine foo)
   +--++--+
   | -+-
   +--+ \  +--+  
 ->| -+--> $y PMC
   +--+

Now, if we want to access $y, we first need to perform a look-up in
the first pad, usually empty, but which may contain a lexical variable
$y if bar is defined as:
   sub bar () {
 caller.MY{$y} = 2;
   }
Then, if this fails, we perform a look-up in the outer scope.

Most of the time, lexical variables will be defined in one of the
first pads, so I think we can expect the cost of accessing a lexical
variable to be around one or two hash lookup.  This is already a lot
more costly than a few memory reads.

For global variables ($x and &print in this example), I think this is
worse: we must look into all the hashes, just in case they become
shadowed by a lexical variable, for instance if bar is defined as:
   sub bar () {
 caller.MY{$x} = 2;
   }
or as:
   sub bar () {
 caller.OUTER.MY{$x} = 2;
   }

-- Jerome



Re: ARM Jit v2

2002-08-04 Thread Daniel Grunblatt

On Sat, 3 Aug 2002, Nicholas Clark wrote:

> I wasn't actually expecting you to apply that :-)
> It was more a "where I am at now" informational patch.
Sorry :)

>
> I think that this patch is at good point to pause and take stock. I believe
> it JITs just about every integer op (including some i386 isn't JITting yet)
Great job!

> OK, it doesn't JIT the logical xor op, but that one scares me, and I'm unsure
> how useful it is.
>
> I've not done the floating point ops (or anything else) partly because I
> don't have a good reference for the format of the floating point instructions.
> [However, it's not that hard, as I have source code to both point emulators
> supplied with ARM Linux, so I can see the decode code :-)]
> But more because I'm not sure that it will give such a speed it.
>
> I feel I've demonstrated to myself that it will be possible to generate all
> forms of parrot ops without undue problems. However, I've hardly used any
> registers in my ops so far (at most 3, but I only actually needed 2) when
> there are up to 12 at my disposal. I've no real idea which will turn out to
> be the most useful in real parrot programs, and hence where the effort in
> JITting will get most reward, so I think it best to wait now and see what
> is needed most.
>
> My other thought is that with the current JIT architecture I'm loading
> everything from RAM at the start of the ops (1 or 2 instructions), and save it
> back at the end (1 instruction) with only 1 or 2 instructions need to actually
> do the work. With 10 registers spare, and 60% of my instructions shifting data
> around like some job creation scheme for out of work electrons, I think that
> it might be best to wait and see what we (*) learn from JITs on other
> platforms, then use that to design a third generation JIT that is capable of
> mapping parrot registers onto hardware CPU registers.
>
> * Er, "we" is probably just Daniel as I confess I don't feel motivated to
> attempt to learn other assembly language to write JITs for hardware I don't
> own. Hey, all you Mac fans, where's the PPC JIT? 

I'm working on it, as I'm working on the register allocator too.

Daniel Grunblatt.




Re: Lexical variables, scratchpads, closures, ...

2002-08-04 Thread Jerome Vouillon

On Sat, Aug 03, 2002 at 01:42:06AM -0400, Melvin Smith wrote:
> Here is an attempt. I'm assuming the stack of pads is a singly-linked list
> of pads with children pointing to parents, and multiple children can refer
> to the same parent. If they are created on the fly, the creation of
> a Sub or Closure will simply hold a pointer to its parent, and pads will
> be garbage collected.

This should work.  I think you still need a "real" stack of pads.  The
opcode "new_pad" creates a new pad whose parent is the top of the
stack, and replace this parent by the new pad in the stack. The opcode
"invoke" push the subrouting pad onto the stack.  The opcode "ret" pop
the top of the stack.

-- Jerome



Re: Unifying PMCs and Buffers for GC

2002-08-04 Thread Mike Lambert

Peter Gibbs wrote:

> I am very much in agreement with this concept in principle. I would like you
> to consider adding a name/tag/id field to all pool headers, containing a
> short text description of the pool, for debugging purposes.

I don't have a problem with that. And yes, it'd definitely help debugging
(as opposed to printing out the various pool addresses and comparing them ;)

> > One idea, which is most closely in line with the current semantics, is to
> > add a pool pointer to every header. I've found a few times in the past
> > where such a pointer would have come in handy. This would allow us to call
> > the pool's mark() function, to handle stuff like pointing-to-buffers, etc.
> This is something I have done in my personal version, for buffer headers
> only at present (I have been mainly ignoring PMCs, as I believe they are
> still immature). I use it for my latest version COW code, as well as to
> allow buffer headers to be returned to the correct pool when they are
> detected as free in code that is not resource-pool driven.

Re: DOD immaturity: Yeah, I agree to some extent. It's somewhat difficult
to test DOD efficiency because every string is directly traceable from the
root, thus avoding mark_used for the most part. Perhaps some GC-PMC
benchmarks are needed to weed out remaining issues.

Re: COW code. Ooohh! You've kept it up date with the current code? I was
working on applying your old patch (ticket 607 at
http://bugs6.perl.org/rt2/Ticket/Display.html?id=607), but if you've gow
COW code in the current build, that's even better.

One question: does your current code utilize bufstart as the beginning of
the buffer, or the beginning of the string?

> > b) it allows us to make new types of buffer-like headers on par with
> > existing structures.
> On this subject, I would like to see the string structure changed to include
> a buffer header structure, rather than duplicating the fields. This would
> mean a lot of changes (e.g. all s->bufstart to s->buffer.bufstart), but
> would be safer and more consistant. Of course, strings may not even
> warrant existence outside of a generic String pmc any more.

Again, I agree. If the COW code forces all the string usage to use
strstart and strlen, then bufstart and buflen essentially are used a *lot*
less. This should make the mental transition easier.

> One option would be to use a limited set of physical sizes (only multiples
> of 16 bytes or something) and have free lists per physical size, rather than
> per individual pool. This would waste some space in each header, but may
> be more efficient overall.

I suppose this allows us to mix and match entries of different types in
same pools, since each header would have a pointer to its own pool,
regardless of its neighbors. However, the number 16 could be tuned to 4 or
1 to achieve slightly better mem usage. (Or even POINTER_ALIGNMENT).

> > Finallythe unification of buffers and PMCs means that buffers can now
> > point to things of their own accord, without requiring that they be
> > surrounded by an accompanying PMC type.
> How about the other way round? If the one-size-fits-all PMCs were to be
> replaced by custom structures, then everything could be a PMC, and
> buffer headers as a separate resource could just disappear!

I think you misunderstood me here. I agree that making the buffer headers
a distinct resource is unnecessary. However, this does mean that all
headers need to be traced now. For pure strings, this can hurt
performance, although one can argue that it helps performance in the
general case of the PMC containing buffer data (a couple less
indirections needed on usage).

We could make a new header flag, BUFFER_has_pointers_FLAG, which specifies
that this buffer contains pointers to other data structures, and should be
traced. If this is unset, the buffer doesn't get added onto the free list.

Since adding it to the free list requires adjusting next_for_GC, it's
already going to reference memory there. Checking the flag would merely
prevent traversing the memory again in the 'process' portion.

Thanks for the quick reply,
Mike Lambert




Re: PARROT QUESTIONS: The PDDs

2002-08-04 Thread Dan Sugalski

At 8:56 AM -0400 8/3/02, Bryan C. Warnock wrote:
>Sorry for the Wayback Machine...
>
>On Mon, 2002-07-15 at 01:13, Ashley Winters wrote:
>>  I decided my next step should be to take a look at the PDDs so I know what's
>>  going on. I would expect them to be like a writer's canon for a TV 
>>show. I'll
>>  write my impressions as I go on.
>>
>>  PDD00:
>>  Does PDD still mean 'Perl Design Document', or should it mean 'Parrot ...'?
>>  The documents seem to all refer to the interpreter.
>>
>>  While I'm thinking about it, where will 'Parrot' leave off and 
>>'Perl6' begin?
>>  At some point, it will be inappropriate to discuss the Parrot interpreter on
>>  a Perl6 list, since Perl6 might have JVM/CLR backends, and Parrot might have
>>  Python/Ruby frontends.
>
>These are both questions I asked a long time ago, for which I received
>no sufficient answers.  So PDD 0, at least, remained as it was.

D'oh! I need to catch up with my mail.

Feel free to throw patches to the list about it, though.

>  >From a coding/design perspective, there was at least a thread starting
>at http:[EMAIL PROTECTED]/msg03748.html
>
>As a matter of fact, looking at it more closely indicates that this was
>actually annotated with CVS version 1.2 of PDD 0.  Original threads:
>http:[EMAIL PROTECTED]/msg08677.html
>http:[EMAIL PROTECTED]/msg08678.html
>
>As you can see, I was dilemma'd about it, too. :-)

Oh, the irony! :)
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk



Re: resize_array (PerlArray)

2002-08-04 Thread Benjamin Stuhl

At 04:28 PM 8/2/2002 +0200, Haegl wrote:
>On 2002/08/02 16:11:26 Nicholas Clark <[EMAIL PROTECTED]> wrote:
>
> >It does on reading. I forget the eloquent explanation about the how or
> >why, but all references bar the leftmost are vivified. (Even inside
> >defined). In effect, all bar the last reference are in lvalue context -
> >only the rightmost is rvalue.
>
>The explanation is the part that would have been the most interesting...
>Everyone: Is this just some unwanted but unavoidable behaviour that
>results from the lvalue/rvalue processing or is there a real reason
>behind it? (I don't see one, that's why I ask :-)

IIRC, it's basically because Perl5 doesn't have multidimensional keys,
so $a[0]{"foo"}[24] becomes (in pseudo-perl5-assembler, assume each
op pops its args and pushes its results)

push $a
array_fetch 0
hash_fetch "foo"
array_fetch 24

So, in order that it not segfault in the middle by trying to do a fetch out of
a NULL hash or array, the first to fetches are called in lvalue context (or 
rather,
"this element must exist - create it if you have to" context). With 
multidimensional
keys, Perl6 can avoid this trap, but we really still need an lvalue fetch - 
one reason
being references. We need to support

$a = \@b[2];
$$a = 4;

which only does a fetch on @b[2], but @b[2] had better autovivify, or the 
deref may
segfault.

-- BKS