Re: C89

2004-10-29 Thread Leopold Toetsch
Bill Coffman [EMAIL PROTECTED] wrote:
 Thanks for the info...

 Apparently,

gcc -ansi -pedantic

 is supposed to be ANSI C '89.

Not really. It's pedantic ;)

 Incidentally, I tried adding -ansi and -pedantic and I got lots of
 warnings, like long long not supported by ANSI C'89, etc. (how can
 you do 64 bit ints then?).

A C compiler on a 64-bit machine uses long.

 ... I also got errors that caused outright
 failure.  Perhaps it's best to forget the whole C'89 thing.

Not the C'89 thing, but the -ansi thing of gcc.

 -Bill

leo


Re: register allocation questions

2004-10-29 Thread Leopold Toetsch
Bill Coffman [EMAIL PROTECTED] wrote:

 Currently, here's how the register allocator is doing.

 Failed TestStat Wstat Total Fail  Failed  List of Failed
 ---
 t/library/dumper.t5  1280135  38.46%  1-2 5 8 13
 4 tests and 51 subtests skipped.
 Failed 1/123 test scripts, 99.19% okay. 5/1956 subtests failed, 99.74% okay.

 I recall Leo, or someone, saying that the data dumper routines are not
 following the calling convention properly.

I didn't look too close, but it's probably only the entry points:

  .sub _dumper
  _global_dumper()

That's missing C.param statements, so there are none.

 I've learned a lot about how the compiler works at this point, and I'd
 like to contribute more :)

Great. Thanks.

 Would you like a patch?  Should I fix the data dumper routines first?

Definitely - dumper.t tests are currently disabled, don't worry.

 What is all this talk about deferred registers?  What should I do
 next?

deferred registers doesn't make bells ring. What do you mean with
that? - Send patch with explanation of algorithm.

 Yes, I think we are kind of doing this.  It's best to pass the
 registers straight through though.  Like when a variable will be used
 as a parameter, give it the appropriate reg num.  Sort of outside the
 immediate scope of register coloring, but as I've learned, one must go
 a little beyond, to see the input and output for each sub.

Well, it's not really outside of register coloring. It's part of
parrot's calling conventions. You can think of it as part of the Parrot
machine ABI. When you write a compiler for darwin-PPC, you have to pass
function arguments in r3, r4, ... and you get a return value in r3. If
you don't do that, you'll not be able to make any C library call.

In Parrot we have similar calling conventions and the register allocator
must be aware of that. E.g. when you have:

some_function()   # (i, j) = some_function()
$I0 = I5
$I1 = I6

you know that I5 and I6 are return results. The live range or the
previous usage of I5 and I6 is cut by the function call.

Using the return values directly is of course an optimization and not
strictly necessary, nethertheless the allocator has to be aware that the
function call invalidates previous I5 and I6.

 But the idea is to have each sub declare how many registers to
 save/restore.

Don't worry about save/restore. That's already changed. imcc doesn't
emit any savetop/restoretop or similar opcodes any more. Registers are
preserved now by allocating a new register frame for the subroutine.

 We can also minimize this number to match the physical architecture
 that parrot is running on (for an arch specific optimization).

Yes. I did that some time ago in imcc/jit.c, which produced register
mapping for the underlying hardware CPU. Parrot registers 0.. n-1 were
given negative numbers and src/jit.c used these directly as mapping for
CPU registers. This vastly reduced JIT startup time.

 Yes, yes, renaming!  I want to do register renaming!

Go for it please.

 p31 holds all the spill stuff.  It's a pain.  Maybe I'll move that
 around, but if p31 is used, it means that there is no more room for
 symbols, in at least one of the reg sets.

I'd say that with register renaming, spilling will be very rare. But
there is of course no need to use P31 for it. If we really have to spill
we can optimize that a bit.

 - Bill Coffman

leo


[PATCH] PPC JIT failure for t/pmc/threads_8.pasm

2004-10-29 Thread Jeff Clites
I was getting a failure under JIT on PPC for t/pmc/threads_8.pasm, and 
the problem turned out to be that emitting a restart op takes 26 
instructions, or 104 bytes, and we were hitting the grow-the-arena 
logic just shy of what would have triggered a resize, then running off 
the end.

The below patch fixes this; really that magic number (200, now) needs 
to be bigger than the amount of space we'd ever need to emit the JIT 
code for a single op (plus saving registers and such), but with the 
possibility of dynamically loadable op libs (with JIT?), it's hard to 
say what number is guaranteed to be large enough. Or, we can pick a 
reasonable, largish number that works for the built-in ops (empirically 
determined, as now), and document that loadable JITted ops which could 
take more than this, need to make sure to grow the arena as necessary. 
(And we could provide a utility function to make this easy.)

JEff
Index: src/jit.c
===
RCS file: /cvs/public/parrot/src/jit.c,v
retrieving revision 1.95
diff -u -b -r1.95 jit.c
--- src/jit.c   25 Oct 2004 10:24:14 -  1.95
+++ src/jit.c   29 Oct 2004 07:50:09 -
@@ -1395,7 +1395,7 @@
 while (cur_op = cur_section-end) {
 /* Grow the arena early */
 if (jit_info-arena.size 
-(jit_info-arena.op_map[jit_info-op_i].offset + 
100)) {
+(jit_info-arena.op_map[jit_info-op_i].offset + 
200)) {
 #if REQUIRES_CONSTANT_POOL
 Parrot_jit_extend_arena(jit_info);
 #else



AIX PPC JIT warning

2004-10-29 Thread Jeff Clites
Recently config/gen/platform/darwin/asm.s was added, containing 
Parrot_ppc_jit_restore_nonvolatile_registers(). Corresponding code also 
needs to be added to config/gen/platform/aix/asm.s -- Parrot should 
fail to link on AIX currently, without this. I didn't try to update the 
AIX asm.s myself, since I wasn't confident that I could do this 
correctly without having a way to test.

So, someone with AIX asm expertise, please take a look.
Thanks,
JEff


Re: register allocation questions

2004-10-29 Thread Leopold Toetsch
Leopold Toetsch [EMAIL PROTECTED] wrote:
 Bill Coffman [EMAIL PROTECTED] wrote:

 t/library/dumper.t5  1280135  38.46%  1-2 5 8 13

 I didn't look too close, but it's probably only the entry points:

   .sub _dumper
   _global_dumper()

Fixed.

leo


Re: [PATCH] PPC JIT failure for t/pmc/threads_8.pasm

2004-10-29 Thread Leopold Toetsch
Jeff Clites [EMAIL PROTECTED] wrote:
 I was getting a failure under JIT on PPC for t/pmc/threads_8.pasm, and
 the problem turned out to be that emitting a restart op takes 26
 instructions, or 104 bytes, and we were hitting the grow-the-arena
 logic just shy of what would have triggered a resize, then running off
 the end.

Duh. It's probably time to generate a JITted restart function.

 The below patch fixes this; really that magic number (200, now) needs
 to be bigger than the amount of space we'd ever need to emit the JIT
 code for a single op (plus saving registers and such), but with the
 possibility of dynamically loadable op libs (with JIT?),

Not easily. Or it would be not too hard. Given a fairly complete
implementation of core.jit, we have a function table (per platform) that
has slots for every processor instruction (or almost: Parrot hasn't yet
vector opcodes). So a new opcode could be written in terms of existing
opcodes, which would easily allow the generation of JITted variants.

If a new opcode is too complex, it could be written (partly) in C, which
would need just one new JIT opcode call_c_func. And that is exactly what
Parrot_jit_build_call_func on i386 is already doing.

 ... it's hard to
 say what number is guaranteed to be large enough.

If we ever have loadable JIT code, will have an interface to set that
magic number.

Thanks, applied.

 JEff

leo


Re: pmc_type

2004-10-29 Thread Leopold Toetsch
Paolo Molaro [EMAIL PROTECTED] wrote:
 On 10/27/04 Luke Palmer wrote:

 Ugh, yeah, but what does that buy you?  In dynamic languages pure
 derivational typechecking is very close to useless.

 Actually, if I were to write a perl runtime for parrot, mono or
 even the JVM I'd experiment with the same pattern.

For the latter two yes, but as Luke has outlined that doesn't really
help for languages where methods are changing under the hood.

 You would assign small interger IDs to the names of the methods
 and build a vtable indexed by the id.

Well, we already got a nice method cache, which makes lookup a
vtable-like operation, i.e. an array lookup. But that's runtime only
(and it needs invalidation still). So actually just the first method
lookup is a hash operation.

 ... There are a number of optimizations that
 can be done to reduce the vtable size, but I'm not sure this would
 matter in parrot as long as bytecode values are as big as C ints:-)

That ought to come ;) Cachegrind shows no problem with opcode fetch and
you know, when it's compiled to JIT bytecode size doesn't matter anyway.
We just avoid the opcode and operand decoding.

 lupus

leo


[perl #32208] [PATCH] Register allocation patch - scales better to more symbols

2004-10-29 Thread via RT
# New Ticket Created by  Bill Coffman 
# Please include the string:  [perl #32208]
# in the subject line of all future correspondence about this issue. 
# URL: http://rt.perl.org:80/rt3/Ticket/Display.html?id=32208 


Patch does the following:

- Applied Matula/Chaitin/Briggs algorithm for register allocation.
- Color the graph all at once, and spill all symbols with high colors.
 Spill all at once to speed things up.
- Remove several of the functions, which are incorporated into the new
algorithm.

- Shortcomming: doesn't use score anymore, but the algorithm is smart
enough that  I hope it's okay to do that.
- Failed 2 tests for latest CVS.  (See earlier posting.)

WANT TO DO:
- Apparently, there's a memory leak which prevents from coloring
graphs with more than a few hundred registers.  I suspect this is in
the spill, or update_life routine.  Not sure if it's mine or
pre-existing.
- Interference graph is using 8 times the memory it needs to use. 
This is still trivial compared to lost data in above bug.
- Smarten up algorithm to use score again.  A good way to do so is
commented in the code.
- Create spilling score, that prints out with a debug option.  This
can be a metric to compare various algorithms.
- Improve spill to spill all registers at once, adding speed.
- Introduce proper analysis of flow graph, to create less conservative
interference graph.
- Color each of the four register types separately.  Be sure to
compare gains with losses for this, as it is not entirely cear.
- Introduce register renaming.  When variable is reassigned, it might
as well be considered a new symbol... well, much of the time, anyway.
- Introduce variable register size, in coordination with subroutine
calls, to reduce copy cost.  Coordinate with Dan and Leo on this.
- Improve flow-graph, basic block calculation, etc.  Make it all a
little easier to understand, and more efficient.  Knuth style,
literate programming.  Well, just good comments, and a couple of
decent pods.
Index: imcc/reg_alloc.c
===
RCS file: /cvs/public/parrot/imcc/reg_alloc.c,v
retrieving revision 1.22
diff -u -r1.22 reg_alloc.c
--- imcc/reg_alloc.c	30 Sep 2004 16:00:37 -	1.22
+++ imcc/reg_alloc.c	29 Oct 2004 04:39:07 -
@@ -41,15 +41,63 @@
 static void compute_du_chain(IMC_Unit * unit);
 static void compute_one_du_chain(SymReg * r, IMC_Unit * unit);
 static int interferes(IMC_Unit *, SymReg * r0, SymReg * r1);
-static int map_colors(IMC_Unit *, int x, unsigned int * graph, int colors[], int typ);
-#ifdef DO_SIMPLIFY
-static int simplify (IMC_Unit *);
-#endif
 static void compute_spilling_costs (Parrot_Interp, IMC_Unit *);
-static void order_spilling (IMC_Unit *);
 static void spill (Interp *, IMC_Unit * unit, int);
-static int try_allocate(Parrot_Interp, IMC_Unit *);
-static void restore_interference_graph(IMC_Unit *);
+
+/* New graph algorithm stuff */
+static void ig_color_graph(void);
+static void apply_coloring(IMC_Unit *);
+static void ig_precolor(IMC_Unit *);
+static int ig_init_graph(int num_nodes, unsigned* edge_bits);
+static void ig_clear_graph(void);
+static int spill_registers(Parrot_Interp interpreter, IMC_Unit * unit);
+
+typedef struct {
+int deg; /* degree of node (# neighbors) */
+int col; /* color assigned to this node */
+int rank;/* position within the below D array */
+char in; /* boolean, indicating if removed yet */
+} node;
+
+typedef struct {
+int n;   /* number of nodes */
+node* V; /* array of nodes */
+int* D;  /* sorted nodes by degree */
+unsigned* E; /* edge data, adjacency matrix */
+int k;   /* maximum color used in graph (0 means uncolored) */
+} graph;
+
+graph G; /* must have as global to use qsort, 
+but there's only one at a time -- FIXME minimize this global */
+
+#define Dbg_level 0  /* FIXME -- must be a better way to implement this */
+#include stdarg.h
+static void my_message(const char *pat, ...)
+{
+va_list args;
+va_start(args, pat);
+#if Dbg_level = 1
+vfprintf(stderr,pat,args);
+#endif
+va_end(args);
+}
+static void my_message2(const char *pat, ...)
+{
+va_list args;
+va_start(args, pat);
+#if Dbg_level = 2
+vfprintf(stderr,pat,args);
+#endif
+va_end(args);
+}
+/*#define Dbg printf*/
+#define Dbg my_message
+#define Dbg2 my_message2
+
+
+/**/
+
+
 #if 0
 static int neighbours(int node);
 #endif
@@ -57,7 +105,7 @@
 extern int pasm_file;
 /* XXX FIXME: Globals: */
 
-static IMCStack nodeStack;
+static IMCStack nodeStack;  /* FIXME -- this is used in a silly way */
 
 static unsigned int* ig_get_word(int i, int j, int N, unsigned int* graph,
  int* bit_ofs)
@@ -74,12 +122,14 @@
 *word |= (1  bit_ofs);
 }
 
+/* currently unused.
 static void ig_clear(int i, int j, int N, unsigned int* 

Re: [perl #32196] Yet Another GC Crash (YAGC)

2004-10-29 Thread Leopold Toetsch
Matt Diephouse [EMAIL PROTECTED] wrote:
 #0  0x0003d420 in pobject_lives (interpreter=0xd00140, obj=0x0) at
 src/dod.c:198
 #1  0x48f0 in mark_1_seg (interpreter=0xd00140, cs=0xd01fd0) at
 src/packfile.c:360

Ah. The code assumed that there is always a valid subroutine name, which
might not be true for dynamically created subs, like in forth.

I've put a check for a NULL name in front.

Thanks,
leo


Re: JIT and platforms warning

2004-10-29 Thread Leopold Toetsch
Leopold Toetsch [EMAIL PROTECTED] wrote:

 arm, mips, and sun4 JIT platforms need definitely some work to even keep
 up with the current state of the JIT interface.

That's actually wrong - sorry. Sun4 JIT is fairly complete and is up to
date and should't have been in above sentence.

I've messed that up with the arm platform, which doesn't use register
mappings at all. All opcodes are reloading and storing from/to Parrot
registers. Mips only has 3 JITted opcodes.

Sorry again Stéphane,
leo


Traceback or call chain

2004-10-29 Thread Leopold Toetsch
We now have since quite a time the current subroutine and the current 
continuation in the interpreter context structure. With that at hand, we 
should now be able to generate function tracebacks in error case and we 
need the call chain too, to optimize register frame recycling.

Whenever a continuation is created, we have to walk up the call chain 
and mark all return continuations as non-recyclable.

Should the traceback object be avaiable as a PMC?
What information should be included in the traceback (object)?
Comments welcome,
leo


Re: [perl #32208] [PATCH] Register allocation patch - scales better to more symbols

2004-10-29 Thread Leopold Toetsch
Bill Coffman (via RT) wrote:
Patch does the following:
- Applied Matula/Chaitin/Briggs algorithm for register allocation.
- Color the graph all at once, and spill all symbols with high colors.
 Spill all at once to speed things up.
Good. Hopefully Dan can provide some compile number compares.
- Shortcomming: doesn't use score anymore, but the algorithm is smart
enough that  I hope it's okay to do that.
- Failed 2 tests for latest CVS.  (See earlier posting.)
I've fixed dumper.t in CVS. Only streams_11 is currently failing here.
WANT TO DO:
- Apparently, there's a memory leak which prevents from coloring
graphs with more than a few hundred registers.  I suspect this is in
the spill, or update_life routine.  Not sure if it's mine or
pre-existing.
There probably were already some leaks. But we really have to get rid of 
memory leaks alltogether.

- Interference graph is using 8 times the memory it needs to use. 
This is still trivial compared to lost data in above bug.
That might kill Dan's 6000-liner.
- Color each of the four register types separately.  Be sure to
compare gains with losses for this, as it is not entirely cear.
That would reduce memory, wouldn't it?
- Introduce register renaming.  When variable is reassigned, it might
as well be considered a new symbol... well, much of the time, anyway.
Number 1 in my priority list.
- Introduce variable register size, in coordination with subroutine
calls, to reduce copy cost.  Coordinate with Dan and Leo on this.
Not needed. We don't copy registers anymore.
- Improve flow-graph, basic block calculation, etc.
Yeah. And create some means to test it.
Some more notes WRT the patch:
* the Dbg and Dbg2 debug macros aren't needed. Just use the existing 
debug(interp, level, ...) function in src/debug.c. If you need some 
extra levels, you can use some more bits in imcc/debug.h

* The global G is a no no, and I don't think you need it for qsort (If 
you need it you should just use the global around the qsort). We finally 
have to have a reentrant compiler. Yes I know, there are still some 
other globals around, they are being reduced ...

* all functions should have an Interp* and a IMC_Unit* argument to allow 
reentrancy. I.e. all state should be in the unit structure.

* Variable names should be a bit more verbose, G.V is to terse.
* alloca() isn't portable and not available everywhere
I'm waiting for Dan's comments on usability.
Thanks for the patch,
leo


Q: newsub opcodes

2004-10-29 Thread Leopold Toetsch
When PIR code has a function call syntax:
  foo(i, j)
the created code has currently (amongst other) a line:
  newsub Px, .Sub, foo
where the label foo is a relative branch offset.
This is suboptimal for several reasons:
- it creates a new PMC for every call albeit in 99.99% of cases the PMC 
constant for the sub could be used directly [1]

- the created subroutine PMC lacks information: only the start label is 
known, when the PMC is created. The subroutine's name and the end of the 
opcodes for that sub isn't in Px. Obtaining that information would be a 
costy O(n) lookup in the fixup segement of the bytecode (or in the 
constants, which is probably still larger). Subroutine length and name 
information is needed for introspection and for bounds checking in safe 
run cores.

So I think, we should do instead something like this:
  get_sub Px, foo   # find the PMC with label foo in constants
# at compile time and
# replace foo with the index in constants
  clone Py, Px  # if Px would be modified, clone it first [1]
  find_global Px, foo   # we have that already, but hash lookup!
The current syntax:
  newsub Px, .Closure, foo
could remain unchanged, except that again under the hood, the label foo 
is replaced with the index in the constant table. For closures it's 
probably best to actually return a new object per default, as a closure 
might have different state in the lexical pad in each invocation.

Is that reasonable?
leo
[1] a few tests attach properties to the Sub PMC


Re: C89

2004-10-29 Thread Bryan Donlan
On Thu, 28 Oct 2004 19:22:02 -0700, Bill Coffman [EMAIL PROTECTED] wrote:
 Thanks for the info...
 
 Apparently,
 
gcc -ansi -pedantic
 
 is supposed to be ANSI C '89.  Equiv to -std=c89.  Also, my
 Configure.pl generated make file uses neither -ansi nor -pedantic.  I
 do have access to a KR C v2, but it doesn't look like it's going to
 match the actual practice.  Oh well.  So long, as my code works, I'm
 happy.
 
 Incidentally, I tried adding -ansi and -pedantic and I got lots of
 warnings, like long long not supported by ANSI C'89, etc. (how can
 you do 64 bit ints then?).  I also got errors that caused outright
 failure.  Perhaps it's best to forget the whole C'89 thing.  But maybe
 someone should remove that from the documentation?  Just a thought.

I thought long long was only defined in C99, not C89?

-- 
bd


Re: [perl #32208] [PATCH] Register allocation patch - scales better to more symbols

2004-10-29 Thread Dan Sugalski
At 12:30 PM +0200 10/29/04, Leopold Toetsch wrote:
Bill Coffman (via RT) wrote:
Patch does the following:
- Applied Matula/Chaitin/Briggs algorithm for register allocation.
- Color the graph all at once, and spill all symbols with high colors.
 Spill all at once to speed things up.
Good. Hopefully Dan can provide some compile number compares.
I'll give it a shot as soon as I can.
WANT TO DO:
- Apparently, there's a memory leak which prevents from coloring
graphs with more than a few hundred registers.  I suspect this is in
the spill, or update_life routine.  Not sure if it's mine or
pre-existing.
There probably were already some leaks. But we really have to get 
rid of memory leaks alltogether.

- Interference graph is using 8 times the memory it needs to use. 
This is still trivial compared to lost data in above bug.
That might kill Dan's 6000-liner.
I should point out, for the folks following along at home, that it's 
6K lines of source in the original language (DecisionPlus). The 
actual PIR code generated runs to 84k lines in the biggest sub.

Some more notes WRT the patch:
* the Dbg and Dbg2 debug macros aren't needed. Just use the existing 
debug(interp, level, ...) function in src/debug.c. If you need some 
extra levels, you can use some more bits in imcc/debug.h

* The global G is a no no, and I don't think you need it for qsort 
(If you need it you should just use the global around the qsort). We 
finally have to have a reentrant compiler. Yes I know, there are 
still some other globals around, they are being reduced ...
I'd like to get us down to a single global for all of Parrot. I don't 
think it's possible to safely go any lower than that, though I 
suppose we could if we really, really tried, and didn't mind things 
crashing and burning in some really odd fringe edge cases.

* all functions should have an Interp* and a IMC_Unit* argument to 
allow reentrancy. I.e. all state should be in the unit structure.
Definitely.
* Variable names should be a bit more verbose, G.V is to terse.
Yeah. This stuff is abstruse enough as it is -- take pity on those uf 
us with Very Little Brain. :)

* alloca() isn't portable and not available everywhere
Yep. This is a gcc-ism. Use Parrot's memory allocation functions instead.
I'm waiting for Dan's comments on usability.
I'd like the code issues cleaned up before it gets committed. I'll 
let you know the timing as soon as I can, though it'll probably take 
a few hours.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Q: newsub opcodes

2004-10-29 Thread Dan Sugalski
At 2:46 PM +0200 10/29/04, Leopold Toetsch wrote:
When PIR code has a function call syntax:
  foo(i, j)
the created code has currently (amongst other) a line:
  newsub Px, .Sub, foo
where the label foo is a relative branch offset.
This is suboptimal for several reasons:
[snip]
So I think, we should do instead something like this:
  get_sub Px, foo   # find the PMC with label foo in constants
# at compile time and
# replace foo with the index in constants
  clone Py, Px  # if Px would be modified, clone it first [1]
  find_global Px, foo   # we have that already, but hash lookup!
[snip]
Is that reasonable?
Yeah, but I think I've a better approach. Instead of doing this, 
let's just get PMC constants implemented. (I know -- just he says 
:) Each sub in a bytecode segment can get a slot in the constant 
table, and we can map in sub fetches to the (as of now nonexistent) 
set_p_pc op.

While we're at it we should see about adding in an integer constant 
table that can be fixed up on load (to take care of those pesky what 
number did my PMC class map to problems more quickly than the hash 
lookup) but we can put that off a bit.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [perl #32208] [PATCH] Register allocation patch - scales better to more symbols

2004-10-29 Thread Dan Sugalski
At 12:30 PM +0200 10/29/04, Leopold Toetsch wrote:
Bill Coffman (via RT) wrote:
Patch does the following:
- Applied Matula/Chaitin/Briggs algorithm for register allocation.
- Color the graph all at once, and spill all symbols with high colors.
 Spill all at once to speed things up.
Good. Hopefully Dan can provide some compile number compares.
The numbers are... not good.
I took one of the mid-sized programs and threw it at the new code. 
Parrot in CVS takes about 10 minutes to run through this program. The 
main sub's about 30Klines of code, and the stat from a parrot -v is:

sub _MAIN:
registers in .imc:   I2875, N0, S868, P7615
0 labels, 0 lines deleted, 0 if_branch, 0 branch_branch
0 used once deleted
0 invariants_moved
registers needed:I2883, N0, S873, P7741
registers in .pasm:  I31, N0, S31, P32 - 37 spilled
5845 basic_blocks, 47622 edges
I applied the patch to a copy of parrot and ran it. After 37 minutes 
I killed the thing. It had 1.6G of RAM allocated at the time of 
death, too.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: pmc_type

2004-10-29 Thread Paolo Molaro
On 10/29/04 Leopold Toetsch wrote:
  Ugh, yeah, but what does that buy you?  In dynamic languages pure
  derivational typechecking is very close to useless.
 
  Actually, if I were to write a perl runtime for parrot, mono or
  even the JVM I'd experiment with the same pattern.
 
 For the latter two yes, but as Luke has outlined that doesn't really
 help for languages where methods are changing under the hood.

If a method changes you just replace the pointer in the vtable
to point to the new method implementation. Invalidation is the
same, you just replace it with a method that gives the 
method not found error/exception.

  You would assign small interger IDs to the names of the methods
  and build a vtable indexed by the id.
 
 Well, we already got a nice method cache, which makes lookup a
 vtable-like operation, i.e. an array lookup. But that's runtime only
 (and it needs invalidation still). So actually just the first method
 lookup is a hash operation.

And where is it cached and how? Take (sorry, still perl5 syntax:-):

foreach $i (@list) {
$i-method ();
}

With the vtable idea, the low-level operations are (in pseudo-C):
vtable = $i-vtable; // just a memory dereference
code = vtable [method-constant-id]; // another mem deref
run_code (code);

From your description it seems it would look like:
vtable = $i-vtable;
code = vtable-method_lookup (method); // C function call
run_code (code);

Note that $i may be of different type for each loop iteration.
Even a cached lookup is going to be slower than a simple memory 
dereference. Of course this only matters if the lookup is 
actually a bottleneck of your function call speed.

  matter in parrot as long as bytecode values are as big as C ints:-)
 
 That ought to come ;) Cachegrind shows no problem with opcode fetch and
 you know, when it's compiled to JIT bytecode size doesn't matter anyway.
 We just avoid the opcode and operand decoding.

If you use a JIT, decode overhead is already very small:-) AFAIK, alpha
is the only interesting architecture that doesn't do byte access (at least
on older processors) and so it may be a little inefficient there. But I think
you should optimize for the common case. On my machine going through a byte
opcode array is faster than an int one by about 15% (more if level 2 cache
or mem is needed to hold it). The only issue is when you need to load int
values that don't fit in a byte, but those are not so common as register 
numbers in your bytecode which currently take a whole int could just use
a byte.
Anyway, the two approaches may also balance out if the opcodes are
in ro memory. The issue is that in perl, for example, so much is
supposed to happen at runtime, because the 'use' operator changes the
compiling environment, so you actually need to compile at runtime in many 
cases, not only eval. That means emitting parrot bytecode in memory and
this bytecode is per-process, so it increases memory usage and eventually
swapping activity. As you say, since you jit, this memory is wasted, since
it goes unused soon after it is written.
Another issue is disk-load time: when you have small test apps it doesn't
matter, but when you start having bigger apps it might (even mmapping
has its cost, if you need a larger working set to load bytecodes).

BTW, in the computed goto code, make the array of address labels const:
it helps reducing the rw working set at least when parrot is built as an
executable.

lupus

-- 
-
[EMAIL PROTECTED] debian/rules
[EMAIL PROTECTED] Monkeys do it better


Re: Q: newsub opcodes

2004-10-29 Thread Leopold Toetsch
Dan Sugalski [EMAIL PROTECTED] wrote:

   get_sub Px, foo   # find the PMC with label foo in constants

 Yeah, but I think I've a better approach. Instead of doing this,
 let's just get PMC constants implemented.

Well, they are implemented, at least partly. Sub PMCs are in the
constant table. The funny Cget_sub opcode is actually a ...

 set_p_pc op.

... with the small difference, that at compile time, the integer
argument is a label (offset).

What about the other part: closures - should they be created via new
always?

 While we're at it we should see about adding in an integer constant
 table that can be fixed up on load (to take care of those pesky what
 number did my PMC class map to problems more quickly than the hash
 lookup) but we can put that off a bit.

You are thinking of a 2-stage lookup?

  I1 = dynamic_type I0   # lookup runtime type mapping
  $P0 = new I1

A normal IntList Array can do that too.

leo


Mostly a Perl task for the interested

2004-10-29 Thread Leopold Toetsch
classes/*.c is created by the bytecode compiler classes/pmc2c2.pl. Most 
of the actual code is in lib/Parrot/Pmc2c.pm.

The created C code could need some improvements:
* the temp_base_vtable should be const.
  This is currently not possible, because items like .whoami are 
changed in the temp_base_vtable. But we don't have to do that, as the 
vtable is cloned a few lines below anyway. So we should create a const 
table and do the rest of the init stuff in the cloned table.

* same with the MMD init table.
* All constant strings in classes (whoami, isa_str, does_str) and method 
names in the delegate.c should use the CONST_STRING() macro. That would 
need some Makefile tweaks too, to add a dependency on the .str file.
Note: foo = CONST_STRING(interpreter, foo); should always be on it's 
own line and not inside a multiline expression.

Thanks,
leo


Re: [perl #32208] [PATCH] Register allocation patch - scales better to more symbols

2004-10-29 Thread Bill Coffman
Sounds like the memory leak.  Let me try to fix this, and address the
other issues.  I'll get back to you.

Thanks,
-Bill

On Fri, 29 Oct 2004 10:15:34 -0400, Dan Sugalski [EMAIL PROTECTED] wrote:
 At 12:30 PM +0200 10/29/04, Leopold Toetsch wrote:
 Bill Coffman (via RT) wrote:
 
 Patch does the following:
 
 - Applied Matula/Chaitin/Briggs algorithm for register allocation.
 - Color the graph all at once, and spill all symbols with high colors.
   Spill all at once to speed things up.
 
 Good. Hopefully Dan can provide some compile number compares.
 
 The numbers are... not good.
 
 I took one of the mid-sized programs and threw it at the new code.
 Parrot in CVS takes about 10 minutes to run through this program. The
 main sub's about 30Klines of code, and the stat from a parrot -v is:
 
 sub _MAIN:
  registers in .imc:   I2875, N0, S868, P7615
  0 labels, 0 lines deleted, 0 if_branch, 0 branch_branch
  0 used once deleted
  0 invariants_moved
  registers needed:I2883, N0, S873, P7741
  registers in .pasm:  I31, N0, S31, P32 - 37 spilled
  5845 basic_blocks, 47622 edges
 
 I applied the patch to a copy of parrot and ran it. After 37 minutes
 I killed the thing. It had 1.6G of RAM allocated at the time of
 death, too.
 --
 
 
 Dan
 
 --it's like this---
 Dan Sugalski  even samurai
 [EMAIL PROTECTED] have teddy bears and even
teddy bears get drunk



Re: Q: newsub opcodes

2004-10-29 Thread Dan Sugalski
At 4:36 PM +0200 10/29/04, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
   get_sub Px, foo   # find the PMC with label foo in constants

 Yeah, but I think I've a better approach. Instead of doing this,
 let's just get PMC constants implemented.
Well, they are implemented, at least partly. Sub PMCs are in the
constant table. The funny Cget_sub opcode is actually a ...
 set_p_pc op.
... with the small difference, that at compile time, the integer
argument is a label (offset).
Then we should toss the difference and have a single op to access the 
PMC constant table. We're going to need to do this for real PMC 
constants, and I don't see any point to have two ways to do the 
identical same thing.

What about the other part: closures - should they be created via new
always?
Yeah, I think so. They always need to capture a lexical scope, so I 
think they're going to have to.

  While we're at it we should see about adding in an integer constant
 table that can be fixed up on load (to take care of those pesky what
 number did my PMC class map to problems more quickly than the hash
 lookup) but we can put that off a bit.
You are thinking of a 2-stage lookup?
  I1 = dynamic_type I0   # lookup runtime type mapping
  $P0 = new I1
More like what we do right now with all the other constant types. 
Integers aren't in a constant table since we just inline them, but 
the nice thing about a constant table is you can do fixups on it 
while not touching the actual bytecode, leaving it readonly and 
mmapped and all that.

While I'd prefer to leave integers inlined in general, having an 
integer section of the constant table that can be accessed when 
necessary makes the things that need integer fixup easier.

And yeah, this imples that our constant table isn't necesasrily 
constant. I'm OK with that, though. :)

A normal IntList Array can do that too.
Sure, it could. But we're trying to make sure we provide all the 
standard facilities in one place so all the different compiler 
writers don't have to bother. Fixed-up integer constants is a 
reasonable foundation piece for us to provide.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Q: newsub opcodes

2004-10-29 Thread Leopold Toetsch
Dan Sugalski wrote:
At 4:36 PM +0200 10/29/04, Leopold Toetsch wrote:

Well, they are implemented, at least partly. Sub PMCs are in the
constant table. The funny Cget_sub opcode is actually a ...
 set_p_pc op.

... with the small difference, that at compile time, the integer
argument is a label (offset).

Then we should toss the difference and have a single op to access the 
PMC constant table. 
I'm all for that. We just need set_p_pc. OTOH we need some syntax bits, 
what this PMC constant in _pc denotes and how it's constructed. For 
subroutine PMCs it's quite simple: the subroutine label is defining the 
Sub PMC. So its probably something like:

  .pmc_constant .Sub, foo
A complex PMC could be
  .pmc_constant .Complex, 2+3i
Putting PMC constants into the constant table isn't the problem here, 
nor changing the format on disc to use freeze/thaw, the construction of 
more or less arbitrary PMC constants needs some thoughts.

We could probably just define that the Cnew_extended vtable of a class 
is responsible for constructing an appropriate object from a given 
string. And that get's frozen to bytecode.

[ integer constants ]
More like what we do right now with all the other constant types. 
Ah, sounds good.
Integers aren't in a constant table since we just inline them, but the 
nice thing about a constant table is you can do fixups on it while not 
touching the actual bytecode, leaving it readonly and mmapped and all that.

While I'd prefer to leave integers inlined in general, having an integer 
section of the constant table that can be accessed when necessary makes 
the things that need integer fixup easier.
Well, we can't have both schemes coexisting. When running add_i_ic or 
new_p_ic. we have to know, whether the integer is inlined or in the 
constant table. For RISC cpus JIT code a constant table is better even 
for integers.

And having integer constants in the constant table would open the path 
to compile an INTVAL=64bit configuration on a 32-bit machine, where 
opcode_t is 32 bits.

That leads again to my warnocked proposal to just toss all variants of 
opcodes that have constants too. With all possible PMC constants in the 
constants table, we get another (estimated) times two opcode count increase.

When we now have only:
  add_p_p_p
we'd get:
  add_p_p_pc
  add_p_pc_p
  add_p_pc_pc
additionally.
We'll start blowing caches. The code get's too big (think compile 
problems with CGoto). JIT maintainers have to support all these opcodes.

Please consider to reduce constant usage to 4 opcodes:
  set_i_ic
  set_n_nc
  set_s_sc
  set_p_pc
(yes, that imposes a bit more pressure on the register allocator, but 
these constants are reloadable all the time and don't need spilling)

And yeah, this imples that our constant table isn't necesasrily 
constant. I'm OK with that, though. :)
Well, constant Sub PMCs have already offsets relative to their code 
segment in the PBC. On loading the segment, this gets converted to 
absolute code addresses. Forget the constantness of constant segments ;)

We'll do yet another fixup and I really like the idea WRT PMC types.
leo


Re: Q: newsub opcodes

2004-10-29 Thread Dan Sugalski
At 10:17 PM +0200 10/29/04, Leopold Toetsch wrote:
Dan Sugalski wrote:
At 4:36 PM +0200 10/29/04, Leopold Toetsch wrote:

Well, they are implemented, at least partly. Sub PMCs are in the
constant table. The funny Cget_sub opcode is actually a ...
 set_p_pc op.

... with the small difference, that at compile time, the integer
argument is a label (offset).

Then we should toss the difference and have a single op to access 
the PMC constant table.
I'm all for that. We just need set_p_pc. OTOH we need some syntax 
bits, what this PMC constant in _pc denotes and how it's 
constructed. For subroutine PMCs it's quite simple: the subroutine 
label is defining the Sub PMC. So its probably something like:

  .pmc_constant .Sub, foo
For now I'm fine with restricting it to sub PMCs and opening it up to 
more later. :)

[ integer constants ]
More like what we do right now with all the other constant types.
Ah, sounds good.
Integers aren't in a constant table since we just inline them, but 
the nice thing about a constant table is you can do fixups on it 
while not touching the actual bytecode, leaving it readonly and 
mmapped and all that.

While I'd prefer to leave integers inlined in general, having an 
integer section of the constant table that can be accessed when 
necessary makes the things that need integer fixup easier.
Well, we can't have both schemes coexisting.
Right, and I don't think it's a good idea to switch out from what we 
have now. For integer constants I think we ought to have an explicit 
op for fetching them:

   getconstant Ix, Iconstantnumber
or something like that. We can have a corresponding setconstant op 
for constants that... aren't. And for code to set up the constant 
table in the first place.

And having integer constants in the constant table would open the 
path to compile an INTVAL=64bit configuration on a 32-bit machine, 
where opcode_t is 32 bits.

That leads again to my warnocked proposal to just toss all variants 
of opcodes that have constants too. With all possible PMC constants 
in the constants table, we get another (estimated) times two opcode 
count increase.

Please consider to reduce constant usage to 4 opcodes:
I have, and no. (though we can toss all the two-constant I, S, and N 
forms) We leave things as-is. When we run up to our 1.0 release we 
can run some analysis on the different compilers to see what ops 
aren't being used and pare out the list at that point

And yeah, this imples that our constant table isn't necesasrily 
constant. I'm OK with that, though. :)
Well, constant Sub PMCs have already offsets relative to their code 
segment in the PBC. On loading the segment, this gets converted to 
absolute code addresses. Forget the constantness of constant 
segments ;)
:) Works for me. I want to revisit packfile formats and whatnot again 
soon anyway -- I want us to start adding in source line number  
source lines to the packfiles and have it available as we run so we 
can start throwing more informative error messages and making the 
debugger more useful.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Mostly a Perl task for the interested

2004-10-29 Thread Nicholas Clark
On Fri, Oct 29, 2004 at 05:47:55PM +0200, Leopold Toetsch wrote:
 classes/*.c is created by the bytecode compiler classes/pmc2c2.pl. Most 
 of the actual code is in lib/Parrot/Pmc2c.pm.
 
 The created C code could need some improvements:

Can I add a fourth - one I said to Dan I intended to do, but so far haven't
managed:

* The created C code could benefit from #line directives to track where
  C code came from the input .pmc file, so that compiler errors are reported
  for the original .pmc file. Perl 5's xsubpp does this well, using #line
  directives to switch between foo.c and foo.xs, depending on whether that
  section of code was human written, or autogenerated. It makes things much
  easier while developing.


Of course, I may find time to do this before anyone else does, but anyone
is welcome to beat me to it.

Nicholas Clark


Re: AIX PPC JIT warning

2004-10-29 Thread Adam Thomason
On Fri, 29 Oct 2004 01:05:18 -0700, Jeff Clites [EMAIL PROTECTED] wrote:
 Recently config/gen/platform/darwin/asm.s was added, containing
 Parrot_ppc_jit_restore_nonvolatile_registers(). Corresponding code also
 needs to be added to config/gen/platform/aix/asm.s -- Parrot should
 fail to link on AIX currently, without this. I didn't try to update the
 AIX asm.s myself, since I wasn't confident that I could do this
 correctly without having a way to test.
 
 So, someone with AIX asm expertise, please take a look.
 
 Thanks,
 
 JEff
 
 

Worry not, it's already broken.  I've been unable to test the AIX/PPC
JIT since ICU went in.  The configuration for ICU (at least as of 2.6)
supports only a 64-bit build, while aix/asm.s is 32-bit only (the
linker claims the .o is corrupt if assembled with OBJECT_MODE=64).

To get it working again, one of three things needs to happen:

1. ICU becomes optional again (please!).
2. PPC64 JIT code is written which can be morphed into POWER code.
Transforming PPC32-POWER was mostly straightforward, so hopefully
64-bit will be as well.
3. ICU's configure starts to support 32-bit compiles.  This might
happen with 3.0/CVS already, but I haven't checked.

1 is necessary anyway, but it doesn't seem like a high priority.  2 is
best in the long run, but requires somebody who knows more about PPC64
ASM than I do to get started.  I don't know if 3 has any chance of
happening upstream, but I doubt there's anybody working on Parrot who
wants to deal with it.

If somebody can help with one or more of these, I can try to get it
going on AIX 4.3.3 once again.

Adam


[perl #32223] [PATCH] Build dynclasses by default.

2004-10-29 Thread via RT
# New Ticket Created by  Will Coleda 
# Please include the string:  [perl #32223]
# in the subject line of all future correspondence about this issue. 
# URL: http://rt.perl.org:80/rt3/Ticket/Display.html?id=32223 


I think we should be building dynclasses by default.

oolong:~/research/parrot/config/gen/makefiles coke$ cvs diff root.in 
Index: root.in
===
RCS file: /cvs/public/parrot/config/gen/makefiles/root.in,v
retrieving revision 1.254
diff -b -u -r1.254 root.in
--- root.in 12 Oct 2004 09:00:16 -  1.254
+++ root.in 30 Oct 2004 03:31:44 -
@@ -463,7 +463,7 @@
 #
 ###
 
-all : flags_dummy $(TEST_PROG) runtime/parrot/include/parrotlib.pbc 
runtime/parrot/include/config.fpmc docs $(LIBNCI_SO) $(GEN_LIBRARY)
+all : flags_dummy $(TEST_PROG) runtime/parrot/include/parrotlib.pbc 
runtime/parrot/include/config.fpmc docs $(LIBNCI_SO) $(GEN_LIBRARY) dynclasses_dummy
 
 .SUFFIXES : .c .h .pmc .dump $(O) .str .imc .pbc
 
@@ -581,6 +581,9 @@
@echo Compiling with:
@$(PERL) tools/dev/cc_flags.pl ./CFLAGS echo $(CC) $(CFLAGS) -I$(@D) 
${cc_o_out} xx$(O) -c xx.c
 
+dynclasses_dummy :
+   cd dynclasses  $(MAKE)
+
 runtime/parrot/include/parrotlib.pbc: runtime/parrot/library/parrotlib.imc 
$(TEST_PROG)
./parrot -o $@ runtime/parrot/library/parrotlib.imc
 



[perl #25255] IMCC - no warning on duplicate .local vars

2004-10-29 Thread Will Coleda via RT
 [coke - Sat Jan 24 19:32:16 2004]:
 
 It would be helpful if IMCC complained about duplicate .local labels, 
 so that the attached wouldn't compile, rather than dying at runtime.
 

A naive pass at this is:

oolong:~/research/parrot coke$ cvs diff imcc/symreg.c
Index: imcc/symreg.c
=
==
RCS file: /cvs/public/parrot/imcc/symreg.c,v
retrieving revision 1.55
diff -b -u -r1.55 symreg.c
--- imcc/symreg.c   17 Jul 2004 08:07:27 -  1.55
+++ imcc/symreg.c   30 Oct 2004 04:45:21 -
@@ -287,6 +287,11 @@
 ident-next = namespace-idents;
 namespace-idents = ident;
 }
+if (_get_sym(cur_unit-hash,fullname)) {
+fataly(1, sourcefile, line,
+duplicate .local or .sym: '%s',
+ fullname);
+}
 r = mk_symreg(fullname, t);
 r-type = VTIDENTIFIER;
 free(name);

This causes a few tests to fail:

t/library/dumper.t  13  332813   13 100.00%  1-13
t/library/parrotlib.t1   256 61  16.67%  3
t/library/streams.t 12  307221   12  57.14%  2 4-5 8 10-12 14-17 20
t/pmc/iter.t 1   256441   2.27%  11


/Some/ of these seem to be valid errors. But it also seems to not like having .subs 
with the 
same name as a .local inside that sub.

Also, the errors message isn't reporting properly.

Help?


Re: [perl #25255] IMCC - no warning on duplicate .local vars

2004-10-29 Thread William Coleda
That is to say, the file and line number appear to be off.
Will Coleda via RT wrote:
Also, the errors message isn't reporting properly.
Help?


Re: Q: newsub opcodes

2004-10-29 Thread Brent 'Dax' Royal-Gordon
Dan Sugalski [EMAIL PROTECTED] wrote:
 At 10:17 PM +0200 10/29/04, Leopold Toetsch wrote:
 That leads again to my warnocked proposal to just toss all variants
 of opcodes that have constants too. With all possible PMC constants
 in the constants table, we get another (estimated) times two opcode
 count increase.
 
 Please consider to reduce constant usage to 4 opcodes:
 
 I have, and no. (though we can toss all the two-constant I, S, and N
 forms) We leave things as-is. When we run up to our 1.0 release we
 can run some analysis on the different compilers to see what ops
 aren't being used and pare out the list at that point

As a compromise, it strikes me that we're using 32-bit numbers to
encode registers, but only using five bits of that 32.  Could we not
have constants indicated by numbers starting at 32 or something
similar?  We could probably do something very clever to abstract it,
like load all the constants into a reserved, dynamically-sized set of
registers starting at [INSP]32.

A scheme like this would allow us to consolidate the constant and
register variants of all the ops, while still allowing us to use
constants whenever we wanted.  I'm not sure if the cost--allocating
more register banks and loading the constants into those registers--is
worth it, but it might be worth thinking about at least.

-- 
Brent 'Dax' Royal-Gordon [EMAIL PROTECTED]
Perl and Parrot hacker

There is no cabal.