Re: cvs commit: parrot/src runops_cores.c

2003-10-25 Thread Leopold Toetsch
Dan Sugalski <[EMAIL PROTECTED]> wrote:
>   "jsr " where  is an immediate address rather than a register
>   generates bad code.

C takes an absolute bytecode address, which IMHO never can be an
immediate integer. The op should be defined as jsr(invar INT).

leo


Re: cvs commit: parrot/src runops_cores.c

2003-10-25 Thread Leopold Toetsch
Dan Sugalski <[EMAIL PROTECTED]> wrote:
>   Runtime code generation works mostly, but for some reason the sequence

Another note: jsr/ret are not prepared to do inter-segment branches. A
compiled code segments is currently an entirely new packfile (it should
be only a new packfile directory finally) with its own const table and
bytecode segment. The C or C opcodes have no means to switch
to a different bytecode segment. Also C gets just a simple address
and can't switch back the segment of the caller.

I only see 2 things:

1) you use invokecc/invoke P1, that is you compile:

   .pcc_sub _compiled_word_n:
   dosomething
   invoke P1

  after compilation you can obtain the Sub via:

   find_global P0, "_compiled_word_n"
   invokecc # call it

2) We expand C syntax:

   jsr "_the_sub" # push addr_next; push current_seg

   which could find the Sub PMC and branch to it, and C has to
   check that there are 2 return words on the stack, or we have an new
   opcode C that does it.

leo




Re: [perl #24289] [PATCH] Make parrot/languages/Makefile happy

2003-10-25 Thread Leopold Toetsch
Bernhard Schmalhofer <[EMAIL PROTECTED]> wrote:

> 'make all' failed at the target 'cola'. The reason was a 'cd ../imcc; make'.

Removed some more imcc trails.

> 'make test' failed at 'perl6.test'. For some reason 'perl/t/harness' returns an
> error 29, at least on my Linux notebook.

I don't have that. perl6 tests run, though 3 are failing.

> CU, Bernhard

Thanks for reporting,
leo


[RfC] key strings/numbers/PMCs

2003-10-25 Thread Leopold Toetsch
Currently there is no simple way, to packout a Key that has number or 
string key members. PackFile_Constant_pack() for PFC_KEY does a linear 
lookup (find_in_const) to get at the index of the string or number in 
the constant table. This is really ugly.

So I'd change that, so that key string/numbers (and PMCs) don't have the 
actual string/number/PMC inside, but the constant table index of the 
item. This avoids the linear search for the index.

This also means, that key_new_cstring (and _number, if one needs that) 
generates a constant table entry and that key_string returns the item 
from the constant table[1]. Now to avoid many duplicates it would be 
best to cache these entries and do a hash lookup first.

This all is also related with the discussion of constant PMCs/STRINGs, 
we had some time ago "[RfC] constant PMCs and classes" and with the lack 
of "get_string_keyed_str" and friends for hash lookup.

E.g. pmc_register() currently generates 3 (three) intermediate key PMCs 
for hash lookup in the classname_hash(). The class and object code has 
such lookup all over the place and it will be more in the long run.

Comments welcome,
leo
[1] in combination with the PObj_constant_FLAG - a really dynamic 
variant would work like now, albeit by far the most usage of such keys 
is truely constant.



Q: hash entries

2003-10-25 Thread Leopold Toetsch
I have generalized the hash a bit. There is now a variant that can C 
strings as keys too.
But what I always wanted to know is: do we really need the HASH_ENTRY as 
storage for hash items, or just PMCs as Array/PerlArray does? I think 
that the entry adds some overhead (type + union = 3 or 4 words) to hash 
usage, the more, when only PMCs (or pointers) will be stored in the hash.

leo



Re: [RfC] key strings/numbers/PMCs

2003-10-25 Thread Dan Sugalski
At 2:19 PM +0200 10/25/03, Leopold Toetsch wrote:
Currently there is no simple way, to packout a Key that has number 
or string key members. PackFile_Constant_pack() for PFC_KEY does a 
linear lookup (find_in_const) to get at the index of the string or 
number in the constant table. This is really ugly.
I'd always assumed that key constants, when being reconstituted from 
the bytecode files, would just get strings built for them. Going with 
the constant table offset works, but it has the issue of not being 
able to pass keys across bytecode segments and needing entries made 
in the constant tables for dynamically created keys.

This all is also related with the discussion of constant 
PMCs/STRINGs, we had some time ago "[RfC] constant PMCs and classes" 
and with the lack of "get_string_keyed_str" and friends for hash 
lookup.
The lack of the string keyed variant is on purpose. Or, rather, the 
_int variant was a fast optimization put in to drop the overhead for 
array lookup. My assumption was, at the time, that hash lookups were 
expensive enough that there wasn't enough win in a shortcut to 
justify the extra space in all the vtables.
--
Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: cvs commit: parrot/src runops_cores.c

2003-10-25 Thread Dan Sugalski
At 10:10 AM +0200 10/25/03, Leopold Toetsch wrote:
Dan Sugalski <[EMAIL PROTECTED]> wrote:
   "jsr " where  is an immediate address rather than a register
   generates bad code.
C takes an absolute bytecode address, which IMHO never can be an
immediate integer. The op should be defined as jsr(invar INT).
Oh, it certainly can be an absolute address, if you know what the 
address is when you're generating the code. Assuming fixed-address 
bytecode segments, of course, but I think we're not pondering GC 
possibly moving bytecode. (I had actually considered it, but decided 
that was a bit much even for me :)

That's what actually triggered all this--the forth compiler generates 
code on the fly. There are a set of base words that are their own 
chunks of assembly, then there are the user-defined words which get 
their bytecode generated on the fly by creating little pieces of 
assembly that jsr into the base words.

2drop, for example, drops the two words on the top of the stack and 
is really just "drop drop", and can be build from existing words. 
(drop, in this case) The generated code for the word *should* look 
like:

   jsr .drop
   jsr .drop
   ret
where drop is the address of the drop code. (I use a real value 
rather than the constant, but this is for illustrative purposes) This 
doesn't work now, for lack of immediate jsrs, and the code is instead

set .scratchI, .drop
jsr .scratchI
set .scratchI, .drop
jsr .scratchI
ret
To snag in the other message at the same time, I am aware of the 
issues involved in crossing packfile segments with this. (You'll note 
some mildly vicious hackery in the code because of this) That's 
something we need to address--I'm OK both with different bytecode 
segments sharing a constant table, as well as mandating that new 
segments may, if given the appropriate flags at creation time, can 
share the existing constant segment. (FWIW, the code as it stands 
does, even with me doing Mildly Evil Things, work and handle the 
nasty segment crossing. Kudos are in order all around because of 
that. :)

I *don't* want to use the full-on calling conventions here, because 
they are unnecessary, and arguably downright incorrect for 
forth--when you create new words you bind the current definition, not 
the new definition. (I think. I might be wrong about that, in which 
case Plan B may be in order) The local calling conventions are also 
very different, and I'd rather not use them except when I have to. 
(PDD03's pretty clear that the calling conventions don't have to be 
followed for internal calls)

Being able to do what I'm doing is very useful for languages with 
lightweight normal calling needs and dynamic compilation. (Like, say, 
Forth) I'm willing to lose the capability if it just doesn't work 
out, but I'd rather not if we can avoid it.
--
Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


PMC initialization

2003-10-25 Thread Dan Sugalski
Okay, it's time to address this. It's damned useful to be able to 
pass in initialization data to a PMC--much more sensible to do it all 
in one go, rather than separate new/init methods. And, unfortunately 
that's somewhat problematic at the moment, as there are all sorts of 
reasonable ways to pass in init data. So, time for a decision. Or, 
rather, reopening of the discussion.

*unless* someone comes up with a Better Idea (this is your chance!), 
lets go with two init methods. The first takes no parameters, as the 
plain init does now, and builds an empty PMC. The seconds assumes its 
parameters are in the registers, with standard calling conventions, 
and goes from there.
--
Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Ordered destruction and object graph traversal

2003-10-25 Thread Gordon Henriksen
On Monday, October 20, 2003, at 11:40 , Jeff Clites wrote:

My solution was to define a new vtable method--I've called it visit(), 
though the name's not the important part--to which you pass a callback 
(plus an optional context argument). It's job is to invoke that 
callback on each of it's referenced "children" (also passing the 
context object with each invocation), and that's it. The PMC doesn't 
know or care what the callback does--in a GC context the callback may 
set a flag on what's passed in and stash it for later processing; in 
another context it may be used inside of an iterator which vends out 
objects one-by-one and guards against loops.
Great! This is exactly the fundamental operation I'd been musing would 
be the best building block to enable serialization and pretty-print 
operations, and I'm sad to see that your post has been Warnocked. :(

This mechanism is excellent in that it enables both breadth-first and 
depth-first traversals, and is neutral to whether a pointer is traversed 
or not: The client (callback) can decide that based upon the live bit or 
the ordered destruction bit or a seen table or the phase of the moon. 
This also doesn't alone imply any resource consumption. And it doesn't 
affect concurrency; threading is preserved. Great!

Serialization is a very natural client of this API, and at least some of 
the arguments surrounding serialization should be lessened—because the 
core technology is amenable to use by various possible algorithms, and 
so deciding upon those algorithms early becomes less vital. 
Serialization simply becomes an easier problem with this in place.

I would venture, though, that DoD may well need a separate and heavily 
optimized implementation, purely for efficiency's sake.

—

Gordon Henriksen
[EMAIL PROTECTED]


Re: cvs commit: parrot/src runops_cores.c

2003-10-25 Thread Leopold Toetsch
Dan Sugalski <[EMAIL PROTECTED]> wrote:
> At 10:10 AM +0200 10/25/03, Leopold Toetsch wrote:

> Oh, it certainly can be an absolute address, if you know what the
> address is when you're generating the code.

Did you ever try, what the assembler considers needing fixup: a
_non_local label. I don't know, if that's the problem. But local labels
(w/o underscores) are considered to be fixed up immediately, while global
labels are fixed up later.

> 2drop, for example, drops the two words on the top of the stack and
> is really just "drop drop", and can be build from existing words.
> (drop, in this case) The generated code for the word *should* look
> like:

> jsr .drop
> jsr .drop
> ret

Yep, that's right. It would be slightly simpler, if there were one or
two tests in the forth directory, anyway, after sneaking into "info
gforth", I figured out that ": bla 2 + ; 1 bla ." would compile the
"bla" opcode and print 3 :)

I didn't look at that immediate thing yet, but it would be fine if you
could extract some PASM lines, that exhibit this inconsistency. But IIRC
gets the integer argument fixed up, while the constant isn't. This is
due to the fact, that opcodes don't have any information, which argument
is a branch offset or address. There is only one assumption, that, when
there is a branch, the last argument of one opcode is the branch offset.

So during label fixup there are some hardwired "is this a set_addr" or
such, and then when yes, fixup the second argument.

We should really have a syntax in core.ops that clearly states, here is
a label or such.

> I'm OK both with different bytecode
> segments sharing a constant table, as well as mandating that new
> segments may, if given the appropriate flags at creation time, can
> share the existing constant segment.

That's the way to go. The compile stuff was a joke in the first place to
get Jerome's bfc compiler running. The first implementation just used the
(one and only) constant table and generated a new bytecode segment. But
generally, an eval()ed piece of code could produce constants en mass,
never going away and exhausting memory finally.

So the plans (tm) are, to make such (compiled) code segments PMCs, which
can go out of scope like any other closure or such, and that get cleaned
up by DOD.

Coming back to your forth case. You didn't have constants in the
compiled code, so your hacks are working. But for general usage we need
some flags, what the compiled code should share with the calling
bytecode. This is not only the constant table but concerning all
segments like debug (currently line numbers only).

The general scheme looks like a tree:

  DIRECTROY
 BYTECODE
 CONSTANTS
 DEBUG
 ...
 EVAL_1_DIR
EVAL_1_BYTECODE
...

A packfile directory can hold arbitrary other segments, which can be
directories again ad infinitum. We need some syntax to assign constant
or other segments to the parent or a new dir segment. Bytecode always
has to be in a new segment, we can't expand mmap()ed code nor JIT code.

This also means, that there will be some restructuring of
interpreter->code (and ->constants), the former is now the packfile, and
will be a directory segment finally, the shortcuts like constants or jit_code
will probably die, they make a mess out of code segment switching ...

leo


Re: PMC initialization

2003-10-25 Thread Leopold Toetsch
Dan Sugalski <[EMAIL PROTECTED]> wrote:
> The seconds assumes its
> parameters are in the registers, with standard calling conventions,
> and goes from there.

Seems to heavy to me. We already have init_pmc (taking one additional
initializer) and init_pmc_props, taking a NULL or real initializer PMC
plus one property hash. The property hash can take everything as well as
the first one. I currently don't see the need for more variants.

leo


Re: [RfC] key strings/numbers/PMCs

2003-10-25 Thread Leopold Toetsch
Dan Sugalski <[EMAIL PROTECTED]> wrote:
> At 2:19 PM +0200 10/25/03, Leopold Toetsch wrote:
>>Currently there is no simple way, to packout a Key that has number
>>or string key members. PackFile_Constant_pack() for PFC_KEY does a
>>linear lookup (find_in_const) to get at the index of the string or
>>number in the constant table. This is really ugly.

> I'd always assumed that key constants, when being reconstituted from
> the bytecode files, would just get strings built for them. Going with
> the constant table offset works, but it has the issue of not being
> able to pass keys across bytecode segments and needing entries made
> in the constant tables for dynamically created keys.

Constant keys refering to constant string (or number) items can't cross
bytecode borders. I'm speaking of

  set P0, P1[I0;"key";"bla"]

The key items for "key" and "bla" have to be in the same constant table.
They can't be shared, because to generate such a key, you first have to
generate "key" and "bla" constant string entries, which are referenced
in the composite key by *index* (and might already be string constants
with an assigned index).

But on unpack, the key components take real strings. That effectively
prohibts their packing again except with this ugly hack: 'where is the
constant index of a string looking like "bla"'.

> The lack of the string keyed variant is on purpose. Or, rather, the
> _int variant was a fast optimization put in to drop the overhead for
> array lookup. My assumption was, at the time, that hash lookups were
> expensive enough that there wasn't enough win in a shortcut to
> justify the extra space in all the vtables.

And the assumption now is ? :)

WRT vtable space: Please look at currently totally unused _same variants
- not speaking of some _keyed :)

leo


[BUG] IMCC looking in P3[0] for 1st arg

2003-10-25 Thread Steve Fink
I am getting a seg fault when doing a very simple subroutine call with
IMCC:

.sub _main
newsub $P4, .Sub, _two_of
$P6 = new PerlHash
.pcc_begin prototyped
.arg $P6
.arg 14
.pcc_call $P4
after:
.pcc_end
end
.end

.pcc_sub _two_of non_prototyped
.param PerlHash Sunknown_named3
.param int mode
.pcc_begin_return
.pcc_end_return
.end

The problem is that IMCC is checking to see whether the 1st argument
is of the correct type (PerlHash), but it looks for the argument in
P3[0], when in fact it isn't an overflow arg and so is in P5. P3, in
fact, is null and so parrot seg faults.

Oddly, if I take away the int parameter (and corresponding argument),
it does not crash. But this also seems to remove the typecheck
entirely.