Re: Bytecode PDD

Leopold Toetsch Fri, 29 Sep 2006 02:52:35 -0700

Am Freitag, 29. September 2006 01:39 schrieb Jonathan Worthington:
> Hi,
>
> I've checked in the proposed bytecode PDD and also most of the changes
> that I discussed with Allison earlier today. Feedback on it would be
> greatly appreciated.


Great work, thanks.

> A couple of open questions on this are:
>
> 1) Is keeping the Parrot version number around sensible and if so, is
> having it as the version of Parrot that wrote the packfile useful?  I
> guess it's helpful if we need workarounds for bugs in previous versions
> of Parrot in later versions to know this. Other thoughts?

I think it's useful.

> 2) How should we handle changes to the core Parrot library (mostly PMCs,
> but also consider anything we promise is available)? Should this bump
> the packfile version number too? Or do we want some other mechanism to
> handle this?

This is still a can of worms. Not so much changes to PMC type numberings per 
se (which should invalidate PBCs) but the dynamic nature of these resources.

I'll try to dump my thoughts.

A PBC refers - via its contents - to several possibly dynamically extendable 
resources. A probably incomplete list is:

1) PMCs   [*1]
2) charsets
3) encodings
4) HLLs 
5) opcodes

(see also src/pmc/parrotinterpreter.pmc:547 ff)  [*2]

Whenever such items are refered to by a numeric index and that index is part 
of the PBC, we have a possible problem.

Let's look at opcodes. These are present in the PBC as index (the opcode 
number). We got a packfile with some dynamic opcode inside:

  opcodes
  [ 10, 20, 30, 1300, 1301, 0 ]

Let's say, opcode #1300 and #1301 are from some dynamic opcode lib. Now this 
PF gets loaded into an interpreter, which already has dynamic opcode 
librar{y,ies} loaded. In the best case, it was the same opcode library and 
the opcode numbers just happen to match. But that's pure luck.

The same argument holds for all other above resources.

BTW encodings seem to be missing in the pdd - and we can't do:
   "Character set, copied from the string structure." 
because this is a pointer. We need an index into the available 
charsets/encodings.

So what I think, we have to do, is:

- store a metatable of such resources, this is basically for:
  2-4) a list of names / library PMCs, which describes how to load 
       the resource
       (or NULL, if this resource is a core resource)
  1,5) same + range of indices

- when now a PBC is loaded, we'd have to merge this information with already 
in-memory structures of the interpreter. We can at least detect, if there's a 
collision. Still better would of course be to relocate the index and use this 
mapping during unpacking. Unfortunately we can't do the relocation of opcodes 
for mmap-ed bytecde in memory.

[*1] theoretically PMCs shouldn't be a problem, as these are usually looked up 
dynamically, but it depends of course on the usage of dynamic oplibs :-(

  .loadlib "mypmc"
  ...
  new P0, .MyPMC   # new_p_ic  .MyPMC is refered to by index  
  new P0, 'MyPMC'  # referenced by name

For the index case, we'd again have the described problem.
(The .Type syntax is always fine for core PMCs, which don't change for the 
validity range of the packfile).

[*2] This resides currently in the interpreter PMC, but should be moved into 
the future PackFile PMC.

> Again, comments and/or suggestions on anything else in the proposal are
> very welcome! :-)

I've some thoughts re PF PMCs too - later.

> Thanks,
>
> Jonathan

leo

Re: Bytecode PDD

Reply via email to