Re: Namespaces, part 2

Jeff Clites Tue, 05 Oct 2004 09:51:21 -0700

On Oct 4, 2004, at 8:25 AM, Dan Sugalski wrote:

Okay, since we've got the *basic* semantics down (unified namespace, namespace entries get a post-pended null character)


I'll ask again, what about subs? Do they get name-mangled too?

$Px = find_global [key; key; key], 'name'

As Leo pointed out in a thread of the same name last year, this is a new syntax--keyed access on nothing. I assume you mean for [key; key; key] to serve as a sort of literal syntax for an array of strings? If so, we should make the syntax for that sort of thing explicit. Or, we can do what Leo suggested (in a thread of the same name last year, and more recently), and write this as keyed access on a particular namespace (typically, the root namespace):

        $P0 = root_namespace
        $P1 = find_global $P0['Foo'; 'Bar'; 'Baz'], 'xyzzy'

The case where the namespace is specified by string I think should go away.

I assume by "specified by string", you mean a string such as "Foo::Bar"? If so, I agree, and I'll note that it would be possible to write per-language utility functions to parse these apart--compilers will need to be able to process these, in some form, and so might programmers in general, but it doesn't need to be an op.

That is, if I have the namespace Foo, Foo::Bar, and Foo::Bar::Baz, and then have a thing named xyzzy in Foo::Bar::Baz, I can get it by doing:
   $P1 = find_global ['Foo'; 'Bar'; 'Baz'], 'xyzzy'
or
   $P0 = find_global ['Foo']
   $P1 = find_global $P0, ['Bar'; 'Baz']
   $P2 = find_global $P1, 'xyzzy'

...

That is, if we don't specify the name of the thing in the namespace we get the namespace PMC itself, which we can then go look things up in. This is handy since it means code can cache the namespace PMC so if there are a dozen variables to go fetch out, we don't have to do the multi-level hash lookup each time. (That'd be nasty)


Several things here:

1) By your name mangling scheme, it seems we don't need a special syntax for looking up a namespace--it's just something else in the namespace above it, so looking up "Foo::Bar" would be:

        $P0 = root_namespace
        $P1 = find_global $P0['Foo'], 'Bar'

But if everything's lumped together and name-mangled in a non-segmented namespace, then you don't need find_global, you just need this (as Leo suggested):

        $P0 = root_namespace
        $P1 = $P0['Foo'; 'Bar']         # P1 holds Foo::Bar
        $P2 = $P1['Baz'; 'xyzzy']       # P2 holds xyzzy from Foo::Bar::Baz
        $P2 = $P0['Foo'; 'Bar'; 'Baz'; 'xyzzy']  # same thing

I was formerly a proponent of keeping the final "thing" in a separate parameter, but with your current mangling-plus-flat-namespace proposal, I don't see a reason for it.

2) What's putting in the trailing null bytes (HLL programmer, compiler, or Parrot's implementation of the lookup)? I'd assume it would be the HLL compiler, so shouldn't the above be:

$P2 = $P0['Foo\0'; 'Bar\0'; 'Baz\0'; '$xyzzy'] #or at least, shouldn't the PASM variant look like this

(In particular, I noticed you said, "a thing named xyzzy", but didn't include a sigil?)

3) Canonical decomposition: Consider the following:

        $P0 = root_namespace
        $P2 = $P0['Foo\0'; 'Bar\0'; 'Baz\0'; '$xyzzy']

v.

        $P0 = root_namespace
        $P0 = $P0['Foo\0']
        $P0 = $P0['Bar\0']
        $P0 = $P0['Baz\0']
        $P2 = $P0['$xyzzy']

(The example's the same even with your other syntax.)

These appear equivalent, _but_ with the former, the root namespace could take into consideration to whole list keys, and decide what to return based on that information; with the latter, it only sees 'Foo\0'. So in the general case (in light of namespace tie-ing), they could do different things. The upshot of this is that we need to make explicit the algorithm followed by a multi-keyed lookup--basically, iterative v. recursive, if you think it through. Recursive is more flexible and powerful (because any namespace along the way can "short circuit" the rest of the lookup), but means that HLL compilers must emit the first option above (with an iterative approach, where Parrot controls the algorithm and namespaces have less control, a compiler would have a choice).

4) Similar to (3), you can't do much caching of a namespace PMC, in light of tying and such. Consider:

        $a = $Foo::Bar::bar;
        somesub();
        $b = $Foo::Bar::zoo;

A compiler can't safely optimize this to cache the lookup of the Foo::Bar namespace, because somesub() might perform tying of one of the relevant namespaces, or otherwise rearrange the hierarchy, and you'd get the wrong $zoo. Even without somesub(), in light of namespace tying it's possible that the lookup of $bar actually caused a namespace rearrangement, so you can't even cache across adjacent lookups (or a lookup and a store, such as $Foo::Bar::bar = $Foo::Bar::zoo).

5) Python. Language crossing issues aside, there are issues with Python itself. Python does not know, at compile-time, what's a namespace and what's some other sort of object. Consider:

        x = a.b.c.d

Here a, a.b, and a.b.c might all be modules, or they might just be attributes of non-module objects.

That has some consequences: (a) The Python HLL compiler can't name-mangle namespace/module names with a null byte, since it doesn't know, nor can really the lookup process do this behind the scenes, since it doesn't know either--namespaces are just objects with lots of attributes. (b) Lookup will have to be "one level at a time"--the compiler can't emit find_global ['a'; 'b'; 'c']..., since it can't know it's doing a namespace lookup (though the keyed-access approach might save us). (c) If we use keyed access, a Python compiler could potentially emit $P0['a'; 'b'; 'c'; 'd'], but you wouldn't get this from Python-to-Parrot bytecode translation, so in light of (3), with namespace tying you could get different behavior for the same Python code depending on whether you bytecode translated or compiled direct-to-Parrot, so you'd probably want to stick to one-level-at-a-time even for direct-to-Parrot compilers. (d) In light of this, Python and Perl might "respond" very differently to namespace tying, since the lookup algorithms might be different.

6) I don't think you've covered how Python code would access Perl variables in your proposed scheme.

Making one section of a namespace an alias for another part is just a matter of getting the PMC for the section of the namespace tree you want to alias and sticking it into the namespace it should hang off of. Right *now* I'm not inclined to have a store_global variant to do this, but if we want to completely hide all namespace operations behind ops I'm OK with that.

Now, with that out of the way, let's talk about overlaid namespaces.

I don't think we need special ops for namespace overlaying. If we have API powerful enough to allow one to create namespaces with custom behavior, then "overlaying" is just creating a custom namespace which delegates lookups into other namespaces, and then just storing this into the appropriate slot. I think we'd get overlaying for free. For instance, this is trivial to do in Python today:

>>> class wrapper: ... def __init__(self, modulename): ... self.wrapped = __import__(modulename) ... def __getattr__(self, attrname): ... print "Getting " + attrname ... return getattr(self.wrapped, attrname) ... >>> os = wrapper("os") >>> os.O_APPEND Getting O_APPEND 8 >>> os.stat("/dev/null") Getting stat (8630, 27649924L, 26798244L, 1, 0, 0, 0L, 1095909600, 1096963882, 1096963882)

Here, I created something which looks like the os module, but actually logs all lookups before delegating to the "real" os module.

It would be similarly simple to create something which searches a list of namespaces:

        mynamespace = multiwrapper("os", "sys", "xml")

        mynamespace.foo()  # searches os, then sys, then xml

If we can do this, there's no need for dedicated ops.

JEff

Re: Namespaces, part 2

Reply via email to