Some namespace notes

Jeff Clites Tue, 13 Jan 2004 11:17:59 -0800

Here are some notes on namespaces, picking up a thread from about a month ago:

On Dec 11, 2003, at 8:57 AM, Dan Sugalski wrote:

That does, though, argue that we need to revisit the global access opcodes. If we're going hierarchic, and we want to separate out the name from the namespace, that would seem to argue that we'd want it to look like:

find_global P1, ['global', 'namespace', 'hierarchy'], "thingname"

That is, split the namespace path from the name of the thing, and make the namespace path a multidimensional key.

I definitely agree that we should have separate slots for namespace and name, as you have above. So I think the discussion boils down to whether a namespace specifier is logically a string or an array of strings.

Short version: I was originally going to argue for fully hierarchical namespaces, identified as above, but after turning this over in my head for a while, I came to the conclusion that namespaces are not conceptually hierarchical (especially as used in languages such as Perl5 and Java, at least), so I'm going to argue for a single string (rather than an array) as a namespace identifier.

Here's my framing of the general problem. I think there are 3 basic options:

1) No namespaces. That would mean we might have variables named "Foo::Bar::baz", but to parrot that would just be a variable with a funny name (it wouldn't infer any sort of nesting of variables or namespaces).

2) Two-level namespaces. That would mean that parrot has the concept of "look up variable 'baz' in namespace 'Foo::Bar'", but no concept of nested namespaces--"Foo::Bar" is just a funny namespace name (no nesting of namespaces inferred).

3) Full hierarchical namespaces. So parrot knows how to "look up variable baz inside namespace Bar inside namespace Foo". Parrot would never need to see the syntax ("::" v. "." v. whatever) used by different languages to specify nested namespaces--their compilers would assemble these as arrays, as in Dan's example above.

(Also, (2) v. (1) is what Tim was indicating with :

        *{"Foo\0Bar\0Baz"}->{var};
or
        *{"Foo\0Bar\0Baz\0var"};

in his post in the previous thread, I believe.)

I think that probably most agree that (1) is out--so the question is (2) v. (3).

I think there are 2 considerations:

A) What does a hierarchy give us?

B) What kind of cross-language compatibility do we need?

As to (A), I don't think the hierarchy actually matters much. What I mean is, that I don't think it's actually significant to say that the namespace A::B::C is "inside" of the namespace A::B. For instance, $A::B::C::var won't "fall back" to finding $A::B::var -- they're really just separate namespaces which would have worked just the same if they'd been called "ABC" and "AB" or "Foo" and "Bar". The hierarchy is only used to conceptually organize things (for humans), not really at runtime. Notably, this is the viewpoint taken by Java and I believe by Perl5. (For instance, Java has a "com.sun.media.sound" namespace, but not a "com" namespace.)

So unless I'm missing some uses of a hierarchy, I think that (A) doesn't argue for (3) over (2), so it boils down to consideration (B).

For (B), what I mean is: Do we want the following to refer to the same package/namespace:

in Perl:
                use Foo::Bar::Baz;

in Java:
                import Foo.Bar.Baz;

If we do, then I say we should go with (3), and use the array-based method of specifying a namespace which Dan indicated above. Then, it's up to the individual compilers to pick apart this syntaxes into the same arrays: ['Foo', 'Bar', 'Baz']

On the other hand, maybe we don't want this. Maybe we want these to refer to different packages/namespaces. In Perl, if I want to actually instantiate a java.lang.String, maybe it's clearer to just really treat the class name as "java.lang.String". I actually think it should be up to the individual language implementers to decide if they want to "normalize" during compilation to a common syntax for specifying package names, but I think it makes more sense for them to _not_ normalize, and in Perl just "use java.lang.String" to pull in that Java package.

So I'm arguing for (2), which says: Namespaces don't conceptually nest.

Now, that said, this really just argues that most languages actually use 2-level namespaces in their syntax--that "Foo::Bar::Baz" doesn't really indicate nesting. But, we can certainly _allow_ namespace nesting--it just wouldn't have a one-line syntax. What I'm thinking of here is having ops like this:

-----
# shortcut for lookup of "thingnane" in global namespace

find_global P1, "thingname" ----- # lookup "thingname" in namespace "MyPackage::Foo"; really, this is: find namespace "MyPackage::Foo" inside global namspace, and lookup "thingname" in that, so it's still a shortcut

find_global P1, "MyPackage::Foo", "thingname" ----- # here's an alternate, more explicit way to do the same thing. This might be slower for a single lookup (the one-step method may be able to optimize), but faster if you need to do multiple lookups, since you have a direct reference to the namespace which you can re-use. It would be up to the HLL compiler to decide which to use when.

# find "MyPackage::Foo" namespace inside global namespace
find_namespace P2, "MyPackage::Foo"

# find "thingname" using explicit namespace reference
find_global P1, P2, "thingname"
-----
# Here's how you could actually use nested namespaces

# find namespace in global namespace
find_namespace P2, "MyPackage::Foo"

# find namespace using explicit namespace reference
find_namespace P3, P2, "Boo:Bar::Baz"

# find "thingname" inside explicit namespace reference--doesn't really matter that this namespace was looked up inside another find_global P1, P3, "thingname" ----- [So that's three find_global variants, and two find_namespace variants--of course, some could have different names if that's clearer.]

That is, you can nest namespaces if you want, but at the parrot level there's no one-line syntax for that, and Perl5 and Java wouldn't use this feature, and Perl6 probably wouldn't.

At the conceptual level, the reason that HLLs tend to use 2-level namespace (rather than fully nested) is that really they are a means for humans to coordinate at-a-distance, either (a) so that 2 people can create classes or variables called "Foo" without conflict, because one will be "com.jeff.Foo" and one will be "com.john.Foo", or (b) to keep things such as CPAN organized conceptually. Once you have a namespace you "own" (eg, com.YourCompany.YourDepartment), there's no need to nest, since you can manage name conflict within your own namespace (almost by definition).

So, that's my take on namespace nesting and syntax. Option (2) maps better to how HLLs tend to think of and use namespaces, and can still accommodate fully nested namespaces, if any language really needs them.

JEff

Some namespace notes

Reply via email to