Re: Namespaces, part 1 (new bits)

Jeff Clites Sun, 03 Oct 2004 20:56:40 -0700

More detailed responses are below, but some general comments first:

I think that no matter what the approach, there's an unavoidable mismatch between Perl and Python when it comes to variable naming, it's going to be a bit awkward to access Perl variables from within Python. I don't see any way around that. Either of the obvious approaches will have issues to deal with; these approaches are:

1) Treat Perl variables as having the sigil as part of the name. -or- 2) Treat Perl variables as not including the sigil in the name, but have multiple categories.

For Python, the problem with (1) is that no Perl variables will have legal names from Python's point of view. With (2), the variable names are okay for Python, but Python doesn't have the concept of multiple categories of variables.

But, that's not as bad as it sounds, and I'll spell out how either of those could work. But with either approach, the integration won't be seamless--using a Perl-originating module (namespace) in Python will involve more work on the Python side than is required when using a Python-originating module (in Python). Here's how either approach could work. Consider the following Perl(5) package as an example:

# Perl5
package Gizmo;
$foo = 1;
@foo = ('a');
%foo = ('b' => 'c');
sub foo { return 1; }


Approach 1) Sigil is part of the name.

        import Perl.Gizmo

x = Perl.Gizmo.foo # error--not defined x = Perl.Gizmo.__dict__['$foo'] # x now holds the value of Perl's $foo Perl.Gizmo.__dict__['$foo'] = 'hello' # Perl's $foo now holds a Python string # similarly for hashes, arrays, etc.; subs would probably need a sigil, maybe '&'

That would actually work. It's awkward, but it reflects the fact that Python's symbol tables (like Perl's) can deal with entries which you couldn't create with the "normal" Python syntax.

Approach 2) Sigil is not part of the name. Use categories.

        import Perl.Gizmo

x = Perl.Gizmo.scalars.foo # x now holds the value of Perl's $foo Perl.Gizmo.scalars.foo = 'hello' # Perl's $foo now holds a Python string x = Perl.Gizmo.arrays.foo # x now holds the value of Perl's @foo

Either approach would work from a Python perspective. The syntax of the second, to my eyes, is a bit less awkward. (Note that even with approach 2, you'd need to resort to the explicit Perl.Gizmo.scalars.__dict__[""] style to access variables with non-ASCII names, but at least you wouldn't have to do that for *all* variables.)

Both approaches require some special Pythonish behavior of the Perl-originated namespace. Parrot's namespace-tie-ing mechanism could probably be leveraged to provide this behavior and avoid putting the burden on the Perl side (i.e., the extra work should be on the import side, since it's reaping the benefit of the language crossing). What I'm thinking here is that when Python tries to import a Parrot module, it actually gets a namespace object which is a wrapper (or adaptor) around the actual Perl namespace. (So in the example above, the Perl.Gizmo that Python gets is actually a special namespace which mediates interaction with the "real" Perl Gizmo namespace.) Here's what the adaptor would need to do in each case:

Case 1) The adaptor would need to expose a __dict__ attribute to provide a hash-like interface to the contents of the namespace. This is necessary because the real Perl namespace won't have the necessary __dict__ variable in place. This would be simple to implement--namespaces are already hash-like.

Case 2) The adaptor would satisfy a request for Perl.Gizmo.scalars.foo by asking the real Gizmo namespace for the entry "foo" in the "scalars" category. This would also be simple to implement if we have category-structured namespaces.

Either approach requires that the import mechanism have an awareness of the language of the imported namespace (i.e., Perl modules need to be wrapped but Python modules don't, for import from Python code), but that's supplied by the "Perl." prefix anyway.

Going from Python to Perl is easy--all Python variables look like Perl-scalars-holding-references, though we might want to special-case strings and numbers. See below for further notes on this.

Further comments, some of which will be redundant with the above:

On Sep 30, 2004, at 1:00 AM, Leopold Toetsch wrote:

Jeff Clites <[EMAIL PROTECTED]> wrote:
First off, Perl5 doesn't describe itself that way. The Camel states, "Note that we can use the same name for $days, @days, and %days without Perl getting confused."
While that's fine for Perl it doesn't help, if you want to access one
distinct "days" from Python:
  import days from Perl.foo   # which one?
So it's true that $foo and @foo are different items, but that can be
stated as, "the scalar 'foo' and the array 'foo' are different items,
with the same names".
That does only work, if the context provides enough information, which one of the "foo"s should be used. That information isn't present always.

What I mean is, if someone is importing a variable from a Perl module, they must have read some documentation about that Perl module, to figure out that they want to do that--that's how they know the name at all ("days" v. "automobiles"), and so that same documentation would tell them whether they need to import a scalar, or an array, or a hash (or a sub). So the programmer must know--there's no need to infer from context on the Python side. I've provided, above, some syntax options for this.

without the name decoration. For example, in Common Lisp, this:

(foo foo)
means, "call the function foo, and pass the variable foo as an
argument".
Ok, then a hyptothetical CL translator has to take care of this. The "foo" variable is probably kind of a lexical, the function "foo" is accessible. If both "foo"s are global, the same problem as above exists.

But a CL compiler has no problem itself--it knows by context what is meant ("the thing on the far left means a function"), so it can emit to appropriate instructions--one to lookup the function "foo", one to lookup the variable "foo", and one to call the former with the latter as an argument. This should be approximately the same as calling "foo($foo)" in perl5. Different languages just use different mechanisms to make their grammars unambiguous.

And: we can't attach hints to the namespace lookup because you just don't know, if Python wants the scalar "foo" or the array "foo". There is exactly one "foo" object that Python can use, that's it.

That's not accurate, and it's not a hint, it's a demand--the programmer should know exactly which one he wants
The programmer might know it, but Python has no means to express the
difference.
a = lookupPerlScalar("foo");
There isn't a possibility for the Python translator to generate that
kind of code in the general case.

Oh no--I didn't mean for a Python compiler to emit that. That wasn't meant to be pseudo-PIR. I meant that as a Python function, and that as the code a Python programmer would type. (It was bad Python pseudo-code.) The right syntax for this (on the Python side) is as given above, using __dict__. (And Python has other options, such as its getattr() built-in function.)

I think we have issues going from Perl to Python as well, in other cases. Things such as arrays/hashes/strings should come across as references--so I'd say that in Python you access [EMAIL PROTECTED] rather than @foo directly, because assignment in Python doesn't copy.
I don't think so. Python's assignment is pure name binding. In Python you access that thing that is in name slot "foo". That has nothing to do with references, which Python doesn't have anyway.
If you have in Python
  foo = bar
then that's in bytecode:
  LOAD_NAME "bar"
  STORE_NAME "foo"
Now both these name slots have the same item. But modifying e.g. "bar"
doesn't effect "foo", it's not a reference that is common to these name
slots.

I think we mean the same thing here. Basically, Python variables act like they hold C pointers. Your example above could be described as loading the reference held by "bar", and storing it to "foo". Python is like Java in this regard--there's not an explicit syntax for creating references, but all variables already act as though they contain references. (Well in the Java case, object-ish variables, that is.) And the "Learning Python" book also describes it this way: "Python assignment stores references to objects in names or data structure slots. It always creates *references* to objects, instead of copying objects. Because of that, Python variables are much more like pointers than data storage areas as in C." (emphasis theirs).

Here's a demo of how Perl and Python differ in this regard:

For instance, consider this, from an interactive session:

>>> a = ['foo', 'bar']
>>> b = a
>>> a
['foo', 'bar']
>>> b
['foo', 'bar']
>>> del a[0]
>>> b
['bar']

That's because the "a = b" doesn't copy the array--you can think of it as copying a reference (a and b both hold references to the same underlying array). Note that modifying the object stored in a, modifies that stored in b (it's the same object). This is just like Java--clearest to say that all variables hold references, so that assignment copies the reference (not the underlying object).

But this Perl code is different:

@a = ('foo', 'bar');
@b = @a;

pop @a;

Now @a has one item, and @b still has two--assignment did a copy, because Perl variables act as though they hold the underlying data structure directly.

Contrast this:

@array = ('foo', 'bar');
$a = [EMAIL PROTECTED];
$b = $a;

pop @$a;

Both $a and $b hold references to @array, which now contains only one item.

Now, if Python had a way to access Perl's "@a" as "a" and Perl's "@b" as "b", then by Python's semantics, you'd expect "b = a" not to do a copy, but by Perl's semantics, it would. But if all you can access on the Python side is really "[EMAIL PROTECTED]", then "b = a" in Python would act in a way consistent with both languages. That is, using my category syntax, consider this code:

        import Perl.Example

        x = Perl.Example.arrays.a
        Perl.Example.arrays.a = Perl.Example.arrays.b
        del Perl.Example.arrays.a[0]

What should this do? On the one hand, you might expect this to be equivalent to Perl's "@a = @b", so that "a" is now a copy of "b" (i.e., "a" should act as though it were emptied out, and then filled with the elements contained in "b"), and thus subsequent modifications to the object in "b", shouldn't affect the object in "a". But assignment in Python doesn't work this way, so by Python's semantics, both "a" and "b" should now refer to the same underlying object. There's a clash--Python's variables want to hold references, not actual objects. The only time there isn't a clash (it seems), is in the case of Perl scalars which happen to hold references. For all the other cases, it's problematic.

One way to reconcile this, as I was getting at in my previous message, is to say that Perl.Example.arrays.a isn't actually holding "@a", but rather a reference to it, such as "[EMAIL PROTECTED]". The namespace adaptor we already need could probably make this happen. But there's a problem with this, too: You'd probably want to treat scalars homogeneously, so that you'd get "\$foo", etc. on the Python side. This does what you'd want for strings (Python hold a string reference), but if $foo holds a Perl reference, then on the Python side this will look like a reference-to-a-reference, and Python doesn't (I think) have a syntax for handling this. (And, I suspect we'd have further problems if we tried to tread Perl scalars differently depending on whether they held a reference or a string/number, etc.)

The real troubles arise from different issues. Given a PerlString "s"
handled over to Python. Now Python code looks like this:

  s += "x"

or

  print s % (2,3)

If "s" were a Python string, the first would concatenate, the second is
a sprintf-like interpolation. So what is the result, when "s" is a
PerlString?

Yes, that's a problem, for a couple of reasons. First, the compiler needs to emit the same code whether "s" is a number or a string, since you can't tell at compile-time, which seems to mean that PerlScalar's would need to have Python-specific vtable entries (e.g., for a Python-plus, at least), which is gross, or else blow up, which is not very useful. The second problem is that Perl scalars are by design ambiguous between strings and numbers, so that "a += b" in Python is ambiguous. (Should it do numeric addition or string concatenation?) It would be easy enough to create utility functions to let you create a "real" Python number or string from a PerlScalar, but again that's awkward.

JEff

Re: Namespaces, part 1 (new bits)

Reply via email to