Re: [IronPython] Array Access Problem

Bob Ippolito Thu, 26 May 2005 01:56:07 -0700


On May 10, 2005, at 10:57 AM, Jim Hugunin wrote:

Bob Ippolito wrote:

(1) Don't have mutable value types, use a reference type that points
to a value type (some kind of proxy)


I don't think that this is possible to do in a consistent way and my
suspicion is that doing this half-way would be more confusing than not
doing it at all.  Let's walk through the original example:

apt = Array.CreateInstance(Point, 1)

This creates a true CLI array of Point structs

pt = Point(1,2)

Today this makes a new Point struct and returns the boxed version of

that struct. We could instead return a new instance of animaginary newtype, ValueProxy<Point>. This new instance is a standard referencetype

that holds a point as its data.  This proxy will need to forward all
field, property and method accesses to the contained Point struct.

apt[0] = pt

What do we do here?  We need to copy the data in pt into apt[0].  This

is what it means to have an array of structs. No matter what we dowithproxies or wrappers there's no way out of this copy. We could addsomekind of pointer to the ValueProxy<Point> keeping track of the factthatthere's a copy of this variable now held in apt[0]. This wouldneed tobe an arbitrarily large list of pointers. This list would also beeasy

to break with CLI code that directly modified apt or other containers
holding on to the value types.

pt.X = 0

The only way this can modify apt[0] is if we keep the full list of
references in ValueProxy.  See above for why keeping that full list
still wouldn't always work.

apt[0].X = 0

This example would work using the ValueProxy that pointed to apt[0];

however, when apt[0] is assigned to a variable the situationbecomes as

bad as it is for pt.

for pt in apt:
  pt.X = 0

The for loop uses an Enumerator to loop through the points in apt.
Without constructing a custom enumerator for arrays there's no way to
get anything but copy semantics here.  While we could build a custom
enumerator for arrays this wouldn't solve the general case of value
types being returned from methods.

When I played with this example in C#, I discovered something
interesting:

Point[] pa = new Point[3];
foreach (Point p in pa) {
    pt.X = 10;
}

The code above generates an error from the C# compiler:
"Cannot modify members of 'p' because it is a 'foreach iteration
variable'"

The C# compiler is treating these iteration variables as semi-immutablein order to minimize the confusion that can come from the copysemantics

of value types.  This seems like a promising idea...

Actually the idea I had was different -- leaving boxed type handlingas-is, but the __getitem__ of the Point[] instance would return"ValueProxy" instances.. which would give you similar semantics to C#-- as long as you don't keep it around for a long time. Of course,you could deviate from standard Python a little bit and have anoptional extension to the __getitem__ protocol that would recognizethat the __getitem__ is really just to find a "pointer" so that itcan set an attribute somewhere. __getitemforsetattr__ or something...

I only really had that idea because it would fix the reported bug,you're probably right about how it's currently half-implemented beingmore confusing.. however, I think it might be less confusing than thecurrent state.

(2) Make value types immutable (or at least the ones you grab from
collections)

All of the problems with value types stem from their mutability.Nobody

ever complains that int, double, char, etc. are value types because
those are all immutable.  For immutable objects there's no difference
between pass by reference and pass by value.

The CLR team's API Design Guidelines say this:
- Do not create mutable value types.
http://blogs.msdn.com/kcwalina/archive/2004/09/28/235232.aspx
(or see here - http://peter.golde.org/2003/10/13.html#a16)

In some ways, this would be just reflecting in IronPython this good
design sense.

One advantage of immutability is that it would make failures like the
following much more obvious:

apt[0].X = 0

If value types were immutable this would throw.  The exception message
might give people enough information to get started tracking down the
issue and modifying their code to work correctly.

What are the problems with this approach?

1. C#/VB examples won't port very naturally to IronPython and the docs

will need a section explaining the various workarounds to the factthat

IronPython doesn't support this idiom.  This isn't ideal, but I could
easily live with this doc burden.

2. There's no way that I know of to make a value type 100% immutable

without controlling its implementation. IronPython could blocksettingof fields and properties on value types, but there's no way toreliably

detect and block all sets that came through methods.  Just getting the

properties and fields would probably cover 95% of the cases wherepeople

try to mutate a value type, but it seems pretty awkward to me to say
that value types in IronPython are sort-of immutable unless there are
mutating methods.  The fact that this is what the C# compiler does for
iteration variables is encouraging at least in that it's a precedent.

3. There might be things that are impossible to express with this
restriction.  I don't think that's true, particularly with the use of

named parameters to initialize fields and properties in the valuetype's

constructor.  However, one of the principles of IronPython is that it
should be able to use any CLS library and it's possible there's some
weird library design with value types that wouldn't work if they were
considered virtually immutable by IronPython.

If we went down the immutable value type route, it would beinteresting

to look at different kinds of sugar that could be provided to make the
impact on most programs less than it currently is.

In PyObjC we have similar problems to this.. the mutable value typeproblem exists, but isn't a problem in practice because people JustDon't Do That. What *is* a problem is that Foundation has a mutablestring type.

Now this sounds like a small problem at first, but since FoundationNSDictionary is key-copying, mutable strings are hashable and areallowed to pass for a regular string anywhere. Also, since unicodeobjects are immutable in Python and their hash can not change, weirdthings can happen.

In practice, this is also not a problem (anymore). From Python, theNSMutableString is bridged to a subclass of unicode. So, it has acopy of the contents at the time of its creation, and all of thePython methods will behave as documented since they are usingPython's implementation. However, it also has all of the methods ofNSMutableString and they also act correctly. In order to get anupdated Python representation, you simply call some Objective-C-implemented-method that will return the object again and you'll get anew proxy (normally proxies are guaranteed unique so "is" works, butthis is not true for most classes that we conveniently bridge toimmutable Python built-in types). Fortunately, the NSObject protocolhas a "self" instance method that will return that instance ..


>>> from Foundation import *
>>> s = NSMutableString.string()
>>> s
u''
>>> hash(s)
0
>>> s.description()
u''
>>> s.appendString_('foo')
>>> hash(s)
0
>>> s
u''
>>> s.description()
u'foo'
>>> s.self()
u'foo'

It looks confusing in a contrived example like this, but in practiceyou're generally either using one set of methods or the other.. soI've never been confused by it and we haven't had any complaints.

You could provide some similar workaround, with a function or methodthat mutates a field (because unlike in the PyObjC case, you're notguaranteed mutating methods).


-bob

_______________________________________________
users-ironpython.com mailing list
users-ironpython.com@lists.ironpython.com
http://lists.ironpython.com/listinfo.cgi/users-ironpython.com

Re: [IronPython] Array Access Problem

Reply via email to