On Mon, 23 Mar 2009 22:24:30 -0400, Cristian Vlasceanu <[email protected]> wrote:

I thought one of the benefits of having immutable strings is that
substrings were just pointers to slices of the original data in Java and
.NET. So every time I do a substring in Java and .NET, it creates a copy
of the data?  That seems very wasteful, especially when the data is
immutable...


Fair enough, I meant substrings in more of a C++ way. Perhaps discussing
slices / arrays by comparison with substrings / strings is a bad idea,
because as you point out strings are immutable whilst arrays are not.


OK, I didn't think of that implementation.

My main problem with slices is that when you a) append to them, or b) resize them (by assigning to the length property), and the new size goes past the
bounds of the "original" array, then the slice gets "divorced" from the
array, and from a light-weight "view" it gets promoted to a "first class"
array (at least this is the behavior in 2.025).

It's even more bizarre than this :) If you append to a slice that happens to point to the first bytes of the original array, it appends in place (no divorce!), possibly overwriting the (possibly immutable!) data still in the original array. This is the bug I'm trying to fix with my proposals.

But I see your point, which is one aspect that I was aware of, but didn't really feel like it was a huge problem. After all, if the behavior is deterministic, then you can know whether your slice still points to the original array. But looking at it from your point of view, the scheme definitely has valid issues. Most other parts of the D language allow you to simply look at the type of something and know what it means. Arrays, you have to examine the code that created/used the array to figure out whether it's an alias or unique data. That is a problem.

So maybe it is a worthwhile exercise to figure out if there is a way to embed the attributes of the array into the type. I'll have to think about how this could be implemented in a way that makes sense, is realistic, and does not hinder performance or syntax. There might still be a way to make this work.

That is an interesting idea.  But I have a couple problems with it:

First, when I see ref int[] s, I think reference to an array, not this
array references data from another array.
Second, your proposed default (not using ref) is to copy data everywhere,
which is not good for performance.  Most of the time, arrays are passed
without needing to "own" the data, so making copies everywhere you forgot to put ref would be hard to deal with. It's also completely incompatible
with existing code, which expects reference semantics without using ref.


That's exactly right, but for D / .NET I do not expect existing code to
compile as-is. For example, I do not intent to port any of the phobos /
tango code. I envision all I / O and system stuff to go through
[mscorlib]System.

That is what I would expect also, but part of the benefit of having .NET implemented in another language is to port existing code from that language to a .NET runtime. Any code that was to be ported might suffer. There are already instances of ref char[] or ref string in many applications/libs that would cause strange bugs when ported to .NET.

Oh, and BTW, if I couldn't use Tango, I'd most certainly not use D.NET ;) I sort of loathe the .NET runtime libs, except for certain parts.

-Steve

Reply via email to