Re: should pure functions accept/deal with shared data?

Alex Rønne Petersen Wed, 06 Jun 2012 16:03:28 -0700

On 06-06-2012 23:39, Steven Schveighoffer wrote:

An interesting situation, the current compiler happily will compile pure
functions that accept shared data.


I believed when we relaxed purity rules, shared data should be taboo for
pure functions, even weak-pure ones. Note that at least at the time, Don
agreed with me: http://forum.dlang.org/post/[email protected]

Now, technically, there's nothing really *horrible* about this, I mean
you can't really have truly shared data inside a strong-pure function.
Any data that's marked as 'shared' will not be shared because a
strong-pure function cannot receive any shared data.

So if you then were to call a weak-pure function that had shared
parameters from a strong-pure function, you simply would be wasting
cycles locking or using a memory-barrier on data that is not truly
shared. I don't really see a compelling reason to have weak-pure
functions accept shared data explicitly.

*Except* that template functions which use IFTI have good reason to be
able to be marked pure.

For example:

void inc(T)(ref T i) pure
{
++i;
}

Now, we have a template function that we know only will affect i, and
the compiler enforces that.

But what happens here?

shared int x;

void main()
{
x.inc();
}

here, T == shared int.

One solution (if shared isn't allowed on pure functions) is, don't mark
inc pure, let it be inferred. But then we are losing the contract to
have the compiler help us enforce purity.

I'll also point out that inc isn't a valid function for data that is
actually shared: ++i is not atomic. So disallowing shared actually helps
us in this regard, by refusing to compile a function that would be
dangerous when used on shared data.


Man, shared is such a mess.

(I'm going to slightly hijack a branch of your thread because I think weneed to address the below concerns before we can make this decisionproperly.)

We need to be crystal clear on what we're talking about here. Usually,people refer to shared as being supposed to insert memory barriers.Others call operations on shared data atomic.

(And of course, neither is actually implemented in any compiler, and Idoubt they ever will be.)

A memory barrier is what the x86 sfence, lfence, and mfence instructionsrepresent. They simply make various useful guarantees about ordering ofloads and stores. Nothing else.

Atomic operations are what the lock prefix is used for, for example thelock add operation, lock cmpxchg, etc. These operate on the most recentvalue at whatever memory location is being operated on, i.e. caches arecircumvented.

Memory barriers and atomic operations are not the same thing, and weshould avoid conflating them. Yes, they can be used together to writelow-level, lock-free data structures, but the use of one does notinclude the other automatically.

(At this point, I probably don't need to point out how x86-biased andunportable shared is.....)


So, my question to the community is: What should shared *really* mean?

I don't think that having shared imply memory barriers is going to beterribly useful to anyone. In fact, I don't know how the compiler wouldeven determine where to efficiently insert memory barriers. And*actually*, I think memory barriers is really not what people mean at*all* when they refer to shared's effect on code generation. I thinkwhat people *really* want is atomic operations.

Steven, in your particular case, I don't agree entirely. The operationcan be atomic quite trivially by implementing inc() like so (for theshared int case):


void inc(ref shared int i) pure nothrow
{
    // just pretend the compiler emitted this
    asm
    {
        mov EDX, i;
        lock;
        inc [EDX];
    }
}

But I may be misunderstanding you. Of course, it gets a little morecomplex if you use the result of the ++ operation afterwards, but it'sstill not impossible to do atomically. What can *not* be done is doingthe increment and loading the result in one purely atomic instruction(and I suspect this is what you may have been referring to). It's worthpointing out that most atomic operations can be implemented with a spinlock (which is exactly what core.atomic does for most binaryoperations), so while it cannot be done with an x86 instruction, it canbe achieved through such a mechanism, and most real world atomic APIs dothis (see InterlockedIncrement in Windows for example).

Further, if shared is going to be useful at all, stuff like this *has*to be atomic, IMO.

I'm still of the opinion that bringing atomicity and memory barriersinto the type system is a horrible can of worms that we should neverhave opened, but now shared is there and we need to make up our mindsalready.


The compiler *currently* however, will simply compile this just fine.

I'm strongly leaning towards this being a bug, and needs to be fixed in
the compiler.

Some background of why this got brought up:
https://github.com/D-Programming-Language/druntime/pull/147

Opinions?

-Steve


--
Alex Rønne Petersen
[email protected]
http://lycus.org

Re: should pure functions accept/deal with shared data?

Reply via email to