Vinzent Höfler schrieb:

Question is, what makes one variable use read/write-through, while other variables can be read from the cache, with lazy-write?
 Synchronisation. Memory barriers. That's what they are for.

And this doesn't happen out of thin air. How else?

Ok, maybe I misunderstood your question. Normally, these days every
memory access is cached (and that's of course independent from compiler
optimizations).

This should be "normal" (speedy) behaviour.

But there are usually possibilities to define certain areas as
write-through (like the video frame buffer). But that's more a chip-set
thing than it has anything to do with concurrent programming.

I also thought about cache or page attributes (CONST vs. DATA sections), but these IMO don't offer the fine granularity, that would allow to separate synchronized from unsynchronized variables (class/record members!) in code. Even the memory manager would have to deal with two classes of un/synced memory then :-(

Apart from that you have to use the appropiate CPU instructions.

Seems to be the only solution.

Is this a compiler requirement, which has to enforce read/write-through for all volatile variables?
 No.  "volatile" (at least in C) does not mean that.

Can you provide a positive answer instead?

Ada2005 RM:

|C.6(16): For a volatile object all reads and updates of the object as
|         a whole are performed directly to memory.
|C.6(20): The external effect [...] is defined to include each read and
|         update of a volatile or atomic object. The implementation shall
|         not generate any memory reads or updates of atomic or volatile
|         objects other than those specified by the program.

That's Ada's definition of "volatile". C's definition is less stronger, but
should basically have the same effect.

Is that positive enough for you? ;)

Much better ;-)

But what would this mean to FPC code in general (do we *need* such attributes?), and what will be their speed impact? This obviously depends on the effects of the specific synchronizing instructions, inserted by the compiler.


But if so, which variables (class fields...) can ever be treated as non-volatile, when they can be used from threads other than the main thread?
 Without explicit synchronisation? Actually, none.

Do you understand the implication of your answer?

I hope so. :)

When it's up to every coder, to insert explicit synchronization whenever required, how to determine the places where explicit code is required?

By careful analysis. Although there may exist tools which detect potentially
un-synchronised accesses to shared variables, there will be no tool that
inserts synchronisation code automatically for you.

I wouldn't like such tools, except the compiler itself :-(

Consider the shareable bi-linked list, where insertion requires code like this:
  list.Lock; //prevent concurrent access
  ... //determine affected list elements
  new.prev := prev; //prev must be guaranteed to be valid
  new.next := next;
  prev.next := new;
  next.prev := new;
  list.Unlock;
What can we expect from the Lock method/instruction - what kind of synchronizaton (memory barrier) can, will or should it provide?

My understanding of a *full* cache synchronization would slow down not only the current core and cache, but also all other caches?

If so, would it help to enclose above instructions in e.g.
  Synchronized begin
    update the links...
  end;
so that the compiler can make all memory references (at least reads) occur read/write-through, inside such a code block? Eventually a global cache sync can be inserted on exit from such a block.

After these considerations I'd understand that using Interlocked instructions in the code would ensure such read/write-through, but merely as a side effect - they also lock the bus for every instruction, what's not required when concurrent access has been excluded by other means before.


Conclusion:

We need a documentation of the FPC specific means of cache synchronization, with their guaranteed effects on every target[1].

Furthermore we need concrete examples[2], how (to what extent) it's required to use these special instructions/procedures, in examples like above. When cache synchronization is a big issue, then the usage of related (thread-unaware) objects should be discussed as well, i.e. how to ensure that their use will cause no trouble, e.g. by invalidating the entire cache before.

[1] When the effects of the "primitíves" vary amongst targets, then either more specific documentation is required, or higher level target-insensitive procedures with a guaranteed behaviour. Eventually both, so that the experienced coder can use conditional code for the different handling on various targets.

[2] References to e.g. C code samples IMO is inappropriate, because the user cannot know what special handling such a different compiler will apply to its compiled code (see "volatile").

DoDi

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to