On Wed, Oct 13, 2010 at 8:27 AM, Jonas Maebe <jonas.ma...@elis.ugent.be> wrote: > > 1) on entry of the "critical section" protected by this variable, you can > have problems, because this sequence: > > locked:=true; > local:=shared_global_var; > > may actually be executed in this order: > > local:=shared_global_var; > locked:=true;
Thanks btw... Yes, I didn't know that fact until you posted a link in another thread. I had a curios problem with pointers and a link list that a manager thread managed but another server thread had access to read/write. Once in a while the order of ops would change and cause read access violations. > So you can get speculative reads into the "critical section" > > 2) when exiting the "critical section", there are no problems, because none > of the loads or stores before the one that sets the boolean "lock" variable > to false, can be moved past that store. Into meaning inside or outside the section? I was under the assumption that inside the section - ops were thread safe from reads. But multi-core systems - I'd bet that order can be executed differently. > In summary, the fact that a particular program runs fine on your particular > machine does not mean anything: > a) your particular machine may not perform any kind of reordering that > results in problems > b) your particular program may not expose any kind of reordering that > results in problems After reading the wikipedia article, and AMD's engineer's blog postings with suggested code, and considering I'm exclusively using AMD cpus, I would say this is true. Problems could certainly prove difficult to resolve in cases involving for worker objects waiting for other workers to solve (recursion or so) acting as a logic gate - potentially a serious issue. > That does not mean that automatically the program "can be used without > memory barriers". It is virtually impossible to prove correctness of > multi-threaded code running on multi-cores through testing, and it is > literally impossible to prove it for all possible machines by testing on a > single machine (even if that machine has 4096 cores and runs 16000 threads), > simply because other machines may use different memory consistency models. After reading up I would say that only under certain circumstances memory barriers can be avoided by engineering via thread isolation (see my commands in uThreads.pas) and limited access (indexed boolean arrays in uThreads.pas with no order necessary due to polling); and a good understanding of challenges is required when coding multi-threaded apps for multi-core systems. Sometimes memory barriers aren't even needed or germane to a particular aspect of an application feature or functionality. Knowing which aspects go with what method is what makes for stable and fast applications. Lastly, because the polling concept was already established, I would say that order of execution with regard to the architecture set forth in my test case, proves just that. Polling for all true or false, does not require concern for pre-emptive positives, or false positives. As designed it proves true IFF all threads are complete. And these facts will remain true on all cpus. Thanks for the info, help, and discussion. _______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal