On Fri, Feb 24, 2012 at 2:10 PM, Torvald Riegel <trie...@redhat.com> wrote: > On Fri, 2012-02-24 at 09:58 +0100, Richard Guenther wrote: >> On Thu, Feb 23, 2012 at 10:11 PM, Aldy Hernandez <al...@redhat.com> wrote: >> > On 02/23/12 12:19, Aldy Hernandez wrote: >> > >> >> about hit me. Instead now I save all loads in a function and iterate >> >> through them in a brute force way. I'd like to rewrite this into a hash >> >> of some sort, but before I go any further I'm interested to know if the >> >> main idea is ok. >> > >> > >> > For the record, it may be ideal to reuse some of the iterations we already >> > do over the function's basic blocks, so we don't have to iterate yet again >> > over the IL stream. Though it may complicate the pass unnecessarily. >> >> It's definitely ugly that you need to do this. > > Indeed. But that's simply the price we have to pay for making > publication with transactions easier for the programmer yet still at > acceptable performance. > > Also note that what we primarily have to care about for this PR is to > not hoist loads _within_ transactions if they would violate publication > safety. I didn't have time to look at Aldy's patch yet, but a first > safe and conservative way would be to treat transactions as full > transformation barriers, and prevent publication-safety-violating > transformations _within_ transactions. Which I would prefer until we're > confident that we understood all of it. > > For hoisting out of or across transactions, we have to reason about more > than just publication safety. > >> And yes, you have to look at >> very many passes I guess, PRE comes to my mind at least. >> >> I still do not understand the situation very well, at least why for >> >> transaction { for () if (b) load; } load; >> >> it should be safe to hoist load out of the transaction > > This one is funny. *Iff* this is an atomic txn, we can assume that the > transaction does not wait for a concurrent event. If it is a relaxed > txn, then we must not have a loop which terminates depending on an > atomic load (which we don't in that example); otherwise, we cannot hoist > load to before the txn (same reason as why we must not hoist to before > an atomic load with memory_order_acquire). > Now, if these things hold, then the load will always be executed after > the txn. Thus, we can assume that it will happen anyway and nothing can > stop it (irrespective of which position the txn gets in the > Transactional Synchronization Order (see the C++ TM spec)). It is a > nonatomic load, and we can rely on the program being data-race-free, so > we cannot have other threads storing to the same location, so we can > hoist it across because in that part of the program, the location is > guaranteed to be thread-local data (or immutable). > > As I said, this isn't related to just pub safety anymore. And this is > tricky enough that I'd rather not try to optimize this currently but > instead wait until we have more confidence in our understanding of the > matter. > >> while for >> >> load; transaction { for () if (b) load; } >> >> it is not. > > If the same assumptions as above hold, I think you can hoist it out, > because again you can assume that it targets thread-local/immutable > data: the nontransactional (+nonatomic!) load can happen at any time, > essentially, irrespective of b's value or how/when other threads modify > it. Thus, it cannot have been changed between the two loads in a > data-race-free program. > >> Neither do I understand why it's not ok for >> >> transaction { for () if (b) load; } >> >> to hoist the load out of the transaction. > > You would violate publication safety. > > Also, you don't have any reason to believe that load targets > thread-local/immutable data, so you must not make it nontransactional > (otherwise, you could introduce a race condition). > >> >> I assume all the issue is about hoisting things over the trasaction start. >> So - why does the transaction start not simply clobber all global memory? >> That would avoid the hoisting. I assume that transforming the above to >> >> transaction { tem = load; for () if (b) = tem; } >> >> is ok? > > No, it is not. Actually, this is precisely the transformation that we > need to prevent from happening. > As Aldy said, please see the explanations in the PR, or have a look at > the C++ TM specification and the C++11 memory model alternatively. We > could also discuss this on IRC, if this would be easier.
Ok. I see. So, I think what would be best is to have a way to check whether a store/load is part of a transaction - do we have a way to do that right now? (For example a flag on a gimple stmt?) Then we can simply avoid the LIM transform by making transaction load/store stmts behave the same as potentially trapping stmts (thus, only optimize if the memory is accessed unconditional somewhere else). That would work for PRE as well. [easiest would be to make *_could_trap_p return true for memory ops inside a transaction] Does that sound like it would make TM happy (well, apart from missed optimizations)? Thus, is it enough to avoid transforms that would be invalid for loads/stores that might trap? Thanks, Richard. > Torvald >