On Mon, 21 May 2012, Aldy Hernandez wrote: > On 05/16/12 07:53, Richard Guenther wrote: > > On Mon, 7 May 2012, Aldy Hernandez wrote: > > [Sorry for the delay; I was on vacation.] > > I am forgoing the load avoidance code altogether to simplify things. Thanks. > > > + /* Emit the load code into the latch, so that we are sure it will > > + be processed after all dependencies. */ > > + latch_edge = loop_latch_edge (loop); > > > > inserting code into the latch block is bad - the vectorizer will > > be confused by non-empty latches. > > The code as is on trunk inserts the load on the latch.
Does it? Ok ... > That's why I also > inserted the supporting flag checking code there. Do you want me to put the > supporting code somewhere else? Technically there isn't a good other place that does not add a partial redundancy. > > Your ChangeLog mentions changes that are no longer necessary > > (the target hook). > > Whoops. Fixed. > > > > > I didn't quite get the store order issue - we _do_ preserve store > > ordering right now, do we? So how come introducing the flags > > change that? > > The current scheme on trunk works because it inserts one load with > gsi_insert_on_edge() and subsequent ones get appended correctly, whereas my > patch has to split the edge (which happens at the top of the block), so > subsequent code inserts happen in reverse order (at the top). If I remember > correctly, that is... I can look again and report if it's still unclear. Hmm, the old code (well, and the new one ...) walks stores to move by walking a bitmap. I suppose we rely on computing uids in the original program order here then. (flag_tm && loop_preheader_edge (loop)->src->flags & BB_IN_TRANSACTION) can you encapsulate this into a predicate? Like block_in_transaction () that also checks flag_tm? + /* ?? FIXME TESTING TESTING ?? */ + multi_threaded_model_p=true; + /* ?? FIXME TESTING TESTING ?? */ that of course needs fixing ;) (and re-testing) > New patch attached. Tested on x86-64 Linux. No regressions. Ok with the new predicate in basic-block.h and re-testing without the fixme above. Thanks, Richard.