Hi, Very interesting observation. There are several explanations here. First, you may want to look at branch mispredictions. The pipeline flushes due to memory order violations may be shadowed by flushes due to branch mispredictions. You may want to increase your LSQ size to allow more inflight memory instructions. Finally, is memory order violations the bottleneck? For example, you can look at your potential ILP of your apps and cache misses.
On Mon, Dec 9, 2019 at 1:08 PM Kamran Hasan <[email protected]> wrote: > Hello, > > I've been wondering how memory violations between load store pairs affect > ipc and found that reducing violations by >90% results in an IPC increase > of less than 5% across many different benchmarks. The results seem pretty > counter intuitive to me as I thought the overhead of flushing the pipeline > and restarting execution would be steep. > > Here was my experimental setup: I have two versions of gem5. The first one > is straight from the repo with no changes, and uses storesets for its > memory dependance prediction. The second version is blind speculation only > has one change from the default version which is that it always returns "no > violation" when asked for a memory dependence prediction. I did this by > always returning 0 in the checkInst function in store_set.cc. Both > codebases were compiled into X86/gem5.fast and both ran the same 11 > programs from Mibench. I used the default O3 parameters so the > architecture is a 8 wide superscalar machine with a 192 ROB, 256 registers, > and 32 LQ/32SQ. Programs ran for 10M instr after fast forwarding 50M > instructions. > > > I then looked at the number of memory violations and the IPC for all the > runs and only crc, sha and dijkstra got meaningful IPC gains while the > other 8 programs did not see a benefit even though number of memory > violations were dramatically reduced. Results are here: > https://pastebin.com/raw/HhUKMha5 > > So my question is does gem5 not penalize a memory order violation as > heavily as a branch misprediction? Or is the overhead of recovering from a > memory violation truly not that big in practice? I would appreciate any > insight that could help me reconcile what I'm seeing in the experimental > data with my intuition > > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________ gem5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
