-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2787/#review7077
-----------------------------------------------------------


Hi guys,

I'm not sure of the status of or plans for this patch, but I wanted to test it 
out and provide some feedback. I've merged and tested this with the current 
gem5 head (11061). First, there are a number of things broken with this patch, 
and if we're still interested in checking it in, it needs plenty of work. It 
took a fair amount of debugging before I was able to run with it.

Second, after merging, I microbenchmarked a few common memory access patterns. 
The O3 CPU with Ruby certainly performs better than older versions of gem5 (I 
was testing changeset 10238). It appears that prior to this patch, the O3 CPU 
has been modified to fix the memory access squashing caused by Ruby sequencer 
blocking (not sure which changes fixed that), so the execution time of the 
microbenchmarks is now comparable between Ruby and classic without this patch.

Further, I found this patch actually introduces many confusing issues and can 
reduce performance by up to 60%. It was very difficult to track down why 
performance suffered: By coalescing requests in the sequencer, the number of 
cache accesses changes, so the first problem was figuring out what an 
'appropriate' change in number of cache accesses might be. After coming to an 
ok conclusion on that, I then found that the sequencer max_outstanding_requests 
needs to be configured appropriately to manage sequencer coalescing well. 
Specifically, if the max_outstanding_requests is less than the LSQ depth, the 
sequencer won't block the LSQ from issuing accesses to the same line, but will 
block when it is full. This reduces the MLP exposed to the caches compared to 
when the LSQ is blocked on outstanding lines and forced to expose accesses to 
separate lines. Setting the max_outstanding_requests greater than the LSQ depth 
fixes this, but this performance bug indicates that the coalescing in the 
sequencer introduces more non-trivial configuration to get reasonable MLP.

Overall, I feel that this patch needs to be dropped: It does not appear to be 
necessary for performance, and it actually introduces problems with performance 
and debugging due to the cache access effects.

- Joel Hestness


On July 22, 2015, 6:15 p.m., Tony Gutierrez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2787/
> -----------------------------------------------------------
> 
> (Updated July 22, 2015, 6:15 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> -------
> 
> Changeset 10887:1e05089bc991
> ---------------------------
> ruby: Fixed pipeline squashes caused by aliased requests
> 
> This patch was created by Bihn Pham during his internship at AMD.
> 
> This patch fixes a very significant performance bug when using the O3 CPU 
> model
> and Ruby. The issue was Ruby returned false when it received a request to the
> same address that already has an outstanding request or when the memory is
> blocked. As a result, O3 unnecessary squashed the pipeline and re-executed
> instructions. This fix merges readRequestTable and writeRequestTable in
> Sequencer into a single request table that keeps track of all requests and
> allows multiple outstanding requests to the same address. This prevents O3
> from squashing the pipeline.
> 
> 
> Diffs
> -----
> 
>   src/mem/ruby/system/Sequencer.hh ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18 
>   src/mem/ruby/system/Sequencer.cc ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18 
>   src/mem/ruby/system/Sequencer.py ebb3d0737aa72ec4fa24b6af9cf9a6b2a1109d18 
> 
> Diff: http://reviews.gem5.org/r/2787/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Tony Gutierrez
> 
>

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to