Hi Jason,

Thanks for the information. I'm using Ruby regularly for my own experiments so 
I'd be happy to be part of the conversation regarding its refactoring and 
improvement.

For now, I've submitted a small patch for review which allows the Ruby 
sequencer to coalesce loads with loads and stores with stores, roughly 
mimicking what mshrs and a merge buffer ought to do. This isn't perfect, but 
does help reduce the core bandwidth starvation. When running STREAM, I see 
~9000 MB/s instead of ~3000 MB/s which is a large improvement. 
https://gem5-review.googlesource.com/c/public/gem5/+/21161

Best
--
Timothy Hayes
Senior Research Engineer
Arm Research
Phone: +44-1223405170
[email protected]

________________________________
From: gem5-dev <[email protected]> on behalf of Jason Lowe-Power 
<[email protected]>
Sent: 24 September 2019 16:11
To: gem5 Developer List <[email protected]>
Subject: Re: [gem5-dev] Ruby Sequencer starving O3 core?

Hi Timothy,

The short answer is that this is a quasi-known issue. The interface between
the core and Ruby needs to be improved. (It's on the roadmap! Though, no
one is actively working on it.)

I could be wrong myself, but I believe you're correct that Ruby cannot
handle multiple loads to the same cache block. I believe in previous
incarnations of the simulator that the coalescing into cache blocks
happened in the LSQ. However, the classic caches assume this happens in the
cache creating a mis-match between Ruby and the classic caches.

I'm not sure what the best fix for this is. Unless it's a small change, we
should probably discuss the design with Brad and Tony before putting
significant effort into coding.

Cheers,
Jason

On Sun, Sep 22, 2019 at 3:20 PM Timothy Hayes <[email protected]> wrote:

> I'm experimenting with various O3 configurations combined with Ruby's
> MESI_Three_Level memory subsystem. I notice that it's very challenging to
> provide the core with more memory bandwidth. For typical/realistic
> O3/Ruby/memory parameters, a single core struggles to achieve 3000 MB/s in
> STREAM. If I max out all the parameters of the O3 core, Ruby, the NoC and
> provide a lot of memory bandwidth, STREAM just about reaches 6000 MB/s. I
> believe this should be much higher. I've found one possible explanation for
> this behaviour.
>
> The Ruby Sequencer receives memory requests from the core via the function
> Sequencer::insertRequest(PacketPtr pkt, RubyRequestType request_type). This
> function determine whether there are requests to the same cache line
> and--if there are--returns without enqueing the memory request. This also
> happens with load requests in which there is already an outstanding load
> request to the same cache line.
>
> RequestTable::value_type default_entry(line_addr, (SequencerRequest*)
> NULL);
> pair<RequestTable::iterator, bool> r  =
> m_readRequestTable.insert(default_entry);
>
> if (r.second) {
>     /* snip */
> } else {
>     // There is an outstanding read request for the cache line
>     m_load_waiting_on_load++;
>     return RequestStatus_Aliased;
> }
>
> This eventually returns to the LSQ which interprets the Aliased
> RequestStatus as the cache controller being blocked.
>
> bool LSQUnit<Impl>::trySendPacket(bool isLoad, PacketPtr data_pkt)
> {
>     if (!lsq->cacheBlocked() &&
>         lsq->cachePortAvailable(isLoad)) {
>         if (!dcachePort->sendTimingReq(data_pkt)) {
>             ret = false;
>             cache_got_blocked = true;
>         }
>      }
>      if (cache_got_blocked) {
>          lsq->cacheBlocked(true);
>         ++lsqCacheBlocked;
>      }
> }
>
> If the code is generating many load requests to contigious memory, e.g. in
> STREAM, won't the cache get blocked extremely frequently? Would this
> explain why it's so difficult to get the core to consume more bandwidth?
>
> I'm happy to go ahead and fix/improve this, but I wanted to check first
> that I'm not missing something--can Ruby handle multiple outstanding loads
> to the same cache line without blocking the cache?
>
>
> --
>
> Timothy Hayes
>
> Senior Research Engineer
>
> Arm Research
>
> Phone: +44-1223405170
>
> [email protected]
>
>
> ​
>
>
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy the
> information in any medium. Thank you.
> _______________________________________________
> gem5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to