Re: [gem5-dev] Review Request 1941: ruby: Fix Topology throttle connections

Joel Hestness Thu, 18 Jul 2013 14:13:40 -0700

Hi Nilay,

On Sun, Jul 14, 2013 at 9:36 PM, Nilay Vaish <[email protected]> wrote:

> On Wed, 3 Jul 2013, Joel Hestness wrote:
>
>  On July 2, 2013, 11:17 p.m., Nilay Vaish wrote:
>>>
>>>> Joel, I am of the view that we should eliminate ruby's internal
>>>> numbering instead. The cntrl_id is something that appears in
>>>> the config.ini file and hence is visible to the user. But ruby's
>>>> internal numbering of controllers never gets exposed.
>>>>
>>>
>> I'm a fan of Ruby's unique numbering for a few reasons:
>>
>> 1) Because Ruby controllers are instantiated and initialized in 2
>> separate phases, it is guaranteed that the Ruby unique IDs are setup before
>> any initialization, since they only depend on the number of controllers,
>> not their orders, types or otherwise. This is really simple compared to
>> instantiation ordering or another complicated scheme in the protocol config
>> files.
>>
>> 2) All of the code for setting up the machine type base number for the
>> Ruby unique IDs is rolled into SLICC generated code, so the user need not
>> have any exposure to it. It would be nice to just rely on this; basically
>> every time that I've setup or modified a protocol configuration file (4 or
>> 5 experiences), getting the cntrl_ids correct has been a pain and error
>> prone. Without the change I'm proposing here, if the cntrl_ids are not
>> sequential from 0-(# cntrls - 1) (which is easy to mess up), you get a
>> MessageBuffer connection fatal with any interconnect using Topology.cc.
>> This fatal would never be tripped if using Ruby unique IDs, based on
>> simple-to-setup controller version numbers.
>>
>> 3) The cntrl_ids are unnecessary as far as I can tell, and they don't
>> inform the user about anything that the Ruby unique IDs or version number
>> cannot. Probably the most (only?) important place where we must be able to
>> identify a particular controller is in a protocol trace. These traces
>> already use the version number, which is printed in the config.ini and is
>> user controlled. Effectively, cntrl_ids are a duplicate of Ruby unique IDs
>> with a user defined machine type base number. I'm not aware of a need for
>> this flexibility.
>>
>> Anyway, this is a bit of an aside from this review request: Do you want
>> to make a decision on using cntrl_ids vs. Ruby unique IDs before this patch
>> goes through? It does fix a known bug, and I feel it does so in an
>> appropriate direction.
>>
>
> Well, the bug can be fixed by using the other id as well. I am fine with
> dropping the controller ids. But in that case, I would prefer that we add
> the ids that ruby provides to the config.ini file.
>

It looks like the Ruby unique IDs are only valid after instantiating all
Ruby controllers (see generated code calling
<MachineType>::getNumControllers() in build/*/mem/protocol/MachineType.cc),
which happens after writing SimObject parameters out to the config.ini file
(see src/python/m5/simulate.py:88-114).  As such, to print the Ruby unique
IDs to the config.ini would require shuffling its printing to after
recursively calling createCCObject in simulate.py.  This shuffling looks
like it would be non-trivial, especially since we'd need to update a Python
SimObject parameter for the Ruby unique ID in each controller.

I also spent some more time digging into the use of cntrl_id, and the only
remaining use is in the CacheRecorder for checkpoint restore cache warmup.
 It is used to identify the sequencer that a controller is attached to
during warmup, and the way it is organized now looks problematic to me.
 There are cases where the number of sequencers/controllers may change from
a checkpoint run to the restored run (e.g. using accelerators configured
with different cache port configurations), so decreasing the number of
controllers from the checkpoint run can cause the warmup to fail if issuing
a warmup request to an cntrl_id that doesn't exist in the restore machine
(see m_seq_map use in src/mem/ruby/recorder/CacheRecorder.cc:91).  I think
a better way to handle this is to use MachineType and version ID instead of
cntrl_id in the cache records to ensure that warmup packets are sent to not
only the right ID, but also the right MachineType.  After all, it doesn't
make sense to warmup, say, a CPU cache with records that were attributed to
a DMA device in the checkpointed system due to cntrl_ids had been shifted
(this can result in hard-to-track/verify correctness and performance bugs
in warmup).

Is there a particular reason you'd prefer access to these IDs?

  Thanks,
  Joel

-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Wisconsin - Madison
  http://pages.cs.wisc.edu/~hestness/
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] Review Request 1941: ruby: Fix Topology throttle connections

Reply via email to