[gem5-users] CHI and GEM5 v22.0.0.2

2022-09-22 Thread Javed Osmany
Hello

I have downloaded GEM5 v22.0.0.2 and wanted to know how many of the reported 
CHI issues have been fixed in this release?
Also is there a way to determine if a particular reported bug has been fixed 
and included in the latest release?

Thanks in advance

JO
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] CHi - assertion error when modelling "mostly inclusive" for private L2$

2022-04-21 Thread Javed Osmany via gem5-users
Hello

I am simulating a multicore Ruby system using CHI, using the Parsec/Splash2 
benchmarks & gem5-21.2.1.0.
It consists of three clusters :

1)  Little cluster of 4 CPUs, each CPU has private L1$ and L2$

2)  Middle cluster of 3 CPUs, each CPU has private L1$ and L2$

3)  Big cluster of 1 CPU with private L1$ and L2$.

By default, the L2$ and L3$ (residing in the HNF) have their clusivity set to 
strict_inclusive and mostly_inclusive respectively (CHI_config.py):

class CHI_L2Controller(CHI_Cache_Controller):
'''
Default parameters for a L2 Cache controller
'''

def __init__(self, ruby_system, cache, l2_clusivity, prefetcher):
super(CHI_L2Controller, self).__init__(ruby_system)
self.sequencer = NULL
self.cache = cache
self.use_prefetcher = False
self.allow_SD = True
self.is_HN = False
self.enable_DMT = False
self.enable_DCT = False
self.send_evictions = False
# Strict inclusive MOESI
 self.alloc_on_seq_acc = False
 self.alloc_on_seq_line_write = False
 self.alloc_on_readshared = True
 self.alloc_on_readunique = True
 self.alloc_on_readonce = True
 self.alloc_on_writeback = True
 self.dealloc_on_unique = False
 self.dealloc_on_shared = False
 self.dealloc_backinv_unique = True
 self.dealloc_backinv_shared = True

class CHI_HNFController(CHI_Cache_Controller):
'''
Default parameters for a coherent home node (HNF) cache controller
'''

#def __init__(self, ruby_system, cache, prefetcher, addr_ranges):
def __init__(self, ruby_system, cache, prefetcher, addr_ranges, 
hnf_enable_dmt, hnf_enable_dct, \
 num_tbe, num_repl_tbe, num_snp_tbe, unified_repl_tbe, 
l3_clusivity):
super(CHI_HNFController, self).__init__(ruby_system)
self.sequencer = NULL
self.cache = cache
self.use_prefetcher = False
self.addr_ranges = addr_ranges
self.allow_SD = True
self.is_HN = True
#self.enable_DMT = True
#self.enable_DCT = True
self.enable_DMT = hnf_enable_dmt
self.enable_DCT = hnf_enable_dct
self.send_evictions = False
# MOESI / Mostly inclusive for shared / Exclusive for unique
self.alloc_on_seq_acc = False
self.alloc_on_seq_line_write = False
self.alloc_on_readshared = True
self.alloc_on_readunique = False
self.alloc_on_readonce = True
self.alloc_on_writeback = True
self.dealloc_on_unique = True
self.dealloc_on_shared = False
self.dealloc_backinv_unique = False
self.dealloc_backinv_shared = False

The simulations complete okay for the default clusivity of L2$ and L3$.
However, if I change the L2$ clusivity to "mostly_inclusive" some of the 
benchmarks are failing with an assertion error.

I took the default L3$ clusivity of mostly_inclusive to update the L2$ 
clusivity to be mostly_inclusive:

class CHI_L2Controller(CHI_Cache_Controller):
'''
Default parameters for a L2 Cache controller
'''

def __init__(self, ruby_system, cache, l2_clusivity, prefetcher):
super(CHI_L2Controller, self).__init__(ruby_system)
self.sequencer = NULL
self.cache = cache
self.use_prefetcher = False
self.allow_SD = True
self.is_HN = False
self.enable_DMT = False
self.enable_DCT = False
self.send_evictions = False
# Strict inclusive MOESI
if (l2_clusivity == "sincl"):
self.alloc_on_seq_acc = False
self.alloc_on_seq_line_write = False
self.alloc_on_readshared = True
self.alloc_on_readunique = True
self.alloc_on_readonce = True
self.alloc_on_writeback = True
self.dealloc_on_unique = False
self.dealloc_on_shared = False
self.dealloc_backinv_unique = True
self.dealloc_backinv_shared = True
elif (l2_clusivity == "mincl"):
# Mostly inclusive MOESI
self.alloc_on_seq_acc = False
self.alloc_on_seq_line_write = False
self.alloc_on_readshared = True
self.alloc_on_readunique = False
self.alloc_on_readonce = True
self.alloc_on_writeback = True
self.dealloc_on_unique = True
self.dealloc_on_shared = False
self.dealloc_backinv_unique = False
self.dealloc_backinv_shared = False

The assertion error being:

log_parsec_volrend_134_8rnf_1snf_4hnf_3_clust_all_priv_l2.txt:build/ARM/mem/ruby/protocol/Cache_Controller.cc:5477:
 panic: Runtime Error at CHI-cache-actions.sm:1947: assert failure.

QS 1: Even though the L2$ is private, i am assuming that L2$ clusivity can be 
set to mostly_inclusive. Is that assumption correct?
QS2: If the answer to QS 1 is yes, then it would seem that the 
"mostly_inclusive" settings for 

[gem5-users] CHI - measure of the snoop traffic

2022-04-14 Thread Javed Osmany via gem5-users
Hello

I am modelling a Ruby based CHI multicore, 3-cluster system with two different 
configs.
In one config, all the cluster CPUs have a private L2$ and in the other config, 
for two clusters, the CPUs share an L2$.

I wanted to check the snoop out traffic at the L2$ controller and the HNCache 
controller (in the HNF a shared L3$ is modelled) for the two different configs.

In the stats.txt, I have been looking at the cpux.l2.snpOut.m_buf_msgs metric.
Each of the L2CacheController and HNFCacheController processes the incoming 
snpIn request and if the requested cache line resides in another CPU cache will 
send out the snpOut request to the CHI network.

Now in MessageBuffer.cc, we have

MessageBuffer::MessageBuffer(const Params )
: SimObject(p), m_stall_map_size(0),
m_max_size(p.buffer_size), m_time_last_time_size_checked(0),
m_time_last_time_enqueue(0), m_time_last_time_pop(0),
m_last_arrival_time(0), m_strict_fifo(p.ordered),
m_randomization(p.randomization),
m_allow_zero_latency(p.allow_zero_latency),
ADD_STAT(m_not_avail_count, "Number of times this buffer did not have "
"N slots available"),
ADD_STAT(m_buf_msgs, "Average number of messages in buffer"),


So as I understand it, each snpOut request that the CacheController generates 
will get buffered in the corresponding snpOut MessageBuffer first, then 
arbitrate for access to the network and once arbitration is won, the request is 
sent out to the CHI interconnect.

A  couple of questions:


1)  Will snpOut.m_buf_msgs give an accurate account of the number of snpOut 
requests that the CacheController has generated?

2)  Is it possible for the request to be buffered in the snpOut Message 
buffer and if there is no conflict to the network, the request could be sent 
out to the network on the next clock cycle? If yes, then will this be visible 
in the snpOut.m_buf_msgs metric?

3)  For snpOut.m_msg_bufs, how is the "Average number of messages in the 
buffer" actually being computed?

Tks in advance

JO

___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] CHI - data/tag latency modelling for HNF/L3$

2022-04-07 Thread Javed Osmany via gem5-users
Hello

I am trying to model a multicore SOC system using Ruby and CHI and I am trying 
to model data/tag latency for the L3$ which resides in the HNF.

Looking in CHI.py and CHI_config.py, I could not see any mechanisms to model 
this.

Could someone please let me know if this is possible and if so whether I can 
make it a command line option?

Thanks in advance
JO


___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] CHI

2022-03-02 Thread Javed Osmany via gem5-users
Hello

I am using the latest version of gem5 (21.2.1.0).

Previously, when using gem5 version 21.0.0.0, in the function "def 
define_options(parser)" (in CHI.py), I added some command line options as such:

def define_options(parser):
parser.add_option("--chi-config", action="store", type="string",
  default=None,
  help="NoC config. parameters and bindings. "
   "Required for CustomMesh topology")
## Add command line options specifically for the [Big, Middle, Little]
## Cluster.
parser.add_option("--verbose", action="store", type="string",
  default="false",
  help="Disable/Enable verbose printing for debugging")
parser.add_option("--num-clusters", action="store", type="string",
  default=0,
  help="Number of Clusters in the system")

I was then able to specify the options when running as:

./build/ARM/gem5.opt 
--outdir=m5out_parsec_blackscoles_1_clust_little_4_cpu_all_shared_l2 
configs/example/se_kiri
n_custom.py --ruby --topology=Pt2Pt --cpu-type=DerivO3CPU --num-cpus=4 
--num-dirs=1 --num-l3caches=1 --verbose=true --num-clusters=0 .

This then worked okay. The new command line options I added was recognised okay.

Now with gem5 21.2.1.0, I have added the same options to the 
define_options(parser) function (in CHI.py) as such:

def define_options(parser):
parser.add_argument("--chi-config", action="store", type=str,
  default=None,
  help="NoC config. parameters and bindings. "
   "Required for CustomMesh topology")
## Add command line options specifically for the [Big, Middle, Little]
## Cluster.
parser.add_option("--verbose", action="store", type="string",
  default="false",
  help="Disable/Enable verbose printing for debugging")
parser.add_option("--num-clusters", action="store", type="string",
  default=0,
  help="Number of Clusters in the system")
 :
 :


But the following command does not work anymore:

./build/ARM/gem5.opt 
--outdir=m5out_parsec_blackscoles_1_clust_little_4_cpu_all_shared_l2 
configs/example/se_kiri
n_custom.py --ruby --topology=Pt2Pt --cpu-type=DerivO3CPU --num-cpus=4 
--num-dirs=1 --num-l3caches=1 --verbose=true --num-clusters=0


The error message being:

command line: ./build/ARM/gem5.opt 
--outdir=m5out_parsec_blackscoles_1_clust_little_4_cpu_all_shared_l2 
configs/example/se_kirin_custom.py --ruby --topology=Pt2Pt 
--cpu-type=DerivO3CPU --num-cpus=4 --num-dirs=1 --num-l3caches=1 --verbose=true 
--num-clusters=0

Usage: se_kirin_custom.py [options]

se_kirin_custom.py: error: no such option: --verbose

Any pointers why the new command line options I had previously specified are 
now not working?

Tks in advance

JO

___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] CHI prefetcher on V21

2021-11-09 Thread Liyichao via gem5-users
Hi All:
 Does latest GEM5 version support prefetcher on Ruby CHI now?

On v21.0.1.0, I think the prefetcher on Ruby CHI does not support well, 
isn't it?

___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] CHI, Ruby - changing cacheline size

2021-08-26 Thread Javed Osmany via gem5-users
Hello

I am using CHI protocol and the Ruby memory system. I am trying to run the 
Parsec and Splash2 benchmarks by varying the cache line size using the command 
line option cacheline_size.

It works for cacheline_size = 64, 128, 256 but not for 32.

I am using gem5-21.0
Command I am using is

./build/ARM/gem5.opt --outdir=m5out_parsec_blackscoles_struct1_line_32 
configs/example/se_kirin_custom.py --ruby --top
ology=Pt2Pt --cpu-type=DerivO3CPU --num-cpus=8 --num-dirs=1 --num-l3caches=1 
--num-cpu-bigclust=1 --num-cpu-middleclust=1 --num-cpu-
littleclust=2 --num-clusters=3 --cpu-type-bigclust=derivo3 
--cpu-type-middleclust=derivo3 --cpu-type-littleclust=derivo3 --bigclust-
l2cache=private --middleclust-l2cache=private --littleclust-l2cache=shared 
--num-bigclust-subclust=1 --num-middleclust-subclust=2 --
num-littleclust-subclust=2 --num-cpu-bigclust-subclust2=1 
--num-cpu-middleclust-subclust2=3 --num-cpu-littleclust-subclust2=2 --big-
cpu-clock=3GHz --middle-cpu-clock=2.6GHz --little-cpu-clock=2GHz 
--cacheline_size=32 --verbose=true --cmd=tests/parsec/blackscoles/p
arsec.blackscholes.hooks -o '4 tests/parsec/blackscoles/in_4K.txt 
tests/parsec/blackscoles/prices.txt'


The error message I get is:

Global frequency set at 1 ticks per second
warn: DRAM device capacity (8192 Mbytes) does not match the address range 
assigned (512 Mbytes)
fatal: fetch buffer size (64 bytes) is greater than the cache block size (32 
bytes)

QS: is a cacheline size of 32 bytes supported in Ruby?
QS: If yes, do I need to change the fetch buffer size to be 32-bytes as well 
and if so where do I need to modify?


Thanks in advance
JO
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] CHI and Ruby Cache block size

2021-08-16 Thread Javed Osmany via gem5-users
Hello

I am using the CHI protocol with Ruby.

The CHI L1 Cache and L2 Cache are derived from the RubyCache class model.

My question: Within Ruby, is it possible to have different cache line size for 
the L1 and L2 caches?

I had a look at src/mem/ruby/structures/RubyCache.py and there is only 
block_size specified, which seems to imply that the same block size will be 
used for both L1 and L2 caches.

Thanks in advance

Best regards
JO
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] CHI - Cluster CPUs having a private L2 cache

2021-07-09 Thread Javed Osmany via gem5-users
Hello

I am using CHI and I want to model the scenario where the CPUs are in a cluster 
and each cluster CPU has private L1 and L2 caches.

I have modified CHI.py and CHI_config.py.

In CHI_config.py, I have taken a copy of CHI_RNF() class object and renamed the 
copied version CHI_RNF_CLUST_PRIV_L2(). Within this new class object I have 
modified the code to realise a private L2cache for each CPU in the cluster.

I have tried to test with the generation of one cluster 
(ruby_system.littleCluster) with four CPUs.

I am attaching the WinZip rar file which includes the modified versions of 
CHI.py, CHI_config.py. Also included is the log file from generating one 
cluster and zero cluster.

Have added print statements for debugging. In particular, for the modified 
CHI.py, added print statement between lines   574-576, I get the following 
output:

CHI.py -- Cntrl is .cpu0.l1i
CHI.py -- Cntrl is .cpu0.l1d
CHI.py -- Cntrl is .cpu1.l1i
CHI.py -- Cntrl is .cpu1.l1d
CHI.py -- Cntrl is .cpu2.l1i
CHI.py -- Cntrl is .cpu2.l1d
CHI.py -- Cntrl is .cpu3.l1i
CHI.py -- Cntrl is .cpu3.l1d
CHI.py -- Cntrl is .cpu0.l2
CHI.py -- Cntrl is .cpu1.l2
CHI.py -- Cntrl is .cpu2.l2
CHI.py -- Cntrl is .cpu3.l2

When I look at the above debug output, it is not clear whether four private 
L2$'s have been generated or just one L2$ which is shared between all four 
CPUs. I am comparing the above with the output when zero clusters are specified 
(log file for zero clusters is also attached).

Could someone with more expertise in CHI implementation in GEM5 than myself, 
please let me know if the generated L2$'s for one cluster are private or shared>

Thanks in advance.

JO



chi_one_cluster.rar
Description: chi_one_cluster.rar
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] CHI and caches

2021-06-18 Thread Javed Osmany via gem5-users
Hello

I have been studying the CHI documentation and the configs/ruby/CHI.py file.

Both the code and the documentation mention about

1)  Map each CPU in the system to an RNF with private and split L1 caches

2)  Add a private L2 cache to each RNF


So what happens if the CPU model already has implemented L1/L2 caches (ie - if 
cpu type is O3_ARM_v7a_3)? Are the existing caches stripped out and CHI 
compliant caches added?

Thanks in advance?
JO

___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s