[gem5-users] CHI and GEM5 v22.0.0.2
Hello I have downloaded GEM5 v22.0.0.2 and wanted to know how many of the reported CHI issues have been fixed in this release? Also is there a way to determine if a particular reported bug has been fixed and included in the latest release? Thanks in advance JO ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org
[gem5-users] CHi - assertion error when modelling "mostly inclusive" for private L2$
Hello I am simulating a multicore Ruby system using CHI, using the Parsec/Splash2 benchmarks & gem5-21.2.1.0. It consists of three clusters : 1) Little cluster of 4 CPUs, each CPU has private L1$ and L2$ 2) Middle cluster of 3 CPUs, each CPU has private L1$ and L2$ 3) Big cluster of 1 CPU with private L1$ and L2$. By default, the L2$ and L3$ (residing in the HNF) have their clusivity set to strict_inclusive and mostly_inclusive respectively (CHI_config.py): class CHI_L2Controller(CHI_Cache_Controller): ''' Default parameters for a L2 Cache controller ''' def __init__(self, ruby_system, cache, l2_clusivity, prefetcher): super(CHI_L2Controller, self).__init__(ruby_system) self.sequencer = NULL self.cache = cache self.use_prefetcher = False self.allow_SD = True self.is_HN = False self.enable_DMT = False self.enable_DCT = False self.send_evictions = False # Strict inclusive MOESI self.alloc_on_seq_acc = False self.alloc_on_seq_line_write = False self.alloc_on_readshared = True self.alloc_on_readunique = True self.alloc_on_readonce = True self.alloc_on_writeback = True self.dealloc_on_unique = False self.dealloc_on_shared = False self.dealloc_backinv_unique = True self.dealloc_backinv_shared = True class CHI_HNFController(CHI_Cache_Controller): ''' Default parameters for a coherent home node (HNF) cache controller ''' #def __init__(self, ruby_system, cache, prefetcher, addr_ranges): def __init__(self, ruby_system, cache, prefetcher, addr_ranges, hnf_enable_dmt, hnf_enable_dct, \ num_tbe, num_repl_tbe, num_snp_tbe, unified_repl_tbe, l3_clusivity): super(CHI_HNFController, self).__init__(ruby_system) self.sequencer = NULL self.cache = cache self.use_prefetcher = False self.addr_ranges = addr_ranges self.allow_SD = True self.is_HN = True #self.enable_DMT = True #self.enable_DCT = True self.enable_DMT = hnf_enable_dmt self.enable_DCT = hnf_enable_dct self.send_evictions = False # MOESI / Mostly inclusive for shared / Exclusive for unique self.alloc_on_seq_acc = False self.alloc_on_seq_line_write = False self.alloc_on_readshared = True self.alloc_on_readunique = False self.alloc_on_readonce = True self.alloc_on_writeback = True self.dealloc_on_unique = True self.dealloc_on_shared = False self.dealloc_backinv_unique = False self.dealloc_backinv_shared = False The simulations complete okay for the default clusivity of L2$ and L3$. However, if I change the L2$ clusivity to "mostly_inclusive" some of the benchmarks are failing with an assertion error. I took the default L3$ clusivity of mostly_inclusive to update the L2$ clusivity to be mostly_inclusive: class CHI_L2Controller(CHI_Cache_Controller): ''' Default parameters for a L2 Cache controller ''' def __init__(self, ruby_system, cache, l2_clusivity, prefetcher): super(CHI_L2Controller, self).__init__(ruby_system) self.sequencer = NULL self.cache = cache self.use_prefetcher = False self.allow_SD = True self.is_HN = False self.enable_DMT = False self.enable_DCT = False self.send_evictions = False # Strict inclusive MOESI if (l2_clusivity == "sincl"): self.alloc_on_seq_acc = False self.alloc_on_seq_line_write = False self.alloc_on_readshared = True self.alloc_on_readunique = True self.alloc_on_readonce = True self.alloc_on_writeback = True self.dealloc_on_unique = False self.dealloc_on_shared = False self.dealloc_backinv_unique = True self.dealloc_backinv_shared = True elif (l2_clusivity == "mincl"): # Mostly inclusive MOESI self.alloc_on_seq_acc = False self.alloc_on_seq_line_write = False self.alloc_on_readshared = True self.alloc_on_readunique = False self.alloc_on_readonce = True self.alloc_on_writeback = True self.dealloc_on_unique = True self.dealloc_on_shared = False self.dealloc_backinv_unique = False self.dealloc_backinv_shared = False The assertion error being: log_parsec_volrend_134_8rnf_1snf_4hnf_3_clust_all_priv_l2.txt:build/ARM/mem/ruby/protocol/Cache_Controller.cc:5477: panic: Runtime Error at CHI-cache-actions.sm:1947: assert failure. QS 1: Even though the L2$ is private, i am assuming that L2$ clusivity can be set to mostly_inclusive. Is that assumption correct? QS2: If the answer to QS 1 is yes, then it would seem that the "mostly_inclusive" settings for
[gem5-users] CHI - measure of the snoop traffic
Hello I am modelling a Ruby based CHI multicore, 3-cluster system with two different configs. In one config, all the cluster CPUs have a private L2$ and in the other config, for two clusters, the CPUs share an L2$. I wanted to check the snoop out traffic at the L2$ controller and the HNCache controller (in the HNF a shared L3$ is modelled) for the two different configs. In the stats.txt, I have been looking at the cpux.l2.snpOut.m_buf_msgs metric. Each of the L2CacheController and HNFCacheController processes the incoming snpIn request and if the requested cache line resides in another CPU cache will send out the snpOut request to the CHI network. Now in MessageBuffer.cc, we have MessageBuffer::MessageBuffer(const Params ) : SimObject(p), m_stall_map_size(0), m_max_size(p.buffer_size), m_time_last_time_size_checked(0), m_time_last_time_enqueue(0), m_time_last_time_pop(0), m_last_arrival_time(0), m_strict_fifo(p.ordered), m_randomization(p.randomization), m_allow_zero_latency(p.allow_zero_latency), ADD_STAT(m_not_avail_count, "Number of times this buffer did not have " "N slots available"), ADD_STAT(m_buf_msgs, "Average number of messages in buffer"), So as I understand it, each snpOut request that the CacheController generates will get buffered in the corresponding snpOut MessageBuffer first, then arbitrate for access to the network and once arbitration is won, the request is sent out to the CHI interconnect. A couple of questions: 1) Will snpOut.m_buf_msgs give an accurate account of the number of snpOut requests that the CacheController has generated? 2) Is it possible for the request to be buffered in the snpOut Message buffer and if there is no conflict to the network, the request could be sent out to the network on the next clock cycle? If yes, then will this be visible in the snpOut.m_buf_msgs metric? 3) For snpOut.m_msg_bufs, how is the "Average number of messages in the buffer" actually being computed? Tks in advance JO ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] CHI - data/tag latency modelling for HNF/L3$
Hello I am trying to model a multicore SOC system using Ruby and CHI and I am trying to model data/tag latency for the L3$ which resides in the HNF. Looking in CHI.py and CHI_config.py, I could not see any mechanisms to model this. Could someone please let me know if this is possible and if so whether I can make it a command line option? Thanks in advance JO ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] CHI
Hello I am using the latest version of gem5 (21.2.1.0). Previously, when using gem5 version 21.0.0.0, in the function "def define_options(parser)" (in CHI.py), I added some command line options as such: def define_options(parser): parser.add_option("--chi-config", action="store", type="string", default=None, help="NoC config. parameters and bindings. " "Required for CustomMesh topology") ## Add command line options specifically for the [Big, Middle, Little] ## Cluster. parser.add_option("--verbose", action="store", type="string", default="false", help="Disable/Enable verbose printing for debugging") parser.add_option("--num-clusters", action="store", type="string", default=0, help="Number of Clusters in the system") I was then able to specify the options when running as: ./build/ARM/gem5.opt --outdir=m5out_parsec_blackscoles_1_clust_little_4_cpu_all_shared_l2 configs/example/se_kiri n_custom.py --ruby --topology=Pt2Pt --cpu-type=DerivO3CPU --num-cpus=4 --num-dirs=1 --num-l3caches=1 --verbose=true --num-clusters=0 . This then worked okay. The new command line options I added was recognised okay. Now with gem5 21.2.1.0, I have added the same options to the define_options(parser) function (in CHI.py) as such: def define_options(parser): parser.add_argument("--chi-config", action="store", type=str, default=None, help="NoC config. parameters and bindings. " "Required for CustomMesh topology") ## Add command line options specifically for the [Big, Middle, Little] ## Cluster. parser.add_option("--verbose", action="store", type="string", default="false", help="Disable/Enable verbose printing for debugging") parser.add_option("--num-clusters", action="store", type="string", default=0, help="Number of Clusters in the system") : : But the following command does not work anymore: ./build/ARM/gem5.opt --outdir=m5out_parsec_blackscoles_1_clust_little_4_cpu_all_shared_l2 configs/example/se_kiri n_custom.py --ruby --topology=Pt2Pt --cpu-type=DerivO3CPU --num-cpus=4 --num-dirs=1 --num-l3caches=1 --verbose=true --num-clusters=0 The error message being: command line: ./build/ARM/gem5.opt --outdir=m5out_parsec_blackscoles_1_clust_little_4_cpu_all_shared_l2 configs/example/se_kirin_custom.py --ruby --topology=Pt2Pt --cpu-type=DerivO3CPU --num-cpus=4 --num-dirs=1 --num-l3caches=1 --verbose=true --num-clusters=0 Usage: se_kirin_custom.py [options] se_kirin_custom.py: error: no such option: --verbose Any pointers why the new command line options I had previously specified are now not working? Tks in advance JO ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] CHI prefetcher on V21
Hi All: Does latest GEM5 version support prefetcher on Ruby CHI now? On v21.0.1.0, I think the prefetcher on Ruby CHI does not support well, isn't it? ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] CHI, Ruby - changing cacheline size
Hello I am using CHI protocol and the Ruby memory system. I am trying to run the Parsec and Splash2 benchmarks by varying the cache line size using the command line option cacheline_size. It works for cacheline_size = 64, 128, 256 but not for 32. I am using gem5-21.0 Command I am using is ./build/ARM/gem5.opt --outdir=m5out_parsec_blackscoles_struct1_line_32 configs/example/se_kirin_custom.py --ruby --top ology=Pt2Pt --cpu-type=DerivO3CPU --num-cpus=8 --num-dirs=1 --num-l3caches=1 --num-cpu-bigclust=1 --num-cpu-middleclust=1 --num-cpu- littleclust=2 --num-clusters=3 --cpu-type-bigclust=derivo3 --cpu-type-middleclust=derivo3 --cpu-type-littleclust=derivo3 --bigclust- l2cache=private --middleclust-l2cache=private --littleclust-l2cache=shared --num-bigclust-subclust=1 --num-middleclust-subclust=2 -- num-littleclust-subclust=2 --num-cpu-bigclust-subclust2=1 --num-cpu-middleclust-subclust2=3 --num-cpu-littleclust-subclust2=2 --big- cpu-clock=3GHz --middle-cpu-clock=2.6GHz --little-cpu-clock=2GHz --cacheline_size=32 --verbose=true --cmd=tests/parsec/blackscoles/p arsec.blackscholes.hooks -o '4 tests/parsec/blackscoles/in_4K.txt tests/parsec/blackscoles/prices.txt' The error message I get is: Global frequency set at 1 ticks per second warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes) fatal: fetch buffer size (64 bytes) is greater than the cache block size (32 bytes) QS: is a cacheline size of 32 bytes supported in Ruby? QS: If yes, do I need to change the fetch buffer size to be 32-bytes as well and if so where do I need to modify? Thanks in advance JO ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] CHI and Ruby Cache block size
Hello I am using the CHI protocol with Ruby. The CHI L1 Cache and L2 Cache are derived from the RubyCache class model. My question: Within Ruby, is it possible to have different cache line size for the L1 and L2 caches? I had a look at src/mem/ruby/structures/RubyCache.py and there is only block_size specified, which seems to imply that the same block size will be used for both L1 and L2 caches. Thanks in advance Best regards JO ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] CHI - Cluster CPUs having a private L2 cache
Hello I am using CHI and I want to model the scenario where the CPUs are in a cluster and each cluster CPU has private L1 and L2 caches. I have modified CHI.py and CHI_config.py. In CHI_config.py, I have taken a copy of CHI_RNF() class object and renamed the copied version CHI_RNF_CLUST_PRIV_L2(). Within this new class object I have modified the code to realise a private L2cache for each CPU in the cluster. I have tried to test with the generation of one cluster (ruby_system.littleCluster) with four CPUs. I am attaching the WinZip rar file which includes the modified versions of CHI.py, CHI_config.py. Also included is the log file from generating one cluster and zero cluster. Have added print statements for debugging. In particular, for the modified CHI.py, added print statement between lines 574-576, I get the following output: CHI.py -- Cntrl is .cpu0.l1i CHI.py -- Cntrl is .cpu0.l1d CHI.py -- Cntrl is .cpu1.l1i CHI.py -- Cntrl is .cpu1.l1d CHI.py -- Cntrl is .cpu2.l1i CHI.py -- Cntrl is .cpu2.l1d CHI.py -- Cntrl is .cpu3.l1i CHI.py -- Cntrl is .cpu3.l1d CHI.py -- Cntrl is .cpu0.l2 CHI.py -- Cntrl is .cpu1.l2 CHI.py -- Cntrl is .cpu2.l2 CHI.py -- Cntrl is .cpu3.l2 When I look at the above debug output, it is not clear whether four private L2$'s have been generated or just one L2$ which is shared between all four CPUs. I am comparing the above with the output when zero clusters are specified (log file for zero clusters is also attached). Could someone with more expertise in CHI implementation in GEM5 than myself, please let me know if the generated L2$'s for one cluster are private or shared> Thanks in advance. JO chi_one_cluster.rar Description: chi_one_cluster.rar ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] CHI and caches
Hello I have been studying the CHI documentation and the configs/ruby/CHI.py file. Both the code and the documentation mention about 1) Map each CPU in the system to an RNF with private and split L1 caches 2) Add a private L2 cache to each RNF So what happens if the CPU model already has implemented L1/L2 caches (ie - if cpu type is O3_ARM_v7a_3)? Are the existing caches stripped out and CHI compliant caches added? Thanks in advance? JO ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s