[gem5-dev] Change in gem5/gem5[develop]: arch-riscv: Fix disassembling of float register instructions
Ian Jiang has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/32054 ) Change subject: arch-riscv: Fix disassembling of float register instructions .. arch-riscv: Fix disassembling of float register instructions In disassembling of float register instructions, Gem5 always gives 2 source registers rs1 and rs2. However, this is not correct for Mul-Add instructions which have three rs1, rs2, and rs3, and for Move, Convert instructions which have only rs1. For example: (Gem5 output vs Expected) - fmadd.d fa0,fa0,fa4 vs fmadd.d fa0,fa0,fa4,fa5 - fcvt.d.l fa4,a6,zero vs fcvt.d.l fa4,a6 This patch fixes the problem. Change-Id: I02d840eab602ac4a9782911b3cdff2935dfe5e68 Signed-off-by: Ian Jiang Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32054 Reviewed-by: Jason Lowe-Power Maintainer: Jason Lowe-Power Tested-by: kokoro --- M src/arch/riscv/insts/standard.cc 1 file changed, 5 insertions(+), 2 deletions(-) Approvals: Jason Lowe-Power: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/src/arch/riscv/insts/standard.cc b/src/arch/riscv/insts/standard.cc index bb621ae..e6c2b67 100644 --- a/src/arch/riscv/insts/standard.cc +++ b/src/arch/riscv/insts/standard.cc @@ -47,8 +47,11 @@ { stringstream ss; ss << mnemonic << ' ' << registerName(_destRegIdx[0]) << ", " << -registerName(_srcRegIdx[0]) << ", " << -registerName(_srcRegIdx[1]); +registerName(_srcRegIdx[0]); +if (_srcRegIdx[1].index() != 0) +ss << ", " << registerName(_srcRegIdx[1]); +if (_srcRegIdx[2].index() != 0) +ss << ", " << registerName(_srcRegIdx[2]); return ss.str(); } -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32054 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I02d840eab602ac4a9782911b3cdff2935dfe5e68 Gerrit-Change-Number: 32054 Gerrit-PatchSet: 2 Gerrit-Owner: Ian Jiang Gerrit-Reviewer: Gabe Black Gerrit-Reviewer: Ian Jiang Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: dev-arm: Initialize cd_addr in src/dev/arm/smmu_v3_transl.cc
Hoa Nguyen has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/32015 ) Change subject: dev-arm: Initialize cd_addr in src/dev/arm/smmu_v3_transl.cc .. dev-arm: Initialize cd_addr in src/dev/arm/smmu_v3_transl.cc In src/dev/arm/smmu_v3_transl.cc#L1401, cd_addr might not be initialized when all if statements fail. Change-Id: Idf53c07a9b5d52eea488e631f7334d4b566e645a Signed-off-by: Hoa Nguyen Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32015 Reviewed-by: Giacomo Travaglini Maintainer: Giacomo Travaglini Tested-by: kokoro --- M src/dev/arm/smmu_v3_transl.cc 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Giacomo Travaglini: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/src/dev/arm/smmu_v3_transl.cc b/src/dev/arm/smmu_v3_transl.cc index 209b04f..c7b20f9 100644 --- a/src/dev/arm/smmu_v3_transl.cc +++ b/src/dev/arm/smmu_v3_transl.cc @@ -1398,7 +1398,7 @@ const StreamTableEntry , uint32_t sid, uint32_t ssid) { -Addr cd_addr; +Addr cd_addr = 0; if (ste.dw0.s1cdmax == 0) { cd_addr = ste.dw0.s1ctxptr << ST_CD_ADDR_SHIFT; -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32015 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Idf53c07a9b5d52eea488e631f7334d4b566e645a Gerrit-Change-Number: 32015 Gerrit-PatchSet: 2 Gerrit-Owner: Hoa Nguyen Gerrit-Reviewer: Andreas Sandberg Gerrit-Reviewer: Bobby R. Bruce Gerrit-Reviewer: Giacomo Travaglini Gerrit-Reviewer: Hoa Nguyen Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: util,scons: improve compareVersions function
Hoa Nguyen has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/32014 ) Change subject: util,scons: improve compareVersions function .. util,scons: improve compareVersions function Current compareVersions() fails in this case: compareVersions("10", "10.0") return -1 while it should be 0. This at least is causing a systemc compiling issue. This problem causes by the comparison algorithm. The algorithm turns the versions in two lists, and compares the corresponding elements of the two lists up to the last element of the shorter list. If all elements are equal, the longer list will be determined to be the more recent version. Hence, this algorithm determines "10.0" to be more recent to "10". This commit addresses this issue by making the version lists have the same length by adding 0 to the shorter list. JIRA: https://gem5.atlassian.net/browse/GEM5-715 Change-Id: I859679185ac67e1b4d327d8803699cc5e399fa8c Signed-off-by: Hoa Nguyen Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32014 Reviewed-by: Gabe Black Maintainer: Gabe Black Tested-by: kokoro --- M src/python/m5/util/__init__.py 1 file changed, 5 insertions(+), 4 deletions(-) Approvals: Gabe Black: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/src/python/m5/util/__init__.py b/src/python/m5/util/__init__.py index c59f40a..d26bf4e 100644 --- a/src/python/m5/util/__init__.py +++ b/src/python/m5/util/__init__.py @@ -44,6 +44,7 @@ import sys from six import string_types +from six.moves import zip_longest from . import convert from . import jobfile @@ -132,13 +133,13 @@ v1 = make_version_list(v1) v2 = make_version_list(v2) + # Compare corresponding elements of lists -for n1,n2 in zip(v1, v2): +# The shorter list is filled with 0 till the lists have the same length +for n1,n2 in zip_longest(v1, v2, fillvalue=0): if n1 < n2: return -1 if n1 > n2: return 1 -# all corresponding values are equal... see if one has extra values -if len(v1) < len(v2): return -1 -if len(v1) > len(v2): return 1 + return 0 def crossproduct(items): -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32014 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I859679185ac67e1b4d327d8803699cc5e399fa8c Gerrit-Change-Number: 32014 Gerrit-PatchSet: 5 Gerrit-Owner: Hoa Nguyen Gerrit-Reviewer: Bobby R. Bruce Gerrit-Reviewer: Gabe Black Gerrit-Reviewer: Hoa Nguyen Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] AMD GCN3 - HSA Memory Mapping
Hi All, I have two queries related to apu_se.py. (1) In both AMD staging and public/develop, apu_se.py has two unused variables: hsapp_gpu_map_vaddr = 0x2 hsapp_gpu_map_size = 0x1000 Are they unnecessary or should they actually be used somewhere ? (2) The following is passed as pioAddr to the HSAPacketProcessor. hsapp_gpu_map_paddr = int(Addr(options.mem_size)) And then the following assignment is done. # Map workload to this address space host_cpu.workload[0].map(0x1000, 0x2, 4096) Should the physical address to the workload map be the same as pioAddr of HSAPacketProcessor, i.e. greater than physical memory size and should the virtual address of the workload map remain 0x1000 ? As a whole, are the above mentioned variables related ? If yes, then how ? Are some of them accidentally hardcoded and should actually be variables? Please advise. Thank you, Sampad ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-arm: Implementing SecureEL2 feature for Armv8
Jordi Vaquero has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/31394 ) Change subject: arch-arm: Implementing SecureEL2 feature for Armv8 .. arch-arm: Implementing SecureEL2 feature for Armv8 This patch adds Secure EL2 feature. This allows stage1 EL2/EL&0 and stage2 secure translation. The changes are organized as follow: + insts/static_inst.cc: Modify checks for illegalInstruction on eret + isa.cc/hh: Enabling contorl bits + isa/insts/misc.hh/64.hh: Smc fault trigger. + miscregs.cc/hh: Declaration and initialization of new registers + self_debug.cc/hh: Add secureEL2 types for breakpoints + stage2_lookup.cc/hh: Allow stage2 in secure state. + tlb.cc/table_walker.cc: Allow secure state for stage2 and stage 1 EL2&0 translation regime + utility.cc/hh: New function InSecure and refactor of other helpers to enable secure state JIRA: https://gem5.atlassian.net/browse/GEM5-686 Change-Id: Ie59438b1828508e944334420da1d8f4745649056 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31394 Reviewed-by: Giacomo Travaglini Maintainer: Giacomo Travaglini Tested-by: kokoro --- M src/arch/arm/ArmSystem.py M src/arch/arm/fastmodel/CortexA76/thread_context.cc M src/arch/arm/faults.cc M src/arch/arm/insts/static_inst.cc M src/arch/arm/insts/static_inst.hh M src/arch/arm/interrupts.cc M src/arch/arm/interrupts.hh M src/arch/arm/isa.cc M src/arch/arm/isa.hh M src/arch/arm/isa/insts/branch.isa M src/arch/arm/isa/insts/fp.isa M src/arch/arm/isa/insts/misc.isa M src/arch/arm/isa/insts/misc64.isa M src/arch/arm/miscregs.cc M src/arch/arm/miscregs.hh M src/arch/arm/self_debug.cc M src/arch/arm/self_debug.hh M src/arch/arm/semihosting.cc M src/arch/arm/stage2_lookup.cc M src/arch/arm/stage2_lookup.hh M src/arch/arm/system.cc M src/arch/arm/system.hh M src/arch/arm/table_walker.cc M src/arch/arm/tlb.cc M src/arch/arm/tracers/tarmac_record.cc M src/arch/arm/utility.cc M src/arch/arm/utility.hh 27 files changed, 211 insertions(+), 139 deletions(-) Approvals: Giacomo Travaglini: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/src/arch/arm/ArmSystem.py b/src/arch/arm/ArmSystem.py index c4cc51f..333ae5f 100644 --- a/src/arch/arm/ArmSystem.py +++ b/src/arch/arm/ArmSystem.py @@ -75,6 +75,8 @@ "True if LSE is implemented (ARMv8.1)") have_pan = Param.Bool(True, "True if Priviledge Access Never is implemented (ARMv8.1)") +have_secel2 = Param.Bool(True, +"True if Secure EL2 is implemented (ARMv8)") semihosting = Param.ArmSemihosting(NULL, "Enable support for the Arm semihosting by settings this parameter") diff --git a/src/arch/arm/fastmodel/CortexA76/thread_context.cc b/src/arch/arm/fastmodel/CortexA76/thread_context.cc index 4016d2b..4e2bfd2 100644 --- a/src/arch/arm/fastmodel/CortexA76/thread_context.cc +++ b/src/arch/arm/fastmodel/CortexA76/thread_context.cc @@ -59,7 +59,7 @@ break; } -Iris::CanonicalMsn out_msn = inSecureState(this) ? +Iris::CanonicalMsn out_msn = isSecure(this) ? Iris::PhysicalMemorySecureMsn : Iris::PhysicalMemoryNonSecureMsn; // Figure out what memory spaces match the canonical numbers we need. diff --git a/src/arch/arm/faults.cc b/src/arch/arm/faults.cc index 40cf634..300c82c 100644 --- a/src/arch/arm/faults.cc +++ b/src/arch/arm/faults.cc @@ -977,10 +977,12 @@ } else { bool lower_32 = false; if (toEL == EL3) { -if (!inSecureState(tc) && ArmSystem::haveEL(tc, EL2)) +if (EL2Enabled(tc)) lower_32 = ELIs32(tc, EL2); else lower_32 = ELIs32(tc, EL1); +} else if (ELIsInHost(tc, fromEL) && fromEL == EL0 && toEL == EL2) { +lower_32 = ELIs32(tc, EL0); } else { lower_32 = ELIs32(tc, static_cast(toEL - 1)); } @@ -1310,7 +1312,7 @@ HDCR hdcr = tc->readMiscRegNoEffect(MISCREG_HDCR); toHyp = fromEL == EL2; -toHyp |= ArmSystem::haveEL(tc, EL2) && !inSecureState(tc) && +toHyp |= ArmSystem::haveEL(tc, EL2) && !isSecure(tc) && currEL(tc) <= EL1 && (hcr.tge || stage2 || (source == DebugEvent && hdcr.tde)); return toHyp; diff --git a/src/arch/arm/insts/static_inst.cc b/src/arch/arm/insts/static_inst.cc index 0cbd776..12586c7 100644 --- a/src/arch/arm/insts/static_inst.cc +++ b/src/arch/arm/insts/static_inst.cc @@ -634,9 +634,8 @@ const auto tc = xc->tcBase(); const HCR hcr = tc->readMiscReg(MISCREG_HCR_EL2); const HDCR mdcr = tc->readMiscRegNoEffect(MISCREG_MDCR_EL2); -if ((ArmSystem::haveEL(tc, EL2) && !inSecureState(tc) && - !ELIs32(tc, EL2) && (hcr.tge == 1 || mdcr.tde == 1)) || - !ELIs32(tc, EL1)) { +if ((EL2Enabled(tc) && !ELIs32(tc, EL2) && +
[gem5-dev] Change in gem5/gem5[develop]: arch-riscv: Fix disassembling of float register instructions
Ian Jiang has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/32054 ) Change subject: arch-riscv: Fix disassembling of float register instructions .. arch-riscv: Fix disassembling of float register instructions In disassembling of float register instructions, Gem5 always gives 2 source registers rs1 and rs2. However, this is not correct for Mul-Add instructions which have three rs1, rs2, and rs3, and for Move, Convert instructions which have only rs1. For example: (Gem5 output vs Expected) - fmadd.d fa0,fa0,fa4 vs fmadd.d fa0,fa0,fa4,fa5 - fcvt.d.l fa4,a6,zero vs fcvt.d.l fa4,a6 This patch fixes the problem. Change-Id: I02d840eab602ac4a9782911b3cdff2935dfe5e68 Signed-off-by: Ian Jiang --- M src/arch/riscv/insts/standard.cc 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/arch/riscv/insts/standard.cc b/src/arch/riscv/insts/standard.cc index bb621ae..e6c2b67 100644 --- a/src/arch/riscv/insts/standard.cc +++ b/src/arch/riscv/insts/standard.cc @@ -47,8 +47,11 @@ { stringstream ss; ss << mnemonic << ' ' << registerName(_destRegIdx[0]) << ", " << -registerName(_srcRegIdx[0]) << ", " << -registerName(_srcRegIdx[1]); +registerName(_srcRegIdx[0]); +if (_srcRegIdx[1].index() != 0) +ss << ", " << registerName(_srcRegIdx[1]); +if (_srcRegIdx[2].index() != 0) +ss << ", " << registerName(_srcRegIdx[2]); return ss.str(); } -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32054 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I02d840eab602ac4a9782911b3cdff2935dfe5e68 Gerrit-Change-Number: 32054 Gerrit-PatchSet: 1 Gerrit-Owner: Ian Jiang Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: dev-arm: relax GenericTimer check for CPU count
Ciro Santilli has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/31894 ) Change subject: dev-arm: relax GenericTimer check for CPU count .. dev-arm: relax GenericTimer check for CPU count At Iff9ad68d64e67b3df51682b7e4e272e5f355bcd6 a check was added to prevent segfaults when unserializing the GenericTimer in case the new number of thread contexts was smaller than the old one pre-checkpoint. However, GenericTimer objects are only created dynamically as needed after timer miscreg accesses. Therefore, if we take the checkpoint before touching those registers, e.g. from a simple baremetal example, then the checkpoint saves zero timers, and upon restore the assert would fail because we have one thread context and not zero: fatal: The simulated system has been initialized with 1 CPUs, but the Generic Timer checkpoint expects 0 CPUs. Consider restoring the checkpoint specifying 0 CPUs. This commit solves that by ensuring only that the new thread context count larger than, but not necessarily equal to the number of cores. Change-Id: I8bcb05a6faecd4b4845f7fd4d71df95041bf6c99 JIRA: https://gem5.atlassian.net/browse/GEM5-703 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31894 Reviewed-by: Giacomo Travaglini Maintainer: Giacomo Travaglini Tested-by: kokoro --- M src/dev/arm/generic_timer.cc 1 file changed, 5 insertions(+), 1 deletion(-) Approvals: Giacomo Travaglini: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/src/dev/arm/generic_timer.cc b/src/dev/arm/generic_timer.cc index bf6cd4e..7bb2def 100644 --- a/src/dev/arm/generic_timer.cc +++ b/src/dev/arm/generic_timer.cc @@ -426,7 +426,11 @@ cpu_count = OLD_CPU_MAX; } -if (cpu_count != system.threads.size()) { +// We cannot assert for equality here because CPU timers are dynamically +// created on the first miscreg access. Therefore, if we take the checkpoint +// before any timer registers have been accessed, the number of counters +// is actually smaller than the total number of CPUs. +if (cpu_count > system.threads.size()) { fatal("The simulated system has been initialized with %d CPUs, " "but the Generic Timer checkpoint expects %d CPUs. Consider " "restoring the checkpoint specifying %d CPUs.", -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/31894 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I8bcb05a6faecd4b4845f7fd4d71df95041bf6c99 Gerrit-Change-Number: 31894 Gerrit-PatchSet: 4 Gerrit-Owner: Ciro Santilli Gerrit-Reviewer: Andreas Sandberg Gerrit-Reviewer: Ciro Santilli Gerrit-Reviewer: Giacomo Travaglini Gerrit-Reviewer: Richard Cooper Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Number of TLBs equal to number of CUs
GAURAV JAIN has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/32035 ) Change subject: gpu-compute: Number of TLBs equal to number of CUs .. gpu-compute: Number of TLBs equal to number of CUs The n_cu variable in GPUTLBConifig.py did not take the number of CUs into consideration and instead calculated the number of TLBs using cu_per_sa, sa_per_complex, num_gpu_complexes. Thus changing the number of cus (n_cus) and none of the other flags resulted in a segmentation fault since the required TLBs were not being instantiated Change-Id: I569a4e6dc7db9b7a81aeede5ac68aacc0f400a5e --- M configs/common/GPUTLBConfig.py 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/configs/common/GPUTLBConfig.py b/configs/common/GPUTLBConfig.py index 8e2b1e4..c06bda1 100644 --- a/configs/common/GPUTLBConfig.py +++ b/configs/common/GPUTLBConfig.py @@ -74,8 +74,7 @@ coalescer_name.append(eval(Coalescer_constructor(my_level))) def config_tlb_hierarchy(options, system, shader_idx): -n_cu = options.cu_per_sa * options.sa_per_complex * \ - options.num_gpu_complexes +n_cu = options.num_compute_units if options.TLB_config == "perLane": num_TLBs = 64 * n_cu -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32035 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I569a4e6dc7db9b7a81aeede5ac68aacc0f400a5e Gerrit-Change-Number: 32035 Gerrit-PatchSet: 1 Gerrit-Owner: GAURAV JAIN Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Adding separate class for dynamic-reg-alloc...
GAURAV JAIN has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/32034 ) Change subject: gpu-compute: Adding separate class for dynamic-reg-allocation .. gpu-compute: Adding separate class for dynamic-reg-allocation SimplePoolManager doesn't allow mapping of two WGs simultaneously on the same Compute Unit (provided the previous WG has been mapped to all the SIMDs) even if there is sufficient VRF and SRF space available. DynPoolManager takes care of that by dynamically allocating and deallocating register file space to wavefronts Change-Id: I2255c68d4b421615d7b231edc05d3ebb27cbd66c --- M configs/example/apu_se.py M src/gpu-compute/GPU.py M src/gpu-compute/SConscript M src/gpu-compute/compute_unit.cc M src/gpu-compute/compute_unit.hh A src/gpu-compute/dyn_pool_manager.cc A src/gpu-compute/dyn_pool_manager.hh M src/gpu-compute/pool_manager.hh M src/gpu-compute/shader.cc M src/gpu-compute/static_register_manager_policy.cc 10 files changed, 283 insertions(+), 16 deletions(-) diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py index 82e4022..2993143 100644 --- a/configs/example/apu_se.py +++ b/configs/example/apu_se.py @@ -180,6 +180,8 @@ ' m5_switchcpu pseudo-ops will toggle back and forth') parser.add_option("--num-hw-queues", type="int", default=10, help="number of hw queues in packet processor") +parser.add_option("--reg-alloc-policy",type="string", default="simple", + help="register allocation policy (simple/dynamic)") Ruby.define_options(parser) @@ -300,18 +302,28 @@ for k in xrange(shader.n_wf): wavefronts.append(Wavefront(simdId = j, wf_slot_id = k, wf_size = options.wf_size)) -vrf_pool_mgrs.append(SimplePoolManager(pool_size = \ + +if options.reg_alloc_policy == "simple": +vrf_pool_mgrs.append(SimplePoolManager(pool_size = \ options.vreg_file_size, min_alloc = \ options.vreg_min_alloc)) +srf_pool_mgrs.append(SimplePoolManager(pool_size = \ + options.sreg_file_size, + min_alloc = \ + options.vreg_min_alloc)) +elif options.reg_alloc_policy == "dynamic": +vrf_pool_mgrs.append(DynPoolManager(pool_size = \ + options.vreg_file_size, + min_alloc = \ + options.vreg_min_alloc)) +srf_pool_mgrs.append(DynPoolManager(pool_size = \ + options.sreg_file_size, + min_alloc = \ + options.vreg_min_alloc)) vrfs.append(VectorRegisterFile(simd_id=j, wf_size=options.wf_size, num_regs=options.vreg_file_size)) - -srf_pool_mgrs.append(SimplePoolManager(pool_size = \ - options.sreg_file_size, - min_alloc = \ - options.vreg_min_alloc)) srfs.append(ScalarRegisterFile(simd_id=j, wf_size=options.wf_size, num_regs=options.sreg_file_size)) diff --git a/src/gpu-compute/GPU.py b/src/gpu-compute/GPU.py index 7408bf9..5d2e6c5 100644 --- a/src/gpu-compute/GPU.py +++ b/src/gpu-compute/GPU.py @@ -30,6 +30,7 @@ # POSSIBILITY OF SUCH DAMAGE. # # Authors: Steve Reinhardt +# Gaurav Jain from m5.defines import buildEnv from m5.params import * @@ -67,6 +68,12 @@ cxx_class = 'SimplePoolManager' cxx_header = "gpu-compute/simple_pool_manager.hh" +## This is for allowing multiple workgroups on one CU +class DynPoolManager(PoolManager): +type = 'DynPoolManager' +cxx_class = 'DynPoolManager' +cxx_header = "gpu-compute/dyn_pool_manager.hh" + class RegisterFile(SimObject): type = 'RegisterFile' cxx_class = 'RegisterFile' diff --git a/src/gpu-compute/SConscript b/src/gpu-compute/SConscript index 0f1afbc..f242818 100644 --- a/src/gpu-compute/SConscript +++ b/src/gpu-compute/SConscript @@ -65,6 +65,7 @@ Source('scheduler.cc') Source('scoreboard_check_stage.cc') Source('shader.cc') +Source('dyn_pool_manager.cc') Source('simple_pool_manager.cc') Source('static_register_manager_policy.cc') Source('tlb_coalescer.cc') @@ -83,6 +84,7 @@ DebugFlag('GPUPort') DebugFlag('GPUPrefetch') DebugFlag('GPUReg') +DebugFlag('GPURegAlloc') DebugFlag('GPURename')