[gem5-dev] [L] Change in gem5/gem5[develop]: arch-riscv: add RV32 ADFIMU_Zfh instruction tests

2022-12-30 Thread Roger Chang (Gerrit) via gem5-dev
Roger Chang has submitted this change. (  
https://gem5-review.googlesource.com/c/public/gem5/+/65533?usp=email )


 (

35 is the latest approved patch-set.
No files were changed between the latest approved patch-set and the  
submitted one.

 )Change subject: arch-riscv: add RV32 ADFIMU_Zfh instruction tests
..

arch-riscv: add RV32 ADFIMU_Zfh instruction tests

1. Add rv32 binary files into asmtests
2. Support Riscv CPU with 32 bits register to  simple_binary_run.py

Change-Id: I5cc4c2eeb7654a4acc2d167eb76d8b6522e65dd9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65533
Reviewed-by: Yu-hsin Wang 
Reviewed-by: Bobby Bruce 
Tested-by: kokoro 
Maintainer: Bobby Bruce 
---
M tests/gem5/asmtest/tests.py
M tests/gem5/configs/simple_binary_run.py
2 files changed, 228 insertions(+), 160 deletions(-)

Approvals:
  Yu-hsin Wang: Looks good to me, but someone else must approve
  kokoro: Regressions pass
  Bobby Bruce: Looks good to me, approved; Looks good to me, approved




diff --git a/tests/gem5/asmtest/tests.py b/tests/gem5/asmtest/tests.py
index b2a5992..0ddffb2 100644
--- a/tests/gem5/asmtest/tests.py
+++ b/tests/gem5/asmtest/tests.py
@@ -34,156 +34,159 @@
 # The following lists the RISCV binaries. Those commented out presently  
result

 # in a test failure. This is outlined in the following Jira issue:
 # https://gem5.atlassian.net/browse/GEM5-496
-binaries = (
-"rv64samt-ps-sysclone_d",
-"rv64samt-ps-sysfutex1_d",
+binary_configs = (
+("rv{}samt-ps-sysclone_d", (64,)),
+("rv{}samt-ps-sysfutex1_d", (64,)),
 #'rv64samt-ps-sysfutex2_d',
-"rv64samt-ps-sysfutex3_d",
+("rv{}samt-ps-sysfutex3_d", (64,)),
 #'rv64samt-ps-sysfutex_d',
-"rv64ua-ps-amoadd_d",
-"rv64ua-ps-amoadd_w",
-"rv64ua-ps-amoand_d",
-"rv64ua-ps-amoand_w",
-"rv64ua-ps-amomax_d",
-"rv64ua-ps-amomax_w",
-"rv64ua-ps-amomaxu_d",
-"rv64ua-ps-amomaxu_w",
-"rv64ua-ps-amomin_d",
-"rv64ua-ps-amomin_w",
-"rv64ua-ps-amominu_d",
-"rv64ua-ps-amominu_w",
-"rv64ua-ps-amoor_d",
-"rv64ua-ps-amoor_w",
-"rv64ua-ps-amoswap_d",
-"rv64ua-ps-amoswap_w",
-"rv64ua-ps-amoxor_d",
-"rv64ua-ps-amoxor_w",
-"rv64ua-ps-lrsc",
-"rv64uamt-ps-amoadd_d",
-"rv64uamt-ps-amoand_d",
-"rv64uamt-ps-amomax_d",
-"rv64uamt-ps-amomaxu_d",
-"rv64uamt-ps-amomin_d",
-"rv64uamt-ps-amominu_d",
-"rv64uamt-ps-amoor_d",
-"rv64uamt-ps-amoswap_d",
-"rv64uamt-ps-amoxor_d",
-"rv64uamt-ps-lrsc_d",
-"rv64ud-ps-fadd",
-"rv64ud-ps-fclass",
-"rv64ud-ps-fcmp",
-"rv64ud-ps-fcvt",
-"rv64ud-ps-fcvt_w",
-"rv64ud-ps-fdiv",
-"rv64ud-ps-fmadd",
-"rv64ud-ps-fmin",
-"rv64ud-ps-ldst",
-"rv64ud-ps-move",
-"rv64ud-ps-recoding",
-"rv64ud-ps-structural",
-"rv64uf-ps-fadd",
-"rv64uf-ps-fclass",
-"rv64uf-ps-fcmp",
-"rv64uf-ps-fcvt",
-"rv64uf-ps-fcvt_w",
-"rv64uf-ps-fdiv",
-"rv64uf-ps-fmadd",
-"rv64uf-ps-fmin",
-"rv64uf-ps-ldst",
-"rv64uf-ps-move",
-"rv64uf-ps-recoding",
-"rv64ui-ps-add",
-"rv64ui-ps-addi",
-"rv64ui-ps-addiw",
-"rv64ui-ps-addw",
-"rv64ui-ps-and",
-"rv64ui-ps-andi",
-"rv64ui-ps-auipc",
-"rv64ui-ps-beq",
-"rv64ui-ps-bge",
-"rv64ui-ps-bgeu",
-"rv64ui-ps-blt",
-"rv64ui-ps-bltu",
-"rv64ui-ps-bne",
-"rv64ui-ps-fence_i",
-"rv64ui-ps-jal",
-"rv64ui-ps-jalr",
-"rv64ui-ps-lb",
-"rv64ui-ps-lbu",
-"rv64ui-ps-ld",
-"rv64ui-ps-lh",
-"rv64ui-ps-lhu",
-"rv64ui-ps-lui",
-"rv64ui-ps-lw",
-"rv64ui-ps-lwu",
-"rv64ui-ps-or",
-"rv64ui-ps-ori",
-"rv64ui-ps-sb",
-"rv64ui-ps-sd",
-"rv64ui-ps-sh",
-"rv64ui-ps-simple",
-"rv64ui-ps-sll",
-"rv64ui-ps-slli",
-"rv64ui-ps-slliw",
-"rv64ui-ps-sllw",
-"rv64ui-ps-slt",
-"rv64ui-ps-slti",
-"rv64ui-ps-sltiu",
-"rv64ui-ps-sltu",
-"rv64ui-ps-sra",
-"rv64ui-ps-srai",
-"rv64ui-ps-sraiw",
-"rv64ui-ps-sraw",
-"rv64ui-ps-srl",
-"rv64ui-ps-srli",
-"rv64ui-ps-srliw",
-"rv64ui-ps-srlw",
-"rv64ui-ps-sub",
-"rv64ui-ps-subw",
-"rv64ui-ps-sw",
-"rv64ui-ps-xor",
-"rv64ui-ps-xori",
-"rv64um-ps-div",
-"rv64um-ps-divu",
-"rv64um-ps-divuw",
-"rv64um-ps-divw",
-"rv64um-ps-mul",
-"rv64um-ps-mulh",
-"rv64um-ps-mulhsu",
-"rv64um-ps-mulhu",
-"rv64um-ps-mulw",
-"rv64um-ps-rem",
-"rv64um-ps-remu",
-"rv64um-ps-remuw",
-"rv64um-ps-remw",
-"rv64uzfh-ps-fadd",
-"rv64uzfh-ps-fclass",
-"rv64uzfh-ps-fcmp",
-"rv64uzfh-ps-fcvt",
-"rv64uzfh-ps-fcvt_w",
-"rv64uzfh-ps-fdiv",
-"rv64uzfh-ps-fmadd",
-"rv64uzfh-ps-fmin",
-"rv64uzfh-ps-ldst",
-"rv64uzfh-ps-move",
-"rv64uzfh-ps-recoding",
+("rv{}ua-ps-amoadd_d", (64,)),
+("rv{}ua-ps-amoadd_w", (32, 64)),
+("rv{}ua-ps-amoand_d", (64,)),
+

[gem5-dev] [M] Change in gem5/gem5[develop]: arch-vega: Implement ds_write2st64_b64

2022-12-30 Thread Matthew Poremba (Gerrit) via gem5-dev
Matthew Poremba has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/67078?usp=email )



Change subject: arch-vega: Implement ds_write2st64_b64
..

arch-vega: Implement ds_write2st64_b64

Write two qwords at offsets multiplied by 8 * 64 bytes.

Change-Id: I0d0e05f3e848c2fd02d32095e32b7f023bd8803b
---
M src/arch/amdgpu/vega/insts/instructions.cc
M src/arch/amdgpu/vega/insts/instructions.hh
2 files changed, 58 insertions(+), 1 deletion(-)



diff --git a/src/arch/amdgpu/vega/insts/instructions.cc  
b/src/arch/amdgpu/vega/insts/instructions.cc

index 7594f9c..3ef11c4 100644
--- a/src/arch/amdgpu/vega/insts/instructions.cc
+++ b/src/arch/amdgpu/vega/insts/instructions.cc
@@ -36589,8 +36589,52 @@
 void
 Inst_DS__DS_WRITE2ST64_B64::execute(GPUDynInstPtr gpuDynInst)
 {
-panicUnimplemented();
+Wavefront *wf = gpuDynInst->wavefront();
+
+if (gpuDynInst->exec_mask.none()) {
+wf->decLGKMInstsIssued();
+return;
+}
+
+gpuDynInst->execUnitId = wf->execUnitId;
+gpuDynInst->latency.init(gpuDynInst->computeUnit());
+gpuDynInst->latency.set(
+gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
+ConstVecOperandU32 addr(gpuDynInst, extData.ADDR);
+ConstVecOperandU64 data0(gpuDynInst, extData.DATA0);
+ConstVecOperandU64 data1(gpuDynInst, extData.DATA1);
+
+addr.read();
+data0.read();
+data1.read();
+
+calcAddr(gpuDynInst, addr);
+
+for (int lane = 0; lane < NumVecElemPerVecReg; ++lane) {
+if (gpuDynInst->exec_mask[lane]) {
+(reinterpret_cast(
+gpuDynInst->d_data))[lane * 2] = data0[lane];
+(reinterpret_cast(
+gpuDynInst->d_data))[lane * 2 + 1] = data1[lane];
+}
+}
+
+ 
gpuDynInst->computeUnit()->localMemoryPipe.issueRequest(gpuDynInst);

 } // execute
+
+void
+Inst_DS__DS_WRITE2ST64_B64::initiateAcc(GPUDynInstPtr gpuDynInst)
+{
+Addr offset0 = instData.OFFSET0 * 8 * 64;
+Addr offset1 = instData.OFFSET1 * 8 * 64;
+
+initDualMemWrite(gpuDynInst, offset0, offset1);
+}
+
+void
+Inst_DS__DS_WRITE2ST64_B64::completeAcc(GPUDynInstPtr gpuDynInst)
+{
+}
 // --- Inst_DS__DS_CMPST_B64 class methods ---

 Inst_DS__DS_CMPST_B64::Inst_DS__DS_CMPST_B64(InFmt_DS *iFmt)
diff --git a/src/arch/amdgpu/vega/insts/instructions.hh  
b/src/arch/amdgpu/vega/insts/instructions.hh

index 9f017f9..2896732 100644
--- a/src/arch/amdgpu/vega/insts/instructions.hh
+++ b/src/arch/amdgpu/vega/insts/instructions.hh
@@ -33572,6 +33572,8 @@
 } // getOperandSize

 void execute(GPUDynInstPtr) override;
+void initiateAcc(GPUDynInstPtr) override;
+void completeAcc(GPUDynInstPtr) override;
 }; // Inst_DS__DS_WRITE2ST64_B64

 class Inst_DS__DS_CMPST_B64 : public Inst_DS

--
To view, visit  
https://gem5-review.googlesource.com/c/public/gem5/+/67078?usp=email
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I0d0e05f3e848c2fd02d32095e32b7f023bd8803b
Gerrit-Change-Number: 67078
Gerrit-PatchSet: 1
Gerrit-Owner: Matthew Poremba 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org


[gem5-dev] [M] Change in gem5/gem5[develop]: arch-vega: Read one dword for SGPR base global insts

2022-12-30 Thread Matthew Poremba (Gerrit) via gem5-dev
Matthew Poremba has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/67077?usp=email )



Change subject: arch-vega: Read one dword for SGPR base global insts
..

arch-vega: Read one dword for SGPR base global insts

Global instructions in Vega can either use a VGPR base address plus
instruction offset or SGPR base address plus VGPR offset plus
instruction offset. Currently the VGPR address/offset is always read as
two dwords. This causes problems if the VGPR number is the last VGPR
allocated to a wavefront since the second dword would be beyond the
allocation and trip an assert.

This changeset sets the operand size of the VGPR operand to one dword
when SGPR base is used and two dwords otherwise so initDynOperandInfo
does not assert. It also moves the read of the VGPR into the calcAddr
method so that the correct ConstVecOperandU## is used to prevent another
assertion failure when reading from the register file. These two changes
are made to all flat instructions, as global instructions are a
subsegement of flat instructions.

Change-Id: I79030771aa6deec05ffa5853ca2d8b68943ee0a0
---
M src/arch/amdgpu/vega/insts/instructions.cc
M src/arch/amdgpu/vega/insts/instructions.hh
M src/arch/amdgpu/vega/insts/op_encodings.hh
3 files changed, 97 insertions(+), 107 deletions(-)



diff --git a/src/arch/amdgpu/vega/insts/instructions.cc  
b/src/arch/amdgpu/vega/insts/instructions.cc

index f0fb1aa..7594f9c 100644
--- a/src/arch/amdgpu/vega/insts/instructions.cc
+++ b/src/arch/amdgpu/vega/insts/instructions.cc
@@ -43825,11 +43825,7 @@
 gpuDynInst->latency.init(gpuDynInst->computeUnit());
 gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

-ConstVecOperandU64 addr(gpuDynInst, extData.ADDR);
-
-addr.read();
-
-calcAddr(gpuDynInst, addr, extData.SADDR, instData.OFFSET);
+calcAddr(gpuDynInst, extData.ADDR, extData.SADDR, instData.OFFSET);

 issueRequestHelper(gpuDynInst);
 } // execute
@@ -43913,11 +43909,7 @@
 gpuDynInst->latency.init(gpuDynInst->computeUnit());
 gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

-ConstVecOperandU64 addr(gpuDynInst, extData.ADDR);
-
-addr.read();
-
-calcAddr(gpuDynInst, addr, extData.SADDR, instData.OFFSET);
+calcAddr(gpuDynInst, extData.ADDR, extData.SADDR, instData.OFFSET);

 issueRequestHelper(gpuDynInst);
 } // execute
@@ -44002,11 +43994,7 @@
 gpuDynInst->latency.init(gpuDynInst->computeUnit());
 gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

-ConstVecOperandU64 addr(gpuDynInst, extData.ADDR);
-
-addr.read();
-
-calcAddr(gpuDynInst, addr, extData.SADDR, instData.OFFSET);
+calcAddr(gpuDynInst, extData.ADDR, extData.SADDR, instData.OFFSET);

 issueRequestHelper(gpuDynInst);
 } // execute
@@ -44061,11 +44049,7 @@
 gpuDynInst->latency.init(gpuDynInst->computeUnit());
 gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

-ConstVecOperandU64 addr(gpuDynInst, extData.ADDR);
-
-addr.read();
-
-calcAddr(gpuDynInst, addr, extData.SADDR, instData.OFFSET);
+calcAddr(gpuDynInst, extData.ADDR, extData.SADDR, instData.OFFSET);

 issueRequestHelper(gpuDynInst);
 } // execute
@@ -44120,11 +44104,7 @@
 gpuDynInst->latency.init(gpuDynInst->computeUnit());
 gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

-ConstVecOperandU64 addr(gpuDynInst, extData.ADDR);
-
-addr.read();
-
-calcAddr(gpuDynInst, addr, extData.SADDR, instData.OFFSET);
+calcAddr(gpuDynInst, extData.ADDR, extData.SADDR, instData.OFFSET);

 issueRequestHelper(gpuDynInst);
 } // execute
@@ -44188,11 +44168,7 @@
 gpuDynInst->latency.init(gpuDynInst->computeUnit());
 gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

-ConstVecOperandU64 addr(gpuDynInst, extData.ADDR);
-
-addr.read();
-
-calcAddr(gpuDynInst, addr, extData.SADDR, instData.OFFSET);
+calcAddr(gpuDynInst, extData.ADDR, extData.SADDR, instData.OFFSET);

 issueRequestHelper(gpuDynInst);
 } // execute
@@ -44260,13 +44236,11 @@
 gpuDynInst->latency.init(gpuDynInst->computeUnit());
 gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod());

-ConstVecOperandU64 addr(gpuDynInst, extData.ADDR);
 ConstVecOperandU8 data(gpuDynInst, extData.DATA);

-addr.read();
 data.read();

-calcAddr(gpuDynInst, addr, extData.SADDR, instData.OFFSET);
+calcAddr(gpuDynInst, extData.ADDR, extData.SADDR, instData.OFFSET);

 for (int lane = 0; lane < NumVecElemPerVecReg; ++lane) {
 if (gpuDynInst->exec_mask[lane]) {
@@ -44319,13 +44293,11 @@
 

[gem5-dev] [S] Change in gem5/gem5[develop]: arch-vega: Add missing operand size for ds_write2st64_b64

2022-12-30 Thread Matthew Poremba (Gerrit) via gem5-dev
Matthew Poremba has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/67071?usp=email )



Change subject: arch-vega: Add missing operand size for ds_write2st64_b64
..

arch-vega: Add missing operand size for ds_write2st64_b64

This instruction takes three operands (address, and two datas) but there
were only operand sizes for two operands tripping assert in default
case.

Change-Id: I3f505b6432aee5f3f265acac46b83c0c7daff3e7
---
M src/arch/amdgpu/vega/insts/instructions.hh
1 file changed, 16 insertions(+), 1 deletion(-)



diff --git a/src/arch/amdgpu/vega/insts/instructions.hh  
b/src/arch/amdgpu/vega/insts/instructions.hh

index 0671df8..1c42248 100644
--- a/src/arch/amdgpu/vega/insts/instructions.hh
+++ b/src/arch/amdgpu/vega/insts/instructions.hh
@@ -33553,7 +33553,9 @@
 switch (opIdx) {
   case 0: //vgpr_a
 return 4;
-  case 1: //vgpr_d1
+  case 1: //vgpr_d0
+return 8;
+  case 2: //vgpr_d1
 return 8;
   default:
 fatal("op idx %i out of bounds\n", opIdx);

--
To view, visit  
https://gem5-review.googlesource.com/c/public/gem5/+/67071?usp=email
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I3f505b6432aee5f3f265acac46b83c0c7daff3e7
Gerrit-Change-Number: 67071
Gerrit-PatchSet: 1
Gerrit-Owner: Matthew Poremba 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org


[gem5-dev] [M] Change in gem5/gem5[develop]: base: Specialize bitwise atomics so FP types can be used

2022-12-30 Thread Matthew Poremba (Gerrit) via gem5-dev
Matthew Poremba has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/67073?usp=email )



Change subject: base: Specialize bitwise atomics so FP types can be used
..

base: Specialize bitwise atomics so FP types can be used

The current atomic memory operations are templated so any type can be
used. However floating point types can not perform bitwise operations.
The GPU model contains some instructions which do atomics on floating
point types, so they need to be supported. To allow this, template
specialization is added to atomic AND, OR, and XOR which does nothing
if the type is floating point and operates as normal for integral
types.

Change-Id: I60f935756355462e99c59a9da032c5bf5afa246c
---
M src/base/amo.hh
1 file changed, 47 insertions(+), 3 deletions(-)



diff --git a/src/base/amo.hh b/src/base/amo.hh
index 81bf069..c990d15 100644
--- a/src/base/amo.hh
+++ b/src/base/amo.hh
@@ -129,30 +129,57 @@
 template
 class AtomicOpAnd : public TypedAtomicOpFunctor
 {
+// Bitwise operations are only legal on integral types
+template
+typename std::enable_if::value, void>::type
+executeImpl(B *b) { *b &= a; }
+
+template
+typename std::enable_if::value, void>::type
+executeImpl(B *b) { }
+
   public:
 T a;
 AtomicOpAnd(T _a) : a(_a) { }
-void execute(T *b) { *b &= a; }
+void execute(T *b) { executeImpl(b); }
 AtomicOpFunctor* clone () { return new AtomicOpAnd(a); }
 };

 template
 class AtomicOpOr : public TypedAtomicOpFunctor
 {
+// Bitwise operations are only legal on integral types
+template
+typename std::enable_if::value, void>::type
+executeImpl(B *b) { *b |= a; }
+
+template
+typename std::enable_if::value, void>::type
+executeImpl(B *b) { }
+
   public:
 T a;
 AtomicOpOr(T _a) : a(_a) { }
-void execute(T *b) { *b |= a; }
+void execute(T *b) { executeImpl(b); }
 AtomicOpFunctor* clone () { return new AtomicOpOr(a); }
 };

 template
 class AtomicOpXor : public TypedAtomicOpFunctor
 {
+// Bitwise operations are only legal on integral types
+template
+typename std::enable_if::value, void>::type
+executeImpl(B *b) { *b ^= a; }
+
+template
+typename std::enable_if::value, void>::type
+executeImpl(B *b) { }
+
   public:
 T a;
 AtomicOpXor(T _a) : a(_a) {}
-void execute(T *b) { *b ^= a; }
+void execute(T *b) { executeImpl(b); }
 AtomicOpFunctor* clone () { return new AtomicOpXor(a); }
 };


--
To view, visit  
https://gem5-review.googlesource.com/c/public/gem5/+/67073?usp=email
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I60f935756355462e99c59a9da032c5bf5afa246c
Gerrit-Change-Number: 67073
Gerrit-PatchSet: 1
Gerrit-Owner: Matthew Poremba 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org


[gem5-dev] [M] Change in gem5/gem5[develop]: arch-vega: Implement ds_add_u64

2022-12-30 Thread Matthew Poremba (Gerrit) via gem5-dev
Matthew Poremba has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/67075?usp=email )



Change subject: arch-vega: Implement ds_add_u64
..

arch-vega: Implement ds_add_u64

This instruction does an atomic add of an unsigned 64-bit data with a
VGPR and value in LDS atomically without return.

Change-Id: I6a7d6713b256607c4e69ddbdef5c83172493c077
---
M src/arch/amdgpu/vega/insts/instructions.cc
M src/arch/amdgpu/vega/insts/instructions.hh
2 files changed, 60 insertions(+), 3 deletions(-)



diff --git a/src/arch/amdgpu/vega/insts/instructions.cc  
b/src/arch/amdgpu/vega/insts/instructions.cc

index a0308c8..511a767 100644
--- a/src/arch/amdgpu/vega/insts/instructions.cc
+++ b/src/arch/amdgpu/vega/insts/instructions.cc
@@ -36082,6 +36082,10 @@
 Inst_DS__DS_ADD_U64::Inst_DS__DS_ADD_U64(InFmt_DS *iFmt)
 : Inst_DS(iFmt, "ds_add_u64")
 {
+setFlag(MemoryRef);
+setFlag(GroupSegment);
+setFlag(AtomicAdd);
+setFlag(AtomicNoReturn);
 } // Inst_DS__DS_ADD_U64

 Inst_DS__DS_ADD_U64::~Inst_DS__DS_ADD_U64()
@@ -36090,14 +36094,53 @@

 // --- description from .arch file ---
 // 64b:
-// tmp = MEM[ADDR];
 // MEM[ADDR] += DATA[0:1];
-// RETURN_DATA[0:1] = tmp.
 void
 Inst_DS__DS_ADD_U64::execute(GPUDynInstPtr gpuDynInst)
 {
-panicUnimplemented();
+Wavefront *wf = gpuDynInst->wavefront();
+
+if (gpuDynInst->exec_mask.none()) {
+wf->decLGKMInstsIssued();
+return;
+}
+
+gpuDynInst->execUnitId = wf->execUnitId;
+gpuDynInst->latency.init(gpuDynInst->computeUnit());
+gpuDynInst->latency.set(
+gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
+ConstVecOperandU32 addr(gpuDynInst, extData.ADDR);
+ConstVecOperandU64 data(gpuDynInst, extData.DATA0);
+
+addr.read();
+data.read();
+
+calcAddr(gpuDynInst, addr);
+
+for (int lane = 0; lane < NumVecElemPerVecReg; ++lane) {
+if (gpuDynInst->exec_mask[lane]) {
+(reinterpret_cast(gpuDynInst->a_data))[lane]
+= data[lane];
+}
+}
+
+ 
gpuDynInst->computeUnit()->localMemoryPipe.issueRequest(gpuDynInst);

 } // execute
+
+void
+Inst_DS__DS_ADD_U64::initiateAcc(GPUDynInstPtr gpuDynInst)
+{
+Addr offset0 = instData.OFFSET0;
+Addr offset1 = instData.OFFSET1;
+Addr offset = (offset1 << 8) | offset0;
+
+initAtomicAccess(gpuDynInst, offset);
+} // initiateAcc
+
+void
+Inst_DS__DS_ADD_U64::completeAcc(GPUDynInstPtr gpuDynInst)
+{
+} // completeAcc
 // --- Inst_DS__DS_SUB_U64 class methods ---

 Inst_DS__DS_SUB_U64::Inst_DS__DS_SUB_U64(InFmt_DS *iFmt)
diff --git a/src/arch/amdgpu/vega/insts/instructions.hh  
b/src/arch/amdgpu/vega/insts/instructions.hh

index 05a0002..f8fc98b 100644
--- a/src/arch/amdgpu/vega/insts/instructions.hh
+++ b/src/arch/amdgpu/vega/insts/instructions.hh
@@ -33079,6 +33079,8 @@
 }
 } // getOperandSize

+void initiateAcc(GPUDynInstPtr gpuDynInst) override;
+void completeAcc(GPUDynInstPtr gpuDynInst) override;
 void execute(GPUDynInstPtr) override;
 }; // Inst_DS__DS_ADD_U64


--
To view, visit  
https://gem5-review.googlesource.com/c/public/gem5/+/67075?usp=email
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I6a7d6713b256607c4e69ddbdef5c83172493c077
Gerrit-Change-Number: 67075
Gerrit-PatchSet: 1
Gerrit-Owner: Matthew Poremba 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org


[gem5-dev] [M] Change in gem5/gem5[develop]: arch-vega: Implement ds_add_f32 atomic

2022-12-30 Thread Matthew Poremba (Gerrit) via gem5-dev
Matthew Poremba has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/67074?usp=email )



Change subject: arch-vega: Implement ds_add_f32 atomic
..

arch-vega: Implement ds_add_f32 atomic

This instruction does an atomic add of a 32-bit float with a VGPR and
value in LDS atomically without return.

Change-Id: Id4f23a1ab587a23edfd1d88ede1cbcc5bdedc0cb
---
M src/arch/amdgpu/vega/insts/instructions.cc
M src/arch/amdgpu/vega/insts/instructions.hh
2 files changed, 60 insertions(+), 3 deletions(-)



diff --git a/src/arch/amdgpu/vega/insts/instructions.cc  
b/src/arch/amdgpu/vega/insts/instructions.cc

index 5332687..a0308c8 100644
--- a/src/arch/amdgpu/vega/insts/instructions.cc
+++ b/src/arch/amdgpu/vega/insts/instructions.cc
@@ -34749,6 +34749,10 @@
 : Inst_DS(iFmt, "ds_add_f32")
 {
 setFlag(F32);
+setFlag(MemoryRef);
+setFlag(GroupSegment);
+setFlag(AtomicAdd);
+setFlag(AtomicNoReturn);
 } // Inst_DS__DS_ADD_F32

 Inst_DS__DS_ADD_F32::~Inst_DS__DS_ADD_F32()
@@ -34757,15 +34761,54 @@

 // --- description from .arch file ---
 // 32b:
-// tmp = MEM[ADDR];
 // MEM[ADDR] += DATA;
-// RETURN_DATA = tmp.
 // Floating point add that handles NaN/INF/denormal values.
 void
 Inst_DS__DS_ADD_F32::execute(GPUDynInstPtr gpuDynInst)
 {
-panicUnimplemented();
+Wavefront *wf = gpuDynInst->wavefront();
+
+if (gpuDynInst->exec_mask.none()) {
+wf->decLGKMInstsIssued();
+return;
+}
+
+gpuDynInst->execUnitId = wf->execUnitId;
+gpuDynInst->latency.init(gpuDynInst->computeUnit());
+gpuDynInst->latency.set(
+gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
+ConstVecOperandU32 addr(gpuDynInst, extData.ADDR);
+ConstVecOperandF32 data(gpuDynInst, extData.DATA0);
+
+addr.read();
+data.read();
+
+calcAddr(gpuDynInst, addr);
+
+for (int lane = 0; lane < NumVecElemPerVecReg; ++lane) {
+if (gpuDynInst->exec_mask[lane]) {
+(reinterpret_cast(gpuDynInst->a_data))[lane]
+= data[lane];
+}
+}
+
+ 
gpuDynInst->computeUnit()->localMemoryPipe.issueRequest(gpuDynInst);

 } // execute
+
+void
+Inst_DS__DS_ADD_F32::initiateAcc(GPUDynInstPtr gpuDynInst)
+{
+Addr offset0 = instData.OFFSET0;
+Addr offset1 = instData.OFFSET1;
+Addr offset = (offset1 << 8) | offset0;
+
+initAtomicAccess(gpuDynInst, offset);
+} // initiateAcc
+
+void
+Inst_DS__DS_ADD_F32::completeAcc(GPUDynInstPtr gpuDynInst)
+{
+} // completeAcc
 // --- Inst_DS__DS_WRITE_B8 class methods ---

 Inst_DS__DS_WRITE_B8::Inst_DS__DS_WRITE_B8(InFmt_DS *iFmt)
diff --git a/src/arch/amdgpu/vega/insts/instructions.hh  
b/src/arch/amdgpu/vega/insts/instructions.hh

index 33be33e..05a0002 100644
--- a/src/arch/amdgpu/vega/insts/instructions.hh
+++ b/src/arch/amdgpu/vega/insts/instructions.hh
@@ -31895,6 +31895,8 @@
 }
 } // getOperandSize

+void initiateAcc(GPUDynInstPtr gpuDynInst) override;
+void completeAcc(GPUDynInstPtr gpuDynInst) override;
 void execute(GPUDynInstPtr) override;
 }; // Inst_DS__DS_ADD_F32


--
To view, visit  
https://gem5-review.googlesource.com/c/public/gem5/+/67074?usp=email
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: Id4f23a1ab587a23edfd1d88ede1cbcc5bdedc0cb
Gerrit-Change-Number: 67074
Gerrit-PatchSet: 1
Gerrit-Owner: Matthew Poremba 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org


[gem5-dev] [M] Change in gem5/gem5[develop]: arch-vega: Implement ds_add_u32 atomic

2022-12-30 Thread Matthew Poremba (Gerrit) via gem5-dev
Matthew Poremba has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/67072?usp=email )



Change subject: arch-vega: Implement ds_add_u32 atomic
..

arch-vega: Implement ds_add_u32 atomic

This instruction does an atomic add of unsigned 32-bit data with a VGPR
and value in LDS atomically, without return.

Change-Id: I87579a94f6200a9a066f8f7390e57fb5fb6eff8e
---
M src/arch/amdgpu/vega/insts/instructions.cc
M src/arch/amdgpu/vega/insts/instructions.hh
2 files changed, 60 insertions(+), 3 deletions(-)



diff --git a/src/arch/amdgpu/vega/insts/instructions.cc  
b/src/arch/amdgpu/vega/insts/instructions.cc

index 3570e32..5332687 100644
--- a/src/arch/amdgpu/vega/insts/instructions.cc
+++ b/src/arch/amdgpu/vega/insts/instructions.cc
@@ -34065,6 +34065,10 @@
 Inst_DS__DS_ADD_U32::Inst_DS__DS_ADD_U32(InFmt_DS *iFmt)
 : Inst_DS(iFmt, "ds_add_u32")
 {
+setFlag(MemoryRef);
+setFlag(GroupSegment);
+setFlag(AtomicAdd);
+setFlag(AtomicNoReturn);
 } // Inst_DS__DS_ADD_U32

 Inst_DS__DS_ADD_U32::~Inst_DS__DS_ADD_U32()
@@ -34073,14 +34077,53 @@

 // --- description from .arch file ---
 // 32b:
-// tmp = MEM[ADDR];
 // MEM[ADDR] += DATA;
-// RETURN_DATA = tmp.
 void
 Inst_DS__DS_ADD_U32::execute(GPUDynInstPtr gpuDynInst)
 {
-panicUnimplemented();
+Wavefront *wf = gpuDynInst->wavefront();
+
+if (gpuDynInst->exec_mask.none()) {
+wf->decLGKMInstsIssued();
+return;
+}
+
+gpuDynInst->execUnitId = wf->execUnitId;
+gpuDynInst->latency.init(gpuDynInst->computeUnit());
+gpuDynInst->latency.set(
+gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
+ConstVecOperandU32 addr(gpuDynInst, extData.ADDR);
+ConstVecOperandU32 data(gpuDynInst, extData.DATA0);
+
+addr.read();
+data.read();
+
+calcAddr(gpuDynInst, addr);
+
+for (int lane = 0; lane < NumVecElemPerVecReg; ++lane) {
+if (gpuDynInst->exec_mask[lane]) {
+(reinterpret_cast(gpuDynInst->a_data))[lane]
+= data[lane];
+}
+}
+
+ 
gpuDynInst->computeUnit()->localMemoryPipe.issueRequest(gpuDynInst);

 } // execute
+
+void
+Inst_DS__DS_ADD_U32::initiateAcc(GPUDynInstPtr gpuDynInst)
+{
+Addr offset0 = instData.OFFSET0;
+Addr offset1 = instData.OFFSET1;
+Addr offset = (offset1 << 8) | offset0;
+
+initAtomicAccess(gpuDynInst, offset);
+} // initiateAcc
+
+void
+Inst_DS__DS_ADD_U32::completeAcc(GPUDynInstPtr gpuDynInst)
+{
+} // completeAcc
 // --- Inst_DS__DS_SUB_U32 class methods ---

 Inst_DS__DS_SUB_U32::Inst_DS__DS_SUB_U32(InFmt_DS *iFmt)
diff --git a/src/arch/amdgpu/vega/insts/instructions.hh  
b/src/arch/amdgpu/vega/insts/instructions.hh

index 1c42248..33be33e 100644
--- a/src/arch/amdgpu/vega/insts/instructions.hh
+++ b/src/arch/amdgpu/vega/insts/instructions.hh
@@ -31211,6 +31211,8 @@
 }
 } // getOperandSize

+void initiateAcc(GPUDynInstPtr gpuDynInst) override;
+void completeAcc(GPUDynInstPtr gpuDynInst) override;
 void execute(GPUDynInstPtr) override;
 }; // Inst_DS__DS_ADD_U32


--
To view, visit  
https://gem5-review.googlesource.com/c/public/gem5/+/67072?usp=email
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I87579a94f6200a9a066f8f7390e57fb5fb6eff8e
Gerrit-Change-Number: 67072
Gerrit-PatchSet: 1
Gerrit-Owner: Matthew Poremba 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org


[gem5-dev] [M] Change in gem5/gem5[develop]: arch-vega: Implement ds_read_i8

2022-12-30 Thread Matthew Poremba (Gerrit) via gem5-dev
Matthew Poremba has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/67076?usp=email )



Change subject: arch-vega: Implement ds_read_i8
..

arch-vega: Implement ds_read_i8

Read one byte with sign extended from LDS.

Change-Id: I9cb9b4033c6f834241cba944bc7e6a7ebc5401be
---
M src/arch/amdgpu/vega/insts/instructions.cc
M src/arch/amdgpu/vega/insts/instructions.hh
2 files changed, 56 insertions(+), 1 deletion(-)



diff --git a/src/arch/amdgpu/vega/insts/instructions.cc  
b/src/arch/amdgpu/vega/insts/instructions.cc

index 511a767..f0fb1aa 100644
--- a/src/arch/amdgpu/vega/insts/instructions.cc
+++ b/src/arch/amdgpu/vega/insts/instructions.cc
@@ -35630,8 +35630,50 @@
 void
 Inst_DS__DS_READ_I8::execute(GPUDynInstPtr gpuDynInst)
 {
-panicUnimplemented();
+Wavefront *wf = gpuDynInst->wavefront();
+
+if (gpuDynInst->exec_mask.none()) {
+wf->decLGKMInstsIssued();
+return;
+}
+
+gpuDynInst->execUnitId = wf->execUnitId;
+gpuDynInst->latency.init(gpuDynInst->computeUnit());
+gpuDynInst->latency.set(
+gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24)));
+ConstVecOperandU32 addr(gpuDynInst, extData.ADDR);
+
+addr.read();
+
+calcAddr(gpuDynInst, addr);
+
+ 
gpuDynInst->computeUnit()->localMemoryPipe.issueRequest(gpuDynInst);

 } // execute
+
+void
+Inst_DS__DS_READ_I8::initiateAcc(GPUDynInstPtr gpuDynInst)
+{
+Addr offset0 = instData.OFFSET0;
+Addr offset1 = instData.OFFSET1;
+Addr offset = (offset1 << 8) | offset0;
+
+initMemRead(gpuDynInst, offset);
+} // initiateAcc
+
+void
+Inst_DS__DS_READ_I8::completeAcc(GPUDynInstPtr gpuDynInst)
+{
+VecOperandU32 vdst(gpuDynInst, extData.VDST);
+
+for (int lane = 0; lane < NumVecElemPerVecReg; ++lane) {
+if (gpuDynInst->exec_mask[lane]) {
+vdst[lane] =  
(VecElemU32)sext<8>((reinterpret_cast(

+gpuDynInst->d_data))[lane]);
+}
+}
+
+vdst.write();
+} // completeAcc
 // --- Inst_DS__DS_READ_U8 class methods ---

 Inst_DS__DS_READ_U8::Inst_DS__DS_READ_U8(InFmt_DS *iFmt)
diff --git a/src/arch/amdgpu/vega/insts/instructions.hh  
b/src/arch/amdgpu/vega/insts/instructions.hh

index f8fc98b..b2cf2b9 100644
--- a/src/arch/amdgpu/vega/insts/instructions.hh
+++ b/src/arch/amdgpu/vega/insts/instructions.hh
@@ -32848,6 +32848,8 @@
 } // getOperandSize

 void execute(GPUDynInstPtr) override;
+void initiateAcc(GPUDynInstPtr) override;
+void completeAcc(GPUDynInstPtr) override;
 }; // Inst_DS__DS_READ_I8

 class Inst_DS__DS_READ_U8 : public Inst_DS

--
To view, visit  
https://gem5-review.googlesource.com/c/public/gem5/+/67076?usp=email
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I9cb9b4033c6f834241cba944bc7e6a7ebc5401be
Gerrit-Change-Number: 67076
Gerrit-PatchSet: 1
Gerrit-Owner: Matthew Poremba 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org


[gem5-dev] [M] Change in gem5/gem5[develop]: misc: Update version info for develop branch

2022-12-30 Thread Bobby Bruce (Gerrit) via gem5-dev
Bobby Bruce has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/67053?usp=email )



Change subject: misc: Update version info for develop branch
..

misc: Update version info for develop branch

Change-Id: Icd409acda0e88852938b2af9f170e2a410e91f8c
---
M ext/sst/README.md
M ext/testlib/configuration.py
M src/Doxyfile
M src/base/version.cc
M src/python/gem5/resources/downloader.py
M tests/compiler-tests.sh
M tests/jenkins/presubmit.sh
M tests/nightly.sh
M tests/weekly.sh
M util/dockerfiles/docker-compose.yaml
M util/dockerfiles/gcn-gpu/Dockerfile
11 files changed, 46 insertions(+), 37 deletions(-)



diff --git a/ext/sst/README.md b/ext/sst/README.md
index 49f5634..1f37cb4 100644
--- a/ext/sst/README.md
+++ b/ext/sst/README.md
@@ -62,7 +62,7 @@
 Downloading the built bootloader containing a Linux Kernel and a workload,

 ```sh
-wget http://dist.gem5.org/dist/v22-1/misc/riscv/bbl-busybox-boot-exit
+wget http://dist.gem5.org/dist/develop/misc/riscv/bbl-busybox-boot-exit
 ```

 Running the simulation
@@ -87,7 +87,7 @@
 directory):

 ```sh
-wget http://dist.gem5.org/dist/v22-1/arm/aarch-sst-20211207.tar.bz2
+wget http://dist.gem5.org/dist/develop/arm/aarch-sst-20211207.tar.bz2
 tar -xf aarch-sst-20211207.tar.bz2

 # copying bootloaders
diff --git a/ext/testlib/configuration.py b/ext/testlib/configuration.py
index 97c6376..fd47e3b 100644
--- a/ext/testlib/configuration.py
+++ b/ext/testlib/configuration.py
@@ -213,7 +213,7 @@
   os.pardir,
   os.pardir))
 defaults.result_path = os.path.join(os.getcwd(), 'testing-results')
-defaults.resource_url = 'http://dist.gem5.org/dist/v22-1'
+defaults.resource_url = 'http://dist.gem5.org/dist/develop'
 defaults.resource_path =  
os.path.abspath(os.path.join(defaults.base_dir,

 'tests',
 'gem5',
diff --git a/src/Doxyfile b/src/Doxyfile
index 4d14b7c..24d70bb 100644
--- a/src/Doxyfile
+++ b/src/Doxyfile
@@ -31,7 +31,7 @@
 # This could be handy for archiving the generated documentation or
 # if some version control system is used.

-PROJECT_NUMBER = v22.1.0.0
+PROJECT_NUMBER = [DEVELOP-FOR-23.0]

 # The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute)
 # base path where the generated documentation will be put.
diff --git a/src/base/version.cc b/src/base/version.cc
index 050aea0..8131a31 100644
--- a/src/base/version.cc
+++ b/src/base/version.cc
@@ -32,6 +32,6 @@
 /**
  * @ingroup api_base_utils
  */
-const char *gem5Version = "22.1.0.0";
+const char *gem5Version = "[DEVELOP-FOR-23.0]";

 } // namespace gem5
diff --git a/src/python/gem5/resources/downloader.py  
b/src/python/gem5/resources/downloader.py

index f619b97..1fda8d8 100644
--- a/src/python/gem5/resources/downloader.py
+++ b/src/python/gem5/resources/downloader.py
@@ -55,7 +55,7 @@
 """
 Specifies the version of resources.json to obtain.
 """
-return "22.1"
+return "develop"


 def _get_resources_json_uri() -> str:
diff --git a/tests/compiler-tests.sh b/tests/compiler-tests.sh
index 044ceb2..f5d4bb1 100755
--- a/tests/compiler-tests.sh
+++ b/tests/compiler-tests.sh
@@ -114,7 +114,7 @@
 # targets for this test
 build_indices=(${build_permutation[@]:0:$builds_count})

-repo_name="${base_url}/${compiler}:v22-1"
+repo_name="${base_url}/${compiler}:latest"

 # Grab compiler image
 docker pull $repo_name >/dev/null
diff --git a/tests/jenkins/presubmit.sh b/tests/jenkins/presubmit.sh
index 36da3fa..91eb95f 100755
--- a/tests/jenkins/presubmit.sh
+++ b/tests/jenkins/presubmit.sh
@@ -37,8 +37,8 @@

 set -e

-DOCKER_IMAGE_ALL_DEP=gcr.io/gem5-test/ubuntu-22.04_all-dependencies:v22-1
-DOCKER_IMAGE_CLANG_COMPILE=gcr.io/gem5-test/clang-version-14:v22-1
+DOCKER_IMAGE_ALL_DEP=gcr.io/gem5-test/ubuntu-22.04_all-dependencies:latest
+DOCKER_IMAGE_CLANG_COMPILE=gcr.io/gem5-test/clang-version-14:latest
 PRESUBMIT_STAGE2=tests/jenkins/presubmit-stage2.sh
 GEM5ART_TESTS=tests/jenkins/gem5art-tests.sh

diff --git a/tests/nightly.sh b/tests/nightly.sh
index bf05154..1360c44 100755
--- a/tests/nightly.sh
+++ b/tests/nightly.sh
@@ -37,7 +37,7 @@

 # The docker tag to use (varies between develop, and versions on the  
staging

 # branch)
-tag="v22-1"
+tag="latest"

 # The first argument is the number of threads to be used for compilation.  
If no

 # argument is given we default to one.
diff --git a/tests/weekly.sh b/tests/weekly.sh
index 9b400b9..c7f834b 100755
--- a/tests/weekly.sh
+++ b/tests/weekly.sh
@@ -37,7 +37,7 @@

 # The docker tag to use (varies between develop, and versions on the  
staging

 # branch)
-tag="v22-1"
+tag="latest"

 # We assume the first two arguments are the number of threads followed by  
the

 # GPU ISA to test. These default 

[gem5-dev] [M] Change in gem5/gem5[develop]: misc: Merge branch stable into develop branch

2022-12-30 Thread Bobby Bruce (Gerrit) via gem5-dev
Bobby Bruce has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/67051?usp=email )



Change subject: misc: Merge branch stable into develop branch
..

misc: Merge branch stable into develop branch

This ensures both branches are in-sync and have not diverged.

Change-Id: Ib487d8596037017b9ec03d7e8a76229373c153db
---
M src/dev/amdgpu/pm4_packet_processor.cc
M src/dev/amdgpu/sdma_engine.cc
M src/dev/amdgpu/sdma_engine.hh
4 files changed, 60 insertions(+), 34 deletions(-)



diff --git a/src/dev/amdgpu/pm4_packet_processor.cc  
b/src/dev/amdgpu/pm4_packet_processor.cc

index 3c832c5..152fd4d 100644
--- a/src/dev/amdgpu/pm4_packet_processor.cc
+++ b/src/dev/amdgpu/pm4_packet_processor.cc
@@ -458,13 +458,7 @@
 SDMAEngine *sdma_eng = gpuDevice->getSDMAById(pkt->engineSel - 2);

 // Register RLC queue with SDMA
-<<< HEAD   (fcde59 util: ext/systemc is importing env Environment  
instead of ma)

 sdma_eng->registerRLCQueue(pkt->doorbellOffset << 2, addr, mqd);
-===
-sdma_eng->registerRLCQueue(pkt->doorbellOffset << 2,
-   mqd->rb_base << 8, rlc_size,
-   rptr_wb_addr);
->>> BRANCH (5fa484 misc: Merge the v22.1 release staging into stable)

 // Register doorbell with GPU device
 gpuDevice->setSDMAEngine(pkt->doorbellOffset << 2, sdma_eng);
diff --git a/src/dev/amdgpu/sdma_engine.cc b/src/dev/amdgpu/sdma_engine.cc
index 0a167bf..4c03bf5 100644
--- a/src/dev/amdgpu/sdma_engine.cc
+++ b/src/dev/amdgpu/sdma_engine.cc
@@ -165,12 +165,7 @@
 }

 void
-<<< HEAD   (fcde59 util: ext/systemc is importing env Environment  
instead of ma)
 SDMAEngine::registerRLCQueue(Addr doorbell, Addr mqdAddr, SDMAQueueDesc  
*mqd)

-===
-SDMAEngine::registerRLCQueue(Addr doorbell, Addr rb_base, uint32_t size,
- Addr rptr_wb_addr)
->>> BRANCH (5fa484 misc: Merge the v22.1 release staging into stable)
 {
 uint32_t rlc_size = 4UL << bits(mqd->sdmax_rlcx_rb_cntl, 6, 1);
 Addr rptr_wb_addr = mqd->sdmax_rlcx_rb_rptr_addr_hi;
@@ -185,43 +180,25 @@
 rlc0.base(mqd->rb_base << 8);
 rlc0.size(rlc_size);
 rlc0.rptr(0);
-<<< HEAD   (fcde59 util: ext/systemc is importing env Environment  
instead of ma)

 rlc0.incRptr(mqd->rptr);
 rlc0.setWptr(mqd->wptr);
-===
-rlc0.wptr(0);
->>> BRANCH (5fa484 misc: Merge the v22.1 release staging into stable)
 rlc0.rptrWbAddr(rptr_wb_addr);
 rlc0.processing(false);
-<<< HEAD   (fcde59 util: ext/systemc is importing env Environment  
instead of ma)

 rlc0.setMQD(mqd);
 rlc0.setMQDAddr(mqdAddr);
-===
-rlc0.size(size);
->>> BRANCH (5fa484 misc: Merge the v22.1 release staging into stable)
 } else if (!rlc1.valid()) {
 DPRINTF(SDMAEngine, "Doorbell %lx mapped to RLC1\n", doorbell);
 rlcInfo[1] = doorbell;
 rlc1.valid(true);
-<<< HEAD   (fcde59 util: ext/systemc is importing env Environment  
instead of ma)

 rlc1.base(mqd->rb_base << 8);
 rlc1.size(rlc_size);
 rlc1.rptr(0);
 rlc1.incRptr(mqd->rptr);
 rlc1.setWptr(mqd->wptr);
-===
-rlc1.base(rb_base);
-rlc1.rptr(0);
-rlc1.wptr(0);
->>> BRANCH (5fa484 misc: Merge the v22.1 release staging into stable)
 rlc1.rptrWbAddr(rptr_wb_addr);
 rlc1.processing(false);
-<<< HEAD   (fcde59 util: ext/systemc is importing env Environment  
instead of ma)

 rlc1.setMQD(mqd);
 rlc1.setMQDAddr(mqdAddr);
-===
-rlc1.size(size);
->>> BRANCH (5fa484 misc: Merge the v22.1 release staging into stable)
 } else {
 panic("No free RLCs. Check they are properly unmapped.");
 }
diff --git a/src/dev/amdgpu/sdma_engine.hh b/src/dev/amdgpu/sdma_engine.hh
index 6a12f97..27c1691 100644
--- a/src/dev/amdgpu/sdma_engine.hh
+++ b/src/dev/amdgpu/sdma_engine.hh
@@ -287,12 +287,7 @@
 /**
  * Methods for RLC queues
  */
-<<< HEAD   (fcde59 util: ext/systemc is importing env Environment  
instead of ma)

 void registerRLCQueue(Addr doorbell, Addr mqdAddr, SDMAQueueDesc *mqd);
-===
-void registerRLCQueue(Addr doorbell, Addr rb_base, uint32_t size,
-  Addr rptr_wb_addr);
->>> BRANCH (5fa484 misc: Merge the v22.1 release staging into stable)
 void unregisterRLCQueue(Addr doorbell);
 void deallocateRLCQueues();


--
To view, visit  
https://gem5-review.googlesource.com/c/public/gem5/+/67051?usp=email
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: Ib487d8596037017b9ec03d7e8a76229373c153db
Gerrit-Change-Number: 67051
Gerrit-PatchSet: 1
Gerrit-Owner: Bobby Bruce 
Gerrit-MessageType: newchange

[gem5-dev] [S] Change in gem5/gem5[develop]: scons: Re-add -Werror for gem5 develop branch

2022-12-30 Thread Bobby Bruce (Gerrit) via gem5-dev
Bobby Bruce has uploaded this change for review. (  
https://gem5-review.googlesource.com/c/public/gem5/+/67052?usp=email )



Change subject: scons: Re-add -Werror for gem5 develop branch
..

scons: Re-add -Werror for gem5 develop branch

This is removed from the stable branch to avoid build errors but should
included on the develop branch to aid developers.

This reverts commit 7dd61c865975862b099e1af5e867083ac9307d9b.

Change-Id: I1fe249ce87aa8d70c1f092fc7db1554e6aee7355
---
M SConstruct
1 file changed, 22 insertions(+), 0 deletions(-)



diff --git a/SConstruct b/SConstruct
index e8107ea..bd26e45 100755
--- a/SConstruct
+++ b/SConstruct
@@ -420,6 +420,14 @@
 conf.CheckLinkFlag('-Wl,--threads')
 conf.CheckLinkFlag(
 '-Wl,--thread-count=%d' %  
GetOption('num_jobs'))

+
+# Treat warnings as errors but white list some warnings that we
+# want to allow (e.g., deprecation warnings).
+env.Append(CCFLAGS=['-Werror',
+ '-Wno-error=deprecated-declarations',
+ '-Wno-error=deprecated',
+])
+
 else:
 error('\n'.join((
   "Don't know what compiler options to use for your compiler.",

--
To view, visit  
https://gem5-review.googlesource.com/c/public/gem5/+/67052?usp=email
To unsubscribe, or for help writing mail filters, visit  
https://gem5-review.googlesource.com/settings


Gerrit-Project: public/gem5
Gerrit-Branch: develop
Gerrit-Change-Id: I1fe249ce87aa8d70c1f092fc7db1554e6aee7355
Gerrit-Change-Number: 67052
Gerrit-PatchSet: 1
Gerrit-Owner: Bobby Bruce 
Gerrit-MessageType: newchange
___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org


[gem5-dev] [M] Change in gem5/gem5[release-staging-v22-1]: misc: Update RELEASE-NOTES.md for v22.1.0.0

2022-12-30 Thread Bobby Bruce (Gerrit) via gem5-dev
Bobby Bruce has submitted this change. (  
https://gem5-review.googlesource.com/c/public/gem5/+/66391?usp=email )


Change subject: misc: Update RELEASE-NOTES.md for v22.1.0.0
..

misc: Update RELEASE-NOTES.md for v22.1.0.0

Change-Id: I28753f24742ca156e19ac2af4fb302f9de20e852
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66391
Reviewed-by: Bobby Bruce 
Reviewed-by: Jason Lowe-Power 
Reviewed-by: Matthew Poremba 
Maintainer: Bobby Bruce 
Reviewed-by: Matt Sinclair 
Tested-by: kokoro 
Reviewed-by: Daniel Carvalho 
---
M RELEASE-NOTES.md
1 file changed, 135 insertions(+), 0 deletions(-)

Approvals:
  Bobby Bruce: Looks good to me, approved; Looks good to me, approved
  Jason Lowe-Power: Looks good to me, approved
  Daniel Carvalho: Looks good to me, but someone else must approve
  Matt Sinclair: Looks good to me, but someone else must approve
  Matthew Poremba: Looks good to me, approved
  kokoro: Regressions pass




diff --git a/RELEASE-NOTES.md b/RELEASE-NOTES.md
index 881285f..c10cc40 100644
--- a/RELEASE-NOTES.md
+++ b/RELEASE-NOTES.md
@@ -1,3 +1,121 @@
+# Version 22.1.0.0
+
+This release has 500 contributions from 48 unique contributors and marks  
our second major release of 2022.
+This release incorporates several new features, improvements, and bug  
fixes for the computer architecture reserach community.

+
+See below for more details!
+
+## New features and improvements
+
+- The gem5 binary can now be compiled to include multiple ISA targets.
+A compilation of gem5 which includes all gem5 ISAs can be created using:  
`scons build/ALL/gem5.opt`.
+This will use the Ruby `MESI_Two_Level` cache coherence protocol by  
default, to use other protocols: `scons build/ALL/gem5.opt PROTOCOL=protocol>`.
+The classic cache system may continue to be used regardless as to which  
Ruby cache coherence protocol is compiled.
+- The `m5` Python module now includes functions to set exit events are  
particular simululation ticks:

+- *setMaxTick(tick)* : Used to to specify the maximum simulation tick.
+- *getMaxTick()* : Used to obtain the maximum simulation tick value.
+- *getTicksUntilMax()*: Used to get the number of ticks remaining  
until the maximum tick is reached.
+- *scheduleTickExitFromCurrent(tick)* : Used to schedule an exit exit  
event a specified number of ticks in the future.
+- *scheduleTickExitAbsolute(tick)* : Used to schedule an exit event as  
a specified tick.

+- We now include the `RiscvMatched` board as part of the gem5 stdlib.
+This board is modeled after the [HiFive Unmatched  
board](https://www.sifive.com/boards/hifive-unmatched) and may be used to  
emulate its behavior.
+See "configs/example/gem5_library/riscv-matched-fs.py"  
and "configs/example/gem5_library/riscv-matched-hello.py" for examples  
using this board.
+- An API for [SimPoints](https://doi.org/10.1145/885651.781076) has been  
added.
+SimPoints can substantially improve gem5 Simulation time by only  
simulating representative parts of a simulation then extrapolating  
statistical data accordingly.
+Examples of using SimPoints with gem5 can be found  
in "configs/example/gem5_library/checkpoints/simpoints-se-checkpoint.py"  
and "configs/example/gem5_library/checkpoints/simpoints-se-restore.py".

+- "Workloads" have been introduced to gem5.
+Workloads have been incorporated into the gem5 Standard library.
+They can be used specify the software to be run on a simulated system that  
come complete with input parameters and any other dependencies necessary to  
run a simuation on the target hardware.
+At the level of the gem5 configuration script a user may specify a  
workload via a board's `set_workload` function.
+For example, `set_workload(Workload("x86-ubuntu-18.04-boot"))` sets the  
board to use the "x86-ubuntu-18.04-boot" workload.
+This workload specifies a boot consisting of the Linux 5.4.49 kernel then  
booting an Ubunutu 18.04 disk image, to exit upon booting.
+Workloads are agnostic to underlying gem5 design and, via the  
gem5-resources infrastructure, will automatically retrieve all necessary  
kernels, disk-images, etc., necessary to execute.
+Examples of using gem5 Workloads can be found  
in "configs/example/gem5_library/x86-ubuntu-ruby.py"  
and "configs/example/gem5_library/riscv-ubuntu-run.py".
+- To aid gem5 developers, we have incorporated  
[pre-commit](https://pre-commit.com) checks into gem5.
+These checks automatically enforce the gem5 style guide on Python files  
and a subset of other requirements (such as line length) on altered code  
prior to a `git commit`.

+Users may install pre-commit by running `./util/pre-commit-install.sh`.
+Passing these checks is a requirement to submit code to gem5 so  
installation is strongly advised.

+- A multiprocessing module has been added.
+This allows for multiple simulations to be run from a single gem5  
execution via a single gem5 configuration 

[gem5-dev] issue about the Xbar : waitingForPeer and waitingForLayer

2022-12-30 Thread Felix Guo via gem5-dev
Hi sir;
   I am learning about the gem5 coherentXbar, and in the Xbar.hh two 
variables make me confused which are waitingForPeer and waitingForLayer

   From the code flow,
I think the waitingForPeer is used to point to a master/memSide port who’s peer 
is busy and can not accept the request
   And it is assigned in the Layer:;failTiming function which is called in 
the CoherentXBar::recvTimingReq after a memSidePort->tryTiming fail or 
memSidePort->sendTimingReq fail ,
and the fail is because if the peer is not ready to accept the request , am I 
right here?

And in the waitingForLayer are ports being pushed in BaseXBar::Layer::tryTiming function when the ReqLayer is busy, which means that the 
cpuSideport of xbar is not ready to accept the request
And the cupSidePort which is busy itself will be pushed into the 
waitingForLayer.

   But the waitingForPeer is moved into waitingForLayer in 
BaseXBar::Layer::recvRetry() function,  which is called in 
CoherentXBar::recvReqRetry(PortID mem_side_port_id) function,
and the recvReqRetry is always called because the peer is ready and call the 
sendReqRetry().

>From my point of view, when the peer is ready to accept a request, and when 
>recvReqRetry of the master port iscalled, the master port should call the 
>sendTimingReq to resend the request.
But because the waitingForPeer is moved into waitingForLayer, the master port 
which was pointed by waitingForPeer will try to send request by Layer:: 
sendRetry which is called in BaseXBar::Layer::retryWaiting()
And the Layer:: sendRetry will actually implemented in the subclass which is 
ReqLayer/ RespLayer/ SnoopRespLayer, and the sendRetryResp/ sendRetryReq/ 
sendRetrySnoopResp is called NOT the sendTimingReq
And this does NoT match the mem protocol when slave is busy (see gem5: Creating 
SimObjects in the memory 
system, 
if the picture below is not showed correctly)

And I think the reason of this is because the Xbar doesn’t distinguish the two 
retry senarios when xbar::cpuSidePort is busy and when the peer of 
xbar::memSidePort is busy
In both scenario, xbar put the port into waitingForLayer as said above, but the 
the flow of this scenario is different.
[cid:image003.jpg@01D91C6C.D4DC66C0]

Felix guo
Shannxi Xi’an

___
gem5-dev mailing list -- gem5-dev@gem5.org
To unsubscribe send an email to gem5-dev-le...@gem5.org