[gem5-dev] Change in gem5/gem5[develop]: arch-x86: Make JRCX instruction do 64-bit jump
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/40195 ) Change subject: arch-x86: Make JRCX instruction do 64-bit jump .. arch-x86: Make JRCX instruction do 64-bit jump Per the AMD64 Architecture Programming Manual: The size of the count register (CX, ECX, or RCX) depends on the address-size attribute of the JrCXZ instruction. Therefore, JRCXZ can only be executed in 64-bit mode and In 64-bit mode, the operand size defaults to 64 bits. The processor sign-extends the 8-bit displacement value to 64 bits before adding it to the RIP. Change-Id: Id55147d0602ff41ad6aaef483bef722ff56cae62 --- M src/arch/x86/isa/insts/general_purpose/control_transfer/conditional_jump.py 1 file changed, 2 insertions(+), 0 deletions(-) diff --git a/src/arch/x86/isa/insts/general_purpose/control_transfer/conditional_jump.py b/src/arch/x86/isa/insts/general_purpose/control_transfer/conditional_jump.py index 390a08b..420d55b 100644 --- a/src/arch/x86/isa/insts/general_purpose/control_transfer/conditional_jump.py +++ b/src/arch/x86/isa/insts/general_purpose/control_transfer/conditional_jump.py @@ -212,6 +212,8 @@ def macroop JRCX_I { +# Make the default data size of jumps 64 bits in 64 bit mode +.adjust_env oszIn64Override .control_direct rdip t1 -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/40195 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Id55147d0602ff41ad6aaef483bef722ff56cae62 Gerrit-Change-Number: 40195 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: dev-hsa: enable interruptible hsa signal support
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/38335 ) Change subject: dev-hsa: enable interruptible hsa signal support .. dev-hsa: enable interruptible hsa signal support Event creation and management support from emulated drivers is required to support interruptible signals in HSA and this support was not available. This changeset adds the event creation and management support in the emulated driver. With this patch, each interruptible signal created by the HSA runtime is associated with a signal event. The HSA runtime can then put a thread waiting on a signal condition to sleep asking the driver to monitor the event associated with that signal. If the signal is modified by the GPU, the dispatcher notifies the driver about signal value change. If the modifier is a CPU thread, the thread will have to make HSA API calls to modify the signal and these API calls will notify the driver about signal value change. Once the driver is notified about a change in the signal value, the driver checks to see if any thread is sleeping on that signal and wake up the sleeping thread associated with that event. The driver has also implemented the time_out wakeup that can wake up the thread after a certain time period has expired. This is also true for barrier packets. Each signal has an event address in a kernel managed and allocated event page that can be used as a mailbox pointer to notify an event. However, this feature used by non-CPU agents to communicate with the driver is not implemented by this changeset because the non-CPU HSA agents in our model can directly communicate with driver in our implementation. Having said that, adding that feature should be trivial because the event address and event pages are correctly setup by this changeset and just adding the event page's virtual address to our PIO doorbell interface in the page tables and registering that pio address to the driver should be sufficient. Managing mailbox pointer for an event is based on event ID and using this event ID as an index into event page, this changeset already provides a unique mailbox pointer for each event. Change-Id: Ic62794076ddd47526b1f952fdb4c1bad632bdd2e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38335 Reviewed-by: Jason Lowe-Power Reviewed-by: Matt Sinclair Maintainer: Matt Sinclair Tested-by: kokoro --- M configs/example/apu_se.py M src/dev/hsa/hsa_device.hh M src/dev/hsa/hsa_driver.cc M src/dev/hsa/hsa_driver.hh M src/dev/hsa/hsa_packet_processor.cc M src/dev/hsa/hsa_packet_processor.hh A src/dev/hsa/hsa_signal.hh M src/dev/hsa/hw_scheduler.cc A src/dev/hsa/kfd_event_defines.h M src/gpu-compute/dispatcher.cc M src/gpu-compute/gpu_command_processor.cc M src/gpu-compute/gpu_command_processor.hh M src/gpu-compute/gpu_compute_driver.cc M src/gpu-compute/gpu_compute_driver.hh M src/sim/emul_driver.hh 15 files changed, 545 insertions(+), 70 deletions(-) Approvals: Jason Lowe-Power: Looks good to me, approved Matt Sinclair: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py index 7edc733..feed8a7 100644 --- a/configs/example/apu_se.py +++ b/configs/example/apu_se.py @@ -470,7 +470,7 @@ "/usr/lib/x86_64-linux-gnu" ]), 'HOME=%s' % os.getenv('HOME','/'), - "HSA_ENABLE_INTERRUPT=0"] + "HSA_ENABLE_INTERRUPT=1"] process = Process(executable = executable, cmd = [options.cmd] + options.options.split(), drivers = [gpu_driver], env = env) diff --git a/src/dev/hsa/hsa_device.hh b/src/dev/hsa/hsa_device.hh index 68cbd82..6f981d6 100644 --- a/src/dev/hsa/hsa_device.hh +++ b/src/dev/hsa/hsa_device.hh @@ -43,10 +43,13 @@ #include "dev/hsa/hsa_packet_processor.hh" #include "params/HSADevice.hh" +class HSADriver; + class HSADevice : public DmaDevice { public: typedef HSADeviceParams Params; +typedef std::function HsaSignalCallbackFunction; HSADevice(const Params &p) : DmaDevice(p), hsaPP(p.hsapp) { @@ -92,7 +95,21 @@ { fatal("%s does not accept vendor specific packets\n", name()); } - +virtual void +attachDriver(HSADriver *driver) +{ +fatal("%s does not need HSA driver\n", name()); +} +virtual void +updateHsaSignal(Addr signal_handle, uint64_t signal_value) +{ +fatal("%s does not have HSA signal update functionality.\n", name()); +} +virtual uint64_t +functionalReadHsaSignal(Addr signal_handle) +{ +fatal("%s does not have HSA signal read functionality.\n", name()); +} void dmaReadVirt(Addr host_addr, unsigned size, DmaCallback *cb, void *data, Tick delay = 0); void dmaWriteVirt(Addr host_addr, unsigned size, DmaCallback *cb, diff --git a/src/dev/hsa/hsa_driver.cc
[gem5-dev] Change in gem5/gem5[develop]: dev-hsa: Add missing include to hsa_driver.hh
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/40216 ) Change subject: dev-hsa: Add missing include to hsa_driver.hh .. dev-hsa: Add missing include to hsa_driver.hh Due to using ThreadContext::Suspended in hsa_driver.hh as of 965ad12b9a4ae4035b0f63e7ab083ac87258a071, we now need to include cpu/thread_context.hh. This change fixes that. Change-Id: I2c6882f2a29ca1638dd34cda42874b95cafbe548 --- M src/dev/hsa/hsa_driver.hh 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/dev/hsa/hsa_driver.hh b/src/dev/hsa/hsa_driver.hh index fc8131e..616ec94 100644 --- a/src/dev/hsa/hsa_driver.hh +++ b/src/dev/hsa/hsa_driver.hh @@ -54,12 +54,12 @@ #include #include "base/types.hh" +#include "cpu/thread_context.hh" #include "sim/emul_driver.hh" struct HSADriverParams; class HSADevice; class PortProxy; -class ThreadContext; class HSADriver : public EmulatedDriver { -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/40216 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I2c6882f2a29ca1638dd34cda42874b95cafbe548 Gerrit-Change-Number: 40216 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: dev-hsa: Add missing include to hsa_driver.hh
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/40216 ) Change subject: dev-hsa: Add missing include to hsa_driver.hh .. dev-hsa: Add missing include to hsa_driver.hh Due to using ThreadContext::Suspended in hsa_driver.hh as of 965ad12b9a4ae4035b0f63e7ab083ac87258a071, we now need to include cpu/thread_context.hh. This change fixes that. Change-Id: I2c6882f2a29ca1638dd34cda42874b95cafbe548 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40216 Reviewed-by: Matt Sinclair Reviewed-by: Gabe Black Maintainer: Matt Sinclair Tested-by: kokoro --- M src/dev/hsa/hsa_driver.hh 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Matt Sinclair: Looks good to me, but someone else must approve; Looks good to me, approved Gabe Black: Looks good to me, approved kokoro: Regressions pass diff --git a/src/dev/hsa/hsa_driver.hh b/src/dev/hsa/hsa_driver.hh index fc8131e..616ec94 100644 --- a/src/dev/hsa/hsa_driver.hh +++ b/src/dev/hsa/hsa_driver.hh @@ -54,12 +54,12 @@ #include #include "base/types.hh" +#include "cpu/thread_context.hh" #include "sim/emul_driver.hh" struct HSADriverParams; class HSADevice; class PortProxy; -class ThreadContext; class HSADriver : public EmulatedDriver { -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/40216 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I2c6882f2a29ca1638dd34cda42874b95cafbe548 Gerrit-Change-Number: 40216 Gerrit-PatchSet: 2 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Alexandru Duțu Gerrit-Reviewer: Gabe Black Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: Matthew Poremba Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-gcn3: Fix sign extension for branches with multiplied offset
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41053 ) Change subject: arch-gcn3: Fix sign extension for branches with multiplied offset .. arch-gcn3: Fix sign extension for branches with multiplied offset Certain branch instructions specify that the result of (simm16 * 4) gets sign-extended before being added to the PC. Previously, that result was being sign extended as if it was still a 16-bit number. This patch fixes that by having the result be sign extended as an 18-bit number. Change-Id: Id4d430f8daa71ca7910b570e7e39790626f1decf --- M src/arch/gcn3/insts/instructions.cc 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/src/arch/gcn3/insts/instructions.cc b/src/arch/gcn3/insts/instructions.cc index 03b11ab..29de1a8 100644 --- a/src/arch/gcn3/insts/instructions.cc +++ b/src/arch/gcn3/insts/instructions.cc @@ -3900,7 +3900,7 @@ Addr pc = wf->pc(); ScalarRegI16 simm16 = instData.SIMM16; -pc = pc + ((ScalarRegI64)sext<16>(simm16 * 4LL)) + 4LL; +pc = pc + ((ScalarRegI64)sext<18>(simm16 * 4LL)) + 4LL; wf->pc(pc); } @@ -3946,7 +3946,7 @@ scc.read(); if (!scc.rawData()) { -pc = pc + ((ScalarRegI64)sext<16>(simm16 * 4LL)) + 4LL; +pc = pc + ((ScalarRegI64)sext<18>(simm16 * 4LL)) + 4LL; } wf->pc(pc); @@ -3975,7 +3975,7 @@ scc.read(); if (scc.rawData()) { -pc = pc + ((ScalarRegI64)sext<16>(simm16 * 4LL)) + 4LL; +pc = pc + ((ScalarRegI64)sext<18>(simm16 * 4LL)) + 4LL; } wf->pc(pc); @@ -4005,7 +4005,7 @@ vcc.read(); if (!vcc.rawData()) { -pc = pc + ((ScalarRegI64)sext<16>(simm16 * 4LL)) + 4LL; +pc = pc + ((ScalarRegI64)sext<18>(simm16 * 4LL)) + 4LL; } wf->pc(pc); @@ -4035,7 +4035,7 @@ if (vcc.rawData()) { Addr pc = wf->pc(); ScalarRegI16 simm16 = instData.SIMM16; -pc = pc + ((ScalarRegI64)sext<16>(simm16 * 4LL)) + 4LL; +pc = pc + ((ScalarRegI64)sext<18>(simm16 * 4LL)) + 4LL; wf->pc(pc); } } @@ -4060,7 +4060,7 @@ if (wf->execMask().none()) { Addr pc = wf->pc(); ScalarRegI16 simm16 = instData.SIMM16; -pc = pc + ((ScalarRegI64)sext<16>(simm16 * 4LL)) + 4LL; +pc = pc + ((ScalarRegI64)sext<18>(simm16 * 4LL)) + 4LL; wf->pc(pc); } } @@ -4085,7 +4085,7 @@ if (wf->execMask().any()) { Addr pc = wf->pc(); ScalarRegI16 simm16 = instData.SIMM16; -pc = pc + ((ScalarRegI64)sext<16>(simm16 * 4LL)) + 4LL; +pc = pc + ((ScalarRegI64)sext<18>(simm16 * 4LL)) + 4LL; wf->pc(pc); } } -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/41053 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Id4d430f8daa71ca7910b570e7e39790626f1decf Gerrit-Change-Number: 41053 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-gcn3: Fix sign extension for branches with multiplied offset
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/41053 ) Change subject: arch-gcn3: Fix sign extension for branches with multiplied offset .. arch-gcn3: Fix sign extension for branches with multiplied offset Certain branch instructions specify that the result of (simm16 * 4) gets sign-extended before being added to the PC. Previously, that result was being sign extended as if it was still a 16-bit number. This patch fixes that by having the result be sign extended as an 18-bit number. Change-Id: Id4d430f8daa71ca7910b570e7e39790626f1decf Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41053 Reviewed-by: Matt Sinclair Reviewed-by: Matthew Poremba Maintainer: Matt Sinclair Tested-by: kokoro --- M src/arch/gcn3/insts/instructions.cc 1 file changed, 7 insertions(+), 7 deletions(-) Approvals: Matthew Poremba: Looks good to me, approved Matt Sinclair: Looks good to me, but someone else must approve; Looks good to me, approved kokoro: Regressions pass diff --git a/src/arch/gcn3/insts/instructions.cc b/src/arch/gcn3/insts/instructions.cc index 03b11ab..29de1a8 100644 --- a/src/arch/gcn3/insts/instructions.cc +++ b/src/arch/gcn3/insts/instructions.cc @@ -3900,7 +3900,7 @@ Addr pc = wf->pc(); ScalarRegI16 simm16 = instData.SIMM16; -pc = pc + ((ScalarRegI64)sext<16>(simm16 * 4LL)) + 4LL; +pc = pc + ((ScalarRegI64)sext<18>(simm16 * 4LL)) + 4LL; wf->pc(pc); } @@ -3946,7 +3946,7 @@ scc.read(); if (!scc.rawData()) { -pc = pc + ((ScalarRegI64)sext<16>(simm16 * 4LL)) + 4LL; +pc = pc + ((ScalarRegI64)sext<18>(simm16 * 4LL)) + 4LL; } wf->pc(pc); @@ -3975,7 +3975,7 @@ scc.read(); if (scc.rawData()) { -pc = pc + ((ScalarRegI64)sext<16>(simm16 * 4LL)) + 4LL; +pc = pc + ((ScalarRegI64)sext<18>(simm16 * 4LL)) + 4LL; } wf->pc(pc); @@ -4005,7 +4005,7 @@ vcc.read(); if (!vcc.rawData()) { -pc = pc + ((ScalarRegI64)sext<16>(simm16 * 4LL)) + 4LL; +pc = pc + ((ScalarRegI64)sext<18>(simm16 * 4LL)) + 4LL; } wf->pc(pc); @@ -4035,7 +4035,7 @@ if (vcc.rawData()) { Addr pc = wf->pc(); ScalarRegI16 simm16 = instData.SIMM16; -pc = pc + ((ScalarRegI64)sext<16>(simm16 * 4LL)) + 4LL; +pc = pc + ((ScalarRegI64)sext<18>(simm16 * 4LL)) + 4LL; wf->pc(pc); } } @@ -4060,7 +4060,7 @@ if (wf->execMask().none()) { Addr pc = wf->pc(); ScalarRegI16 simm16 = instData.SIMM16; -pc = pc + ((ScalarRegI64)sext<16>(simm16 * 4LL)) + 4LL; +pc = pc + ((ScalarRegI64)sext<18>(simm16 * 4LL)) + 4LL; wf->pc(pc); } } @@ -4085,7 +4085,7 @@ if (wf->execMask().any()) { Addr pc = wf->pc(); ScalarRegI16 simm16 = instData.SIMM16; -pc = pc + ((ScalarRegI64)sext<16>(simm16 * 4LL)) + 4LL; +pc = pc + ((ScalarRegI64)sext<18>(simm16 * 4LL)) + 4LL; wf->pc(pc); } } -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/41053 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Id4d430f8daa71ca7910b570e7e39790626f1decf Gerrit-Change-Number: 41053 Gerrit-PatchSet: 2 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Alexandru Duțu Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: Matthew Poremba Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Fix accidental execution when stopped at barrier
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41573 ) Change subject: gpu-compute: Fix accidental execution when stopped at barrier .. gpu-compute: Fix accidental execution when stopped at barrier Due the compute unit pipeline being executed in reverse order, there exists a scenario where a compute unit will execute an extra instruction when it's supposed to be stopped at a barrier. It occurs as follows: * The ScheduleStage sets a barrier instruction ready to execute. * The ScoreboardCheckStage adds another instruction to the readyList. This is where the barrier is checked, but because the barrier isn't executing yet, the instruction can be passed along to ScheduleStage * The barrier executes, and stalls * The ScheduleStage sees that there's a new instruction and schedules it to be executed. * Only now will the ScoreboardCheckStage realize a barrier is active and stall accordingly * The subsequent instruction executes This patch checks for barrier status in the ScheduleStage to prevent an instruction from being scheduled when there is a barrier active. Change-Id: Ib683e2c68f361d7ee60a3beaf53b4b6c888c9f8d --- M src/gpu-compute/schedule_stage.cc 1 file changed, 15 insertions(+), 0 deletions(-) diff --git a/src/gpu-compute/schedule_stage.cc b/src/gpu-compute/schedule_stage.cc index 8a2ea18..5c51e76 100644 --- a/src/gpu-compute/schedule_stage.cc +++ b/src/gpu-compute/schedule_stage.cc @@ -106,6 +106,21 @@ wIt++; } } +/** + * Remove any wave that's at a barrier. Due to backwards execution + * of the pipeline, the ScoreboardCheckStage can mark an instruction + * as ready immediately before a barrier executes, which would then + * be executed when the barrier is active without this check. + **/ +for (auto wIt = fromScoreboardCheck.readyWFs(j).begin(); + wIt != fromScoreboardCheck.readyWFs(j).end();) { +if ((*wIt)->getStatus() == Wavefront::S_BARRIER) { +*wIt = nullptr; +wIt = fromScoreboardCheck.readyWFs(j).erase(wIt); +} else { +wIt++; +} +} } // Attempt to add another wave for each EXE type to schList queues -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/41573 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Ib683e2c68f361d7ee60a3beaf53b4b6c888c9f8d Gerrit-Change-Number: 41573 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Explicitly set driver to NULL in constructor
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41973 ) Change subject: gpu-compute: Explicitly set driver to NULL in constructor .. gpu-compute: Explicitly set driver to NULL in constructor We have a fail_if in attachDriver to prevent driver from being overwritten. However, the fail_if only checks for if the driver is not NULL. Previously in some cases, driver was set to garbage, which made the fail_if trip the first time we were assigning the driver. This patch explicitly sets driver to NULL in the constructor, thus ensuring that it will be NULL the first time we call attachDriver Change-Id: I325f6033e785025a912e3af3888c66cee0332f40 --- M src/gpu-compute/gpu_command_processor.cc 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gpu-compute/gpu_command_processor.cc b/src/gpu-compute/gpu_command_processor.cc index da21076..4901a93 100644 --- a/src/gpu-compute/gpu_command_processor.cc +++ b/src/gpu-compute/gpu_command_processor.cc @@ -42,7 +42,7 @@ #include "sim/syscall_emul_buf.hh" GPUCommandProcessor::GPUCommandProcessor(const Params &p) -: HSADevice(p), dispatcher(*p.dispatcher) +: HSADevice(p), dispatcher(*p.dispatcher), driver(NULL) { dispatcher.setCommandProcessor(this); } -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/41973 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I325f6033e785025a912e3af3888c66cee0332f40 Gerrit-Change-Number: 41973 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: dev-hsa: Fix size of HSA Queue
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/42423 ) Change subject: dev-hsa: Fix size of HSA Queue .. dev-hsa: Fix size of HSA Queue In the HSAQueueDescriptor ptr function, we mod the index by numElts, but numElts was previously just set to size, which was the raw size of the queue. This lead to indexing past the queue. We fix this by dividing by the size by the AQL packet size to get the actual number of elements the queue can hold. Change-Id: Ie5e699379f303255305c279e58a34dc783df86a0 --- M src/dev/hsa/hsa_packet_processor.hh 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/dev/hsa/hsa_packet_processor.hh b/src/dev/hsa/hsa_packet_processor.hh index e79ffb1..8ef5ccd 100644 --- a/src/dev/hsa/hsa_packet_processor.hh +++ b/src/dev/hsa/hsa_packet_processor.hh @@ -84,7 +84,7 @@ uint64_t hri_ptr, uint32_t size) : basePointer(base_ptr), doorbellPointer(db_ptr), writeIndex(0), readIndex(0), -numElts(size), hostReadIndexPtr(hri_ptr), +numElts(size / AQL_PACKET_SIZE), hostReadIndexPtr(hri_ptr), stalledOnDmaBufAvailability(false), dmaInProgress(false) { } -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/42423 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Ie5e699379f303255305c279e58a34dc783df86a0 Gerrit-Change-Number: 42423 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-x86: Add insts used in newer libstdc++ rehashing
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/42443 ) Change subject: arch-x86: Add insts used in newer libstdc++ rehashing .. arch-x86: Add insts used in newer libstdc++ rehashing For newer versions of libstdc++ (Like the one in the ubuntu-20.04_all-dependencies docker image), the variables used when rehashing, e.g., std::unordered_maps have been extended. This resulted in the rehashing function using different, unimplemented, instructions. Because these instructions are unimplemented, it resulted in a std::bad_alloc exception when inserting into an unordered_map This patchset implements the following instructions: FCOMI FSUBRP FISTP Change-Id: I85c57acace1f7a547b0a97ec3a0f0500909c5d2a --- M src/arch/x86/isa/decoder/x87.isa M src/arch/x86/isa/insts/x87/arithmetic/subtraction.py M src/arch/x86/isa/insts/x87/compare_and_test/floating_point_ordered_compare.py M src/arch/x86/isa/insts/x87/data_transfer_and_conversion/convert_and_load_or_store_integer.py M src/arch/x86/isa/microops/ldstop.isa 5 files changed, 65 insertions(+), 8 deletions(-) diff --git a/src/arch/x86/isa/decoder/x87.isa b/src/arch/x86/isa/decoder/x87.isa index 258fcb5..c28bc2f 100644 --- a/src/arch/x86/isa/decoder/x87.isa +++ b/src/arch/x86/isa/decoder/x87.isa @@ -185,7 +185,7 @@ } 0x3: decode MODRM_MOD { 0x3: fcmovnu(); -default: fistp(); +default: Inst::FISTP(Md); // 32-bit int } 0x4: decode MODRM_MOD { 0x3: decode MODRM_RM { @@ -203,7 +203,7 @@ default: Inst::FLD80(M); } 0x6: decode MODRM_MOD { -0x3: fcomi(); +0x3: Inst::FCOMI(Rq); default: Inst::UD2(); } 0x7: decode MODRM_MOD { @@ -307,7 +307,10 @@ default: ficomp(); } 0x4: decode MODRM_MOD { -0x3: fsubrp(); +0x3: decode MODRM_RM { +0x1: Inst::FSUBRP(); +default: Inst::FSUBRP(Eq); +} default: fisub(); } 0x5: decode MODRM_MOD { @@ -344,7 +347,7 @@ } 0x3: decode MODRM_MOD { 0x3: Inst::UD2(); -default: fistp(); +default: Inst::FISTP(Mw); // 16-bit int } 0x4: decode MODRM_MOD { 0x3: decode MODRM_RM { @@ -365,7 +368,7 @@ } 0x7: decode MODRM_MOD { 0x3: Inst::UD2(); -default: fistp(); +default: Inst::FISTP(Mq); } } } diff --git a/src/arch/x86/isa/insts/x87/arithmetic/subtraction.py b/src/arch/x86/isa/insts/x87/arithmetic/subtraction.py index 02c41f6..97cdb45 100644 --- a/src/arch/x86/isa/insts/x87/arithmetic/subtraction.py +++ b/src/arch/x86/isa/insts/x87/arithmetic/subtraction.py @@ -91,8 +91,27 @@ fault "std::make_shared()" }; +def macroop FSUBRP +{ +subfp st(1), st(0), st(1), spm=1 +}; + +def macroop FSUBRP_R +{ +subfp sti, st(0), sti, spm=1 +}; + +def macroop FSUBRP_M +{ +fault "std::make_shared()" +}; + +def macroop FSUBRP_P +{ +fault "std::make_shared()" +}; + # FISUB # FSUBR -# FSUBRP # FISUBR ''' diff --git a/src/arch/x86/isa/insts/x87/compare_and_test/floating_point_ordered_compare.py b/src/arch/x86/isa/insts/x87/compare_and_test/floating_point_ordered_compare.py index 5e03952..a3e71e9 100644 --- a/src/arch/x86/isa/insts/x87/compare_and_test/floating_point_ordered_compare.py +++ b/src/arch/x86/isa/insts/x87/compare_and_test/floating_point_ordered_compare.py @@ -37,6 +37,11 @@ # FCOM # FCOMP # FCOMPP -# FCOMI # FCOMIP + +#fcomi +def macroop FCOMI_R { +compfp st(0), sti +}; + ''' diff --git a/src/arch/x86/isa/insts/x87/data_transfer_and_conversion/convert_and_load_or_store_integer.py b/src/arch/x86/isa/insts/x87/data_transfer_and_conversion/convert_and_load_or_store_integer.py index 1dbe79f..515b98b 100644 --- a/src/arch/x86/isa/insts/x87/data_transfer_and_conversion/convert_and_load_or_store_integer.py +++ b/src/arch/x86/isa/insts/x87/data_transfer_and_conversion/convert_and_load_or_store_integer.py @@ -50,6 +50,19 @@ }; # FIST -# FISTP + +def macroop FISTP_M { +movfp ufp1, st(0) +stifp87 ufp1, seg, sib, disp +pop87 +}; + +def macroop FISTP_P { +movfp ufp1, st(0) +rdip t7 +stifp87 ufp1, seg, riprel, disp +pop87 +}; + # FISTTP ''' diff --git a/src/arch/x86/isa/microops/ldstop.isa b/src/arch/x86/isa/microops/ldstop.isa index 79aadfa..0186bc6 100644 --- a/src/arch/x86/isa/microops/ldstop.isa +++ b/src/arch/x86/isa/microops/ldstop.isa @@ -649,6 +649,23 @@ } ''') +defineMicroStoreOp('Stifp87', code=''' +switch (d
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: Add missing transitions + wakes for Dma events
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/42463 ) Change subject: mem-ruby: Add missing transitions + wakes for Dma events .. mem-ruby: Add missing transitions + wakes for Dma events This also changes one of the wakeUpDependents calls to a wakeUpAllDependentsAddr call to prevent a hang. Change-Id: Ia076414e5c6d9c8c0b2576d1f442195d75d275fc --- M src/mem/ruby/protocol/MOESI_AMD_Base-dir.sm 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/mem/ruby/protocol/MOESI_AMD_Base-dir.sm b/src/mem/ruby/protocol/MOESI_AMD_Base-dir.sm index 684d03e..4d24891 100644 --- a/src/mem/ruby/protocol/MOESI_AMD_Base-dir.sm +++ b/src/mem/ruby/protocol/MOESI_AMD_Base-dir.sm @@ -1119,7 +1119,7 @@ // The exit state is always going to be U, so wakeUpDependents logic should be covered in all the // transitions which are flowing into U. - transition({BL, BS_M, BM_M, B_M, BP, BDW_P, BS_PM, BM_PM, B_PM, BS_Pm, BM_Pm, B_Pm, B}, {DmaRead,DmaWrite}){ + transition({BL, BDR_M, BS_M, BM_M, B_M, BP, BDR_PM, BDW_P, BS_PM, BM_PM, B_PM, BDR_Pm, BS_Pm, BM_Pm, B_Pm, B}, {DmaRead,DmaWrite}){ sd_stallAndWaitRequest; } @@ -1280,6 +1280,7 @@ transition(BDR_M, MemData, U) { mt_writeMemDataToTBE; dd_sendResponseDmaData; +wa_wakeUpAllDependentsAddr; dt_deallocateTBE; pm_popMemQueue; } @@ -1373,7 +1374,7 @@ dd_sendResponseDmaData; // Check for pending requests from the core we put to sleep while waiting // for a response -wa_wakeUpDependents; +wa_wakeUpAllDependentsAddr; dt_deallocateTBE; pt_popTriggerQueue; } -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/42463 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Ia076414e5c6d9c8c0b2576d1f442195d75d275fc Gerrit-Change-Number: 42463 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-x86: Add insts used in newer libstdc++ rehashing
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/42443 ) Change subject: arch-x86: Add insts used in newer libstdc++ rehashing .. arch-x86: Add insts used in newer libstdc++ rehashing For newer versions of libstdc++ (Like the one in the ubuntu-20.04_all-dependencies docker image), the variables used when rehashing, e.g., std::unordered_maps have been extended. This resulted in the rehashing function using different, unimplemented, instructions. Because these instructions are unimplemented, it resulted in a std::bad_alloc exception when inserting into an unordered_map This patchset implements the following instructions: FCOMI, a floating point comparison instruction, using the compfp microop. The implementation mirrors that of the FUCOMI instruction (another floating point comparison instruction) FSUBRP, a reverse subtraction instruction, is implemented using the subfp microop like the FSUBP does, but with the operands flipped accordingly. FISTP, an instruction to convert a float to int and then store, is implemented by using a conversion microop (cvtf_d2i) and then a store. The cvtf_d2i microop is re-written to handle multple data sizes, as is required by the FISTP instruction. Change-Id: I85c57acace1f7a547b0a97ec3a0f0500909c5d2a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42443 Reviewed-by: Gabe Black Maintainer: Gabe Black Tested-by: kokoro --- M src/arch/x86/insts/microfpop.hh M src/arch/x86/isa/decoder/x87.isa M src/arch/x86/isa/insts/x87/arithmetic/subtraction.py M src/arch/x86/isa/insts/x87/compare_and_test/floating_point_ordered_compare.py M src/arch/x86/isa/insts/x87/data_transfer_and_conversion/convert_and_load_or_store_integer.py M src/arch/x86/isa/microops/fpop.isa 6 files changed, 37 insertions(+), 13 deletions(-) Approvals: Gabe Black: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/src/arch/x86/insts/microfpop.hh b/src/arch/x86/insts/microfpop.hh index e9d32da..245a899 100644 --- a/src/arch/x86/insts/microfpop.hh +++ b/src/arch/x86/insts/microfpop.hh @@ -54,6 +54,7 @@ const RegIndex dest; const uint8_t dataSize; const int8_t spm; +RegIndex foldOBit; // Constructor FpOp(ExtMachInst _machInst, @@ -66,7 +67,9 @@ __opClass), src1(_src1.index()), src2(_src2.index()), dest(_dest.index()), dataSize(_dataSize), spm(_spm) -{} +{ +foldOBit = (dataSize == 1 && !_machInst.rex.present) ? 1 << 6 : 0; +} std::string generateDisassembly( Addr pc, const Loader::SymbolTable *symtab) const override; diff --git a/src/arch/x86/isa/decoder/x87.isa b/src/arch/x86/isa/decoder/x87.isa index 258fcb5..e7f1747 100644 --- a/src/arch/x86/isa/decoder/x87.isa +++ b/src/arch/x86/isa/decoder/x87.isa @@ -185,7 +185,7 @@ } 0x3: decode MODRM_MOD { 0x3: fcmovnu(); -default: fistp(); +default: Inst::FISTP(Md); } 0x4: decode MODRM_MOD { 0x3: decode MODRM_RM { @@ -203,7 +203,7 @@ default: Inst::FLD80(M); } 0x6: decode MODRM_MOD { -0x3: fcomi(); +0x3: Inst::FCOMI(Rq); default: Inst::UD2(); } 0x7: decode MODRM_MOD { @@ -307,7 +307,7 @@ default: ficomp(); } 0x4: decode MODRM_MOD { -0x3: fsubrp(); +0x3: Inst::FSUBRP(Rq); default: fisub(); } 0x5: decode MODRM_MOD { @@ -344,7 +344,7 @@ } 0x3: decode MODRM_MOD { 0x3: Inst::UD2(); -default: fistp(); +default: Inst::FISTP(Mw); } 0x4: decode MODRM_MOD { 0x3: decode MODRM_RM { @@ -365,7 +365,7 @@ } 0x7: decode MODRM_MOD { 0x3: Inst::UD2(); -default: fistp(); +default: Inst::FISTP(Mq); } } } diff --git a/src/arch/x86/isa/insts/x87/arithmetic/subtraction.py b/src/arch/x86/isa/insts/x87/arithmetic/subtraction.py index 02c41f6..dea1277 100644 --- a/src/arch/x86/isa/insts/x87/arithmetic/subtraction.py +++ b/src/arch/x86/isa/insts/x87/arithmetic/subtraction.py @@ -91,8 +91,12 @@ fault "std::make_shared()" }; +def macroop FSUBRP_R +{ +subfp sti, st(0), sti, spm=1 +}; + # FISUB # FSUBR -# FSUBRP # FISUBR ''' diff --git a/src/arch/x86/isa/insts/x87/compare_and_test/floating_point_ordered_compare.py b/src/arch/x86/isa/insts/x87/compare_and_test/floating_point_ordered_compare.py index 5e03952..cd348cd 100644 --- a/src/arch/x86/isa/insts/x87/compare_and_test/floating_point_ordered_compare.py +++ b/src/arch
[gem5-dev] Change in gem5/gem5[develop]: dev-hsa,gpu-compute: Fix override for updateHsaSignal
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/44046 ) Change subject: dev-hsa,gpu-compute: Fix override for updateHsaSignal .. dev-hsa,gpu-compute: Fix override for updateHsaSignal Change 965ad12 removed a parameter from the updateHsaSignal function. Change 25e8a14 added the parameter back, but only for the derived class, breaking the override. This patch adds that parameter back to the base class, fixing the override. Change-Id: Id1e96e29ca4be7f3ce244bac83a112e3250812d1 --- M src/dev/hsa/hsa_device.hh M src/gpu-compute/gpu_command_processor.hh 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/src/dev/hsa/hsa_device.hh b/src/dev/hsa/hsa_device.hh index 157c459..d722a5d 100644 --- a/src/dev/hsa/hsa_device.hh +++ b/src/dev/hsa/hsa_device.hh @@ -101,7 +101,8 @@ fatal("%s does not need HSA driver\n", name()); } virtual void -updateHsaSignal(Addr signal_handle, uint64_t signal_value) +updateHsaSignal(Addr signal_handle, uint64_t signal_value, +HsaSignalCallbackFunction function = [ = ] (const uint64_t &) { }) { fatal("%s does not have HSA signal update functionality.\n", name()); } diff --git a/src/gpu-compute/gpu_command_processor.hh b/src/gpu-compute/gpu_command_processor.hh index c78ae0b..67cda7d 100644 --- a/src/gpu-compute/gpu_command_processor.hh +++ b/src/gpu-compute/gpu_command_processor.hh @@ -90,7 +90,7 @@ void updateHsaSignal(Addr signal_handle, uint64_t signal_value, HsaSignalCallbackFunction function = -[] (const uint64_t &) { }); +[] (const uint64_t &) { }) override; uint64_t functionalReadHsaSignal(Addr signal_handle) override; -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/44046 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Id1e96e29ca4be7f3ce244bac83a112e3250812d1 Gerrit-Change-Number: 44046 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Fix scalar register ready check
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/44045 ) Change subject: gpu-compute: Fix scalar register ready check .. gpu-compute: Fix scalar register ready check Replaces some curly braces that were accidentally removed causing the function to return false even when it shouldn't Change-Id: I15fb4167468c8e3dd1107f1ca3dc98c48df4611b --- M src/gpu-compute/scalar_register_file.cc 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gpu-compute/scalar_register_file.cc b/src/gpu-compute/scalar_register_file.cc index 5fa7a62..14ea3fe 100644 --- a/src/gpu-compute/scalar_register_file.cc +++ b/src/gpu-compute/scalar_register_file.cc @@ -52,11 +52,12 @@ { for (const auto& srcScalarOp : ii->srcScalarRegOperands()) { for (const auto& physIdx : srcScalarOp.physIndices()) { -if (regBusy(physIdx)) +if (regBusy(physIdx)) { DPRINTF(GPUSRF, "RAW stall: WV[%d]: %s: physReg[%d]\n", w->wfDynId, ii->disassemble(), physIdx); w->stats.numTimesBlockedDueRAWDependencies++; return false; +} } } -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/44045 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I15fb4167468c8e3dd1107f1ca3dc98c48df4611b Gerrit-Change-Number: 44045 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-gcn3: Read registers in execute instead of initiateAcc
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/45345 ) Change subject: arch-gcn3: Read registers in execute instead of initiateAcc .. arch-gcn3: Read registers in execute instead of initiateAcc Certain memory writes were reading their registers in initiateAcc, which lead to scenarios where a subsequent instruction would execute, clobbering the value in that register before the memory writes' initiateAcc method was called, causing the memory write to read wrong data. This patch moves all register reads to execute, preventing the above scenario from happening. Change-Id: Iee107c19e4b82c2e172bf2d6cc95b79983a43d83 --- M src/arch/amdgpu/gcn3/insts/instructions.cc 1 file changed, 116 insertions(+), 125 deletions(-) diff --git a/src/arch/amdgpu/gcn3/insts/instructions.cc b/src/arch/amdgpu/gcn3/insts/instructions.cc index 8cadff7..4ae4c29 100644 --- a/src/arch/amdgpu/gcn3/insts/instructions.cc +++ b/src/arch/amdgpu/gcn3/insts/instructions.cc @@ -5065,8 +5065,13 @@ gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod()); ScalarRegU32 offset(0); ConstScalarOperandU64 addr(gpuDynInst, instData.SBASE << 1); +ConstScalarOperandU32 sdata(gpuDynInst, instData.SDATA); addr.read(); +sdata.read(); + +std::memcpy((void*)gpuDynInst->scalar_data, sdata.rawDataPtr(), +sizeof(ScalarRegU32)); if (instData.IMM) { offset = extData.OFFSET; @@ -5090,10 +5095,6 @@ void Inst_SMEM__S_STORE_DWORD::initiateAcc(GPUDynInstPtr gpuDynInst) { -ConstScalarOperandU32 sdata(gpuDynInst, instData.SDATA); -sdata.read(); -std::memcpy((void*)gpuDynInst->scalar_data, sdata.rawDataPtr(), -sizeof(ScalarRegU32)); initMemWrite<1>(gpuDynInst); } // initiateAcc @@ -5124,8 +5125,13 @@ gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod()); ScalarRegU32 offset(0); ConstScalarOperandU64 addr(gpuDynInst, instData.SBASE << 1); +ConstScalarOperandU64 sdata(gpuDynInst, instData.SDATA); addr.read(); +sdata.read(); + +std::memcpy((void*)gpuDynInst->scalar_data, sdata.rawDataPtr(), +sizeof(ScalarRegU64)); if (instData.IMM) { offset = extData.OFFSET; @@ -5149,10 +5155,6 @@ void Inst_SMEM__S_STORE_DWORDX2::initiateAcc(GPUDynInstPtr gpuDynInst) { -ConstScalarOperandU64 sdata(gpuDynInst, instData.SDATA); -sdata.read(); -std::memcpy((void*)gpuDynInst->scalar_data, sdata.rawDataPtr(), -sizeof(ScalarRegU64)); initMemWrite<2>(gpuDynInst); } // initiateAcc @@ -5183,8 +5185,13 @@ gpuDynInst->latency.set(gpuDynInst->computeUnit()->clockPeriod()); ScalarRegU32 offset(0); ConstScalarOperandU64 addr(gpuDynInst, instData.SBASE << 1); +ConstScalarOperandU128 sdata(gpuDynInst, instData.SDATA); addr.read(); +sdata.read(); + +std::memcpy((void*)gpuDynInst->scalar_data, sdata.rawDataPtr(), +4 * sizeof(ScalarRegU32)); if (instData.IMM) { offset = extData.OFFSET; @@ -5208,10 +5215,6 @@ void Inst_SMEM__S_STORE_DWORDX4::initiateAcc(GPUDynInstPtr gpuDynInst) { -ConstScalarOperandU128 sdata(gpuDynInst, instData.SDATA); -sdata.read(); -std::memcpy((void*)gpuDynInst->scalar_data, sdata.rawDataPtr(), -4 * sizeof(ScalarRegU32)); initMemWrite<4>(gpuDynInst); } // initiateAcc @@ -35743,9 +35746,18 @@ ConstVecOperandU32 addr1(gpuDynInst, extData.VADDR + 1); ConstScalarOperandU128 rsrcDesc(gpuDynInst, extData.SRSRC * 4); ConstScalarOperandU32 offset(gpuDynInst, extData.SOFFSET); +ConstVecOperandI8 data(gpuDynInst, extData.VDATA); rsrcDesc.read(); offset.read(); +data.read(); + +for (int lane = 0; lane < NumVecElemPerVecReg; ++lane) { +if (gpuDynInst->exec_mask[lane]) { +(reinterpret_cast(gpuDynInst->d_data))[lane] += data[lane]; +} +} int inst_offset = instData.OFFSET; @@ -35790,16 +35802,6 @@ void Inst_MUBUF__BUFFER_STORE_BYTE::initiateAcc(GPUDynInstPtr gpuDynInst) { -ConstVecOperandI8 data(gpuDynInst, extData.VDATA); -data.read(); - -for (int lane = 0; lane < NumVecElemPerVecReg; ++lane) { -if (gpuDynInst->exec_mask[lane]) { -(reinterpret_cast(gpuDynInst->d_data))[lane] -= data[lane]; -} -} - initMemWrite(gpuDynInst); } // initiateAcc @@ -35839,9 +35841,18 @@ ConstVecOperandU32 addr1(gpuDynInst, extData.VADDR + 1); ConstScalarOperandU128 rsrcDesc(gpuDynInst, extData.SRSRC * 4);
[gem5-dev] Change in gem5/gem5[develop]: arch-gcn3,gpu-compute: Set gpuDynInst exec_mask before use
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/45346 ) Change subject: arch-gcn3,gpu-compute: Set gpuDynInst exec_mask before use .. arch-gcn3,gpu-compute: Set gpuDynInst exec_mask before use vector_register_file uses the exec_mask of a memory instruction in order to determine if it should mark a register as in-use or not. Previously, the exec_mask of memory instructions were only set on execution of that instruction, which occurs after the code in vector_register_file. This lead to the code reading potentially garbage data, leading to a scenario where a register would be marked used when it shouldn't be. This fix sets the exec_mask of memory instructions in schedule_stage, which works because the only time the wavefront execMask() is updated is on a instruction executing, and we know the previous instruction will have executed by the time schedule_stage executes, due to the order the pipeline is executed in. This also undoes part of a patch from last year (62ec973) which treated the symptom of accidental register allocation, without preventing the registers from being allocated in the first place. This patch also removes now redundant code that sets the exec_mask in instructions.cc for memory instructions Change-Id: Idabd3502764fb06133ac2458606c1aaf6f04 --- M src/arch/amdgpu/gcn3/insts/instructions.cc M src/gpu-compute/schedule_stage.cc 2 files changed, 29 insertions(+), 155 deletions(-) diff --git a/src/arch/amdgpu/gcn3/insts/instructions.cc b/src/arch/amdgpu/gcn3/insts/instructions.cc index 4ae4c29..a5f28e3 100644 --- a/src/arch/amdgpu/gcn3/insts/instructions.cc +++ b/src/arch/amdgpu/gcn3/insts/instructions.cc @@ -31240,7 +31240,6 @@ { Wavefront *wf = gpuDynInst->wavefront(); gpuDynInst->execUnitId = wf->execUnitId; -gpuDynInst->exec_mask = wf->execMask(); gpuDynInst->latency.init(gpuDynInst->computeUnit()); gpuDynInst->latency.set( gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24))); @@ -31301,7 +31300,6 @@ { Wavefront *wf = gpuDynInst->wavefront(); gpuDynInst->execUnitId = wf->execUnitId; -gpuDynInst->exec_mask = wf->execMask(); gpuDynInst->latency.init(gpuDynInst->computeUnit()); gpuDynInst->latency.set( gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24))); @@ -31365,7 +31363,6 @@ { Wavefront *wf = gpuDynInst->wavefront(); gpuDynInst->execUnitId = wf->execUnitId; -gpuDynInst->exec_mask = wf->execMask(); gpuDynInst->latency.init(gpuDynInst->computeUnit()); gpuDynInst->latency.set( gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24))); @@ -31545,7 +31542,6 @@ { Wavefront *wf = gpuDynInst->wavefront(); gpuDynInst->execUnitId = wf->execUnitId; -gpuDynInst->exec_mask = wf->execMask(); gpuDynInst->latency.init(gpuDynInst->computeUnit()); gpuDynInst->latency.set( gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24))); @@ -31605,7 +31601,6 @@ { Wavefront *wf = gpuDynInst->wavefront(); gpuDynInst->execUnitId = wf->execUnitId; -gpuDynInst->exec_mask = wf->execMask(); gpuDynInst->latency.init(gpuDynInst->computeUnit()); gpuDynInst->latency.set( gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24))); @@ -32070,7 +32065,6 @@ { Wavefront *wf = gpuDynInst->wavefront(); gpuDynInst->execUnitId = wf->execUnitId; -gpuDynInst->exec_mask = wf->execMask(); gpuDynInst->latency.init(gpuDynInst->computeUnit()); gpuDynInst->latency.set( gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24))); @@ -32132,7 +32126,6 @@ { Wavefront *wf = gpuDynInst->wavefront(); gpuDynInst->execUnitId = wf->execUnitId; -gpuDynInst->exec_mask = wf->execMask(); gpuDynInst->latency.init(gpuDynInst->computeUnit()); gpuDynInst->latency.set( gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24))); @@ -32197,7 +32190,6 @@ { Wavefront *wf = gpuDynInst->wavefront(); gpuDynInst->execUnitId = wf->execUnitId; -gpuDynInst->exec_mask = wf->execMask(); gpuDynInst->latency.init(gpuDynInst->computeUnit()); gpuDynInst->latency.set( gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24))); @@ -32281,7 +32273,6 @@ { Wavefront *wf = gpuDynInst->wavefront(); gpuDynInst->execUnitId = wf->execUnitId; -gpuDynInst->exec_mask = wf->execMask(); gpuDynInst->latency.init(gpuDynInst->computeUnit()); gpuDynInst->latency.set( gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24))); @@ -32362,7 +32353,6 @@ { Wavefront *wf = gpuDynInst->wavefront();
[gem5-dev] Change in gem5/gem5[develop]: arch-gcn3,arch-vega,gpu-compute: Move request counters
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/45347 ) Change subject: arch-gcn3,arch-vega,gpu-compute: Move request counters .. arch-gcn3,arch-vega,gpu-compute: Move request counters When the Vega ISA got committed, it lacked the request counter tracking for memory requests that existed in the GCN3 code. Instead of copying over the same lines from the GCN3 code to the Vega code, this commit makes the various memory pipelines handle updating the request counter information instead, as every memory instruction calls a memory pipeline. This commit also adds an issueRequest in scalar_memory_pipeline, as previously, the gpuDynInsts were explicitly placed in the queue of issuedRequests. Change-Id: I5140d3b2f12be582f2ae9ff7c433167aeec5b68e --- M src/arch/amdgpu/gcn3/insts/instructions.cc M src/arch/amdgpu/vega/insts/instructions.cc M src/gpu-compute/global_memory_pipeline.cc M src/gpu-compute/local_memory_pipeline.cc M src/gpu-compute/scalar_memory_pipeline.cc M src/gpu-compute/scalar_memory_pipeline.hh 6 files changed, 82 insertions(+), 408 deletions(-) diff --git a/src/arch/amdgpu/gcn3/insts/instructions.cc b/src/arch/amdgpu/gcn3/insts/instructions.cc index a5f28e3..a51354e 100644 --- a/src/arch/amdgpu/gcn3/insts/instructions.cc +++ b/src/arch/amdgpu/gcn3/insts/instructions.cc @@ -4494,12 +4494,7 @@ calcAddr(gpuDynInst, addr, offset); gpuDynInst->computeUnit()->scalarMemoryPipe -.getGMReqFIFO().push(gpuDynInst); - -wf->scalarRdGmReqsInPipe--; -wf->scalarOutstandingReqsRdGm++; -gpuDynInst->wavefront()->outstandingReqs++; -gpuDynInst->wavefront()->validateRequestCounters(); +.issueRequest(gpuDynInst); } void @@ -4553,12 +4548,7 @@ calcAddr(gpuDynInst, addr, offset); gpuDynInst->computeUnit()->scalarMemoryPipe. -getGMReqFIFO().push(gpuDynInst); - -wf->scalarRdGmReqsInPipe--; -wf->scalarOutstandingReqsRdGm++; -gpuDynInst->wavefront()->outstandingReqs++; -gpuDynInst->wavefront()->validateRequestCounters(); +issueRequest(gpuDynInst); } void @@ -4610,12 +4600,7 @@ calcAddr(gpuDynInst, addr, offset); gpuDynInst->computeUnit()->scalarMemoryPipe. -getGMReqFIFO().push(gpuDynInst); - -wf->scalarRdGmReqsInPipe--; -wf->scalarOutstandingReqsRdGm++; -gpuDynInst->wavefront()->outstandingReqs++; -gpuDynInst->wavefront()->validateRequestCounters(); +issueRequest(gpuDynInst); } void @@ -4667,12 +4652,7 @@ calcAddr(gpuDynInst, addr, offset); gpuDynInst->computeUnit()->scalarMemoryPipe. -getGMReqFIFO().push(gpuDynInst); - -wf->scalarRdGmReqsInPipe--; -wf->scalarOutstandingReqsRdGm++; -gpuDynInst->wavefront()->outstandingReqs++; -gpuDynInst->wavefront()->validateRequestCounters(); +issueRequest(gpuDynInst); } void @@ -4724,12 +4704,7 @@ calcAddr(gpuDynInst, addr, offset); gpuDynInst->computeUnit()->scalarMemoryPipe. -getGMReqFIFO().push(gpuDynInst); - -wf->scalarRdGmReqsInPipe--; -wf->scalarOutstandingReqsRdGm++; -gpuDynInst->wavefront()->outstandingReqs++; -gpuDynInst->wavefront()->validateRequestCounters(); +issueRequest(gpuDynInst); } void @@ -4782,12 +4757,7 @@ calcAddr(gpuDynInst, rsrcDesc, offset); gpuDynInst->computeUnit()->scalarMemoryPipe -.getGMReqFIFO().push(gpuDynInst); - -wf->scalarRdGmReqsInPipe--; -wf->scalarOutstandingReqsRdGm++; -gpuDynInst->wavefront()->outstandingReqs++; -gpuDynInst->wavefront()->validateRequestCounters(); +.issueRequest(gpuDynInst); } // execute void @@ -4841,12 +4811,7 @@ calcAddr(gpuDynInst, rsrcDesc, offset); gpuDynInst->computeUnit()->scalarMemoryPipe -.getGMReqFIFO().push(gpuDynInst); - -wf->scalarRdGmReqsInPipe--; -wf->scalarOutstandingReqsRdGm++; -gpuDynInst->wavefront()->outstandingReqs++; -gpuDynInst->wavefront()->validateRequestCounters(); +.issueRequest(gpuDynInst); } // execute void @@ -4900,12 +4865,7 @@ calcAddr(gpuDynInst, rsrcDesc, offset); gpuDynInst->computeUnit()->scalarMemoryPipe -.getGMReqFIFO().push(gpuDynInst); - -wf->scalarRdGmReqsInPipe--; -wf->scalarOutstandingReqsRdGm++; -gpuDynInst->wavefront()->outstandingReqs++; -gpuDynInst->wavefront()->validateRequestCounters(); +.issueRequest(gpuDynInst); } // execute void @@ -4959,12 +4919,7 @@ calcAddr(gpuDynInst, rsrcDesc, offset); gpuDynInst->computeUnit()->scalarMemoryPipe -.getGMRe
[gem5-dev] Change in gem5/gem5[develop]: configs: Add mem_banks to Carrizo topology
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/46240 ) Change subject: configs: Add mem_banks to Carrizo topology .. configs: Add mem_banks to Carrizo topology ROCm 4 iterates through the mem_banks to find an appropriate place to allocate memory. Previously, Carrizo didn't have any mem_banks, which resulted in the ROCm 4 runtime erroring out, as it didn't know where to allocate memory. The implementation is fairly similar to the implementation used for the Fiji or Vega configs Change-Id: I5bb4e89657d44c6cb690fd224ee1bf1d4d6cf2a5 --- M configs/example/hsaTopology.py 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/configs/example/hsaTopology.py b/configs/example/hsaTopology.py index 51585de..78fe1f7 100644 --- a/configs/example/hsaTopology.py +++ b/configs/example/hsaTopology.py @@ -36,7 +36,7 @@ from os.path import join as joinpath from os.path import isdir from shutil import rmtree, copyfile -from m5.util.convert import toFrequency +from m5.util.convert import toFrequency, toMemorySize def file_append(path, contents): with open(joinpath(*path), 'a') as f: @@ -422,12 +422,14 @@ # must have marketing name file_append((node_dir, 'name'), 'Carrizo\n') +mem_banks_cnt = 1 + # populate global node properties # NOTE: SIMD count triggers a valid GPU agent creation node_prop = 'cpu_cores_count %s\n' % options.num_cpus + \ 'simd_count %s\n' \ % (options.num_compute_units * options.simds_per_cu)+ \ -'mem_banks_count 0\n' + \ +'mem_banks_count %s\n' % mem_banks_cnt + \ 'caches_count 0\n' + \ 'io_links_count 0\n'+ \ 'cpu_core_id_base 16\n' + \ @@ -453,3 +455,14 @@ % int(toFrequency(options.CPUClock) / 1e6) file_append((node_dir, 'properties'), node_prop) + +for i in range(mem_banks_cnt): +mem_dir = joinpath(node_dir, f'mem_banks/{i}') +remake_dir(mem_dir) + +mem_prop = f'heap_type 0\n' + \ + f'size_in_bytes {toMemorySize(options.mem_size)}'+ \ + f'flags 0\n' + \ + f'width 64\n'+ \ + f'mem_clk_max 1600\n' +file_append((mem_dir, 'properties'), mem_prop) -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/46240 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I5bb4e89657d44c6cb690fd224ee1bf1d4d6cf2a5 Gerrit-Change-Number: 46240 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs,gpu-compute: Add render driver needed for ROCm 4
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/46244 ) Change subject: configs,gpu-compute: Add render driver needed for ROCm 4 .. configs,gpu-compute: Add render driver needed for ROCm 4 ROCm 4 utilizes the render driver located at /dev/dri/renderDXXX. This patch implements a very simple driver that just returns a file descriptor when opened, as testing has shown that's all that's needed Change-Id: I65602346cbf17b2dc80e114046ebf5c9830a1507 --- M configs/example/apu_se.py M configs/example/hsaTopology.py M src/gpu-compute/GPU.py M src/gpu-compute/SConscript A src/gpu-compute/gpu_render_driver.cc A src/gpu-compute/gpu_render_driver.hh 6 files changed, 49 insertions(+), 1 deletion(-) diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py index f779df3..b9e1e7c 100644 --- a/configs/example/apu_se.py +++ b/configs/example/apu_se.py @@ -436,6 +436,8 @@ gfxVersion = args.gfx_version, dGPUPoolID = 1, m_type = args.m_type) +render_driver = GPURenderDriver(filename = 'dri/renderD128') + # Creating the GPU kernel launching components: that is the HSA # packet processor (HSAPP), GPU command processor (CP), and the # dispatcher. @@ -498,7 +500,8 @@ "HSA_ENABLE_SDMA=0"] process = Process(executable = executable, cmd = [args.cmd] - + args.options.split(), drivers = [gpu_driver], env = env) + + args.options.split(), + drivers = [gpu_driver, render_driver], env = env) for cpu in cpu_list: cpu.createThreads() diff --git a/configs/example/hsaTopology.py b/configs/example/hsaTopology.py index 78fe1f7..b77d1c1 100644 --- a/configs/example/hsaTopology.py +++ b/configs/example/hsaTopology.py @@ -373,6 +373,7 @@ 'vendor_id 4098\n' + \ 'device_id 29440\n' + \ 'location_id 512\n' + \ +'drm_render_minor 128\n'+ \ 'max_engine_clk_fcompute %s\n' \ % int(toFrequency(options.gpu_clock) / 1e6) + \ 'local_mem_size 4294967296\n' + \ @@ -446,6 +447,7 @@ 'vendor_id 4098\n' + \ 'device_id 39028\n' + \ 'location_id 8\n' + \ +'drm_render_minor 128\n'+ \ 'max_engine_clk_fcompute %s\n' \ % int(toFrequency(options.gpu_clock) / 1e6) + \ 'local_mem_size 0\n'+ \ diff --git a/src/gpu-compute/GPU.py b/src/gpu-compute/GPU.py index 579c84b..ace83a5 100644 --- a/src/gpu-compute/GPU.py +++ b/src/gpu-compute/GPU.py @@ -254,6 +254,10 @@ # default value: 5/C_RO_S (only allow caching in GL2 for read. Shared) m_type = Param.Int("Default MTYPE for cache. Valid values between 0-7"); +class GPURenderDriver(EmulatedDriver): +type = 'GPURenderDriver' +cxx_header = 'gpu-compute/gpu_render_driver.hh' + class GPUDispatcher(SimObject): type = 'GPUDispatcher' cxx_header = 'gpu-compute/dispatcher.hh' diff --git a/src/gpu-compute/SConscript b/src/gpu-compute/SConscript index adb9b0e..ae0bfab 100644 --- a/src/gpu-compute/SConscript +++ b/src/gpu-compute/SConscript @@ -52,6 +52,7 @@ Source('gpu_compute_driver.cc') Source('gpu_dyn_inst.cc') Source('gpu_exec_context.cc') +Source('gpu_render_driver.cc') Source('gpu_static_inst.cc') Source('gpu_tlb.cc') Source('lds_state.cc') diff --git a/src/gpu-compute/gpu_render_driver.cc b/src/gpu-compute/gpu_render_driver.cc new file mode 100644 index 000..9d9cbd2 --- /dev/null +++ b/src/gpu-compute/gpu_render_driver.cc @@ -0,0 +1,17 @@ +#include "gpu-compute/gpu_render_driver.hh" + +#include "params/GPURenderDriver.hh" +#include "sim/fd_entry.hh" + +GPURenderDriver::GPURenderDriver(const GPURenderDriverParams &p) +: EmulatedDriver(p) +{ +} + +int GPURenderDriver::open(ThreadContext *tc, int mode, int flags) +{ +auto process = tc->getProcessPtr(); +auto device_fd_entry = std::make_shared(this, filename); +int tgt_fd = process->fds->allocFD(device_fd_entry); +return tgt_fd; +} diff --git a/src/gpu-compute/gpu_render_driver.hh b/src/gpu-compute/gpu_render_driver.hh new file mode 100644 index 000..46d1b8d --- /dev/null +++ b/src/gpu-compute/gpu_render_driver.hh @@ -0,0 +1,21 @@ +#ifndef __GPU_COMPUTE_GPU_RENDER_DRIVER_HH__ +#define __GPU_COMPUTE_GPU_RENDER
[gem5-dev] Change in gem5/gem5[develop]: util: Update GCN Dockerfile for ROCm 4
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/46239 ) Change subject: util: Update GCN Dockerfile for ROCm 4 .. util: Update GCN Dockerfile for ROCm 4 This now installs ROCm 4 from source instead of ROCm 1.6. Change-Id: I380ca06e93d48475e93d18f69eb97756186772ab --- M util/dockerfiles/gcn-gpu/Dockerfile 1 file changed, 152 insertions(+), 144 deletions(-) diff --git a/util/dockerfiles/gcn-gpu/Dockerfile b/util/dockerfiles/gcn-gpu/Dockerfile index e5683ab..491c960 100644 --- a/util/dockerfiles/gcn-gpu/Dockerfile +++ b/util/dockerfiles/gcn-gpu/Dockerfile @@ -1,166 +1,174 @@ -FROM ubuntu:16.04 +# Copyright (c) 2020 The Regents of the University of California +# All Rights Reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions are +# met: redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer; +# redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution; +# neither the name of the copyright holders nor the names of its +# contributors may be used to endorse or promote products derived from +# this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +FROM ubuntu:20.04 +ENV DEBIAN_FRONTEND=noninteractive +RUN apt -y update +RUN apt -y upgrade +RUN apt -y install build-essential git m4 scons zlib1g zlib1g-dev \ +libprotobuf-dev protobuf-compiler libprotoc-dev libgoogle-perftools-dev \ +python3-dev python3-six python-is-python3 doxygen libboost-all-dev \ +libhdf5-serial-dev python3-pydot libpng-dev libelf-dev pkg-config -# Needed for add-apt-repository -RUN apt-get update && apt-get install -y --no-install-recommends \ -software-properties-common +# Requirements for ROCm +RUN apt -y install cmake mesa-common-dev libgflags-dev libgoogle-glog-dev -# Ubuntu 16.04 does not have a python package new enough for gem5, use a PPA -RUN add-apt-repository ppa:deadsnakes/ppa && apt-get update +RUN git clone https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface.git &&\ +git -C /ROCT-Thunk-Interface/ checkout roc-4.0.x && \ +mkdir -p /ROCT-Thunk-Interface/build -# Should be minimal needed packages -RUN apt-get update && apt-get install -y --no-install-recommends \ -findutils \ -file \ -libunwind8 \ -libunwind-dev \ -pkg-config \ -build-essential \ -gcc-multilib \ -g++-multilib \ -git \ -ca-certificates \ -m4 \ -zlib1g \ -zlib1g-dev \ -libprotobuf-dev \ -protobuf-compiler \ -libprotoc-dev \ -libgoogle-perftools-dev \ -python-yaml \ -python3.9 \ -python3.9-dev \ -python3.9-distutils \ -wget \ -libpci3 \ -libelf1 \ -libelf-dev \ -cmake \ -openssl \ -libssl-dev \ -libboost-filesystem-dev \ -libboost-system-dev \ -libboost-dev \ -libpng12-dev \ -gdb +WORKDIR /ROCT-Thunk-Interface/build +RUN cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=/opt/rocm .. && \ +make -j$(nproc) && make install +WORKDIR / -# Use python 3.9 by default -RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9 1 +# There was no rocm-4.0.x tag at the time, there was even a github post +# stating to use rocm-3.10.x +RUN git clone https://github.com/RadeonOpenCompute/ROCR-Runtime.git && \ +git -C /ROCR-Runtime/ checkout rocm-3.10.x && \ +mkdir -p /ROCR-Runtime/src/build -# Setuptools is needed for cmake for ROCm build. Install using pip. -# Instructions to install PIP from https://pypi.org/project/pip/ -RUN wget https://bootstrap.pypa.io/get-pip.py -qO get-pip.py -RUN python3 get-pip.py -RUN pip install -U setuptools scons==3.1.2 six +WORKDIR /ROCR-Runtime/src/build +# need MEMFD_CREATE=OFF as MEMFD_CREATE syscall isn't implemented +RUN cmake -DIMAGE_SUPPORT=OFF -DHAVE_MEMFD_CREATE=OFF -DCMAKE_BUILD_TYPE=Debug\ +-DCMAK
[gem5-dev] Change in gem5/gem5[develop]: arch-x86: Ignore syscalls called in ROCm 4
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/46241 ) Change subject: arch-x86: Ignore syscalls called in ROCm 4 .. arch-x86: Ignore syscalls called in ROCm 4 This patch ignores syscalls called by the ROCm 4 stack. Based on testing so far, these syscalls don't affect the correctness of programs that use ROCm 4. sched_yield gets changed to ignoreWarnOnceFunc, as it gets called significantly more in ROCm 4. Change-Id: I566b1d71d989c54bfc559d5b83790dff73a38b28 --- M src/arch/x86/linux/syscall_tbl64.cc 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/arch/x86/linux/syscall_tbl64.cc b/src/arch/x86/linux/syscall_tbl64.cc index 8630265..be82437 100644 --- a/src/arch/x86/linux/syscall_tbl64.cc +++ b/src/arch/x86/linux/syscall_tbl64.cc @@ -60,7 +60,7 @@ { 21, "access", ignoreFunc }, { 22, "pipe", pipeFunc }, { 23, "select", selectFunc }, -{ 24, "sched_yield", ignoreFunc }, +{ 24, "sched_yield", ignoreWarnOnceFunc }, { 25, "mremap", mremapFunc }, { 26, "msync" }, { 27, "mincore" }, @@ -111,7 +111,7 @@ { 72, "fcntl", fcntlFunc }, { 73, "flock" }, { 74, "fsync" }, -{ 75, "fdatasync" }, +{ 75, "fdatasync", ignoreFunc }, { 76, "truncate", truncateFunc }, { 77, "ftruncate", ftruncateFunc }, #if defined(SYS_getdents) @@ -171,7 +171,7 @@ { 128, "rt_sigtimedwait" }, { 129, "rt_sigqueueinfo" }, { 130, "rt_sigsuspend" }, -{ 131, "sigaltstack" }, +{ 131, "sigaltstack", ignoreFunc }, { 132, "utime" }, { 133, "mknod", mknodFunc }, { 134, "uselib" }, @@ -197,7 +197,7 @@ { 154, "modify_ldt" }, { 155, "pivot_root" }, { 156, "_sysctl" }, -{ 157, "prctl" }, +{ 157, "prctl", ignoreFunc }, { 158, "arch_prctl", archPrctlFunc }, { 159, "adjtimex" }, { 160, "setrlimit", ignoreFunc }, -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/46241 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I566b1d71d989c54bfc559d5b83790dff73a38b28 Gerrit-Change-Number: 46241 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Initialize GPUDriver member variables before use
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/46248 ) Change subject: gpu-compute: Initialize GPUDriver member variables before use .. gpu-compute: Initialize GPUDriver member variables before use A few member variables weren't initialized, but we were assuming that they were 0 when first read. This explicitly sets those variables to 0. Change-Id: I2c840d361ed3a7d306e22dc7561a3870f1ef94a1 --- M src/gpu-compute/gpu_compute_driver.cc 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gpu-compute/gpu_compute_driver.cc b/src/gpu-compute/gpu_compute_driver.cc index 12e537c..02f1de5 100644 --- a/src/gpu-compute/gpu_compute_driver.cc +++ b/src/gpu-compute/gpu_compute_driver.cc @@ -53,7 +53,8 @@ GPUComputeDriver::GPUComputeDriver(const Params &p) : EmulatedDriver(p), device(p.device), queueId(0), - isdGPU(p.isdGPU), gfxVersion(p.gfxVersion), dGPUPoolID(p.dGPUPoolID) + isdGPU(p.isdGPU), gfxVersion(p.gfxVersion), dGPUPoolID(p.dGPUPoolID), + eventPage(0), eventSlotIndex(0) { device->attachDriver(this); DPRINTF(GPUDriver, "Constructing KFD: device\n"); -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/46248 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I2c840d361ed3a7d306e22dc7561a3870f1ef94a1 Gerrit-Change-Number: 46248 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Change certain IOCTL errors to warnings
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/46247 ) Change subject: gpu-compute: Change certain IOCTL errors to warnings .. gpu-compute: Change certain IOCTL errors to warnings There are certain IOCTL errors that were triggering with the change to ROCm 4, however they could be set to warnings without causing any errors in the program Change-Id: Ie0052267f3ccfbdbadb90249b6f19e6a1205f57e --- M src/gpu-compute/gpu_compute_driver.cc 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gpu-compute/gpu_compute_driver.cc b/src/gpu-compute/gpu_compute_driver.cc index 7f8cc16..12e537c 100644 --- a/src/gpu-compute/gpu_compute_driver.cc +++ b/src/gpu-compute/gpu_compute_driver.cc @@ -417,7 +417,7 @@ TypedBufferArg args(ioc_buf); args.copyIn(virt_proxy); if (args->event_type != KFD_IOC_EVENT_SIGNAL) { -fatal("Signal events are only supported currently\n"); +warn("Signal events are only supported currently\n"); } else if (eventSlotIndex == SLOTS_PER_PAGE) { fatal("Signal event wasn't created; signal limit reached\n"); } @@ -508,8 +508,8 @@ "\tamdkfd wait for event %d\n", EventData->event_id); panic_if(ETable.count(EventData->event_id) == 0, "Event ID invalid, cannot set this event\n"); -panic_if(ETable[EventData->event_id].threadWaiting, - "Multiple threads waiting on the same event\n"); +if (ETable[EventData->event_id].threadWaiting) + warn("Multiple threads waiting on the same event\n"); if (ETable[EventData->event_id].setEvent) { // If event is already set, the event has already happened. // Just unset the event and dont put this thread to sleep. -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/46247 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Ie0052267f3ccfbdbadb90249b6f19e6a1205f57e Gerrit-Change-Number: 46247 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-x86: build with getdents64 if system supports it
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/46242 ) Change subject: arch-x86: build with getdents64 if system supports it .. arch-x86: build with getdents64 if system supports it This patch makes it so the getdents64 syscall is built in gem5 if the underlying host implements the syscall, similar to how the getdents syscall is implemented. The implementation for getdents64 already existed Change-Id: I73b22c8df8df994f3f720e848a7d4f8cd31d318e --- M src/arch/x86/linux/syscall_tbl32.cc M src/arch/x86/linux/syscall_tbl64.cc 2 files changed, 8 insertions(+), 0 deletions(-) diff --git a/src/arch/x86/linux/syscall_tbl32.cc b/src/arch/x86/linux/syscall_tbl32.cc index 50d0969..db70151 100644 --- a/src/arch/x86/linux/syscall_tbl32.cc +++ b/src/arch/x86/linux/syscall_tbl32.cc @@ -261,7 +261,11 @@ { 218, "mincore" }, { 219, "madvise", ignoreFunc }, { 220, "madvise1" }, +#if defined(SYS_getdents64) +{ 221, "getdents64", getdents64Func }, +#else { 221, "getdents64" }, +#endif { 222, "fcntl64" }, { 223, "unused" }, { 224, "gettid", gettidFunc }, diff --git a/src/arch/x86/linux/syscall_tbl64.cc b/src/arch/x86/linux/syscall_tbl64.cc index be82437..94837cd 100644 --- a/src/arch/x86/linux/syscall_tbl64.cc +++ b/src/arch/x86/linux/syscall_tbl64.cc @@ -257,7 +257,11 @@ { 214, "epoll_ctl_old" }, { 215, "epoll_wait_old" }, { 216, "remap_file_pages" }, +#if defined(SYS_getdents64) +{ 217, "getdents64", getdents64Func }, +#else { 217, "getdents64" }, +#endif { 218, "set_tid_address", setTidAddressFunc }, { 219, "restart_syscall" }, { 220, "semtimedop" }, -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/46242 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I73b22c8df8df994f3f720e848a7d4f8cd31d318e Gerrit-Change-Number: 46242 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-x86,sim: (WIP) Workaround for sched_getaffinity
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/46243 ) Change subject: arch-x86,sim: (WIP) Workaround for sched_getaffinity .. arch-x86,sim: (WIP) Workaround for sched_getaffinity sched_getaffinity is different from other syscalls in the raw syscall return the size of the cpumask being used to represent the CPU bit mask. Because of this, when a library (libnuma in this case) directly called sched_getaffinity and got a return value of 0, it errored out, thinking that there were no CPUs available. Currently the implementation just returns 1, and it's being used as a proof-of-concept for ROCm 4 support, as ROCm 4 support uses libnuma. Change-Id: Id95c919986cc98a411877056256604f57a29f0f9 --- M src/arch/x86/linux/syscall_tbl64.cc M src/sim/syscall_emul.cc M src/sim/syscall_emul.hh 3 files changed, 12 insertions(+), 1 deletion(-) diff --git a/src/arch/x86/linux/syscall_tbl64.cc b/src/arch/x86/linux/syscall_tbl64.cc index 94837cd..bb24f3d 100644 --- a/src/arch/x86/linux/syscall_tbl64.cc +++ b/src/arch/x86/linux/syscall_tbl64.cc @@ -244,7 +244,7 @@ { 201, "time", timeFunc }, { 202, "futex", futexFunc }, { 203, "sched_setaffinity", ignoreFunc }, -{ 204, "sched_getaffinity", ignoreFunc }, +{ 204, "sched_getaffinity", schedGetaffinityFunc }, { 205, "set_thread_area" }, { 206, "io_setup" }, { 207, "io_destroy" }, diff --git a/src/sim/syscall_emul.cc b/src/sim/syscall_emul.cc index bb8b42a..17a947a 100644 --- a/src/sim/syscall_emul.cc +++ b/src/sim/syscall_emul.cc @@ -1650,3 +1650,10 @@ return 0; } + +SyscallReturn +schedGetaffinityFunc(SyscallDesc *desc, ThreadContext *tc, + pid_t pid, size_t cpusetsize, Addr mask) +{ +return 1; +} diff --git a/src/sim/syscall_emul.hh b/src/sim/syscall_emul.hh index 54e92b2..83e11b2 100644 --- a/src/sim/syscall_emul.hh +++ b/src/sim/syscall_emul.hh @@ -367,6 +367,10 @@ SyscallReturn getsocknameFunc(SyscallDesc *desc, ThreadContext *tc, int tgt_fd, VPtr<> addrPtr, VPtr<> lenPtr); +// Target sched_getaffinity() handler +SyscallReturn schedGetaffinityFunc(SyscallDesc *desc, ThreadContext *tc, + pid_t pid, size_t cpusetsize, Addr mask); + /// Futex system call /// Implemented by Daniel Sanchez /// Used by printf's in multi-threaded apps -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/46243 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Id95c919986cc98a411877056256604f57a29f0f9 Gerrit-Change-Number: 46243 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Ignore GPU kernel names
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/46245 ) Change subject: gpu-compute: Ignore GPU kernel names .. gpu-compute: Ignore GPU kernel names ROCm 4 seems to have updated the akc, and the only real issue that has occured is that we're no longer able to read kernel names in the same way as we were in ROCm 1.6. This patch removes the prior method of reading kernel names and gives all kernels a temporary name Change-Id: I0040e0cf4cd35d6f56ded6a8acfb10c600bcc77a --- M src/gpu-compute/gpu_command_processor.cc 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/src/gpu-compute/gpu_command_processor.cc b/src/gpu-compute/gpu_command_processor.cc index 9bdd0b9..78b3235 100644 --- a/src/gpu-compute/gpu_command_processor.cc +++ b/src/gpu-compute/gpu_command_processor.cc @@ -171,7 +171,6 @@ DPRINTF(GPUCommandProc, "Machine code starts at addr: %#x\n", machine_code_addr); -Addr kern_name_addr(0); std::string kernel_name; /** @@ -184,10 +183,7 @@ * host memory. I have no idea what BLIT stands for. * */ if (akc.runtime_loader_kernel_symbol) { -virt_proxy.readBlob(akc.runtime_loader_kernel_symbol + 0x10, -(uint8_t*)&kern_name_addr, 0x8); - -virt_proxy.readString(kernel_name, kern_name_addr); +kernel_name = "Some kernel"; } else { kernel_name = "Blit kernel"; } -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/46245 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I0040e0cf4cd35d6f56ded6a8acfb10c600bcc77a Gerrit-Change-Number: 46245 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: dev-hsa,gpu-compute: IOCTL updates for ROCm 4
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/46246 ) Change subject: dev-hsa,gpu-compute: IOCTL updates for ROCm 4 .. dev-hsa,gpu-compute: IOCTL updates for ROCm 4 This change copies over the up-to-date kfd_ioctl.h file from the linux kernel, and updates the gpu_compute_driver to reflect the changes found in the new version of the kfd_ioctl.h file Change-Id: I51e8e7158762f4b7e06c0f84507e5889a17939a2 --- M src/dev/hsa/kfd_ioctl.h M src/gpu-compute/gpu_compute_driver.cc 2 files changed, 371 insertions(+), 335 deletions(-) diff --git a/src/dev/hsa/kfd_ioctl.h b/src/dev/hsa/kfd_ioctl.h index 504621c..5ba0a0c 100644 --- a/src/dev/hsa/kfd_ioctl.h +++ b/src/dev/hsa/kfd_ioctl.h @@ -23,13 +23,16 @@ #ifndef KFD_IOCTL_H_INCLUDED #define KFD_IOCTL_H_INCLUDED +#include #include #include -#include - +/* + * - 1.1 - initial version + * - 1.3 - Add SMI events support + */ #define KFD_IOCTL_MAJOR_VERSION 1 -#define KFD_IOCTL_MINOR_VERSION 2 +#define KFD_IOCTL_MINOR_VERSION 3 struct kfd_ioctl_get_version_args { @@ -41,6 +44,7 @@ #define KFD_IOC_QUEUE_TYPE_COMPUTE 0 #define KFD_IOC_QUEUE_TYPE_SDMA1 #define KFD_IOC_QUEUE_TYPE_COMPUTE_AQL 2 +#define KFD_IOC_QUEUE_TYPE_SDMA_XGMI 3 #define KFD_MAX_QUEUE_PERCENTAGE 100 #define KFD_MAX_QUEUE_PRIORITY 15 @@ -69,7 +73,7 @@ struct kfd_ioctl_destroy_queue_args { uint32_t queue_id; /* to KFD */ - uint32_t pad; +uint32_t pad; }; struct kfd_ioctl_update_queue_args @@ -78,15 +82,24 @@ uint32_t queue_id; /* to KFD */ uint32_t ring_size; /* to KFD */ - uint32_t queue_percentage; /* to KFD */ - uint32_t queue_priority;/* to KFD */ +uint32_t queue_percentage; /* to KFD */ +uint32_t queue_priority; /* to KFD */ }; struct kfd_ioctl_set_cu_mask_args { - uint32_t queue_id; /* to KFD */ - uint32_t num_cu_mask; /* to KFD */ - uint64_t cu_mask_ptr; /* to KFD */ +uint32_t queue_id; /* to KFD */ +uint32_t num_cu_mask; /* to KFD */ +uint64_t cu_mask_ptr; /* to KFD */ +}; + +struct kfd_ioctl_get_queue_wave_state_args +{ +uint64_t ctl_stack_address;/* to KFD */ +uint32_t ctl_stack_used_size; /* from KFD */ +uint32_t save_area_used_size; /* from KFD */ +uint32_t queue_id; /* to KFD */ +uint32_t pad; }; /* For kfd_ioctl_set_memory_policy_args.default_policy and alternate_policy */ @@ -104,14 +117,6 @@ uint32_t pad; }; -struct kfd_ioctl_set_trap_handler_args -{ - uint64_t tba_addr; - uint64_t tma_addr; - uint32_t gpu_id;/* to KFD */ - uint32_t pad; -}; - /* * All counters are monotonic. They are used for profiling of compute jobs. * The profiling is done by userspace. @@ -122,32 +127,32 @@ struct kfd_ioctl_get_clock_counters_args { uint64_t gpu_clock_counter; /* from KFD */ - uint64_t cpu_clock_counter; /* from KFD */ - uint64_t system_clock_counter; /* from KFD */ - uint64_t system_clock_freq; /* from KFD */ +uint64_t cpu_clock_counter;/* from KFD */ +uint64_t system_clock_counter; /* from KFD */ +uint64_t system_clock_freq;/* from KFD */ uint32_t gpu_id;/* to KFD */ uint32_t pad; }; -#define NUM_OF_SUPPORTED_GPUS 7 - struct kfd_process_device_apertures { uint64_t lds_base; /* from KFD */ - uint64_t lds_limit; /* from KFD */ - uint64_t scratch_base; /* from KFD */ - uint64_t scratch_limit; /* from KFD */ - uint64_t gpuvm_base;/* from KFD */ - uint64_t gpuvm_limit; /* from KFD */ - uint32_t gpu_id;/* from KFD */ - uint32_t pad; +uint64_t lds_limit;/* from KFD */ +uint64_t scratch_base; /* from KFD */ +uint64_t scratch_limit;/* from KFD */ +uint64_t gpuvm_base; /* from KFD */ +uint64_t gpuvm_limit; /* from KFD */ +uint32_t gpu_id; /* from KFD */ +uint32_t pad; }; -/* This IOCTL and the limited NUM_OF_SUPPORTED_GPUS is deprecated. Use - * kfd_ioctl_get_process_apertures_new instead, which supports - * arbitrary numbers of GPUs. +/* + * AMDKFD_IOC_GET_PROCESS_APERTURES is deprecated. Use + * AMDKFD_IOC_GET_PROCESS_APERTURES_NEW instead, which supports an + * unlimited number of GPUs. */ +#define NUM_OF_SUPPORTED_GPUS 7 struct kfd_ioctl_get_process_apertures_args { struct kfd_process_device_apertures @@ -165,11 +170,11 @@ */ uint64_t kfd_process_device_apertures_
[gem5-dev] Change in gem5/gem5[develop]: util: Update GCN Dockerfile for ROCm 4
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/46239 ) Change subject: util: Update GCN Dockerfile for ROCm 4 .. util: Update GCN Dockerfile for ROCm 4 This now installs ROCm 4 from source instead of ROCm 1.6. Change-Id: I380ca06e93d48475e93d18f69eb97756186772ab Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46239 Reviewed-by: Matthew Poremba Reviewed-by: Matt Sinclair Maintainer: Matt Sinclair Tested-by: kokoro --- M util/dockerfiles/gcn-gpu/Dockerfile 1 file changed, 112 insertions(+), 158 deletions(-) Approvals: Matthew Poremba: Looks good to me, approved Matt Sinclair: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/util/dockerfiles/gcn-gpu/Dockerfile b/util/dockerfiles/gcn-gpu/Dockerfile index 2f5d1b4..360ab1f 100644 --- a/util/dockerfiles/gcn-gpu/Dockerfile +++ b/util/dockerfiles/gcn-gpu/Dockerfile @@ -1,166 +1,120 @@ -FROM ubuntu:16.04 +# Copyright (c) 2021 Kyle Roarty +# All Rights Reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions are +# met: redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer; +# redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution; +# neither the name of the copyright holders nor the names of its +# contributors may be used to endorse or promote products derived from +# this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +FROM ubuntu:20.04 +ENV DEBIAN_FRONTEND=noninteractive +RUN apt -y update +RUN apt -y upgrade +RUN apt -y install build-essential git m4 scons zlib1g zlib1g-dev \ +libprotobuf-dev protobuf-compiler libprotoc-dev libgoogle-perftools-dev \ +python3-dev python3-six python-is-python3 doxygen libboost-all-dev \ +libhdf5-serial-dev python3-pydot libpng-dev libelf-dev pkg-config -# Needed for add-apt-repository -RUN apt-get update && apt-get install -y --no-install-recommends \ -software-properties-common +# Requirements for ROCm +RUN apt -y install cmake mesa-common-dev libgflags-dev libgoogle-glog-dev -# Ubuntu 16.04 does not have a python package new enough for gem5, use a PPA -RUN add-apt-repository ppa:deadsnakes/ppa && apt-get update +# Needed to get ROCm repo, build packages +RUN apt -y install wget gnupg2 rpm -# Should be minimal needed packages -RUN apt-get update && apt-get install -y --no-install-recommends \ -findutils \ -file \ -libunwind8 \ -libunwind-dev \ -pkg-config \ -build-essential \ -gcc-multilib \ -g++-multilib \ -git \ -ca-certificates \ -m4 \ -zlib1g \ -zlib1g-dev \ -libprotobuf-dev \ -protobuf-compiler \ -libprotoc-dev \ -libgoogle-perftools-dev \ -python-yaml \ -python3.9 \ -python3.9-dev \ -python3.9-distutils \ -wget \ -libpci3 \ -libelf1 \ -libelf-dev \ -cmake \ -openssl \ -libssl-dev \ -libboost-filesystem-dev \ -libboost-system-dev \ -libboost-dev \ -libpng12-dev \ -gdb +RUN wget -q -O - https://repo.radeon.com/rocm/rocm.gpg.key | apt-key add - -# Use python 3.9 by default -RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9 1 +# ROCm webpage says to use debian main, but the individual versions +# only have xenial +RUN echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.0.1/ xenial main' | tee /etc/apt/sources.list.d/rocm.list -# Setuptools is needed for cmake for ROCm build. Install using pip. -# Instructions to install PIP from https://pypi.org/project/pip/ -RUN wget https://bootstrap.pypa.io/get-pip.py -qO get-pip.py -RUN python3 get-pip.py -RUN pip install -U setuptools scons==3.1.2 six +RUN apt-get update && apt -y install hsakmt-roct hsakmt-roct-dev +RUN ln -s /opt/rocm-4.0.1 /opt/rocm -ARG gem5_dist=http://dist.gem5.org/dist/v21-0 +
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Initialize GPUDriver member variables before use
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/46248 ) Change subject: gpu-compute: Initialize GPUDriver member variables before use .. gpu-compute: Initialize GPUDriver member variables before use A few member variables weren't initialized, but we were assuming that they were 0 when first read. This explicitly sets those variables to 0. Change-Id: I2c840d361ed3a7d306e22dc7561a3870f1ef94a1 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46248 Tested-by: kokoro Reviewed-by: Matt Sinclair Maintainer: Matt Sinclair --- M src/gpu-compute/gpu_compute_driver.cc 1 file changed, 2 insertions(+), 1 deletion(-) Approvals: Matt Sinclair: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/src/gpu-compute/gpu_compute_driver.cc b/src/gpu-compute/gpu_compute_driver.cc index 12e537c..02f1de5 100644 --- a/src/gpu-compute/gpu_compute_driver.cc +++ b/src/gpu-compute/gpu_compute_driver.cc @@ -53,7 +53,8 @@ GPUComputeDriver::GPUComputeDriver(const Params &p) : EmulatedDriver(p), device(p.device), queueId(0), - isdGPU(p.isdGPU), gfxVersion(p.gfxVersion), dGPUPoolID(p.dGPUPoolID) + isdGPU(p.isdGPU), gfxVersion(p.gfxVersion), dGPUPoolID(p.dGPUPoolID), + eventPage(0), eventSlotIndex(0) { device->attachDriver(this); DPRINTF(GPUDriver, "Constructing KFD: device\n"); 1 is the latest approved patch-set. No files were changed between the latest approved patch-set and the submitted one. -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/46248 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I2c840d361ed3a7d306e22dc7561a3870f1ef94a1 Gerrit-Change-Number: 46248 Gerrit-PatchSet: 8 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Alex Dutu Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: Matthew Poremba Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-x86: Ignore certain syscalls called in ROCm 4
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/46241 ) Change subject: arch-x86: Ignore certain syscalls called in ROCm 4 .. arch-x86: Ignore certain syscalls called in ROCm 4 fdatasync, sigaltstack, and prctl are called by the ROCm 4 stack, but were unimplemented. Based on testing, we can change these to ignoreFunc without affecting program correctness. sched_yield gets changed to ignoreWarnOnceFunc, as it gets called significantly more in ROCm 4. Change-Id: I566b1d71d989c54bfc559d5b83790dff73a38b28 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46241 Tested-by: kokoro Maintainer: Matt Sinclair Reviewed-by: Matt Sinclair Reviewed-by: Matthew Poremba --- M src/arch/x86/linux/syscall_tbl64.cc 1 file changed, 4 insertions(+), 4 deletions(-) Approvals: Matthew Poremba: Looks good to me, approved Matt Sinclair: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/src/arch/x86/linux/syscall_tbl64.cc b/src/arch/x86/linux/syscall_tbl64.cc index 8630265..be82437 100644 --- a/src/arch/x86/linux/syscall_tbl64.cc +++ b/src/arch/x86/linux/syscall_tbl64.cc @@ -60,7 +60,7 @@ { 21, "access", ignoreFunc }, { 22, "pipe", pipeFunc }, { 23, "select", selectFunc }, -{ 24, "sched_yield", ignoreFunc }, +{ 24, "sched_yield", ignoreWarnOnceFunc }, { 25, "mremap", mremapFunc }, { 26, "msync" }, { 27, "mincore" }, @@ -111,7 +111,7 @@ { 72, "fcntl", fcntlFunc }, { 73, "flock" }, { 74, "fsync" }, -{ 75, "fdatasync" }, +{ 75, "fdatasync", ignoreFunc }, { 76, "truncate", truncateFunc }, { 77, "ftruncate", ftruncateFunc }, #if defined(SYS_getdents) @@ -171,7 +171,7 @@ { 128, "rt_sigtimedwait" }, { 129, "rt_sigqueueinfo" }, { 130, "rt_sigsuspend" }, -{ 131, "sigaltstack" }, +{ 131, "sigaltstack", ignoreFunc }, { 132, "utime" }, { 133, "mknod", mknodFunc }, { 134, "uselib" }, @@ -197,7 +197,7 @@ { 154, "modify_ldt" }, { 155, "pivot_root" }, { 156, "_sysctl" }, -{ 157, "prctl" }, +{ 157, "prctl", ignoreFunc }, { 158, "arch_prctl", archPrctlFunc }, { 159, "adjtimex" }, { 160, "setrlimit", ignoreFunc }, 3 is the latest approved patch-set. No files were changed between the latest approved patch-set and the submitted one. -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/46241 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I566b1d71d989c54bfc559d5b83790dff73a38b28 Gerrit-Change-Number: 46241 Gerrit-PatchSet: 5 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Alex Dutu Gerrit-Reviewer: Gabe Black Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: Matthew Poremba Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs: Add mem_banks to Carrizo topology
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/46240 ) Change subject: configs: Add mem_banks to Carrizo topology .. configs: Add mem_banks to Carrizo topology ROCm 4 iterates through the mem_banks to find an appropriate place to allocate memory. Previously, Carrizo didn't have any mem_banks, which resulted in the ROCm 4 runtime erroring out, as it didn't know where to allocate memory. The implementation is fairly similar to the implementation used for the Fiji or Vega configs Change-Id: I5bb4e89657d44c6cb690fd224ee1bf1d4d6cf2a5 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46240 Tested-by: kokoro Reviewed-by: Matthew Poremba Reviewed-by: Matt Sinclair Reviewed-by: Bobby R. Bruce Maintainer: Matt Sinclair --- M configs/example/hsaTopology.py 1 file changed, 15 insertions(+), 2 deletions(-) Approvals: Matthew Poremba: Looks good to me, approved Matt Sinclair: Looks good to me, approved; Looks good to me, approved Bobby R. Bruce: Looks good to me, but someone else must approve kokoro: Regressions pass diff --git a/configs/example/hsaTopology.py b/configs/example/hsaTopology.py index 51585de..78fe1f7 100644 --- a/configs/example/hsaTopology.py +++ b/configs/example/hsaTopology.py @@ -36,7 +36,7 @@ from os.path import join as joinpath from os.path import isdir from shutil import rmtree, copyfile -from m5.util.convert import toFrequency +from m5.util.convert import toFrequency, toMemorySize def file_append(path, contents): with open(joinpath(*path), 'a') as f: @@ -422,12 +422,14 @@ # must have marketing name file_append((node_dir, 'name'), 'Carrizo\n') +mem_banks_cnt = 1 + # populate global node properties # NOTE: SIMD count triggers a valid GPU agent creation node_prop = 'cpu_cores_count %s\n' % options.num_cpus + \ 'simd_count %s\n' \ % (options.num_compute_units * options.simds_per_cu)+ \ -'mem_banks_count 0\n' + \ +'mem_banks_count %s\n' % mem_banks_cnt + \ 'caches_count 0\n' + \ 'io_links_count 0\n'+ \ 'cpu_core_id_base 16\n' + \ @@ -453,3 +455,14 @@ % int(toFrequency(options.CPUClock) / 1e6) file_append((node_dir, 'properties'), node_prop) + +for i in range(mem_banks_cnt): +mem_dir = joinpath(node_dir, f'mem_banks/{i}') +remake_dir(mem_dir) + +mem_prop = f'heap_type 0\n' + \ + f'size_in_bytes {toMemorySize(options.mem_size)}'+ \ + f'flags 0\n' + \ + f'width 64\n'+ \ + f'mem_clk_max 1600\n' +file_append((mem_dir, 'properties'), mem_prop) 2 is the latest approved patch-set. No files were changed between the latest approved patch-set and the submitted one. -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/46240 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I5bb4e89657d44c6cb690fd224ee1bf1d4d6cf2a5 Gerrit-Change-Number: 46240 Gerrit-PatchSet: 4 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Alex Dutu Gerrit-Reviewer: Bobby R. Bruce Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: Matthew Poremba Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-x86,sim: Implement sched_getaffinity
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/46243 ) Change subject: arch-x86,sim: Implement sched_getaffinity .. arch-x86,sim: Implement sched_getaffinity sched_getaffinity is different from other syscalls in the raw syscall return the size of the cpumask being used to represent the CPU bit mask. Because of this, when a library (libnuma in this case) directly called sched_getaffinity and got a return value of 0, it errored out, thinking that there were no CPUs available. This implementation assumes that all CPUs are available, so it sets all simulated CPUs in the bitmask Change-Id: Id95c919986cc98a411877056256604f57a29f0f9 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46243 Tested-by: kokoro Reviewed-by: Matt Sinclair Reviewed-by: Jason Lowe-Power Maintainer: Jason Lowe-Power --- M src/arch/x86/linux/syscall_tbl64.cc M src/sim/syscall_emul.hh 2 files changed, 24 insertions(+), 1 deletion(-) Approvals: Jason Lowe-Power: Looks good to me, approved; Looks good to me, approved Matt Sinclair: Looks good to me, but someone else must approve kokoro: Regressions pass diff --git a/src/arch/x86/linux/syscall_tbl64.cc b/src/arch/x86/linux/syscall_tbl64.cc index 94837cd..7231595 100644 --- a/src/arch/x86/linux/syscall_tbl64.cc +++ b/src/arch/x86/linux/syscall_tbl64.cc @@ -244,7 +244,7 @@ { 201, "time", timeFunc }, { 202, "futex", futexFunc }, { 203, "sched_setaffinity", ignoreFunc }, -{ 204, "sched_getaffinity", ignoreFunc }, +{ 204, "sched_getaffinity", schedGetaffinityFunc }, { 205, "set_thread_area" }, { 206, "io_setup" }, { 207, "io_destroy" }, diff --git a/src/sim/syscall_emul.hh b/src/sim/syscall_emul.hh index cd2d8d1..3c1ad04 100644 --- a/src/sim/syscall_emul.hh +++ b/src/sim/syscall_emul.hh @@ -57,6 +57,7 @@ /// application on the host machine. #if defined(__linux__) +#include #include #include @@ -2603,4 +2604,26 @@ #endif } +/// Target sched_getaffinity +template +SyscallReturn +schedGetaffinityFunc(SyscallDesc *desc, ThreadContext *tc, + pid_t pid, size_t cpusetsize, VPtr<> cpu_set_mask) +{ +#if defined(__linux__) +if (cpusetsize < CPU_ALLOC_SIZE(tc->getSystemPtr()->threads.size())) +return -EINVAL; + +BufferArg maskBuf(cpu_set_mask, cpusetsize); +maskBuf.copyIn(tc->getVirtProxy()); +for (int i = 0; i < tc->getSystemPtr()->threads.size(); i++) { +CPU_SET(i, (cpu_set_t *)maskBuf.bufferPtr()); +} +maskBuf.copyOut(tc->getVirtProxy()); +return CPU_ALLOC_SIZE(tc->getSystemPtr()->threads.size()); +#else +warnUnsupportedOS("sched_getaffinity"); +return -1; +#endif +} #endif // __SIM_SYSCALL_EMUL_HH__ 3 is the latest approved patch-set. No files were changed between the latest approved patch-set and the submitted one. -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/46243 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Id95c919986cc98a411877056256604f57a29f0f9 Gerrit-Change-Number: 46243 Gerrit-PatchSet: 5 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Gabe Black Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-x86: build with getdents64 if system supports it
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/46242 ) Change subject: arch-x86: build with getdents64 if system supports it .. arch-x86: build with getdents64 if system supports it This patch makes it so the getdents64 syscall is built in gem5 if the underlying host implements the syscall, similar to how the getdents syscall is implemented. The implementation for getdents64 already existed Change-Id: I73b22c8df8df994f3f720e848a7d4f8cd31d318e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46242 Tested-by: kokoro Reviewed-by: Matt Sinclair Reviewed-by: Matthew Poremba Reviewed-by: Alex Dutu Maintainer: Matt Sinclair --- M src/arch/x86/linux/syscall_tbl32.cc M src/arch/x86/linux/syscall_tbl64.cc 2 files changed, 8 insertions(+), 0 deletions(-) Approvals: Alex Dutu: Looks good to me, approved Matthew Poremba: Looks good to me, approved Matt Sinclair: Looks good to me, but someone else must approve; Looks good to me, approved kokoro: Regressions pass diff --git a/src/arch/x86/linux/syscall_tbl32.cc b/src/arch/x86/linux/syscall_tbl32.cc index 50d0969..db70151 100644 --- a/src/arch/x86/linux/syscall_tbl32.cc +++ b/src/arch/x86/linux/syscall_tbl32.cc @@ -261,7 +261,11 @@ { 218, "mincore" }, { 219, "madvise", ignoreFunc }, { 220, "madvise1" }, +#if defined(SYS_getdents64) +{ 221, "getdents64", getdents64Func }, +#else { 221, "getdents64" }, +#endif { 222, "fcntl64" }, { 223, "unused" }, { 224, "gettid", gettidFunc }, diff --git a/src/arch/x86/linux/syscall_tbl64.cc b/src/arch/x86/linux/syscall_tbl64.cc index be82437..94837cd 100644 --- a/src/arch/x86/linux/syscall_tbl64.cc +++ b/src/arch/x86/linux/syscall_tbl64.cc @@ -257,7 +257,11 @@ { 214, "epoll_ctl_old" }, { 215, "epoll_wait_old" }, { 216, "remap_file_pages" }, +#if defined(SYS_getdents64) +{ 217, "getdents64", getdents64Func }, +#else { 217, "getdents64" }, +#endif { 218, "set_tid_address", setTidAddressFunc }, { 219, "restart_syscall" }, { 220, "semtimedop" }, 1 is the latest approved patch-set. No files were changed between the latest approved patch-set and the submitted one. -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/46242 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I73b22c8df8df994f3f720e848a7d4f8cd31d318e Gerrit-Change-Number: 46242 Gerrit-PatchSet: 5 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Alex Dutu Gerrit-Reviewer: Gabe Black Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: Matthew Poremba Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Ignore GPU kernel names
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/46245 ) Change subject: gpu-compute: Ignore GPU kernel names .. gpu-compute: Ignore GPU kernel names ROCm 4 seems to have updated the akc, and the only real issue that has occured is that we're no longer able to read kernel names in the same way as we were in ROCm 1.6. This patch removes the prior method of reading kernel names and gives all kernels a temporary name Change-Id: I0040e0cf4cd35d6f56ded6a8acfb10c600bcc77a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46245 Tested-by: kokoro Reviewed-by: Matt Sinclair Reviewed-by: Matthew Poremba Maintainer: Matt Sinclair --- M src/gpu-compute/gpu_command_processor.cc 1 file changed, 1 insertion(+), 5 deletions(-) Approvals: Matthew Poremba: Looks good to me, approved Matt Sinclair: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/src/gpu-compute/gpu_command_processor.cc b/src/gpu-compute/gpu_command_processor.cc index 9bdd0b9..78b3235 100644 --- a/src/gpu-compute/gpu_command_processor.cc +++ b/src/gpu-compute/gpu_command_processor.cc @@ -171,7 +171,6 @@ DPRINTF(GPUCommandProc, "Machine code starts at addr: %#x\n", machine_code_addr); -Addr kern_name_addr(0); std::string kernel_name; /** @@ -184,10 +183,7 @@ * host memory. I have no idea what BLIT stands for. * */ if (akc.runtime_loader_kernel_symbol) { -virt_proxy.readBlob(akc.runtime_loader_kernel_symbol + 0x10, -(uint8_t*)&kern_name_addr, 0x8); - -virt_proxy.readString(kernel_name, kern_name_addr); +kernel_name = "Some kernel"; } else { kernel_name = "Blit kernel"; } 3 is the latest approved patch-set. No files were changed between the latest approved patch-set and the submitted one. -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/46245 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I0040e0cf4cd35d6f56ded6a8acfb10c600bcc77a Gerrit-Change-Number: 46245 Gerrit-PatchSet: 5 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Alex Dutu Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: Matthew Poremba Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: dev-hsa,gpu-compute: IOCTL updates for ROCm 4
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/46246 ) Change subject: dev-hsa,gpu-compute: IOCTL updates for ROCm 4 .. dev-hsa,gpu-compute: IOCTL updates for ROCm 4 This change copies over the up-to-date kfd_ioctl.h file from the linux kernel, and updates the gpu_compute_driver to reflect the changes found in the new version of the kfd_ioctl.h file Change-Id: I51e8e7158762f4b7e06c0f84507e5889a17939a2 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46246 Reviewed-by: Matt Sinclair Maintainer: Matt Sinclair Tested-by: kokoro --- M src/dev/hsa/kfd_ioctl.h M src/gpu-compute/gpu_compute_driver.cc 2 files changed, 310 insertions(+), 275 deletions(-) Approvals: Matt Sinclair: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/src/dev/hsa/kfd_ioctl.h b/src/dev/hsa/kfd_ioctl.h index 504621c..7099851 100644 --- a/src/dev/hsa/kfd_ioctl.h +++ b/src/dev/hsa/kfd_ioctl.h @@ -23,13 +23,16 @@ #ifndef KFD_IOCTL_H_INCLUDED #define KFD_IOCTL_H_INCLUDED +#include #include #include -#include - +/* + * - 1.1 - initial version + * - 1.3 - Add SMI events support + */ #define KFD_IOCTL_MAJOR_VERSION 1 -#define KFD_IOCTL_MINOR_VERSION 2 +#define KFD_IOCTL_MINOR_VERSION 3 struct kfd_ioctl_get_version_args { @@ -41,6 +44,7 @@ #define KFD_IOC_QUEUE_TYPE_COMPUTE 0 #define KFD_IOC_QUEUE_TYPE_SDMA1 #define KFD_IOC_QUEUE_TYPE_COMPUTE_AQL 2 +#define KFD_IOC_QUEUE_TYPE_SDMA_XGMI3 #define KFD_MAX_QUEUE_PERCENTAGE 100 #define KFD_MAX_QUEUE_PRIORITY 15 @@ -89,6 +93,15 @@ uint64_t cu_mask_ptr; /* to KFD */ }; +struct kfd_ioctl_get_queue_wave_state_args +{ +uint64_t ctl_stack_address; /* to KFD */ +uint32_t ctl_stack_used_size; /* from KFD */ +uint32_t save_area_used_size; /* from KFD */ +uint32_t queue_id; /* to KFD */ +uint32_t pad; +}; + /* For kfd_ioctl_set_memory_policy_args.default_policy and alternate_policy */ #define KFD_IOC_CACHE_POLICY_COHERENT 0 #define KFD_IOC_CACHE_POLICY_NONCOHERENT 1 @@ -104,14 +117,6 @@ uint32_t pad; }; -struct kfd_ioctl_set_trap_handler_args -{ - uint64_t tba_addr; - uint64_t tma_addr; - uint32_t gpu_id;/* to KFD */ - uint32_t pad; -}; - /* * All counters are monotonic. They are used for profiling of compute jobs. * The profiling is done by userspace. @@ -130,8 +135,6 @@ uint32_t pad; }; -#define NUM_OF_SUPPORTED_GPUS 7 - struct kfd_process_device_apertures { uint64_t lds_base; /* from KFD */ @@ -144,10 +147,12 @@ uint32_t pad; }; -/* This IOCTL and the limited NUM_OF_SUPPORTED_GPUS is deprecated. Use - * kfd_ioctl_get_process_apertures_new instead, which supports - * arbitrary numbers of GPUs. +/* + * AMDKFD_IOC_GET_PROCESS_APERTURES is deprecated. Use + * AMDKFD_IOC_GET_PROCESS_APERTURES_NEW instead, which supports an + * unlimited number of GPUs. */ +#define NUM_OF_SUPPORTED_GPUS 7 struct kfd_ioctl_get_process_apertures_args { struct kfd_process_device_apertures @@ -217,14 +222,21 @@ #define KFD_IOC_WAIT_RESULT_TIMEOUT1 #define KFD_IOC_WAIT_RESULT_FAIL 2 -/* - * The added 512 is because, currently, 8*(4096/256) signal events are - * reserved for debugger events, and we want to provide at least 4K signal - * events for EOP usage. - * We add 512 to make the allocated size (KFD_SIGNAL_EVENT_LIMIT * 8) be - * page aligned. - */ -#define KFD_SIGNAL_EVENT_LIMIT (4096 + 512) +#define KFD_SIGNAL_EVENT_LIMIT 4096 + +/* For kfd_event_data.hw_exception_data.reset_type. */ +#define KFD_HW_EXCEPTION_WHOLE_GPU_RESET0 +#define KFD_HW_EXCEPTION_PER_ENGINE_RESET 1 + +/* For kfd_event_data.hw_exception_data.reset_cause. */ +#define KFD_HW_EXCEPTION_GPU_HANG 0 +#define KFD_HW_EXCEPTION_ECC1 + +/* For kfd_hsa_memory_exception_data.ErrorType */ +#define KFD_MEM_ERR_NO_RAS 0 +#define KFD_MEM_ERR_SRAM_ECC1 +#define KFD_MEM_ERR_POISON_CONSUMED 2 +#define KFD_MEM_ERR_GPU_HANG3 struct kfd_ioctl_create_event_args { @@ -267,22 +279,38 @@ /* memory exception data */ struct kfd_hsa_memory_exception_data { - struct kfd_memory_exception_failure failure; - uint64_t va; - uint32_t gpu_id; - uint32_t pad; +struct kfd_memory_exception_failure failure; +uint64_t va; +uint32_t gpu_id; +uint32_t ErrorType; /* 0 = no RAS error, + * 1 = ECC_SRAM, + * 2 = Link_SYNFLOOD (poison), + * 3 = GPU hang(not attributable to a specific cause), + * other values reserved + */ +}; + +/* hw exception data */ +struct kfd_hsa_hw_e
[gem5-dev] Change in gem5/gem5[develop]: configs,gpu-compute: Add render driver needed for ROCm 4
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/46244 ) Change subject: configs,gpu-compute: Add render driver needed for ROCm 4 .. configs,gpu-compute: Add render driver needed for ROCm 4 ROCm 4 utilizes the render driver located at /dev/dri/renderDXXX. This patch implements a very simple driver that just returns a file descriptor when opened, as testing has shown that's all that's needed Change-Id: I65602346cbf17b2dc80e114046ebf5c9830a1507 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46244 Tested-by: kokoro Reviewed-by: Jason Lowe-Power Reviewed-by: Matt Sinclair Maintainer: Jason Lowe-Power Maintainer: Matt Sinclair --- M configs/example/apu_se.py M configs/example/hsaTopology.py M src/gpu-compute/GPU.py M src/gpu-compute/SConscript A src/gpu-compute/gpu_render_driver.cc A src/gpu-compute/gpu_render_driver.hh 6 files changed, 124 insertions(+), 2 deletions(-) Approvals: Jason Lowe-Power: Looks good to me, but someone else must approve; Looks good to me, approved Matt Sinclair: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py index f779df3..98a1e19 100644 --- a/configs/example/apu_se.py +++ b/configs/example/apu_se.py @@ -436,6 +436,9 @@ gfxVersion = args.gfx_version, dGPUPoolID = 1, m_type = args.m_type) +renderDriNum = 128 +render_driver = GPURenderDriver(filename = f'dri/renderD{renderDriNum}') + # Creating the GPU kernel launching components: that is the HSA # packet processor (HSAPP), GPU command processor (CP), and the # dispatcher. @@ -498,7 +501,8 @@ "HSA_ENABLE_SDMA=0"] process = Process(executable = executable, cmd = [args.cmd] - + args.options.split(), drivers = [gpu_driver], env = env) + + args.options.split(), + drivers = [gpu_driver, render_driver], env = env) for cpu in cpu_list: cpu.createThreads() diff --git a/configs/example/hsaTopology.py b/configs/example/hsaTopology.py index 78fe1f7..78193e0 100644 --- a/configs/example/hsaTopology.py +++ b/configs/example/hsaTopology.py @@ -156,6 +156,9 @@ file_append((node_dir, 'gpu_id'), 22124) file_append((node_dir, 'name'), 'Vega\n') +# Should be the same as the render driver filename (dri/renderD) +drm_num = 128 + # 96 in real Vega # Random comment for comparison purposes caches = 0 @@ -200,7 +203,7 @@ 'vendor_id 4098\n' + \ 'device_id 26720\n' + \ 'location_id 1024\n'+ \ -'drm_render_minor 128\n'+ \ +'drm_render_minor %s\n' % drm_num + \ 'hive_id 0\n' + \ 'num_sdma_engines 2\n' + \ 'num_sdma_xgmi_engines 0\n' + \ @@ -329,6 +332,9 @@ file_append((node_dir, 'gpu_id'), 50156) file_append((node_dir, 'name'), 'Fiji\n') +# Should be the same as the render driver filename (dri/renderD) +drm_num = 128 + # Real Fiji shows 96, but building that topology is complex and doesn't # appear to be required for anything. caches = 0 @@ -373,6 +379,7 @@ 'vendor_id 4098\n' + \ 'device_id 29440\n' + \ 'location_id 512\n' + \ +'drm_render_minor %s\n' % drm_num + \ 'max_engine_clk_fcompute %s\n' \ % int(toFrequency(options.gpu_clock) / 1e6) + \ 'local_mem_size 4294967296\n' + \ @@ -424,6 +431,9 @@ mem_banks_cnt = 1 +# Should be the same as the render driver filename (dri/renderD) +drm_num = 128 + # populate global node properties # NOTE: SIMD count triggers a valid GPU agent creation node_prop = 'cpu_cores_count %s\n' % options.num_cpus + \ @@ -446,6 +456,7 @@ 'vendor_id 4098\n' + \ 'device_id 39028\n' + \ 'location_id 8\n' + \ +'drm_render_minor %s\n' % drm_num + \ 'max_engine_clk_fcompute %s\n' \ % int(toFrequency(options.gpu_clock) / 1e6) + \ 'local_mem_size 0\n'
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Change certain IOCTL errors to warnings
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/46247 ) Change subject: gpu-compute: Change certain IOCTL errors to warnings .. gpu-compute: Change certain IOCTL errors to warnings There are certain IOCTL errors that were triggering with the change to ROCm 4, however they could be set to warnings without causing any errors in the program Change-Id: Ie0052267f3ccfbdbadb90249b6f19e6a1205f57e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46247 Tested-by: kokoro Reviewed-by: Matthew Poremba Reviewed-by: Matt Sinclair Maintainer: Matt Sinclair --- M src/gpu-compute/gpu_compute_driver.cc 1 file changed, 3 insertions(+), 3 deletions(-) Approvals: Matthew Poremba: Looks good to me, approved Matt Sinclair: Looks good to me, but someone else must approve; Looks good to me, approved kokoro: Regressions pass diff --git a/src/gpu-compute/gpu_compute_driver.cc b/src/gpu-compute/gpu_compute_driver.cc index 7f8cc16..12e537c 100644 --- a/src/gpu-compute/gpu_compute_driver.cc +++ b/src/gpu-compute/gpu_compute_driver.cc @@ -417,7 +417,7 @@ TypedBufferArg args(ioc_buf); args.copyIn(virt_proxy); if (args->event_type != KFD_IOC_EVENT_SIGNAL) { -fatal("Signal events are only supported currently\n"); +warn("Signal events are only supported currently\n"); } else if (eventSlotIndex == SLOTS_PER_PAGE) { fatal("Signal event wasn't created; signal limit reached\n"); } @@ -508,8 +508,8 @@ "\tamdkfd wait for event %d\n", EventData->event_id); panic_if(ETable.count(EventData->event_id) == 0, "Event ID invalid, cannot set this event\n"); -panic_if(ETable[EventData->event_id].threadWaiting, - "Multiple threads waiting on the same event\n"); +if (ETable[EventData->event_id].threadWaiting) + warn("Multiple threads waiting on the same event\n"); if (ETable[EventData->event_id].setEvent) { // If event is already set, the event has already happened. // Just unset the event and dont put this thread to sleep. 3 is the latest approved patch-set. No files were changed between the latest approved patch-set and the submitted one. -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/46247 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Ie0052267f3ccfbdbadb90249b6f19e6a1205f57e Gerrit-Change-Number: 46247 Gerrit-PatchSet: 8 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Alex Dutu Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: Matthew Poremba Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-vega: Add missing return to flat_load_dwordx4
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/47520 ) Change subject: arch-vega: Add missing return to flat_load_dwordx4 .. arch-vega: Add missing return to flat_load_dwordx4 Change-Id: Ibf56c25a3d22d3c12ae2c1bb11f00f4a44b5919a --- M src/arch/amdgpu/vega/insts/instructions.cc 1 file changed, 1 insertion(+), 0 deletions(-) diff --git a/src/arch/amdgpu/vega/insts/instructions.cc b/src/arch/amdgpu/vega/insts/instructions.cc index 74be2cf..793bdce 100644 --- a/src/arch/amdgpu/vega/insts/instructions.cc +++ b/src/arch/amdgpu/vega/insts/instructions.cc @@ -42981,6 +42981,7 @@ if (gpuDynInst->exec_mask.none()) { wf->decVMemInstsIssued(); wf->decLGKMInstsIssued(); +return; } gpuDynInst->execUnitId = wf->execUnitId; -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/47520 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Ibf56c25a3d22d3c12ae2c1bb11f00f4a44b5919a Gerrit-Change-Number: 47520 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-vega: Fix s_endpgm instruction
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/47519 ) Change subject: arch-vega: Fix s_endpgm instruction .. arch-vega: Fix s_endpgm instruction Copy over changes that had been made to s_engpgm in GCN3 but weren't added to the Vega implementation Change-Id: I1063f83b1ce8f7c5e451c8c227265715c8f725b9 --- M src/arch/amdgpu/vega/insts/instructions.cc 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/src/arch/amdgpu/vega/insts/instructions.cc b/src/arch/amdgpu/vega/insts/instructions.cc index b0a6cb0..74be2cf 100644 --- a/src/arch/amdgpu/vega/insts/instructions.cc +++ b/src/arch/amdgpu/vega/insts/instructions.cc @@ -4134,7 +4134,12 @@ ComputeUnit *cu = gpuDynInst->computeUnit(); // delete extra instructions fetched for completed work-items -wf->instructionBuffer.clear(); +wf->instructionBuffer.erase(wf->instructionBuffer.begin() + 1, +wf->instructionBuffer.end()); + +if (wf->pendingFetch) { +wf->dropFetch = true; +} wf->computeUnit->fetchStage.fetchUnit(wf->simdId) .flushBuf(wf->wfSlotId); @@ -4212,8 +4217,11 @@ bool kernelEnd = wf->computeUnit->shader->dispatcher().isReachingKernelEnd(wf); +bool relNeeded = +wf->computeUnit->shader->impl_kern_end_rel; + //if it is not a kernel end, then retire the workgroup directly -if (!kernelEnd) { +if (!kernelEnd || !relNeeded) { wf->computeUnit->shader->dispatcher().notifyWgCompl(wf); wf->setStatus(Wavefront::S_STOPPED); wf->computeUnit->completedWGs++; @@ -4229,6 +4237,7 @@ * the complex */ setFlag(MemSync); +setFlag(GlobalSegment); // Notify Memory System of Kernel Completion // Kernel End = isKernel + isMemSync wf->setStatus(Wavefront::S_RETURNING); -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/47519 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I1063f83b1ce8f7c5e451c8c227265715c8f725b9 Gerrit-Change-Number: 47519 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-vega: Add decoding for implemented insts
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/47521 ) Change subject: arch-vega: Add decoding for implemented insts .. arch-vega: Add decoding for implemented insts Certain instructions were implemented in instructions.cc, but weren't actually being decoded by the decoder, causing the decoder to return nullptr for valid instructions. This patch fixes the decoder to return the proper instruction class for implemented instructions Change-Id: I8d8525a1c435147017cb38d9df8e1675986ef04b --- M src/arch/amdgpu/vega/decoder.cc 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/src/arch/amdgpu/vega/decoder.cc b/src/arch/amdgpu/vega/decoder.cc index 363f7e1..480d326 100644 --- a/src/arch/amdgpu/vega/decoder.cc +++ b/src/arch/amdgpu/vega/decoder.cc @@ -4155,19 +4155,19 @@ GPUStaticInst* Decoder::decode_OP_VOP2__V_ADD_U32(MachInst iFmt) { -return nullptr; +return new Inst_VOP2__V_ADD_U32(&iFmt->iFmt_VOP2); } GPUStaticInst* Decoder::decode_OP_VOP2__V_SUB_U32(MachInst iFmt) { -return nullptr; +return new Inst_VOP2__V_SUB_U32(&iFmt->iFmt_VOP2); } GPUStaticInst* Decoder::decode_OP_VOP2__V_SUBREV_U32(MachInst iFmt) { -return nullptr; +return new Inst_VOP2__V_SUBREV_U32(&iFmt->iFmt_VOP2); } GPUStaticInst* @@ -4443,7 +4443,7 @@ GPUStaticInst* Decoder::decode_OP_SOP2__S_MUL_HI_I32(MachInst iFmt) { -return nullptr; +return new Inst_SOP2__S_MUL_I32(&iFmt->iFmt_SOP2); } GPUStaticInst* @@ -6939,31 +6939,31 @@ GPUStaticInst* Decoder::decode_OPU_VOP3__V_MAD_F16(MachInst iFmt) { -return nullptr; +return new Inst_VOP3__V_MAD_F16(&iFmt->iFmt_VOP3A); } GPUStaticInst* Decoder::decode_OPU_VOP3__V_MAD_U16(MachInst iFmt) { -return nullptr; +return new Inst_VOP3__V_MAD_U16(&iFmt->iFmt_VOP3A); } GPUStaticInst* Decoder::decode_OPU_VOP3__V_MAD_I16(MachInst iFmt) { -return nullptr; +return new Inst_VOP3__V_MAD_I16(&iFmt->iFmt_VOP3A); } GPUStaticInst* Decoder::decode_OPU_VOP3__V_FMA_F16(MachInst iFmt) { -return nullptr; +return new Inst_VOP3__V_FMA_F16(&iFmt->iFmt_VOP3A); } GPUStaticInst* Decoder::decode_OPU_VOP3__V_DIV_FIXUP_F16(MachInst iFmt) { -return nullptr; +return new Inst_VOP3__V_DIV_FIXUP_F16(&iFmt->iFmt_VOP3A); } GPUStaticInst* -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/47521 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I8d8525a1c435147017cb38d9df8e1675986ef04b Gerrit-Change-Number: 47521 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs: Set valid heap_type values
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/47528 ) Change subject: configs: Set valid heap_type values .. configs: Set valid heap_type values The variables that were used to set heap_type don't exist. Explicitly set them to the proper values. Change-Id: I8df7fca7442f6640be1154ef147c4e302ea491bb --- M configs/example/hsaTopology.py 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/configs/example/hsaTopology.py b/configs/example/hsaTopology.py index 28060cc..da3bc57 100644 --- a/configs/example/hsaTopology.py +++ b/configs/example/hsaTopology.py @@ -140,7 +140,7 @@ # CPU memory reporting mem_dir = joinpath(node_dir, 'mem_banks/0') remake_dir(mem_dir) -mem_prop = 'heap_type %s\n' % HsaHeaptype.HSA_HEAPTYPE_SYSTEM.value + \ +mem_prop = 'heap_type 0\n' + \ 'size_in_bytes 33704329216\n'+ \ 'flags 0\n' + \ 'width 72\n' + \ @@ -221,7 +221,7 @@ # TODO: Extract size, clk, and width from sim paramters mem_dir = joinpath(node_dir, 'mem_banks/0') remake_dir(mem_dir) -mem_prop = 'heap_type %s\n' % heap_type.value + \ +mem_prop = 'heap_type 1\n' + \ 'size_in_bytes 17163091968\n'+ \ 'flags 0\n' + \ 'width 2048\n' + \ -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/47528 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I8df7fca7442f6640be1154ef147c4e302ea491bb Gerrit-Change-Number: 47528 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs,gpu-compute: Set proper dGPUPoolID defaults
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/47527 ) Change subject: configs,gpu-compute: Set proper dGPUPoolID defaults .. configs,gpu-compute: Set proper dGPUPoolID defaults In GPU.py, dGPUPoolID is defined as an int, but was defaulted to False. Explicitly set it to 0, instead. In apu_se.py, dGPUPoolID was being set to 1, but that was resulting in crashes. Setting it to 0 avoids those crashes Change-Id: I0f1161588279a335bbd0d8ae7acda97fc23201b5 --- M configs/example/apu_se.py M src/gpu-compute/GPU.py 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py index 98a1e19..8d49865 100644 --- a/configs/example/apu_se.py +++ b/configs/example/apu_se.py @@ -434,7 +434,7 @@ # HSA kernel mode driver gpu_driver = GPUComputeDriver(filename = "kfd", isdGPU = args.dgpu, gfxVersion = args.gfx_version, - dGPUPoolID = 1, m_type = args.m_type) + dGPUPoolID = 0, m_type = args.m_type) renderDriNum = 128 render_driver = GPURenderDriver(filename = f'dri/renderD{renderDriNum}') diff --git a/src/gpu-compute/GPU.py b/src/gpu-compute/GPU.py index ace83a5..107899e 100644 --- a/src/gpu-compute/GPU.py +++ b/src/gpu-compute/GPU.py @@ -243,7 +243,7 @@ device = Param.GPUCommandProcessor('GPU controlled by this driver') isdGPU = Param.Bool(False, 'Driver is for a dGPU') gfxVersion = Param.GfxVersion('gfx801', 'ISA of gpu to model') -dGPUPoolID = Param.Int(False, 'Pool ID for dGPU.') +dGPUPoolID = Param.Int(0, 'Pool ID for dGPU.') # Default Mtype for caches #-- 1 1 1 C_RW_S (Cached-ReadWrite-Shared) #-- 1 1 0 C_RW_US (Cached-ReadWrite-Unshared) -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/47527 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I0f1161588279a335bbd0d8ae7acda97fc23201b5 Gerrit-Change-Number: 47527 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs,gpu-compute: Add support for gfx902/Raven
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/47530 ) Change subject: configs,gpu-compute: Add support for gfx902/Raven .. configs,gpu-compute: Add support for gfx902/Raven This patch adds support for a gfx902 Vega APU, ripping the appropriate values for device_id from the ROCm Thunk (src/topology.c) Note: gfx902 isn't officially supported by ROCm. This means that it may not work for all programs. In particular, rocBLAS is incompatible with gfx902, so anything that uses rocBLAS won't be able to run with gfx902 Change-Id: I48893e7cc9c7e52275fdfd22314f371a9db8e90a --- M configs/example/apu_se.py M configs/example/hsaTopology.py M src/gpu-compute/GPU.py M src/gpu-compute/gpu_compute_driver.cc M src/gpu-compute/gpu_compute_driver.hh 5 files changed, 19 insertions(+), 7 deletions(-) diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py index 8d49865..1e78f27 100644 --- a/configs/example/apu_se.py +++ b/configs/example/apu_se.py @@ -189,7 +189,7 @@ "between 0-7") parser.add_argument("--gfx-version", type=str, default='gfx801', -help="Gfx version for gpu: gfx801, gfx803, gfx900") +help="Gfx version for gpu: gfx801, gfx803, gfx900, gfx902") Ruby.define_options(parser) @@ -682,7 +682,7 @@ elif args.gfx_version == 'gfx900': hsaTopology.createVegaTopology(args) else: -assert (args.gfx_version in ['gfx801']),\ +assert (args.gfx_version in ['gfx801', 'gfx902']),\ "Incorrect gfx version for APU" hsaTopology.createCarrizoTopology(args) diff --git a/configs/example/hsaTopology.py b/configs/example/hsaTopology.py index da3bc57..7996a83 100644 --- a/configs/example/hsaTopology.py +++ b/configs/example/hsaTopology.py @@ -427,13 +427,21 @@ file_append((node_dir, 'gpu_id'), 2765) # must have marketing name -file_append((node_dir, 'name'), 'Carrizo\n') +if options.gfx_version == 'gfx801': +file_append((node_dir, 'name'), 'Carrizo\n') +elif options.gfx_version == 'gfx902': +file_append((node_dir, 'name'), 'Raven\n') mem_banks_cnt = 1 # Should be the same as the render driver filename (dri/renderD) drm_num = 128 +if options.gfx_version == 'gfx801': +device_id = 39028 +elif options.gfx_version == 'gfx902': +device_id = 5597 + # populate global node properties # NOTE: SIMD count triggers a valid GPU agent creation node_prop = 'cpu_cores_count %s\n' % options.num_cpus + \ @@ -454,7 +462,7 @@ 'simd_per_cu %s\n' % options.simds_per_cu + \ 'max_slots_scratch_cu 32\n' + \ 'vendor_id 4098\n' + \ -'device_id 39028\n' + \ +'device_id %s\n' % device_id+ \ 'location_id 8\n' + \ 'drm_render_minor %s\n' % drm_num + \ 'max_engine_clk_fcompute %s\n' \ diff --git a/src/gpu-compute/GPU.py b/src/gpu-compute/GPU.py index 107899e..e07f180 100644 --- a/src/gpu-compute/GPU.py +++ b/src/gpu-compute/GPU.py @@ -52,6 +52,7 @@ 'gfx801', 'gfx803', 'gfx900', +'gfx902', ] class PoolManager(SimObject): diff --git a/src/gpu-compute/gpu_compute_driver.cc b/src/gpu-compute/gpu_compute_driver.cc index 7edbbdb..f39576e 100644 --- a/src/gpu-compute/gpu_compute_driver.cc +++ b/src/gpu-compute/gpu_compute_driver.cc @@ -322,7 +322,7 @@ args->process_apertures[i].lds_limit = ldsApeLimit(args->process_apertures[i].lds_base); -if (isdGPU) { +if (isdGPU || gfxVersion == GfxVersion::gfx902) { args->process_apertures[i].gpuvm_base = 0x100ull; args->process_apertures[i].gpuvm_limit = 0x8000ULL - 1; @@ -355,6 +355,7 @@ } else { switch (gfxVersion) { case GfxVersion::gfx801: + case GfxVersion::gfx902: args->process_apertures[i].gpu_id = 2765; break; default: @@ -593,7 +594,7 @@ scratchApeLimit(ape_args->scratch_base); ape_args->lds_base = ldsApeBase(i + 1); ape_args->lds_limit = ldsApeLimit(ape_args->lds_base); -if (isdGPU) { +if (isdGPU || gfxVersion == GfxVersion::gfx902) { ape_args->gpuvm_base = 0x100ull; ape_args->gpuvm_limit = 0x8000ULL - 1
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Update GET_PROCESS_APERTURES IOCTLs
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/47529 ) Change subject: gpu-compute: Update GET_PROCESS_APERTURES IOCTLs .. gpu-compute: Update GET_PROCESS_APERTURES IOCTLs The apertures for non-gfx801 GPUs are set differently. If the apertures aren't set properly, ROCm will error out. This updates the GPUVM apertures, as it is the one that ROCm explicitly checks (WIP) Change-Id: I1fa6f60bc20c7b6eb3896057841d96846460a9f8 --- M src/gpu-compute/gpu_compute_driver.cc 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/src/gpu-compute/gpu_compute_driver.cc b/src/gpu-compute/gpu_compute_driver.cc index 02f1de5..7edbbdb 100644 --- a/src/gpu-compute/gpu_compute_driver.cc +++ b/src/gpu-compute/gpu_compute_driver.cc @@ -322,9 +322,15 @@ args->process_apertures[i].lds_limit = ldsApeLimit(args->process_apertures[i].lds_base); +if (isdGPU) { +args->process_apertures[i].gpuvm_base = 0x100ull; +args->process_apertures[i].gpuvm_limit = +0x8000ULL - 1; +} else { args->process_apertures[i].gpuvm_base = gpuVmApeBase(i + 1); args->process_apertures[i].gpuvm_limit = gpuVmApeLimit(args->process_apertures[i].gpuvm_base); +} // NOTE: Must match ID populated by hsaTopology.py // @@ -393,14 +399,6 @@ 47) != 0x1); assert(bits(args->process_apertures[i].lds_limit, 63, 47) != 0); -assert(bits(args->process_apertures[i].gpuvm_base, 63, - 47) != 0x1); -assert(bits(args->process_apertures[i].gpuvm_base, 63, - 47) != 0); -assert(bits(args->process_apertures[i].gpuvm_limit, 63, - 47) != 0x1); -assert(bits(args->process_apertures[i].gpuvm_limit, 63, - 47) != 0); } args.copyOut(virt_proxy); @@ -595,8 +593,15 @@ scratchApeLimit(ape_args->scratch_base); ape_args->lds_base = ldsApeBase(i + 1); ape_args->lds_limit = ldsApeLimit(ape_args->lds_base); -ape_args->gpuvm_base = gpuVmApeBase(i + 1); -ape_args->gpuvm_limit = gpuVmApeLimit(ape_args->gpuvm_base); +if (isdGPU) { +ape_args->gpuvm_base = 0x100ull; +ape_args->gpuvm_limit = 0x8000ULL - 1; +} else { +ape_args->gpuvm_base = gpuVmApeBase(i + 1); +ape_args->gpuvm_limit = +gpuVmApeLimit(ape_args->gpuvm_base); +} + // NOTE: Must match ID populated by hsaTopology.py if (isdGPU) { @@ -628,10 +633,6 @@ assert(bits(ape_args->lds_base, 63, 47) != 0); assert(bits(ape_args->lds_limit, 63, 47) != 0x1); assert(bits(ape_args->lds_limit, 63, 47) != 0); -assert(bits(ape_args->gpuvm_base, 63, 47) != 0x1); -assert(bits(ape_args->gpuvm_base, 63, 47) != 0); -assert(bits(ape_args->gpuvm_limit, 63, 47) != 0x1); -assert(bits(ape_args->gpuvm_limit, 63, 47) != 0); ape_args.copyOut(virt_proxy); } -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/47529 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I1fa6f60bc20c7b6eb3896057841d96846460a9f8 Gerrit-Change-Number: 47529 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs: Add shared_cpu_list to cache directories
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/47524 ) Change subject: configs: Add shared_cpu_list to cache directories .. configs: Add shared_cpu_list to cache directories The ROCm thunk uses this file instead of the shared_cpu_map file. Change-Id: I985512245c9f51106b8347412ed643f78b567b24 --- M configs/common/FileSystemConfig.py 1 file changed, 2 insertions(+), 0 deletions(-) diff --git a/configs/common/FileSystemConfig.py b/configs/common/FileSystemConfig.py index 0d9f221..66a6315 100644 --- a/configs/common/FileSystemConfig.py +++ b/configs/common/FileSystemConfig.py @@ -217,6 +217,8 @@ file_append((indexdir, 'number_of_sets'), num_sets) file_append((indexdir, 'physical_line_partition'), '1') file_append((indexdir, 'shared_cpu_map'), hex_mask(cpus)) +file_append((indexdir, 'shared_cpu_list'), +','.join(str(cpu) for cpu in cpus)) def _redirect_paths(options): # Redirect filesystem syscalls from src to the first matching dests -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/47524 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I985512245c9f51106b8347412ed643f78b567b24 Gerrit-Change-Number: 47524 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs: Don't report CPU cores on Fiji properties
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/47525 ) Change subject: configs: Don't report CPU cores on Fiji properties .. configs: Don't report CPU cores on Fiji properties ROCm determines if a device is a dGPU in two ways. The first is by looking at the device ID. The second is through a flag that gets set only if the reported cpu_cores_count is 0. If these don't agree, ROCm breaks when doing memory operations. Previously, cpu_cores_count was non-zero on the Fiji config. This patch sets it to 0 to appease ROCm Change-Id: I0fd0ce724f491ed6a4598188b3799468668585f4 --- M configs/example/hsaTopology.py 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/configs/example/hsaTopology.py b/configs/example/hsaTopology.py index 78193e0..28060cc 100644 --- a/configs/example/hsaTopology.py +++ b/configs/example/hsaTopology.py @@ -359,7 +359,7 @@ file_append((io_dir, 'properties'), io_prop) # Populate GPU node properties -node_prop = 'cpu_cores_count %s\n' % options.num_cpus + \ +node_prop = 'cpu_cores_count 0\n' + \ 'simd_count %s\n' \ % (options.num_compute_units * options.simds_per_cu)+ \ 'mem_banks_count 1\n' + \ -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/47525 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I0fd0ce724f491ed6a4598188b3799468668585f4 Gerrit-Change-Number: 47525 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Add mmap functionality to GPURenderDriver
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/47523 ) Change subject: gpu-compute: Add mmap functionality to GPURenderDriver .. gpu-compute: Add mmap functionality to GPURenderDriver dGPUs mmap the GPURenderDriver, however it doesn't appear that they do anything with it. This patch implements the mmap function by just returning the address provided, while not doing anything else Change-Id: Ia010a2aebcf7e2c75e22d93dfb440937d1bef3b1 --- M src/gpu-compute/gpu_render_driver.cc M src/gpu-compute/gpu_render_driver.hh 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/src/gpu-compute/gpu_render_driver.cc b/src/gpu-compute/gpu_render_driver.cc index 1af83cb..41730d2 100644 --- a/src/gpu-compute/gpu_render_driver.cc +++ b/src/gpu-compute/gpu_render_driver.cc @@ -38,7 +38,7 @@ /* ROCm 4 utilizes the render driver located at /dev/dri/renderDXXX. This * patch implements a very simple driver that just returns a file - * descriptor when opened, as testing has shown that's all that's needed + * descriptor when opened. */ int GPURenderDriver::open(ThreadContext *tc, int mode, int flags) @@ -48,3 +48,12 @@ int tgt_fd = process->fds->allocFD(device_fd_entry); return tgt_fd; } + +/* DGPUs try to mmap the driver file. It doesn't appear they do anything + * with it, so we just return the address that's provided + */ +Addr GPURenderDriver::mmap(ThreadContext *tc, Addr start, uint64_t length, + int prot, int tgt_flags, int tgt_fd, off_t offset) +{ +return start; +} diff --git a/src/gpu-compute/gpu_render_driver.hh b/src/gpu-compute/gpu_render_driver.hh index d070668..a992976 100644 --- a/src/gpu-compute/gpu_render_driver.hh +++ b/src/gpu-compute/gpu_render_driver.hh @@ -47,6 +47,9 @@ { return -EBADF; } + +Addr mmap(ThreadContext *tc, Addr start, uint64_t length, + int prot, int tgt_flags, int tgt_fd, off_t offset) override; }; #endif -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/47523 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Ia010a2aebcf7e2c75e22d93dfb440937d1bef3b1 Gerrit-Change-Number: 47523 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-x86: Ignore mbind syscall
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/47526 ) Change subject: arch-x86: Ignore mbind syscall .. arch-x86: Ignore mbind syscall mbind gets called when running with a dGPU in ROCm 4, but we are able to ignore it without breaking anything Change-Id: I7c1ba47656122a5eb856981dca2a05359098e3b2 --- M src/arch/x86/linux/syscall_tbl64.cc 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/arch/x86/linux/syscall_tbl64.cc b/src/arch/x86/linux/syscall_tbl64.cc index 7231595..6f2fad5 100644 --- a/src/arch/x86/linux/syscall_tbl64.cc +++ b/src/arch/x86/linux/syscall_tbl64.cc @@ -281,7 +281,7 @@ { 234, "tgkill", tgkillFunc }, { 235, "utimes" }, { 236, "vserver" }, -{ 237, "mbind" }, +{ 237, "mbind", ignoreFunc }, { 238, "set_mempolicy" }, { 239, "get_mempolicy", ignoreFunc }, { 240, "mq_open" }, -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/47526 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I7c1ba47656122a5eb856981dca2a05359098e3b2 Gerrit-Change-Number: 47526 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-vega: Add fatal when decoding missing insts
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/47522 ) Change subject: arch-vega: Add fatal when decoding missing insts .. arch-vega: Add fatal when decoding missing insts Certain instructions don't have implementations in instructions.cc, and get decoded as a nullptr. This adds a fatal when decoding a missing instruction, as we aren't able to properly run a program if all its instructions aren't implemented, and it allows us to figure out which instruction i missing due to fatals printing the line they were called. Change-Id: I7e3690f079b790dceee102063773d5fbbc8619f1 --- M src/arch/amdgpu/vega/decoder.cc 1 file changed, 229 insertions(+), 0 deletions(-) diff --git a/src/arch/amdgpu/vega/decoder.cc b/src/arch/amdgpu/vega/decoder.cc index 480d326..94035f6 100644 --- a/src/arch/amdgpu/vega/decoder.cc +++ b/src/arch/amdgpu/vega/decoder.cc @@ -4437,6 +4437,7 @@ GPUStaticInst* Decoder::decode_OP_SOP2__S_MUL_HI_U32(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } @@ -4449,42 +4450,49 @@ GPUStaticInst* Decoder::decode_OP_SOP2__S_LSHL1_ADD_U32(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OP_SOP2__S_LSHL2_ADD_U32(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OP_SOP2__S_LSHL3_ADD_U32(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OP_SOP2__S_LSHL4_ADD_U32(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OP_SOP2__S_PACK_LL_B32_B16(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OP_SOP2__S_PACK_LH_B32_B16(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OP_SOP2__S_HH_B32_B16(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } @@ -4611,6 +4619,7 @@ GPUStaticInst* Decoder::decode_OP_SOPK__S_CALL_B64(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } @@ -6831,108 +6840,126 @@ GPUStaticInst* Decoder::decode_OPU_VOP3__V_MAD_U32_U16(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OPU_VOP3__V_MAD_I32_I16(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OPU_VOP3__V_XAD_U32(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OPU_VOP3__V_MIN3_F16(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OPU_VOP3__V_MIN3_I16(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OPU_VOP3__V_MIN3_U16(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OPU_VOP3__V_MAX3_F16(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OPU_VOP3__V_MAX3_I16(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OPU_VOP3__V_MAX3_U16(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OPU_VOP3__V_MED3_F16(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OPU_VOP3__V_MED3_I16(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode_OPU_VOP3__V_MED3_U16(MachInst iFmt) { +fatal("Trying to decode instruction without a class\n"); return nullptr; } GPUStaticInst* Decoder::decode
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Check for WAX dependences
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/47539 ) Change subject: gpu-compute: Check for WAX dependences .. gpu-compute: Check for WAX dependences This adds checking if the destination registers are free or busy in the operandsReady() function for both scalar and vector registers. This allows us to catch WAX dependences between instructions. Change-Id: I0fb0b29e9608fca0d90c059422d4d9500d5b2a7d --- M src/gpu-compute/scalar_register_file.cc M src/gpu-compute/vector_register_file.cc 2 files changed, 22 insertions(+), 0 deletions(-) diff --git a/src/gpu-compute/scalar_register_file.cc b/src/gpu-compute/scalar_register_file.cc index 52e0a2f..3a00093 100644 --- a/src/gpu-compute/scalar_register_file.cc +++ b/src/gpu-compute/scalar_register_file.cc @@ -64,6 +64,17 @@ } } +for (const auto& dstScalarOp : ii->dstScalarRegOperands()) { +for (const auto& physIdx : dstScalarOp.physIndices()) { +if (regBusy(physIdx)) { +DPRINTF(GPUSRF, "WAX stall: WV[%d]: %s: physReg[%d]\n", +w->wfDynId, ii->disassemble(), physIdx); +w->stats.numTimesBlockedDueWAXDependencies++; +return false; +} +} +} + return true; } diff --git a/src/gpu-compute/vector_register_file.cc b/src/gpu-compute/vector_register_file.cc index dc5434d..2355643 100644 --- a/src/gpu-compute/vector_register_file.cc +++ b/src/gpu-compute/vector_register_file.cc @@ -71,6 +71,17 @@ } } +for (const auto& dstVecOp : ii->dstVecRegOperands()) { +for (const auto& physIdx : dstVecOp.physIndices()) { +if (regBusy(physIdx)) { +DPRINTF(GPUVRF, "WAX stall: WV[%d]: %s: physReg[%d]\n", +w->wfDynId, ii->disassemble(), physIdx); +w->stats.numTimesBlockedDueWAXDependencies++; +return false; +} +} +} + return true; } -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/47539 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I0fb0b29e9608fca0d90c059422d4d9500d5b2a7d Gerrit-Change-Number: 47539 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[release-staging-v21-1]: arch-gcn3: Free dest registers in non-memory Load DS insts
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/48019 ) Change subject: arch-gcn3: Free dest registers in non-memory Load DS insts .. arch-gcn3: Free dest registers in non-memory Load DS insts Certain DS insts are classfied as Loads, but don't actually go through the memory pipeline. However, any instruction classified as a load marks its destination registers as free in the memory pipeline. Because these instructions didn't use the memory pipeline, they never freed their destination registers, which led to a deadlock. This patch explicitly calls the function used to free the destination registers in the execute() method of those Load instructions that don't use the memory pipeline. Change-Id: Ic2ac2e232c8fbad63d0c62c1862f2bdaeaba4edf --- M src/arch/amdgpu/gcn3/insts/instructions.cc 1 file changed, 27 insertions(+), 0 deletions(-) diff --git a/src/arch/amdgpu/gcn3/insts/instructions.cc b/src/arch/amdgpu/gcn3/insts/instructions.cc index a421454..21ab58d 100644 --- a/src/arch/amdgpu/gcn3/insts/instructions.cc +++ b/src/arch/amdgpu/gcn3/insts/instructions.cc @@ -32397,6 +32397,15 @@ } vdst.write(); + +/** + * This is needed because we treat this instruction as a load + * but it's not an actual memory request. + * Without this, the destination register never gets marked as + * free, leading to a possible deadlock + */ +wf->computeUnit->vrf[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); } // execute // --- Inst_DS__DS_PERMUTE_B32 class methods --- @@ -32468,6 +32477,15 @@ wf->decLGKMInstsIssued(); wf->rdLmReqsInPipe--; wf->validateRequestCounters(); + +/** + * This is needed because we treat this instruction as a load + * but it's not an actual memory request. + * Without this, the destination register never gets marked as + * free, leading to a possible deadlock + */ +wf->computeUnit->vrf[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); } // execute // --- Inst_DS__DS_BPERMUTE_B32 class methods --- @@ -32539,6 +32557,15 @@ wf->decLGKMInstsIssued(); wf->rdLmReqsInPipe--; wf->validateRequestCounters(); + +/** + * This is needed because we treat this instruction as a load + * but it's not an actual memory request. + * Without this, the destination register never gets marked as + * free, leading to a possible deadlock + */ +wf->computeUnit->vrf[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); } // execute // --- Inst_DS__DS_ADD_U64 class methods --- -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/48019 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: release-staging-v21-1 Gerrit-Change-Id: Ic2ac2e232c8fbad63d0c62c1862f2bdaeaba4edf Gerrit-Change-Number: 48019 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[release-staging-v21-1]: gpu-compute: Fix off-by-one when creating an AddrRange
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/48020 ) Change subject: gpu-compute: Fix off-by-one when creating an AddrRange .. gpu-compute: Fix off-by-one when creating an AddrRange The end value of an AddrRange is already not included in the range, so subtracting one from the end creates an off-by-one error. This patch removes the extra -1 that was used when determining the end of an AddrRange in allocateGpuVma Change-Id: I75659e9a7fabd991bb37be9aa40f8e409eb21154 --- M src/gpu-compute/gpu_compute_driver.cc 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gpu-compute/gpu_compute_driver.cc b/src/gpu-compute/gpu_compute_driver.cc index 92ac641..f794b43 100644 --- a/src/gpu-compute/gpu_compute_driver.cc +++ b/src/gpu-compute/gpu_compute_driver.cc @@ -985,7 +985,7 @@ GPUComputeDriver::allocateGpuVma(Request::CacheCoherenceFlags mtype, Addr start, Addr length) { -AddrRange range = AddrRange(start, start + length - 1); +AddrRange range = AddrRange(start, start + length); DPRINTF(GPUDriver, "Registering [%p - %p] with MTYPE %d\n", range.start(), range.end(), mtype); fatal_if(gpuVmas.insert(range, mtype) == gpuVmas.end(), -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/48020 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: release-staging-v21-1 Gerrit-Change-Id: I75659e9a7fabd991bb37be9aa40f8e409eb21154 Gerrit-Change-Number: 48020 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[release-staging-v21-1]: gpu-compute: Fix off-by-one when creating an AddrRange
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/48020 ) Change subject: gpu-compute: Fix off-by-one when creating an AddrRange .. gpu-compute: Fix off-by-one when creating an AddrRange The end value of an AddrRange is already not included in the range, so subtracting one from the end creates an off-by-one error. This patch removes the extra -1 that was used when determining the end of an AddrRange in allocateGpuVma Change-Id: I75659e9a7fabd991bb37be9aa40f8e409eb21154 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48020 Reviewed-by: Matt Sinclair Reviewed-by: Bobby R. Bruce Maintainer: Matt Sinclair Tested-by: kokoro --- M src/gpu-compute/gpu_compute_driver.cc 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Matt Sinclair: Looks good to me, but someone else must approve; Looks good to me, approved Bobby R. Bruce: Looks good to me, approved kokoro: Regressions pass diff --git a/src/gpu-compute/gpu_compute_driver.cc b/src/gpu-compute/gpu_compute_driver.cc index 92ac641..f794b43 100644 --- a/src/gpu-compute/gpu_compute_driver.cc +++ b/src/gpu-compute/gpu_compute_driver.cc @@ -985,7 +985,7 @@ GPUComputeDriver::allocateGpuVma(Request::CacheCoherenceFlags mtype, Addr start, Addr length) { -AddrRange range = AddrRange(start, start + length - 1); +AddrRange range = AddrRange(start, start + length); DPRINTF(GPUDriver, "Registering [%p - %p] with MTYPE %d\n", range.start(), range.end(), mtype); fatal_if(gpuVmas.insert(range, mtype) == gpuVmas.end(), -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/48020 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: release-staging-v21-1 Gerrit-Change-Id: I75659e9a7fabd991bb37be9aa40f8e409eb21154 Gerrit-Change-Number: 48020 Gerrit-PatchSet: 2 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Alex Dutu Gerrit-Reviewer: Bobby R. Bruce Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: Matthew Poremba Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[release-staging-v21-1]: arch-gcn3: Free dest registers in non-memory Load DS insts
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/48019 ) Change subject: arch-gcn3: Free dest registers in non-memory Load DS insts .. arch-gcn3: Free dest registers in non-memory Load DS insts Certain DS insts are classfied as Loads, but don't actually go through the memory pipeline. However, any instruction classified as a load marks its destination registers as free in the memory pipeline. Because these instructions didn't use the memory pipeline, they never freed their destination registers, which led to a deadlock. This patch explicitly calls the function used to free the destination registers in the execute() method of those Load instructions that don't use the memory pipeline. Change-Id: Ic2ac2e232c8fbad63d0c62c1862f2bdaeaba4edf Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48019 Reviewed-by: Matt Sinclair Reviewed-by: Bobby R. Bruce Maintainer: Matt Sinclair Tested-by: kokoro --- M src/arch/amdgpu/gcn3/insts/instructions.cc 1 file changed, 27 insertions(+), 0 deletions(-) Approvals: Matt Sinclair: Looks good to me, but someone else must approve; Looks good to me, approved Bobby R. Bruce: Looks good to me, approved kokoro: Regressions pass diff --git a/src/arch/amdgpu/gcn3/insts/instructions.cc b/src/arch/amdgpu/gcn3/insts/instructions.cc index a421454..21ab58d 100644 --- a/src/arch/amdgpu/gcn3/insts/instructions.cc +++ b/src/arch/amdgpu/gcn3/insts/instructions.cc @@ -32397,6 +32397,15 @@ } vdst.write(); + +/** + * This is needed because we treat this instruction as a load + * but it's not an actual memory request. + * Without this, the destination register never gets marked as + * free, leading to a possible deadlock + */ +wf->computeUnit->vrf[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); } // execute // --- Inst_DS__DS_PERMUTE_B32 class methods --- @@ -32468,6 +32477,15 @@ wf->decLGKMInstsIssued(); wf->rdLmReqsInPipe--; wf->validateRequestCounters(); + +/** + * This is needed because we treat this instruction as a load + * but it's not an actual memory request. + * Without this, the destination register never gets marked as + * free, leading to a possible deadlock + */ +wf->computeUnit->vrf[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); } // execute // --- Inst_DS__DS_BPERMUTE_B32 class methods --- @@ -32539,6 +32557,15 @@ wf->decLGKMInstsIssued(); wf->rdLmReqsInPipe--; wf->validateRequestCounters(); + +/** + * This is needed because we treat this instruction as a load + * but it's not an actual memory request. + * Without this, the destination register never gets marked as + * free, leading to a possible deadlock + */ +wf->computeUnit->vrf[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); } // execute // --- Inst_DS__DS_ADD_U64 class methods --- -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/48019 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: release-staging-v21-1 Gerrit-Change-Id: Ic2ac2e232c8fbad63d0c62c1862f2bdaeaba4edf Gerrit-Change-Number: 48019 Gerrit-PatchSet: 2 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Alex Dutu Gerrit-Reviewer: Bobby R. Bruce Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: Matthew Poremba Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[release-staging-v21-1]: sim-se: Properly handle a clone with the VFORK flag
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/48346 ) Change subject: sim-se: Properly handle a clone with the VFORK flag .. sim-se: Properly handle a clone with the VFORK flag When clone is called with the VFORK flag, the calling process is suspended until the child process either exits, or calls execve. This patch adds in a new variable to Process, which is used to store the context of the calling process if this process is created through a clone with VFORK set. This patch also adds the required support in clone to suspend the calling thread, and in exitImpl and execveFunc to wake up the calling thread when the child thread calls either of those functions Change-Id: I85af67544ea1d5df7102dcff1331b5a6f6f4fa7c --- M src/sim/process.cc M src/sim/process.hh M src/sim/syscall_emul.cc M src/sim/syscall_emul.hh 4 files changed, 34 insertions(+), 0 deletions(-) diff --git a/src/sim/process.cc b/src/sim/process.cc index 207c275..272fc9f 100644 --- a/src/sim/process.cc +++ b/src/sim/process.cc @@ -175,6 +175,9 @@ #ifndef CLONE_THREAD #define CLONE_THREAD 0 #endif +#ifndef CLONE_VFORK +#define CLONE_VFORK 0 +#endif if (CLONE_VM & flags) { /** * Share the process memory address space between the new process @@ -249,6 +252,10 @@ np->exitGroup = exitGroup; } +if (CLONE_VFORK & flags) { +np->vforkContexts.push_back(otc->contextId()); +} + np->argv.insert(np->argv.end(), argv.begin(), argv.end()); np->envp.insert(np->envp.end(), envp.begin(), envp.end()); } diff --git a/src/sim/process.hh b/src/sim/process.hh index 632ba90..34768a0 100644 --- a/src/sim/process.hh +++ b/src/sim/process.hh @@ -284,6 +284,9 @@ // Process was forked with SIGCHLD set. bool *sigchld; +// Contexts to wake up when this thread exits or calls execve +std::vector vforkContexts; + // Track how many system calls are executed statistics::Scalar numSyscalls; }; diff --git a/src/sim/syscall_emul.cc b/src/sim/syscall_emul.cc index 147cb39..713bec4 100644 --- a/src/sim/syscall_emul.cc +++ b/src/sim/syscall_emul.cc @@ -193,6 +193,16 @@ } } +/** + * If we were a thread created by a clone with vfork set, wake up + * the thread that created us + */ +if (!p->vforkContexts.empty()) { +ThreadContext *vtc = sys->threads[p->vforkContexts.front()]; +assert(vtc->status() == ThreadContext::Suspended); +vtc->activate(); +} + tc->halt(); /** diff --git a/src/sim/syscall_emul.hh b/src/sim/syscall_emul.hh index 09be700..8695638 100644 --- a/src/sim/syscall_emul.hh +++ b/src/sim/syscall_emul.hh @@ -1521,6 +1521,10 @@ ctc->pcState(cpc); ctc->activate(); +if (flags & OS::TGT_CLONE_VFORK) { +tc->suspend(); +} + return cp->pid(); } @@ -1998,6 +2002,16 @@ }; /** + * If we were a thread created by a clone with vfork set, wake up + * the thread that created us + */ +if (!p->vforkContexts.empty()) { +ThreadContext *vtc = p->system->threads[p->vforkContexts.front()]; +assert(vtc->status() == ThreadContext::Suspended); +vtc->activate(); +} + +/** * Note that ProcessParams is generated by swig and there are no other * examples of how to create anything but this default constructor. The * fields are manually initialized instead of passing parameters to the -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/48346 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: release-staging-v21-1 Gerrit-Change-Id: I85af67544ea1d5df7102dcff1331b5a6f6f4fa7c Gerrit-Change-Number: 48346 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[release-staging-v21-1]: sim-se: Fix execve syscall
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/48345 ) Change subject: sim-se: Fix execve syscall .. sim-se: Fix execve syscall There were three things preventing execve from working Firstly, the entrypoint for the new program wasn't correct. This was fixed by calling Process::init, which adds a bias to the entrypoint. Secondly, the uname string wasn't being copied over. This meant when the new executable tried to run, it would think the kernel was too old to run on, and would error out. This was fixed by copying over the uname string (the `release` string in Process) when creating the new process. Additionally, this patch also ensures we copy over the uname string in the clone implementation, as otherwise a cloned thread that called execve would crash. Finally, we choose to not delete the new ProcessParams or the old Process. This is done both because it matches what is done in cloneFunc, but also because deleting the old process results in a segfault later on. Change-Id: I4ca201da689e9e37671b4cb477dc76fa12eecf69 --- M src/sim/syscall_emul.hh 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/src/sim/syscall_emul.hh b/src/sim/syscall_emul.hh index aa02fd6..09be700 100644 --- a/src/sim/syscall_emul.hh +++ b/src/sim/syscall_emul.hh @@ -1452,6 +1452,7 @@ pp->euid = p->euid(); pp->gid = p->gid(); pp->egid = p->egid(); +pp->release = p->release; /* Find the first free PID that's less than the maximum */ std::set const& pids = p->system->PIDs; @@ -2017,6 +2018,7 @@ pp->errout.assign("cerr"); pp->cwd.assign(p->tgtCwd); pp->system = p->system; +pp->release = p->release; /** * Prevent process object creation with identical PIDs (which will trip * a fatal check in Process constructor). The execve call is supposed to @@ -2027,7 +2029,9 @@ */ p->system->PIDs.erase(p->pid()); Process *new_p = pp->create(); -delete pp; +// TODO: there is no way to know when the Process SimObject is done with +// the params pointer. Both the params pointer (pp) and the process +// pointer (p) are normally managed in python and are never cleaned up. /** * Work through the file descriptor array and close any files marked @@ -2042,10 +2046,10 @@ *new_p->sigchld = true; -delete p; tc->clearArchRegs(); tc->setProcessPtr(new_p); new_p->assignThreadContext(tc->contextId()); +new_p->init(); new_p->initState(); tc->activate(); TheISA::PCState pcState = tc->pcState(); -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/48345 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: release-staging-v21-1 Gerrit-Change-Id: I4ca201da689e9e37671b4cb477dc76fa12eecf69 Gerrit-Change-Number: 48345 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[release-staging-v21-1]: arch-gcn3: Implement LDS accesses in Flat instructions
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/48343 ) Change subject: arch-gcn3: Implement LDS accesses in Flat instructions .. arch-gcn3: Implement LDS accesses in Flat instructions Add support for LDS accesses by allowing Flat instructions to dispatch into the local memory pipeline if the requested address is in the group aperture. This requires implementing LDS accesses in the Flat initMemRead/Write functions, in a similar fashion to the DS functions of the same name. Because we now can potentially dispatch to the local memory pipeline, this change also adds a check to regain any tokens we requested as a flat instruction. Change-Id: Id26191f7ee43291a5e5ca5f39af06af981ec23ab --- M src/arch/amdgpu/gcn3/insts/instructions.cc M src/arch/amdgpu/gcn3/insts/op_encodings.hh M src/gpu-compute/gpu_dyn_inst.cc M src/gpu-compute/local_memory_pipeline.cc 4 files changed, 156 insertions(+), 6 deletions(-) diff --git a/src/arch/amdgpu/gcn3/insts/instructions.cc b/src/arch/amdgpu/gcn3/insts/instructions.cc index 79af7ac..95af790 100644 --- a/src/arch/amdgpu/gcn3/insts/instructions.cc +++ b/src/arch/amdgpu/gcn3/insts/instructions.cc @@ -39384,6 +39384,9 @@ if (gpuDynInst->executedAs() == enums::SC_GLOBAL) { gpuDynInst->computeUnit()->globalMemoryPipe .issueRequest(gpuDynInst); +} else if (gpuDynInst->executedAs() == enums::SC_GROUP) { +gpuDynInst->computeUnit()->localMemoryPipe +.issueRequest(gpuDynInst); } else { fatal("Non global flat instructions not implemented yet.\n"); } @@ -39448,6 +39451,9 @@ if (gpuDynInst->executedAs() == enums::SC_GLOBAL) { gpuDynInst->computeUnit()->globalMemoryPipe .issueRequest(gpuDynInst); +} else if (gpuDynInst->executedAs() == enums::SC_GROUP) { +gpuDynInst->computeUnit()->localMemoryPipe +.issueRequest(gpuDynInst); } else { fatal("Non global flat instructions not implemented yet.\n"); } @@ -39511,6 +39517,9 @@ if (gpuDynInst->executedAs() == enums::SC_GLOBAL) { gpuDynInst->computeUnit()->globalMemoryPipe .issueRequest(gpuDynInst); +} else if (gpuDynInst->executedAs() == enums::SC_GROUP) { +gpuDynInst->computeUnit()->localMemoryPipe +.issueRequest(gpuDynInst); } else { fatal("Non global flat instructions not implemented yet.\n"); } @@ -39603,6 +39612,9 @@ if (gpuDynInst->executedAs() == enums::SC_GLOBAL) { gpuDynInst->computeUnit()->globalMemoryPipe .issueRequest(gpuDynInst); +} else if (gpuDynInst->executedAs() == enums::SC_GROUP) { +gpuDynInst->computeUnit()->localMemoryPipe +.issueRequest(gpuDynInst); } else { fatal("Non global flat instructions not implemented yet.\n"); } @@ -39667,6 +39679,9 @@ if (gpuDynInst->executedAs() == enums::SC_GLOBAL) { gpuDynInst->computeUnit()->globalMemoryPipe .issueRequest(gpuDynInst); +} else if (gpuDynInst->executedAs() == enums::SC_GROUP) { +gpuDynInst->computeUnit()->localMemoryPipe +.issueRequest(gpuDynInst); } else { fatal("Non global flat instructions not implemented yet.\n"); } @@ -39731,6 +39746,9 @@ if (gpuDynInst->executedAs() == enums::SC_GLOBAL) { gpuDynInst->computeUnit()->globalMemoryPipe .issueRequest(gpuDynInst); +} else if (gpuDynInst->executedAs() == enums::SC_GROUP) { +gpuDynInst->computeUnit()->localMemoryPipe +.issueRequest(gpuDynInst); } else { fatal("Non global flat instructions not implemented yet.\n"); } @@ -39804,6 +39822,9 @@ if (gpuDynInst->executedAs() == enums::SC_GLOBAL) { gpuDynInst->computeUnit()->globalMemoryPipe .issueRequest(gpuDynInst); +} else if (gpuDynInst->executedAs() == enums::SC_GROUP) { +gpuDynInst->computeUnit()->localMemoryPipe +.issueRequest(gpuDynInst); } else { fatal("Non global flat instructions not implemented yet.\n"); } @@ -39889,6 +39910,9 @@ if (gpuDynInst->executedAs() == enums::SC_GLOBAL) { gpuDynInst->computeUnit()->globalMemoryPipe .issueRequest(gpuDynInst); +} else if (gpuDynInst->executedAs() == enums::SC_GROUP) { +gpuDynInst->computeUnit()->localMemoryPipe +.issueRequest(gpuDynInst); } else { fatal("Non global flat instructions not implemented yet.\n"); } @@ -39952,6 +39976,9 @@ if (gpuDynInst->executedAs() ==
[gem5-dev] Change in gem5/gem5[release-staging-v21-1]: arch-gcn3: Validate if scalar sources are scalar gprs
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/48344 ) Change subject: arch-gcn3: Validate if scalar sources are scalar gprs .. arch-gcn3: Validate if scalar sources are scalar gprs Scalar sources can either be a general-purpose register or a constant register that holds a single value. If we don't check for if the register is a general-purpose register, it's possible that we get a constant register, which then causes all of the register mapping code to break, as the constant registers aren't supposed to be mapped like the general-purpose registers are. This fix adds an isScalarReg check to the instruction encodings that were missing it. Change-Id: I3d7d5393aa324737301c3269cc227b60e8a159e4 --- M src/arch/amdgpu/gcn3/insts/op_encodings.cc 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/src/arch/amdgpu/gcn3/insts/op_encodings.cc b/src/arch/amdgpu/gcn3/insts/op_encodings.cc index cbbb767..cf20a2e 100644 --- a/src/arch/amdgpu/gcn3/insts/op_encodings.cc +++ b/src/arch/amdgpu/gcn3/insts/op_encodings.cc @@ -1277,12 +1277,12 @@ reg = extData.SRSRC; srcOps.emplace_back(reg, getOperandSize(opNum), true, - true, false, false); + isScalarReg(reg), false, false); opNum++; reg = extData.SOFFSET; srcOps.emplace_back(reg, getOperandSize(opNum), true, - true, false, false); + isScalarReg(reg), false, false); opNum++; } @@ -1368,12 +1368,12 @@ reg = extData.SRSRC; srcOps.emplace_back(reg, getOperandSize(opNum), true, - true, false, false); + isScalarReg(reg), false, false); opNum++; reg = extData.SOFFSET; srcOps.emplace_back(reg, getOperandSize(opNum), true, - true, false, false); + isScalarReg(reg), false, false); opNum++; // extData.VDATA moves in the reg list depending on the instruction @@ -1441,13 +1441,13 @@ reg = extData.SRSRC; srcOps.emplace_back(reg, getOperandSize(opNum), true, - true, false, false); + isScalarReg(reg), false, false); opNum++; if (getNumOperands() == 4) { reg = extData.SSAMP; srcOps.emplace_back(reg, getOperandSize(opNum), true, - true, false, false); + isScalarReg(reg), false, false); opNum++; } -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/48344 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: release-staging-v21-1 Gerrit-Change-Id: I3d7d5393aa324737301c3269cc227b60e8a159e4 Gerrit-Change-Number: 48344 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[release-staging-v21-1]: arch-gcn3: Implement large ds_read/write instructions
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/48342 ) Change subject: arch-gcn3: Implement large ds_read/write instructions .. arch-gcn3: Implement large ds_read/write instructions This implements the 96 and 128b ds_read/write instructions in a similar fashion to the 3 and 4 dword flat_load/store instructions. These instructions are treated as reads/writes of 3 or 4 dwords, instead of as a single 96b/128b memory transaction, due to the limitations of the VecOperand class used in the amdgpu code. In order to handle treating the memory transaction as multiple dwords, the patch also adds in new initMemRead/initMemWrite functions for ds instructions. These are similar to the functions used in flat instructions for the same purpose. Change-Id: I0f2ba3cb7cf040abb876e6eae55a6d38149ee960 --- M src/arch/amdgpu/gcn3/insts/instructions.cc M src/arch/amdgpu/gcn3/insts/instructions.hh M src/arch/amdgpu/gcn3/insts/op_encodings.hh 3 files changed, 232 insertions(+), 4 deletions(-) diff --git a/src/arch/amdgpu/gcn3/insts/instructions.cc b/src/arch/amdgpu/gcn3/insts/instructions.cc index 21ab58d..79af7ac 100644 --- a/src/arch/amdgpu/gcn3/insts/instructions.cc +++ b/src/arch/amdgpu/gcn3/insts/instructions.cc @@ -34335,9 +34335,52 @@ void Inst_DS__DS_WRITE_B96::execute(GPUDynInstPtr gpuDynInst) { -panicUnimplemented(); +Wavefront *wf = gpuDynInst->wavefront(); +gpuDynInst->execUnitId = wf->execUnitId; +gpuDynInst->latency.init(gpuDynInst->computeUnit()); +gpuDynInst->latency.set( +gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24))); +ConstVecOperandU32 addr(gpuDynInst, extData.ADDR); +ConstVecOperandU32 data0(gpuDynInst, extData.DATA0); +ConstVecOperandU32 data1(gpuDynInst, extData.DATA0 + 1); +ConstVecOperandU32 data2(gpuDynInst, extData.DATA0 + 2); + +addr.read(); +data0.read(); +data1.read(); +data2.read(); + +calcAddr(gpuDynInst, addr); + +for (int lane = 0; lane < NumVecElemPerVecReg; ++lane) { +if (gpuDynInst->exec_mask[lane]) { +(reinterpret_cast( +gpuDynInst->d_data))[lane * 4] = data0[lane]; +(reinterpret_cast( +gpuDynInst->d_data))[lane * 4 + 1] = data1[lane]; +(reinterpret_cast( +gpuDynInst->d_data))[lane * 4 + 2] = data2[lane]; +} +} + + gpuDynInst->computeUnit()->localMemoryPipe.issueRequest(gpuDynInst); } +void +Inst_DS__DS_WRITE_B96::initiateAcc(GPUDynInstPtr gpuDynInst) +{ +Addr offset0 = instData.OFFSET0; +Addr offset1 = instData.OFFSET1; +Addr offset = (offset1 << 8) | offset0; + +initMemWrite<3>(gpuDynInst, offset); +} // initiateAcc + +void +Inst_DS__DS_WRITE_B96::completeAcc(GPUDynInstPtr gpuDynInst) +{ +} // completeAcc + Inst_DS__DS_WRITE_B128::Inst_DS__DS_WRITE_B128(InFmt_DS *iFmt) : Inst_DS(iFmt, "ds_write_b128") { @@ -34354,9 +34397,56 @@ void Inst_DS__DS_WRITE_B128::execute(GPUDynInstPtr gpuDynInst) { -panicUnimplemented(); +Wavefront *wf = gpuDynInst->wavefront(); +gpuDynInst->execUnitId = wf->execUnitId; +gpuDynInst->latency.init(gpuDynInst->computeUnit()); +gpuDynInst->latency.set( +gpuDynInst->computeUnit()->cyclesToTicks(Cycles(24))); +ConstVecOperandU32 addr(gpuDynInst, extData.ADDR); +ConstVecOperandU32 data0(gpuDynInst, extData.DATA0); +ConstVecOperandU32 data1(gpuDynInst, extData.DATA0 + 1); +ConstVecOperandU32 data2(gpuDynInst, extData.DATA0 + 2); +ConstVecOperandU32 data3(gpuDynInst, extData.DATA0 + 3); + +addr.read(); +data0.read(); +data1.read(); +data2.read(); +data3.read(); + +calcAddr(gpuDynInst, addr); + +for (int lane = 0; lane < NumVecElemPerVecReg; ++lane) { +if (gpuDynInst->exec_mask[lane]) { +(reinterpret_cast( +gpuDynInst->d_data))[lane * 4] = data0[lane]; +(reinterpret_cast( +gpuDynInst->d_data))[lane * 4 + 1] = data1[lane]; +(reinterpret_cast( +gpuDynInst->d_data))[lane * 4 + 2] = data2[lane]; +(reinterpret_cast( +gpuDynInst->d_data))[lane * 4 + 3] = data3[lane]; +} +} + + gpuDynInst->computeUnit()->localMemoryPipe.issueRequest(gpuDynInst); } +void +Inst_DS__DS_WRITE_B128::initiateAcc(GPUDynInstPtr gpuDynInst) +{ +Addr offset0 = instData.OFFSET0; +Addr offset1 = instData.OFFSET1; +Addr offset = (offset1 << 8) | offset0; + +initMemWrite<4>(gpuDynInst, offs
[gem5-dev] Change in gem5/gem5[release-staging-v21-1]: mem-ruby: Account for misaligned accesses in GPUCoalescer
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/48341 ) Change subject: mem-ruby: Account for misaligned accesses in GPUCoalescer .. mem-ruby: Account for misaligned accesses in GPUCoalescer Previously, we assumed that the maximum number of requests that would be issued by an instruction was equal to the number of threads that were active for that instruction. However, if a thread has an access that crosses a cache line, that thread has a misaligned access, and needs to request both cache lines. This patch takes that into account by checking the status vector for each thread in that instruction to determine the number of requests. Change-Id: I1994962c46d504b48654dbd22bcd786c9f382fd9 --- M src/mem/ruby/system/GPUCoalescer.cc 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/mem/ruby/system/GPUCoalescer.cc b/src/mem/ruby/system/GPUCoalescer.cc index c00e7c0..2390ba6 100644 --- a/src/mem/ruby/system/GPUCoalescer.cc +++ b/src/mem/ruby/system/GPUCoalescer.cc @@ -645,7 +645,10 @@ // of the exec_mask. int num_packets = 1; if (!m_usingRubyTester) { -num_packets = getDynInst(pkt)->exec_mask.count(); +num_packets = 0; +for (int i = 0; i < TheGpuISA::NumVecElemPerVecReg; i++) { +num_packets += getDynInst(pkt)->getLaneStatus(i); +} } // the pkt is temporarily stored in the uncoalesced table until -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/48341 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: release-staging-v21-1 Gerrit-Change-Id: I1994962c46d504b48654dbd22bcd786c9f382fd9 Gerrit-Change-Number: 48341 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[release-staging-v21-1]: gpu-compute: Use sorted map for coalescerFIFO
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/48340 ) Change subject: gpu-compute: Use sorted map for coalescerFIFO .. gpu-compute: Use sorted map for coalescerFIFO coalescerFIFO, being a FIFO, should have a consistent ordering. unordered_map is not ordered, which led to a scenario where the first thing placed in the FIFO never got processed. This patch changes the unordered_map to a regular map, which is ordered. Change-Id: I9c7ab32c038d5e60f6b55236266a27b0cae8bfb0 --- M src/gpu-compute/tlb_coalescer.hh 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gpu-compute/tlb_coalescer.hh b/src/gpu-compute/tlb_coalescer.hh index b97801b..fce8740 100644 --- a/src/gpu-compute/tlb_coalescer.hh +++ b/src/gpu-compute/tlb_coalescer.hh @@ -100,7 +100,7 @@ * option is to change it to curTick(), so we coalesce based * on the receive time. */ -typedef std::unordered_map> +typedef std::map> CoalescingFIFO; CoalescingFIFO coalescerFIFO; -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/48340 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: release-staging-v21-1 Gerrit-Change-Id: I9c7ab32c038d5e60f6b55236266a27b0cae8bfb0 Gerrit-Change-Number: 48340 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: util-docker: Fix building gcn-gpu image
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/50847 ) Change subject: util-docker: Fix building gcn-gpu image .. util-docker: Fix building gcn-gpu image In the gcn-gpu image, rocBLAS wasn't able to be installed. This was due to us not installing rocm-cmake, as rocBLAS is dependent on it and will download the most recent version of rocm-cmake if it isn't installed. The most recent version of rocm-cmake wasn't compatible with the version of ROCm we're using. This patch installs rocm-cmake before building and installing rocBLAS instead of after. Change-Id: Iaaa34d5e0d6594fddd0d1a7d147f43405163ca89 --- M util/dockerfiles/gcn-gpu/Dockerfile 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/util/dockerfiles/gcn-gpu/Dockerfile b/util/dockerfiles/gcn-gpu/Dockerfile index 360ab1f..dee02b0 100644 --- a/util/dockerfiles/gcn-gpu/Dockerfile +++ b/util/dockerfiles/gcn-gpu/Dockerfile @@ -98,6 +98,10 @@ RUN ln -s /HIP/build/rocclr/CMakeFiles/Export/_opt/rocm/hip/lib/cmake/hip/* /opt/rocm/hip/lib/cmake/hip/ WORKDIR / +# rocBLAS downloads the most recent rocm-cmake if it isn't installed before +# building +RUN apt install rocm-cmake + RUN git clone -b rocm-4.0.0 \ https://github.com/ROCmSoftwarePlatform/rocBLAS.git && mkdir rocBLAS/build @@ -109,7 +113,7 @@ WORKDIR / # MIOpen dependencies + MIOpen -RUN apt install rocm-cmake rocm-clang-ocl miopen-hip +RUN apt install rocm-clang-ocl miopen-hip # Clone MIOpen repo so that we have the kernel sources available RUN git clone -b rocm-4.0.1 https://github.com/ROCmSoftwarePlatform/MIOpen.git -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/50847 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Iaaa34d5e0d6594fddd0d1a7d147f43405163ca89 Gerrit-Change-Number: 50847 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: util-docker: Fix building gcn-gpu image
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/50847 ) Change subject: util-docker: Fix building gcn-gpu image .. util-docker: Fix building gcn-gpu image In the gcn-gpu image, rocBLAS wasn't able to be installed. This was due to us not installing rocm-cmake, as rocBLAS is dependent on it and will download the most recent version of rocm-cmake if it isn't installed. The most recent version of rocm-cmake wasn't compatible with the version of ROCm we're using. This patch installs rocm-cmake before building and installing rocBLAS instead of after. Change-Id: Iaaa34d5e0d6594fddd0d1a7d147f43405163ca89 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50847 Reviewed-by: Matt Sinclair Reviewed-by: Bobby R. Bruce Maintainer: Matt Sinclair Maintainer: Bobby R. Bruce Tested-by: kokoro --- M util/dockerfiles/gcn-gpu/Dockerfile 1 file changed, 5 insertions(+), 1 deletion(-) Approvals: Matt Sinclair: Looks good to me, approved; Looks good to me, approved Bobby R. Bruce: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/util/dockerfiles/gcn-gpu/Dockerfile b/util/dockerfiles/gcn-gpu/Dockerfile index 360ab1f..dee02b0 100644 --- a/util/dockerfiles/gcn-gpu/Dockerfile +++ b/util/dockerfiles/gcn-gpu/Dockerfile @@ -98,6 +98,10 @@ RUN ln -s /HIP/build/rocclr/CMakeFiles/Export/_opt/rocm/hip/lib/cmake/hip/* /opt/rocm/hip/lib/cmake/hip/ WORKDIR / +# rocBLAS downloads the most recent rocm-cmake if it isn't installed before +# building +RUN apt install rocm-cmake + RUN git clone -b rocm-4.0.0 \ https://github.com/ROCmSoftwarePlatform/rocBLAS.git && mkdir rocBLAS/build @@ -109,7 +113,7 @@ WORKDIR / # MIOpen dependencies + MIOpen -RUN apt install rocm-cmake rocm-clang-ocl miopen-hip +RUN apt install rocm-clang-ocl miopen-hip # Clone MIOpen repo so that we have the kernel sources available RUN git clone -b rocm-4.0.1 https://github.com/ROCmSoftwarePlatform/MIOpen.git -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/50847 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Iaaa34d5e0d6594fddd0d1a7d147f43405163ca89 Gerrit-Change-Number: 50847 Gerrit-PatchSet: 2 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Bobby R. Bruce Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: Matthew Poremba Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-gcn3: Fix MUBUF out-of-bounds case 1
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/51127 ) Change subject: arch-gcn3: Fix MUBUF out-of-bounds case 1 .. arch-gcn3: Fix MUBUF out-of-bounds case 1 This patch upates the out-of-bounds check to properly check against the correct buffer_offset, which is different depending on if the const_swizzle_enable is true or false. Change-Id: I5c687c09ee7f8e446618084b8545b74a84211d4d --- M src/arch/amdgpu/gcn3/insts/op_encodings.hh 1 file changed, 36 insertions(+), 20 deletions(-) diff --git a/src/arch/amdgpu/gcn3/insts/op_encodings.hh b/src/arch/amdgpu/gcn3/insts/op_encodings.hh index 24edfa7..be96924 100644 --- a/src/arch/amdgpu/gcn3/insts/op_encodings.hh +++ b/src/arch/amdgpu/gcn3/insts/op_encodings.hh @@ -634,6 +634,7 @@ Addr stride = 0; Addr buf_idx = 0; Addr buf_off = 0; +Addr buffer_offset = 0; BufferRsrcDescriptor rsrc_desc; std::memcpy((void*)&rsrc_desc, s_rsrc_desc.rawDataPtr(), @@ -656,6 +657,26 @@ buf_off = v_off[lane] + inst_offset; +if (rsrc_desc.swizzleEn) { +Addr idx_stride = 8 << rsrc_desc.idxStride; +Addr elem_size = 2 << rsrc_desc.elemSize; +Addr idx_msb = buf_idx / idx_stride; +Addr idx_lsb = buf_idx % idx_stride; +Addr off_msb = buf_off / elem_size; +Addr off_lsb = buf_off % elem_size; +DPRINTF(GCN3, "mubuf swizzled lane %d: " +"idx_stride = %llx, elem_size = %llx, " +"idx_msb = %llx, idx_lsb = %llx, " +"off_msb = %llx, off_lsb = %llx\n", +lane, idx_stride, elem_size, idx_msb, idx_lsb, +off_msb, off_lsb); + +buffer_offset =(idx_msb * stride + off_msb * elem_size) +* idx_stride + idx_lsb * elem_size + off_lsb; +} else { +buffer_offset = buf_off + stride * buf_idx; +} + /** * Range check behavior causes out of range accesses to @@ -665,7 +686,7 @@ * basis. */ if (rsrc_desc.stride == 0 || !rsrc_desc.swizzleEn) { -if (buf_off + stride * buf_idx >= +if (buffer_offset >= rsrc_desc.numRecords - s_offset.rawData()) { DPRINTF(GCN3, "mubuf out-of-bounds condition 1: " "lane = %d, buffer_offset = %llx, " @@ -692,25 +713,7 @@ } } -if (rsrc_desc.swizzleEn) { -Addr idx_stride = 8 << rsrc_desc.idxStride; -Addr elem_size = 2 << rsrc_desc.elemSize; -Addr idx_msb = buf_idx / idx_stride; -Addr idx_lsb = buf_idx % idx_stride; -Addr off_msb = buf_off / elem_size; -Addr off_lsb = buf_off % elem_size; -DPRINTF(GCN3, "mubuf swizzled lane %d: " -"idx_stride = %llx, elem_size = %llx, " -"idx_msb = %llx, idx_lsb = %llx, " -"off_msb = %llx, off_lsb = %llx\n", -lane, idx_stride, elem_size, idx_msb, idx_lsb, -off_msb, off_lsb); - -vaddr += ((idx_msb * stride + off_msb * elem_size) -* idx_stride + idx_lsb * elem_size + off_lsb); -} else { -vaddr += buf_off + stride * buf_idx; -} +vaddr += buffer_offset; DPRINTF(GCN3, "Calculating mubuf address for lane %d: " "vaddr = %llx, base_addr = %llx, " -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/51127 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I5c687c09ee7f8e446618084b8545b74a84211d4d Gerrit-Change-Number: 51127 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-gcn3,arch-vega: Don't write exec in v_cmp_f_i32
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/52445 ) Change subject: arch-gcn3,arch-vega: Don't write exec in v_cmp_f_i32 .. arch-gcn3,arch-vega: Don't write exec in v_cmp_f_i32 Per the GCN3 and VEGA ISAs, v_cmpx_* writes exec, while v_cmp_* doesn't. This removes the erroneous exec write in the VOP3 implementation of v_cmp_f_i32. Change-Id: I048e35917163c45b879f38d31a88f3f3d56c0baf --- M src/arch/amdgpu/vega/insts/instructions.cc M src/arch/amdgpu/gcn3/insts/instructions.cc 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/src/arch/amdgpu/gcn3/insts/instructions.cc b/src/arch/amdgpu/gcn3/insts/instructions.cc index 65d008b..bb15957 100644 --- a/src/arch/amdgpu/gcn3/insts/instructions.cc +++ b/src/arch/amdgpu/gcn3/insts/instructions.cc @@ -20601,7 +20601,6 @@ } } -wf->execMask() = sdst.rawData(); sdst.write(); } diff --git a/src/arch/amdgpu/vega/insts/instructions.cc b/src/arch/amdgpu/vega/insts/instructions.cc index 757bfa8..1e07f0b 100644 --- a/src/arch/amdgpu/vega/insts/instructions.cc +++ b/src/arch/amdgpu/vega/insts/instructions.cc @@ -22454,7 +22454,6 @@ } } -wf->execMask() = sdst.rawData(); sdst.write(); } // execute // --- Inst_VOP3__V_CMP_LT_I32 class methods --- -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/52445 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I048e35917163c45b879f38d31a88f3f3d56c0baf Gerrit-Change-Number: 52445 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-gcn3,arch-vega: Don't write exec in v_cmp_f_i32
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/52445 ) Change subject: arch-gcn3,arch-vega: Don't write exec in v_cmp_f_i32 .. arch-gcn3,arch-vega: Don't write exec in v_cmp_f_i32 Per the GCN3 and VEGA ISAs, v_cmpx_* writes exec, while v_cmp_* doesn't. This removes the erroneous exec write in the VOP3 implementation of v_cmp_f_i32. Change-Id: I048e35917163c45b879f38d31a88f3f3d56c0baf Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52445 Maintainer: Matt Sinclair Reviewed-by: Matt Sinclair Reviewed-by: Matthew Poremba Tested-by: kokoro --- M src/arch/amdgpu/vega/insts/instructions.cc M src/arch/amdgpu/gcn3/insts/instructions.cc 2 files changed, 19 insertions(+), 2 deletions(-) Approvals: Matthew Poremba: Looks good to me, approved Matt Sinclair: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/src/arch/amdgpu/gcn3/insts/instructions.cc b/src/arch/amdgpu/gcn3/insts/instructions.cc index 65d008b..bb15957 100644 --- a/src/arch/amdgpu/gcn3/insts/instructions.cc +++ b/src/arch/amdgpu/gcn3/insts/instructions.cc @@ -20601,7 +20601,6 @@ } } -wf->execMask() = sdst.rawData(); sdst.write(); } diff --git a/src/arch/amdgpu/vega/insts/instructions.cc b/src/arch/amdgpu/vega/insts/instructions.cc index 757bfa8..1e07f0b 100644 --- a/src/arch/amdgpu/vega/insts/instructions.cc +++ b/src/arch/amdgpu/vega/insts/instructions.cc @@ -22454,7 +22454,6 @@ } } -wf->execMask() = sdst.rawData(); sdst.write(); } // execute // --- Inst_VOP3__V_CMP_LT_I32 class methods --- -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/52445 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I048e35917163c45b879f38d31a88f3f3d56c0baf Gerrit-Change-Number: 52445 Gerrit-PatchSet: 2 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: Matthew Poremba Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs: set hsaTopology properties from options
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/31995 ) Change subject: configs: set hsaTopology properties from options .. configs: set hsaTopology properties from options Change-Id: I17bb812491708f4221c39b738c906f1ad944614d --- M configs/example/hsaTopology.py 1 file changed, 25 insertions(+), 24 deletions(-) diff --git a/configs/example/hsaTopology.py b/configs/example/hsaTopology.py index df24223..7cb18ad 100644 --- a/configs/example/hsaTopology.py +++ b/configs/example/hsaTopology.py @@ -36,6 +36,7 @@ from os.path import join as joinpath from os.path import isdir from shutil import rmtree, copyfile +from m5.util.convert import toFrequency def file_append(path, contents): with open(joinpath(*path), 'a') as f: @@ -77,29 +78,29 @@ # populate global node properties # NOTE: SIMD count triggers a valid GPU agent creation # TODO: Really need to parse these from options -node_prop = 'cpu_cores_count %s\n' % options.num_cpus + \ -'simd_count 32\n' + \ -'mem_banks_count 0\n' + \ -'caches_count 0\n' + \ -'io_links_count 0\n'+ \ -'cpu_core_id_base 16\n' + \ -'simd_id_base 2147483648\n' + \ -'max_waves_per_simd 40\n' + \ -'lds_size_in_kb 64\n' + \ -'gds_size_in_kb 0\n'+ \ -'wave_front_size 64\n' + \ -'array_count 1\n' + \ -'simd_arrays_per_engine 1\n'+ \ -'cu_per_simd_array 10\n'+ \ -'simd_per_cu 4\n' + \ -'max_slots_scratch_cu 32\n' + \ -'vendor_id 4098\n' + \ -'device_id 39028\n' + \ -'location_id 8\n' + \ -'max_engine_clk_fcompute 800\n' + \ -'local_mem_size 0\n'+ \ -'fw_version 699\n' + \ -'capability 4738\n' + \ -'max_engine_clk_ccompute 2100\n' +node_prop = 'cpu_cores_count %s\n' % options.num_cpus + \ +'simd_count %s\n' % (options.num_compute_units * options.simds_per_cu) + \ +'mem_banks_count 0\n' + \ +'caches_count 0\n' + \ +'io_links_count 0\n'+ \ +'cpu_core_id_base 16\n' + \ +'simd_id_base 2147483648\n' + \ +'max_waves_per_simd %s\n' % options.wfs_per_simd+ \ +'lds_size_in_kb 64\n' + \ +'gds_size_in_kb 0\n'+ \ +'wave_front_size %s\n' % options.wf_size+ \ +'array_count 1\n' + \ +'simd_arrays_per_engine %s\n' % options.sa_per_complex + \ +'cu_per_simd_array %s\n' % options.cu_per_sa+ \ +'simd_per_cu %s\n' % options.simds_per_cu + \ +'max_slots_scratch_cu 32\n' + \ +'vendor_id 4098\n' + \ +'device_id 39028\n' + \ +'location_id 8\n' + \ +'max_engine_clk_fcompute %s\n' % int(toFrequency(options.gpu_clock) / 1e6) + \ +'local_mem_size 0\n'+ \ +'fw_version 699\n' + \ +'capability 4738\n' + \ +'max_engine_clk_ccompute %s\n' % int(toFrequency(options.CPUClock) / 1e6) file_append((node_dir, 'properties'),
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: resolve race between data and DMA in MOESI_AMD_Base-dir
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/31996 ) Change subject: mem-ruby: resolve race between data and DMA in MOESI_AMD_Base-dir .. mem-ruby: resolve race between data and DMA in MOESI_AMD_Base-dir There seems to be race condition while running several benchmarks, where the DMA engine and the CorePair simultaneously send requests for the same block. This patch fixes two scenarios (a) If the request from the DMA engine arrives before the one from the CorePair, the directory controller records it as a pending request. However, once the DMA request is serviced, the directory doesn't check for pending requests. The CorePair, consequently, never sees a response to its request and this results in a Deadlock. Added call to wakeUpDependents in the transition from BDR_Pm to U Added call to wakeUpDependents in the transition from BDW_P to U (b) If the request from the CorePair is being serviced by the directory and the DMA requests for the same block, this causes an invalid transition because the current coherence doesn't take care of this scenario. Added transition state where the requests from DMA are added to the stall buffer. Updated B to U CoreUnblock transition to check all buffers, as the DMA requests were being placed later in the stall buffer than was being checked Change-Id: I5a76efef97723bc53cf239ea7e112f84fc874ef8 --- M src/mem/ruby/protocol/MOESI_AMD_Base-dir.sm M src/mem/ruby/slicc_interface/AbstractController.cc 2 files changed, 22 insertions(+), 3 deletions(-) diff --git a/src/mem/ruby/protocol/MOESI_AMD_Base-dir.sm b/src/mem/ruby/protocol/MOESI_AMD_Base-dir.sm index c8dafd5..f1bc637 100644 --- a/src/mem/ruby/protocol/MOESI_AMD_Base-dir.sm +++ b/src/mem/ruby/protocol/MOESI_AMD_Base-dir.sm @@ -180,6 +180,7 @@ void set_tbe(TBE a); void unset_tbe(); void wakeUpAllBuffers(); + void wakeUpAllBuffers(Addr a); void wakeUpBuffers(Addr a); Cycles curCycle(); @@ -1069,6 +1070,10 @@ stall_and_wait(requestNetwork_in, address); } + action(sd_stallAndWaitRequest, "sd", desc="Stall and wait on the address") { +stall_and_wait(dmaRequestQueue_in, address); + } + action(wa_wakeUpDependents, "wa", desc="Wake up any requests waiting for this address") { wakeUpBuffers(address); } @@ -1077,6 +1082,10 @@ wakeUpAllBuffers(); } + action(wa_wakeUpAllDependentsAddr, "waaa", desc="Wake up any requests waiting for this address") { +wakeUpAllBuffers(address); + } + action(z_stall, "z", desc="...") { } @@ -1090,6 +1099,11 @@ st_stallAndWaitRequest; } + // The exit state is always going to be U, so wakeUpDependents logic should be covered in all the + // transitions which are flowing into U. + transition({BL, BS_M, BM_M, B_M, BP, BDW_P, BS_PM, BM_PM, B_PM, BS_Pm, BM_Pm, B_Pm, B}, {DmaRead,DmaWrite}){ +sd_stallAndWaitRequest; + } // transitions from U transition(U, DmaRead, BDR_PM) {L3TagArrayRead} { @@ -1193,7 +1207,7 @@ } transition({B}, CoreUnblock, U) { -wa_wakeUpDependents; +wa_wakeUpAllDependentsAddr; pu_popUnblockQueue; } @@ -1323,12 +1337,18 @@ } transition(BDW_P, ProbeAcksComplete, U) { +// Check for pending requests from the core we put to sleep while waiting +// for a response +wa_wakeUpAllDependentsAddr; dt_deallocateTBE; pt_popTriggerQueue; } transition(BDR_Pm, ProbeAcksComplete, U) { dd_sendResponseDmaData; +// Check for pending requests from the core we put to sleep while waiting +// for a response +wa_wakeUpDependents; dt_deallocateTBE; pt_popTriggerQueue; } diff --git a/src/mem/ruby/slicc_interface/AbstractController.cc b/src/mem/ruby/slicc_interface/AbstractController.cc index 9da8727..d2b3370 100644 --- a/src/mem/ruby/slicc_interface/AbstractController.cc +++ b/src/mem/ruby/slicc_interface/AbstractController.cc @@ -149,8 +149,7 @@ { if (m_waiting_buffers.count(addr) > 0) { // -// Wake up all possible lower rank (i.e. lower priority) buffers that could -// be waiting on this message. +// Wake up all possible buffers that could be waiting on this message. // for (int in_port_rank = m_in_ports - 1; in_port_rank >= 0; -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/31996 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I5a76efef97723bc53cf239ea7e112f84fc874ef8 Gerrit-Change-Number: 31996 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(
[gem5-dev] Change in gem5/gem5[develop]: arch-gcn3: Free registers when execMask = 0
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/32234 ) Change subject: arch-gcn3: Free registers when execMask = 0 .. arch-gcn3: Free registers when execMask = 0 Flat instructions free some of their registers through a call to scheduleWriteOperandsFromLoad(), which is executed in GlobalMemPipeline::exec. When execMask is 0, the instruction returns without issuing a memory request. This patch adds in a call to scheduleWriteOperandsFromLoad() when execMask is 0 for Flat instructions that are either Loads or AtomicReturns, as those are the instructions that call scheduleWriteOperandsFromLoad() in the memory pipeline. This patch also adds in a missing return statement when execMask is 0 in one of the Flat instructions. Change-Id: I09296adb7401e7515d3cedceb780a5df4598b109 --- M src/arch/gcn3/insts/instructions.cc 1 file changed, 74 insertions(+), 0 deletions(-) diff --git a/src/arch/gcn3/insts/instructions.cc b/src/arch/gcn3/insts/instructions.cc index 6e81e2c..cc8a9fb 100644 --- a/src/arch/gcn3/insts/instructions.cc +++ b/src/arch/gcn3/insts/instructions.cc @@ -39406,6 +39406,9 @@ wf->decLGKMInstsIssued(); wf->rdGmReqsInPipe--; wf->rdLmReqsInPipe--; +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vfg[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); return; } @@ -39504,6 +39507,9 @@ wf->decLGKMInstsIssued(); wf->rdGmReqsInPipe--; wf->rdLmReqsInPipe--; +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vfg[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); return; } @@ -39602,6 +39608,9 @@ wf->decLGKMInstsIssued(); wf->rdGmReqsInPipe--; wf->rdLmReqsInPipe--; +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vfg[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); return; } @@ -39672,6 +39681,9 @@ wf->decLGKMInstsIssued(); wf->rdGmReqsInPipe--; wf->rdLmReqsInPipe--; +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vfg[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); return; } @@ -39742,6 +39754,9 @@ wf->decLGKMInstsIssued(); wf->rdGmReqsInPipe--; wf->rdLmReqsInPipe--; +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vfg[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); return; } @@ -39821,6 +39836,10 @@ wf->decLGKMInstsIssued(); wf->rdGmReqsInPipe--; wf->rdLmReqsInPipe--; +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vfg[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); +return; } gpuDynInst->execUnitId = wf->execUnitId; @@ -40355,6 +40374,11 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +if (instData.GLC) { +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vfg[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); +} return; } @@ -40457,6 +40481,11 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +if (instData.GLC) { +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vfg[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); +} return; } @@ -40560,6 +40589,11 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +if (instData.GLC) { +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vfg[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); +} return; } @@ -40650,6 +40684,11 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +if (instData.GLC) { +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vfg[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); +} return; } @@ -40914,6 +40953,11 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +if (instData.GLC) { +
[gem5-dev] Change in gem5/gem5[develop]: util: Install python six module
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/32235 ) Change subject: util: Install python six module .. util: Install python six module six is used in develop, but wasn't used in the GCN staging branch. Change-Id: Ic1ca42df871d1e683c288282497267d00421609f --- M util/dockerfiles/gcn-gpu/Dockerfile 1 file changed, 3 insertions(+), 0 deletions(-) diff --git a/util/dockerfiles/gcn-gpu/Dockerfile b/util/dockerfiles/gcn-gpu/Dockerfile index 475918f..1033160 100644 --- a/util/dockerfiles/gcn-gpu/Dockerfile +++ b/util/dockerfiles/gcn-gpu/Dockerfile @@ -23,6 +23,7 @@ python-dev \ python \ python-yaml \ +python-pip \ wget \ libpci3 \ libelf1 \ @@ -34,6 +35,8 @@ libboost-system-dev \ libboost-dev +RUN pip install six + ARG gem5_dist=http://dist.gem5.org/dist/develop # Install ROCm 1.6 binaries -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32235 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Ic1ca42df871d1e683c288282497267d00421609f Gerrit-Change-Number: 32235 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-gcn3: make read2st64_b32 write proper registers
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/32236 ) Change subject: arch-gcn3: make read2st64_b32 write proper registers .. arch-gcn3: make read2st64_b32 write proper registers Per the GCN3 ISA, read2st64_b32 writes to consecutive registers Change-Id: Ibc1672584a72cf7de12e06068a03fe304b34dce2 --- M src/arch/gcn3/insts/instructions.cc 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/arch/gcn3/insts/instructions.cc b/src/arch/gcn3/insts/instructions.cc index 6e81e2c..955d801 100644 --- a/src/arch/gcn3/insts/instructions.cc +++ b/src/arch/gcn3/insts/instructions.cc @@ -32206,7 +32206,7 @@ Inst_DS__DS_READ2ST64_B32::completeAcc(GPUDynInstPtr gpuDynInst) { VecOperandU32 vdst0(gpuDynInst, extData.VDST); -VecOperandU32 vdst1(gpuDynInst, extData.VDST + 2); +VecOperandU32 vdst1(gpuDynInst, extData.VDST + 1); for (int lane = 0; lane < NumVecElemPerVecReg; ++lane) { if (gpuDynInst->exec_mask[lane]) { -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32236 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Ibc1672584a72cf7de12e06068a03fe304b34dce2 Gerrit-Change-Number: 32236 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs: Remove remnants of /dev/shm mapping from apu_se
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/32354 ) Change subject: configs: Remove remnants of /dev/shm mapping from apu_se .. configs: Remove remnants of /dev/shm mapping from apu_se Change-Id: Iec2598c715223d079bc5dfd2ea52859945706cfc --- M configs/example/apu_se.py 1 file changed, 0 insertions(+), 5 deletions(-) diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py index 82e4022..3c532c4 100644 --- a/configs/example/apu_se.py +++ b/configs/example/apu_se.py @@ -623,9 +623,6 @@ dests = ["%s/fs/sys" % m5.options.outdir]), RedirectPath(src = "/tmp", dests = ["%s/fs/tmp" % m5.options.outdir]), - RedirectPath(src = "/dev/shm", - dests = ["/dev/shm/%s/gem5_%s" % - (getpass.getuser(), os.getpid())])] system.redirect_paths = redirect_paths @@ -681,6 +678,4 @@ print("Ticks:", m5.curTick()) print('Exiting because ', exit_event.getCause()) -FileSystemConfig.cleanup_filesystem(options) - sys.exit(exit_event.getCode()) -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32354 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Iec2598c715223d079bc5dfd2ea52859945706cfc Gerrit-Change-Number: 32354 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: util: Add dependencies to gcn Dockerfile required to build
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/32474 ) Change subject: util: Add dependencies to gcn Dockerfile required to build .. util: Add dependencies to gcn Dockerfile required to build src/base/fiber.cc requires valgrind src/base/pngwriter.cc requires libpng-dev Change-Id: I7f009cd8f5cacd64150c06b716b1ce3008832910 --- M util/dockerfiles/gcn-gpu/Dockerfile 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/util/dockerfiles/gcn-gpu/Dockerfile b/util/dockerfiles/gcn-gpu/Dockerfile index 1033160..5001b15 100644 --- a/util/dockerfiles/gcn-gpu/Dockerfile +++ b/util/dockerfiles/gcn-gpu/Dockerfile @@ -23,7 +23,7 @@ python-dev \ python \ python-yaml \ -python-pip \ +python-six \ wget \ libpci3 \ libelf1 \ @@ -33,9 +33,9 @@ libssl-dev \ libboost-filesystem-dev \ libboost-system-dev \ -libboost-dev - -RUN pip install six +libboost-dev \ +libpng12-dev \ +valgrind ARG gem5_dist=http://dist.gem5.org/dist/develop -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32474 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I7f009cd8f5cacd64150c06b716b1ce3008832910 Gerrit-Change-Number: 32474 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: util: Install python six module in gcn dockerfile
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/32235 ) Change subject: util: Install python six module in gcn dockerfile .. util: Install python six module in gcn dockerfile six is used in develop, but wasn't used in the GCN staging branch. Change-Id: Ic1ca42df871d1e683c288282497267d00421609f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32235 Reviewed-by: Matt Sinclair Reviewed-by: Jason Lowe-Power Maintainer: Jason Lowe-Power Tested-by: kokoro --- M util/dockerfiles/gcn-gpu/Dockerfile 1 file changed, 1 insertion(+), 0 deletions(-) Approvals: Jason Lowe-Power: Looks good to me, approved; Looks good to me, approved Matt Sinclair: Looks good to me, approved kokoro: Regressions pass diff --git a/util/dockerfiles/gcn-gpu/Dockerfile b/util/dockerfiles/gcn-gpu/Dockerfile index 475918f..7787339 100644 --- a/util/dockerfiles/gcn-gpu/Dockerfile +++ b/util/dockerfiles/gcn-gpu/Dockerfile @@ -23,6 +23,7 @@ python-dev \ python \ python-yaml \ +python-six \ wget \ libpci3 \ libelf1 \ -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32235 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Ic1ca42df871d1e683c288282497267d00421609f Gerrit-Change-Number: 32235 Gerrit-PatchSet: 4 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Bobby R. Bruce Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-gcn3: Update LmReqsInPipe in atomic flats when execMask=0
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/32614 ) Change subject: arch-gcn3: Update LmReqsInPipe in atomic flats when execMask=0 .. arch-gcn3: Update LmReqsInPipe in atomic flats when execMask=0 In flat instructions, wrLmReqsInPipe/rdLmReqsInPipe are decremented in the calcAddr() function. However, the calcAddr() function is only called when execMask != 0. This patch adds in statements to decrement wrLmReqsInPipe and rdLmReqsInPipe in all implemented atomic flats when execMask is 0. This fixes a scenario where vector local memory and flat instructions are unable to execute due to LocalMemPipeline::isLMReqFIFOWrRdy always returning false in ScheduleStage::dispatchReady after too many atomic flats execute with execMask = 0 Change-Id: I081cfd3faf74bbfcf0728445e7160fa2a76a6a7e --- M src/arch/gcn3/insts/instructions.cc 1 file changed, 22 insertions(+), 0 deletions(-) diff --git a/src/arch/gcn3/insts/instructions.cc b/src/arch/gcn3/insts/instructions.cc index 002621e..fdea636 100644 --- a/src/arch/gcn3/insts/instructions.cc +++ b/src/arch/gcn3/insts/instructions.cc @@ -40374,6 +40374,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -40481,6 +40483,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -40589,6 +40593,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -40684,6 +40690,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -40953,6 +40961,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -41048,6 +41058,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -41172,6 +41184,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -41281,6 +41295,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -41378,6 +41394,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -41657,6 +41675,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -41755,6 +41775,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUni
[gem5-dev] Change in gem5/gem5[develop]: arch-gcn3: make read2st64_b32 write proper registers
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/32236 ) Change subject: arch-gcn3: make read2st64_b32 write proper registers .. arch-gcn3: make read2st64_b32 write proper registers Per the GCN3 ISA, read2st64_b32 writes to consecutive registers Change-Id: Ibc1672584a72cf7de12e06068a03fe304b34dce2 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32236 Reviewed-by: Matt Sinclair Reviewed-by: Alexandru Duțu Reviewed-by: Bradford Beckmann Reviewed-by: Anthony Gutierrez Maintainer: Anthony Gutierrez Tested-by: kokoro --- M src/arch/gcn3/insts/instructions.cc 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Bradford Beckmann: Looks good to me, approved Alexandru Duțu: Looks good to me, approved Anthony Gutierrez: Looks good to me, approved; Looks good to me, approved Matt Sinclair: Looks good to me, but someone else must approve kokoro: Regressions pass diff --git a/src/arch/gcn3/insts/instructions.cc b/src/arch/gcn3/insts/instructions.cc index 6e81e2c..955d801 100644 --- a/src/arch/gcn3/insts/instructions.cc +++ b/src/arch/gcn3/insts/instructions.cc @@ -32206,7 +32206,7 @@ Inst_DS__DS_READ2ST64_B32::completeAcc(GPUDynInstPtr gpuDynInst) { VecOperandU32 vdst0(gpuDynInst, extData.VDST); -VecOperandU32 vdst1(gpuDynInst, extData.VDST + 2); +VecOperandU32 vdst1(gpuDynInst, extData.VDST + 1); for (int lane = 0; lane < NumVecElemPerVecReg; ++lane) { if (gpuDynInst->exec_mask[lane]) { -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32236 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Ibc1672584a72cf7de12e06068a03fe304b34dce2 Gerrit-Change-Number: 32236 Gerrit-PatchSet: 2 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Alexandru Duțu Gerrit-Reviewer: Anthony Gutierrez Gerrit-Reviewer: Bradford Beckmann Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: fix races between data and DMA in MOESI_AMD_Base-dir
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/31996 ) Change subject: mem-ruby: fix races between data and DMA in MOESI_AMD_Base-dir .. mem-ruby: fix races between data and DMA in MOESI_AMD_Base-dir There are race conditions while running several benchmarks, where the DMA engine and the CorePair simultaneously send requests for the same block. This patch fixes two scenarios (a) If the request from the DMA engine arrives before the one from the CorePair, the directory controller records it as a pending request. However, once the DMA request is serviced, the directory doesn't check for pending requests. The CorePair, consequently, never sees a response to its request and this results in a Deadlock. Added call to wakeUpDependents in the transition from BDR_Pm to U Added call to wakeUpDependents in the transition from BDW_P to U (b) If the request from the CorePair is being serviced by the directory and the DMA requests for the same block, this causes an invalid transition because the current coherence doesn't take care of this scenario. Added transition state where the requests from DMA are added to the stall buffer. Updated B to U CoreUnblock transition to check all buffers, as the DMA requests were being placed later in the stall buffer than was being checked Change-Id: I5a76efef97723bc53cf239ea7e112f84fc874ef8 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31996 Reviewed-by: Matt Sinclair Reviewed-by: Bradford Beckmann Maintainer: Bradford Beckmann Tested-by: kokoro --- M src/mem/ruby/protocol/MOESI_AMD_Base-dir.sm M src/mem/ruby/slicc_interface/AbstractController.cc 2 files changed, 22 insertions(+), 3 deletions(-) Approvals: Bradford Beckmann: Looks good to me, approved; Looks good to me, approved Matt Sinclair: Looks good to me, but someone else must approve kokoro: Regressions pass diff --git a/src/mem/ruby/protocol/MOESI_AMD_Base-dir.sm b/src/mem/ruby/protocol/MOESI_AMD_Base-dir.sm index c8dafd5..f1bc637 100644 --- a/src/mem/ruby/protocol/MOESI_AMD_Base-dir.sm +++ b/src/mem/ruby/protocol/MOESI_AMD_Base-dir.sm @@ -180,6 +180,7 @@ void set_tbe(TBE a); void unset_tbe(); void wakeUpAllBuffers(); + void wakeUpAllBuffers(Addr a); void wakeUpBuffers(Addr a); Cycles curCycle(); @@ -1069,6 +1070,10 @@ stall_and_wait(requestNetwork_in, address); } + action(sd_stallAndWaitRequest, "sd", desc="Stall and wait on the address") { +stall_and_wait(dmaRequestQueue_in, address); + } + action(wa_wakeUpDependents, "wa", desc="Wake up any requests waiting for this address") { wakeUpBuffers(address); } @@ -1077,6 +1082,10 @@ wakeUpAllBuffers(); } + action(wa_wakeUpAllDependentsAddr, "waaa", desc="Wake up any requests waiting for this address") { +wakeUpAllBuffers(address); + } + action(z_stall, "z", desc="...") { } @@ -1090,6 +1099,11 @@ st_stallAndWaitRequest; } + // The exit state is always going to be U, so wakeUpDependents logic should be covered in all the + // transitions which are flowing into U. + transition({BL, BS_M, BM_M, B_M, BP, BDW_P, BS_PM, BM_PM, B_PM, BS_Pm, BM_Pm, B_Pm, B}, {DmaRead,DmaWrite}){ +sd_stallAndWaitRequest; + } // transitions from U transition(U, DmaRead, BDR_PM) {L3TagArrayRead} { @@ -1193,7 +1207,7 @@ } transition({B}, CoreUnblock, U) { -wa_wakeUpDependents; +wa_wakeUpAllDependentsAddr; pu_popUnblockQueue; } @@ -1323,12 +1337,18 @@ } transition(BDW_P, ProbeAcksComplete, U) { +// Check for pending requests from the core we put to sleep while waiting +// for a response +wa_wakeUpAllDependentsAddr; dt_deallocateTBE; pt_popTriggerQueue; } transition(BDR_Pm, ProbeAcksComplete, U) { dd_sendResponseDmaData; +// Check for pending requests from the core we put to sleep while waiting +// for a response +wa_wakeUpDependents; dt_deallocateTBE; pt_popTriggerQueue; } diff --git a/src/mem/ruby/slicc_interface/AbstractController.cc b/src/mem/ruby/slicc_interface/AbstractController.cc index 9da8727..d2b3370 100644 --- a/src/mem/ruby/slicc_interface/AbstractController.cc +++ b/src/mem/ruby/slicc_interface/AbstractController.cc @@ -149,8 +149,7 @@ { if (m_waiting_buffers.count(addr) > 0) { // -// Wake up all possible lower rank (i.e. lower priority) buffers that could -// be waiting on this message. +// Wake up all possible buffers that could be waiting on this message. // for (int in_port_rank = m_in_ports - 1; in_port_rank >= 0; -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/31996 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Cha
[gem5-dev] Change in gem5/gem5[develop]: configs: Remove unneeded variable assignments in apu_se
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/32654 ) Change subject: configs: Remove unneeded variable assignments in apu_se .. configs: Remove unneeded variable assignments in apu_se This patch removes: A line assigning a variable to itself An assignment to a variable (chroot) that is never used. The above assignment also caused an error, "'NoneType' object has no attribute 'startswith'" Change-Id: Ib93c25fee4a0f7c1440de8067b086d8b96614796 --- M configs/example/apu_se.py 1 file changed, 0 insertions(+), 3 deletions(-) diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py index fcec7c1..25c148e 100644 --- a/configs/example/apu_se.py +++ b/configs/example/apu_se.py @@ -334,8 +334,6 @@ shader.CUs = compute_units ## Creating the CPU system -options.num_cpus = options.num_cpus - # The shader core will be whatever is after the CPU cores are accounted for shader_idx = options.num_cpus @@ -616,7 +614,6 @@ ## Start simulation -chroot = os.path.expanduser(options.chroot) redirect_paths = [RedirectPath(src = "/proc", dests = ["%s/fs/proc" % m5.options.outdir]), RedirectPath(src = "/sys", -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32654 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Ib93c25fee4a0f7c1440de8067b086d8b96614796 Gerrit-Change-Number: 32654 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs: Use proper keywordargs for RedirectPath in apu_se
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/32655 ) Change subject: configs: Use proper keywordargs for RedirectPath in apu_se .. configs: Use proper keywordargs for RedirectPath in apu_se RedirectPath uses app_path and host_paths instead of src and dests. This patch fixes that in apu_se. The patch also changes the formatting for those lines, as simply replacing dests with host_paths put the lines over the 80 char limit. Change-Id: If7e4c41f2f52bc3d5aa26465c786294f9b68f8d3 --- M configs/example/apu_se.py 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py index 25c148e..d8246ec 100644 --- a/configs/example/apu_se.py +++ b/configs/example/apu_se.py @@ -614,12 +614,12 @@ ## Start simulation -redirect_paths = [RedirectPath(src = "/proc", - dests = ["%s/fs/proc" % m5.options.outdir]), - RedirectPath(src = "/sys", - dests = ["%s/fs/sys" % m5.options.outdir]), - RedirectPath(src = "/tmp", - dests = ["%s/fs/tmp" % m5.options.outdir])] +redirect_paths = [RedirectPath(app_path = "/proc", host_paths = \ +["%s/fs/proc" % m5.options.outdir]), + RedirectPath(app_path = "/sys", host_paths = \ +["%s/fs/sys" % m5.options.outdir]), + RedirectPath(app_path = "/tmp", host_paths = \ +["%s/fs/tmp" % m5.options.outdir])] system.redirect_paths = redirect_paths -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32655 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: If7e4c41f2f52bc3d5aa26465c786294f9b68f8d3 Gerrit-Change-Number: 32655 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-gcn3: Free registers when execMask = 0
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/32234 ) Change subject: arch-gcn3: Free registers when execMask = 0 .. arch-gcn3: Free registers when execMask = 0 Flat instructions free some of their registers through their memory requests, in particuar a call to scheduleWriteOperandsFromLoad(), which gets called from GlobalMemPipeline::exec. When execMask is 0, the instruction doesn't issue a memory request. This patch adds in a call to scheduleWriteOperandsFromLoad() when execMask is 0 for Flat Load and AtomicReturn instructions, as those are the instructions that call scheduleWriteOperandsFromLoad() in the memory pipeline. This patch also adds in a missing return statement when execMask is 0 in one of the Flat instructions. Change-Id: I09296adb7401e7515d3cedceb780a5df4598b109 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32234 Reviewed-by: Matt Sinclair Reviewed-by: Anthony Gutierrez Maintainer: Anthony Gutierrez Tested-by: kokoro --- M src/arch/gcn3/insts/instructions.cc 1 file changed, 74 insertions(+), 0 deletions(-) Approvals: Anthony Gutierrez: Looks good to me, approved; Looks good to me, approved Matt Sinclair: Looks good to me, approved kokoro: Regressions pass diff --git a/src/arch/gcn3/insts/instructions.cc b/src/arch/gcn3/insts/instructions.cc index 955d801..fd89ae2 100644 --- a/src/arch/gcn3/insts/instructions.cc +++ b/src/arch/gcn3/insts/instructions.cc @@ -39406,6 +39406,9 @@ wf->decLGKMInstsIssued(); wf->rdGmReqsInPipe--; wf->rdLmReqsInPipe--; +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vrf[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); return; } @@ -39504,6 +39507,9 @@ wf->decLGKMInstsIssued(); wf->rdGmReqsInPipe--; wf->rdLmReqsInPipe--; +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vrf[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); return; } @@ -39602,6 +39608,9 @@ wf->decLGKMInstsIssued(); wf->rdGmReqsInPipe--; wf->rdLmReqsInPipe--; +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vrf[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); return; } @@ -39672,6 +39681,9 @@ wf->decLGKMInstsIssued(); wf->rdGmReqsInPipe--; wf->rdLmReqsInPipe--; +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vrf[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); return; } @@ -39742,6 +39754,9 @@ wf->decLGKMInstsIssued(); wf->rdGmReqsInPipe--; wf->rdLmReqsInPipe--; +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vrf[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); return; } @@ -39821,6 +39836,10 @@ wf->decLGKMInstsIssued(); wf->rdGmReqsInPipe--; wf->rdLmReqsInPipe--; +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vrf[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); +return; } gpuDynInst->execUnitId = wf->execUnitId; @@ -40355,6 +40374,11 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +if (instData.GLC) { +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vrf[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); +} return; } @@ -40457,6 +40481,11 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +if (instData.GLC) { +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vrf[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); +} return; } @@ -40560,6 +40589,11 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +if (instData.GLC) { +gpuDynInst->exec_mask = wf->execMask(); +wf->computeUnit->vrf[wf->simdId]-> +scheduleWriteOperandsFromLoad(wf, gpuDynInst); +} return; } @@ -40650,6 +40684,11 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +if (instData.GLC) { +gpuDynInst->exec_mask = wf->execMask(); +
[gem5-dev] Change in gem5/gem5[develop]: util: Add build dependency to gcn Dockerfile
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/32474 ) Change subject: util: Add build dependency to gcn Dockerfile .. util: Add build dependency to gcn Dockerfile src/base/pngwriter.cc requires libpng-dev Change-Id: I7f009cd8f5cacd64150c06b716b1ce3008832910 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32474 Maintainer: Bobby R. Bruce Tested-by: kokoro Reviewed-by: Matt Sinclair --- M util/dockerfiles/gcn-gpu/Dockerfile 1 file changed, 2 insertions(+), 1 deletion(-) Approvals: Matt Sinclair: Looks good to me, approved Bobby R. Bruce: Looks good to me, approved kokoro: Regressions pass diff --git a/util/dockerfiles/gcn-gpu/Dockerfile b/util/dockerfiles/gcn-gpu/Dockerfile index 7787339..2e1f591 100644 --- a/util/dockerfiles/gcn-gpu/Dockerfile +++ b/util/dockerfiles/gcn-gpu/Dockerfile @@ -33,7 +33,8 @@ libssl-dev \ libboost-filesystem-dev \ libboost-system-dev \ -libboost-dev +libboost-dev \ +libpng12-dev ARG gem5_dist=http://dist.gem5.org/dist/develop -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32474 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I7f009cd8f5cacd64150c06b716b1ce3008832910 Gerrit-Change-Number: 32474 Gerrit-PatchSet: 4 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Bobby R. Bruce Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs: Add missing --buffers-size option to GPU_VIPER.py
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/32676 ) Change subject: configs: Add missing --buffers-size option to GPU_VIPER.py .. configs: Add missing --buffers-size option to GPU_VIPER.py Default value of 128 and help message taken from the implementation on the GCN staging branch Change-Id: I58b6b57be07498cdf6e39c0bb85982674ec4caa6 --- M configs/ruby/GPU_VIPER.py 1 file changed, 2 insertions(+), 0 deletions(-) diff --git a/configs/ruby/GPU_VIPER.py b/configs/ruby/GPU_VIPER.py index 50ccd2b..b87928d 100644 --- a/configs/ruby/GPU_VIPER.py +++ b/configs/ruby/GPU_VIPER.py @@ -399,6 +399,8 @@ parser.add_option("--noL1", action = "store_true", default = False, help = "bypassL1") +parser.add_options("--buffers-size", type = 'int', default = 128, + help="Size of MessageBuffers at the controller") def create_system(options, full_system, system, dma_devices, bootmem, ruby_system): -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32676 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I58b6b57be07498cdf6e39c0bb85982674ec4caa6 Gerrit-Change-Number: 32676 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs: Add import for FileSystemConfig in GPU_VIPER.py
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/32675 ) Change subject: configs: Add import for FileSystemConfig in GPU_VIPER.py .. configs: Add import for FileSystemConfig in GPU_VIPER.py Change-Id: I539a4060d705f6e1b9a12aca7836eca271f61557 --- M configs/ruby/GPU_VIPER.py 1 file changed, 1 insertion(+), 0 deletions(-) diff --git a/configs/ruby/GPU_VIPER.py b/configs/ruby/GPU_VIPER.py index 967b4d3..50ccd2b 100644 --- a/configs/ruby/GPU_VIPER.py +++ b/configs/ruby/GPU_VIPER.py @@ -37,6 +37,7 @@ from m5.util import addToPath from .Ruby import create_topology from .Ruby import send_evicts +from common import FileSystemConfig addToPath('../') -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32675 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I539a4060d705f6e1b9a12aca7836eca271f61557 Gerrit-Change-Number: 32675 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs: Replace DirMem w/RubyDirectoryMemory, set addr_ranges
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/32674 ) Change subject: configs: Replace DirMem w/RubyDirectoryMemory, set addr_ranges .. configs: Replace DirMem w/RubyDirectoryMemory, set addr_ranges This was originally from the GCN staging branch, which only had GPU_VIPER.py, but the other GPU_VIPER configs had DirMem as well, so I applied this change to all of them. The patch replaces the Directory in DirCntrl from DirMem to RubyDirectoryMemory. This fixes errors that DirMem caused relating to setting class variables. It also generates and sets addr_ranges in DirCntrl as RubyDirectoryMemory uses the parent object's addr_ranges in its code The style checker complained about a line length in GPU_VIPER_Region, so the patch also fixes that Change-Id: Icec96777a51d8a826b576fc752fae0f7f15427bc --- M configs/ruby/GPU_VIPER.py M configs/ruby/GPU_VIPER_Baseline.py M configs/ruby/GPU_VIPER_Region.py 3 files changed, 69 insertions(+), 47 deletions(-) diff --git a/configs/ruby/GPU_VIPER.py b/configs/ruby/GPU_VIPER.py index 92dcf5e..967b4d3 100644 --- a/configs/ruby/GPU_VIPER.py +++ b/configs/ruby/GPU_VIPER.py @@ -322,24 +322,14 @@ self.probeToL3 = probe_to_l3 self.respToL3 = resp_to_l3 -class DirMem(RubyDirectoryMemory, CntrlBase): -def create(self, options, ruby_system, system): -self.version = self.versionCount() - -phys_mem_size = AddrRange(options.mem_size).size() -mem_module_size = phys_mem_size / options.num_dirs -dir_size = MemorySize('0B') -dir_size.value = mem_module_size -self.size = dir_size - class DirCntrl(Directory_Controller, CntrlBase): -def create(self, options, ruby_system, system): +def create(self, options, dir_ranges, ruby_system, system): self.version = self.versionCount() self.response_latency = 30 -self.directory = DirMem() -self.directory.create(options, ruby_system, system) +self.addr_ranges = dir_ranges +self.directory = RubyDirectoryMemory() self.L3CacheMemory = L3Cache() self.L3CacheMemory.create(options, ruby_system, system) @@ -441,6 +431,17 @@ # Clusters crossbar_bw = None mainCluster = None + +if options.numa_high_bit: +numa_bit = options.numa_high_bit +else: +# if the numa_bit is not specified, set the directory bits as the +# lowest bits above the block offset bits, and the numa_bit as the +# highest of those directory bits +dir_bits = int(math.log(options.num_dirs, 2)) +block_size_bits = int(math.log(options.cacheline_size, 2)) +numa_bit = block_size_bits + dir_bits - 1 + if hasattr(options, 'bw_scalor') and options.bw_scalor > 0: #Assuming a 2GHz clock crossbar_bw = 16 * options.num_compute_units * options.bw_scalor @@ -448,9 +449,16 @@ else: mainCluster = Cluster(intBW=8) # 16 GB/s for i in range(options.num_dirs): +dir_ranges = [] +for r in system.mem_ranges: +addr_range = m5.objects.AddrRange(r.start, size = r.size(), + intlvHighBit = numa_bit, + intlvBits = dir_bits, + intlvMatch = i) +dir_ranges.append(addr_range) dir_cntrl = DirCntrl(noTCCdir = True, TCC_select_num_bits = TCC_bits) -dir_cntrl.create(options, ruby_system, system) +dir_cntrl.create(options, dir_ranges, ruby_system, system) dir_cntrl.number_of_TBEs = options.num_tbes dir_cntrl.useL3OnWT = options.use_L3_on_WT # the number_of_TBEs is inclusive of TBEs below diff --git a/configs/ruby/GPU_VIPER_Baseline.py b/configs/ruby/GPU_VIPER_Baseline.py index 5388a4e..5a3 100644 --- a/configs/ruby/GPU_VIPER_Baseline.py +++ b/configs/ruby/GPU_VIPER_Baseline.py @@ -301,22 +301,12 @@ self.probeToL3 = probe_to_l3 self.respToL3 = resp_to_l3 -class DirMem(RubyDirectoryMemory, CntrlBase): -def create(self, options, ruby_system, system): -self.version = self.versionCount() - -phys_mem_size = AddrRange(options.mem_size).size() -mem_module_size = phys_mem_size / options.num_dirs -dir_size = MemorySize('0B') -dir_size.value = mem_module_size -self.size = dir_size - class DirCntrl(Directory_Controller, CntrlBase): -def create(self, options, ruby_system, system): +def create(self, options, dir_ranges, ruby_system, system): self.version = self.versionCount() self.response_latency = 30 -self.directory = DirMem() -self.directory.create(options, ruby_system, system) +self.addr_ranges = dir_ranges +self.directory = RubyDirectoryMemory() self.L3CacheMemory = L3Cache()
[gem5-dev] Change in gem5/gem5[develop]: configs,gpu-compute,mem-ruby: connect gmTokenPorts in apu_se
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/32677 ) Change subject: configs,gpu-compute,mem-ruby: connect gmTokenPorts in apu_se .. configs,gpu-compute,mem-ruby: connect gmTokenPorts in apu_se This patch adds gmTokenPorts to the ComputeUnit and RubyGPUCoalescer python classes so the gmTokenPorts can be connected in apu_se. Change-Id: Icf3cb05c757754d6935b46f14e4b1b1d5072c4ca --- M configs/example/apu_se.py M src/gpu-compute/GPU.py M src/mem/ruby/system/GPUCoalescer.py 3 files changed, 5 insertions(+), 0 deletions(-) diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py index 82e4022..ef2236c 100644 --- a/configs/example/apu_se.py +++ b/configs/example/apu_se.py @@ -571,6 +571,8 @@ for j in range(wavefront_size): system.cpu[shader_idx].CUs[i].memory_port[j] = \ system.ruby._cpu_ports[gpu_port_idx].slave[j] +system.cpu[shader_idx].CUs[i].gmTokenPort = \ +system.ruby._cpu_ports[gpu_port_idx].gmTokenPort gpu_port_idx += 1 for i in range(n_cu): diff --git a/src/gpu-compute/GPU.py b/src/gpu-compute/GPU.py index 7408bf9..aec4f48 100644 --- a/src/gpu-compute/GPU.py +++ b/src/gpu-compute/GPU.py @@ -165,6 +165,7 @@ sqc_tlb_port = MasterPort("Port to the TLB for the SQC (I-cache)") scalar_port = MasterPort("Port to the scalar data cache") scalar_tlb_port = MasterPort("Port to the TLB for the scalar data cache") +gmTokenPort = MasterPort("Port to the GPU coalesecer for sharing tokens") perLaneTLB = Param.Bool(False, "enable per-lane TLB") prefetch_depth = Param.Int(0, "Number of prefetches triggered at a time"\ "(0 turns off prefetching)") diff --git a/src/mem/ruby/system/GPUCoalescer.py b/src/mem/ruby/system/GPUCoalescer.py index 3345f7f..9d4a76b 100644 --- a/src/mem/ruby/system/GPUCoalescer.py +++ b/src/mem/ruby/system/GPUCoalescer.py @@ -52,3 +52,5 @@ "max outstanding cycles for a request before " \ "deadlock/livelock declared") garnet_standalone = Param.Bool(False, "") + + gmTokenPort = SlavePort("Port to the CU for sharing tokens") -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32677 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Icf3cb05c757754d6935b46f14e4b1b1d5072c4ca Gerrit-Change-Number: 32677 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: misc: Fix db_offset calculation
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/32678 ) Change subject: misc: Fix db_offset calculation .. misc: Fix db_offset calculation db_offset used to be calculated through pointer arithmetic. Pointer arithmetic increments the address by the size of the data type the pointer is pointing at. In the previous db_offset calculation, that was a uint32_t, which means the input was multiplied by 4. This patch multiplies the input value by 4 before assigning it to db_offset. Change-Id: I9042560303ae6b8b1054b98e9a16a9da27843bb2 --- M src/dev/hsa/hw_scheduler.cc 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/dev/hsa/hw_scheduler.cc b/src/dev/hsa/hw_scheduler.cc index 8523be9..4037ffb 100644 --- a/src/dev/hsa/hw_scheduler.cc +++ b/src/dev/hsa/hw_scheduler.cc @@ -95,7 +95,7 @@ // #define VOID_PTR_ADD32(ptr,n) // (void*)((uint32_t*)(ptr) + n)/*ptr + offset*/ // (Addr)VOID_PTR_ADD32(0, queue_id) -Addr db_offset = queue_id; +Addr db_offset = 4*queue_id; if (dbMap.find(db_offset) != dbMap.end()) { panic("Creating an already existing queue (queueID %d)", queue_id); } @@ -346,7 +346,7 @@ // #define VOID_PTR_ADD32(ptr,n) // (void*)((uint32_t*)(ptr) + n)/*ptr + offset*/ // (Addr)VOID_PTR_ADD32(0, queue_id) -Addr db_offset = queue_id; +Addr db_offset = 4*queue_id; auto dbmap_iter = dbMap.find(db_offset); if (dbmap_iter == dbMap.end()) { panic("Destroying a non-existing queue (db_offset %x)", -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32678 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I9042560303ae6b8b1054b98e9a16a9da27843bb2 Gerrit-Change-Number: 32678 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs: Remove remnants of /dev/shm mapping from apu_se
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/32354 ) Change subject: configs: Remove remnants of /dev/shm mapping from apu_se .. configs: Remove remnants of /dev/shm mapping from apu_se This patch removes a redirect for /dev/shm. It also removes a function call that cleaned up the /dev/shm redirect Change-Id: Iec2598c715223d079bc5dfd2ea52859945706cfc Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32354 Reviewed-by: Matt Sinclair Reviewed-by: Jason Lowe-Power Maintainer: Jason Lowe-Power Tested-by: kokoro --- M configs/example/apu_se.py 1 file changed, 1 insertion(+), 6 deletions(-) Approvals: Jason Lowe-Power: Looks good to me, approved; Looks good to me, approved Matt Sinclair: Looks good to me, approved kokoro: Regressions pass diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py index 82e4022..fcec7c1 100644 --- a/configs/example/apu_se.py +++ b/configs/example/apu_se.py @@ -622,10 +622,7 @@ RedirectPath(src = "/sys", dests = ["%s/fs/sys" % m5.options.outdir]), RedirectPath(src = "/tmp", - dests = ["%s/fs/tmp" % m5.options.outdir]), - RedirectPath(src = "/dev/shm", - dests = ["/dev/shm/%s/gem5_%s" % - (getpass.getuser(), os.getpid())])] + dests = ["%s/fs/tmp" % m5.options.outdir])] system.redirect_paths = redirect_paths @@ -681,6 +678,4 @@ print("Ticks:", m5.curTick()) print('Exiting because ', exit_event.getCause()) -FileSystemConfig.cleanup_filesystem(options) - sys.exit(exit_event.getCode()) -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32354 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Iec2598c715223d079bc5dfd2ea52859945706cfc Gerrit-Change-Number: 32354 Gerrit-PatchSet: 4 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Anthony Gutierrez Gerrit-Reviewer: Brandon Potter Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: Matthew Poremba Gerrit-Reviewer: kokoro Gerrit-CC: Bradford Beckmann Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs: Remove unneeded variable assignments in apu_se
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/32654 ) Change subject: configs: Remove unneeded variable assignments in apu_se .. configs: Remove unneeded variable assignments in apu_se This patch removes: A line assigning a variable to itself An assignment to a variable (chroot) that is never used. The above assignment also caused an error, "'NoneType' object has no attribute 'startswith'" Change-Id: Ib93c25fee4a0f7c1440de8067b086d8b96614796 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32654 Reviewed-by: Matt Sinclair Reviewed-by: Jason Lowe-Power Maintainer: Jason Lowe-Power Tested-by: kokoro --- M configs/example/apu_se.py 1 file changed, 0 insertions(+), 3 deletions(-) Approvals: Jason Lowe-Power: Looks good to me, approved; Looks good to me, approved Matt Sinclair: Looks good to me, but someone else must approve kokoro: Regressions pass diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py index fcec7c1..25c148e 100644 --- a/configs/example/apu_se.py +++ b/configs/example/apu_se.py @@ -334,8 +334,6 @@ shader.CUs = compute_units ## Creating the CPU system -options.num_cpus = options.num_cpus - # The shader core will be whatever is after the CPU cores are accounted for shader_idx = options.num_cpus @@ -616,7 +614,6 @@ ## Start simulation -chroot = os.path.expanduser(options.chroot) redirect_paths = [RedirectPath(src = "/proc", dests = ["%s/fs/proc" % m5.options.outdir]), RedirectPath(src = "/sys", -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32654 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Ib93c25fee4a0f7c1440de8067b086d8b96614796 Gerrit-Change-Number: 32654 Gerrit-PatchSet: 2 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Anthony Gutierrez Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: kokoro Gerrit-CC: Bradford Beckmann Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs: Use proper keywordargs for RedirectPath in apu_se
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/32655 ) Change subject: configs: Use proper keywordargs for RedirectPath in apu_se .. configs: Use proper keywordargs for RedirectPath in apu_se RedirectPath uses app_path and host_paths instead of src and dests. This patch fixes that in apu_se. The patch also changes the formatting for those lines, as simply replacing dests with host_paths put the lines over the 80 char limit. Change-Id: If7e4c41f2f52bc3d5aa26465c786294f9b68f8d3 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32655 Reviewed-by: Jason Lowe-Power Reviewed-by: Matt Sinclair Maintainer: Jason Lowe-Power Tested-by: kokoro --- M configs/example/apu_se.py 1 file changed, 9 insertions(+), 6 deletions(-) Approvals: Jason Lowe-Power: Looks good to me, approved; Looks good to me, approved Matt Sinclair: Looks good to me, approved kokoro: Regressions pass diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py index 25c148e..b629058 100644 --- a/configs/example/apu_se.py +++ b/configs/example/apu_se.py @@ -614,12 +614,15 @@ ## Start simulation -redirect_paths = [RedirectPath(src = "/proc", - dests = ["%s/fs/proc" % m5.options.outdir]), - RedirectPath(src = "/sys", - dests = ["%s/fs/sys" % m5.options.outdir]), - RedirectPath(src = "/tmp", - dests = ["%s/fs/tmp" % m5.options.outdir])] +redirect_paths = [RedirectPath(app_path = "/proc", + host_paths = +["%s/fs/proc" % m5.options.outdir]), + RedirectPath(app_path = "/sys", + host_paths = +["%s/fs/sys" % m5.options.outdir]), + RedirectPath(app_path = "/tmp", + host_paths = +["%s/fs/tmp" % m5.options.outdir])] system.redirect_paths = redirect_paths -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32655 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: If7e4c41f2f52bc3d5aa26465c786294f9b68f8d3 Gerrit-Change-Number: 32655 Gerrit-PatchSet: 4 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: kokoro Gerrit-CC: Anthony Gutierrez Gerrit-CC: Bradford Beckmann Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs,gpu-compute,mem-ruby: connect gmTokenPorts in apu_se
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/32677 ) Change subject: configs,gpu-compute,mem-ruby: connect gmTokenPorts in apu_se .. configs,gpu-compute,mem-ruby: connect gmTokenPorts in apu_se This patch adds gmTokenPorts to the ComputeUnit and RubyGPUCoalescer python classes so the gmTokenPorts can be connected in apu_se. Change-Id: Icf3cb05c757754d6935b46f14e4b1b1d5072c4ca Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32677 Reviewed-by: Matt Sinclair Reviewed-by: Anthony Gutierrez Maintainer: Anthony Gutierrez Tested-by: kokoro --- M configs/example/apu_se.py M src/gpu-compute/GPU.py M src/mem/ruby/system/GPUCoalescer.py 3 files changed, 5 insertions(+), 0 deletions(-) Approvals: Anthony Gutierrez: Looks good to me, approved; Looks good to me, approved Matt Sinclair: Looks good to me, but someone else must approve kokoro: Regressions pass diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py index b629058..59dd4c5 100644 --- a/configs/example/apu_se.py +++ b/configs/example/apu_se.py @@ -569,6 +569,8 @@ for j in range(wavefront_size): system.cpu[shader_idx].CUs[i].memory_port[j] = \ system.ruby._cpu_ports[gpu_port_idx].slave[j] +system.cpu[shader_idx].CUs[i].gmTokenPort = \ +system.ruby._cpu_ports[gpu_port_idx].gmTokenPort gpu_port_idx += 1 for i in range(n_cu): diff --git a/src/gpu-compute/GPU.py b/src/gpu-compute/GPU.py index 7408bf9..aec4f48 100644 --- a/src/gpu-compute/GPU.py +++ b/src/gpu-compute/GPU.py @@ -165,6 +165,7 @@ sqc_tlb_port = MasterPort("Port to the TLB for the SQC (I-cache)") scalar_port = MasterPort("Port to the scalar data cache") scalar_tlb_port = MasterPort("Port to the TLB for the scalar data cache") +gmTokenPort = MasterPort("Port to the GPU coalesecer for sharing tokens") perLaneTLB = Param.Bool(False, "enable per-lane TLB") prefetch_depth = Param.Int(0, "Number of prefetches triggered at a time"\ "(0 turns off prefetching)") diff --git a/src/mem/ruby/system/GPUCoalescer.py b/src/mem/ruby/system/GPUCoalescer.py index 3345f7f..9d4a76b 100644 --- a/src/mem/ruby/system/GPUCoalescer.py +++ b/src/mem/ruby/system/GPUCoalescer.py @@ -52,3 +52,5 @@ "max outstanding cycles for a request before " \ "deadlock/livelock declared") garnet_standalone = Param.Bool(False, "") + + gmTokenPort = SlavePort("Port to the CU for sharing tokens") -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32677 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Icf3cb05c757754d6935b46f14e4b1b1d5072c4ca Gerrit-Change-Number: 32677 Gerrit-PatchSet: 2 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Anthony Gutierrez Gerrit-Reviewer: Bradford Beckmann Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: Matthew Poremba Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: arch-gcn3: Update LmReqsInPipe in atomic flats when execMask=0
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/32614 ) Change subject: arch-gcn3: Update LmReqsInPipe in atomic flats when execMask=0 .. arch-gcn3: Update LmReqsInPipe in atomic flats when execMask=0 In flat instructions, wrLmReqsInPipe/rdLmReqsInPipe are decremented in the calcAddr() function. However, the calcAddr() function is only called when execMask != 0. This patch adds in statements to decrement wrLmReqsInPipe and rdLmReqsInPipe in all implemented atomic flats when execMask is 0. This fixes a scenario where vector local memory and flat instructions are unable to execute due to LocalMemPipeline::isLMReqFIFOWrRdy always returning false in ScheduleStage::dispatchReady after too many atomic flats execute with execMask = 0 Change-Id: I081cfd3faf74bbfcf0728445e7160fa2a76a6a7e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32614 Reviewed-by: Matt Sinclair Reviewed-by: Alexandru Duțu Maintainer: Anthony Gutierrez Tested-by: kokoro --- M src/arch/gcn3/insts/instructions.cc 1 file changed, 22 insertions(+), 0 deletions(-) Approvals: Alexandru Duțu: Looks good to me, approved Matt Sinclair: Looks good to me, but someone else must approve Anthony Gutierrez: Looks good to me, approved kokoro: Regressions pass diff --git a/src/arch/gcn3/insts/instructions.cc b/src/arch/gcn3/insts/instructions.cc index fd89ae2..296dbad 100644 --- a/src/arch/gcn3/insts/instructions.cc +++ b/src/arch/gcn3/insts/instructions.cc @@ -40374,6 +40374,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -40481,6 +40483,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -40589,6 +40593,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -40684,6 +40690,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -40953,6 +40961,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -41048,6 +41058,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -41172,6 +41184,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -41281,6 +41295,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -41378,6 +41394,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask(); wf->computeUnit->vrf[wf->simdId]-> @@ -41657,6 +41675,8 @@ wf->decLGKMInstsIssued(); wf->wrGmReqsInPipe--; wf->rdGmReqsInPipe--; +wf->wrLmReqsInPipe--; +wf->rdLmReqsInPipe--; if (instData.GLC) { gpuDynInst->exec_mask = wf->execMask();
[gem5-dev] Change in gem5/gem5[develop]: misc: Use VPtr in hsa_driver.cc
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/33655 ) Change subject: misc: Use VPtr in hsa_driver.cc .. misc: Use VPtr in hsa_driver.cc This change updates HSADriver::allocateQueue to take in a ThreadContext pointer as opposed to a PortProxy ref. This allows the TypedBufferArg to be replaced with VPtr. This also fixes building GCN3_X86 Change-Id: I1fea26b10c7344daf54a0cb05337e961f834a5fd --- M src/dev/hsa/hsa_driver.cc M src/dev/hsa/hsa_driver.hh M src/gpu-compute/gpu_compute_driver.cc 3 files changed, 4 insertions(+), 6 deletions(-) diff --git a/src/dev/hsa/hsa_driver.cc b/src/dev/hsa/hsa_driver.cc index a1215c4..3b27149 100644 --- a/src/dev/hsa/hsa_driver.cc +++ b/src/dev/hsa/hsa_driver.cc @@ -101,10 +101,9 @@ * be mapped into that page. */ void -HSADriver::allocateQueue(PortProxy &mem_proxy, Addr ioc_buf) +HSADriver::allocateQueue(ThreadContext *tc, Addr ioc_buf) { -TypedBufferArg args(ioc_buf); -args.copyIn(mem_proxy); +VPtr args(ioc_buf, tc); if (queueId >= 0x1000) { fatal("%s: Exceeded maximum number of HSA queues allowed\n", name()); @@ -115,5 +114,4 @@ hsa_pp.setDeviceQueueDesc(args->read_pointer_address, args->ring_base_address, args->queue_id, args->ring_size); -args.copyOut(mem_proxy); } diff --git a/src/dev/hsa/hsa_driver.hh b/src/dev/hsa/hsa_driver.hh index abf79ab..19982f7 100644 --- a/src/dev/hsa/hsa_driver.hh +++ b/src/dev/hsa/hsa_driver.hh @@ -74,7 +74,7 @@ HSADevice *device; uint32_t queueId; -void allocateQueue(PortProxy &mem_proxy, Addr ioc_buf); +void allocateQueue(ThreadContext *tc, Addr ioc_buf); }; #endif // __DEV_HSA_HSA_DRIVER_HH__ diff --git a/src/gpu-compute/gpu_compute_driver.cc b/src/gpu-compute/gpu_compute_driver.cc index 6bdb314..b4d65ce6 100644 --- a/src/gpu-compute/gpu_compute_driver.cc +++ b/src/gpu-compute/gpu_compute_driver.cc @@ -71,7 +71,7 @@ { DPRINTF(GPUDriver, "ioctl: AMDKFD_IOC_CREATE_QUEUE\n"); -allocateQueue(virt_proxy, ioc_buf); +allocateQueue(tc, ioc_buf); DPRINTF(GPUDriver, "Creating queue %d\n", queueId); } -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/33655 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I1fea26b10c7344daf54a0cb05337e961f834a5fd Gerrit-Change-Number: 33655 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute,mem-ruby: WriteCompletePkts fixes
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/33656 ) Change subject: gpu-compute,mem-ruby: WriteCompletePkts fixes .. gpu-compute,mem-ruby: WriteCompletePkts fixes There seems to be a flow of packets as so: WriteResp -> WriteReq -> WriteCompleteResp All of these packets share a senderState. The senderState was being deleted in the writeResp packet. This patch fixes that, avoiding a segfault Additionally, the WriteCompleteResp packet was attempting to access physical memory in hitCallback while not having any data, which caused a crash. This can be resolved either by not allowing WriteCompleteResp packets to access memory, or by copying the data from the WriteReq packet. This patch denies WriteCompleteResp packets memory access in hitCallback. In VIPERCoalescer::writeCompleteCallback, there was a packet map, but no packets were ever being removed. This patch removes packets that match the address that was passed in to the function. There was an error that occured if this change wasn't implemented, I forget what it was at this point. Now for the worst part of the change in ComputeUnit::DataPort::recvTimingResp, when the packet is a WriteCompleteResp packet, will call globalMemoryPipe.handleResponse. That call happens when gpuDynInst->allLanesZero(). There's two issues with this. 1: The WriteResp packets (as mentioned above) share the same gpuDynInst as the WriteCompleteResp packets. The WriteResp packets also decrement the status vector in ComputeUnit::DataPort::processMemRespEvent. This leads to issue 2 2: There are multiple WriteCompleteResp packets for one gpuDynInst. Because the status vector is already decremented by the WriteResp packets, the check for gpuDynInst->allLanesZero() returns true for all those packets. This causes globalMemoryPipe.handleResponse to be called multiple times, which causes a crash on an assert. I simply replaced the assert with a return statement. WIP Change-Id: I9a064a0def2bf6c513f5295596c56b1b652b0ca4 --- M src/gpu-compute/compute_unit.cc M src/gpu-compute/global_memory_pipeline.cc M src/mem/ruby/system/RubyPort.cc M src/mem/ruby/system/VIPERCoalescer.cc 4 files changed, 30 insertions(+), 15 deletions(-) diff --git a/src/gpu-compute/compute_unit.cc b/src/gpu-compute/compute_unit.cc index 920257d..2df3399 100644 --- a/src/gpu-compute/compute_unit.cc +++ b/src/gpu-compute/compute_unit.cc @@ -1376,7 +1376,10 @@ } } -delete pkt->senderState; +if (pkt->cmd != MemCmd::WriteResp) { +delete pkt->senderState; +} + delete pkt; } diff --git a/src/gpu-compute/global_memory_pipeline.cc b/src/gpu-compute/global_memory_pipeline.cc index 9fc515a..87bbe17 100644 --- a/src/gpu-compute/global_memory_pipeline.cc +++ b/src/gpu-compute/global_memory_pipeline.cc @@ -278,7 +278,9 @@ // if we are getting a response for this mem request, // then it ought to already be in the ordered response // buffer -assert(mem_req != gmOrderedRespBuffer.end()); +//assert(mem_req != gmOrderedRespBuffer.end()); +if (mem_req == gmOrderedRespBuffer.end()) +return; mem_req->second.second = true; } diff --git a/src/mem/ruby/system/RubyPort.cc b/src/mem/ruby/system/RubyPort.cc index 4510e3a..574fab0 100644 --- a/src/mem/ruby/system/RubyPort.cc +++ b/src/mem/ruby/system/RubyPort.cc @@ -539,7 +539,8 @@ } // Flush, acquire, release requests don't access physical memory -if (pkt->isFlush() || pkt->cmd == MemCmd::MemSyncReq) { +if (pkt->isFlush() || pkt->cmd == MemCmd::MemSyncReq +|| pkt->cmd == MemCmd::WriteCompleteResp) { accessPhysMem = false; } diff --git a/src/mem/ruby/system/VIPERCoalescer.cc b/src/mem/ruby/system/VIPERCoalescer.cc index eafce6d..f6e7a4d 100644 --- a/src/mem/ruby/system/VIPERCoalescer.cc +++ b/src/mem/ruby/system/VIPERCoalescer.cc @@ -243,19 +243,28 @@ assert(m_writeCompletePktMap.count(key) == 1 && !m_writeCompletePktMap[key].empty()); -for (auto writeCompletePkt : m_writeCompletePktMap[key]) { -if (makeLineAddress(writeCompletePkt->getAddr()) == addr) { -RubyPort::SenderState *ss = -safe_cast -(writeCompletePkt->senderState); -MemSlavePort *port = ss->port; -assert(port != NULL); +m_writeCompletePktMap[key].erase( +std::remove_if( +m_writeCompletePktMap[key].begin(), +m_writeCompletePktMap[key].end(), +[addr](PacketPtr writeCompletePkt) -> bool { +if (makeLineAddress(writeCompletePkt->getAddr()) == addr) { +RubyPort::SenderState *ss = +safe_cast +(writeCompletePkt->senderState); +MemSlavePort *port = ss->port; +assert(port != NULL); -wr
[gem5-dev] Change in gem5/gem5[develop]: misc: Fix db_offset calculation
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/32678 ) Change subject: misc: Fix db_offset calculation .. misc: Fix db_offset calculation db_offset used to be calculated through pointer arithmetic. Pointer arithmetic increments the address by the size of the data type the pointer is pointing at. In the previous db_offset calculation, that was a uint32_t, which means the input was multiplied by 4, which is sizeof(uint32_t) This patch multiplies the input value by sizeof(uint32_t) before assigning it to db_offset. Change-Id: I9042560303ae6b8b1054b98e9a16a9da27843bb2 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32678 Reviewed-by: Matt Sinclair Reviewed-by: Alexandru Duțu Maintainer: Matt Sinclair Tested-by: kokoro --- M src/dev/hsa/hw_scheduler.cc 1 file changed, 2 insertions(+), 2 deletions(-) Approvals: Alexandru Duțu: Looks good to me, approved Matt Sinclair: Looks good to me, but someone else must approve; Looks good to me, approved kokoro: Regressions pass diff --git a/src/dev/hsa/hw_scheduler.cc b/src/dev/hsa/hw_scheduler.cc index 8523be9..f25839d 100644 --- a/src/dev/hsa/hw_scheduler.cc +++ b/src/dev/hsa/hw_scheduler.cc @@ -95,7 +95,7 @@ // #define VOID_PTR_ADD32(ptr,n) // (void*)((uint32_t*)(ptr) + n)/*ptr + offset*/ // (Addr)VOID_PTR_ADD32(0, queue_id) -Addr db_offset = queue_id; +Addr db_offset = sizeof(uint32_t)*queue_id; if (dbMap.find(db_offset) != dbMap.end()) { panic("Creating an already existing queue (queueID %d)", queue_id); } @@ -346,7 +346,7 @@ // #define VOID_PTR_ADD32(ptr,n) // (void*)((uint32_t*)(ptr) + n)/*ptr + offset*/ // (Addr)VOID_PTR_ADD32(0, queue_id) -Addr db_offset = queue_id; +Addr db_offset = sizeof(uint32_t)*queue_id; auto dbmap_iter = dbMap.find(db_offset); if (dbmap_iter == dbMap.end()) { panic("Destroying a non-existing queue (db_offset %x)", -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32678 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I9042560303ae6b8b1054b98e9a16a9da27843bb2 Gerrit-Change-Number: 32678 Gerrit-PatchSet: 3 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Alexandru Duțu Gerrit-Reviewer: Anthony Gutierrez Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: kokoro Gerrit-CC: Bobby R. Bruce Gerrit-CC: Bradford Beckmann Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs: set hsaTopology properties from options
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/31995 ) Change subject: configs: set hsaTopology properties from options .. configs: set hsaTopology properties from options This change sets the properties in hsaTopology to the proper values specified by the user through command-line arguments. This ensures that if the properties file is read by a program, it will return the correct values for the simulated hardware. This change also adds in a command-line argument for the lds size, as it was the only other property used in hsaTopology that didn't have a command-line argument. The default value (65536) is taken from src/gpu-compute/LdsState.py Change-Id: I17bb812491708f4221c39b738c906f1ad944614d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31995 Reviewed-by: Matt Sinclair Reviewed-by: Alexandru Duțu Reviewed-by: Anthony Gutierrez Maintainer: Matt Sinclair Maintainer: Anthony Gutierrez Tested-by: kokoro --- M configs/example/apu_se.py M configs/example/hsaTopology.py 2 files changed, 32 insertions(+), 26 deletions(-) Approvals: Alexandru Duțu: Looks good to me, approved Anthony Gutierrez: Looks good to me, approved; Looks good to me, approved Matt Sinclair: Looks good to me, but someone else must approve; Looks good to me, approved kokoro: Regressions pass diff --git a/configs/example/apu_se.py b/configs/example/apu_se.py index 59dd4c5..03418c3 100644 --- a/configs/example/apu_se.py +++ b/configs/example/apu_se.py @@ -174,6 +174,8 @@ help="number of physical banks per LDS module") parser.add_option("--ldsBankConflictPenalty", type="int", default=1, help="number of cycles per LDS bank conflict") +parser.add_options("--lds-size", type="int", default=65536, + help="Size of the LDS in bytes") parser.add_option('--fast-forward-pseudo-op', action='store_true', help = 'fast forward using kvm until the m5_switchcpu' ' pseudo-op is encountered, then switch cpus. subsequent' @@ -290,7 +292,8 @@ localDataStore = \ LdsState(banks = options.numLdsBanks, bankConflictPenalty = \ - options.ldsBankConflictPenalty))) + options.ldsBankConflictPenalty, + size = options.lds_size))) wavefronts = [] vrfs = [] vrf_pool_mgrs = [] diff --git a/configs/example/hsaTopology.py b/configs/example/hsaTopology.py index df24223..707a83d 100644 --- a/configs/example/hsaTopology.py +++ b/configs/example/hsaTopology.py @@ -36,6 +36,7 @@ from os.path import join as joinpath from os.path import isdir from shutil import rmtree, copyfile +from m5.util.convert import toFrequency def file_append(path, contents): with open(joinpath(*path), 'a') as f: @@ -76,30 +77,32 @@ # populate global node properties # NOTE: SIMD count triggers a valid GPU agent creation -# TODO: Really need to parse these from options -node_prop = 'cpu_cores_count %s\n' % options.num_cpus + \ -'simd_count 32\n' + \ -'mem_banks_count 0\n' + \ -'caches_count 0\n' + \ -'io_links_count 0\n'+ \ -'cpu_core_id_base 16\n' + \ -'simd_id_base 2147483648\n' + \ -'max_waves_per_simd 40\n' + \ -'lds_size_in_kb 64\n' + \ -'gds_size_in_kb 0\n'+ \ -'wave_front_size 64\n' + \ -'array_count 1\n' + \ -'simd_arrays_per_engine 1\n'+ \ -'cu_per_simd_array 10\n'+ \ -'simd_per_cu 4\n' + \ -'max_slots_scratch_cu 32\n' + \ -'vendor_id 4098\n' + \ -'device_id 39028\n' + \ -'location_id 8\n' + \ -'max_engine_clk_fcompute 800\n' + \ -'local_mem_size 0\n'+ \ -'fw_version 699\n' + \ -'capability 4738\n' + \ -'max_engine_clk_ccompute 2100\n' +node_prop = 'cpu_cores_count %s\n' % options.num_cpus + \ +'simd_count %s\n' \ +
[gem5-dev] Change in gem5/gem5[develop]: configs: Add parameter for GPU scalar cache mandatory queue size
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/32676 ) Change subject: configs: Add parameter for GPU scalar cache mandatory queue size .. configs: Add parameter for GPU scalar cache mandatory queue size There was a missing option (--buffers-size) used to set the mandatory queue size for the scalar controllers. This patch renames the option to be more clear, and adds it to the argument parser. Default of 128 taken from the implementation on the GCN staging branch Change-Id: I58b6b57be07498cdf6e39c0bb85982674ec4caa6 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32676 Reviewed-by: Matt Sinclair Maintainer: Anthony Gutierrez Tested-by: kokoro --- M configs/ruby/GPU_VIPER.py 1 file changed, 4 insertions(+), 1 deletion(-) Approvals: Matt Sinclair: Looks good to me, approved Anthony Gutierrez: Looks good to me, approved kokoro: Regressions pass diff --git a/configs/ruby/GPU_VIPER.py b/configs/ruby/GPU_VIPER.py index 50ccd2b..6a6dec5 100644 --- a/configs/ruby/GPU_VIPER.py +++ b/configs/ruby/GPU_VIPER.py @@ -399,6 +399,9 @@ parser.add_option("--noL1", action = "store_true", default = False, help = "bypassL1") +parser.add_option("--scalar-buffer-size", type = 'int', default = 128, + help="Size of the mandatory queue in the GPU scalar " + "cache controller") def create_system(options, full_system, system, dma_devices, bootmem, ruby_system): @@ -676,7 +679,7 @@ scalar_cntrl.responseToSQC.slave = ruby_system.network.master scalar_cntrl.mandatoryQueue = \ -MessageBuffer(buffer_size=options.buffers_size) +MessageBuffer(buffer_size=options.scalar_buffer_size) gpuCluster.add(scalar_cntrl) -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/32676 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I58b6b57be07498cdf6e39c0bb85982674ec4caa6 Gerrit-Change-Number: 32676 Gerrit-PatchSet: 6 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Anthony Gutierrez Gerrit-Reviewer: Bradford Beckmann Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: util: Install scons 3.1 from pip in gcn-gpu dockerfile
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/34075 ) Change subject: util: Install scons 3.1 from pip in gcn-gpu dockerfile .. util: Install scons 3.1 from pip in gcn-gpu dockerfile A previous commit updated the minimum required version of scons to 3.0 The gcn Dockerfile previously installed scons from apt, which installed scons 2.4, as the Dockerfile is based on Ubuntu 16 This patch installs scons through pip, which installs scons 3.1 Change-Id: I4f731b301f97e25c730df26afde20ae1cdfaa1b3 --- M util/dockerfiles/gcn-gpu/Dockerfile 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/util/dockerfiles/gcn-gpu/Dockerfile b/util/dockerfiles/gcn-gpu/Dockerfile index 4c17b42..065dad6 100644 --- a/util/dockerfiles/gcn-gpu/Dockerfile +++ b/util/dockerfiles/gcn-gpu/Dockerfile @@ -13,7 +13,6 @@ git \ ca-certificates \ m4 \ -scons \ zlib1g \ zlib1g-dev \ libprotobuf-dev \ @@ -24,6 +23,7 @@ python \ python-yaml \ python-six \ +python-pip \ wget \ libpci3 \ libelf1 \ @@ -37,6 +37,9 @@ libpng12-dev \ libelf-dev +RUN python -m pip install -U pip && \ +python -m pip install -U setuptools scons + ARG gem5_dist=http://dist.gem5.org/dist/develop # Install ROCm 1.6 binaries -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/34075 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I4f731b301f97e25c730df26afde20ae1cdfaa1b3 Gerrit-Change-Number: 34075 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: util: Install scons 3.1 from pip in gcn-gpu dockerfile
Kyle Roarty has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/34075 ) Change subject: util: Install scons 3.1 from pip in gcn-gpu dockerfile .. util: Install scons 3.1 from pip in gcn-gpu dockerfile A previous commit updated the minimum required version of scons to 3.0 The gcn Dockerfile previously installed scons from apt, which installed scons 2.4, as the Dockerfile is based on Ubuntu 16 This patch installs scons through pip, which installs scons 3.1 Change-Id: I4f731b301f97e25c730df26afde20ae1cdfaa1b3 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34075 Reviewed-by: Matt Sinclair Reviewed-by: Daniel Gerzhoy Reviewed-by: Jason Lowe-Power Reviewed-by: Matthew Poremba Maintainer: Matt Sinclair Maintainer: Jason Lowe-Power Tested-by: kokoro --- M util/dockerfiles/gcn-gpu/Dockerfile 1 file changed, 4 insertions(+), 1 deletion(-) Approvals: Jason Lowe-Power: Looks good to me, approved; Looks good to me, approved Matthew Poremba: Looks good to me, approved Matt Sinclair: Looks good to me, but someone else must approve; Looks good to me, approved Daniel Gerzhoy: Looks good to me, approved kokoro: Regressions pass diff --git a/util/dockerfiles/gcn-gpu/Dockerfile b/util/dockerfiles/gcn-gpu/Dockerfile index 4c17b42..065dad6 100644 --- a/util/dockerfiles/gcn-gpu/Dockerfile +++ b/util/dockerfiles/gcn-gpu/Dockerfile @@ -13,7 +13,6 @@ git \ ca-certificates \ m4 \ -scons \ zlib1g \ zlib1g-dev \ libprotobuf-dev \ @@ -24,6 +23,7 @@ python \ python-yaml \ python-six \ +python-pip \ wget \ libpci3 \ libelf1 \ @@ -37,6 +37,9 @@ libpng12-dev \ libelf-dev +RUN python -m pip install -U pip && \ +python -m pip install -U setuptools scons + ARG gem5_dist=http://dist.gem5.org/dist/develop # Install ROCm 1.6 binaries -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/34075 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I4f731b301f97e25c730df26afde20ae1cdfaa1b3 Gerrit-Change-Number: 34075 Gerrit-PatchSet: 2 Gerrit-Owner: Kyle Roarty Gerrit-Reviewer: Bobby R. Bruce Gerrit-Reviewer: Daniel Gerzhoy Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Kyle Roarty Gerrit-Reviewer: Matt Sinclair Gerrit-Reviewer: Matthew Poremba Gerrit-Reviewer: kokoro Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Fix deadlock in fetch_unit after branch instruction
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/34555 ) Change subject: gpu-compute: Fix deadlock in fetch_unit after branch instruction .. gpu-compute: Fix deadlock in fetch_unit after branch instruction The following deadlock was occuring in fetch_unit w/timingSim: 1. exec() is called, a wave is ready to fetch, so it sets pendingFetch 2. A packet is sent to ITLB to fetch for that wave 3. The wave executes a branch, causing the fetch buffer to be cleared 4. The packet is handled, and fetch() is called. However, because the fetch buffer was cleared, it returns doing nothing. 5. exec() gets called again, but the wave will never be scheduled to fetch, as pendingFetch is still set to true. This patch clears pendingFetch (and dropFetch) before returning when the instruction buffer has been cleared in fetch(). dropFetch needed to be cleared otherwise gem5 would crash. Change-Id: Iccbac7defc4849c19e8b17aa2492da641defb772 --- M src/gpu-compute/fetch_unit.cc 1 file changed, 2 insertions(+), 0 deletions(-) diff --git a/src/gpu-compute/fetch_unit.cc b/src/gpu-compute/fetch_unit.cc index 4e4259e..098b783 100644 --- a/src/gpu-compute/fetch_unit.cc +++ b/src/gpu-compute/fetch_unit.cc @@ -240,6 +240,8 @@ * pending, in the same cycle another instruction is trying to fetch. */ if (!fetchBuf.at(wavefront->wfSlotId).isReserved(pkt->req->getVaddr())) { +wavefront->dropFetch = false; +wavefront->pendingFetch = false; return; } -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/34555 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Iccbac7defc4849c19e8b17aa2492da641defb772 Gerrit-Change-Number: 34555 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: gpu-compute: Fix DPRINTFs causing segfault in wavefront.cc
Kyle Roarty has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/34677 ) Change subject: gpu-compute: Fix DPRINTFs causing segfault in wavefront.cc .. gpu-compute: Fix DPRINTFs causing segfault in wavefront.cc In 2 DPRINTFs, a variable's value was being dereferenced instead of its address. Fixed by taking the address of the variable. Change-Id: Id5d1863942848dd7a9e5e17e8180c33adbc72f15 --- M src/gpu-compute/wavefront.cc 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gpu-compute/wavefront.cc b/src/gpu-compute/wavefront.cc index 0e737db..c659873 100644 --- a/src/gpu-compute/wavefront.cc +++ b/src/gpu-compute/wavefront.cc @@ -311,7 +311,7 @@ "Setting KernargSegPtr: s[%d] = %x\n", computeUnit->cu_id, simdId, wfSlotId, wfDynId, physSgprIdx, - ((uint32_t*)kernarg_addr)[0]); + ((uint32_t*)&kernarg_addr)[0]); physSgprIdx = computeUnit->registerManager->mapSgpr(this, regInitIdx); @@ -321,7 +321,7 @@ "Setting KernargSegPtr: s[%d] = %x\n", computeUnit->cu_id, simdId, wfSlotId, wfDynId, physSgprIdx, - ((uint32_t*)kernarg_addr)[1]); + ((uint32_t*)&kernarg_addr)[1]); ++regInitIdx; break; -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/34677 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Id5d1863942848dd7a9e5e17e8180c33adbc72f15 Gerrit-Change-Number: 34677 Gerrit-PatchSet: 1 Gerrit-Owner: Kyle Roarty Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s