[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-04-06 Thread David Fong via gem5-users
rs@gem5.org>>; Poremba, Matthew mailto:matthew.pore...@amd.com>>; Matt Sinclair mailto:sincl...@cs.wisc.edu>> Cc: Kyle Roarty mailto:kroa...@wisc.edu>>; Bharadwaj, Srikant mailto:srikant.bharad...@amd.com>>; David Fong mailto:da...@chronostech.com>> Su

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-04-06 Thread Bharadwaj, Srikant via gem5-users
22 11:23 AM To: gem5 users mailing list mailto:gem5-users@gem5.org>>; Poremba, Matthew mailto:matthew.pore...@amd.com>>; Matt Sinclair mailto:sincl...@cs.wisc.edu>> Cc: Kyle Roarty mailto:kroa...@wisc.edu>>; Bharadwaj, Srikant mailto:srikant.bharad...@amd.com>>; Da

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-04-06 Thread David Fong via gem5-users
g>>; Poremba, Matthew mailto:matthew.pore...@amd.com>>; Matt Sinclair mailto:sincl...@cs.wisc.edu>> Cc: Kyle Roarty mailto:kroa...@wisc.edu>>; Bharadwaj, Srikant mailto:srikant.bharad...@amd.com>>; David Fong mailto:da...@chronostech.com>> Subject: [gem5-users]

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-04-01 Thread David Fong via gem5-users
AM To: gem5 users mailing list mailto:gem5-users@gem5.org>>; Poremba, Matthew mailto:matthew.pore...@amd.com>>; Matt Sinclair mailto:sincl...@cs.wisc.edu>> Cc: Kyle Roarty mailto:kroa...@wisc.edu>>; Bharadwaj, Srikant mailto:srikant.bharad...@amd.com>>; David Fong ma

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-03-30 Thread Bharadwaj, Srikant via gem5-users
Kyle Roarty mailto:kroa...@wisc.edu>>; Bharadwaj, Srikant mailto:srikant.bharad...@amd.com>>; David Fong mailto:da...@chronostech.com>> Subject: [gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn Hi Matt P, Any feedback for my question below regarding stats (stats.txt) to chec

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-03-30 Thread David Fong via gem5-users
; Poremba, Matthew ; Matt Sinclair Cc: Kyle Roarty ; Bharadwaj, Srikant ; David Fong Subject: [gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn Hi Matt P, Any feedback for my question below regarding stats (stats.txt) to check for overall improvements due to reduced latency? Thanks

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-03-23 Thread David Fong via gem5-users
Roarty ; Bharadwaj, Srikant ; David Fong Subject: [gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn Hi Matt P, When I tried the --reg-alloc-policy=dynamic a few runs did not improve and in fact got worse. For now, I will not use this option. Maybe the driver is not optimizing

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-03-21 Thread David Fong via gem5-users
Hi Matt P, When I tried the --reg-alloc-policy=dynamic a few runs did not improve and in fact got worse. For now, I will not use this option. Maybe the driver is not optimizing for this release. I did update my runs to use --gpu-to-dir-latency 100 (instead of 120) --TCC_latency 12 (instead of

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-03-17 Thread Poremba, Matthew via gem5-users
[AMD Official Use Only] These would be valid for both as they both use the same cache protocol files. I'm not very familiar with how dGPU is hacked up in SE mode to look like a dGPU... -Matt From: David Fong Sent: Thursday, March 17, 2022 9:57 AM To: Poremba, Matthew ; Matt Sinclair ;

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-03-17 Thread David Fong via gem5-users
Hi Matt P, Thanks for the tip on latency parameters. Are these parameters valid ONLY for DGPU with VRAM or these apply to both DGPU and APU ? David From: Poremba, Matthew Sent: Thursday, March 17, 2022 7:51 AM To: Matt Sinclair ; David Fong ; gem5 users mailing list Cc: Kyle Roarty ;

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-03-17 Thread Poremba, Matthew via gem5-users
[AMD Official Use Only] Hi David, I don't think these are the parameters you want to be changing if you are trying to change the VRAM memory latency which it seems like you are based on the GDDR5 comment. Those parameters are for the latency between CUs seeing a memory request and the

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-03-16 Thread Matt Sinclair via gem5-users
Matt P or Srikant: can you please help David with the latency question? You know the answers better than I do here. Matt From: David Fong Sent: Wednesday, March 16, 2022 5:47 PM To: Matt Sinclair ; gem5 users mailing list Cc: Kyle Roarty ; Poremba, Matthew Subject: RE: gem5 : X86 + GCN3

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-03-16 Thread David Fong via gem5-users
Hi Matt S, Thanks again for your quick reply with useful information. I will rerun with -reg-alloc-policy=dynamic in my mini regression to see If it makes a difference As for LRN, I won't make modifications to lrn_config.dnnmark unless it's required to run additional DNN tests. The 4 tests :

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-03-15 Thread Matt Sinclair via gem5-users
Hi David, The dynamic register allocation policy allows the GPU to schedule as many wavefronts as there is register space on a CU. By default, the original register allocator released with this GPU model ("simple") only allowed 1 wavefront per CU at a time because the publicly available

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-03-15 Thread David Fong via gem5-users
Hi Matt S., Thanks for the detailed reply. I looked at the link you sent me for the weekly run. I see an additional parameter which I didn't use: --reg-alloc-policy=dynamic What does this do ? I was able to run the two other tests you use in your weekly runs : test_fwd_pool, test_bwd_bn for

[gem5-users] Re: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn

2022-03-14 Thread Matt Sinclair via gem5-users
Hi David, I have not seen this mmap error before, and my initial guess was the mmap error is happening because you are trying to allocate more memory than we created when mmap'ing the inputs for the applications (we do this to speed up SE mode, because otherwise initializing arrays can take