Hi Matt, Thanks for the clarification.
My issue is I need more than 1 cpu. In this scenario what will be the effect of this extra cpu on the coherence traffic, i.e. does it become part of a Core Pair and take part in coherence exchanges ? When I am placing cpus in a garnet topology, how do I ignore this particular cpu ? Regards, Sampad <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free. www.avast.com <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> On Sat, Aug 8, 2020 at 1:08 AM Matt Sinclair via gem5-users < [email protected]> wrote: > Hi Sampad, > > To literally answer the clone error part: this happens when your > application needs multiple thread contexts to run. The failure happens > when -n 1 is used because the simulator doesn't have enough thread contexts > to fulfill what the application needs. > > Of course, the next logical question is why ROCm needs 2 thread contexts. > I haven't looked at this specific behavior in several years, but when I dug > into this in ~2018, I remember this happening because the ROCm stack was > spawning a thread to check on some details about the system (e.g., it was > checking if the HCC version was at least version X, because starting with > that version, the HCC behavior was different). If you are interested in > finding the exact call that does this, you can build a debug version of the > ROCm stack and step through the ROCm stack with gdb while the simulator is > running. Eventually you'll get to the instruction in the ROCm stack that > is doing checks like the one I described above, and you could potentially > remove that call and return true/false instead as appropriate for the check > it's doing. This is what I did previously, although I don't think that > ROCm patch has been merged into develop or the AMD staging branch yet > (although like some of the other ROCm patches, it would actually need to be > placed elsewhere like gem5-resources, not directly in the gem5 repo, since > it doesn't affect gem5 code). > > Alternatively, you can just run with -n 2, as you've found already. It > should have very minimal impact on running the application. > > Matt > > On Fri, Aug 7, 2020 at 11:46 PM Sampad Mohapatra via gem5-users < > [email protected]> wrote: > >> Hi All, >> >> Why does the GCN3 model require at least 2 CPUs ? >> Every time I use a single CPU, gem5 crashes with the following error: >> *fatal: clone: no spare thread context in system* >> >> In contrast, I was able to run the HSAIL model with a single CPU. >> >> Thank You, >> Sampad Mohapatra >> >> >> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> >> Virus-free. >> www.avast.com >> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link> >> <#m_-8468723184678421553_m_4482362651241885149_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> >> _______________________________________________ >> gem5-users mailing list -- [email protected] >> To unsubscribe send an email to [email protected] >> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s > > _______________________________________________ > gem5-users mailing list -- [email protected] > To unsubscribe send an email to [email protected] > %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
_______________________________________________ gem5-users mailing list -- [email protected] To unsubscribe send an email to [email protected] %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
