On Mon, Mar 1, 2021 at 6:48 AM Giacomo Travaglini < [email protected]> wrote:
> > > -----Original Message----- > > From: Gabe Black <[email protected]> > > Sent: 27 February 2021 05:47 > > To: Giacomo Travaglini <[email protected]> > > Cc: gem5 Developer List <[email protected]> > > Subject: Re: [gem5-dev] vector register indexing modes and renaming? > > > > Another question/clarification: > > > > Does any data actually get shared between the two rename modes? I think > you > > said there is not, but now I can't find that. > > Data *do* get shared, even if in gem5 we have separate physical registers. > In fact, when rename mode changes [1], (meta)data is copied from one > register file to the other. > For example, say we have an AArch64 kernel running at EL1 and my AArch32 > (basically armv7) floating point application running at EL0. > > My application will be using vector elements; however, every time there is > an exception to AArch64, cpu will switch > Rename mode and data will be copied / mapping will be adjusted. Any FP & > SIMD operation at this point will use vector registers. > When the kernel finishes its stuff, and goes back to AArch32, vector > elements will be repopulated. > Ok, I thought that was what you said, and I couldn't think of another reason to go through all the trouble of copying things around. > > > Would it work just as well to have > > two register files which operate entirely independently? > > As I mentioned before, they operate independently, but they sync up when > we pass from one mode > To the other. Another way to look at it is that they are mutually > exclusive. > Would it make sense to trigger the syncing between them explicitly from ARM code, rather than forcing the O3 to notice and do the copying? Then the copying, etc, wouldn't have to be generic, since it would be triggered by an ARM architectural mechanism. > > > From what I can tell > > the "V" registers of Neon in aarch64 overlap with the SVE registers, and > the "Q" > > registers of armv7 Neon overlap with the "S", "D", "Q" registers of the > same, > > but I think "V" and "Q" are independent? Maybe reused but not guaranteed > to > > alias? > > > > I would say the rule of thumb for understanding AArch64-AArch32 mapping > (and it's the underlying cause of using different renaming modes) is to > bear in mind that AArch64, differently from AArch32, uses an unpacked > approach for FP & SIMD registers. > Prior to Armv8, smaller FP registers were packed into bigger registers > [2]. Having for example 32 double precision registers (D0-D31) meant having > a maximum of 16 quad word registers (Q0-Q15). > This setup has been abandoned in Armv8 [3]. As an example, S1, or D1 are > not packed anymore in Q0. Those are in fact the 32/64 LSBits of Q1. > This means the newly added (V16-V31) are not accessible in AArch32. > > So to answer your question regarding V and Q. Until Q/V15, they alias > perfectly; V16-V31 are simply not > Defined/accessible in AArch32 so they are not aliased. > > All AArch32 SIMD data is accessible from AArch64. It just won't stick to > the same naming. AArch32 D1 and AArch64 D1 hold different data. > If I really wanted to access AArch32 D1 from AArch64 I would have to read > the 64 MSB of V0. This is a software and not an hardware problem (I just > posted this example to stress the difference between aliasing and > reachability) > Gotcha, makes sense. > Richard kindly pointed me to the following SVE tutorial: > > https://gitlab.com/arm-hpc/training/arm-sve-tools > > But I believe it is worth noting we are actually interested on testing > armv7 (AArch32) SIMD as well, so that won't probably be enough. > I will dig more, and I will keep you posted > Ok great, I'll take a look. Having *something* to test with will be a big leg up, even if it isn't complete. It would also be nice, although more complex, to be able to test the rename mode switching mechanism somehow. Gabe
_______________________________________________ gem5-dev mailing list -- [email protected] To unsubscribe send an email to [email protected] %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
