> -----Original Message----- > From: Gabe Black <[email protected]> > Sent: 27 February 2021 05:47 > To: Giacomo Travaglini <[email protected]> > Cc: gem5 Developer List <[email protected]> > Subject: Re: [gem5-dev] vector register indexing modes and renaming? > > Another question/clarification: > > Does any data actually get shared between the two rename modes? I think you > said there is not, but now I can't find that.
Data *do* get shared, even if in gem5 we have separate physical registers. In fact, when rename mode changes [1], (meta)data is copied from one register file to the other. For example, say we have an AArch64 kernel running at EL1 and my AArch32 (basically armv7) floating point application running at EL0. My application will be using vector elements; however, every time there is an exception to AArch64, cpu will switch Rename mode and data will be copied / mapping will be adjusted. Any FP & SIMD operation at this point will use vector registers. When the kernel finishes its stuff, and goes back to AArch32, vector elements will be repopulated. > Would it work just as well to have > two register files which operate entirely independently? As I mentioned before, they operate independently, but they sync up when we pass from one mode To the other. Another way to look at it is that they are mutually exclusive. > From what I can tell > the "V" registers of Neon in aarch64 overlap with the SVE registers, and the > "Q" > registers of armv7 Neon overlap with the "S", "D", "Q" registers of the same, > but I think "V" and "Q" are independent? Maybe reused but not guaranteed to > alias? > I would say the rule of thumb for understanding AArch64-AArch32 mapping (and it's the underlying cause of using different renaming modes) is to bear in mind that AArch64, differently from AArch32, uses an unpacked approach for FP & SIMD registers. Prior to Armv8, smaller FP registers were packed into bigger registers [2]. Having for example 32 double precision registers (D0-D31) meant having a maximum of 16 quad word registers (Q0-Q15). This setup has been abandoned in Armv8 [3]. As an example, S1, or D1 are not packed anymore in Q0. Those are in fact the 32/64 LSBits of Q1. This means the newly added (V16-V31) are not accessible in AArch32. So to answer your question regarding V and Q. Until Q/V15, they alias perfectly; V16-V31 are simply not Defined/accessible in AArch32 so they are not aliased. All AArch32 SIMD data is accessible from AArch64. It just won't stick to the same naming. AArch32 D1 and AArch64 D1 hold different data. If I really wanted to access AArch32 D1 from AArch64 I would have to read the 64 MSB of V0. This is a software and not an hardware problem (I just posted this example to stress the difference between aliasing and reachability) > BTW, test cases would be very helpful if possible. I've made good progress > cleaning away debris and am getting to the point where I'll want to make > changes which I'm a lot less comfortable making blind. > > Gabe > > On Thu, Feb 25, 2021 at 10:40 PM Gabe Black <[email protected] > <mailto:[email protected]> > wrote: > > > I will ask within Arm if there's something we can > provide to you. > In the meantime I gave a quick look at NEON enabled > libraries [1]; the Ne10 library provides a set of functions optimized for NEON > and a set > of examples making use of it [2] (e.g FIR filter, GEMM > etc etc). > > You could probably cross-compile those examples and > use them in SE mode (recommending to use the O3 model) > > > > > Ok, thanks, I'll take a look. This might even be something we > want in the testing infrastructure? I might look into that when I have a > chance. > > > I took a look at this, and unfortunately I don't think I can use it. The > example only builds for armv7 and not aarch64, and when I tried to build it > for > armv7 I get a bunch of compiler errors. Do you have any other suggestions? Richard kindly pointed me to the following SVE tutorial: https://gitlab.com/arm-hpc/training/arm-sve-tools But I believe it is worth noting we are actually interested on testing armv7 (AArch32) SIMD as well, so that won't probably be enough. I will dig more, and I will keep you posted > > Gabe Kind Regards Giacomo [1]: https://github.com/gem5/gem5/blob/stable/src/cpu/o3/rename_map.cc#L173 [2]: https://developer.arm.com/documentation/den0024/a/ARMv8-Registers/NEON-and-floating-point-registers/NEON-in-AArch32-execution-state- [3]: https://developer.arm.com/documentation/den0024/a/ARMv8-Registers/NEON-and-floating-point-registers/Floating-point-register-organization-in-AArch64 IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. _______________________________________________ gem5-dev mailing list -- [email protected] To unsubscribe send an email to [email protected] %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
