> -----Original Message-----
> From: Gabe Black <[email protected]>
> Sent: 27 February 2021 05:47
> To: Giacomo Travaglini <[email protected]>
> Cc: gem5 Developer List <[email protected]>
> Subject: Re: [gem5-dev] vector register indexing modes and renaming?
>
> Another question/clarification:
>
> Does any data actually get shared between the two rename modes? I think you
> said there is not, but now I can't find that.

Data *do* get shared, even if in gem5 we have separate physical registers.
In fact, when rename mode changes [1], (meta)data is copied from one register 
file to the other.
For example, say we have an AArch64 kernel running at EL1 and my AArch32 
(basically armv7) floating point application running at EL0.

My application will be using vector elements; however, every time there is an 
exception to AArch64, cpu will switch
Rename mode and data will be copied / mapping will be adjusted. Any FP & SIMD 
operation at this point will use vector registers.
When the kernel finishes its stuff, and goes back to AArch32, vector elements 
will be repopulated.

> Would it work just as well to have
> two register files which operate entirely independently?

As I mentioned before, they operate independently, but they sync up when we 
pass from one mode
To the other. Another way to look at it is that they are mutually exclusive.

> From what I can tell
> the "V" registers of Neon in aarch64 overlap with the SVE registers, and the 
> "Q"
> registers of armv7 Neon overlap with the "S", "D", "Q" registers of the same,
> but I think "V" and "Q" are independent? Maybe reused but not guaranteed to
> alias?
>

I would say the rule of thumb for understanding AArch64-AArch32 mapping (and 
it's the underlying cause of using different renaming modes) is to bear in mind 
that AArch64, differently from AArch32, uses an unpacked approach for FP & SIMD 
registers.
Prior to Armv8, smaller FP registers were packed into bigger registers [2]. 
Having for example 32 double precision registers (D0-D31) meant having a 
maximum of 16 quad word registers (Q0-Q15).
This setup has been abandoned in Armv8 [3]. As an example, S1, or D1 are not 
packed anymore in Q0. Those are in fact the 32/64 LSBits of Q1.
This means the newly added (V16-V31) are not accessible in AArch32.

So to answer your question regarding V and Q. Until Q/V15, they alias 
perfectly; V16-V31 are simply not
Defined/accessible in AArch32 so they are not aliased.

All AArch32 SIMD data is accessible from AArch64. It just won't stick to the 
same naming. AArch32 D1 and AArch64 D1 hold different data.
If I really wanted to access AArch32 D1 from AArch64 I would have to read the 
64 MSB of V0. This is a software and not an hardware problem (I just posted 
this example to stress the difference between aliasing and reachability)

> BTW, test cases would be very helpful if possible. I've made good progress
> cleaning away debris and am getting to the point where I'll want to make
> changes which I'm a lot less comfortable making blind.
>
> Gabe
>
> On Thu, Feb 25, 2021 at 10:40 PM Gabe Black <[email protected]
> <mailto:[email protected]> > wrote:
>
>
> I will ask within Arm if there's something we can
> provide to you.
> In the meantime I gave a quick look at NEON enabled
> libraries [1]; the Ne10 library provides a set of functions optimized for NEON
> and a set
> of examples making use of it [2] (e.g FIR filter, GEMM
> etc etc).
>
> You could probably cross-compile those examples and
> use them in SE mode (recommending to use the O3 model)
>
>
>
>
> Ok, thanks, I'll take a look. This might even be something we
> want in the testing infrastructure? I might look into that when I have a 
> chance.
>
>
> I took a look at this, and unfortunately I don't think I can use it. The
> example only builds for armv7 and not aarch64, and when I tried to build it 
> for
> armv7 I get a bunch of compiler errors. Do you have any other suggestions?

Richard kindly pointed me to the following SVE tutorial:

https://gitlab.com/arm-hpc/training/arm-sve-tools

But I believe it is worth noting we are actually interested on testing armv7 
(AArch32) SIMD as well, so that won't probably be enough.
I will dig more, and I will keep you posted

>
> Gabe

Kind Regards

Giacomo

[1]: https://github.com/gem5/gem5/blob/stable/src/cpu/o3/rename_map.cc#L173
[2]: 
https://developer.arm.com/documentation/den0024/a/ARMv8-Registers/NEON-and-floating-point-registers/NEON-in-AArch32-execution-state-
[3]: 
https://developer.arm.com/documentation/den0024/a/ARMv8-Registers/NEON-and-floating-point-registers/Floating-point-register-organization-in-AArch64


IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
_______________________________________________
gem5-dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

Reply via email to