Not compiling with -msve-vector-bits did the trick. It runs perfectly, whether 
I set the cpu[0].isa[0].sve_vl_se to 4 or keep it to 1.
Thank you for the suggestions !!
One last thing, the starter_se.py does not seem to have support for 
--cpu-type=ArmO3CPU (or am I missing something) ?

________________________________
From: Giacomo Travaglini <giacomo.travagl...@arm.com>
Sent: 11 January 2024 12:16
To: The gem5 Users mailing list <gem5-users@gem5.org>
Cc: Jason Lowe-Power <jlowepo...@ucdavis.edu>; Nazmus Sakib <nsak...@nmsu.edu>
Subject: Re: ARM SVE ISA

You don't often get email from giacomo.travagl...@arm.com. Learn why this is 
important<https://aka.ms/LearnAboutSenderIdentification>
WARNING This email originated external to the NMSU email system. Do not click 
on links or open attachments unless you are sure the content is safe.

Hi Nazmus,



I can see from what you posted you are compiling the testcase with 512b vector 
width. I believe you should amend the gem5 VL accordingly… Basically writing up 
in the gem5 config:



cpu.isa[0].sve_vl_se = 4



According to [1].

This should fix your problem. Another solution I believe would be to compile 
without specifying the VL. Then it should be VL agnostic code I presume.



Anyway, I also recommend you use configs/example/arm/starter_se.py as se.py is 
per se deprecated



Kind Regards



Giacomo



[1]: https://github.com/gem5/gem5/blob/stable/src/arch/arm/ArmISA.py#L179



From: Nazmus Sakib via gem5-users <gem5-users@gem5.org>
Date: Thursday, 11 January 2024 at 17:54
To: gem5-users@gem5.org <gem5-users@gem5.org>
Cc: Jason Lowe-Power <jlowepo...@ucdavis.edu>, Nazmus Sakib <nsak...@nmsu.edu>
Subject: [gem5-users] ARM SVE ISA

Hello.
I am trying to run a simple program with SVE instructions on gem5. However, the 
output with debug flag ExecALL suggests there is a issue with the decoder.
Here is the test code:

#define STREAM_ARRAY_SIZE 16
void main()

{

for (int j=0; j<STREAM_ARRAY_SIZE; j++)

       {

      A[j]=3; B[j]=2;

       }

int x=add(A,B);

printf("return %d \n",A[3]);  // should print 6, does not in gem5

}



int add(int * restrict p, int * restrict q)

{  

for (int i=0; i<STREAM_ARRAY_SIZE; i+=1)

      {

        *(p+i)=*(q+i)+4;

               }

printf("dummy %d %d \n",  *(p+3),  *(q+3));    // should print 6 and 2, does 
not in gem5

return *(p+3);

}
I compiled it with gcc cross compiler for arm with following command:

aarch64-linux-gnu-gcc-11 -O3 -static  -mcpu=a64fx+sve2 -msve-vector-bits=512 -o 
test test.c

Without the-mcpu=a64fx+sve2, SVE instructions are not generated.
Here is the command I used:
./build/ARM/gem5.opt ./configs/deprecated/example/se.py --cpu-type=ArmO3CPU 
--caches --cacheline_size=64 --mem-size=8GB --arm-iset=aarch64 -c ./test
I have also used "./configs/example/arm/starter_se.py", but the results are 
same.
When I use --debug-flag=Execall, I see the following isssues:
1) 12589500: system.cpu: A0 T0 : 0x400524 @main+4    :   ptrue   p0, VL64       
  : SimdPredAlu
:  D=[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]  FetchSeq=14292  CPSeq=4962  flags=()

The D=[] should not be all zeros.

2)

12591000: system.cpu: A0 T0 : 0x400550 @main+48    :   st1   {z1}, p0/z, , 
[x19] : MemWrite :
 A=0x491040  FetchSeq=14305  CPSeq=4975  flags=(IsInteger|IsVector|IsStore)

12591000: system.cpu: A0 T0 : 0x400554 @main+52    :   st1   {z0}, p0/z, , 
[x19, #1, mul vl] : MemWrite : A=0x491050  FetchSeq=14306  CPSeq=4976  
flags=(IsInteger|IsVector|IsStore)

The second A should be 0x491080, not 0x491050.

I have run the same thing on RIKEN simulator, which was built on top of gem5 
for Fujitsu A64FX.
Here are the same instructions seen in RIKEN.
1) 15322000: system.cpu A0 T0 : @main+4    :   ptrue   p0, VL64         : 
SimdPredAlu :  
D=0b[0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111]
  FetchSeq=18146  CPSeq=5254  flags=()
As you can see, my data arrays are 64 bytes and appropriate bits in predicate 
registers are set to 1.
2)
15323000: system.cpu A0 T0 : @main+48    :   st1   {z1}, p0/z, , [x19] : 
SveMemWrite :
 A=0x491040  FetchSeq=18159  CPSeq=5267  
flags=(IsInteger|IsVector|IsMemRef|IsStore)

15323000: system.cpu A0 T0 : @main+52    :   st1   {z0}, p0/z, , [x19, #1, mul 
vl] : SveMemWrite :

  A=0x491080  FetchSeq=18160  CPSeq=5268

The second address is calcuated as 0x491080, which is the correct result for 
x19, #1, mul vl, as vl=64.

I tried to compare the files in src/arch/arm/ISA from riken with current gem5. 
Since RIKEN is based on old gem5, there are obvious syntax differences. Other 
than that, I have found 2 things:
1) in ArmISA.py, in riken, there is this:

     id_aa64pfr0_el1 = Param.UInt64(0x0000000100000022, "AArch64 Processor 
Feature Register 0")"

I did not find anything similar in gem5. I did find id_aa64pfr0_el1 in 
ar/arm/reg/misch.hh but its value wasnt set anwhere.

2) In ArmISA.py in current gem5, there is this "FEAT_SVE" extension in class 
ArmDefaultSERelease. However, this is for armv8.2, and I dont know how to 
specify this architecture in command line.

What I am trying to find out is, am I missing any runtime flags that would 
enable the proper SVE instructions in gem5, or is it due to any compile time 
flags since I am setting -mcpu to a64fx (setting -march to armv8.2-a+sve or 
whatever does not produce SVE instructions, it has to be -mcpu=a64fx+sve), or 
is it a possible issue/bug in the new gem5 itself. Any suggestions would be 
appreciated.
Thank you.

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org

Reply via email to