Re: [Qemu-devel] Using new TCG Vector infrastructure in PowerPC

2018-03-16 Thread Nikunj A Dadhania
Richard Henderson  writes:

> On 03/16/2018 12:08 PM, Nikunj A Dadhania wrote:
>> @@ -1078,8 +1079,8 @@ struct CPUPPCState {
>>  /* Altivec registers */
>>  ppc_avr_t avr[32];
>>  uint32_t vscr;
>> -/* VSX registers */
>> -uint64_t vsr[32];
>> +/* 32 (128bit)- VSX registers */
>> +ppc_avr_t vsr[32];
>
> Another thing that needs to happen is to make ppc_avr_t to be 16-byte aligned
> (this is documented in tcg-gvec-op.h, I believe).
>
> This is easily accomplished by adding QEMU_ALIGNED(16) to the first union
> member.  And then you'd like to put vsr adjacent to avr so that you're not
> adding another alignment hole.

Sure, will do that.

Regards
Nikunj




Re: [Qemu-devel] Using new TCG Vector infrastructure in PowerPC

2018-03-16 Thread Richard Henderson
On 03/16/2018 12:08 PM, Nikunj A Dadhania wrote:
> @@ -1078,8 +1079,8 @@ struct CPUPPCState {
>  /* Altivec registers */
>  ppc_avr_t avr[32];
>  uint32_t vscr;
> -/* VSX registers */
> -uint64_t vsr[32];
> +/* 32 (128bit)- VSX registers */
> +ppc_avr_t vsr[32];

Another thing that needs to happen is to make ppc_avr_t to be 16-byte aligned
(this is documented in tcg-gvec-op.h, I believe).

This is easily accomplished by adding QEMU_ALIGNED(16) to the first union
member.  And then you'd like to put vsr adjacent to avr so that you're not
adding another alignment hole.


r~



Re: [Qemu-devel] Using new TCG Vector infrastructure in PowerPC

2018-03-15 Thread Nikunj A Dadhania
Richard Henderson  writes:

> On 03/07/2018 06:03 PM, Nikunj A Dadhania wrote:
>> Hi Richard,
>> 
>> I was working to get TCG vector support for PowerPC[1]. Started with
>> converting logical operations like vector AND/OR/XOR and compare
>> instructions. Found some inconsistency during my testing on x86 laptop
>> emulating PowerPC:
>
> Great.
>
> Well, the problem is that you cannot use TCG generic vectors and TCG global
> variables to access the same memory.

Interesting, wasn't aware of this.

> Thus your first step must be to remove all references to cpu_avrh and 
> cpu_avrl.
>  These can be replaced by translator helpers that perform an explicit
> tcg_gen_ld_i64 or tcg_gen_st_i64 to the proper memory locations.
>
> Only after that's done can you begin converting other references to use the
> host vectors.  Otherwise, the tcg optimizer will do Bad and Unpredictable
> Things, which may well have produced the incorrect results that you saw.
>
> I'll note that it's probably worth rearranging all of {fpr,vsr,avr} to the 
> more
> logical configuration presented by Power7 (?) such that it's one array of 64 x
> 128-bit registers.

I have following for taking care of making VSRs contiguous 128bits. This
has touched lot of code even out of tcg directory. So I currently have
32 AVRs (128bits) and 32 VSRs (128 bits).

@@ -1026,8 +1027,8 @@ struct CPUPPCState {
 
 /* Floating point execution context */
 float_status fp_status;
-/* floating point registers */
-float64 fpr[32];
+/* floating point registers multiplexed with vsr */
+
 /* floating point status and control register */
 target_ulong fpscr;
 
@@ -1078,8 +1079,8 @@ struct CPUPPCState {
 /* Altivec registers */
 ppc_avr_t avr[32];
 uint32_t vscr;
-/* VSX registers */
-uint64_t vsr[32];
+/* 32 (128bit)- VSX registers */
+ppc_avr_t vsr[32];
 /* SPE registers */
 uint64_t spe_acc;
 uint32_t spe_fscr;

Regards,
Nikunj




Re: [Qemu-devel] Using new TCG Vector infrastructure in PowerPC

2018-03-15 Thread Richard Henderson
On 03/07/2018 06:03 PM, Nikunj A Dadhania wrote:
> Hi Richard,
> 
> I was working to get TCG vector support for PowerPC[1]. Started with
> converting logical operations like vector AND/OR/XOR and compare
> instructions. Found some inconsistency during my testing on x86 laptop
> emulating PowerPC:

Great.

Well, the problem is that you cannot use TCG generic vectors and TCG global
variables to access the same memory.

Thus your first step must be to remove all references to cpu_avrh and cpu_avrl.
 These can be replaced by translator helpers that perform an explicit
tcg_gen_ld_i64 or tcg_gen_st_i64 to the proper memory locations.

Only after that's done can you begin converting other references to use the
host vectors.  Otherwise, the tcg optimizer will do Bad and Unpredictable
Things, which may well have produced the incorrect results that you saw.

I'll note that it's probably worth rearranging all of {fpr,vsr,avr} to the more
logical configuration presented by Power7 (?) such that it's one array of 64 x
128-bit registers.


r~



[Qemu-devel] Using new TCG Vector infrastructure in PowerPC

2018-03-07 Thread Nikunj A Dadhania
Hi Richard,

I was working to get TCG vector support for PowerPC[1]. Started with
converting logical operations like vector AND/OR/XOR and compare
instructions. Found some inconsistency during my testing on x86 laptop
emulating PowerPC:

zero =   
max =   

1) tcg_gen_andc_vec - vandc in PPC

New API result:
andc(zero, max) -  (zero & ~max ) =  
andc(max, zero) -  (max & ~zero ) =  
andc(max, max)  -  (max & ~max ) =  -->WRONG
andc(zero, zero)-  (zero & ~zero ) =   

Expected result:
andc(zero, max)  (zero & ~max ) =   
andc(max, zero)  (max & ~zero ) =   
andc(max, max)   (max & ~max ) =   
andc(zero, zero) (zero & ~zero ) =   

2) tcg_gen_or_vec - vor in PPC

New API result:
(zero | max ) =     -> WRONG
(max | max ) =   
(zero | zero ) =  

Expected result:
(zero | max ) =   
(max | max ) =   
(zero | zero ) =   

3) tcg_gen_cmp_vec(TCG_COND_EQ) - vcmpequ* in PPC

New API result(all incorrect):

vcmpequb (zero == zero ) =   
vcmpequh (zero == zero ) =   
vcmpequw (zero == zero ) =   
vcmpequd (zero == zero ) =  

Expected result:
vcmpequb (zero == zero ) =   
vcmpequh (zero == zero ) =   
vcmpequw (zero == zero ) =   
vcmpequd (zero == zero ) =  

Do you see something that I am missing here ?

Regards,
Nikunj

1. PowerPC TCG vector infrastructure implementation
https://github.com/nikunjad/qemu/tree/ppc_vec_0