Re: [Qemu-devel] Using new TCG Vector infrastructure in PowerPC
Richard Henderson writes: > On 03/16/2018 12:08 PM, Nikunj A Dadhania wrote: >> @@ -1078,8 +1079,8 @@ struct CPUPPCState { >> /* Altivec registers */ >> ppc_avr_t avr[32]; >> uint32_t vscr; >> -/* VSX registers */ >> -uint64_t vsr[32]; >> +/* 32 (128bit)- VSX registers */ >> +ppc_avr_t vsr[32]; > > Another thing that needs to happen is to make ppc_avr_t to be 16-byte aligned > (this is documented in tcg-gvec-op.h, I believe). > > This is easily accomplished by adding QEMU_ALIGNED(16) to the first union > member. And then you'd like to put vsr adjacent to avr so that you're not > adding another alignment hole. Sure, will do that. Regards Nikunj
Re: [Qemu-devel] Using new TCG Vector infrastructure in PowerPC
On 03/16/2018 12:08 PM, Nikunj A Dadhania wrote: > @@ -1078,8 +1079,8 @@ struct CPUPPCState { > /* Altivec registers */ > ppc_avr_t avr[32]; > uint32_t vscr; > -/* VSX registers */ > -uint64_t vsr[32]; > +/* 32 (128bit)- VSX registers */ > +ppc_avr_t vsr[32]; Another thing that needs to happen is to make ppc_avr_t to be 16-byte aligned (this is documented in tcg-gvec-op.h, I believe). This is easily accomplished by adding QEMU_ALIGNED(16) to the first union member. And then you'd like to put vsr adjacent to avr so that you're not adding another alignment hole. r~
Re: [Qemu-devel] Using new TCG Vector infrastructure in PowerPC
Richard Henderson writes: > On 03/07/2018 06:03 PM, Nikunj A Dadhania wrote: >> Hi Richard, >> >> I was working to get TCG vector support for PowerPC[1]. Started with >> converting logical operations like vector AND/OR/XOR and compare >> instructions. Found some inconsistency during my testing on x86 laptop >> emulating PowerPC: > > Great. > > Well, the problem is that you cannot use TCG generic vectors and TCG global > variables to access the same memory. Interesting, wasn't aware of this. > Thus your first step must be to remove all references to cpu_avrh and > cpu_avrl. > These can be replaced by translator helpers that perform an explicit > tcg_gen_ld_i64 or tcg_gen_st_i64 to the proper memory locations. > > Only after that's done can you begin converting other references to use the > host vectors. Otherwise, the tcg optimizer will do Bad and Unpredictable > Things, which may well have produced the incorrect results that you saw. > > I'll note that it's probably worth rearranging all of {fpr,vsr,avr} to the > more > logical configuration presented by Power7 (?) such that it's one array of 64 x > 128-bit registers. I have following for taking care of making VSRs contiguous 128bits. This has touched lot of code even out of tcg directory. So I currently have 32 AVRs (128bits) and 32 VSRs (128 bits). @@ -1026,8 +1027,8 @@ struct CPUPPCState { /* Floating point execution context */ float_status fp_status; -/* floating point registers */ -float64 fpr[32]; +/* floating point registers multiplexed with vsr */ + /* floating point status and control register */ target_ulong fpscr; @@ -1078,8 +1079,8 @@ struct CPUPPCState { /* Altivec registers */ ppc_avr_t avr[32]; uint32_t vscr; -/* VSX registers */ -uint64_t vsr[32]; +/* 32 (128bit)- VSX registers */ +ppc_avr_t vsr[32]; /* SPE registers */ uint64_t spe_acc; uint32_t spe_fscr; Regards, Nikunj
Re: [Qemu-devel] Using new TCG Vector infrastructure in PowerPC
On 03/07/2018 06:03 PM, Nikunj A Dadhania wrote: > Hi Richard, > > I was working to get TCG vector support for PowerPC[1]. Started with > converting logical operations like vector AND/OR/XOR and compare > instructions. Found some inconsistency during my testing on x86 laptop > emulating PowerPC: Great. Well, the problem is that you cannot use TCG generic vectors and TCG global variables to access the same memory. Thus your first step must be to remove all references to cpu_avrh and cpu_avrl. These can be replaced by translator helpers that perform an explicit tcg_gen_ld_i64 or tcg_gen_st_i64 to the proper memory locations. Only after that's done can you begin converting other references to use the host vectors. Otherwise, the tcg optimizer will do Bad and Unpredictable Things, which may well have produced the incorrect results that you saw. I'll note that it's probably worth rearranging all of {fpr,vsr,avr} to the more logical configuration presented by Power7 (?) such that it's one array of 64 x 128-bit registers. r~
[Qemu-devel] Using new TCG Vector infrastructure in PowerPC
Hi Richard, I was working to get TCG vector support for PowerPC[1]. Started with converting logical operations like vector AND/OR/XOR and compare instructions. Found some inconsistency during my testing on x86 laptop emulating PowerPC: zero = max = 1) tcg_gen_andc_vec - vandc in PPC New API result: andc(zero, max) - (zero & ~max ) = andc(max, zero) - (max & ~zero ) = andc(max, max) - (max & ~max ) = -->WRONG andc(zero, zero)- (zero & ~zero ) = Expected result: andc(zero, max) (zero & ~max ) = andc(max, zero) (max & ~zero ) = andc(max, max) (max & ~max ) = andc(zero, zero) (zero & ~zero ) = 2) tcg_gen_or_vec - vor in PPC New API result: (zero | max ) = -> WRONG (max | max ) = (zero | zero ) = Expected result: (zero | max ) = (max | max ) = (zero | zero ) = 3) tcg_gen_cmp_vec(TCG_COND_EQ) - vcmpequ* in PPC New API result(all incorrect): vcmpequb (zero == zero ) = vcmpequh (zero == zero ) = vcmpequw (zero == zero ) = vcmpequd (zero == zero ) = Expected result: vcmpequb (zero == zero ) = vcmpequh (zero == zero ) = vcmpequw (zero == zero ) = vcmpequd (zero == zero ) = Do you see something that I am missing here ? Regards, Nikunj 1. PowerPC TCG vector infrastructure implementation https://github.com/nikunjad/qemu/tree/ppc_vec_0