Re: [Qemu-devel] [PATCH v5 00/10] tcg mips64 and mips r6 improvements
On 12/05/2016 01:41 AM, Jin Guojie wrote: > -- Original -- > From: "Richard Henderson";; > Send time: Thursday, Dec 1, 2016 11:52 PM > To: "Jin Guojie"; "qemu-devel"; > Cc: "Aurelien Jarno"; "James > Hogan"; > Subject: Re: [PATCH v5 00/10] tcg mips64 and mips r6 improvements > > Thanks a lot for Richard's careful review. > I have some different opinions for discussion: > >> @@ -1233,7 +1235,8 @@ static void tcg_out_tlb_load(TCGContext *s, TCGReg >> base, >> -mask = (target_ulong)TARGET_PAGE_MASK | ((1 << a_bits) - 1); >> +mask = TARGET_PAGE_MASK | ((1 << a_bits) - 1); >> You need the target_ulong cast here for mips64. >> See patch ebb90a005da67147245cd38fb04a965a87a961b7. > > Since mask is defined as a C type "target_ulong" at the beginning of the > function, > I guess an implicit type cast should be already done by the compiler. > So your change is functionally the same with v5 patch. > To test my idea, I wrote a small case and compiled it on an x86_64 host: > > main() > { > int a_bits = 2; > int page_mask = 0xf000;/* X86 4KB page*/ > unsigned int mask = page_mask | ((1 << a_bits) - 1); > unsigned long m = mask; > printf("mask=%lx\n", m); > } > > $ gcc a.c > $ file a.out > a.out: Mach-O 64-bit executable x86_64 > $ ./a.out > mask=f003 You misunderstand. For a 64-bit guest, the result we're looking for is 0xf003. (1) the type of a_bits is unsigned int, not int, (2) which means the expression "page_mask | ((1 << a_bits) - 1)" becomes unsigned int, (3) which means that the assignment to mask gets truncated. > The output is exactly an unsigned 32-bit quantity. > I didn't see a wrong result generated. > Could you teach me where is my mistake? For a 64-bit guest we need a 64-bit quantity. For a 32-bit guest... we need to match the load that we issue from cmp_off. If use use LWU, then we need an unsigned 32-bit quantity; if we use LW, then we need a signed 32-bit quantity. r~
Re: [Qemu-devel] [PATCH v5 00/10] tcg mips64 and mips r6 improvements
-- Original -- From: "Richard Henderson";; Send time: Thursday, Dec 1, 2016 11:52 PM To: "Jin Guojie"; "qemu-devel"; Cc: "Aurelien Jarno"; "James Hogan"; Subject: Re: [PATCH v5 00/10] tcg mips64 and mips r6 improvements Thanks a lot for Richard's careful review. I have some different opinions for discussion: > @@ -1233,7 +1235,8 @@ static void tcg_out_tlb_load(TCGContext *s, TCGReg > base, > -mask = (target_ulong)TARGET_PAGE_MASK | ((1 << a_bits) - 1); > +mask = TARGET_PAGE_MASK | ((1 << a_bits) - 1); > You need the target_ulong cast here for mips64. > See patch ebb90a005da67147245cd38fb04a965a87a961b7. Since mask is defined as a C type "target_ulong" at the beginning of the function, I guess an implicit type cast should be already done by the compiler. So your change is functionally the same with v5 patch. To test my idea, I wrote a small case and compiled it on an x86_64 host: main() { int a_bits = 2; int page_mask = 0xf000;/* X86 4KB page*/ unsigned int mask = page_mask | ((1 << a_bits) - 1); unsigned long m = mask; printf("mask=%lx\n", m); } $ gcc a.c $ file a.out a.out: Mach-O 64-bit executable x86_64 $ ./a.out mask=f003 The output is exactly an unsigned 32-bit quantity. I didn't see a wrong result generated. Could you teach me where is my mistake? > if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) { > -tcg_out_ext32u(s, base, addrl); > -addrl = base; > +tcg_out_ext32u(s, TCG_TMP0, TCG_TMP0); > } > We should be zero-extending the guest address, not the address that we just > loaded from CMP_OFF. This is because we need to add an unsigned 32-bit > quantity to the full 64-bit host pointer that we load from ADD_OFF. > Did you notice a compare problem between TMP1 and TMP0 below? If so, I > believe > that a partial solution is the TARGET_PAGE_MASK change above. I guess we also > need to do > tcg_out_ldst(s, (TARGET_LONG_BITS == 64 ? OPC_LD >TCG_TARGET_REG_BITS == 64 : OPC_LWU : OPC_LW), >TCG_TMP0, TCG_REG_A0, cmp_off); > in the else section of the tlb comparator load above, since mask will be > 32-bit > unsigned for TARGET_LONG_BITS == 32, and we need an unsigned quantity to > compare that against. In v5 patch, TMP0 is first loaded via LD, and then it is zero-extended. This is just the same behavior as LWU. After OPC_AND, TMP1 also contains unsigned 32-bit quantity. So in theory, there should not exist compare problem between TMP1 and TMP0. I agree with your opinion that addrl should be zero-extended, but It's really strange that the last ALIAS_PADD doesn't generate a bad host address in v5 patch. If v5 patch really lacks the zero-extension operation of addrl, the subsequent load or store will access wrong memory space. However, all the tests so far did not crash at this point. To further verify my idea, I will do more debug work and test the tlb miss rate to see if your analysis should be implemented. At the meantime, I am waiting for Aurelien's test results. I think it's better to collect all people's feedback before I submit the v6 patch. Jin Guojie
Re: [Qemu-devel] [PATCH v5 00/10] tcg mips64 and mips r6 improvements
On 2016-12-01 21:51, Jin Guojie wrote: > Changes in v5: > * Update against master(v2.8.0-rc2) > * Fix a bug: 64-bit big-endian guests hang on mips64 little-endian > hosts, and vice versa. This bug was first introduced in v2 patch, > due to obvious misuse of ret/arg registers in tcg_out_bswap64(). > > tcg_out_opc_reg(s, OPC_DSBH, ret, 0, arg); > - tcg_out_opc_reg(s, OPC_DSHD, ret, 0, arg); > + tcg_out_opc_reg(s, OPC_DSHD, ret, 0, ret); > > * Fix a style problem: checkpatch.pl forbids 'extern' to be used in .c. > > ERROR: externs should be avoided in .c files > #28: FILE: tcg/mips/tcg-target.inc.c:39: > +extern int link_error(void); > > Simply comment the type identifier to pass the check. > > * Tested successfully on following machines: > > | HOST| qemu-system | Debian ISO | > |-| > | mips 32 le |i386 |i386 | > | mips 32 le |x86_64 |i386 | > | mips 32 le |x86_64 |amd64| > | mips 64 le |i386 |i386 | > | mips 64 le |x86_64 |i386 | > | mips 64 le |x86_64 |amd64| > | mips 64 le | mips 64 be | mips 64 be | > |-| > | mips 32 be |i386 | i386| > | mips 32 be |x86_64 | i386| > | mips 32 be |x86_64 | amd64 | > | mips 64 be |i386 | i386| > | mips 64 be |x86_64 | i386| > | mips 64 be |x86_64 | amd64 | > | mips n32 be |386 | i386| > | mips n32 be |x86_64 | i386| > | mips n32 be |x86_64 | amd64 | > > (No plan to test MIPS R6 in this patch.) > > Summary of changes from v4: > > | tcg-mips: Support 64-bit opcodes | Fix tcg_out_bswap64() | > | tcg-mips: Adjust qemu_ld/st for mips64 | Fix a style problem | Thanks for the new version. I just gave it a try on a mips64 le guest, and I confirm this fixes booting mips64 be and ppc64 guests. I'll try to do more tests on mips be / mips le / mips64 le over the week-end. Aurelien -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [PATCH v5 00/10] tcg mips64 and mips r6 improvements
On Fri, Dec 02, 2016 at 11:16:41AM +, James Hogan wrote: > Hi, > > On Thu, Dec 01, 2016 at 09:51:59PM +0800, Jin Guojie wrote: > > * Tested successfully on following machines: > > > > | HOST| qemu-system | Debian ISO | > > |-| > > | mips 32 le |i386 |i386 | > > | mips 32 le |x86_64 |i386 | > > | mips 32 le |x86_64 |amd64| > > | mips 64 le |i386 |i386 | > > | mips 64 le |x86_64 |i386 | > > | mips 64 le |x86_64 |amd64| > > | mips 64 le | mips 64 be | mips 64 be | > > |-| > > | mips 32 be |i386 | i386| > > | mips 32 be |x86_64 | i386| > > | mips 32 be |x86_64 | amd64 | > > | mips 64 be |i386 | i386| > > | mips 64 be |x86_64 | i386| > > | mips 64 be |x86_64 | amd64 | > > | mips n32 be |386 | i386| > > | mips n32 be |x86_64 | i386| > > | mips n32 be |x86_64 | amd64 | > > > > (No plan to test MIPS R6 in this patch.) > > FYI I tried booting one of Aurelien's x86_64 debian QEMU images on a > mipsel64r6 (P6600) using this, but it seems to get stuck somewhere in > GRUB before booting guest kernel. I did see it saying loading kernel and > ramdisk, but it seemed to return to the bios & grub and just get stuck. > > I'm now attempting to reproduce in QEMU I6400 host, which is faster than > FPGA, but having the usual faff trying to get nested video output via > SDL/fbcon working. Inside QEMU mipsel64r6 (emulating I6400) it eventually reaches guest linux, but udev initiated modprobes time out. So its clearly got quite far without any breakage becoming apparent (all the way to userland), and I suspect on P6600 (FPGA) its just way too slow. Cheers James signature.asc Description: Digital signature
Re: [Qemu-devel] [PATCH v5 00/10] tcg mips64 and mips r6 improvements
Hi, On Thu, Dec 01, 2016 at 09:51:59PM +0800, Jin Guojie wrote: > * Tested successfully on following machines: > > | HOST| qemu-system | Debian ISO | > |-| > | mips 32 le |i386 |i386 | > | mips 32 le |x86_64 |i386 | > | mips 32 le |x86_64 |amd64| > | mips 64 le |i386 |i386 | > | mips 64 le |x86_64 |i386 | > | mips 64 le |x86_64 |amd64| > | mips 64 le | mips 64 be | mips 64 be | > |-| > | mips 32 be |i386 | i386| > | mips 32 be |x86_64 | i386| > | mips 32 be |x86_64 | amd64 | > | mips 64 be |i386 | i386| > | mips 64 be |x86_64 | i386| > | mips 64 be |x86_64 | amd64 | > | mips n32 be |386 | i386| > | mips n32 be |x86_64 | i386| > | mips n32 be |x86_64 | amd64 | > > (No plan to test MIPS R6 in this patch.) FYI I tried booting one of Aurelien's x86_64 debian QEMU images on a mipsel64r6 (P6600) using this, but it seems to get stuck somewhere in GRUB before booting guest kernel. I did see it saying loading kernel and ramdisk, but it seemed to return to the bios & grub and just get stuck. I'm now attempting to reproduce in QEMU I6400 host, which is faster than FPGA, but having the usual faff trying to get nested video output via SDL/fbcon working. Cheers James signature.asc Description: Digital signature
[Qemu-devel] [PATCH v5 00/10] tcg mips64 and mips r6 improvements
I tested this patch set on MIPS EB. 1) MIPS r2: (HOST) (qemu-system) (Debian ISO) 64eb -> i386 -> i386 64eb -> x86_64 -> i386 64eb -> x86_64 -> amd64 n32eb-> i386 -> i386 n32eb-> x86_64 -> i386 n32eb-> x86_64 -> amd64 32eb -> i386 -> i386 32eb -> x86_64 -> i386 32eb -> x86_64 -> amd64 2) On r2 CPU, with disable use_movnz_instructions use_mips32_instructions use_mips32r2_instructions See the attached file. (HOST) (qemu-system) (Debian ISO) 64eb -> i386 -> i386 64eb -> x86_64 -> i386 64eb -> x86_64 -> amd64 n32eb-> i386 -> i386 n32eb-> x86_64 -> i386 n32eb-> x86_64 -> amd64 32eb -> i386 -> i386 32eb -> x86_64 -> i386 32eb -> x86_64 -> amd64 If needed, I can also help to test other cases. -- YunQiang Su diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h index d352c97..d467dde 100644 --- a/tcg/mips/tcg-target.h +++ b/tcg/mips/tcg-target.h @@ -90,21 +90,24 @@ typedef enum { #if (defined(__mips_isa_rev) && (__mips_isa_rev >= 1)) || \ defined(_MIPS_ARCH_LOONGSON2E) || defined(_MIPS_ARCH_LOONGSON2F) || \ defined(_MIPS_ARCH_MIPS4) -#define use_movnz_instructions 1 +//#define use_movnz_instructions 1 +extern bool use_movnz_instructions; #else extern bool use_movnz_instructions; #endif /* MIPS32 instruction set detection */ #if defined(__mips_isa_rev) && (__mips_isa_rev >= 1) -#define use_mips32_instructions 1 +//#define use_mips32_instructions 1 +extern bool use_mips32_instructions; #else extern bool use_mips32_instructions; #endif /* MIPS32R2 instruction set detection */ #if defined(__mips_isa_rev) && (__mips_isa_rev >= 2) -#define use_mips32r2_instructions 1 +//#define use_mips32r2_instructions 1 +extern bool use_mips32r2_instructions; #else extern bool use_mips32r2_instructions; #endif diff --git a/tcg/mips/tcg-target.inc.c b/tcg/mips/tcg-target.inc.c index e6479e4..ab86b3b 100644 --- a/tcg/mips/tcg-target.inc.c +++ b/tcg/mips/tcg-target.inc.c @@ -2309,7 +2309,8 @@ static void tcg_target_detect_isa(void) "movz $zero, $zero, $zero\n" ".set pop\n" : : : ); -use_movnz_instructions = !got_sigill; +//use_movnz_instructions = !got_sigill; +use_movnz_instructions = 0; #endif /* Probe for MIPS32 instructions. As no subsetting is allowed @@ -2322,7 +2323,8 @@ static void tcg_target_detect_isa(void) "mul $zero, $zero\n" ".set pop\n" : : : ); -use_mips32_instructions = !got_sigill; +//use_mips32_instructions = !got_sigill; +use_mips32_instructions = 0; #endif /* Probe for MIPS32r2 instructions if MIPS32 instructions are @@ -2336,7 +2338,8 @@ static void tcg_target_detect_isa(void) "seb $zero, $zero\n" ".set pop\n" : : : ); -use_mips32r2_instructions = !got_sigill; +//use_mips32r2_instructions = !got_sigill; +use_mips32r2_instructions = 0; } #endif
Re: [Qemu-devel] [PATCH v5 00/10] tcg mips64 and mips r6 improvements
On 12/01/2016 05:51 AM, Jin Guojie wrote: > Changes in v5: > * Update against master(v2.8.0-rc2) > * Fix a bug: 64-bit big-endian guests hang on mips64 little-endian > hosts, and vice versa. This bug was first introduced in v2 patch, > due to obvious misuse of ret/arg registers in tcg_out_bswap64(). > > tcg_out_opc_reg(s, OPC_DSBH, ret, 0, arg); > - tcg_out_opc_reg(s, OPC_DSHD, ret, 0, arg); > + tcg_out_opc_reg(s, OPC_DSHD, ret, 0, ret); Oops. Thanks. > * Fix a style problem: checkpatch.pl forbids 'extern' to be used in .c. > > ERROR: externs should be avoided in .c files > #28: FILE: tcg/mips/tcg-target.inc.c:39: > +extern int link_error(void); Hmph. This is checkpatch not being helpful, but your change is fine. I've looked at a diff between your current patch and the last update I had to my branch. This probably includes code that post-dates the v2 that you started with. There are two things I'd like to point out: @@ -1233,7 +1235,8 @@ static void tcg_out_tlb_load(TCGContext *s, TCGReg base, TCGReg addrl, if (a_bits < s_bits) { a_bits = s_bits; } -mask = (target_ulong)TARGET_PAGE_MASK | ((1 << a_bits) - 1); + +mask = TARGET_PAGE_MASK | ((1 << a_bits) - 1); You need the target_ulong cast here for mips64. See patch ebb90a005da67147245cd38fb04a965a87a961b7. if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) { -tcg_out_ext32u(s, base, addrl); -addrl = base; +tcg_out_ext32u(s, TCG_TMP0, TCG_TMP0); } Why did you make this change? It is definitely wrong. We should be zero-extending the guest address, not the address that we just loaded from CMP_OFF. This is because we need to add an unsigned 32-bit quantity to the full 64-bit host pointer that we load from ADD_OFF. Did you notice a compare problem between TMP1 and TMP0 below? If so, I believe that a partial solution is the TARGET_PAGE_MASK change above. I guess we also need to do tcg_out_ldst(s, (TARGET_LONG_BITS == 64 ? OPC_LD TCG_TARGET_REG_BITS == 64 : OPC_LWU : OPC_LW), TCG_TMP0, TCG_REG_A0, cmp_off); in the else section of the tlb comparator load above, since mask will be 32-bit unsigned for TARGET_LONG_BITS == 32, and we need an unsigned quantity to compare that against. r~ PS: Ignoring N32 for now. But as a note to ourselves for future: What is the proper combination of zero/sign-extend vs ADDU/DADDU for 32/64-bit guests and an N32 host? What we have now is technically illegal, with zero-extend + ADDU. But I can imagine several valid scenarios.
[Qemu-devel] [PATCH v5 00/10] tcg mips64 and mips r6 improvements
Changes in v5: * Update against master(v2.8.0-rc2) * Fix a bug: 64-bit big-endian guests hang on mips64 little-endian hosts, and vice versa. This bug was first introduced in v2 patch, due to obvious misuse of ret/arg registers in tcg_out_bswap64(). tcg_out_opc_reg(s, OPC_DSBH, ret, 0, arg); - tcg_out_opc_reg(s, OPC_DSHD, ret, 0, arg); + tcg_out_opc_reg(s, OPC_DSHD, ret, 0, ret); * Fix a style problem: checkpatch.pl forbids 'extern' to be used in .c. ERROR: externs should be avoided in .c files #28: FILE: tcg/mips/tcg-target.inc.c:39: +extern int link_error(void); Simply comment the type identifier to pass the check. * Tested successfully on following machines: | HOST| qemu-system | Debian ISO | |-| | mips 32 le |i386 |i386 | | mips 32 le |x86_64 |i386 | | mips 32 le |x86_64 |amd64| | mips 64 le |i386 |i386 | | mips 64 le |x86_64 |i386 | | mips 64 le |x86_64 |amd64| | mips 64 le | mips 64 be | mips 64 be | |-| | mips 32 be |i386 | i386| | mips 32 be |x86_64 | i386| | mips 32 be |x86_64 | amd64 | | mips 64 be |i386 | i386| | mips 64 be |x86_64 | i386| | mips 64 be |x86_64 | amd64 | | mips n32 be |386 | i386| | mips n32 be |x86_64 | i386| | mips n32 be |x86_64 | amd64 | (No plan to test MIPS R6 in this patch.) Summary of changes from v4: | tcg-mips: Support 64-bit opcodes | Fix tcg_out_bswap64() | | tcg-mips: Adjust qemu_ld/st for mips64 | Fix a style problem | Changes in v4: * tcg_out_qemu_ld_slow_path: always sign-extend 32-bit loads. Provide a better solution than patch11 in v3. Fix the blocking bug when emulating i386 kernel on mips64el. * Redefine LO_OFF/HI_OFF as v2. On mips64 host, they are defined as link_error, to ensure that all paths that lead to the use of the symbol are eliminated by the compiler. Changes in v3: * Update against master(v2.8.0-rc1) * Tested on Loongson as mips32r2(el) and mips64r2(el) hosts. Loongson only implements little-endian mips32/mips64 ISA. * Fully work for 32-bit and 64-bit guests. Fix two bugs???segmentation fault on mips64el with 32-bit guests, blocking when emulating i386 kernel on mips64el. * Fix some minor style problems. * PATCH v2 12~16 are not examined due to the lack of R6 machine. Jin Guojie (10): tcg-mips: Move bswap code to a subroutine tcg-mips: Add mips64 opcodes tcg-mips: Support 64-bit opcodes tcg-mips: Add bswap32u and bswap64 tcg-mips: Adjust move functions for mips64 tcg-mips: Adjust load/store functions for mips64 tcg-mips: Adjust prologue for mips64 tcg-mips: Add tcg unwind info tcg-mips: Adjust calling conventions for mips64 tcg-mips: Adjust qemu_ld/st for mips64 Cc: Aurelien Jarno Cc: James Hogan Tested-by: YunQiang Su Signed-off-by: Richard Henderson Signed-off-by: Jin Guojie tcg/mips/tcg-target.h | 60 ++- tcg/mips/tcg-target.inc.c | 1170 +++-- 2 files changed, 977 insertions(+), 253 deletions(-) -- 2.1.0