Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions
Thanks, applied. On Thu, Jan 24, 2013 at 4:02 AM, Richard Henderson r...@twiddle.net wrote: This is a re-working of Paolo's eflags cleanup from October, which I consider a pre-requisite to implementing the ADX extension. I've rearranged most of the patches in trivial ways, and some quite significantly. I've tested the result by running the FC17 installer in both i386 and x86_64 mode, and the bmi/adx extensions with small user-land test cases. The patch series is at git://github.com/rth7680/qemu.git eflags3 Please review. r~ Paolo Bonzini (19): test-i386: QEMU_PACKED is not defined here test-i386: make it compile with a recent gcc target-i386: use OT_* consistently target-i386: introduce gen_ext_tl target-i386: factor setting of s-cc_op handling for string functions target-i386: drop cc_op argument of gen_jcc1 target-i386: move carry computation for inc/dec closer to gen_op_set_cc_op target-i386: move eflags computation closer to gen_op_set_cc_op target-i386: compute eflags outside rcl/rcr helper target-i386: clean up sahf target-i386: use gen_jcc1 to compile loopz target-i386: factor gen_op_set_cc_op/tcg_gen_discard_tl around computing flags target-i386: add helper functions to get other flags target-i386: change gen_setcc_slow_T0 to gen_setcc_slow target-i386: optimize setcc instructions target-i386: use CCPrepare to generate conditional jumps target-i386: cleanup temporary macros for CCPrepare target-i386: introduce gen_cmovcc1 target-i386: kill cpu_T3 Richard Henderson (38): target-i386: Name the cc_op enumeration target-i386: Introduce set_cc_op target-i386: Don't clobber s-cc_op in gen_update_cc_op target-i386: Use gen_update_cc_op everywhere target-i386: do not compute eflags multiple times consecutively target-i386: no need to flush out cc_op before gen_eob target-i386: Move CC discards to set_cc_op target-i386: do not call helper to compute ZF/SF target-i386: use inverted setcond when computing NS or NZ target-i386: convert gen_compute_eflags_c to TCG target-i386: optimize setbe target-i386: optimize setle target-i386: introduce CCPrepare target-i386: introduce gen_prepare_cc target-i386: inline gen_prepare_cc_slow target-i386: expand cmov via movcond target-i386: use gen_op for cmps/scas target-i386: introduce gen_jcc1_noeob target-i386: Update cc_op before TCG branches target-i386: optimize flags checking after sub using CC_SRC2 target-i386: Use CC_SRC2 for ADC and SBB target-i386: Don't reference ENV through most of cc helpers target-i386: Make helper_cc_compute_all const target-i386: Tidy prefix parsing target-i386: Decode the VEX prefixes target-i386: Implement MOVBE target-i386: Implement ANDN target-i386: Implement BEXTR target-i386: Implement BLSR, BLSMSK, BLSI target-i386: Implement BZHI target-i386: Implement MULX target-i386: Implement PDEP, PEXT target-i386: Implement SHLX, SARX, SHRX target-i386: Implement RORX target-i386: Implement ADX extension target-i386: Use clz/ctz for bsf/bsr helpers target-i386: Simplify bsf/bsr flags computation target-i386: Implement tzcnt and fix lzcnt target-i386/cc_helper.c | 243 ++-- target-i386/cc_helper_template.h| 268 ++--- target-i386/cpu.c | 18 +- target-i386/cpu.h | 24 +- target-i386/helper.c| 11 +- target-i386/helper.h| 12 +- target-i386/int_helper.c| 69 +- target-i386/shift_helper_template.h | 12 +- target-i386/translate.c | 2195 +-- tests/tcg/test-i386.c | 10 +- 10 files changed, 1581 insertions(+), 1281 deletions(-) -- 1.7.11.7
Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions
Checkpatch doesn't work well with the pattern #ifdef SOMETHING if (foo) { bar(); } else #endif { baz1(); baz2(); } Which is exactly the case for all three errors reported in this series. I know of no other good way to arrange this pattern. if (0) { #ifdef SOMETHING } else if (foo) { bar(); #endif } else { baz1(); baz2(); } Jay.
Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions
On Thu, Jan 24, 2013 at 9:37 PM, Richard Henderson r...@twiddle.net wrote: On 01/24/2013 08:57 AM, Laurent Desnogues wrote: On Thu, Jan 24, 2013 at 5:52 PM, Richard Henderson r...@twiddle.net wrote: On 2013-01-24 08:46, Laurent Desnogues wrote: I gave a quick try a your branch. My host is an x86_64 CPU and I ran an i386 nbench in user mode. It works but some parts of the benchmark are noticeably slower (10%). Is that expected? Nope. Everything in there should be about speeding up... I'll have a look at it and see if there's something obvious. Let me know if you need more information or the binary (I compiled it some time ago with the oldest compiler I could find, gcc 2.96). Would you look and see how much variability you're getting? I had a quick look with a (newly built) nbench binary and don't see any speed regressions outside the error bars. Built with gcc 4.7.2, 4 trials each: Master Eflags3 Avg Stddev Avg Stddev Change Error Num S 585.92 18.25 573.79 4.39 -2.07% 3.12% String S51.14 1.1051.52 0.13 0.73% 2.15% Bitfield1.64E+008 4.04E+006 1.62E+008 8.63E+005 -1.32% 2.46% FP Emu 85.65 1.81114.74 1.18 33.97% 2.12% Fourier 1365.03 28.79 2813.78 11.72 106.13% 2.11% Assign 14.86 0.2414.89 0.21 0.22% 1.62% Idea723.70 43.31 884.20 4.55 22.18% 5.98% Huff495.27 8.72702.89 3.53 41.92% 1.76% N Net 0.290.010.730.00 149.99% 1.78% LU Decomp 9.260.1621.91 0.22 136.61% 1.70% I haven't looked to see where the massive fp improvements come from, but my first guess is not storing cc_op so often. Although perhaps it would keep us on the same page if we were talking about the exact same binary... Here are my results (5 runs): master eflags3eflags3 stddev stddev/master NUMERIC SO 488.59 3.82 507.37 3.871.04 STRING SOR 42.10 0.27 43.51 0.131.03 BITFIELD 105344000.00 885426.45 91472800.00 426835.680.87 FP EMULATI 22.70 0.54 23.07 0.381.02 FOURIER2669.56 16.12 2576.34 29.460.97 ASSIGNMENT8.40 0.117.68 0.020.91 IDEA 1535.88 7.73 1620.72 3.801.06 HUFFMAN 212.21 2.08 214.34 1.281.01 NEURAL NET0.75 0.000.76 0.001.01 LU DECOMPO 24.25 0.06 24.75 0.081.02 Host is an i7-920 running gcc 4.6.1 x86_64. nbench was compiled with gcc 2.96 targetting 32-bit. The binary I have is old and has been stripped and I alas have no access to gcc 2.96 anymore. Note that using a more recent compiler doesn't show that much difference. Laurent
Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions
Hi, Thank you for submitting your patch series. checkpatch.pl has detected that one or more of the patches in this series violate the QEMU coding style. If you believe this message was sent in error, please ignore it or respond here with an explanation. Otherwise, please correct the coding style issues and resubmit a new version of the patch. For more information about QEMU coding style, see: http://git.qemu.org/?p=qemu.git;a=blob_plain;f=CODING_STYLE;hb=HEAD Here is the output from checkpatch.pl: Subject: target-i386: Implement tzcnt and fix lzcnt Subject: target-i386: Simplify bsf/bsr flags computation Subject: target-i386: Use clz/ctz for bsf/bsr helpers Subject: target-i386: Implement ADX extension WARNING: braces {} are necessary for all arms of this statement #221: FILE: target-i386/translate.c:4247: +if (ot == OT_LONG) { [...] +} else [...] total: 0 errors, 1 warnings, 213 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. Subject: target-i386: Implement RORX Subject: target-i386: Implement SHLX, SARX, SHRX Subject: target-i386: Implement PDEP, PEXT Subject: target-i386: Implement MULX WARNING: braces {} are necessary for all arms of this statement #55: FILE: target-i386/translate.c:4111: +if (s-dflag == 2) { [...] +} else [...] total: 0 errors, 1 warnings, 62 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. Subject: target-i386: Implement BZHI Subject: target-i386: Implement BLSR, BLSMSK, BLSI Subject: target-i386: Implement BEXTR Subject: target-i386: Implement ANDN Subject: target-i386: Implement MOVBE Subject: target-i386: Decode the VEX prefixes Subject: target-i386: Tidy prefix parsing Subject: target-i386: Make helper_cc_compute_all const Subject: target-i386: Don't reference ENV through most of cc helpers Subject: target-i386: Use CC_SRC2 for ADC and SBB Subject: target-i386: optimize flags checking after sub using CC_SRC2 Subject: target-i386: Update cc_op before TCG branches Subject: target-i386: introduce gen_jcc1_noeob Subject: target-i386: use gen_op for cmps/scas Subject: target-i386: kill cpu_T3 Subject: target-i386: expand cmov via movcond Subject: target-i386: introduce gen_cmovcc1 WARNING: braces {} are necessary for all arms of this statement #34: FILE: target-i386/translate.c:2428: +if (ot == OT_LONG) { [...] +} else [...] total: 0 errors, 1 warnings, 82 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. Subject: target-i386: cleanup temporary macros for CCPrepare Subject: target-i386: inline gen_prepare_cc_slow Subject: target-i386: use CCPrepare to generate conditional jumps Subject: target-i386: introduce gen_prepare_cc Subject: target-i386: introduce CCPrepare Subject: target-i386: optimize setcc instructions Subject: target-i386: optimize setle Subject: target-i386: optimize setbe Subject: target-i386: change gen_setcc_slow_T0 to gen_setcc_slow Subject: target-i386: convert gen_compute_eflags_c to TCG Subject: target-i386: use inverted setcond when computing NS or NZ Subject: target-i386: do not call helper to compute ZF/SF Subject: target-i386: Move CC discards to set_cc_op Subject: target-i386: no need to flush out cc_op before gen_eob Subject: target-i386: do not compute eflags multiple times consecutively Subject: target-i386: add helper functions to get other flags Subject: target-i386: Use gen_update_cc_op everywhere Subject: target-i386: Don't clobber s-cc_op in gen_update_cc_op Subject: target-i386: Introduce set_cc_op Subject: target-i386: Name the cc_op enumeration Subject: target-i386: factor gen_op_set_cc_op/tcg_gen_discard_tl around computing flags Subject: target-i386: use gen_jcc1 to compile loopz Subject: target-i386: clean up sahf Subject: target-i386: compute eflags outside rcl/rcr helper Subject: target-i386: move eflags computation closer to gen_op_set_cc_op Subject: target-i386: move carry computation for inc/dec closer to gen_op_set_cc_op Subject: target-i386: drop cc_op argument of gen_jcc1 Subject: target-i386: factor setting of s-cc_op handling for string functions Subject: target-i386: introduce gen_ext_tl Subject: target-i386: use OT_* consistently Subject: test-i386: make it compile with a recent gcc Subject: test-i386: QEMU_PACKED is not defined here Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions
On 2013-01-25 06:18, Anthony Liguori wrote: Hi, Thank you for submitting your patch series. checkpatch.pl has detected that one or more of the patches in this series violate the QEMU coding style. If you believe this message was sent in error, please ignore it or respond here with an explanation. ... Subject: target-i386: Implement ADX extension WARNING: braces {} are necessary for all arms of this statement #221: FILE: target-i386/translate.c:4247: +if (ot == OT_LONG) { [...] +} else [...] Checkpatch doesn't work well with the pattern #ifdef SOMETHING if (foo) { bar(); } else #endif { baz1(); baz2(); } Which is exactly the case for all three errors reported in this series. I know of no other good way to arrange this pattern. r~
Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions
On 01/25/2013 11:10 AM, Richard Henderson wrote: Checkpatch doesn't work well with the pattern #ifdef SOMETHING if (foo) { bar(); } else #endif { baz1(); baz2(); } Which is exactly the case for all three errors reported in this series. I know of no other good way to arrange this pattern. #ifdef SOMETHING # define SOMETHING_WITNESS 1 #else # define SOMETHING_WITNESS 0 #endif if (foo SOMETHING_WITNESS) { bar(); } else { baz1(); baz2(); } That is, hoist your #ifdeffery earlier into the file, and then you can avoid #ifdefs inside the function body, and thus avoid the checkpatch complaints; plus you get the benefit of testing that the code for SOMETHING compiles cleanly even when SOMETHING is not defined. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions
On 2013-01-25 10:16, Eric Blake wrote: Which is exactly the case for all three errors reported in this series. I know of no other good way to arrange this pattern. #ifdef SOMETHING # define SOMETHING_WITNESS 1 #else # define SOMETHING_WITNESS 0 #endif if (foo SOMETHING_WITNESS) { bar(); } else { baz1(); baz2(); } That is, hoist your #ifdeffery earlier into the file, and then you can avoid #ifdefs inside the function body, and thus avoid the checkpatch complaints; plus you get the benefit of testing that the code for SOMETHING compiles cleanly even when SOMETHING is not defined. Well, in this case bar is not present when SOMETHING is undefined, which means that it definitely won't compile. Fixing that is a significant amount of work inside tcg/tcg-op.h against which this patch series should not be held up against. r~
Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions
On 01/25/2013 11:40 AM, Richard Henderson wrote: On 2013-01-25 10:16, Eric Blake wrote: Which is exactly the case for all three errors reported in this series. I know of no other good way to arrange this pattern. #ifdef SOMETHING # define SOMETHING_WITNESS 1 #else # define SOMETHING_WITNESS 0 #endif if (foo SOMETHING_WITNESS) { bar(); } else { baz1(); baz2(); } That is, hoist your #ifdeffery earlier into the file, and then you can avoid #ifdefs inside the function body, and thus avoid the checkpatch complaints; plus you get the benefit of testing that the code for SOMETHING compiles cleanly even when SOMETHING is not defined. Well, in this case bar is not present when SOMETHING is undefined, which means that it definitely won't compile. Even that can be fixed: #ifdef SOMETHING # define SOMETHING_WITNESS 1 #else # define SOMETHING_WITNESS 0 # define bar() ((void) 0) #endif Fixing that is a significant amount of work inside tcg/tcg-op.h against which this patch series should not be held up against. True, which is why I will leave it up to the maintainers whether to take your patch in spite of the checkpatch complaints. I was merely pointing out that a cleanup is possible, not that it is mandatory for acceptance. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions
Am 24.01.2013 05:02, schrieb Richard Henderson: target-i386/cpu.c | 18 +- target-i386/cpu.h | 24 +- You forgot to CC me: Please point me to where in those 57 patches you are touching the core CPU code. Given the size of the series I assume this is 1.5 material? Thanks, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions
On 2013-01-24 01:35, Andreas Färber wrote: Am 24.01.2013 05:02, schrieb Richard Henderson: target-i386/cpu.c | 18 +- target-i386/cpu.h | 24 +- You forgot to CC me: Please point me to where in those 57 patches you are touching the core CPU code. cpu.c: 0045: Implement MOVBE 0046: Implement ANDN 0054: Implement ADX These modify the TCG_*_FEATURES defines as I add new extensions to the translator. cpu.h: 0009: compute eflags outside rcl 0013: Name the cc_op enum 0039: optimize flags checking after sub 0040: Use CC_SRC2 for ADC 0048: Implement BLSR 0054: Implement ADX These primarily have to do with the CC_OP enumeration, and the contents of CPUX86State. Given the size of the series I assume this is 1.5 material? I'd imagine. r~
Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions
On Thu, Jan 24, 2013 at 5:02 AM, Richard Henderson r...@twiddle.net wrote: This is a re-working of Paolo's eflags cleanup from October, which I consider a pre-requisite to implementing the ADX extension. I've rearranged most of the patches in trivial ways, and some quite significantly. I've tested the result by running the FC17 installer in both i386 and x86_64 mode, and the bmi/adx extensions with small user-land test cases. The patch series is at git://github.com/rth7680/qemu.git eflags3 I gave a quick try a your branch. My host is an x86_64 CPU and I ran an i386 nbench in user mode. It works but some parts of the benchmark are noticeably slower (10%). Is that expected? Thanks, Laurent Please review. r~ Paolo Bonzini (19): test-i386: QEMU_PACKED is not defined here test-i386: make it compile with a recent gcc target-i386: use OT_* consistently target-i386: introduce gen_ext_tl target-i386: factor setting of s-cc_op handling for string functions target-i386: drop cc_op argument of gen_jcc1 target-i386: move carry computation for inc/dec closer to gen_op_set_cc_op target-i386: move eflags computation closer to gen_op_set_cc_op target-i386: compute eflags outside rcl/rcr helper target-i386: clean up sahf target-i386: use gen_jcc1 to compile loopz target-i386: factor gen_op_set_cc_op/tcg_gen_discard_tl around computing flags target-i386: add helper functions to get other flags target-i386: change gen_setcc_slow_T0 to gen_setcc_slow target-i386: optimize setcc instructions target-i386: use CCPrepare to generate conditional jumps target-i386: cleanup temporary macros for CCPrepare target-i386: introduce gen_cmovcc1 target-i386: kill cpu_T3 Richard Henderson (38): target-i386: Name the cc_op enumeration target-i386: Introduce set_cc_op target-i386: Don't clobber s-cc_op in gen_update_cc_op target-i386: Use gen_update_cc_op everywhere target-i386: do not compute eflags multiple times consecutively target-i386: no need to flush out cc_op before gen_eob target-i386: Move CC discards to set_cc_op target-i386: do not call helper to compute ZF/SF target-i386: use inverted setcond when computing NS or NZ target-i386: convert gen_compute_eflags_c to TCG target-i386: optimize setbe target-i386: optimize setle target-i386: introduce CCPrepare target-i386: introduce gen_prepare_cc target-i386: inline gen_prepare_cc_slow target-i386: expand cmov via movcond target-i386: use gen_op for cmps/scas target-i386: introduce gen_jcc1_noeob target-i386: Update cc_op before TCG branches target-i386: optimize flags checking after sub using CC_SRC2 target-i386: Use CC_SRC2 for ADC and SBB target-i386: Don't reference ENV through most of cc helpers target-i386: Make helper_cc_compute_all const target-i386: Tidy prefix parsing target-i386: Decode the VEX prefixes target-i386: Implement MOVBE target-i386: Implement ANDN target-i386: Implement BEXTR target-i386: Implement BLSR, BLSMSK, BLSI target-i386: Implement BZHI target-i386: Implement MULX target-i386: Implement PDEP, PEXT target-i386: Implement SHLX, SARX, SHRX target-i386: Implement RORX target-i386: Implement ADX extension target-i386: Use clz/ctz for bsf/bsr helpers target-i386: Simplify bsf/bsr flags computation target-i386: Implement tzcnt and fix lzcnt target-i386/cc_helper.c | 243 ++-- target-i386/cc_helper_template.h| 268 ++--- target-i386/cpu.c | 18 +- target-i386/cpu.h | 24 +- target-i386/helper.c| 11 +- target-i386/helper.h| 12 +- target-i386/int_helper.c| 69 +- target-i386/shift_helper_template.h | 12 +- target-i386/translate.c | 2195 +-- tests/tcg/test-i386.c | 10 +- 10 files changed, 1581 insertions(+), 1281 deletions(-) -- 1.7.11.7
Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions
On 2013-01-24 08:46, Laurent Desnogues wrote: I gave a quick try a your branch. My host is an x86_64 CPU and I ran an i386 nbench in user mode. It works but some parts of the benchmark are noticeably slower (10%). Is that expected? Nope. Everything in there should be about speeding up... I'll have a look at it and see if there's something obvious. r~
Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions
Am 24.01.2013 17:17, schrieb Richard Henderson: On 2013-01-24 01:35, Andreas Färber wrote: Am 24.01.2013 05:02, schrieb Richard Henderson: target-i386/cpu.c | 18 +- target-i386/cpu.h | 24 +- You forgot to CC me: Please point me to where in those 57 patches you are touching the core CPU code. cpu.c:0045: Implement MOVBE 0046: Implement ANDN 0054: Implement ADX These modify the TCG_*_FEATURES defines as I add new extensions to the translator. cpu.h:0009: compute eflags outside rcl 0013: Name the cc_op enum 0039: optimize flags checking after sub 0040: Use CC_SRC2 for ADC 0048: Implement BLSR 0054: Implement ADX These primarily have to do with the CC_OP enumeration, and the contents of CPUX86State. So both cpu.c patches will need a slight rebase after 1.4, the CC_* changes won't conflict and the fields added to CPUX86State look okay too, used for TCG. No objections or need to coordinate from my side. My upcoming X86CPU subclasses mini-series may conflict at the same location again but should be trivially resolvable in either order. Cheers, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions
On 01/24/2013 08:57 AM, Laurent Desnogues wrote: On Thu, Jan 24, 2013 at 5:52 PM, Richard Henderson r...@twiddle.net wrote: On 2013-01-24 08:46, Laurent Desnogues wrote: I gave a quick try a your branch. My host is an x86_64 CPU and I ran an i386 nbench in user mode. It works but some parts of the benchmark are noticeably slower (10%). Is that expected? Nope. Everything in there should be about speeding up... I'll have a look at it and see if there's something obvious. Let me know if you need more information or the binary (I compiled it some time ago with the oldest compiler I could find, gcc 2.96). Would you look and see how much variability you're getting? I had a quick look with a (newly built) nbench binary and don't see any speed regressions outside the error bars. Built with gcc 4.7.2, 4 trials each: Master Eflags3 Avg Stddev Avg Stddev Change Error Num S 585.92 18.25 573.79 4.39 -2.07% 3.12% String S51.14 1.1051.52 0.13 0.73% 2.15% Bitfield1.64E+008 4.04E+006 1.62E+008 8.63E+005 -1.32% 2.46% FP Emu 85.65 1.81114.74 1.18 33.97% 2.12% Fourier 1365.03 28.79 2813.78 11.72 106.13% 2.11% Assign 14.86 0.2414.89 0.21 0.22% 1.62% Idea723.70 43.31 884.20 4.55 22.18% 5.98% Huff495.27 8.72702.89 3.53 41.92% 1.76% N Net 0.290.010.730.00 149.99% 1.78% LU Decomp 9.260.1621.91 0.22 136.61% 1.70% I haven't looked to see where the massive fp improvements come from, but my first guess is not storing cc_op so often. Although perhaps it would keep us on the same page if we were talking about the exact same binary... r~
[Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions
This is a re-working of Paolo's eflags cleanup from October, which I consider a pre-requisite to implementing the ADX extension. I've rearranged most of the patches in trivial ways, and some quite significantly. I've tested the result by running the FC17 installer in both i386 and x86_64 mode, and the bmi/adx extensions with small user-land test cases. The patch series is at git://github.com/rth7680/qemu.git eflags3 Please review. r~ Paolo Bonzini (19): test-i386: QEMU_PACKED is not defined here test-i386: make it compile with a recent gcc target-i386: use OT_* consistently target-i386: introduce gen_ext_tl target-i386: factor setting of s-cc_op handling for string functions target-i386: drop cc_op argument of gen_jcc1 target-i386: move carry computation for inc/dec closer to gen_op_set_cc_op target-i386: move eflags computation closer to gen_op_set_cc_op target-i386: compute eflags outside rcl/rcr helper target-i386: clean up sahf target-i386: use gen_jcc1 to compile loopz target-i386: factor gen_op_set_cc_op/tcg_gen_discard_tl around computing flags target-i386: add helper functions to get other flags target-i386: change gen_setcc_slow_T0 to gen_setcc_slow target-i386: optimize setcc instructions target-i386: use CCPrepare to generate conditional jumps target-i386: cleanup temporary macros for CCPrepare target-i386: introduce gen_cmovcc1 target-i386: kill cpu_T3 Richard Henderson (38): target-i386: Name the cc_op enumeration target-i386: Introduce set_cc_op target-i386: Don't clobber s-cc_op in gen_update_cc_op target-i386: Use gen_update_cc_op everywhere target-i386: do not compute eflags multiple times consecutively target-i386: no need to flush out cc_op before gen_eob target-i386: Move CC discards to set_cc_op target-i386: do not call helper to compute ZF/SF target-i386: use inverted setcond when computing NS or NZ target-i386: convert gen_compute_eflags_c to TCG target-i386: optimize setbe target-i386: optimize setle target-i386: introduce CCPrepare target-i386: introduce gen_prepare_cc target-i386: inline gen_prepare_cc_slow target-i386: expand cmov via movcond target-i386: use gen_op for cmps/scas target-i386: introduce gen_jcc1_noeob target-i386: Update cc_op before TCG branches target-i386: optimize flags checking after sub using CC_SRC2 target-i386: Use CC_SRC2 for ADC and SBB target-i386: Don't reference ENV through most of cc helpers target-i386: Make helper_cc_compute_all const target-i386: Tidy prefix parsing target-i386: Decode the VEX prefixes target-i386: Implement MOVBE target-i386: Implement ANDN target-i386: Implement BEXTR target-i386: Implement BLSR, BLSMSK, BLSI target-i386: Implement BZHI target-i386: Implement MULX target-i386: Implement PDEP, PEXT target-i386: Implement SHLX, SARX, SHRX target-i386: Implement RORX target-i386: Implement ADX extension target-i386: Use clz/ctz for bsf/bsr helpers target-i386: Simplify bsf/bsr flags computation target-i386: Implement tzcnt and fix lzcnt target-i386/cc_helper.c | 243 ++-- target-i386/cc_helper_template.h| 268 ++--- target-i386/cpu.c | 18 +- target-i386/cpu.h | 24 +- target-i386/helper.c| 11 +- target-i386/helper.h| 12 +- target-i386/int_helper.c| 69 +- target-i386/shift_helper_template.h | 12 +- target-i386/translate.c | 2195 +-- tests/tcg/test-i386.c | 10 +- 10 files changed, 1581 insertions(+), 1281 deletions(-) -- 1.7.11.7