Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions

2013-02-23 Thread Blue Swirl
Thanks, applied.

On Thu, Jan 24, 2013 at 4:02 AM, Richard Henderson r...@twiddle.net wrote:
 This is a re-working of Paolo's eflags cleanup from October, which
 I consider a pre-requisite to implementing the ADX extension.  I've
 rearranged most of the patches in trivial ways, and some quite
 significantly.

 I've tested the result by running the FC17 installer in both i386
 and x86_64 mode, and the bmi/adx extensions with small user-land
 test cases.

 The patch series is at

   git://github.com/rth7680/qemu.git eflags3

 Please review.


 r~


 Paolo Bonzini (19):
   test-i386: QEMU_PACKED is not defined here
   test-i386: make it compile with a recent gcc
   target-i386: use OT_* consistently
   target-i386: introduce gen_ext_tl
   target-i386: factor setting of s-cc_op handling for string functions
   target-i386: drop cc_op argument of gen_jcc1
   target-i386: move carry computation for inc/dec closer to
 gen_op_set_cc_op
   target-i386: move eflags computation closer to gen_op_set_cc_op
   target-i386: compute eflags outside rcl/rcr helper
   target-i386: clean up sahf
   target-i386: use gen_jcc1 to compile loopz
   target-i386: factor gen_op_set_cc_op/tcg_gen_discard_tl around
 computing flags
   target-i386: add helper functions to get other flags
   target-i386: change gen_setcc_slow_T0 to gen_setcc_slow
   target-i386: optimize setcc instructions
   target-i386: use CCPrepare to generate conditional jumps
   target-i386: cleanup temporary macros for CCPrepare
   target-i386: introduce gen_cmovcc1
   target-i386: kill cpu_T3

 Richard Henderson (38):
   target-i386: Name the cc_op enumeration
   target-i386: Introduce set_cc_op
   target-i386: Don't clobber s-cc_op in gen_update_cc_op
   target-i386: Use gen_update_cc_op everywhere
   target-i386: do not compute eflags multiple times consecutively
   target-i386: no need to flush out cc_op before gen_eob
   target-i386: Move CC discards to set_cc_op
   target-i386: do not call helper to compute ZF/SF
   target-i386: use inverted setcond when computing NS or NZ
   target-i386: convert gen_compute_eflags_c to TCG
   target-i386: optimize setbe
   target-i386: optimize setle
   target-i386: introduce CCPrepare
   target-i386: introduce gen_prepare_cc
   target-i386: inline gen_prepare_cc_slow
   target-i386: expand cmov via movcond
   target-i386: use gen_op for cmps/scas
   target-i386: introduce gen_jcc1_noeob
   target-i386: Update cc_op before TCG branches
   target-i386: optimize flags checking after sub using CC_SRC2
   target-i386: Use CC_SRC2 for ADC and SBB
   target-i386: Don't reference ENV through most of cc helpers
   target-i386: Make helper_cc_compute_all const
   target-i386: Tidy prefix parsing
   target-i386: Decode the VEX prefixes
   target-i386: Implement MOVBE
   target-i386: Implement ANDN
   target-i386: Implement BEXTR
   target-i386: Implement BLSR, BLSMSK, BLSI
   target-i386: Implement BZHI
   target-i386: Implement MULX
   target-i386: Implement PDEP, PEXT
   target-i386: Implement SHLX, SARX, SHRX
   target-i386: Implement RORX
   target-i386: Implement ADX extension
   target-i386: Use clz/ctz for bsf/bsr helpers
   target-i386: Simplify bsf/bsr flags computation
   target-i386: Implement tzcnt and fix lzcnt

  target-i386/cc_helper.c |  243 ++--
  target-i386/cc_helper_template.h|  268 ++---
  target-i386/cpu.c   |   18 +-
  target-i386/cpu.h   |   24 +-
  target-i386/helper.c|   11 +-
  target-i386/helper.h|   12 +-
  target-i386/int_helper.c|   69 +-
  target-i386/shift_helper_template.h |   12 +-
  target-i386/translate.c | 2195 
 +--
  tests/tcg/test-i386.c   |   10 +-
  10 files changed, 1581 insertions(+), 1281 deletions(-)

 --
 1.7.11.7




Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions

2013-01-28 Thread Jay Foad
 Checkpatch doesn't work well with the pattern

 #ifdef SOMETHING
 if (foo) {
 bar();
 } else
 #endif
 {
 baz1();
 baz2();
 }

 Which is exactly the case for all three errors reported in this series.
 I know of no other good way to arrange this pattern.

if (0) {
#ifdef SOMETHING
} else if (foo) {
bar();
#endif
} else {
baz1();
baz2();
}

Jay.



Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions

2013-01-26 Thread Laurent Desnogues
On Thu, Jan 24, 2013 at 9:37 PM, Richard Henderson r...@twiddle.net wrote:
 On 01/24/2013 08:57 AM, Laurent Desnogues wrote:

 On Thu, Jan 24, 2013 at 5:52 PM, Richard Henderson r...@twiddle.net
 wrote:

 On 2013-01-24 08:46, Laurent Desnogues wrote:


 I gave a quick try a your branch. My host is an x86_64 CPU and I
 ran an i386 nbench in user mode.  It works but some parts of the
 benchmark are noticeably slower (10%).  Is that expected?


 Nope.  Everything in there should be about speeding up...

 I'll have a look at it and see if there's something obvious.


 Let me know if you need more information or the binary (I compiled
 it some time ago with the oldest compiler I could find, gcc 2.96).


 Would you look and see how much variability you're getting?  I had a quick
 look with a (newly built) nbench binary and don't see any speed regressions
 outside the error bars.

 Built with gcc 4.7.2, 4 trials each:

 Master  Eflags3
 Avg Stddev  Avg Stddev
 Change Error
 Num S   585.92  18.25   573.79  4.39
 -2.07% 3.12%
 String S51.14   1.1051.52   0.13
 0.73% 2.15%
 Bitfield1.64E+008   4.04E+006   1.62E+008   8.63E+005
 -1.32% 2.46%
 FP Emu  85.65   1.81114.74  1.18
 33.97% 2.12%
 Fourier 1365.03 28.79   2813.78 11.72
 106.13% 2.11%
 Assign  14.86   0.2414.89   0.21
 0.22% 1.62%
 Idea723.70  43.31   884.20  4.55
 22.18% 5.98%
 Huff495.27  8.72702.89  3.53
 41.92% 1.76%
 N Net   0.290.010.730.00
 149.99% 1.78%
 LU Decomp   9.260.1621.91   0.22
 136.61% 1.70%


 I haven't looked to see where the massive fp improvements come from, but my
 first guess is not storing cc_op so often.  Although perhaps it would keep
 us on the same page if we were talking about the exact same binary...

Here are my results (5 runs):

  master eflags3eflags3
   stddev stddev/master
NUMERIC SO  488.59  3.82  507.37  3.871.04
STRING SOR   42.10  0.27   43.51  0.131.03
BITFIELD  105344000.00 885426.45 91472800.00 426835.680.87
FP EMULATI   22.70  0.54   23.07  0.381.02
FOURIER2669.56 16.12 2576.34 29.460.97
ASSIGNMENT8.40  0.117.68  0.020.91
IDEA   1535.88  7.73 1620.72  3.801.06
HUFFMAN 212.21  2.08  214.34  1.281.01
NEURAL NET0.75  0.000.76  0.001.01
LU DECOMPO   24.25  0.06   24.75  0.081.02

Host is an i7-920 running gcc 4.6.1 x86_64.
nbench was compiled with gcc 2.96 targetting 32-bit.  The binary
I have is old and has been stripped and I alas have no access to
gcc 2.96 anymore.  Note that using a more recent compiler doesn't
show that much difference.


Laurent



Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions

2013-01-25 Thread Anthony Liguori
Hi,

Thank you for submitting your patch series.  checkpatch.pl has
detected that one or more of the patches in this series violate
the QEMU coding style.

If you believe this message was sent in error, please ignore it
or respond here with an explanation.

Otherwise, please correct the coding style issues and resubmit a
new version of the patch.

For more information about QEMU coding style, see:

http://git.qemu.org/?p=qemu.git;a=blob_plain;f=CODING_STYLE;hb=HEAD

Here is the output from checkpatch.pl:

Subject: target-i386: Implement tzcnt and fix lzcnt
Subject: target-i386: Simplify bsf/bsr flags computation
Subject: target-i386: Use clz/ctz for bsf/bsr helpers
Subject: target-i386: Implement ADX extension
WARNING: braces {} are necessary for all arms of this statement
#221: FILE: target-i386/translate.c:4247:
+if (ot == OT_LONG) {
[...]
+} else
[...]

total: 0 errors, 1 warnings, 213 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Subject: target-i386: Implement RORX
Subject: target-i386: Implement SHLX, SARX, SHRX
Subject: target-i386: Implement PDEP, PEXT
Subject: target-i386: Implement MULX
WARNING: braces {} are necessary for all arms of this statement
#55: FILE: target-i386/translate.c:4111:
+if (s-dflag == 2) {
[...]
+} else
[...]

total: 0 errors, 1 warnings, 62 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Subject: target-i386: Implement BZHI
Subject: target-i386: Implement BLSR, BLSMSK, BLSI
Subject: target-i386: Implement BEXTR
Subject: target-i386: Implement ANDN
Subject: target-i386: Implement MOVBE
Subject: target-i386: Decode the VEX prefixes
Subject: target-i386: Tidy prefix parsing
Subject: target-i386: Make helper_cc_compute_all const
Subject: target-i386: Don't reference ENV through most of cc helpers
Subject: target-i386: Use CC_SRC2 for ADC and SBB
Subject: target-i386: optimize flags checking after sub using CC_SRC2
Subject: target-i386: Update cc_op before TCG branches
Subject: target-i386: introduce gen_jcc1_noeob
Subject: target-i386: use gen_op for cmps/scas
Subject: target-i386: kill cpu_T3
Subject: target-i386: expand cmov via movcond
Subject: target-i386: introduce gen_cmovcc1
WARNING: braces {} are necessary for all arms of this statement
#34: FILE: target-i386/translate.c:2428:
+if (ot == OT_LONG) {
[...]
+} else
[...]

total: 0 errors, 1 warnings, 82 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Subject: target-i386: cleanup temporary macros for CCPrepare
Subject: target-i386: inline gen_prepare_cc_slow
Subject: target-i386: use CCPrepare to generate conditional jumps
Subject: target-i386: introduce gen_prepare_cc
Subject: target-i386: introduce CCPrepare
Subject: target-i386: optimize setcc instructions
Subject: target-i386: optimize setle
Subject: target-i386: optimize setbe
Subject: target-i386: change gen_setcc_slow_T0 to gen_setcc_slow
Subject: target-i386: convert gen_compute_eflags_c to TCG
Subject: target-i386: use inverted setcond when computing NS or NZ
Subject: target-i386: do not call helper to compute ZF/SF
Subject: target-i386: Move CC discards to set_cc_op
Subject: target-i386: no need to flush out cc_op before gen_eob
Subject: target-i386: do not compute eflags multiple times consecutively
Subject: target-i386: add helper functions to get other flags
Subject: target-i386: Use gen_update_cc_op everywhere
Subject: target-i386: Don't clobber s-cc_op in gen_update_cc_op
Subject: target-i386: Introduce set_cc_op
Subject: target-i386: Name the cc_op enumeration
Subject: target-i386: factor gen_op_set_cc_op/tcg_gen_discard_tl around 
computing flags
Subject: target-i386: use gen_jcc1 to compile loopz
Subject: target-i386: clean up sahf
Subject: target-i386: compute eflags outside rcl/rcr helper
Subject: target-i386: move eflags computation closer to gen_op_set_cc_op
Subject: target-i386: move carry computation for inc/dec closer to 
gen_op_set_cc_op
Subject: target-i386: drop cc_op argument of gen_jcc1
Subject: target-i386: factor setting of s-cc_op handling for string functions
Subject: target-i386: introduce gen_ext_tl
Subject: target-i386: use OT_* consistently
Subject: test-i386: make it compile with a recent gcc
Subject: test-i386: QEMU_PACKED is not defined here


Regards,

Anthony Liguori




Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions

2013-01-25 Thread Richard Henderson

On 2013-01-25 06:18, Anthony Liguori wrote:

Hi,

Thank you for submitting your patch series.  checkpatch.pl has
detected that one or more of the patches in this series violate
the QEMU coding style.

If you believe this message was sent in error, please ignore it
or respond here with an explanation.

...

Subject: target-i386: Implement ADX extension
WARNING: braces {} are necessary for all arms of this statement
#221: FILE: target-i386/translate.c:4247:
+if (ot == OT_LONG) {
[...]
+} else
[...]


Checkpatch doesn't work well with the pattern

#ifdef SOMETHING
if (foo) {
bar();
} else
#endif
{
baz1();
baz2();
}

Which is exactly the case for all three errors reported in this series.
I know of no other good way to arrange this pattern.


r~



Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions

2013-01-25 Thread Eric Blake
On 01/25/2013 11:10 AM, Richard Henderson wrote:

 Checkpatch doesn't work well with the pattern
 
 #ifdef SOMETHING
 if (foo) {
 bar();
 } else
 #endif
 {
 baz1();
 baz2();
 }
 
 Which is exactly the case for all three errors reported in this series.
 I know of no other good way to arrange this pattern.

#ifdef SOMETHING
# define SOMETHING_WITNESS 1
#else
# define SOMETHING_WITNESS 0
#endif

if (foo  SOMETHING_WITNESS) {
bar();
} else {
baz1();
baz2();
}

That is, hoist your #ifdeffery earlier into the file, and then you can
avoid #ifdefs inside the function body, and thus avoid the checkpatch
complaints; plus you get the benefit of testing that the code for
SOMETHING compiles cleanly even when SOMETHING is not defined.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions

2013-01-25 Thread Richard Henderson

On 2013-01-25 10:16, Eric Blake wrote:

Which is exactly the case for all three errors reported in this series.
I know of no other good way to arrange this pattern.

#ifdef SOMETHING
# define SOMETHING_WITNESS 1
#else
# define SOMETHING_WITNESS 0
#endif

if (foo  SOMETHING_WITNESS) {
 bar();
} else {
 baz1();
 baz2();
}

That is, hoist your #ifdeffery earlier into the file, and then you can
avoid #ifdefs inside the function body, and thus avoid the checkpatch
complaints; plus you get the benefit of testing that the code for
SOMETHING compiles cleanly even when SOMETHING is not defined.


Well, in this case bar is not present when SOMETHING is undefined, which 
means that it definitely won't compile.


Fixing that is a significant amount of work inside tcg/tcg-op.h against 
which this patch series should not be held up against.



r~



Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions

2013-01-25 Thread Eric Blake
On 01/25/2013 11:40 AM, Richard Henderson wrote:
 On 2013-01-25 10:16, Eric Blake wrote:
 Which is exactly the case for all three errors reported in this series.
 I know of no other good way to arrange this pattern.
 #ifdef SOMETHING
 # define SOMETHING_WITNESS 1
 #else
 # define SOMETHING_WITNESS 0
 #endif

 if (foo  SOMETHING_WITNESS) {
  bar();
 } else {
  baz1();
  baz2();
 }

 That is, hoist your #ifdeffery earlier into the file, and then you can
 avoid #ifdefs inside the function body, and thus avoid the checkpatch
 complaints; plus you get the benefit of testing that the code for
 SOMETHING compiles cleanly even when SOMETHING is not defined.
 
 Well, in this case bar is not present when SOMETHING is undefined, which
 means that it definitely won't compile.

Even that can be fixed:

#ifdef SOMETHING
# define SOMETHING_WITNESS 1
#else
# define SOMETHING_WITNESS 0
# define bar() ((void) 0)
#endif

 
 Fixing that is a significant amount of work inside tcg/tcg-op.h against
 which this patch series should not be held up against.

True, which is why I will leave it up to the maintainers whether to take
your patch in spite of the checkpatch complaints.  I was merely pointing
out that a cleanup is possible, not that it is mandatory for acceptance.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions

2013-01-24 Thread Andreas Färber
Am 24.01.2013 05:02, schrieb Richard Henderson:
  target-i386/cpu.c   |   18 +-
  target-i386/cpu.h   |   24 +-

You forgot to CC me: Please point me to where in those 57 patches you
are touching the core CPU code.

Given the size of the series I assume this is 1.5 material?

Thanks,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions

2013-01-24 Thread Richard Henderson

On 2013-01-24 01:35, Andreas Färber wrote:

Am 24.01.2013 05:02, schrieb Richard Henderson:

  target-i386/cpu.c   |   18 +-
  target-i386/cpu.h   |   24 +-


You forgot to CC me: Please point me to where in those 57 patches you
are touching the core CPU code.


cpu.c:  0045: Implement MOVBE
0046: Implement ANDN
0054: Implement ADX

These modify the TCG_*_FEATURES defines as I add new extensions
to the translator.

cpu.h:  0009: compute eflags outside rcl
0013: Name the cc_op enum
0039: optimize flags checking after sub
0040: Use CC_SRC2 for ADC
0048: Implement BLSR
0054: Implement ADX

These primarily have to do with the CC_OP enumeration, and the
contents of CPUX86State.


Given the size of the series I assume this is 1.5 material?


I'd imagine.


r~




Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions

2013-01-24 Thread Laurent Desnogues
On Thu, Jan 24, 2013 at 5:02 AM, Richard Henderson r...@twiddle.net wrote:
 This is a re-working of Paolo's eflags cleanup from October, which
 I consider a pre-requisite to implementing the ADX extension.  I've
 rearranged most of the patches in trivial ways, and some quite
 significantly.

 I've tested the result by running the FC17 installer in both i386
 and x86_64 mode, and the bmi/adx extensions with small user-land
 test cases.

 The patch series is at

   git://github.com/rth7680/qemu.git eflags3

I gave a quick try a your branch. My host is an x86_64 CPU and I
ran an i386 nbench in user mode.  It works but some parts of the
benchmark are noticeably slower (10%).  Is that expected?

Thanks,

Laurent

 Please review.


 r~


 Paolo Bonzini (19):
   test-i386: QEMU_PACKED is not defined here
   test-i386: make it compile with a recent gcc
   target-i386: use OT_* consistently
   target-i386: introduce gen_ext_tl
   target-i386: factor setting of s-cc_op handling for string functions
   target-i386: drop cc_op argument of gen_jcc1
   target-i386: move carry computation for inc/dec closer to
 gen_op_set_cc_op
   target-i386: move eflags computation closer to gen_op_set_cc_op
   target-i386: compute eflags outside rcl/rcr helper
   target-i386: clean up sahf
   target-i386: use gen_jcc1 to compile loopz
   target-i386: factor gen_op_set_cc_op/tcg_gen_discard_tl around
 computing flags
   target-i386: add helper functions to get other flags
   target-i386: change gen_setcc_slow_T0 to gen_setcc_slow
   target-i386: optimize setcc instructions
   target-i386: use CCPrepare to generate conditional jumps
   target-i386: cleanup temporary macros for CCPrepare
   target-i386: introduce gen_cmovcc1
   target-i386: kill cpu_T3

 Richard Henderson (38):
   target-i386: Name the cc_op enumeration
   target-i386: Introduce set_cc_op
   target-i386: Don't clobber s-cc_op in gen_update_cc_op
   target-i386: Use gen_update_cc_op everywhere
   target-i386: do not compute eflags multiple times consecutively
   target-i386: no need to flush out cc_op before gen_eob
   target-i386: Move CC discards to set_cc_op
   target-i386: do not call helper to compute ZF/SF
   target-i386: use inverted setcond when computing NS or NZ
   target-i386: convert gen_compute_eflags_c to TCG
   target-i386: optimize setbe
   target-i386: optimize setle
   target-i386: introduce CCPrepare
   target-i386: introduce gen_prepare_cc
   target-i386: inline gen_prepare_cc_slow
   target-i386: expand cmov via movcond
   target-i386: use gen_op for cmps/scas
   target-i386: introduce gen_jcc1_noeob
   target-i386: Update cc_op before TCG branches
   target-i386: optimize flags checking after sub using CC_SRC2
   target-i386: Use CC_SRC2 for ADC and SBB
   target-i386: Don't reference ENV through most of cc helpers
   target-i386: Make helper_cc_compute_all const
   target-i386: Tidy prefix parsing
   target-i386: Decode the VEX prefixes
   target-i386: Implement MOVBE
   target-i386: Implement ANDN
   target-i386: Implement BEXTR
   target-i386: Implement BLSR, BLSMSK, BLSI
   target-i386: Implement BZHI
   target-i386: Implement MULX
   target-i386: Implement PDEP, PEXT
   target-i386: Implement SHLX, SARX, SHRX
   target-i386: Implement RORX
   target-i386: Implement ADX extension
   target-i386: Use clz/ctz for bsf/bsr helpers
   target-i386: Simplify bsf/bsr flags computation
   target-i386: Implement tzcnt and fix lzcnt

  target-i386/cc_helper.c |  243 ++--
  target-i386/cc_helper_template.h|  268 ++---
  target-i386/cpu.c   |   18 +-
  target-i386/cpu.h   |   24 +-
  target-i386/helper.c|   11 +-
  target-i386/helper.h|   12 +-
  target-i386/int_helper.c|   69 +-
  target-i386/shift_helper_template.h |   12 +-
  target-i386/translate.c | 2195 
 +--
  tests/tcg/test-i386.c   |   10 +-
  10 files changed, 1581 insertions(+), 1281 deletions(-)

 --
 1.7.11.7





Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions

2013-01-24 Thread Richard Henderson

On 2013-01-24 08:46, Laurent Desnogues wrote:

I gave a quick try a your branch. My host is an x86_64 CPU and I
ran an i386 nbench in user mode.  It works but some parts of the
benchmark are noticeably slower (10%).  Is that expected?


Nope.  Everything in there should be about speeding up...

I'll have a look at it and see if there's something obvious.


r~



Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions

2013-01-24 Thread Andreas Färber
Am 24.01.2013 17:17, schrieb Richard Henderson:
 On 2013-01-24 01:35, Andreas Färber wrote:
 Am 24.01.2013 05:02, schrieb Richard Henderson:
   target-i386/cpu.c   |   18 +-
   target-i386/cpu.h   |   24 +-

 You forgot to CC me: Please point me to where in those 57 patches you
 are touching the core CPU code.
 
 cpu.c:0045: Implement MOVBE
 0046: Implement ANDN
 0054: Implement ADX
 
 These modify the TCG_*_FEATURES defines as I add new extensions
 to the translator.
 
 cpu.h:0009: compute eflags outside rcl
 0013: Name the cc_op enum
 0039: optimize flags checking after sub
 0040: Use CC_SRC2 for ADC
 0048: Implement BLSR
 0054: Implement ADX
 
 These primarily have to do with the CC_OP enumeration, and the
 contents of CPUX86State.

So both cpu.c patches will need a slight rebase after 1.4, the CC_*
changes won't conflict and the fields added to CPUX86State look okay
too, used for TCG. No objections or need to coordinate from my side.

My upcoming X86CPU subclasses mini-series may conflict at the same
location again but should be trivially resolvable in either order.

Cheers,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions

2013-01-24 Thread Richard Henderson

On 01/24/2013 08:57 AM, Laurent Desnogues wrote:

On Thu, Jan 24, 2013 at 5:52 PM, Richard Henderson r...@twiddle.net wrote:

On 2013-01-24 08:46, Laurent Desnogues wrote:


I gave a quick try a your branch. My host is an x86_64 CPU and I
ran an i386 nbench in user mode.  It works but some parts of the
benchmark are noticeably slower (10%).  Is that expected?


Nope.  Everything in there should be about speeding up...

I'll have a look at it and see if there's something obvious.


Let me know if you need more information or the binary (I compiled
it some time ago with the oldest compiler I could find, gcc 2.96).


Would you look and see how much variability you're getting?  I had a 
quick look with a (newly built) nbench binary and don't see any speed 
regressions outside the error bars.


Built with gcc 4.7.2, 4 trials each:

Master  Eflags3 
Avg Stddev  Avg Stddev  
 Change Error
Num S   585.92  18.25   573.79  4.39
 -2.07% 3.12%
String S51.14   1.1051.52   0.13
  0.73% 2.15%
Bitfield1.64E+008   4.04E+006   1.62E+008   8.63E+005   
 -1.32% 2.46%
FP Emu  85.65   1.81114.74  1.18 33.97% 
2.12%
Fourier 1365.03 28.79   2813.78 11.72   106.13% 
2.11%
Assign  14.86   0.2414.89   0.21  0.22% 
1.62%
Idea723.70  43.31   884.20  4.55
 22.18% 5.98%
Huff495.27  8.72702.89  3.53
 41.92% 1.76%
N Net   0.290.010.730.00
149.99% 1.78%
LU Decomp   9.260.1621.91   0.22
136.61% 1.70%


I haven't looked to see where the massive fp improvements come from, but 
my first guess is not storing cc_op so often.  Although perhaps it would 
keep us on the same page if we were talking about the exact same binary...



r~



[Qemu-devel] [PATCH 00/57] target-i386 eflags cleanup and bmi/adx extensions

2013-01-23 Thread Richard Henderson
This is a re-working of Paolo's eflags cleanup from October, which
I consider a pre-requisite to implementing the ADX extension.  I've
rearranged most of the patches in trivial ways, and some quite
significantly.

I've tested the result by running the FC17 installer in both i386
and x86_64 mode, and the bmi/adx extensions with small user-land
test cases.

The patch series is at

  git://github.com/rth7680/qemu.git eflags3

Please review.


r~


Paolo Bonzini (19):
  test-i386: QEMU_PACKED is not defined here
  test-i386: make it compile with a recent gcc
  target-i386: use OT_* consistently
  target-i386: introduce gen_ext_tl
  target-i386: factor setting of s-cc_op handling for string functions
  target-i386: drop cc_op argument of gen_jcc1
  target-i386: move carry computation for inc/dec closer to
gen_op_set_cc_op
  target-i386: move eflags computation closer to gen_op_set_cc_op
  target-i386: compute eflags outside rcl/rcr helper
  target-i386: clean up sahf
  target-i386: use gen_jcc1 to compile loopz
  target-i386: factor gen_op_set_cc_op/tcg_gen_discard_tl around
computing flags
  target-i386: add helper functions to get other flags
  target-i386: change gen_setcc_slow_T0 to gen_setcc_slow
  target-i386: optimize setcc instructions
  target-i386: use CCPrepare to generate conditional jumps
  target-i386: cleanup temporary macros for CCPrepare
  target-i386: introduce gen_cmovcc1
  target-i386: kill cpu_T3

Richard Henderson (38):
  target-i386: Name the cc_op enumeration
  target-i386: Introduce set_cc_op
  target-i386: Don't clobber s-cc_op in gen_update_cc_op
  target-i386: Use gen_update_cc_op everywhere
  target-i386: do not compute eflags multiple times consecutively
  target-i386: no need to flush out cc_op before gen_eob
  target-i386: Move CC discards to set_cc_op
  target-i386: do not call helper to compute ZF/SF
  target-i386: use inverted setcond when computing NS or NZ
  target-i386: convert gen_compute_eflags_c to TCG
  target-i386: optimize setbe
  target-i386: optimize setle
  target-i386: introduce CCPrepare
  target-i386: introduce gen_prepare_cc
  target-i386: inline gen_prepare_cc_slow
  target-i386: expand cmov via movcond
  target-i386: use gen_op for cmps/scas
  target-i386: introduce gen_jcc1_noeob
  target-i386: Update cc_op before TCG branches
  target-i386: optimize flags checking after sub using CC_SRC2
  target-i386: Use CC_SRC2 for ADC and SBB
  target-i386: Don't reference ENV through most of cc helpers
  target-i386: Make helper_cc_compute_all const
  target-i386: Tidy prefix parsing
  target-i386: Decode the VEX prefixes
  target-i386: Implement MOVBE
  target-i386: Implement ANDN
  target-i386: Implement BEXTR
  target-i386: Implement BLSR, BLSMSK, BLSI
  target-i386: Implement BZHI
  target-i386: Implement MULX
  target-i386: Implement PDEP, PEXT
  target-i386: Implement SHLX, SARX, SHRX
  target-i386: Implement RORX
  target-i386: Implement ADX extension
  target-i386: Use clz/ctz for bsf/bsr helpers
  target-i386: Simplify bsf/bsr flags computation
  target-i386: Implement tzcnt and fix lzcnt

 target-i386/cc_helper.c |  243 ++--
 target-i386/cc_helper_template.h|  268 ++---
 target-i386/cpu.c   |   18 +-
 target-i386/cpu.h   |   24 +-
 target-i386/helper.c|   11 +-
 target-i386/helper.h|   12 +-
 target-i386/int_helper.c|   69 +-
 target-i386/shift_helper_template.h |   12 +-
 target-i386/translate.c | 2195 +--
 tests/tcg/test-i386.c   |   10 +-
 10 files changed, 1581 insertions(+), 1281 deletions(-)

-- 
1.7.11.7