Re: [Qemu-devel] TCG x86-64 'bt' insn
Hi Peter, sure, that's fine too :). I just meant "great that it will be picked up in a future version" :) Thanks! -Clemens On Sat, Apr 19, 2014 at 3:50 PM, Peter Maydell wrote: > On 19 April 2014 23:41, Clemens Kolbitsch wrote: > > Thanks guys, awesome feedback and glad to see it was picked up in QEMU > 2.0 > > :) > > This didn't go into 2.0 -- it arrived somewhat late for that. It'll get > into 2.1 > (and perhaps 2.0.1, if anybody cares enough to cc stable on it). > > thanks > -- PMM >
Re: [Qemu-devel] TCG x86-64 'bt' insn
On 19 April 2014 23:41, Clemens Kolbitsch wrote: > Thanks guys, awesome feedback and glad to see it was picked up in QEMU 2.0 > :) This didn't go into 2.0 -- it arrived somewhat late for that. It'll get into 2.1 (and perhaps 2.0.1, if anybody cares enough to cc stable on it). thanks -- PMM
Re: [Qemu-devel] TCG x86-64 'bt' insn
Thanks guys, awesome feedback and glad to see it was picked up in QEMU 2.0 :) On Sat, Apr 19, 2014 at 3:21 PM, Peter Maydell wrote: > On 19 April 2014 23:04, Paolo Bonzini wrote: > > The new code should apply to btc/btr/bts too. > > ...see also RTH's patch: > > https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg01455.html > > thanks > -- PMM > -- Clemens Kolbitsch Security Researcher kolbit...@lastline.com Mobile +1 (206) 356-7745 Land +1 (805) 456-7076 Lastline, Inc. 6950 Hollister Avenue, Suite 101 Goleta, CA 93117 www.lastline.com
Re: [Qemu-devel] TCG x86-64 'bt' insn
On 19 April 2014 23:04, Paolo Bonzini wrote: > The new code should apply to btc/btr/bts too. ...see also RTH's patch: https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg01455.html thanks -- PMM
Re: [Qemu-devel] TCG x86-64 'bt' insn
Il 09/04/2014 13:55, Clemens Kolbitsch ha scritto: Hi, quick follow-up. *As always* you find a problem right after asking for help :). The updated patch does not cause BSOD on Windows guests, but neither does it fix the actual problem (of the program seg-faulting) I would really appreciate any feedback on the proposed patch below - the difference to the previous patch is that we clear out undefined flags and only keep the Z-flag (and update the C-flag) The new code should apply to btc/btr/bts too. Basically the "switch" statement keeps just the shr, and everything else moves below. Or even simpler, you could exploit the undefinedness of non-CF, non-ZF flags and keep using CC_OP_SARB. Because CC_OP_SARB uses "cc_dst == 0" to compute ZF, if ZF=1 you want to set it to 0, and if ZF=0 you want to set it to non-zero. So just compute !ZF into CC_DST, and place the tested bit (CF) into CC_SRC. Basically: gen_setcc1(s, (JCC_Z << 1) | 1, cpu_cc_dst); switch(op) { ... } tcg_gen_andi_tl(cpu_cc_src, cpu_tmp4, CC_C); set_cc_op(s, CC_OP_SARB + ot); Last another general question: Does TCG make any assumptions that undefined flags are set to 0? I see that most flag-computations set undefined flags to 0 - is this just a convention or really a requirement? Just a convention. Paolo Thanks guys! -Clemens On Wed, Apr 9, 2014 at 10:33 AM, Clemens Kolbitsch mailto:kolbit...@lastline.com>> wrote: Hi guys, I have to revive a rather old thread [1,2]. A quick summary of the issue: TCG emulates the BT (bit-test) instruction using a SAR to re-compute eflags. While SAR affects all flags, BT only affects the C-flag and leaves the Z-flag unaffected/unchanged [3]. According to the x86 manual, the current behavior is correct (although not perfect performance-wise), but NOT correct according to newer Intel CPU instruction sets (such as x86-64). What has caused some confusion in the past it seems is that AMD's manual mentions that all flags other than the C-flag are "undefined" (just like in Intel's old x86 manual). A patch has been suggested (although not applied) before [2] and I was trying to update and try the patch to the QEMU 2.0 RC2, but it seems not to be working (causes a BSOD on Windows guests). BTW: I have a program that seg-faults on Windows clients (32 as well as 64 bit guest OSes) in vanilla QEMU 2.0 RC2 (just like in previous versions), so this is not just of theoretical importance :) Can somebody say if there is something that I'm doing obviously wrong? *NON-FUNCTIONAL PATCH!* --- ../orig/qemu-2.0.0-rc2/target-i386/translate.c 2014-04-08 12:38:58.0 -0700 +++ target-i386/translate.c 2014-04-09 10:08:43.252084947 -0700 @@ -6710,8 +6710,14 @@ tcg_gen_andi_tl(cpu_T[1], cpu_T[1], (1 << (3 + ot)) - 1); switch(op) { case 0: -tcg_gen_shr_tl(cpu_cc_src, cpu_T[0], cpu_T[1]); -tcg_gen_movi_tl(cpu_cc_dst, 0); +/* whatever the last CC-op is - recompute now so we can OR-in + * updated results */ +gen_update_cc_op(s); +gen_compute_eflags(s); +tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); +tcg_gen_andi_tl(cpu_tmp4, cpu_tmp4, CC_C); +tcg_gen_or_tl(cpu_cc_src, cpu_cc_src, cpu_tmp4); +set_cc_op(s, CC_OP_EFLAGS); break; case 1: tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); @@ -6734,8 +6740,8 @@ tcg_gen_xor_tl(cpu_T[0], cpu_T[0], cpu_tmp0); break; } -set_cc_op(s, CC_OP_SARB + ot); if (op != 0) { +set_cc_op(s, CC_OP_SARB + ot); if (mod != 3) { gen_op_st_v(s, ot, cpu_T[0], cpu_A0); } else { Thanks, I'd really appreciate any input you can provide. -Clemens [1] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00442.html [2] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00764.html [3] page 148 http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf
Re: [Qemu-devel] TCG x86-64 'bt' insn
Hi, quick follow-up. *As always* you find a problem right after asking for help :). The updated patch does not cause BSOD on Windows guests, but neither does it fix the actual problem (of the program seg-faulting) I would really appreciate any feedback on the proposed patch below - the difference to the previous patch is that we clear out undefined flags and only keep the Z-flag (and update the C-flag) --- ../orig/qemu-2.0.0-rc2/target-i386/translate.c 2014-04-08 12:38:58.0 -0700 +++ target-i386/translate.c 2014-04-09 10:48:25.264200230 -0700 @@ -6710,8 +6710,15 @@ tcg_gen_andi_tl(cpu_T[1], cpu_T[1], (1 << (3 + ot)) - 1); switch(op) { case 0: -tcg_gen_shr_tl(cpu_cc_src, cpu_T[0], cpu_T[1]); -tcg_gen_movi_tl(cpu_cc_dst, 0); +/* whatever the last CC-op is - recompute now so we can OR-in + * updated results */ +gen_update_cc_op(s); // ? needed ? +gen_compute_eflags(s); +tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); +tcg_gen_andi_tl(cpu_tmp4, cpu_tmp4, CC_C); +tcg_gen_andi_tl(cpu_cc_src, cpu_cc_src, CC_Z); +tcg_gen_or_tl(cpu_cc_src, cpu_cc_src, cpu_tmp4); +set_cc_op(s, CC_OP_EFLAGS); break; case 1: tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); @@ -6734,8 +6741,8 @@ tcg_gen_xor_tl(cpu_T[0], cpu_T[0], cpu_tmp0); break; } -set_cc_op(s, CC_OP_SARB + ot); if (op != 0) { +set_cc_op(s, CC_OP_SARB + ot); if (mod != 3) { gen_op_st_v(s, ot, cpu_T[0], cpu_A0); } else { Last another general question: Does TCG make any assumptions that undefined flags are set to 0? I see that most flag-computations set undefined flags to 0 - is this just a convention or really a requirement? Thanks guys! -Clemens On Wed, Apr 9, 2014 at 10:33 AM, Clemens Kolbitsch wrote: > Hi guys, > > I have to revive a rather old thread [1,2]. A quick summary of the issue: > > TCG emulates the BT (bit-test) instruction using a SAR to re-compute > eflags. While SAR affects all flags, BT only affects the C-flag and leaves > the Z-flag unaffected/unchanged [3]. > > According to the x86 manual, the current behavior is correct (although not > perfect performance-wise), but NOT correct according to newer Intel CPU > instruction sets (such as x86-64). What has caused some confusion in the > past it seems is that AMD's manual mentions that all flags other than the > C-flag are "undefined" (just like in Intel's old x86 manual). > > A patch has been suggested (although not applied) before [2] and I was > trying to update and try the patch to the QEMU 2.0 RC2, but it seems not to > be working (causes a BSOD on Windows guests). > > BTW: I have a program that seg-faults on Windows clients (32 as well as 64 > bit guest OSes) in vanilla QEMU 2.0 RC2 (just like in previous versions), > so this is not just of theoretical importance :) > > Can somebody say if there is something that I'm doing obviously wrong? > > *NON-FUNCTIONAL PATCH!* > > --- ../orig/qemu-2.0.0-rc2/target-i386/translate.c 2014-04-08 > 12:38:58.0 -0700 > +++ target-i386/translate.c 2014-04-09 10:08:43.252084947 -0700 > @@ -6710,8 +6710,14 @@ > tcg_gen_andi_tl(cpu_T[1], cpu_T[1], (1 << (3 + ot)) - 1); > switch(op) { > case 0: > -tcg_gen_shr_tl(cpu_cc_src, cpu_T[0], cpu_T[1]); > -tcg_gen_movi_tl(cpu_cc_dst, 0); > +/* whatever the last CC-op is - recompute now so we can OR-in > + * updated results */ > +gen_update_cc_op(s); > +gen_compute_eflags(s); > +tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); > +tcg_gen_andi_tl(cpu_tmp4, cpu_tmp4, CC_C); > +tcg_gen_or_tl(cpu_cc_src, cpu_cc_src, cpu_tmp4); > +set_cc_op(s, CC_OP_EFLAGS); > break; > case 1: > tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); > @@ -6734,8 +6740,8 @@ > tcg_gen_xor_tl(cpu_T[0], cpu_T[0], cpu_tmp0); > break; > } > -set_cc_op(s, CC_OP_SARB + ot); > if (op != 0) { > +set_cc_op(s, CC_OP_SARB + ot); > if (mod != 3) { > gen_op_st_v(s, ot, cpu_T[0], cpu_A0); > } else { > > Thanks, I'd really appreciate any input you can provide. > -Clemens > > > [1] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00442.html > [2] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00764.html > [3] page 148 > http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf >
[Qemu-devel] TCG x86-64 'bt' insn
Hi guys, I have to revive a rather old thread [1,2]. A quick summary of the issue: TCG emulates the BT (bit-test) instruction using a SAR to re-compute eflags. While SAR affects all flags, BT only affects the C-flag and leaves the Z-flag unaffected/unchanged [3]. According to the x86 manual, the current behavior is correct (although not perfect performance-wise), but NOT correct according to newer Intel CPU instruction sets (such as x86-64). What has caused some confusion in the past it seems is that AMD's manual mentions that all flags other than the C-flag are "undefined" (just like in Intel's old x86 manual). A patch has been suggested (although not applied) before [2] and I was trying to update and try the patch to the QEMU 2.0 RC2, but it seems not to be working (causes a BSOD on Windows guests). BTW: I have a program that seg-faults on Windows clients (32 as well as 64 bit guest OSes) in vanilla QEMU 2.0 RC2 (just like in previous versions), so this is not just of theoretical importance :) Can somebody say if there is something that I'm doing obviously wrong? *NON-FUNCTIONAL PATCH!* --- ../orig/qemu-2.0.0-rc2/target-i386/translate.c 2014-04-08 12:38:58.0 -0700 +++ target-i386/translate.c 2014-04-09 10:08:43.252084947 -0700 @@ -6710,8 +6710,14 @@ tcg_gen_andi_tl(cpu_T[1], cpu_T[1], (1 << (3 + ot)) - 1); switch(op) { case 0: -tcg_gen_shr_tl(cpu_cc_src, cpu_T[0], cpu_T[1]); -tcg_gen_movi_tl(cpu_cc_dst, 0); +/* whatever the last CC-op is - recompute now so we can OR-in + * updated results */ +gen_update_cc_op(s); +gen_compute_eflags(s); +tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); +tcg_gen_andi_tl(cpu_tmp4, cpu_tmp4, CC_C); +tcg_gen_or_tl(cpu_cc_src, cpu_cc_src, cpu_tmp4); +set_cc_op(s, CC_OP_EFLAGS); break; case 1: tcg_gen_shr_tl(cpu_tmp4, cpu_T[0], cpu_T[1]); @@ -6734,8 +6740,8 @@ tcg_gen_xor_tl(cpu_T[0], cpu_T[0], cpu_tmp0); break; } -set_cc_op(s, CC_OP_SARB + ot); if (op != 0) { +set_cc_op(s, CC_OP_SARB + ot); if (mod != 3) { gen_op_st_v(s, ot, cpu_T[0], cpu_A0); } else { Thanks, I'd really appreciate any input you can provide. -Clemens [1] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00442.html [2] http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00764.html [3] page 148 http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf -- Clemens Kolbitsch Security Researcher kolbit...@lastline.com Mobile +1 (206) 356-7745 Land +1 (805) 456-7076 Lastline, Inc. 6950 Hollister Avenue, Suite 101 Goleta, CA 93117 www.lastline.com