On Mon, Nov 18, 2013 at 11:15 AM, Claudio Fontana <claudio.font...@linaro.org> wrote: > Hello, > > On 09/27/2013 08:25 PM, Richard Henderson wrote: >> On 09/26/2013 05:48 PM, Alexander Graf wrote: >>> This patch adds emulation support for the orr instruction. >>> >>> Signed-off-by: Alexander Graf <ag...@suse.de> >>> --- >>> target-arm/helper-a64.c | 28 +++++++++++ >>> target-arm/helper-a64.h | 1 + >>> target-arm/translate-a64.c | 120 >>> +++++++++++++++++++++++++++++++++++++++++++++ >>> 3 files changed, 149 insertions(+) >>> >>> diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c >>> index 8105fb5..da72b7f 100644 >>> --- a/target-arm/helper-a64.c >>> +++ b/target-arm/helper-a64.c >>> @@ -24,3 +24,31 @@ >>> #include "sysemu/sysemu.h" >>> #include "qemu/bitops.h" >>> >>> +uint32_t HELPER(pstate_add)(uint32_t pstate, uint64_t a1, uint64_t a2, >>> + uint64_t ar) >>> +{ >>> + int64_t s1 = a1; >>> + int64_t s2 = a2; >>> + int64_t sr = ar; >>> + >>> + pstate &= ~(PSTATE_N | PSTATE_Z | PSTATE_C | PSTATE_V); >>> + >>> + if (sr < 0) { >>> + pstate |= PSTATE_N; >>> + } >>> + >>> + if (!ar) { >>> + pstate |= PSTATE_Z; >>> + } >>> + >>> + if (ar && (ar < a1)) { >>> + pstate |= PSTATE_C; >>> + } >>> + >>> + if ((s1 > 0 && s2 > 0 && sr < 0) || >>> + (s1 < 0 && s2 < 0 && sr > 0)) { >>> + pstate |= PSTATE_V; >>> + } >>> + >>> + return pstate; >>> +} >> >> Why are you not using the same split apart bits as A32? >> >>> + /* XXX carry_out */ >>> + switch (shift_type) { >> >> What carry out? I see no such in the ShiftReg description. >> >>> + case 3: >>> + tcg_gen_rotr_i64(r, cpu_reg(reg), tcg_shift); >>> + break; >> >> Incorrect rotate for 32bit? >> >>> +static void handle_orr(DisasContext *s, uint32_t insn) >>> +{ >>> + int is_32bit = !get_bits(insn, 31, 1); >>> + int dest = get_reg(insn); >>> + int source = get_bits(insn, 5, 5); >>> + int rm = get_bits(insn, 16, 5); >>> + int shift_amount = get_sbits(insn, 10, 6); >>> + int is_n = get_bits(insn, 21, 1); >>> + int shift_type = get_bits(insn, 22, 2); >>> + int opc = get_bits(insn, 29, 2); >>> + bool setflags = (opc == 0x3); >>> + TCGv_i64 tcg_op2; >>> + TCGv_i64 tcg_dest; >>> + >>> + if (is_32bit && (shift_amount < 0)) { >>> + /* reserved value */ >>> + unallocated_encoding(s); >>> + } >> >> Why are you extracting shift_amount signed? >> >>> + >>> + /* MOV is dest = xzr & (source & ~0) */ >> >> Comment is wrong. >> >>> + if (!shift_amount && source == 0x1f) { > > Besides the comment, is this correct? > I am trying to rework this patch, but this part seems incorrect to me. > > We land here for the AND as well, and if source(rn) is xzr, > then I would expect the result to be zero for AND regardless of anything else, > and not a MOV. > > Can we really do this optimization in general here for AND, OR, EOR?
That part is definitely wrong: there's a missing check that opc = 1 (ORR/ORN for MOV/MVN). The comment also is very wrong :-) Also note that SP can't be accessed by the shifted reg logical ops as Richard wrote. Laurent > Thanks for any clarification, > > Claudio > >>> + if (is_32bit) { >>> + tcg_gen_ext32u_i64(cpu_reg_sp(dest), cpu_reg(rm)); >>> + } else { >>> + tcg_gen_mov_i64(cpu_reg_sp(dest), cpu_reg(rm)); >>> + } >>> + if (is_n) { >>> + tcg_gen_not_i64(cpu_reg_sp(dest), cpu_reg_sp(dest)); >>> + } >>> + if (is_32bit) { >>> + tcg_gen_ext32u_i64(cpu_reg_sp(dest), cpu_reg_sp(dest)); >>> + } >> >> These are incorrect -- no sp in the logical ops, but xzr instead. >> >> And surely we can emit fewer opcodes for the simple cases here. >> Since these are the canonical aliases for mov/mvn, it'll pay off. >> >> TCGv src = cpu_reg(rm); >> TCGv dst = cpu_reg(rd); >> >> if (is_n) { >> tcg_gen_not_i64(dst, src); >> src = dst; >> } >> if (is_32bit) { >> tcg_gen_ext32u_i64(dst, src); >> } else { >> tcg_gen_mov_i64(dst, src); >> } >> >> Note that tcg_gen_mov_i64 does the src == dst check, so a simple >> 64-bit mvn will only emit the not. >> >> >>> + tcg_dest = cpu_reg(dest); >>> + switch (opc) { >>> + case 0x0: >>> + case 0x3: >>> + tcg_gen_and_i64(tcg_dest, cpu_reg(source), tcg_op2); >>> + break; >>> + case 0x1: >>> + tcg_gen_or_i64(tcg_dest, cpu_reg(source), tcg_op2); >>> + break; >>> + case 0x2: >>> + tcg_gen_xor_i64(tcg_dest, cpu_reg(source), tcg_op2); >>> + break; >>> + } >>> + >>> + if (is_32bit) { >>> + tcg_gen_ext32u_i64(tcg_dest, tcg_dest); >>> + } >>> + >>> + if (setflags) { >>> + gen_helper_pstate_add(pstate, pstate, tcg_dest, cpu_reg(31), >>> tcg_dest); >>> + } >> >> Incorrect flags generated. They're different between add/sub and logical. >> In particular, C and V are always zero. >> >>> + handle_orr(s, insn); >> >> And please use a more proper name than ORR for something that handles all >> of the logical insns. >> >> >> r~ >> >