I got these patches from the ghost of architectures past trying to get perl tests passing on my alpha.
I finally got a chance to get back to this and test building a release on both alpha and amd64 and will commit them in the morning unless someone else gets to it first. There are two separate fixes here, the first is a compiler optimization bug and the other handles processors without specific instructions. These help not only perl on my AlphaStation but also gets these two regress tests to pass on it: regress/lib/libm/nextafter regress/lib/libm/rint (I think I properly fixed the paths to be relative to src/ but it's possible I broke something, so if they don't apply that's probably my fault.) Date: Sun, 17 Jan 2016 21:50:30 +0000 (UTC) From: Miod Vallat <m...@online.fr> To: tech@openbsd.org Subject: Re: Perl 5.22.1 testing request + issue on alpha Organization: Prumpleffer Gmbh User-Agent: slrn/1.0.2 (OpenBSD) > I have run into a strange issue on alpha that I'm still tracking down. > I fear this has interrupted me too long to get 5.22 in for OpenBSD 5.9, > but maybe we can get ahead of the curve and be ready after unlock. > > Previously, NaN + 1 looked like this: > $ perl -we 'print "NaN" + 1' > -nan > > Due to improvements in the Inf/NaN code, 5.22 should get: > $ perl -we 'print "NaN" + 1' > NaN > > But for some reason on alpha NaN isn't special and we instead get: > $ ./perl -we 'print "NaN" + 1' > 1 You might want to try this compiler diff on alpha. When compiling with optimization enabled and ieee-style floating point, the compiler tries to insert asynchronous fpu trap synchronization barriers as late as possible. Unfortunately, the logic does not take into account the store of a floating-point result into memory as something requiring a barrier, which leads to incorrect behaviour on alpha processors without the ``precise arithmetic trap'' extension. Index: alpha.c =================================================================== RCS file: /OpenBSD/src/gnu/gcc/gcc/config/alpha/alpha.c,v retrieving revision 1.4 diff -u -p -r1.4 alpha.c --- gnu/gcc/gcc/config/alpha/alpha.c 20 Dec 2012 13:58:06 -0000 1.4 +++ gnu/gcc/gcc/config/alpha/alpha.c 17 Jan 2016 19:42:44 -0000 @@ -8721,11 +8721,15 @@ summarize_insn (rtx x, struct shadow_sum result of an instruction that might generate an UNPREDICTABLE result. - (c) Within the trap shadow, no register may be used more than once + (c) Within the trap shadow, the destination register of the potentially + trapping instruction may not be used as an input, for its value would be + UNPREDICTABLE. + + (d) Within the trap shadow, no register may be used more than once as a destination register. (This is to make life easier for the trap-handler.) - (d) The trap shadow may not include any branch instructions. */ + (e) The trap shadow may not include any branch instructions. */ static void alpha_handle_trap_shadows (void) @@ -8797,7 +8801,7 @@ alpha_handle_trap_shadows (void) if ((sum.defd.i & shadow.defd.i) || (sum.defd.fp & shadow.defd.fp)) { - /* (c) would be violated */ + /* (d) would be violated */ goto close_shadow; } @@ -8820,11 +8824,19 @@ alpha_handle_trap_shadows (void) goto close_shadow; } + + if ((sum.used.i & shadow.defd.i) + || (sum.used.fp & shadow.defd.fp)) + { + /* (c) would be violated */ + goto close_shadow; + } break; case JUMP_INSN: case CALL_INSN: case CODE_LABEL: + /* (e) would be violated */ goto close_shadow; default: Date: Wed, 20 Jan 2016 20:20:51 +0000 From: Miod Vallat <m...@online.fr> To: Andrew Fresh <and...@afresh1.com> Cc: Theo de Raadt <dera...@openbsd.org>, David Gwynne <d...@openbsd.org> Subject: Re: alpha User-Agent: Mutt/1.5.24 (2015-08-30) > > However! I have just noticed regress/lib/libm/rint will fail with a > > SIGILL. Apparently not all IEEE-mode instructions are implemented on > > this 21064, but this is one of the earliest alpha systems. Could you > > check if this test passes (or fails, but without SIGILL) on your > > alphastation? > > This fails on my alphastation with SIGILL. > > kern.version=OpenBSD 5.9-beta (GENERIC) #281: Sun Dec 27 13:54:59 MST 2015 > dera...@alpha.openbsd.org:/usr/src/sys/arch/alpha/compile/GENERIC > > $ make regress > cc -O2 -pipe -c rint.c > cc -o rint rint.o -lm > ./rint > *** Signal SIGILL in . (<bsd.regress.mk>:48 'run-regress-rint') > FAILED > *** Error 1 in target 'regress' (ignored) The following diff will fix it. Unfortunately it changes <machine/cpu.h> to publish a formerly internal function prototype (guarded by _KERNEL), so it might be too late in the release cycle for such a change? Index: alpha/fp_complete.c =================================================================== RCS file: /OpenBSD/src/sys/arch/alpha/alpha/fp_complete.c,v retrieving revision 1.11 diff -u -p -r1.11 fp_complete.c --- sys/arch/alpha/alpha/fp_complete.c 18 Nov 2014 20:51:00 -0000 1.11 +++ sys/arch/alpha/alpha/fp_complete.c 20 Jan 2016 20:18:11 -0000 @@ -73,13 +73,17 @@ #define IS_SUBNORMAL(v) ((v)->exp == 0 && (v)->frac != 0) -#define PREFILTER_SUBNORMAL(p,v) if ((p)->p_md.md_flags & IEEE_MAP_DMZ \ - && IS_SUBNORMAL(v)) \ - (v)->frac = 0; else - -#define POSTFILTER_SUBNORMAL(p,v) if ((p)->p_md.md_flags & IEEE_MAP_UMZ \ - && IS_SUBNORMAL(v)) \ - (v)->frac = 0; else +#define PREFILTER_SUBNORMAL(p,v) \ +do { \ + if ((p)->p_md.md_flags & IEEE_MAP_DMZ && IS_SUBNORMAL(v)) \ + (v)->frac = 0; \ +} while (0) + +#define POSTFILTER_SUBNORMAL(p,v) \ +do { \ + if ((p)->p_md.md_flags & IEEE_MAP_UMZ && IS_SUBNORMAL(v)) \ + (v)->frac = 0; \ +} while (0) /* Alpha returns 2.0 for true, all zeroes for false. */ @@ -493,7 +497,7 @@ float64_unk(float64 a, float64 b) */ static void -alpha_fp_interpret(alpha_instruction *pc, struct proc *p, u_int64_t bits) +alpha_fp_interpret(struct proc *p, u_int64_t bits) { s_float sfa, sfb, sfc; t_float tfa, tfb, tfc; @@ -560,16 +564,15 @@ alpha_fp_interpret(alpha_instruction *pc } } -static int -alpha_fp_complete_at(alpha_instruction *trigger_pc, struct proc *p, - u_int64_t *ucode) +int +alpha_fp_complete_at(u_long trigger_pc, struct proc *p, u_int64_t *ucode) { int needsig; alpha_instruction inst; u_int64_t rm, fpcr, orig_fpcr; u_int64_t orig_flags, new_flags, changed_flags, md_flags; - if (__predict_false(copyin(trigger_pc, &inst, sizeof inst))) { + if (__predict_false(copyin((void *)trigger_pc, &inst, sizeof inst))) { this_cannot_happen(6, -1); return SIGSEGV; } @@ -589,7 +592,7 @@ alpha_fp_complete_at(alpha_instruction * } orig_flags = FP_C_TO_OPENBSD_FLAG(p->p_md.md_flags); - alpha_fp_interpret(trigger_pc, p, inst.bits); + alpha_fp_interpret(p, inst.bits); md_flags = p->p_md.md_flags; @@ -614,12 +617,12 @@ alpha_fp_complete(u_long a0, u_long a1, u_int64_t op_class; alpha_instruction inst; /* "trigger_pc" is Compaq's term for the earliest faulting op */ - alpha_instruction *trigger_pc, *usertrap_pc; + u_long trigger_pc, usertrap_pc; alpha_instruction *pc, *win_begin, tsw[TSWINSIZE]; sig = SIGFPE; pc = (alpha_instruction *)p->p_md.md_tf->tf_regs[FRAME_PC]; - trigger_pc = pc - 1; /* for ALPHA_AMASK_PAT case */ + trigger_pc = (u_long)pc - 4; /* for ALPHA_AMASK_PAT case */ if (cpu_amask & ALPHA_AMASK_PAT) { if (a0 & 1 || alpha_fp_sync_complete) { sig = alpha_fp_complete_at(trigger_pc, p, ucode); @@ -639,12 +642,6 @@ alpha_fp_complete(u_long a0, u_long a1, * interpret this one instruction in SW. If a SIGFPE is not required, back up * the PC until just after this instruction and restart. This will execute all * trap shadow instructions between the trigger pc and the trap pc twice. - * - * If a SIGFPE is generated from the OSF1 emulation, back up one more - * instruction to the trigger pc itself. Native binaries don't because it - * is non-portable and completely defeats the intended purpose of IEEE - * traps -- for example, to count the number of exponent wraps for a later - * correction. */ trigger_pc = 0; win_begin = pc; @@ -665,10 +662,10 @@ alpha_fp_complete(u_long a0, u_long a1, op_class = 1UL << inst.generic_format.opcode; if (op_class & FPUREG_CLASS) { a1 &= ~(1UL << (inst.operate_generic_format.rc + 32)); - trigger_pc = pc; + trigger_pc = (u_long)pc; } else if (op_class & CPUREG_CLASS) { a1 &= ~(1UL << inst.operate_generic_format.rc); - trigger_pc = pc; + trigger_pc = (u_long)pc; } else if (op_class & TRAPSHADOWBOUNDARY) { if (op_class & CHECKFUNCTIONCODE) { if (inst.mem_format.displacement == op_trapb || @@ -691,8 +688,8 @@ alpha_fp_complete(u_long a0, u_long a1, } done: if (sig) { - usertrap_pc = trigger_pc + 1; - p->p_md.md_tf->tf_regs[FRAME_PC] = (unsigned long)usertrap_pc; + usertrap_pc = trigger_pc + 4; + p->p_md.md_tf->tf_regs[FRAME_PC] = usertrap_pc; return sig; } return 0; Index: alpha/trap.c =================================================================== RCS file: /OpenBSD/src/sys/arch/alpha/alpha/trap.c,v retrieving revision 1.80 diff -u -p -r1.80 trap.c --- sys/arch/alpha/alpha/trap.c 23 Jun 2015 12:29:46 -0000 1.80 +++ sys/arch/alpha/alpha/trap.c 20 Jan 2016 20:18:11 -0000 @@ -1231,6 +1231,24 @@ handle_opdec(p, ucodep) } goto sigill; +#ifndef NO_IEEE + /* case op_fix_float: */ + /* case op_vax_float: */ + case op_ieee_float: + /* case op_any_float: */ + /* + * EV4 processors do not implement dynamic rounding + * instructions at all. + */ + if (cpu_implver <= ALPHA_IMPLVER_EV4) { + sig = alpha_fp_complete_at(inst_pc, p, ucodep); + if (sig) + return sig; + break; + } + goto sigill; +#endif + default: goto sigill; } Index: include/cpu.h =================================================================== RCS file: /OpenBSD/src/sys/arch/alpha/include/cpu.h,v retrieving revision 1.55 diff -u -p -r1.55 cpu.h --- sys/arch/alpha/include/cpu.h 2 Jul 2015 01:33:59 -0000 1.55 +++ sys/arch/alpha/include/cpu.h 20 Jan 2016 20:18:11 -0000 @@ -393,6 +393,7 @@ u_int64_t alpha_read_fp_c(struct proc *) void alpha_write_fp_c(struct proc *, u_int64_t); int alpha_fp_complete(u_long, u_long, struct proc *, u_int64_t *); +int alpha_fp_complete_at(u_long, struct proc *, u_int64_t *); #endif void alpha_enable_fp(struct proc *, int);