alpha fixes for older chips

Andrew Fresh Tue, 29 Mar 2016 19:06:07 -0700

I got these patches from the ghost of architectures past trying to get
perl tests passing on my alpha.


I finally got a chance to get back to this and test building a release
on both alpha and amd64 and will commit them in the morning unless
someone else gets to it first.


There are two separate fixes here, the first is a compiler optimization
bug and the other handles processors without specific instructions.

These help not only perl on my AlphaStation but also gets these two
regress tests to pass on it:

regress/lib/libm/nextafter
regress/lib/libm/rint


(I think I properly fixed the paths to be relative to src/ but it's
possible I broke something, so if they don't apply that's probably my
fault.)

Date: Sun, 17 Jan 2016 21:50:30 +0000 (UTC)
From: Miod Vallat <m...@online.fr>
To: tech@openbsd.org
Subject: Re: Perl 5.22.1 testing request + issue on alpha
Organization: Prumpleffer Gmbh
User-Agent: slrn/1.0.2 (OpenBSD)

> I have run into a strange issue on alpha that I'm still tracking down.
> I fear this has interrupted me too long to get 5.22 in for OpenBSD 5.9,
> but maybe we can get ahead of the curve and be ready after unlock.
>
> Previously, NaN + 1 looked like this:
> $ perl -we 'print "NaN" + 1'
> -nan
>
> Due to improvements in the Inf/NaN code, 5.22 should get:
> $ perl -we 'print "NaN" + 1' 
> NaN
>
> But for some reason on alpha NaN isn't special and we instead get:
> $ ./perl -we 'print "NaN" + 1'
> 1

You might want to try this compiler diff on alpha.





When compiling with optimization enabled and ieee-style floating point, the
compiler tries to insert asynchronous fpu trap synchronization barriers as
late as possible.

Unfortunately, the logic does not take into account the store of a
floating-point result into memory as something requiring a barrier, which
leads to incorrect behaviour on alpha processors without the ``precise
arithmetic trap'' extension.

Index: alpha.c
===================================================================
RCS file: /OpenBSD/src/gnu/gcc/gcc/config/alpha/alpha.c,v
retrieving revision 1.4
diff -u -p -r1.4 alpha.c
--- gnu/gcc/gcc/config/alpha/alpha.c    20 Dec 2012 13:58:06 -0000      1.4
+++ gnu/gcc/gcc/config/alpha/alpha.c    17 Jan 2016 19:42:44 -0000
@@ -8721,11 +8721,15 @@ summarize_insn (rtx x, struct shadow_sum
    result of an instruction that might generate an UNPREDICTABLE
    result.
 
-   (c) Within the trap shadow, no register may be used more than once
+   (c) Within the trap shadow, the destination register of the potentially
+   trapping instruction may not be used as an input, for its value would be
+   UNPREDICTABLE.
+
+   (d) Within the trap shadow, no register may be used more than once
    as a destination register.  (This is to make life easier for the
    trap-handler.)
 
-   (d) The trap shadow may not include any branch instructions.  */
+   (e) The trap shadow may not include any branch instructions.  */
 
 static void
 alpha_handle_trap_shadows (void)
@@ -8797,7 +8801,7 @@ alpha_handle_trap_shadows (void)
                      if ((sum.defd.i & shadow.defd.i)
                          || (sum.defd.fp & shadow.defd.fp))
                        {
-                         /* (c) would be violated */
+                         /* (d) would be violated */
                          goto close_shadow;
                        }
 
@@ -8820,11 +8824,19 @@ alpha_handle_trap_shadows (void)
 
                          goto close_shadow;
                        }
+
+                     if ((sum.used.i & shadow.defd.i)
+                         || (sum.used.fp & shadow.defd.fp))
+                       {
+                         /* (c) would be violated */
+                         goto close_shadow;
+                       }
                      break;
 
                    case JUMP_INSN:
                    case CALL_INSN:
                    case CODE_LABEL:
+                     /* (e) would be violated */
                      goto close_shadow;
 
                    default:


Date: Wed, 20 Jan 2016 20:20:51 +0000
From: Miod Vallat <m...@online.fr>
To: Andrew Fresh <and...@afresh1.com>
Cc: Theo de Raadt <dera...@openbsd.org>, David Gwynne <d...@openbsd.org>
Subject: Re: alpha
User-Agent: Mutt/1.5.24 (2015-08-30)

> > However! I have just noticed regress/lib/libm/rint will fail with a
> > SIGILL. Apparently not all IEEE-mode instructions are implemented on
> > this 21064, but this is one of the earliest alpha systems. Could you
> > check if this test passes (or fails, but without SIGILL) on your
> > alphastation?
> 
> This fails on my alphastation with SIGILL.
> 
> kern.version=OpenBSD 5.9-beta (GENERIC) #281: Sun Dec 27 13:54:59 MST 2015
>     dera...@alpha.openbsd.org:/usr/src/sys/arch/alpha/compile/GENERIC
> 
> $ make regress
> cc -O2 -pipe    -c rint.c
> cc   -o rint rint.o -lm
> ./rint
> *** Signal SIGILL in . (<bsd.regress.mk>:48 'run-regress-rint')
> FAILED
> *** Error 1 in target 'regress' (ignored)

The following diff will fix it. Unfortunately it changes <machine/cpu.h>
to publish a formerly internal function prototype (guarded by _KERNEL),
so it might be too late in the release cycle for such a change?

Index: alpha/fp_complete.c
===================================================================
RCS file: /OpenBSD/src/sys/arch/alpha/alpha/fp_complete.c,v
retrieving revision 1.11
diff -u -p -r1.11 fp_complete.c
--- sys/arch/alpha/alpha/fp_complete.c  18 Nov 2014 20:51:00 -0000      1.11
+++ sys/arch/alpha/alpha/fp_complete.c  20 Jan 2016 20:18:11 -0000
@@ -73,13 +73,17 @@
 
 #define IS_SUBNORMAL(v)        ((v)->exp == 0 && (v)->frac != 0)
 
-#define        PREFILTER_SUBNORMAL(p,v) if ((p)->p_md.md_flags & IEEE_MAP_DMZ  
\
-                                    && IS_SUBNORMAL(v))                \
-                                        (v)->frac = 0; else
-
-#define        POSTFILTER_SUBNORMAL(p,v) if ((p)->p_md.md_flags & IEEE_MAP_UMZ 
\
-                                     && IS_SUBNORMAL(v))               \
-                                         (v)->frac = 0; else
+#define        PREFILTER_SUBNORMAL(p,v) \
+do { \
+       if ((p)->p_md.md_flags & IEEE_MAP_DMZ && IS_SUBNORMAL(v)) \
+               (v)->frac = 0; \
+} while (0)
+
+#define        POSTFILTER_SUBNORMAL(p,v) \
+do { \
+       if ((p)->p_md.md_flags & IEEE_MAP_UMZ && IS_SUBNORMAL(v)) \
+               (v)->frac = 0; \
+} while (0)
 
        /* Alpha returns 2.0 for true, all zeroes for false. */
 
@@ -493,7 +497,7 @@ float64_unk(float64 a, float64 b)
  */
 
 static void
-alpha_fp_interpret(alpha_instruction *pc, struct proc *p, u_int64_t bits)
+alpha_fp_interpret(struct proc *p, u_int64_t bits)
 {
        s_float sfa, sfb, sfc;
        t_float tfa, tfb, tfc;
@@ -560,16 +564,15 @@ alpha_fp_interpret(alpha_instruction *pc
        }
 }
 
-static int
-alpha_fp_complete_at(alpha_instruction *trigger_pc, struct proc *p,
-    u_int64_t *ucode)
+int
+alpha_fp_complete_at(u_long trigger_pc, struct proc *p, u_int64_t *ucode)
 {
        int needsig;
        alpha_instruction inst;
        u_int64_t rm, fpcr, orig_fpcr;
        u_int64_t orig_flags, new_flags, changed_flags, md_flags;
 
-       if (__predict_false(copyin(trigger_pc, &inst, sizeof inst))) {
+       if (__predict_false(copyin((void *)trigger_pc, &inst, sizeof inst))) {
                this_cannot_happen(6, -1);
                return SIGSEGV;
        }
@@ -589,7 +592,7 @@ alpha_fp_complete_at(alpha_instruction *
        }
        orig_flags = FP_C_TO_OPENBSD_FLAG(p->p_md.md_flags);
 
-       alpha_fp_interpret(trigger_pc, p, inst.bits);
+       alpha_fp_interpret(p, inst.bits);
 
        md_flags = p->p_md.md_flags;
 
@@ -614,12 +617,12 @@ alpha_fp_complete(u_long a0, u_long a1, 
        u_int64_t op_class;
        alpha_instruction inst;
        /* "trigger_pc" is Compaq's term for the earliest faulting op */
-       alpha_instruction *trigger_pc, *usertrap_pc;
+       u_long trigger_pc, usertrap_pc;
        alpha_instruction *pc, *win_begin, tsw[TSWINSIZE];
 
        sig = SIGFPE;
        pc = (alpha_instruction *)p->p_md.md_tf->tf_regs[FRAME_PC];
-       trigger_pc = pc - 1;    /* for ALPHA_AMASK_PAT case */
+       trigger_pc = (u_long)pc - 4;    /* for ALPHA_AMASK_PAT case */
        if (cpu_amask & ALPHA_AMASK_PAT) {
                if (a0 & 1 || alpha_fp_sync_complete) {
                        sig = alpha_fp_complete_at(trigger_pc, p, ucode);
@@ -639,12 +642,6 @@ alpha_fp_complete(u_long a0, u_long a1, 
  * interpret this one instruction in SW. If a SIGFPE is not required, back up
  * the PC until just after this instruction and restart. This will execute all
  * trap shadow instructions between the trigger pc and the trap pc twice.
- * 
- * If a SIGFPE is generated from the OSF1 emulation,  back up one more
- * instruction to the trigger pc itself. Native binaries don't because it
- * is non-portable and completely defeats the intended purpose of IEEE
- * traps -- for example, to count the number of exponent wraps for a later
- * correction.
  */
        trigger_pc = 0;
        win_begin = pc;
@@ -665,10 +662,10 @@ alpha_fp_complete(u_long a0, u_long a1, 
                op_class = 1UL << inst.generic_format.opcode;
                if (op_class & FPUREG_CLASS) {
                        a1 &= ~(1UL << (inst.operate_generic_format.rc + 32));
-                       trigger_pc = pc;
+                       trigger_pc = (u_long)pc;
                } else if (op_class & CPUREG_CLASS) {
                        a1 &= ~(1UL << inst.operate_generic_format.rc);
-                       trigger_pc = pc;
+                       trigger_pc = (u_long)pc;
                } else if (op_class & TRAPSHADOWBOUNDARY) {
                        if (op_class & CHECKFUNCTIONCODE) {
                                if (inst.mem_format.displacement == op_trapb ||
@@ -691,8 +688,8 @@ alpha_fp_complete(u_long a0, u_long a1, 
        }
 done:
        if (sig) {
-               usertrap_pc = trigger_pc + 1;
-               p->p_md.md_tf->tf_regs[FRAME_PC] = (unsigned long)usertrap_pc;
+               usertrap_pc = trigger_pc + 4;
+               p->p_md.md_tf->tf_regs[FRAME_PC] = usertrap_pc;
                return sig;
        }
        return 0;
Index: alpha/trap.c
===================================================================
RCS file: /OpenBSD/src/sys/arch/alpha/alpha/trap.c,v
retrieving revision 1.80
diff -u -p -r1.80 trap.c
--- sys/arch/alpha/alpha/trap.c 23 Jun 2015 12:29:46 -0000      1.80
+++ sys/arch/alpha/alpha/trap.c 20 Jan 2016 20:18:11 -0000
@@ -1231,6 +1231,24 @@ handle_opdec(p, ucodep)
                }
                goto sigill;
 
+#ifndef NO_IEEE
+       /* case op_fix_float: */
+       /* case op_vax_float: */
+       case op_ieee_float:
+       /* case op_any_float: */
+               /*
+                * EV4 processors do not implement dynamic rounding
+                * instructions at all.
+                */
+               if (cpu_implver <= ALPHA_IMPLVER_EV4) {
+                       sig = alpha_fp_complete_at(inst_pc, p, ucodep);
+                       if (sig)
+                               return sig;
+                       break;
+               }
+               goto sigill;
+#endif
+
        default:
                goto sigill;
        }
Index: include/cpu.h
===================================================================
RCS file: /OpenBSD/src/sys/arch/alpha/include/cpu.h,v
retrieving revision 1.55
diff -u -p -r1.55 cpu.h
--- sys/arch/alpha/include/cpu.h        2 Jul 2015 01:33:59 -0000       1.55
+++ sys/arch/alpha/include/cpu.h        20 Jan 2016 20:18:11 -0000
@@ -393,6 +393,7 @@ u_int64_t alpha_read_fp_c(struct proc *)
 void alpha_write_fp_c(struct proc *, u_int64_t);
 
 int alpha_fp_complete(u_long, u_long, struct proc *, u_int64_t *);
+int alpha_fp_complete_at(u_long, struct proc *, u_int64_t *);
 #endif
 
 void alpha_enable_fp(struct proc *, int);

alpha fixes for older chips

Reply via email to