[Milkymist-devel] 10 % 2 = ?

Werner Almesberger Sun, 15 Jan 2012 17:31:07 -0800

First let's welcome a newcomer: tools/asm/fpvm accepts PFPU assembler
in more or less FPVM style, with symbolic names added as a bonus. It
calls tools/asm/pfpu repeatedly to execute the code one instruction
at a time on the M1.


This is slow but accurate. The one instruction at a time limit comes
from the command line size limit in RTEMS. One could work around
that, e.g., by making the pfpu command read from a file, but let's
save the over-engineering for later.

Second, I put all this to use to find out more about the modulo bug
that's been bothering me. Turns out that the result of 10 % 2 on M1
is 2, not 0.

I've translated the modulo algorithm into "fpvm" assembler:
https://github.com/milkymist/milkymist/blob/master/tools/asm/mod.fpvm

Run with:

cd milkymist/tools/asm
./fpvm -d -x mod.fpvm

This needs M1_* set up such that "pfpu" can telnet and run the pfpu
commands.

The complete transcript of the session is below. There are three
types of lines:

- # fmul opb, idiv -> bidiv
  #     bidiv = 8 (0x41000000)

  Tracing of operation and result at the level of "fpvm".

- ## r2=2 r3=0x40800000 fmul r2,r3 -> r2
  ## 0x41000000 8 0x40800000 4

  Debug output for tracing at the level of "pfpu". Note that the
  result line starts with R2 (in internal and float format).

- result = 2 (0x40000000)

  This means that, at the end, the variable "result" has the final
  value 2, with the internal representation 0x40000000. The
  internal representation tells us that this is an exact two, not
  something rounded.

Third, whole thing is still a bit fragile. If you get a "syntax
error", the M1 probably crashed and "pfpu" got confused about the
(lack of) output.

It also seems necessary to first run some rendering job after
booting before running pfpu commands. I could have sworn I once
saw it worked without rendering first, but maybe I'm just imagining
things.

Fourth, how to fix this ? The problem is

div = 5 (0x409ffffb)

which really is 4.9999976 and then becomes

idiv = 4 (0x40800000)

- Werner

The full mod.fpvm session:

# opa = 10
# opb = 2
# onehalf = 0.5
# twohalf = 1.5
# quake opb -> y
## r2=2 quake r2 -> r2
## 0x3f3759df 0.716215
#     y = 0.716215 (0x3f3759df)
# fmul y,y -> yy
## r2=0x3f3759df r3=0x3f3759df fmul r2,r3 -> r2
## 0x3f03519c 0.512964 0x3f3759df 0.716215
#     yy = 0.512964 (0x3f03519c)
# fmul onehalf, opb -> hx
## r2=0.5 r3=2 fmul r2,r3 -> r2
## 0x3f800000 1 0x40000000 2
#     hx = 1 (0x3f800000)
# fmul hx, yy -> hxyy
## r2=0x3f800000 r3=0x3f03519c fmul r2,r3 -> r2
## 0x3f03519c 0.512964 0x3f03519c 0.512964
#     hxyy = 0.512964 (0x3f03519c)
# fsub twohalf, hxyy -> sub
## r2=1.5 r3=0x3f03519c fsub r2,r3 -> r2
## 0x3f7cae64 0.987036 0x3f03519c 0.512964
#     sub = 0.987036 (0x3f7cae64)
# fmul sub, y -> y2
## r2=0x3f7cae64 r3=0x3f3759df fmul r2,r3 -> r2
## 0x3f34f95e 0.70693 0x3f3759df 0.716215
#     y2 = 0.70693 (0x3f34f95e)
# fmul y2,y2 -> yy
## r2=0x3f34f95e r3=0x3f34f95e fmul r2,r3 -> r2
## 0x3effdf3e 0.49975 0x3f34f95e 0.70693
#     yy = 0.49975 (0x3effdf3e)
# fmul onehalf, opb -> hx
## r2=0.5 r3=2 fmul r2,r3 -> r2
## 0x3f800000 1 0x40000000 2
#     hx = 1 (0x3f800000)
# fmul hx, yy -> hxyy
## r2=0x3f800000 r3=0x3effdf3e fmul r2,r3 -> r2
## 0x3effdf3e 0.49975 0x3effdf3e 0.49975
#     hxyy = 0.49975 (0x3effdf3e)
# fsub twohalf, hxyy -> sub
## r2=1.5 r3=0x3effdf3e fsub r2,r3 -> r2
## 0x3f800830 1.00025 0x3effdf3e 0.49975
#     sub = 1.00025 (0x3f800830)
# fmul sub, y2 -> invsqrt
## r2=0x3f800830 r3=0x3f34f95e fmul r2,r3 -> r2
## 0x3f3504f1 0.707107 0x3f34f95e 0.70693
#     invsqrt = 0.707107 (0x3f3504f1)
# fmul invsqrt, invsqrt ->invsqrt2
## r2=0x3f3504f1 r3=0x3f3504f1 fmul r2,r3 -> r2
## 0x3efffff9 0.5 0x3f3504f1 0.707107
#     invsqrt2 = 0.5 (0x3efffff9)
# fmul invsqrt2, opa -> div
## r2=0x3efffff9 r3=10 fmul r2,r3 -> r2
## 0x409ffffb 5 0x41200000 10
#     div = 5 (0x409ffffb)
# f2i div -> i
## r2=0x409ffffb f2i r2 -> r2
## 0x00000004 5.60519e-45
#     i = 5.60519e-45 (0x00000004)
# i2f i -> idiv
## r2=0x00000004 i2f r2 -> r2
## 0x40800000 4
#     idiv = 4 (0x40800000)
# fmul opb, idiv -> bidiv
## r2=2 r3=0x40800000 fmul r2,r3 -> r2
## 0x41000000 8 0x40800000 4
#     bidiv = 8 (0x41000000)
# fsub opa, bidiv -> result
## r2=10 r3=0x41000000 fsub r2,r3 -> r2
## 0x40000000 2 0x41000000 8
#     result = 2 (0x40000000)
bidiv = 8 (0x41000000)
div = 5 (0x409ffffb)
hx = 1 (0x3f800000)
hxyy = 0.49975 (0x3effdf3e)
i = 5.60519e-45 (0x00000004)
idiv = 4 (0x40800000)
invsqrt = 0.707107 (0x3f3504f1)
invsqrt2 = 0.5 (0x3efffff9)
onehalf = 0.5 (0.5)
opa = 10 (10)
opb = 2 (2)
result = 2 (0x40000000)
sub = 1.00025 (0x3f800830)
twohalf = 1.5 (1.5)
y = 0.716215 (0x3f3759df)
y2 = 0.70693 (0x3f34f95e)
yy = 0.49975 (0x3effdf3e)

(end)
_______________________________________________
http://lists.milkymist.org/listinfo.cgi/devel-milkymist.org
IRC: #milkymist@Freenode

[Milkymist-devel] 10 % 2 = ?

Reply via email to