Hi Folks,
an update to the GLYPH ISA with specification of the vector registers:
latest: https://metaparadigm.com/~mclark/glyph-20260122.pdf
current: https://metaparadigm.com/~mclark/glyph.pdf
source: https://github.com/michaeljclark/glyph
source: https://github.com/michaeljclark/x86
GLYPH has been designed from the perspective of an emulator developer.
it has many features that make it ideal for emulation. it has twice as
many registers as mainstream CPU architectures which means privileged
state can be mapped to registers. it also has a software-defined MMU.
GLYPH design
GLYPH was designed after reflection on the combinatorial complexity of
X86 instruction decode and the complexity of constant synthesis in
RISC-V. the GLYPH ISA adopts an even simpler length decoding scheme than
present-day RISC-V, at the expense of losing a bit of coding space in
the 16-bit packet. the goal is that after quantitative evaluation, we
can win back some density with more compact constant synthesis due to
the "constant stream," which is analogous to push constants in Vulkan.
GLYPH has 16/32/64-bit instruction packets. at present, only the 16-bit
packet has been placed, and it has been placed based on analysis of
common X86 prolog epilog codegen sequences seen in QEMU for X86-64,
with special PC-relative loads and stores designed to fish pointers
from .data sections and operations for register save/restore, pointer
arithmetic, and loops. 16-bit instructions are presently limited to
64-bit arithmetic except with sign-extended 32-bit constants. e.g.
MOVABS reg, imm64
MOV reg, imm32
ADD reg, imm32
GLYPH evolution
it is intended that the 64-bit instruction packet will contain Intel
EVEX-like instructions for a 512-bit vector register file, plus the
addition of an extended 4096-bit vector length (i32x128 and f32x128).
there are no special mask registers because the scalar port is 128-bit.
I intend to add 4-lane swizzle to common integer and floating-point
vector operations: VFADD, VFSUB, VFMUL, VFDIV, VFMIN, VFMAX:
float swizzle bits = { x, y, z, w, 0, 1, -1, NaN }
int swizzle bits = { x, y, z, w, 0, 1, -1, 2 }
this is so that we can reduce lane shuffle ops for common arithmetic.
here is a 3D cross product, and it maps to current SPIR-V shader caps:
VFMUL r4 r1.yzx_ r2.zxy_ ; (ay*bz, az*bx, ax*by)
VFMUL r5 r1.zxy_ r2.yzx_ ; (az*by, ax*bz, ay*bx)
VFSUB r3 r4 r5 ; cross product in r3
GLYPH future
one of the defining characteristics of the work is that we have tried,
where possible, to use ARPA naming, and that distinguishes this work
from many other instruction set architectures, not in any essential
functional way, but primarily in comprehension. for example, instead
of using topological and logical APIC IDs for IPI targets, we simply
use "thread address" and "thread domain".
Regards,
Michael.