One great big block comment isn't the best way to document
the syntax of a language.
Signed-off-by: Richard Henderson
---
MAINTAINERS | 1 +
docs/decodetree.rst | 156 ++
scripts/decodetree.py | 134 +---
3 files changed, 158 insertions(+), 133 deletions(-)
create mode 100644 docs/decodetree.rst
diff --git a/MAINTAINERS b/MAINTAINERS
index ad007748b9..fc7cddb873 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -118,6 +118,7 @@ F: exec.c
F: accel/tcg/
F: accel/stubs/tcg-stub.c
F: scripts/decodetree.py
+F: docs/decodetree.rst
F: include/exec/cpu*.h
F: include/exec/exec-all.h
F: include/exec/helper*.h
diff --git a/docs/decodetree.rst b/docs/decodetree.rst
new file mode 100644
index 00..d9be30b2db
--- /dev/null
+++ b/docs/decodetree.rst
@@ -0,0 +1,156 @@
+
+Decodetree Specification
+
+
+A *decodetree* is built from instruction *patterns*. A pattern may
+represent a single architectural instruction or a group of same, depending
+on what is convenient for further processing.
+
+Each pattern has both *fixedbits* and *fixedmask*, the combination of which
+describes the condition under which the pattern is matched::
+
+ (insn & fixedmask) == fixedbits
+
+Each pattern may have *fields*, which are extracted from the insn and
+passed along to the translator. Examples of such are registers,
+immediates, and sub-opcodes.
+
+In support of patterns, one may declare *fields*, *argument sets*, and
+*formats*, each of which may be re-used to simplify further definitions.
+
+Fields
+==
+
+Syntax::
+
+ field_def := '%' identifier ( unnamed_field )+ ( !function=identifier )?
+ unnamed_field := number ':' ( 's' ) number
+
+For *unnamed_field*, the first number is the least-significant bit position
+of the field and the second number is the length of the field. If the 's' is
+present, the field is considered signed. If multiple ``unnamed_fields`` are
+present, they are concatenated. In this way one can define disjoint fields.
+
+If ``!function`` is specified, the concatenated result is passed through the
+named function, taking and returning an integral value.
+
+FIXME: the fields of the structure into which this result will be stored
+is restricted to ``int``. Which means that we cannot expand 64-bit items.
+
+Field examples:
+
++---+-+
+| Input | Generated code |
++===+=+
+| %disp 0:s16 | sextract(i, 0, 16) |
++---+-+
+| %imm9 16:6 10:3 | extract(i, 16, 6) << 3 | extract(i, 10, 3) |
++---+-+
+| %disp12 0:s1 1:1 2:10 | sextract(i, 0, 1) << 11 | |
+| |extract(i, 1, 1) << 10 | |
+| |extract(i, 2, 10)|
++---+-+
+| %shimm8 5:s8 13:1 | expand_shimm8(sextract(i, 5, 8) << 1 | |
+| !function=expand_shimm8 | extract(i, 13, 1))|
++---+-+
+
+Argument Sets
+=
+
+Syntax::
+
+ args_def:= '&' identifier ( args_elt )+ ( !extern )?
+ args_elt:= identifier
+
+Each *args_elt* defines an argument within the argument set.
+Each argument set will be rendered as a C structure "arg_$name"
+with each of the fields being one of the member arguments.
+
+If ``!extern`` is specified, the backing structure is assumed
+to have been already declared, typically via a second decoder.
+
+Argument set examples::
+
+ ra rb rc
+reg base offset
+
+
+Formats
+===
+
+Syntax::
+
+ fmt_def := '@' identifier ( fmt_elt )+
+ fmt_elt := fixedbit_elt | field_elt | field_ref | args_ref
+ fixedbit_elt := [01.-]+
+ field_elt:= identifier ':' 's'? number
+ field_ref:= '%' identifier | identifier '=' '%' identifier
+ args_ref := '&' identifier
+
+Defining a format is a handy way to avoid replicating groups of fields
+across many instruction patterns.
+
+A *fixedbit_elt* describes a contiguous sequence of bits that must
+be 1, 0, or don't care. The difference between '.' and '-'
+is that '.' means that the bit will be covered with a field or a
+final 0 or 1 from the pattern, and '-' means that the bit is really
+ignored by the cpu and will not be specified.
+
+A *field_elt* describes a simple field only given a width; the position of
+the field is implied by its position with respect to other *fixedbit_elt*
+and *field_elt*.
+
+If any *fixedbit_elt* or *field_elt* appear, then all bits must be