I've had a closer look at the basic instructions listed at
http://docs.google.com/View?id=dfsp4qpd_41dtrrskfb#Operation_to_support_in_shader_8117638312455733
to which I have some comments and questions.

1.  The document list both signed and unsigned additive operations.
These are equivalent, except possibly for the flags.  I suggest not to
differentiate signed and unsigned additive instructions, and instead
adapt the conventional semantics for carry and overflow flags.

2.  Do we need to support extraction of the upper 32 bits of a
multiplication?  If not, then mult and umult are also equivalent except
possibly for the flags.  I think we can reuse the carry flag for
overflow of an unsigned multiplication in analogy to the additive
instructions.

3.  Finally on this chain of though, f2u and f2i will be equivalent
except for overflow detection.  Again we can use the carry flag for the
unsigned case and overflow for the signed case.

4.  I like the idea of adding minimum and maximum functions in the
instruction set; if we need them, that is.  They should not require much
logic.  But in this case, note that there is a difference between signed
and unsigned.  Do we want both?  On the other hand it only takes 2 to 3
cycles to compute any of these if my idea of the instruction set is
correct.

5.  To complete the set of shift instructions we need the arithmetic and
logic shift distinction (aka signed and unsigned shifts).

6.  There may be more flags than we need.  For integer division by
zero we can use the overflow, since that's the only way a division can
be out of range.  For floating point division by zero, infinity seems
like a natural choice, but I recall there was some discussion on the
list some time ago about special requirement for the Inf and NaN
semantics for rendering so I'm not sure whether we need to differentiate
Inf and zero division for float.

7.  Assuming we've agreed to not run threads in dependent groups, do we
care about the loop unit?  Maybe a decrement-and-jump-if-nonzero would
do?

8.  We are going with conventional flag-based branch instructions,
right?  How many bits of address do we need in the branch instructions?
Do we need computed branches?
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to