I've had a closer look at the basic instructions listed at http://docs.google.com/View?id=dfsp4qpd_41dtrrskfb#Operation_to_support_in_shader_8117638312455733 to which I have some comments and questions.
1. The document list both signed and unsigned additive operations. These are equivalent, except possibly for the flags. I suggest not to differentiate signed and unsigned additive instructions, and instead adapt the conventional semantics for carry and overflow flags. 2. Do we need to support extraction of the upper 32 bits of a multiplication? If not, then mult and umult are also equivalent except possibly for the flags. I think we can reuse the carry flag for overflow of an unsigned multiplication in analogy to the additive instructions. 3. Finally on this chain of though, f2u and f2i will be equivalent except for overflow detection. Again we can use the carry flag for the unsigned case and overflow for the signed case. 4. I like the idea of adding minimum and maximum functions in the instruction set; if we need them, that is. They should not require much logic. But in this case, note that there is a difference between signed and unsigned. Do we want both? On the other hand it only takes 2 to 3 cycles to compute any of these if my idea of the instruction set is correct. 5. To complete the set of shift instructions we need the arithmetic and logic shift distinction (aka signed and unsigned shifts). 6. There may be more flags than we need. For integer division by zero we can use the overflow, since that's the only way a division can be out of range. For floating point division by zero, infinity seems like a natural choice, but I recall there was some discussion on the list some time ago about special requirement for the Inf and NaN semantics for rendering so I'm not sure whether we need to differentiate Inf and zero division for float. 7. Assuming we've agreed to not run threads in dependent groups, do we care about the loop unit? Maybe a decrement-and-jump-if-nonzero would do? 8. We are going with conventional flag-based branch instructions, right? How many bits of address do we need in the branch instructions? Do we need computed branches? _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
