Op 11-01-15 om 01:34 schreef Ilia Mirkin:
And you're allowing saturate/neg emission on the short form.
Yes
Is this already in envytools?
Tesla floating point instructions are poorly documented in the RST
documents; fmad is no exception. I'll make sure to check envydis.
Also, what's the shortForm thing?
Documented in envytools; see
http://envytools.readthedocs.org/en/latest/hw/graph/tesla/cuda/isa.html#instruction-format
. In short, opcodes are either 4 bytes (short) or 8 bytes (long).
This change is
probably fine, but the changelog needs work.
If you insist I could elaborate a little further. However, documenting
what a short opcode is seems a bit superfluous.
On Sat, Jan 10, 2015 at 7:22 PM, Roy Spliet <[email protected]> wrote:
MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit
Signed-off-by: Roy Spliet <[email protected]>
---
.../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 18 ++++++++++++------
.../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp | 2 +-
2 files changed, 13 insertions(+), 7 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
index 2077388..b1e7409 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
@@ -939,9 +939,20 @@ CodeEmitterNV50::emitFMAD(const Instruction *i)
code[0] = 0xe0000000;
+ if (i->src(1).getFile() == FILE_IMMEDIATE) {
+ code[1] = 0;
+ emitForm_IMM(i);
+ code[0] |= neg_mul << 15;
+ code[0] |= neg_add << 22;
+ if (i->saturate)
+ code[0] |= 1 << 8;
+ } else
if (i->encSize == 4) {
emitForm_MUL(i);
- assert(!neg_mul && !neg_add);
+ code[0] |= neg_mul << 15;
+ code[0] |= neg_add << 22;
+ if (i->saturate)
+ code[0] |= 1 << 8;
} else {
code[1] = neg_mul << 26;
code[1] |= neg_add << 27;
@@ -1931,11 +1942,6 @@ CodeEmitterNV50::getMinEncodingSize(const Instruction
*i) const
// check constraints on short MAD
if (info.srcNr >= 2 && i->srcExists(2)) {
- if (i->saturate || i->src(2).mod)
- return 8;
- if ((i->src(0).mod ^ i->src(1).mod) ||
- (i->src(0).mod | i->src(1).mod).abs())
- return 8;
if (!i->defExists(0) ||
i->def(0).rep()->reg.data.id != i->src(2).rep()->reg.data.id)
return 8;
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
index 48f996b..f4dedd7 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
@@ -118,7 +118,7 @@ void TargetNV50::initOpInfo()
static const uint32_t shortForm[(OP_LAST + 31) / 32] =
{
// MOV,ADD,SUB,MUL,SAD,L/PINTERP,RCP,TEX,TXF
- 0x00010e40, 0x00000040, 0x00000498, 0x00000000
+ 0x00014e40, 0x00000040, 0x00000498, 0x00000000
};
static const operation noDestList[] =
{
--
2.1.0
_______________________________________________
Nouveau mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/nouveau
_______________________________________________
Nouveau mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/nouveau
_______________________________________________
Nouveau mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/nouveau