A few things that stand out: 0: ld u32 %r219 c0[0x0000000000000000+0x0] (0)
wtf is that 0x0000000000000 thing doing there? Was it a %rX which got constant-folded into 0? That indirectness should have then been removed... that said, the final encoding looks fine. I believe that kepler has this launch descriptor thing too... is that being set correctly? Please generate a mmt trace, and we can see if anything stands out compared to a blob trace that also does compute. Cheers, -ilia On Tue, Dec 15, 2015 at 9:15 AM, Hans de Goede <[email protected]> wrote: > Hi all, > > As part of my compute work I'm trying to get some TGSI compute > code to work. The code from mesa/src/gallium/tests/trivial.c > works. > > So now I'm trying to get a "native" tgsi kernel to run via > clover, I'm using Francisco's nbody.c example for this: > > https://fedorapeople.org/~jwrdegoede/nbody.c > > Which does not work, at first I thought there was an issue > with the setup of the input / output buffers, but that seems to > work fine, and moreover I finally got the smart idea to look > in dmesg, which says: > > [ 9920.802435] nouveau 0000:01:00.0: gr: TRAP ch 6 [007f7fa000 nbody[31881]] > [ 9920.802449] nouveau 0000:01:00.0: gr: GPC0/TPC0/MP trap: global 00000000 > [] warp 10009 [INVALID_OPCODE] > [ 9920.802456] nouveau 0000:01:00.0: gr: GPC0/TPC1/MP trap: global 00000004 > [MULTIPLE_WARP_ERRORS] warp 20009 [INVALID_OPCODE] > > and repeats that for every "step" in the nobody simulation, this is on a > gk107 card. > > So that seems to be the real problem, since the > error says "INVALID_OPCODE", I've put the tgsi code from nbody.c > through "nouveau_compiler -a e4" and then run "nvdisasm -b SM30" > on it, but the output looks ok. There is a 8 byte sequence which does > not get decoded every 64 bytes but AFAIK that is the scheduling info, > so that should be fine. > > One thing which does stand out is that this: > > 0: ld u32 %r219 c0[0x0000000000000000+0x0] (0) > 1: ld u32 %r222 c0[0x4] (0) > 2: ld u64 { %r225 %r228 } c0[0x8] (0) > 3: ld u32 %r234 c0[0x10] (0) > > Gets translated into (nvdisasm output) : > > /*0008*/ LDC R4, c[0x0][0x0]; > /* 0x1400000003f11c86 */ > /*0010*/ MOV R2, c[0x0][0x4]; > /* 0x2800400010009de4 */ > /*0018*/ LDC.64 R0, c[0x0][0x8]; > /* 0x1400000023f01ca6 */ > /*0020*/ MOV R3, c[0x0][0x10]; > /* 0x280040004000dde4 */ > > Where I would expect for LDC instructions, could that be the problem ? > > If that is not the problem, then hints how to debug this further would be > greatly appreciated. > > Regards, > > Hans > _______________________________________________ > Nouveau mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/nouveau _______________________________________________ Nouveau mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/nouveau
