On Fri, Dec 18, 2015 at 7:55 AM, Hans de Goede <[email protected]> wrote: > Hi, > > On 16-12-15 18:24, Ilia Mirkin wrote: >> >> I believe that your problem is this: >> >> /*01a0*/ LD R8, [R8]; >> /* 0x8000000000821c85 */ >> >> That needs to be LD.E (and your ST's need to be ST.E). You're using a >> 32-bit gmem address, but you need to be using a 64-bit one. I believe >> the 32-bit ones work on fermi, but afaik not on Kepler. > > > I do not think that is the problem, src/gallium/tests/trivial/compute > test_input_global() has: > > COMP > DCL SV[0], THREAD_ID > DCL TEMP[0], LOCAL > DCL TEMP[1], LOCAL > IMM[0] UINT32 {8, 0, 0, 0} > 0: BGNSUB :0 > 1: UMUL TEMP[0], SV[0], IMM[0] > 2: LOAD TEMP[1].xy, RES[32764], TEMP[0] > 3: LOAD TEMP[0].x, RES[32767], TEMP[1].yyyy > 4: UADD TEMP[1].x, TEMP[0], -TEMP[1] > 5: STORE RES[32767].x, TEMP[1].yyyy, TEMP[1] > 6: RET > 7: ENDSUB > > Which translates to: > > SUB:0 () > BB:0 (7 instructions) - df = { } > -> BB:1 (cross) > 0: rdsv u32 $r0 sv[TID:0] (8) > 1: shl u32 $r2 $r0 0x00000003 (8) > 2: ld u64 $r0d c0[$r2+0x0] (8) > 3: ld u32 $r2 g[$r1+0x0] (8) > 4: add u32 $r0 $r2 neg $r0 (8) > 5: st u32 # g[$r1+0x0] $r0 (8) > 6: ret (8) > BB:1 (0 instructions) - idom = BB:0, df = { } > > MAIN:-1 () > BB:0 (0 instructions) - df = { } > > Which is also using 32 bits loads from global memory > and that works fine on my GK107 [GeForce GT 740]. > > I think that for now I'll just focus on translating > the tests from rc/gallium/tests/trivial/compute.c to > opencl and getting the entire opencl -> llvm -> tgsi -> > nouveau_compiler -> hardware chain to work that way. > > Still would be good to get nbody.c to work though.
Hmmmm odd. Not sure how 32-bit addresses work there. (Or on Fermi tbh.) Probably assumes that the upper 8 bits of the 40-bit VA are 0? Anyways, another thing I remember is that I couldn't get barriers to work at all with tess (with iirc, invalid opcode errors). My solution to the problem was to just discard them, since that's what the blob seemed to do, and I assumed they knew what they were doing. Perhaps I was just emitting it wrong. I'd take a careful look at how the blob emits that BAR.SYNC primitive. Cheers, -ilia _______________________________________________ Nouveau mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/nouveau
