BTW, you may be interested in https://github.com/imirkin/mesa/commits/atomic3 which has working ARB_shader_atomic_counters and ARB_shader_storage_buffer_object support (while ripping out things like TGSI_FILE_RESOURCE). Still working on proper memory qualifier support, and obviously need to do some cleanup before upstreaming. Should be getting into a pushable state probably early January.
Cheers, -ilia On Wed, Dec 16, 2015 at 12:24 PM, Ilia Mirkin <[email protected]> wrote: > I believe that your problem is this: > > /*01a0*/ LD R8, [R8]; > /* 0x8000000000821c85 */ > > That needs to be LD.E (and your ST's need to be ST.E). You're using a > 32-bit gmem address, but you need to be using a 64-bit one. I believe > the 32-bit ones work on fermi, but afaik not on Kepler. > > Cheers, > > -ilia > > > > On Wed, Dec 16, 2015 at 12:06 PM, Hans de Goede <[email protected]> wrote: >> Hi, >> >> On 15-12-15 20:04, Ilia Mirkin wrote: >>> >>> Also, where's the exit op? Perhaps what's happening is that you don't >>> have an exit and it just goes off executing into the ether? >> >> >> Sorry I only included a small bit of the program in my original mail >> because I found the use of "MOV" instructions to load constants >> suspicious, is that normal ? >> >> I've put a log with NV50_PROG_DEBUG=1 output here: >> >> https://fedorapeople.org/~jwrdegoede/nbody.log >> >> nvdisasm -b SM30 for the generated binary code is here: >> >> https://fedorapeople.org/~jwrdegoede/nbody.disasm >> >> There are already .tgsi, .hex and .bin files there if >> you find those easier to use then the >> NV50_PROG_DEBUG=1 output. >> >> >>> >>> On Tue, Dec 15, 2015 at 12:00 PM, Ilia Mirkin <[email protected]> >>> wrote: >>>> >>>> A few things that stand out: >>>> >>>> 0: ld u32 %r219 c0[0x0000000000000000+0x0] (0) >>>> >>>> wtf is that 0x0000000000000 thing doing there? Was it a %rX which got >>>> constant-folded into 0? That indirectness should have then been >>>> removed... that said, the final encoding looks fine. >> >> >> I don't know, maybe there is a hint in the log file? >> >> Regards, >> >> Hans >> >> >> >>>> >>>> I believe that kepler has this launch descriptor thing too... is that >>>> being set correctly? Please generate a mmt trace, and we can see if >>>> anything stands out compared to a blob trace that also does compute. >>>> >>>> Cheers, >>>> >>>> -ilia >>>> >>>> On Tue, Dec 15, 2015 at 9:15 AM, Hans de Goede <[email protected]> >>>> wrote: >>>>> >>>>> Hi all, >>>>> >>>>> As part of my compute work I'm trying to get some TGSI compute >>>>> code to work. The code from mesa/src/gallium/tests/trivial.c >>>>> works. >>>>> >>>>> So now I'm trying to get a "native" tgsi kernel to run via >>>>> clover, I'm using Francisco's nbody.c example for this: >>>>> >>>>> https://fedorapeople.org/~jwrdegoede/nbody.c >>>>> >>>>> Which does not work, at first I thought there was an issue >>>>> with the setup of the input / output buffers, but that seems to >>>>> work fine, and moreover I finally got the smart idea to look >>>>> in dmesg, which says: >>>>> >>>>> [ 9920.802435] nouveau 0000:01:00.0: gr: TRAP ch 6 [007f7fa000 >>>>> nbody[31881]] >>>>> [ 9920.802449] nouveau 0000:01:00.0: gr: GPC0/TPC0/MP trap: global >>>>> 00000000 >>>>> [] warp 10009 [INVALID_OPCODE] >>>>> [ 9920.802456] nouveau 0000:01:00.0: gr: GPC0/TPC1/MP trap: global >>>>> 00000004 >>>>> [MULTIPLE_WARP_ERRORS] warp 20009 [INVALID_OPCODE] >>>>> >>>>> and repeats that for every "step" in the nobody simulation, this is on a >>>>> gk107 card. >>>>> >>>>> So that seems to be the real problem, since the >>>>> error says "INVALID_OPCODE", I've put the tgsi code from nbody.c >>>>> through "nouveau_compiler -a e4" and then run "nvdisasm -b SM30" >>>>> on it, but the output looks ok. There is a 8 byte sequence which does >>>>> not get decoded every 64 bytes but AFAIK that is the scheduling info, >>>>> so that should be fine. >>>>> >>>>> One thing which does stand out is that this: >>>>> >>>>> 0: ld u32 %r219 c0[0x0000000000000000+0x0] (0) >>>>> 1: ld u32 %r222 c0[0x4] (0) >>>>> 2: ld u64 { %r225 %r228 } c0[0x8] (0) >>>>> 3: ld u32 %r234 c0[0x10] (0) >>>>> >>>>> Gets translated into (nvdisasm output) : >>>>> >>>>> /*0008*/ LDC R4, c[0x0][0x0]; >>>>> /* 0x1400000003f11c86 */ >>>>> /*0010*/ MOV R2, c[0x0][0x4]; >>>>> /* 0x2800400010009de4 */ >>>>> /*0018*/ LDC.64 R0, c[0x0][0x8]; >>>>> /* 0x1400000023f01ca6 */ >>>>> /*0020*/ MOV R3, c[0x0][0x10]; >>>>> /* 0x280040004000dde4 */ >>>>> >>>>> Where I would expect for LDC instructions, could that be the problem ? >>>>> >>>>> If that is not the problem, then hints how to debug this further would >>>>> be >>>>> greatly appreciated. >>>>> >>>>> Regards, >>>>> >>>>> Hans >>>>> _______________________________________________ >>>>> Nouveau mailing list >>>>> [email protected] >>>>> http://lists.freedesktop.org/mailman/listinfo/nouveau _______________________________________________ Nouveau mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/nouveau
