Hi Ilia. I'll take a look and see what I can find out. Thanks, - Andy
On Wed, Apr 23, 2014 at 05:03:17PM -0700, Ilia Mirkin wrote: > On Wed, Apr 23, 2014 at 6:22 PM, Ilia Mirkin <[email protected]> wrote: > > Hello, > > > > I've been trying to add ARB_sample_shading support to nouveau, and am > > being defeated by the gl_SampleMask tests. Everything else works fine. > > (And naturally the tests pass with the proprietary driver.) I'm trying > > to do this for both GT21x, as well as GF100+. > > > > In the GT21x case, it seems like the low bit of method 0x1928 needs to > > be set (as well as the second-to-lowest bit), for GF100+, the low bit > > of the last dword of the shader header needs to be set. > > > > But exactly which register is the output supposed to go into? It looks > > like with the proprietary driver, r0..r3 get the first color output, > > and r4 gets the sample mask. However the way that things are set up > > with nouveau, r4..r7 get the first color output (and that part works > > fine). But where should the sample mask go at the end of the fragment > > program? r0? r8? (I've tried all of those with minimal effect.) > > Perhaps there's more configuration that I'm missing regarding the > > sample mask? Also, how does this interact with the frag depth (which > > also gets implicitly assigned based on color outputs)? > > As a clarification to the r0..r3 vs r4..r7 for first color output, > I've changed things around to ensure that the first color output ends > up in r0..r3 in the nouveau shader too. The shader generated by > nouveau is: > > HDR[00] = 0x00021462 > HDR[04] = 0x00000000 > HDR[08] = 0x00000000 > HDR[0c] = 0x00000000 > HDR[10] = 0x00000000 > HDR[14] = 0xf0000000 > HDR[18] = 0x00000000 > HDR[1c] = 0x00000000 > HDR[20] = 0x00000000 > HDR[24] = 0x00000000 > HDR[28] = 0x00000000 > HDR[2c] = 0x00000000 > HDR[30] = 0x00000000 > HDR[34] = 0x00000000 > HDR[38] = 0x00000000 > HDR[3c] = 0x00000000 > HDR[40] = 0x00000000 > HDR[44] = 0x00000000 > HDR[48] = 0x0000000f > HDR[4c] = 0x00000001 > shader binary code (0x80 bytes): > 42e04237 22804280 fff01c00 c07e0070 fff05c00 c07e0074 10009de4 28004000 > 00105c00 30044000 01201c84 14060000 04001c02 10408102 05205c84 14060000 > 720042e7 22e20042 04105c02 10040404 04011c83 68000000 0000dde2 18fe0000 > 00001de2 18000000 0c005de4 28000000 00009de2 18000000 00001de7 80000000 > > which, with "nvdisas -b SM30 -raw" decodes to > > /*0008*/ IPA.PASS R0, a[0x70], RZ; > /*0010*/ IPA.PASS R1, a[0x74], RZ; > /*0018*/ MOV R2, c[0x0][0x4]; > /*0020*/ FFMA R1, R1, c[0x0][0x0], R2; > /*0028*/ F2I.S32.F32.TRUNC R0, R0; > /*0030*/ IMUL32I.U32.U32 R0, R0, 0x10204081; > /*0038*/ F2I.S32.F32.TRUNC R1, R1; > /*0048*/ IMUL32I.U32.U32 R1, R1, 0x1010101; > /*0050*/ LOP.XOR R4, R0, R1; > /*0058*/ MOV32I R3, 0x3f800000; > /*0060*/ MOV32I R0, 0x0; > /*0068*/ MOV R1, R3; > /*0070*/ MOV32I R2, 0x0; > /*0078*/ EXIT ; > > While the proprietary-driver-generated shader is: [the output is of > quad-word-writes, so the right-most dword is the first of 4... so you > have to read it right-to-left] > > --816-- w 27:0x0430, 0x00000000,0x00000000,0x00000000,0x00001462 > --816-- w 27:0x0440, 0x00000000,0x00000000,0xb0000000,0x00000000 > --816-- w 27:0x0450, 0x00000000,0x00000000,0x00000000,0x00000000 > --816-- w 27:0x0460, 0x00000000,0x00000000,0x00000000,0x00000000 > --816-- w 27:0x0470, 0x00000001,0x0000000f,0x00000000,0x00000000 > --816-- w 27:0x0480, 0xc07e0074,0xfff05c00,0x22324232,0xa0423047 > --816-- w 27:0x0490, 0xc07e0070,0xfff01c00,0x2800403c,0x10009de4 > --816-- w 27:0x04a0, 0x3004803c,0x30105c40,0x2800403c,0x0000dde4 > --816-- w 27:0x04b0, 0x14860000,0x05201c84,0x3006803c,0x20009c40 > --816-- w 27:0x04c0, 0x14860000,0x09205c84,0x22004280,0x42304247 > --816-- w 27:0x04d0, 0x28000000,0xfc001de4,0x10040404,0x04009ca2 > --816-- w 27:0x04e0, 0x18fe0000,0x00005de2,0x10408102,0x0410dca2 > --816-- w 27:0x04f0, 0x28000000,0xfc009de4,0x68000000,0x08311c83 > --816-- w 27:0x0500, 0x28000000,0x0400dde4,0x20000000,0x0002e047 > --816-- w 27:0x0510, 0x4003ffff,0xe0001de7,0x80000000,0x00001de7 > --816-- w 27:0x0520, 0x40000000,0x00001de4,0x40000000,0x00001de4 > --816-- w 27:0x0530, 0x40000000,0x00001de4,0x40000000,0x00001de4 > > Which decodes to: > > /*0008*/ IPA.PASS R1, a[0x74], RZ; > /*0010*/ MOV R2, c[0x0][0xf04]; > /*0018*/ IPA.PASS R0, a[0x70], RZ; > /*0020*/ MOV R3, c[0x0][0xf00]; > /*0028*/ FFMA.FTZ R1, R1, R2, c[0x0][0xf0c]; > /*0030*/ FFMA.FTZ R2, R0, R3, c[0x0][0xf08]; > /*0038*/ F2I.FTZ.S32.F32.TRUNC R0, R1; > /*0048*/ F2I.FTZ.S32.F32.TRUNC R1, R2; > /*0050*/ IMUL32I R2, R0, 0x1010101; > /*0058*/ MOV R0, RZ; > /*0060*/ IMUL32I R3, R1, 0x10204081; > /*0068*/ MOV32I R1, 0x3f800000; > /*0070*/ LOP.XOR R4, R3, R2; > /*0078*/ MOV R2, RZ; > /*0088*/ MOV R3, R1; > /*0090*/ EXIT ; > > (Not sure why the nouveau shader only has 1 FMA, but that's the input > shader we get from Gallium. I highly doubt this is the source of the > error, since it has nothing to do with sample masks, but my question > about sample mask output still stands even if its :) ) > > Oh, and for completeness, the input GLSL shader is: > > "#version 130\n" > "#extension GL_ARB_sample_shading : enable\n" > "out vec4 out_color;\n" > "void main()\n" > "{\n" > /* For 128x128 image size, below formula produces a bit > * pattern where no two bits of gl_SampleMask[0] are > * correlated. > */ > " gl_SampleMask[0] = (int(gl_FragCoord.x) * 0x10204081) ^\n" > " (int(gl_FragCoord.y) * 0x01010101);\n" > " out_color = vec4(0.0, 1.0, 0.0, 1.0);\n" > "}\n"; > > > > > Any insight into this would be hugely helpful. In case you feel like > > taking a look at the actual code, these are my commits: > > https://github.com/imirkin/mesa/commits/sample_shading . Note that > > some bits of the sample mask were already there for nvc0 (like setting > > the shader header bit), thus don't appear in my change. > > > > Thanks, > > > > -ilia > _______________________________________________ > Nouveau mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/nouveau _______________________________________________ Nouveau mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/nouveau
