Re: [Mesa-dev] Mesa (master): st/glsl_to_tgsi: simpler fixup of empty writemasks

2016-10-13 Thread Michel Dänzer
On 13/10/16 04:26 PM, Nicolai Hähnle wrote:
> Hi Michel,
> 
> On 13.10.2016 08:42, Michel Dänzer wrote:
>> On 13/10/16 01:50 AM, Nicolai Hähnle wrote:
>>> Module: Mesa
>>> Branch: master
>>> Commit: f5f3cadca3809952288e3726ed5fde22090dc61d
>>> URL:   
>>> http://cgit.freedesktop.org/mesa/mesa/commit/?id=f5f3cadca3809952288e3726ed5fde22090dc61d
>>>
>>>
>>> Author: Nicolai Hähnle 
>>> Date:   Fri Oct  7 12:49:36 2016 +0200
>>>
>>> st/glsl_to_tgsi: simpler fixup of empty writemasks
>>
>> This change broke the piglit tests
>> spec@glsl-110@execution@variable-indexing@vs-temp-array-mat2-index(-col)-wr
>>
>> on my Kaveri. Output with R600_DEBUG=ps,vs attached as
>> vs-temp-array-mat2-index-wr.txt .
>>
>>
>> P.S. The newly enabled tests
>> spec@arb_enhanced_layouts@execution@component-layout@vs-tcs-load-output(-indirect)
>>
>> also fail, output attached as vs-tcs-load-output.stderr .
> 
> Thanks, it's on my radar. As far as I can tell, Mesa is doing the right
> thing here and LLVM is the problem, and working around it in Mesa may
> not be possible or reliable. But it's on my list of things to fix properly.

Excellent, thanks.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): st/glsl_to_tgsi: simpler fixup of empty writemasks

2016-10-13 Thread Nicolai Hähnle

Hi Michel,

On 13.10.2016 08:42, Michel Dänzer wrote:

On 13/10/16 01:50 AM, Nicolai Hähnle wrote:

Module: Mesa
Branch: master
Commit: f5f3cadca3809952288e3726ed5fde22090dc61d
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=f5f3cadca3809952288e3726ed5fde22090dc61d

Author: Nicolai Hähnle 
Date:   Fri Oct  7 12:49:36 2016 +0200

st/glsl_to_tgsi: simpler fixup of empty writemasks


This change broke the piglit tests
spec@glsl-110@execution@variable-indexing@vs-temp-array-mat2-index(-col)-wr
on my Kaveri. Output with R600_DEBUG=ps,vs attached as
vs-temp-array-mat2-index-wr.txt .


P.S. The newly enabled tests
spec@arb_enhanced_layouts@execution@component-layout@vs-tcs-load-output(-indirect)
also fail, output attached as vs-tcs-load-output.stderr .


Thanks, it's on my radar. As far as I can tell, Mesa is doing the right 
thing here and LLVM is the problem, and working around it in Mesa may 
not be possible or reliable. But it's on my list of things to fix properly.


Nicolai
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): st/glsl_to_tgsi: simpler fixup of empty writemasks

2016-10-13 Thread Michel Dänzer

Hi Nicolai,


On 13/10/16 01:50 AM, Nicolai Hähnle wrote:
> Module: Mesa
> Branch: master
> Commit: f5f3cadca3809952288e3726ed5fde22090dc61d
> URL:
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=f5f3cadca3809952288e3726ed5fde22090dc61d
> 
> Author: Nicolai Hähnle 
> Date:   Fri Oct  7 12:49:36 2016 +0200
> 
> st/glsl_to_tgsi: simpler fixup of empty writemasks

This change broke the piglit tests
spec@glsl-110@execution@variable-indexing@vs-temp-array-mat2-index(-col)-wr
on my Kaveri. Output with R600_DEBUG=ps,vs attached as
vs-temp-array-mat2-index-wr.txt .


P.S. The newly enabled tests
spec@arb_enhanced_layouts@execution@component-layout@vs-tcs-load-output(-indirect)
also fail, output attached as vs-tcs-load-output.stderr .

-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
VERT
PROPERTY NEXT_SHADER FRAG
DCL IN[0]
DCL OUT[0], POSITION
DCL OUT[1], COLOR
DCL CONST[0..9]
DCL TEMP[0], LOCAL
DCL TEMP[1..2], ARRAY(1), LOCAL
DCL TEMP[3..8], ARRAY(2), LOCAL
DCL TEMP[9..10], ARRAY(3), LOCAL
DCL TEMP[11..12], ARRAY(4), LOCAL
DCL TEMP[13..14], LOCAL
DCL ADDR[0]
IMM[0] FLT32 {0., 0., 1., 0.}
IMM[1] INT32 {2, 0, 0, 0}
  0: MUL TEMP[0], CONST[6], IN[0].
  1: MAD TEMP[0], CONST[7], IN[0]., TEMP[0]
  2: MAD TEMP[0], CONST[8], IN[0]., TEMP[0]
  3: MAD TEMP[0], CONST[9], IN[0]., TEMP[0]
  4: MOV TEMP[1], IMM[0].
  5: MOV TEMP[2], IMM[0].
  6: MOV TEMP[3].xy, TEMP[1].xyxx
  7: MOV TEMP[4].xy, TEMP[2].xyxx
  8: MOV TEMP[9], IMM[0].
  9: MOV TEMP[10], IMM[0].
 10: MOV TEMP[5].xy, TEMP[9].xyxx
 11: MOV TEMP[6].xy, TEMP[10].xyxx
 12: MOV TEMP[11], IMM[0].
 13: MOV TEMP[12], IMM[0].
 14: MOV TEMP[7].xy, TEMP[11].xyxx
 15: MOV TEMP[8].xy, TEMP[12].xyxx
 16: UMUL TEMP[13].x, CONST[4]., IMM[1].
 17: UARL ADDR[0].x, TEMP[13].
 18: MOV TEMP[ADDR[0].x+3](2).xy, CONST[0].xyxx
 19: UARL ADDR[0].x, TEMP[13].
 20: MOV TEMP[ADDR[0].x+4](2).xy, CONST[1].xyxx
 21: UMUL TEMP[13].x, CONST[4]., IMM[1].
 22: UARL ADDR[0].x, TEMP[13].
 23: MOV TEMP[ADDR[0].x+4](2).xy, CONST[5].xyxx
 24: UMUL TEMP[13].x, CONST[4]., IMM[1].
 25: UMUL TEMP[14].x, CONST[4]., IMM[1].
 26: UARL ADDR[0].x, TEMP[14].
 27: MUL TEMP[14].xy, TEMP[ADDR[0].x+3](2).xyyy, CONST[2].
 28: UARL ADDR[0].x, TEMP[13].
 29: MAD TEMP[13].xy, TEMP[ADDR[0].x+4](2).xyyy, CONST[2]., TEMP[14].xyyy
 30: ADD TEMP[13].xy, TEMP[13].xyyy, -CONST[3].xyyy
 31: DP2 TEMP[13].x, TEMP[13].xyyy, TEMP[13].xyyy
 32: FSLT TEMP[13].x, TEMP[13]., IMM[0].
 33: UIF TEMP[13]. :0
 34:   MOV TEMP[13], IMM[0].xzxz
 35: ELSE :0
 36:   MOV TEMP[13], IMM[0].zxxz
 37: ENDIF
 38: MOV OUT[0], TEMP[0]
 39: MOV OUT[1], TEMP[13]
 40: END
radeonsi: Compiling shader 1
TGSI shader LLVM IR:

; ModuleID = 'tgsi'
source_filename = "tgsi"
target triple = "amdgcn--"

define amdgpu_vs <{ float, float, float }> @main([17 x <16 x i8>] addrspace(2)* 
byval dereferenceable(18446744073709551615), [16 x <16 x i8>] addrspace(2)* 
byval dereferenceable(18446744073709551615), [32 x <8 x i32>] addrspace(2)* 
byval dereferenceable(18446744073709551615), [16 x <8 x i32>] addrspace(2)* 
byval dereferenceable(18446744073709551615), [16 x <4 x i32>] addrspace(2)* 
byval dereferenceable(18446744073709551615), [16 x <16 x i8>] addrspace(2)* 
byval dereferenceable(18446744073709551615), i32 inreg, i32 inreg, i32 inreg, 
i32 inreg, i32, i32, i32, i32, i32) {
main_body:
  %15 = getelementptr [16 x <16 x i8>], [16 x <16 x i8>] addrspace(2)* %5, i64 
0, i64 0, !amdgpu.uniform !0
  %16 = load <16 x i8>, <16 x i8> addrspace(2)* %15, align 16, !invariant.load 
!0
  %17 = call <4 x float> @llvm.SI.vs.load.input(<16 x i8> %16, i32 0, i32 %14)
  %18 = extractelement <4 x float> %17, i32 0
  %19 = extractelement <4 x float> %17, i32 1
  %20 = extractelement <4 x float> %17, i32 2
  %21 = extractelement <4 x float> %17, i32 3
  %22 = getelementptr [16 x <16 x i8>], [16 x <16 x i8>] addrspace(2)* %1, i64 
0, i64 0, !amdgpu.uniform !0
  %23 = load <16 x i8>, <16 x i8> addrspace(2)* %22, align 16, !invariant.load 
!0
  %24 = call float @llvm.SI.load.const(<16 x i8> %23, i32 96)
  %25 = fmul float %24, %18
  %26 = call float @llvm.SI.load.const(<16 x i8> %23, i32 100)
  %27 = fmul float %26, %18
  %28 = call float @llvm.SI.load.const(<16 x i8> %23, i32 104)
  %29 = fmul float %28, %18
  %30 = getelementptr [16 x <16 x i8>], [16 x <16 x i8>] addrspace(2)* %1, i64 
0, i64 0, !amdgpu.uniform !0
  %31 = load <16 x i8>, <16 x i8> addrspace(2)* %30, align 16, !invariant.load 
!0
  %32 = call float @llvm.SI.load.const(<16 x i8> %31, i32 108)
  %33 = fmul float %32, %18
  %34 = call float @llvm.SI.load.const(<16 x i8> %31, i32 112)
  %35 = fmul float %34, %19
  %36 = fadd float %35, %25
  %37 = getelementptr [16 x <16 x i8>], [16 x <16 x i8>] addrspace(2)* %1, i64 
0, i64 0, !amdgpu.uniform !0