[Mesa-dev] [PATCH v2 3/7] nv50/ir: optimize neg(and(set, 1)) to set

2016-01-27 Thread Karol Herbst
From: Karol Herbst <g...@karolherbst.de> helps shaders in saints row IV, bioshock infinite and shadow warrior total instructions in shared programs : 1921966 -> 1910935 (-0.57%) total gprs used in shared programs: 251863 -> 251728 (-0.05%) total local used in shared programs :

[Mesa-dev] [PATCH v2 6/7] nv50/ir: run DCE backwards

2016-01-27 Thread Karol Herbst
reduces calls up to 50% Signed-off-by: Karol Herbst <nouv...@karolherbst.de> Reviewed-by: Ilia Mirkin <imir...@alum.mit.edu> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nou

[Mesa-dev] [PATCH v2 1/7] nv50/ir: enable PostRaConstantFolding for [c0, f0)

2016-01-27 Thread Karol Herbst
From: Karol Herbst <g...@karolherbst.de> helps shaders in multiple games total instructions in shared programs : 1925865 -> 1922112 (-0.19%) total gprs used in shared programs: 251863 -> 251863 (0.00%) total local used in shared programs : 5673 -> 5673 (0.00%) total bytes

[Mesa-dev] [PATCH v2 2/7] nv50/ir: swap sources in PostRaConstantFolding when src0 is imm helps some shaders in multiple games

2016-01-27 Thread Karol Herbst
lgpr inst bytes helped 0 0 62 62 hurt 0 0 0 0 v2: make the diff more clear and use swapSources Signed-off-by: Karol Herbst <nouv...@karolherbst.de> --- src/gallium/drive

[Mesa-dev] [PATCH v2 4/7] nv50/ir: optimize shl(shr(a, c), c) to and(a, ~((1 << c) - 1))

2016-01-27 Thread Karol Herbst
From: Karol Herbst <g...@karolherbst.de> helps shaders in multiple games total instructions in shared programs : 1910935 -> 1901781 (-0.48%) total gprs used in shared programs: 251728 -> 251728 (0.00%) total local used in shared programs : 5673 -> 5673 (0.00%) total bytes

[Mesa-dev] [PATCH v2 7/7] nv50/ir: optimize mad/fma with third argument 0 to mul

2016-01-27 Thread Karol Herbst
From: Karol Herbst <g...@karolherbst.de> total instructions in shared programs : 1895008 -> 1894759 (-0.01%) total gprs used in shared programs: 251728 -> 251715 (-0.01%) total local used in shared programs : 5673 -> 5673 (0.00%) total bytes used in shared programs : 17377

[Mesa-dev] [PATCH v2 0/7] nv50/ir: various compiler optimizations

2016-01-27 Thread Karol Herbst
w IV shadow warrior talos principle unigine heaven/valley wasteland 2 witcher 2 Karol Herbst (7): nv50/ir: enable PostRaConstantFolding for [c0,f0) nv50/ir: swap sources in PostRaConstantFolding when src0 is imm nv50/ir: optimize neg(and(set, 1)) to set nv50/ir: optimize shl(

[Mesa-dev] [PATCH 1/2] nv50/ir: optimize neg(and(set, 1)) to set

2016-02-01 Thread Karol Herbst
: simplified the code Signed-off-by: Karol Herbst <nouv...@karolherbst.de> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 32 ++ 1 file changed, 32 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drive

[Mesa-dev] [PATCH 0/2] nouveau compiler improvements

2016-02-01 Thread Karol Herbst
the first patch improves some shaders in some games and the second one fixes an issue if the optimizations passes are rerun Karol Herbst (2): nv50/ir: optimize neg(and(set, 1)) to set nv50/ir: we can't do the add to mad conversion when the mul saturates .../drivers/nouveau/codegen

[Mesa-dev] [PATCH 2/2] nv50/ir: we can't do the add to mad conversion when the mul saturates

2016-02-01 Thread Karol Herbst
--- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 1d04a6d..7e58328 100644 ---

[Mesa-dev] [PATCH 2/6] nv50/ir: swap sources in PostRaConstantFolding when src0 is imm

2016-01-25 Thread Karol Herbst
From: Karol Herbst <g...@karolherbst.de> helps some shaders in multiple games total instructions in shared programs : 1922267 -> 1922121 (-0.01%) total gprs used in shared programs: 251878 -> 251878 (0.00%) total local used in shared programs : 5673 -> 5673 (0.00%) t

[Mesa-dev] [PATCH 5/6] nv50/ir: add PostRADCE Pass

2016-01-25 Thread Karol Herbst
From: Karol Herbst <g...@karolherbst.de> helps shaders in some games total instructions in shared programs : 1901958 -> 1895185 (-0.36%) total gprs used in shared programs: 251739 -> 251739 (0.00%) total local used in shared programs : 5673 -> 5673 (0.00%) total bytes

[Mesa-dev] [PATCH 1/6] nv50/ir: enable PostRaConstantFolding for [c0, f0)

2016-01-25 Thread Karol Herbst
From: Karol Herbst <g...@karolherbst.de> helps shaders in multiple games total instructions in shared programs : 1926020 -> 1922267 (-0.19%) total gprs used in shared programs: 251878 -> 251878 (0.00%) total local used in shared programs : 5673 -> 5673 (0.00%) total bytes

[Mesa-dev] [PATCH 4/6] nv50/ir: optimize shl(shr(a, c), c) to and(a, ~((1 << c) - 1))

2016-01-25 Thread Karol Herbst
From: Karol Herbst <g...@karolherbst.de> helps shaders in multiple games total instructions in shared programs : 192 -> 1901958 (-0.48%) total gprs used in shared programs: 251739 -> 251739 (0.00%) total local used in shared programs : 5673 -> 5673 (0.00%) total bytes

[Mesa-dev] [PATCH 3/6] nv50/ir: optimize neg(add(bool, 1)) to bool for OP_SET and OP_SLCT

2016-01-25 Thread Karol Herbst
From: Karol Herbst <g...@karolherbst.de> helps shaders in saints row IV, bioshock infinite and shadow warrior total instructions in shared programs : 1922121 -> 192 (-0.57%) total gprs used in shared programs: 251878 -> 251739 (-0.06%) total local used in shared programs :

[Mesa-dev] [PATCH 0/6] nv50/ir: various compiler optimizations

2016-01-25 Thread Karol Herbst
w IV shadow warrior talos principle unigine heaven/valley wasteland 2 witcher 2 Karol Herbst (6): nv50/ir: enable PostRaConstantFolding for [c0,f0) nv50/ir: swap sources in PostRaConstantFolding when src0 is imm nv50/ir: optimize neg(add(bool, 1)) to bool for OP_SET and OP_SLCT nv50/ir: opti

[Mesa-dev] [PATCH 6/6] nv50/ir: run DCE backwards

2016-01-25 Thread Karol Herbst
reduces Pass rerun by around 40% Signed-off-by: Karol Herbst <nouv...@karolherbst.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/g

[Mesa-dev] [PATCH] nv50/ir: fix minor memory corpution when RA does another round

2016-01-26 Thread Karol Herbst
From: Karol Herbst <g...@karolherbst.de> sometimes an application might crash with a message like this: ERROR: no viable spill candidates left this is due to a memory corruption wich only manifest when there is another RA round this fixes this Signed-off-by: Karol Herbst

Re: [Mesa-dev] [PATCH 2/6] nv50/ir: swap sources in PostRaConstantFolding when src0 is imm

2016-01-26 Thread Karol Herbst
> Ilia Mirkin <imir...@alum.mit.edu> hat am 26. Januar 2016 um 04:53 > geschrieben: > > On Mon, Jan 25, 2016 at 9:57 AM, Karol Herbst <nouv...@karolherbst.de> wrote: > > From: Karol Herbst <g...@karolherbst.de> > > > > helps some shaders in mul

[Mesa-dev] [PATCH 2/2] nv50/ir: optimize sub(a, 0) to a

2016-02-17 Thread Karol Herbst
3 3 hurt 0 0 0 0 Signed-off-by: Karol Herbst <nouv...@karolherbst.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[Mesa-dev] [PATCH 0/2] more nouveau optimisations

2016-02-17 Thread Karol Herbst
Karol Herbst (2): nv50/ir: add PostRADCE Pass nv50/ir: optimize sub(a, 0) to a src/gallium/drivers/nouveau/codegen/nv50_ir.h | 2 +- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 77 ++ 2 files changed, 52 insertions(+), 27 deletions(-) -- 2.7.1

[Mesa-dev] [PATCH 1/2] nv50/ir: add PostRADCE Pass

2016-02-17 Thread Karol Herbst
ograms : 5569 -> 5569 (0.00%) total bytes used in shared programs : 16513528 -> 16451848 (-0.37%) v2: remove the DCE stuff from NV50PostRaConstantFolding alltogether only run this Pass with NV50_PROG_OPTIMIZE >= 1 Signed-off-by: Karol Herbst <nouv...@karolherbst.de> --- src/ga

[Mesa-dev] ARB_shading_language_include

2016-03-12 Thread Karol Herbst
Hi all, the game "Divinity: Original Sin - Enhanced Edition" uses ARB_shading_language_include whenever it detects a non catalyst driver on Linux. Apitraces from the game running on catalyst show that the shaders are simply included within the game engine and replay fine with all mesa drivers as

[Mesa-dev] [PATCH 4/4] nv50: add PostRADualIssue Pass

2016-08-13 Thread Karol Herbst
=640: inst_executed: 1.03G inst_issued1: 614M -> 500M inst_issued2: 213M -> 271M score: 1021 -> 1056 Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 59 ++ 1 file changed, 59 insertions(+) diff --gi

[Mesa-dev] [PATCH 0/4] nvc0: improve dual issueing

2016-08-13 Thread Karol Herbst
the compiler pass aren't as big as with it. Karol Herbst (4): nv50: add target->hasDualIssueing() nvc0/ir: don't dual issue instructions which depend on each other nvc0/ir: dual issue two min/max instructions nv50: add PostRADualIssue Pass src/gallium/drivers/nouveau/codegen/nv50_ir.

[Mesa-dev] [PATCH 3/4] nvc0/ir: dual issue two min/max instructions

2016-08-13 Thread Karol Herbst
> 1030 with dual_issue pass: inst_executed: 1.03G inst_issued1: 535M -> 500M inst_issued2: 254M -> 271M score: 1052 -> 1056 Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp| 14 -- 1 file changed, 12 in

[Mesa-dev] [PATCH 2/4] nvc0/ir: don't dual issue instructions which depend on each other

2016-08-13 Thread Karol Herbst
no changes without a dual_issue pass changes with for ./GpuTest /test=pixmark_piano /benchmark /no_scorebox /msaa=0 /benchmark_duration_ms=6 /width=1024 /height=640: inst_executed: 1.03G inst_issued1: 538M -> 535M inst_issued2: 251M -> 254M score: 1038 -> 1052 Signed-off-by: Kar

[Mesa-dev] [PATCH 1/4] nv50: add target->hasDualIssueing()

2016-08-13 Thread Karol Herbst
Signed-off-by: Karol Herbst <karolher...@gmail.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir_target.h| 1 + src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp | 7 ++- src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.h | 1 + 3 files changed, 8 inse

Re: [Mesa-dev] [PATCH 2/4] nvc0/ir: don't dual issue instructions which depend on each other

2016-08-13 Thread karol herbst
2016-08-13 18:17 GMT+02:00 Ilia Mirkin <imir...@alum.mit.edu>: > On Sat, Aug 13, 2016 at 6:02 AM, Karol Herbst <karolher...@gmail.com> wrote: >> no changes without a dual_issue pass >> >> changes with for ./GpuTest /test=pixmark_piano /benchmark /no_scorebox >

Re: [Mesa-dev] [PATCH 2/4] nvc0/ir: don't dual issue instructions which depend on each other

2016-08-13 Thread karol herbst
2016-08-13 19:27 GMT+02:00 Ilia Mirkin <imir...@alum.mit.edu>: > On Sat, Aug 13, 2016 at 1:24 PM, karol herbst <karolher...@gmail.com> wrote: >> 2016-08-13 18:17 GMT+02:00 Ilia Mirkin <imir...@alum.mit.edu>: >>> On Sat, Aug 13, 2016 at 6:02 AM, Karol

Re: [Mesa-dev] [PATCH 3/4] nvc0/ir: dual issue two min/max instructions

2016-08-13 Thread karol herbst
2016-08-13 17:43 GMT+02:00 Tobias Klausmann <tobias.johannes.klausm...@mni.thm.de>: > Hi Karol, > > one question inline. > > > On 13.08.2016 12:02, Karol Herbst wrote: >> >> min/max pairs can be dual issued on Kepler1 >> >> changes for ./GpuTest /te

Re: [Mesa-dev] [PATCH 4/4] nv50: add PostRADualIssue Pass

2016-08-13 Thread karol herbst
2016-08-13 21:33 GMT+02:00 Ilia Mirkin : > On Sat, Aug 13, 2016 at 3:26 PM, Connor Abbott wrote: >> So, I don't know much about how nv50 ir works, but to me this just >> seems like a pretty slow implementation of a very limited instruction >> scheduler.

[Mesa-dev] [PATCH] nvc0/ir: joins get a sched of 0x2f

2016-09-09 Thread Karol Herbst
slightly improves performance for GpuTest /test=pixmark_piano /benchmark /no_scorebox /msaa=0 /benchmark_duration_ms=6 /width=1024 /height=640 score: 1031 -> 1033 observed from the binary generated by nvidia Signed-off-by: Karol Herbst <karolher...@gmail.com> --- src/galliu

Re: [Mesa-dev] [PATCH] HUD: Add support for block I/O, network I/O and lmsensor stats

2016-09-12 Thread Karol Herbst
Hey, nice work regarding the lmsensor bits. But I think it makes sense to also wire the power things in, cause we actually expose them within nouveau. Others might want or actually do the same as well. Many thanks 2016-09-12 20:33 GMT+02:00 Steven Toth : > Three new

Re: [Mesa-dev] [PATCH] HUD: Add support for block I/O, network I/O and lmsensor stats

2016-09-12 Thread Karol Herbst
2016-09-12 23:20 GMT+02:00 Steven Toth : >> nice work regarding the lmsensor bits. But I think it makes sense to >> also wire the power things in, cause we actually expose them within >> nouveau. Others might want or actually do the same as well. > > Karol, thank you for your

Re: [Mesa-dev] [PATCH] HUD: Add support for block I/O, network I/O and lmsensor stats

2016-09-13 Thread Karol Herbst
well it won't for your GPU, it is currently Fermi (GF100+) only. I guess I will add support for it later then 2016-09-13 13:14 GMT+02:00 Steven Toth : >>> Ahh, my nouveau card must be too old then. I only get temperature from >>> it. I have a 6yo(?) 8800 GTS. That being

Re: [Mesa-dev] [PATCH] HUD: Add support for block I/O, network I/O and lmsensor stats

2016-09-12 Thread Karol Herbst
2016-09-13 0:15 GMT+02:00 Steven Toth : >> I think you expose Temperature, Voltage and Current. But Nouveau exposes >> Temperature, Voltage, Fan and Power through hwmon. >> >> Read the "power" section here for more info: >>

[Mesa-dev] [PATCH] nv50/ir: optimize sub(a, 0) to a

2016-10-05 Thread Karol Herbst
25837792 -> 25837192 (-0.00%) localgpr inst bytes helped 0 0 33 33 hurt 0 0 0 0 Signed-off-by: Karol Herbst <karolher...@gmail.com> --- src/gallium/drive

Re: [Mesa-dev] [PATCH 1/6] nv50/ir: add LIMM form of mad to gk110

2016-10-08 Thread Karol Herbst
t way you won't break things and mupuf will appreciate. :) > I think you read the patches in the wrong order. The two first patches are the changes in the emiter. > On 10/08/2016 05:43 PM, Karol Herbst wrote: >> >> Signed-off-by: Karol Herbst <karolher...@gmail.c

[Mesa-dev] [PATCH 4/6] nv50/ir: rework postraconstantfolding pass

2016-10-08 Thread Karol Herbst
we might want to add more folding passes here, so make it a bit more generic Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 124 ++--- 1 file changed, 62 insertions(+), 62 deletions(-) diff --git a/src/gallium/d

[Mesa-dev] [PATCH 5/6] nv50/ra: always prefer def == src2 for mad/sad

2016-10-08 Thread Karol Herbst
just little random noise in shader-db will help in the next patch Signed-off-by: Karol Herbst <karolher...@gmail.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_

[Mesa-dev] [PATCH 6/6] nv50/ir: implement mad post ra folding for nvc0+

2016-10-08 Thread Karol Herbst
0 0 0 Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 65 -- 1 file changed, 60 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/ga

[Mesa-dev] [PATCH 0/6] nv50/ir: PostRaConstantFolding improvements

2016-10-08 Thread Karol Herbst
This series reworks the structure of the pass to make it easier to add more optimisations to it. Also implements folding for mad on gf100+ ISAs to reduce instruction count by ~0.37% I can only test it on a gk106 for now. Karol Herbst (6): nv50/ir: add LIMM form of mad to gk110 nv50/ir: add

[Mesa-dev] [PATCH 1/6] nv50/ir: add LIMM form of mad to gk110

2016-10-08 Thread Karol Herbst
Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 49 ++ 1 file changed, 32 insertions(+), 17 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp b/src/gallium/drivers/nouveau/c

[Mesa-dev] [PATCH 3/6] nv50/ir: replace post_ra_dead by Instruction::isDead

2016-10-08 Thread Karol Herbst
Signed-off-by: Karol Herbst <karolher...@gmail.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir.h| 2 +- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 20 +++- 2 files changed, 8 insertions(+), 14 deletions(-) diff --git a/src/gallium/drivers/nouveau/c

[Mesa-dev] [PATCH 2/6] nv50/ir: add LIMM form of mad to gm107

2016-10-08 Thread Karol Herbst
Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 32 -- 1 file changed, 23 insertions(+), 9 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp b/src/gallium/drivers/nouveau/c

Re: [Mesa-dev] [PATCH] gf100/ir: limms on gm107 are 19 bit

2016-10-08 Thread Karol Herbst
odeEmitterGM107::emitIMMD and it indeed does some magic there. > > On Sat, Oct 8, 2016 at 3:23 PM, Karol Herbst <karolher...@gmail.com> wrote: >> the emit code uses 19 everywhere, so we should let >> CodeEmitterGM107::longIMMD and TargetNVC0::insnCanLoad check against >&g

Re: [Mesa-dev] [PATCH] nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c)

2016-10-08 Thread Karol Herbst
looks great, a few comments below 2016-10-08 21:55 GMT+02:00 Samuel Pitoiset : > total instructions in shared programs :2286901 -> 2284473 (-0.11%) > total gprs used in shared programs:335256 -> 335273 (0.01%) > total local used in shared programs :31968 -> 31968

[Mesa-dev] [PATCH] gf100/ir: limms on gm107 are 19 bit

2016-10-08 Thread Karol Herbst
the emit code uses 19 everywhere, so we should let CodeEmitterGM107::longIMMD and TargetNVC0::insnCanLoad check against this too Signed-off-by: Karol Herbst <karolher...@gmail.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 6 +++--- src/gallium/drivers/nouveau/c

Re: [Mesa-dev] [PATCH 2/6] nv50/ir: add LIMM form of mad to gm107

2016-10-09 Thread Karol Herbst
2016-10-08 18:12 GMT+02:00 Samuel Pitoiset <samuel.pitoi...@gmail.com>: > Usually we prefix with gm107/ir, gk110/ir, etc... > > More comments below. > > On 10/08/2016 05:43 PM, Karol Herbst wrote: >> >> Signed-off-by: Karol Herbst <karolher...@gmail.com>

Re: [Mesa-dev] [PATCH 3/6] nv50/ir: replace post_ra_dead by Instruction::isDead

2016-10-09 Thread Karol Herbst
2016-10-08 18:39 GMT+02:00 Samuel Pitoiset <samuel.pitoi...@gmail.com>: > > > On 10/08/2016 05:43 PM, Karol Herbst wrote: >> >> Signed-off-by: Karol Herbst <karolher...@gmail.com> >> --- >> src/gallium/drivers/nouveau/codegen/nv50_ir.h

[Mesa-dev] [PATCH v2 1/6] gk110/ir: add LIMM form of mad

2016-10-09 Thread Karol Herbst
v2: renamed commit reordered modifiers add assert(dst == src2) Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 50 ++ 1 file changed, 33 insertions(+), 17 deletions(-) diff --git a/src/gallium/d

[Mesa-dev] [PATCH v2 3/6] nv50/ir: replace post_ra_dead by Instruction::isDead

2016-10-09 Thread Karol Herbst
Signed-off-by: Karol Herbst <karolher...@gmail.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir.h| 2 +- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 20 +++- 2 files changed, 8 insertions(+), 14 deletions(-) diff --git a/src/gallium/drivers/nouveau/c

[Mesa-dev] [PATCH v2 0/6] nv50/ir: PostRaConstantFolding improvements

2016-10-09 Thread Karol Herbst
t bytes helped 0 2640934093 hurt 0 20 61 61 Karol Herbst (6): gk110/ir: add LIMM form of mad gm107/ir: add LIMM form of mad nv50/ir: replace post_ra_dead by Instruction::isDead nv50/ir:

[Mesa-dev] [PATCH v2 4/6] nv50/ir: restructure postraconstantfolding pass

2016-10-09 Thread Karol Herbst
we might want to add more folding passes here, so make it a bit more generic v2: leave the comment and reword commit message Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 120 +++-- 1 file changed, 62 inse

[Mesa-dev] [PATCH v2 6/6] nv50/ra: always prefer def == src2 for mad/sad

2016-10-09 Thread Karol Herbst
gt; 25743616 (-0.12%) localgpr inst bytes helped 0 2617361736 hurt 0 20 78 78 v2: reorder to show the benefit of this patch Signed-off-by: Karol Herbst <karolher...@gmail.com>

[Mesa-dev] [PATCH v2 5/6] nv50/ir: implement mad post ra folding for nvc0+

2016-10-09 Thread Karol Herbst
0 0 0 v2: removed TODO reorderd to show changes without RA modification removed stale debugging print() call Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 64 +++--- 1 file changed

[Mesa-dev] [PATCH v2 2/6] gm107/ir: add LIMM form of mad

2016-10-09 Thread Karol Herbst
v2: renamed commit reordered modifiers add assert(dst == src2) Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 35 -- 1 file changed, 26 insertions(+), 9 deletions(-) diff --git a/src/gallium/d

[Mesa-dev] [PATCH] nv50/ir: start LocalCSE with getFirst to merge PHI instructions

2016-10-06 Thread Karol Herbst
lgpr inst bytes helped 0 25 100 100 hurt 0 0 0 0 Signed-off-by: Karol Herbst <karolher...@gmail.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 +- 1 file changed,

[Mesa-dev] [PATCH] nv50/ra: let simplify return an error and handle that

2016-10-03 Thread Karol Herbst
fixes a crash in the case simplify reports an error Signed-off-by: Karol Herbst <karolher...@gmail.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.

Re: [Mesa-dev] Mesa 13.0.0 release plan (Was Re: Mesa 12.1.0 release plan (Was Re: Next Mesa release, anyone?))

2016-09-30 Thread Karol Herbst
2016-09-30 16:57 GMT+02:00 Ian Romanick : > On 09/30/2016 06:23 AM, Brian Paul wrote: >> On 09/30/2016 04:59 AM, Emil Velikov wrote: >>> On 30 September 2016 at 03:31, Timothy Arceri >>> wrote: On Thu, 2016-09-29 at 19:17 -0700, Jason

Re: [Mesa-dev] [PATCH] nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c)

2016-10-09 Thread Karol Herbst
2016-10-09 13:58 GMT+02:00 Samuel Pitoiset <samuel.pitoi...@gmail.com>: > > > On 10/08/2016 10:04 PM, Karol Herbst wrote: >> >> looks great, a few comments below > > > Thanks! > >> >> 2016-10-08 21:55 GMT+02:00 Samuel Pitoiset <samuel.pit

Re: [Mesa-dev] [PATCH v2 2/6] gm107/ir: add LIMM form of mad

2016-10-26 Thread Karol Herbst
2016-10-26 19:20 GMT+02:00 Samuel Pitoiset <samuel.pitoi...@gmail.com>: > > > On 10/09/2016 11:04 AM, Karol Herbst wrote: >> >> v2: renamed commit >> reordered modifiers >> add assert(dst == src2) >> >> Signed-off-by: Karol Herbst <

[Mesa-dev] [PATCH v4 1/4] nv50/ir: restructure and rename postraconstantfolding pass

2016-11-06 Thread Karol Herbst
we might want to add more folding passes here, so make it a bit more generic v2: leave the comment and reword commit message v4: rename it to PostRaLoadPropagation Signed-off-by: Karol Herbst <karolher...@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoi...@gmail.com> ---

[Mesa-dev] [PATCH v4 0/4] nv50/ir: PostRaConstantFolding improvements

2016-11-06 Thread Karol Herbst
This series reworks the structure of the pass to make it easier to add more optimisations to it. I have to rework the RA commit a bit and the post_ra_dead patch should be submitted on its own. v2: swaped the last two commits v3: reworked order v4: droped last two patches Karol Herbst (4

[Mesa-dev] [PATCH v4 4/4] gm107/ir: add LIMM form of mad

2016-11-06 Thread Karol Herbst
v2: renamed commit reordered modifiers add assert(dst == src2) v3: reordered modifiers again Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 35 -- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp

[Mesa-dev] [PATCH v4 3/4] gk110/ir: add LIMM form of mad

2016-11-06 Thread Karol Herbst
v2: renamed commit reordered modifiers add assert(dst == src2) v3: removed wrong neg mod emission Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 50 ++ .../drivers/nouveau/codegen/nv50_ir_peepho

[Mesa-dev] [PATCH v4 2/4] nv50/ir: implement mad post ra folding for nvc0+

2016-11-06 Thread Karol Herbst
0 0 0 0 v2: removed TODO reorderd to show changes without RA modification removed stale debugging print() call v3: remove predicate checks enable only for gf100 ISA Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_i

Re: [Mesa-dev] [PATCH v4 4/4] gm107/ir: add LIMM form of mad

2016-11-06 Thread Karol Herbst
Subject: [PATCH v5] gm107/ir: add LIMM form of mad v2: renamed commit reordered modifiers add assert(dst == src2) v3: reordered modifiers again v5: no roudning bit for limms Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cp

Re: [Mesa-dev] [PATCH v4 0/4] nv50/ir: PostRaConstantFolding improvements

2016-11-06 Thread Karol Herbst
no regressions in piglit on my nve6 2016-11-06 15:05 GMT+01:00 Karol Herbst <karolher...@gmail.com>: > This series reworks the structure of the pass to make it easier to add > more optimisations to it. > > I have to rework the RA commit a bit and the post_ra_dead patch sho

Re: [Mesa-dev] [PATCH] glcpp: initializes version to -1

2016-11-08 Thread Karol Herbst
2016-11-08 13:35 GMT+01:00 Juan A. Suarez Romero <jasua...@igalia.com>: > On Sat, 2016-11-05 at 10:48 +0100, Karol Herbst wrote: >> "#version 0512": 0:1(10): error: GLSL 3.30 is not supported. >> Supported >> versions are: 1.10, 1.20, 1.30, 1.00 ES, and 3.00

Re: [Mesa-dev] [PATCH] glcpp: initializes version to -1

2016-11-05 Thread Karol Herbst
2016-11-05 2:50 GMT+01:00 Ian Romanick : > (Sorry about the top post. Sent from my phone.) > > That expression will allow versions like 0130 as valid. If you just want to > allow 0, you need a more complex regular expression. I feel like that's > just a bandage... what

Re: [Mesa-dev] [PATCH] glcpp: initializes version to -1

2016-11-07 Thread Karol Herbst
2016-11-07 10:05 GMT+01:00 Juan A. Suarez Romero <jasua...@igalia.com>: > On Sat, 2016-11-05 at 10:48 +0100, Karol Herbst wrote: >> 2016-11-05 2:50 GMT+01:00 Ian Romanick <i...@freedesktop.org>: >> > (Sorry about the top post. Sent from my phone.) >> > >&g

Re: [Mesa-dev] [PATCH] nv50/ir: start LocalCSE with getFirst to merge PHI instructions

2016-10-25 Thread Karol Herbst
01%) >>> total local used in shared programs : 9505 -> 9505 (0.00%) >>> total bytes used in shared programs : 25837192 -> 25833736 (-0.01%) >>> >>> local gpr inst bytes >>> helped 0 25

Re: [Mesa-dev] [PATCH 5/5] nv50/ir: detect when a SLCT is equivalent to a SET

2016-10-21 Thread Karol Herbst
On 21 October 2016 8:30:33 a.m. GMT+02:00, Ilia Mirkin wrote: >Signed-off-by: Ilia Mirkin >--- >.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 23 >++ > 1 file changed, 19 insertions(+), 4 deletions(-) > >diff --git

Re: [Mesa-dev] [PATCH v3 2/6] nv50/ir: implement mad post ra folding for nvc0+

2016-10-30 Thread Karol Herbst
2016-10-30 23:45 GMT+01:00 Matt Turner <matts...@gmail.com>: > On Sun, Oct 30, 2016 at 2:20 PM, Karol Herbst <karolher...@gmail.com> wrote: >> Signed-off-by: Karol Herbst <karolher...@gmail.com> >> >> fixup >> >> Signed-off-by: Karol Herbst &l

[Mesa-dev] [PATCH v3 6/6] nv50/ir: replace post_ra_dead by Instruction::isDead

2016-10-30 Thread Karol Herbst
Signed-off-by: Karol Herbst <karolher...@gmail.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir.h | 2 +- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 24 -- 2 files changed, 9 insertions(+), 17 deletions(-) diff --git a/src/gallium/drivers/nouveau/c

[Mesa-dev] [PATCH v3 2/6] nv50/ir: implement mad post ra folding for nvc0+

2016-10-30 Thread Karol Herbst
0 0 0 0 v2: removed TODO reorderd to show changes without RA modification removed stale debugging print() call v3: remove predicate checks enable only for gf100 ISA Signed-off-by: Karol Herbst <karolher...@gmail.com> fixup Signed-off-by: Karol Herbst &

[Mesa-dev] [PATCH v3 1/6] nv50/ir: restructure postraconstantfolding pass

2016-10-30 Thread Karol Herbst
we might want to add more folding passes here, so make it a bit more generic v2: leave the comment and reword commit message Signed-off-by: Karol Herbst <karolher...@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoi...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_peephole

[Mesa-dev] [PATCH v3 5/6] gm107/ir: add LIMM form of mad

2016-10-30 Thread Karol Herbst
v2: renamed commit reordered modifiers add assert(dst == src2) v3: reordered modifiers again Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 35 -- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp

[Mesa-dev] [PATCH v3 0/6] nv50/ir: PostRaConstantFolding improvements

2016-10-30 Thread Karol Herbst
4591 hurt 0 23 64 64 Karol Herbst (6): nv50/ir: restructure postraconstantfolding pass nv50/ir: implement mad post ra folding for nvc0+ nv50/ra: always prefer def == src2 for mad/sad gk110/ir: add LIMM form of mad gm107/ir: add LIMM form of m

[Mesa-dev] [PATCH v3 3/6] nv50/ra: always prefer def == src2 for mad/sad

2016-10-30 Thread Karol Herbst
localgpr inst bytes helped 0 2619371937 hurt 0 23 81 81 v2: reorder to show the benefit of this patch Signed-off-by: Karol Herbst <karolher...@gmail.com> --- src/gallium/drivers/nouveau/code

[Mesa-dev] [PATCH v3 4/6] gk110/ir: add LIMM form of mad

2016-10-30 Thread Karol Herbst
v2: renamed commit reordered modifiers add assert(dst == src2) v3: removed wrong neg mod emission Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 50 ++ .../drivers/nouveau/codegen/nv50_ir_peepho

Re: [Mesa-dev] [PATCH] glcpp: initializes version to -1

2016-11-04 Thread Karol Herbst
for reference the bug I've created for this: https://bugs.freedesktop.org/show_bug.cgi?id=97420 and thanks for fixing this 2016-11-04 13:22 GMT+01:00 Juan A. Suarez Romero : > Shader can define #version as an integer, including 0. > > Initializes version to -1 to know later

Re: [Mesa-dev] [PATCH] nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c)

2016-10-09 Thread Karol Herbst
2016-10-09 21:34 GMT+02:00 Ilia Mirkin <imir...@alum.mit.edu>: > On Sun, Oct 9, 2016 at 3:28 PM, Karol Herbst <karolher...@gmail.com> wrote: >> 2016-10-09 13:58 GMT+02:00 Samuel Pitoiset <samuel.pitoi...@gmail.com>: >>> >>> >>> On 10/08/20

Re: [Mesa-dev] [PATCH] drirc: Allow extension midshader for Divinity: Original Sin (EE)

2017-01-07 Thread Karol Herbst
that game still depends on ARB_shading_language_include and it checks for that extension by checking if the function pointers are there. One hacky solution is this: diff --git a/src/glx/glxcmds.c b/src/glx/glxcmds.c index 63f4921..e1ab885 100644 --- a/src/glx/glxcmds.c +++ b/src/glx/glxcmds.c @@

[Mesa-dev] [PATCH 2/2] nvc0/ir: also do ConstantFolding for FMA

2017-03-21 Thread Karol Herbst
ytes used in shared programs : 36123344 -> 36115776 (-0.02%) localgpr inst bytes helped 2 48 243 243 hurt 2 3 32 32 Signed-off-by: Karol Herbst <karolher...@gmail.com> --- s

[Mesa-dev] [PATCH 1/2] nvc0/ir: disable support for LIMMs on MAD/FMA

2017-03-21 Thread Karol Herbst
ams: 481563 -> 481511 (-0.01%) total local used in shared programs : 27469 -> 27469 (0.00%) total bytes used in shared programs : 36139384 -> 36123344 (-0.04%) Signed-off-by: Karol Herbst <karolher...@gmail.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp | 4 ++

[Mesa-dev] [PATCH v5 4/5] gm107/ir: add LIMM form of mad

2017-03-26 Thread Karol Herbst
v2: renamed commit reordered modifiers add assert(dst == src2) v3: reordered modifiers again v5: no rounding bit for limms Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 34 -- .../drivers/nouveau/c

[Mesa-dev] [PATCH v5 2/5] nv50/ir: implement mad post ra folding for nvc0+

2017-03-26 Thread Karol Herbst
0 0 0 v2: removed TODO reorderd to show changes without RA modification removed stale debugging print() call v3: remove predicate checks enable only for gf100 ISA Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_i

[Mesa-dev] [PATCH v5 0/5] nvc0/ir: add support for MAD/FMA PostRALoadPropagation

2017-03-26 Thread Karol Herbst
was "nv50/ir: PostRaConstantFolding improvements" before. nothing really changed from the last version, just minor things. Karol Herbst (5): nv50/ir: restructure and rename postraconstantfolding pass nv50/ir: implement mad post ra folding for nvc0+ gk110/ir: add LIMM form of mad

[Mesa-dev] [PATCH v5 1/5] nv50/ir: restructure and rename postraconstantfolding pass

2017-03-26 Thread Karol Herbst
we might want to add more folding passes here, so make it a bit more generic v2: leave the comment and reword commit message v4: rename it to PostRaLoadPropagation Signed-off-by: Karol Herbst <karolher...@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoi...@gmail.com> ---

[Mesa-dev] [PATCH v5 5/5] nv50/ir: also do PostRaLoadPropagation for FMA

2017-03-26 Thread Karol Herbst
in shared programs : 36061888 -> 36056504 (-0.01%) localgpr inst bytes helped 0 0 228 228 hurt 0 0 0 0 Signed-off-by: Karol Herbst <karolher...@gmail.com> --- src/galli

[Mesa-dev] [PATCH v5 3/5] gk110/ir: add LIMM form of mad

2017-03-26 Thread Karol Herbst
v2: renamed commit reordered modifiers add assert(dst == src2) v3: removed wrong neg mod emission Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 50 ++ .../drivers/nouveau/codegen/nv50_ir_peepho

[Mesa-dev] [PATCH] nvc0/target: treat FMA like MAD

2017-03-18 Thread Karol Herbst
35749888 -> 35214176 (-1.50%) localgpr inst bytes helped 17182940914091 hurt 4 44 3 3 Signed-off-by: Karol Herbst <karolher...@gmail.com> --- src/gallium/drive

[Mesa-dev] [PATCH v2 2/3] nv50/ir: handle logops with NOT in AlgebraicOpt

2017-04-03 Thread Karol Herbst
Signed-off-by: Karol Herbst <karolher...@gmail.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[Mesa-dev] [PATCH v2 0/3] nv50/ir: Preapre for running Opts inside a loop

2017-04-03 Thread Karol Herbst
Slowly we are getting to the point, that we miss enough optimization opportunities as the result of our own passes. For this we need to fix AlgebraicOpt to be able to handle mods on sources without creating new issues. The last patch enables looping opts. v2: update commit author Karol Herbst

[Mesa-dev] [PATCH v2 3/3] nv50/ir: run some passes multiple times

2017-04-03 Thread Karol Herbst
Signed-off-by: Karol Herbst <karolher...@gmail.com> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp| 17 +++-- 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv

[Mesa-dev] [PATCH v2 1/3] nv50/ir: fix AlgebraicOpt for slcts with mods

2017-04-03 Thread Karol Herbst
Signed-off-by: Karol Herbst <karolher...@gmail.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/c

[Mesa-dev] [PATCH 3/3] nv50/ir: run some passes multiple times

2017-04-03 Thread Karol Herbst
From: Karol Herbst <nouv...@karolherbst.de> With the shader cache, compilation time matters less. As a side effect we can write more optimizations to produce better optimized code. total instructions in shared programs : 3931743 -> 3917512 (-0.36%) total gprs used in shared programs

  1   2   3   4   5   6   7   8   >