Re: [Mesa-dev] [PATCH 1/2] R600: handle loops to self in the structurizer v2

2013-01-22 Thread Aaron Watry
Am 22.01.2013 13:08, schrieb Michel Dänzer: On Mon, 2013-01-21 at 22:28 +0100, Christian König wrote: v2: don't mess up other loops Signed-off-by: Christian König deathsimple at vodafone.de Series is Tested-by: Michel Dänzer michel.daenzer at amd.com No piglit regressions with

Re: [Mesa-dev] [PATCH 3/3] R600: Add support for SET*_DX10 instructions

2013-01-31 Thread Aaron Watry
From: Tom Stellard thomas.stellard at amd.com These instructions compare two floating point values and return an integer true (-1) or false (0) value. When compiling code generated by the Mesa GLSL frontend, the SET*_DX10 instructions save us four instructions for most branch decisions that use

Re: [Mesa-dev] [PATCH (9.1)] Revert r600g: re-enable handling of DISCARD_RANGE, improving performance

2013-02-20 Thread Aaron Watry
I've managed to capture a trace that loads TF2 to the menu and reproduces some of the flickering. I haven't managed to capture any gameplay yet due to an error in CD Key authentication due to how I'm launching the game. URL: http://www.watrys.net/tf2_menu.trace.xz --Aaron On Wed, Feb 20, 2013

Re: [Mesa-dev] [PATCH] clover: Fix build with LLVM 3.3

2013-02-21 Thread Aaron Watry
Hi Tom, Mesa+Clover does indeed build against master llvm/clang, but I'm having trouble building against it when I try to do a clean build of Piglit. Error received: [ 18%] Built target piglitutil_cl Linking C executable ../../../../../bin/cl-custom-run-simple-kernel

Re: [Mesa-dev] [PATCH] clover: Fix build with LLVM 3.3

2013-02-21 Thread Aaron Watry
On Thu, Feb 21, 2013 at 8:33 AM, Tom Stellard t...@stellard.net wrote: On Thu, Feb 21, 2013 at 08:25:20AM -0600, Aaron Watry wrote: Hi Tom, Mesa+Clover does indeed build against master llvm/clang, but I'm having trouble building against it when I try to do a clean build of Piglit

Re: [Mesa-dev] [PATCH] clover: Fix build with LLVM 3.3

2013-02-21 Thread Aaron Watry
On Thu, Feb 21, 2013 at 10:06 AM, Tom Stellard t...@stellard.net wrote: On Thu, Feb 21, 2013 at 10:02:34AM -0600, Aaron Watry wrote: On Thu, Feb 21, 2013 at 8:33 AM, Tom Stellard t...@stellard.net wrote: On Thu, Feb 21, 2013 at 08:25:20AM -0600, Aaron Watry wrote: Hi Tom, Mesa

Re: [Mesa-dev] [PATCH] clover: Fix build with LLVM 3.3

2013-02-22 Thread Aaron Watry
On Fri, Feb 22, 2013 at 12:21 PM, Tom Stellard t...@stellard.net wrote: On Thu, Feb 21, 2013 at 08:25:20AM -0600, Aaron Watry wrote: Hi Tom, Mesa+Clover does indeed build against master llvm/clang, but I'm having trouble building against it when I try to do a clean build of Piglit

Re: [Mesa-dev] [PATCH 0/9] remove mfeatures.h file

2013-02-26 Thread Aaron Watry
Same error here. Configuration: ./autogen.sh --enable-texture-float --enable-opencl --with-gallium-drivers=r600 --with-dri-drivers=radeon --prefix=/usr/local --Aaron On Tue, Feb 26, 2013 at 11:09 AM, Jordan Justen jljus...@gmail.com wrote: On Sat, Feb 23, 2013 at 7:29 AM, Brian Paul

[Mesa-dev] [PATCH] libclc: Fix libclc build for LLVM 3.3

2013-03-08 Thread Aaron Watry
LLVM moved a bunch of IR-related headers for version 3.3. This fixes the libclc build to follow suit. --- utils/prepare-builtins.cpp | 12 1 file changed, 12 insertions(+) diff --git a/utils/prepare-builtins.cpp b/utils/prepare-builtins.cpp index ae7731b..0141484 100644 ---

Re: [Mesa-dev] [PATCH] [libclc] configure: Enable building separate libraries for target variants

2013-03-13 Thread Aaron Watry
The python changes in this file look good to me. I haven't done a line-by-line review of the SI changes. I tested this patch and v2 of the related mesa series on r600g (radeon 6850) with a recent LLVM and fresh mesa master as of this evening. No real change in the piglit CL test success/failure

[Mesa-dev] [PATCH] libclc: Add max() builtin function

2013-03-14 Thread Aaron Watry
Adds this function for both int and floating data types. --- generic/include/clc/clc.h |2 ++ generic/include/clc/integer/max.h |2 ++ generic/include/clc/integer/max.inc |1 + generic/include/clc/math/max.h |2 ++ generic/include/clc/math/max.inc|1 +

[Mesa-dev] [PATCH 1/3] libclc: Fix abs_diff builtin integer function

2013-03-14 Thread Aaron Watry
--- generic/lib/SOURCES |1 + generic/lib/integer/abs_diff.inc |2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES index b593941..a97213b 100644 --- a/generic/lib/SOURCES +++ b/generic/lib/SOURCES @@ -4,6 +4,7 @@

[Mesa-dev] libclc: Improve libclc handling of built-in functions

2013-03-14 Thread Aaron Watry
This series depends on the one-off patch I just sent to add max(). 1) Fix the broken abs_diff integer built-in. 2) Add clamp for both integer and floating types in a new shared/ dir in order to reduce code duplication and improve maintainability. 3) Move the max() function into the shared/

[Mesa-dev] [PATCH 2/3] libclc: Add clamp() builtin for integer/floating point

2013-03-14 Thread Aaron Watry
Created under a new shared/ directory for functions which are available for both integer and floating point types. --- generic/include/clc/clc.h|3 +++ generic/include/clc/shared/clamp.h |5 + generic/include/clc/shared/clamp.inc |1 + generic/lib/SOURCES

[Mesa-dev] [PATCH 3/3] libclc: Move max builtin to shared/

2013-03-14 Thread Aaron Watry
Max(x,y) is available for all integer/floating types. --- generic/include/clc/clc.h |3 +-- generic/include/clc/integer/max.h |2 -- generic/include/clc/integer/max.inc |1 - generic/include/clc/math/max.h |2 -- generic/include/clc/math/max.inc|1 -

Re: [Mesa-dev] libclc: Improve libclc handling of built-in functions

2013-03-14 Thread Aaron Watry
PM, Aaron Watry awa...@gmail.com wrote: This series depends on the one-off patch I just sent to add max(). 1) Fix the broken abs_diff integer built-in. 2) Add clamp for both integer and floating types in a new shared/ dir in order to reduce code duplication and improve maintainability. 3

[Mesa-dev] [PATCH] libclc: implement rotate builtin

2013-03-23 Thread Aaron Watry
This implementation does a lot of bit shifting and masking. Suffice to say, this is somewhat suboptimal... but it does look to produce correct results (after the piglit tests were corrected for sign extension issues). Someone who knows LLVM better than I could re-write this more efficiently. ---

[Mesa-dev] (no subject)

2013-04-13 Thread Aaron Watry
Implements the min() OpenCL built-in in 2 stages. 1) Implement min() where the two argument types match 2) Make changes to support min(vec,scalar) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org

[Mesa-dev] [PATCH 1/2] libclc: implement initial version of min()

2013-04-13 Thread Aaron Watry
This doesn't handle the integer cases for min(vector, scalar). --- generic/include/clc/clc.h |1 + generic/include/clc/shared/min.h |5 + generic/include/clc/shared/min.inc |1 + generic/lib/SOURCES|1 + generic/lib/shared/min.cl | 11

[Mesa-dev] [PATCH 2/2] libclc: Implement the min(vec, scalar) version of the min builtin.

2013-04-13 Thread Aaron Watry
Checks if the current GENTYPE is scalar, and if not, then defines a separate implementation of the function which casts the second arg to vector before proceeding. --- generic/include/clc/integer/gentype.inc | 23 +++ generic/include/clc/math/gentype.inc|8

[Mesa-dev] [PATCH] libclc: Add clamp(vec, scalar, scalar) and max(vec, scalar)

2013-04-13 Thread Aaron Watry
For any GENTYPE that isn't scalar, we need to implement a mixed vector/scalar version of clamp/max. This depends on the min() patches I sent to the list a few minutes ago. --- generic/include/clc/shared/clamp.inc |4 generic/include/clc/shared/max.inc |4

[Mesa-dev] [PATCH] libclc: Rename [add|sub]_sat.ll to [add|sub]_sat_if.ll

2013-04-15 Thread Aaron Watry
configure.py allows overloading *.cl with *.ll, but will only ever build the first file listed in SOURCES of ${file}.cl and ${file}.ll add_sat, sub_sat, (and the soon to be submitted clz) all define interfaces in ${function_name}.ll which are implemented in ${function_name}_impl.ll. Renaming the

Re: [Mesa-dev] [PATCH 1/6] configure.ac: Remove unused HAVE_PIPE_LOADER_XLIB macro.

2013-04-26 Thread Aaron Watry
For the series: Tested-by: Aaron Watry awa...@gmail.com Config: ./configure --with-dri-drivers=radeon --with-gallium-drivers=r600 --enable-texture-float --enable-opencl --enable-gles1 --enable-gles2 --enable-xvmc --enable-vdpau --enable-r600-llvm-compiler --with-egl-platforms=x11,drm --enable-glx

Re: [Mesa-dev] R600 Patchset: Optimizations for bfgminer

2013-04-29 Thread Aaron Watry
Hi Tom, I'm not too qualified to review the llvm code changes, but the changes looked sane. I did want to point out a few piglit changes/regressions as a result of this set of patches. For my HD6850, running latest llvm from git: gegl-rgb-gamma-u8-to-ragabaf: pass - fail v3i32-stack: pass - fail

Re: [Mesa-dev] RFC: tgsi opcodes for 32x32 muls with 64bit results

2013-05-03 Thread Aaron Watry
Not sure if this helps much, but... With gentype being one of: char, uchar, short, ushort, int, uint, long, ulong, and the widths being scalar, 2, 3, 4, 8, or 16 components wide. From the OpenCL 1.1 spec: gentype mad_hi(gentype a, gentype b): Computes x * y and returns the high half of the

Re: [Mesa-dev] [PATCH] r600g: don't emit surface_sync after FLUSH_AND_INV_EVENT

2013-05-03 Thread Aaron Watry
I know it's been pushed already, but this also fixes some lockups that I was seeing on Barts (HD6850) when running piglit's OpenCL tests. Thanks for fixing this. --Aaron On Fri, May 3, 2013 at 9:47 AM, Marek Olšák mar...@gmail.com wrote: Reviewed-by: Marek Olšák mar...@gmail.com Marek On

Re: [Mesa-dev] R600 Patchset: Emit true ISA

2013-05-04 Thread Aaron Watry
This series, and the associated mesa changes are all: Tested-By: Aaron Watry awa...@gmail.com --Aaron On Fri, May 3, 2013 at 5:53 PM, Tom Stellard t...@stellard.net wrote: Hi, The attached patches modify the CodeEmitter to emit true ISA. Previously, we were prefixing all instructions

[Mesa-dev] R600: Expand vselect and SRA for v2i32 and v4i32

2013-05-06 Thread Aaron Watry
These two patches fix a number of piglit OpenCL test failures on my HD6850 (Barts). There are no piglit CL test regressions and the llvm make check runs without any unexpected failures. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org

[Mesa-dev] [PATCH 1/2] R600: Expand vselect for v4i32 and v2i32

2013-05-06 Thread Aaron Watry
Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600/R600ISelLowering.cpp |3 +++ 1 file changed, 3 insertions(+) diff --git a/lib/Target/R600/R600ISelLowering.cpp b/lib/Target/R600/R600ISelLowering.cpp index c6e2136..6dec4d1 100644 --- a/lib/Target/R600/R600ISelLowering.cpp

[Mesa-dev] [PATCH 2/2] R600: Expand SRA for v4i32/v2i32

2013-05-06 Thread Aaron Watry
Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600/R600ISelLowering.cpp |2 ++ 1 file changed, 2 insertions(+) diff --git a/lib/Target/R600/R600ISelLowering.cpp b/lib/Target/R600/R600ISelLowering.cpp index 6dec4d1..ac56ed8 100644 --- a/lib/Target/R600/R600ISelLowering.cpp +++ b

[Mesa-dev] R600: Expand vselect and SRA for v2i32 and v4i32 (v2)

2013-05-08 Thread Aaron Watry
These two patches fix a number of piglit OpenCL test failures on my HD6850 (Barts). There are no piglit CL test regressions and the llvm make check runs without any unexpected failures. v2: Add tests for v4i32 data type. ___ mesa-dev mailing list

[Mesa-dev] [PATCH 1/2] R600: Expand vselect for v4i32 and v2i32

2013-05-08 Thread Aaron Watry
Signed-off-by: Aaron Watry awa...@gmail.com v2: Add vselect v4i32 test --- lib/Target/R600/R600ISelLowering.cpp |3 +++ test/CodeGen/R600/vselect.ll | 17 + 2 files changed, 20 insertions(+) create mode 100644 test/CodeGen/R600/vselect.ll diff --git a/lib/Target

[Mesa-dev] [PATCH 2/2] R600: Expand SRA for v4i32/v2i32

2013-05-08 Thread Aaron Watry
Signed-off-by: Aaron Watry awa...@gmail.com v2: Add v4i32 test --- lib/Target/R600/R600ISelLowering.cpp |2 ++ test/CodeGen/R600/sra.ll | 13 + 2 files changed, 15 insertions(+) create mode 100644 test/CodeGen/R600/sra.ll diff --git a/lib/Target/R600

[Mesa-dev] [PATCH] R600: Expand MUL for v4i32/v2i32

2013-05-08 Thread Aaron Watry
Fixes piglit test for OpenCL builtin mul24, and allows mad24 to run. Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600/R600ISelLowering.cpp |2 ++ test/CodeGen/R600/mul.ll | 16 2 files changed, 18 insertions(+) create mode 100644 test/CodeGen

[Mesa-dev] [PATCH] R600: Expand SUB for v2i32/v4i32

2013-05-08 Thread Aaron Watry
Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600/R600ISelLowering.cpp |2 ++ test/CodeGen/R600/sub.ll | 15 +++ 2 files changed, 17 insertions(+) create mode 100644 test/CodeGen/R600/sub.ll diff --git a/lib/Target/R600/R600ISelLowering.cpp b/lib

Re: [Mesa-dev] [PATCH] scons: Use LLVM shared library if found.

2013-05-17 Thread Aaron Watry
On Fri, May 17, 2013 at 2:31 PM, Jose Fonseca jfons...@vmware.com wrote: - Original Message - On Fri, May 17, 2013 at 7:44 AM, Jose Fonseca jfons...@vmware.com wrote: Vinson, Why is this necessary? (I'd prefer that LLVM is statically linked by default. ) Jose The SCons

Re: [Mesa-dev] [PATCH libclc] Add bitselect builtin

2013-05-23 Thread Aaron Watry
Reviewed-by: Aaron Watry awa...@gmail.com Please also send the attached test patch (or an expanded version of it) to the piglit list. On Thu, May 23, 2013 at 12:48 PM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com --- generic/include/clc/clc.h

[Mesa-dev] libclc: vload/vstore initial implementation

2013-05-23 Thread Aaron Watry
I've implemented the OpenCL vload/vstore builtin functions in two parts. 1) Pure CL C implementation. No Assembly 2) Add assembly optimizations for 32-bit int/uint loads/stores of 4+ component vectors Note: The vstore implementation assumes that the hardware back end supports byte-addressable

[Mesa-dev] [PATCH 1/4] libclc: Initial vload implementation

2013-05-23 Thread Aaron Watry
Should work for all targets and data types. Completely unoptimized. --- generic/include/clc/clc.h | 1 + generic/include/clc/shared/vload.h | 37 ++ generic/lib/SOURCES| 1 + generic/lib/shared/vload.cl| 47

[Mesa-dev] [PATCH 2/4] libclc: Initial vstore implementation

2013-05-23 Thread Aaron Watry
Assumes that the target supports byte-addressable stores. Completely unoptimized. --- generic/include/clc/clc.h | 1 + generic/include/clc/shared/vstore.h | 36 generic/lib/SOURCES | 1 + generic/lib/shared/vstore.cl| 56

[Mesa-dev] [PATCH 3/4] libclc: Add assembly versions of vload for global int4/8/16

2013-05-23 Thread Aaron Watry
The assembly should be generic, but at least currently R600 only supports 32-bit loads of int1/4, and I believe that only global is well-supported. R600 lowers the 8/16 component vectors to multiple 4-bit loads. The unoptimized C versions of the other stuff is left in place. ---

[Mesa-dev] [PATCH 4/4] libclc: Add assembly versions of vstore for global [u]int4/8/16

2013-05-23 Thread Aaron Watry
The assembly should be generic, but at least currently R600 only supports 32-bit stores of [u]int1/4, and I believe that only global is well-supported. R600 lowers the 8/16 component stores to multiple 4-component stores. The unoptimized C versions of the other stuff is left in place. ---

Re: [Mesa-dev] [PATCH] clover: Don't segfault when compiling a program with no kernel

2013-06-06 Thread Aaron Watry
Looks good to me. Is there a piglit test for this? --Aaron On Wed, Jun 5, 2013 at 7:12 PM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com --- src/gallium/state_trackers/clover/llvm/invocation.cpp | 7 +++ 1 file changed, 7 insertions(+) diff --git

Re: [Mesa-dev] [PATCH libclc] Implement barrier() builtin

2013-06-13 Thread Aaron Watry
is there and functioning. For the libclc change: Reviewed-by: Aaron Watry awa...@gmail.com On Wed, Jun 12, 2013 at 7:31 PM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com --- r600/lib/SOURCES | 2 ++ r600/lib/synchronization/barrier.cl

Re: [Mesa-dev] [PATCH 1/2] r600g/compute: Move compute_shader_create() function into evergreen_compute.c

2013-06-13 Thread Aaron Watry
On Wed, Jun 12, 2013 at 7:34 PM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com --- src/gallium/drivers/r600/evergreen_compute.c | 23 +++- src/gallium/drivers/r600/r600_shader.c | 32 2 files changed, 22

Re: [Mesa-dev] [PATCH 2/2] r600g/compute: Accept LDS size from the LLVM backend

2013-06-13 Thread Aaron Watry
-by: Aaron Watry awa...@gmail.com On Wed, Jun 12, 2013 at 7:34 PM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com And allocate the correct amount before dispatching the kernel. --- src/gallium/drivers/r600/evergreen_compute.c | 53

[Mesa-dev] [PATCH] R600: Add SI load support for v[24]i32 and store for v2i32

2013-06-14 Thread Aaron Watry
Also add a seperate vector lit test file, since r600 doesn't seem to handle v2i32 load/store yet, but we can test both for SI. Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600/SIInstructions.td | 5 + test/CodeGen/R600/load.vec.ll | 19 +++ 2 files changed

[Mesa-dev] R600: Various fixes for R600 and SI

2013-06-17 Thread Aaron Watry
First patch fixes load/store for v2i32 on R600. Without this, the other two will cause make check failures. I've verified the changes using a Radeon 5400 (Cedar). Note that the previous custom lowering of v2i32 store was causing silent data corruption. The other two patches expand add/sub on SI

[Mesa-dev] [PATCH 1/3] R600: Expand v2i32 load/store instead of custom lowering

2013-06-17 Thread Aaron Watry
The custom lowering causes llc to crash with a segfault. Ideally, the custom lowering can be fixed, but this allows programs which load/store v2i32 to work without crashing. Signed-off-by: Aaron Watryawa...@gmail.com --- lib/Target/R600/R600ISelLowering.cpp | 4 ++--

[Mesa-dev] [PATCH 2/3] R600/SI: Expand add for v2i32 and v4i32

2013-06-17 Thread Aaron Watry
Also add SI tests to existing file and a v2i32 test for both R600 and SI. Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600/SIISelLowering.cpp | 2 ++ test/CodeGen/R600/add.ll | 37 +++-- 2 files changed, 33 insertions(+), 6 deletions

Re: [Mesa-dev] [PATCH] R600/SI: Add support for v4i32 and v4f32 kernel args

2013-06-18 Thread Aaron Watry
Tested on Pitcairn by: Aaron Watry awa...@gmail.com Follow-up question: Would it be as easy as it looks to add v2i32 right away? On Tue, Jun 18, 2013 at 6:21 PM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com --- lib/Target/R600/AMDGPUCallingConv.td| 9

[Mesa-dev] R600: Expand integer operations for SI and consolidate code with EG

2013-06-20 Thread Aaron Watry
) calls that appear in both R600ISelLowering.cpp and SIISelLowering.cpp are all moved to AMDGPUISelLowering.cpp. If we decide to implement these ops through native instructions for either target in the future, we can override that in the individual targets. Signed-off-by: Aaron Watry awa...@gmail.com

[Mesa-dev] [PATCH 01/12] R600/SI: Expand and of v2i32/v4i32 for SI

2013-06-20 Thread Aaron Watry
Also add lit test for both cases on SI, and v2i32 for evergreen. Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600/SIISelLowering.cpp | 3 +++ test/CodeGen/R600/and.ll | 37 +++-- 2 files changed, 34 insertions(+), 6 deletions(-) diff

[Mesa-dev] [PATCH 02/12] R600/SI: Expand mul of v2i32/v4i32 for SI

2013-06-20 Thread Aaron Watry
Also add lit test for both cases on SI, and v2i32 for evergreen. Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600/SIISelLowering.cpp | 3 +++ test/CodeGen/R600/mul.ll | 38 -- 2 files changed, 35 insertions(+), 6 deletions(-) diff

[Mesa-dev] [PATCH 03/12] R600/SI: Expand or of v2i32/v4i32 for SI

2013-06-20 Thread Aaron Watry
Also add lit test for both cases on SI, and v2i32 for evergreen. Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600/SIISelLowering.cpp | 3 +++ test/CodeGen/R600/or.ll| 41 +++--- 2 files changed, 37 insertions(+), 7 deletions(-) diff

[Mesa-dev] [PATCH 04/12] R600/SI: Expand shl of v2i32/v4i32 for SI

2013-06-20 Thread Aaron Watry
Also add lit test for both cases on SI, and v2i32 for evergreen. Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600/SIISelLowering.cpp | 3 +++ test/CodeGen/R600/shl.ll | 47 ++ 2 files changed, 40 insertions(+), 10 deletions(-) diff

[Mesa-dev] [PATCH 05/12] R600/SI: Expand srl of v2i32/v4i32 for SI

2013-06-20 Thread Aaron Watry
Also add lit test for both cases on SI, and v2i32 for evergreen. Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600/SIISelLowering.cpp | 2 ++ test/CodeGen/R600/srl.ll | 42 +++--- 2 files changed, 37 insertions(+), 7 deletions(-) diff

[Mesa-dev] [PATCH 06/12] R600/SI: Expand ashr of v2i32/v4i32 for SI

2013-06-20 Thread Aaron Watry
Also add lit test for both cases on SI, and v2i32 for evergreen. Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600/SIISelLowering.cpp | 2 ++ test/CodeGen/R600/sra.ll | 41 +++--- 2 files changed, 36 insertions(+), 7 deletions(-) diff

[Mesa-dev] [PATCH 07/12] R600/SI: Expand udiv v[24]i32 for SI and v2i32 for EG

2013-06-20 Thread Aaron Watry
Also add lit test for both cases on SI, and v2i32 for evergreen. Note: I followed the guidance of the v4i32 EG check... UDIV produces really complex code, so let's just check that the instruction was lowered successfully. Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600

[Mesa-dev] [PATCH 08/12] R600/SI: Expand urem of v2i32/v4i32 for SI

2013-06-20 Thread Aaron Watry
Also add lit test for both cases on SI, and v2i32 for evergreen. Note: I followed the guidance of the v4i32 EG check... UREM produces really complex code, so let's just check that the instruction was lowered successfully. Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600

[Mesa-dev] [PATCH 09/12] R600: Add v2i32 test for setcc on evergreen

2013-06-20 Thread Aaron Watry
No test/expansion for SI has been added yet. Attempts to expand this operation for SI resulted in a stacktrace in (IIRC) LegalizeIntegerTypes which was complaining about vector comparisons being required to return a vector type. Signed-off-by: Aaron Watry awa...@gmail.com --- test/CodeGen/R600

[Mesa-dev] [PATCH 10/12] R600/SI: Expand xor v2i32/v4i32

2013-06-20 Thread Aaron Watry
Add test cases for both vector sizes on SI and also add v2i32 test for EG. Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600/SIISelLowering.cpp | 3 +++ test/CodeGen/R600/xor.ll | 40 +++--- 2 files changed, 36 insertions(+), 7 deletions

[Mesa-dev] [PATCH 11/12] R600: Add v2i32 test for vselect

2013-06-20 Thread Aaron Watry
() == N-getOperand(0).getValueType().isVector() Vector compare must return a vector result!' failed. Signed-off-by: Aaron Watry awa...@gmail.com --- test/CodeGen/R600/vselect.ll | 26 -- 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/test/CodeGen/R600

[Mesa-dev] [PATCH 12/12] R600: Consolidate expansion of v2i32/v4i32 ops for EG/SI

2013-06-20 Thread Aaron Watry
By default, we expand these operations for both EG and SI. Move the duplicated code into a common space for now. If the targets ever actually implement these operations as instructions, we can override that in the relevant target. Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600

[Mesa-dev] [PATCH] R600: Improve vector constant loading for EG/SI

2013-06-21 Thread Aaron Watry
Add some constant load v2i32/v4i32 tests for both EG and SI. Tested on: Pitcairn (7850) and Cedar (54xx) Signed-off-by: Aaron Watry awa...@gmail.com --- lib/Target/R600/R600Instructions.td | 3 +++ lib/Target/R600/SIInstructions.td | 10 ++ test/CodeGen/R600/load.vec.ll | 27

Re: [Mesa-dev] [PATCH 04/12] R600/SI: Expand shl of v2i32/v4i32 for SI

2013-06-21 Thread Aaron Watry
I moved it to the top of the file, if that's ok... although I guess I could leave it at the bottom if you want.. --Aaron On Fri, Jun 21, 2013 at 9:05 PM, Tom Stellard t...@stellard.net wrote: On Thu, Jun 20, 2013 at 06:43:42PM -0500, Aaron Watry wrote: Also add lit test for both cases on SI

Re: [Mesa-dev] [PATCH 1/2] R600: Add support for i32 loads from the constant address space on Cayman

2013-06-24 Thread Aaron Watry
Tested-By: Aaron Watry awa...@gmail.com Tested on an A6-3500 (SUMO) On Tue, Jun 18, 2013 at 11:54 AM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com --- lib/Target/R600/R600Instructions.td | 9 + test/CodeGen/R600/load.ll | 1 + 2 files

[Mesa-dev] [PATCH] clover: Fix build with LLVM 3.4

2013-06-28 Thread Aaron Watry
PathV1.h has been removed. In theory this can go back before llvm 3.4, but I haven't done the research to find out how far back. Signed-off-by: Aaron Watry awa...@gmail.com --- src/gallium/state_trackers/clover/llvm/invocation.cpp | 12 1 file changed, 12 insertions(+) diff --git

Re: [Mesa-dev] [PATCH] clover: Fix build with LLVM 3.4

2013-06-28 Thread Aaron Watry
Disregard this patch... Looks like Tom already pushed a fix last night. --Aaron On Fri, Jun 28, 2013 at 9:41 AM, Aaron Watry awa...@gmail.com wrote: PathV1.h has been removed. In theory this can go back before llvm 3.4, but I haven't done the research to find out how far back. Signed-off

Re: [Mesa-dev] [PATCH] R600: Expand VSELECT for all types

2013-07-16 Thread Aaron Watry
Looks good to me. I've tested on Cedar (HD5400) with no OpenCL regressions, but cannot test on SI because SETCC still causes issues (see https://bugs.freedesktop.org/show_bug.cgi?id=66175). Once SETCC is fixed for SI, we should probably add SI-CHECK lines to vselect.ll --Aaron On Tue, Jul 16,

Re: [Mesa-dev] [PATCH] R600: Expand VSELECT for all types

2013-07-17 Thread Aaron Watry
, Tom Stellard t...@stellard.net wrote: Hi, The attached three patches along with this one should fix VSELECT on SI as well. -Tom On Tue, Jul 16, 2013 at 05:12:40PM -0500, Aaron Watry wrote: Looks good to me. I've tested on Cedar (HD5400) with no OpenCL regressions, but cannot test on SI

Re: [Mesa-dev] [PATCH 7/7] clover: Sign-extend and zero-extend kernel arguments when required v2

2013-07-17 Thread Aaron Watry
On Tue, Jul 9, 2013 at 11:21 PM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com v2: - Extend to target size rather than aligned size - Support for big-endian --- src/gallium/state_trackers/clover/core/kernel.cpp | 58 --

Re: [Mesa-dev] Patches: R600: Improve load / store support for 8-bit and 16-bit types

2013-08-12 Thread Aaron Watry
It'll take me a while to attempt to parse everything that's going on in these patches (and your resource descriptor types series that this depends on), but I have sent it all through a piglit run on Evergreen (Cedar). Everything was latest Mesa/LLVM/libclc upstream code as of today. Baseline:

Re: [Mesa-dev] Patches: R600: Improve load / store support for 8-bit and 16-bit types

2013-08-13 Thread Aaron Watry
. And as you said, the descriptors series fixed compute hangs for the 7850 on quite a few kernels which did comparison operations (max/clamp kernels mostly, maybe some min). You can definitely get a tested-by for both the descriptors series and this: Tested-by: Aaron Watry awa...@gmail.com Quite a few

Re: [Mesa-dev] [PATCH] R600: Fix segfault in R600TextureIntrinsicReplacer

2013-08-21 Thread Aaron Watry
' on module 'radeon'. 1.Running pass 'AMDGPU DAG-DAG Pattern Instruction Selection' on function '@vp8_loop_filter_all_edges_kernel' Aborted (core dumped) For that you get a: Tested-By: Aaron Watry awa...@gmail.com On Wed, Aug 21, 2013 at 1:33 PM, Tom Stellard t...@stellard.net wrote: From: Tom

Re: [Mesa-dev] [PATCH] r600g/compute: Fix bug in compute memory pool

2013-08-28 Thread Aaron Watry
The changes look good to me... That seems to be a much more sane way to add the item to the beginning of the linked list. I've tested this on CEDAR (Radeon 5400) without any OpenCL regressions, and the only piglit change was that the new piglit test created for this bug now passes. --Aaron On

Re: [Mesa-dev] [PATCH 1/4] r600g: use u_upload_mgr for allocating staging transfer buffers

2012-12-11 Thread Aaron Watry
6850. http://openbenchmarking.org/result/1212102-SU-SUBALLOCT33 For my part: Tested-by: Aaron Watry awa...@gmail.com --Aaron Watry u_upload_mgr suballocates memory from a large buffer and maps the allocated range (unsychronized), which is perfect for short-lived staging buffers. This reduces

Re: [Mesa-dev] [PATCH] r600g: tgsi to llvm emits stream output intrinsics.

2012-12-14 Thread Aaron Watry
diff --git a/src/gallium/drivers/r600/r600_llvm.c b/src/gallium/drivers/r600/r600_llvm.c index 8f1ed26..14c0205 100644 --- a/src/gallium/drivers/r600/r600_llvm.c +++ b/src/gallium/drivers/r600/r600_llvm.c @@ -229,11 +229,32 @@ static void llvm_emit_epilogue(struct lp_build_tgsi_context *

Re: [Mesa-dev] [PATCH 1/2] drivers/radeon: Don't link against libgallium.la

2013-01-11 Thread Aaron Watry
--enable-texture-float --enable-opencl --enable-r600-llvm-compiler --with-egl-platforms=x11,drm --enable-glx-tls --Aaron Watry From: Tom Stellardthomas.stellard at amd.com http://lists.freedesktop.org/mailman/listinfo/mesa-dev This fixes multiple symbol errors in pipe-loader --- src/gallium

Re: [Mesa-dev] [PATCH] Revert targets/opencl: Link against libgallium.la instead of libgallium.a

2013-01-14 Thread Aaron Watry
of libgallium.a, but until we can figure why linking against libgallium.la causes runtime failures in clover we will continue to link against libgallium.a Tested-by: Aaron Watry awa...@gmail.com Piglit runs CL tests again, but I still get a bunch of run-time warnings along the lines of: premain

Re: [Mesa-dev] AMD RS780 please help

2012-09-25 Thread Aaron Watry
: 1.00 Buffer[20]: 0.765430, Expected: -0.375000 Buffer[21]: 0.765430, Expected: -0.375000 Buffer[22]: 0.765430, Expected: 0.00 Buffer[23]: 0.765430, Expected: 1.00 PIGLIT: {'result': 'fail' } Regards, Aaron Watry 2012/9/24 Marek Ol??k mar...@gmail.com: Hi, would somebody

Re: [Mesa-dev] GSoC : Video decoding state tracker for Gallium3d

2011-04-04 Thread Aaron Watry
Hi Emeric, It doesn't affect your proposal too much, but I'd recommend changing the order of your August tasks a bit. I would suggest trying to work on the loop filter before the motion compensation. A few of the 720p and 1080p videos that I profiled during my thesis work suggested that the loop

[Mesa-dev] [PATCH] RFC clover: calculate maximum workgroup size based on device

2013-10-23 Thread Aaron Watry
The maximum workgroup size for a given kernel is based on the capabilities of the device that it's being run on. Previously, we were just returning the maximum value of a size_t which is obviously wrong. This patch uses the device's capabilities, but doesn't take into account any resource usage

Re: [Mesa-dev] [PATCH] R600: Expand vector FSQRT ops

2013-10-25 Thread Aaron Watry
Reviewed-by: Aaron Watry awa...@gmail.com I have tested this on a Radeon 5400 (Cedar), and I just sent a few generated tests to the piglit list. --Aaron On Wed, Oct 23, 2013 at 6:28 PM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com --- lib/Target/R600

Re: [Mesa-dev] [PATCH] radeon/llvm: Specify the DataLayout when running optimizations

2013-10-28 Thread Aaron Watry
I ran this through a piglit CL test run on my 7850, no test fixes or regressions. --Aaron On Tue, Oct 22, 2013 at 11:28 AM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com Without DataLayout, a lot of optimization passes aren't run and the ones that are don't

Re: [Mesa-dev] [PATCH] radeon/llvm: Specify the DataLayout when running optimizations

2013-10-28 Thread Aaron Watry
I just ran a quick.tests run on evergreen without any regressions. Patch looks good to me, and doesn't seem to cause any regressions on the hardware I have available to test with. --Aaron On Tue, Oct 22, 2013 at 11:28 AM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard

Re: [Mesa-dev] [PATCH] clover: Calculate the optimal work group size when local_size is NULL

2013-10-29 Thread Aaron Watry
On Tue, Oct 29, 2013 at 7:06 PM, Niels Ole Salscheider niels_...@salscheider-online.de wrote: Hi Tom, this has been on my todo list for quite a while. Your patch looks good to me, but in my experience a block with approximately the same size for each dimension gives slightly better

Re: [Mesa-dev] [PATCH] clover: Don't install headers when using the icd

2013-10-30 Thread Aaron Watry
Reviewed and Tested-by: Aaron Watry awa...@gmail.com On Tue, Oct 29, 2013 at 11:48 AM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com The ICD loader should be responsible for installing headers. --- src/gallium/state_trackers/clover/Makefile.am | 21

[Mesa-dev] [PATCH] clover: fix build with LLVM 3.4

2013-11-01 Thread Aaron Watry
dso_list was added as an argument for createInternalizePass in 3.4, and then it was removed again in the same llvm version. --- src/gallium/state_trackers/clover/llvm/invocation.cpp | 5 - 1 file changed, 5 deletions(-) diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp

Re: [Mesa-dev] [PATCH 1/5] mesa: remove Alpha CPU checks

2013-11-05 Thread Aaron Watry
On Mon, Nov 4, 2013 at 7:04 PM, Matt Turner matts...@gmail.com wrote: On Mon, Nov 4, 2013 at 4:48 PM, Brian Paul bri...@vmware.com wrote: --- src/mesa/main/compiler.h |7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/src/mesa/main/compiler.h

Re: [Mesa-dev] [PATCH] vl: use a separate context for shader based decode

2013-11-06 Thread Aaron Watry
On Wed, Nov 6, 2013 at 8:13 AM, Christian König deathsim...@vodafone.de wrote: From: Christian König christian.koe...@amd.com This makes VDPAU thread save again. Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/auxiliary/vl/vl_mpeg12_decoder.c | 180

[Mesa-dev] [PATCH 6/6] gallium/pipe_loader: un-reference udev resources when we're done with them.

2013-11-06 Thread Aaron Watry
--- src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c b/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c index 339d7bf..927fb24 100644 ---

[Mesa-dev] [PATCH 4/6] radeonsi/compute: Free program and program.kernels on shutdown

2013-11-06 Thread Aaron Watry
--- src/gallium/drivers/radeonsi/radeonsi_compute.c | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c b/src/gallium/drivers/radeonsi/radeonsi_compute.c index 265dbd7..28a3f17 100644 ---

Re: [Mesa-dev] [PATCH 0/6] radeon: Plug some memory leaks

2013-11-06 Thread Aaron Watry
uses tabs for spaces. Do you want a v2, or are you happy with the patches assuming that I fix the indentation? --Aaron -Tom On Wed, Nov 06, 2013 at 10:36:49AM -0600, Aaron Watry wrote: I decided to have some fun and hooked valgrind up to my 7850 while running a few OpenCL tests in piglit

[Mesa-dev] [PATCH 0/6 v2] radeon: Plug some memory leaks

2013-11-06 Thread Aaron Watry
Turns out that I don't have commit access to Mesa, just piglit. Feel free to push if they look good. I decided to have some fun and hooked valgrind up to my SI while running a few OpenCL tests in piglit. This is the first batch of fixes. Aaron Watry (6): radeon/llvm: fix spelling error

[Mesa-dev] [PATCH 4/6] radeonsi/compute: Free program and program.kernels on shutdown

2013-11-06 Thread Aaron Watry
v2: Fix indentation Reviewed-by: Tom Stellard thomas.stell...@amd.com --- src/gallium/drivers/radeonsi/radeonsi_compute.c | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c

[Mesa-dev] [PATCH 6/6] gallium/pipe_loader: un-reference udev resources when we're done with them.

2013-11-06 Thread Aaron Watry
Reviewed-by: Tom Stellard thomas.stell...@amd.com --- src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c b/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c index 339d7bf..927fb24 100644

[Mesa-dev] [PATCH 2/6] radeon/llvm: Free libelf resources

2013-11-06 Thread Aaron Watry
v2: Fix indentation Reviewed-by: Tom Stellard thomas.stell...@amd.com --- src/gallium/drivers/radeon/radeon_llvm_emit.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c b/src/gallium/drivers/radeon/radeon_llvm_emit.c index 8bf278b..d2e5642

[Mesa-dev] [PATCH 1/6] radeon/llvm: fix spelling error

2013-11-06 Thread Aaron Watry
Reviewed-by: Tom Stellard thomas.stell...@amd.com --- src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c index

[Mesa-dev] [PATCH 2/3] r600/llvm: Free binary.code/binary.config in r600_llvm_compile

2013-11-07 Thread Aaron Watry
radeon_llvm_compile allocates memory for binary.code, binary.config, or neither depending on what's being done. We need to make sure to free that memory after it's no longer needed. --- src/gallium/drivers/r600/r600_llvm.c | 7 +++ 1 file changed, 7 insertions(+) diff --git

  1   2   3   4   >