Am 22.01.2013 13:08, schrieb Michel Dänzer:
On Mon, 2013-01-21 at 22:28 +0100, Christian König wrote:
v2: don't mess up other loops
Signed-off-by: Christian König deathsimple at vodafone.de Series is
Tested-by: Michel Dänzer michel.daenzer at amd.com No piglit
regressions with
From: Tom Stellard thomas.stellard at amd.com
These instructions compare two floating point values and return an
integer true (-1) or false (0) value.
When compiling code generated by the Mesa GLSL frontend, the SET*_DX10
instructions save us four instructions for most branch decisions that
use
I've managed to capture a trace that loads TF2 to the menu and reproduces
some of the flickering. I haven't managed to capture any gameplay yet due
to an error in CD Key authentication due to how I'm launching the game.
URL:
http://www.watrys.net/tf2_menu.trace.xz
--Aaron
On Wed, Feb 20, 2013
Hi Tom,
Mesa+Clover does indeed build against master llvm/clang, but I'm having
trouble building against it when I try to do a clean build of Piglit.
Error received:
[ 18%] Built target piglitutil_cl
Linking C executable ../../../../../bin/cl-custom-run-simple-kernel
On Thu, Feb 21, 2013 at 8:33 AM, Tom Stellard t...@stellard.net wrote:
On Thu, Feb 21, 2013 at 08:25:20AM -0600, Aaron Watry wrote:
Hi Tom,
Mesa+Clover does indeed build against master llvm/clang, but I'm having
trouble building against it when I try to do a clean build of Piglit
On Thu, Feb 21, 2013 at 10:06 AM, Tom Stellard t...@stellard.net wrote:
On Thu, Feb 21, 2013 at 10:02:34AM -0600, Aaron Watry wrote:
On Thu, Feb 21, 2013 at 8:33 AM, Tom Stellard t...@stellard.net wrote:
On Thu, Feb 21, 2013 at 08:25:20AM -0600, Aaron Watry wrote:
Hi Tom,
Mesa
On Fri, Feb 22, 2013 at 12:21 PM, Tom Stellard t...@stellard.net wrote:
On Thu, Feb 21, 2013 at 08:25:20AM -0600, Aaron Watry wrote:
Hi Tom,
Mesa+Clover does indeed build against master llvm/clang, but I'm having
trouble building against it when I try to do a clean build of Piglit
Same error here.
Configuration: ./autogen.sh --enable-texture-float --enable-opencl
--with-gallium-drivers=r600 --with-dri-drivers=radeon --prefix=/usr/local
--Aaron
On Tue, Feb 26, 2013 at 11:09 AM, Jordan Justen jljus...@gmail.com wrote:
On Sat, Feb 23, 2013 at 7:29 AM, Brian Paul
LLVM moved a bunch of IR-related headers for version 3.3.
This fixes the libclc build to follow suit.
---
utils/prepare-builtins.cpp | 12
1 file changed, 12 insertions(+)
diff --git a/utils/prepare-builtins.cpp b/utils/prepare-builtins.cpp
index ae7731b..0141484 100644
---
The python changes in this file look good to me. I haven't done a
line-by-line review of the SI changes.
I tested this patch and v2 of the related mesa series on r600g (radeon
6850) with a recent LLVM and fresh mesa master as of this evening. No real
change in the piglit CL test success/failure
Adds this function for both int and floating data types.
---
generic/include/clc/clc.h |2 ++
generic/include/clc/integer/max.h |2 ++
generic/include/clc/integer/max.inc |1 +
generic/include/clc/math/max.h |2 ++
generic/include/clc/math/max.inc|1 +
---
generic/lib/SOURCES |1 +
generic/lib/integer/abs_diff.inc |2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES
index b593941..a97213b 100644
--- a/generic/lib/SOURCES
+++ b/generic/lib/SOURCES
@@ -4,6 +4,7 @@
This series depends on the one-off patch I just sent to add max().
1) Fix the broken abs_diff integer built-in.
2) Add clamp for both integer and floating types in a new shared/ dir in order
to reduce code duplication and improve maintainability.
3) Move the max() function into the shared/
Created under a new shared/ directory for functions which are available for
both integer and floating point types.
---
generic/include/clc/clc.h|3 +++
generic/include/clc/shared/clamp.h |5 +
generic/include/clc/shared/clamp.inc |1 +
generic/lib/SOURCES
Max(x,y) is available for all integer/floating types.
---
generic/include/clc/clc.h |3 +--
generic/include/clc/integer/max.h |2 --
generic/include/clc/integer/max.inc |1 -
generic/include/clc/math/max.h |2 --
generic/include/clc/math/max.inc|1 -
PM, Aaron Watry awa...@gmail.com wrote:
This series depends on the one-off patch I just sent to add max().
1) Fix the broken abs_diff integer built-in.
2) Add clamp for both integer and floating types in a new shared/ dir in
order
to reduce code duplication and improve maintainability.
3
This implementation does a lot of bit shifting and masking. Suffice to say,
this is somewhat suboptimal... but it does look to produce correct results
(after the piglit tests were corrected for sign extension issues).
Someone who knows LLVM better than I could re-write this more efficiently.
---
Implements the min() OpenCL built-in in 2 stages.
1) Implement min() where the two argument types match
2) Make changes to support min(vec,scalar)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
This doesn't handle the integer cases for min(vector, scalar).
---
generic/include/clc/clc.h |1 +
generic/include/clc/shared/min.h |5 +
generic/include/clc/shared/min.inc |1 +
generic/lib/SOURCES|1 +
generic/lib/shared/min.cl | 11
Checks if the current GENTYPE is scalar, and if not, then defines a separate
implementation of the function which casts the second arg to vector before
proceeding.
---
generic/include/clc/integer/gentype.inc | 23 +++
generic/include/clc/math/gentype.inc|8
For any GENTYPE that isn't scalar, we need to implement a mixed
vector/scalar version of clamp/max.
This depends on the min() patches I sent to the list a few minutes ago.
---
generic/include/clc/shared/clamp.inc |4
generic/include/clc/shared/max.inc |4
configure.py allows overloading *.cl with *.ll, but will only ever build
the first file listed in SOURCES of ${file}.cl and ${file}.ll
add_sat, sub_sat, (and the soon to be submitted clz) all define interfaces in
${function_name}.ll which are implemented in ${function_name}_impl.ll.
Renaming the
For the series:
Tested-by: Aaron Watry awa...@gmail.com
Config:
./configure --with-dri-drivers=radeon --with-gallium-drivers=r600
--enable-texture-float --enable-opencl --enable-gles1 --enable-gles2
--enable-xvmc --enable-vdpau --enable-r600-llvm-compiler
--with-egl-platforms=x11,drm --enable-glx
Hi Tom,
I'm not too qualified to review the llvm code changes, but the changes
looked sane. I did want to point out a few piglit changes/regressions as a
result of this set of patches.
For my HD6850, running latest llvm from git:
gegl-rgb-gamma-u8-to-ragabaf: pass - fail
v3i32-stack: pass - fail
Not sure if this helps much, but...
With gentype being one of:
char, uchar, short, ushort, int, uint, long, ulong, and the widths
being scalar, 2, 3, 4, 8, or 16 components wide.
From the OpenCL 1.1 spec:
gentype mad_hi(gentype a, gentype b):
Computes x * y and returns the high half of the
I know it's been pushed already, but this also fixes some lockups that
I was seeing on Barts (HD6850) when running piglit's OpenCL tests.
Thanks for fixing this.
--Aaron
On Fri, May 3, 2013 at 9:47 AM, Marek Olšák mar...@gmail.com wrote:
Reviewed-by: Marek Olšák mar...@gmail.com
Marek
On
This series, and the associated mesa changes are all:
Tested-By: Aaron Watry awa...@gmail.com
--Aaron
On Fri, May 3, 2013 at 5:53 PM, Tom Stellard t...@stellard.net wrote:
Hi,
The attached patches modify the CodeEmitter to emit true ISA.
Previously, we were prefixing all instructions
These two patches fix a number of piglit OpenCL test failures on my
HD6850 (Barts).
There are no piglit CL test regressions and the llvm make check runs
without any unexpected failures.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600/R600ISelLowering.cpp |3 +++
1 file changed, 3 insertions(+)
diff --git a/lib/Target/R600/R600ISelLowering.cpp
b/lib/Target/R600/R600ISelLowering.cpp
index c6e2136..6dec4d1 100644
--- a/lib/Target/R600/R600ISelLowering.cpp
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600/R600ISelLowering.cpp |2 ++
1 file changed, 2 insertions(+)
diff --git a/lib/Target/R600/R600ISelLowering.cpp
b/lib/Target/R600/R600ISelLowering.cpp
index 6dec4d1..ac56ed8 100644
--- a/lib/Target/R600/R600ISelLowering.cpp
+++ b
These two patches fix a number of piglit OpenCL test failures on my
HD6850 (Barts).
There are no piglit CL test regressions and the llvm make check runs
without any unexpected failures.
v2: Add tests for v4i32 data type.
___
mesa-dev mailing list
Signed-off-by: Aaron Watry awa...@gmail.com
v2: Add vselect v4i32 test
---
lib/Target/R600/R600ISelLowering.cpp |3 +++
test/CodeGen/R600/vselect.ll | 17 +
2 files changed, 20 insertions(+)
create mode 100644 test/CodeGen/R600/vselect.ll
diff --git a/lib/Target
Signed-off-by: Aaron Watry awa...@gmail.com
v2: Add v4i32 test
---
lib/Target/R600/R600ISelLowering.cpp |2 ++
test/CodeGen/R600/sra.ll | 13 +
2 files changed, 15 insertions(+)
create mode 100644 test/CodeGen/R600/sra.ll
diff --git a/lib/Target/R600
Fixes piglit test for OpenCL builtin mul24, and allows mad24 to run.
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600/R600ISelLowering.cpp |2 ++
test/CodeGen/R600/mul.ll | 16
2 files changed, 18 insertions(+)
create mode 100644 test/CodeGen
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600/R600ISelLowering.cpp |2 ++
test/CodeGen/R600/sub.ll | 15 +++
2 files changed, 17 insertions(+)
create mode 100644 test/CodeGen/R600/sub.ll
diff --git a/lib/Target/R600/R600ISelLowering.cpp
b/lib
On Fri, May 17, 2013 at 2:31 PM, Jose Fonseca jfons...@vmware.com wrote:
- Original Message -
On Fri, May 17, 2013 at 7:44 AM, Jose Fonseca jfons...@vmware.com wrote:
Vinson,
Why is this necessary?
(I'd prefer that LLVM is statically linked by default. )
Jose
The SCons
Reviewed-by: Aaron Watry awa...@gmail.com
Please also send the attached test patch (or an expanded version of
it) to the piglit list.
On Thu, May 23, 2013 at 12:48 PM, Tom Stellard t...@stellard.net wrote:
From: Tom Stellard thomas.stell...@amd.com
---
generic/include/clc/clc.h
I've implemented the OpenCL vload/vstore builtin functions in two parts.
1) Pure CL C implementation. No Assembly
2) Add assembly optimizations for 32-bit int/uint loads/stores of 4+ component
vectors
Note: The vstore implementation assumes that the hardware back end supports
byte-addressable
Should work for all targets and data types. Completely unoptimized.
---
generic/include/clc/clc.h | 1 +
generic/include/clc/shared/vload.h | 37 ++
generic/lib/SOURCES| 1 +
generic/lib/shared/vload.cl| 47
Assumes that the target supports byte-addressable stores.
Completely unoptimized.
---
generic/include/clc/clc.h | 1 +
generic/include/clc/shared/vstore.h | 36
generic/lib/SOURCES | 1 +
generic/lib/shared/vstore.cl| 56
The assembly should be generic, but at least currently R600 only supports
32-bit loads of int1/4, and I believe that only global is well-supported.
R600 lowers the 8/16 component vectors to multiple 4-bit loads.
The unoptimized C versions of the other stuff is left in place.
---
The assembly should be generic, but at least currently R600 only supports
32-bit stores of [u]int1/4, and I believe that only global is well-supported.
R600 lowers the 8/16 component stores to multiple 4-component stores.
The unoptimized C versions of the other stuff is left in place.
---
Looks good to me. Is there a piglit test for this?
--Aaron
On Wed, Jun 5, 2013 at 7:12 PM, Tom Stellard t...@stellard.net wrote:
From: Tom Stellard thomas.stell...@amd.com
---
src/gallium/state_trackers/clover/llvm/invocation.cpp | 7 +++
1 file changed, 7 insertions(+)
diff --git
is there and functioning.
For the libclc change:
Reviewed-by: Aaron Watry awa...@gmail.com
On Wed, Jun 12, 2013 at 7:31 PM, Tom Stellard t...@stellard.net wrote:
From: Tom Stellard thomas.stell...@amd.com
---
r600/lib/SOURCES | 2 ++
r600/lib/synchronization/barrier.cl
On Wed, Jun 12, 2013 at 7:34 PM, Tom Stellard t...@stellard.net wrote:
From: Tom Stellard thomas.stell...@amd.com
---
src/gallium/drivers/r600/evergreen_compute.c | 23 +++-
src/gallium/drivers/r600/r600_shader.c | 32
2 files changed, 22
-by: Aaron Watry awa...@gmail.com
On Wed, Jun 12, 2013 at 7:34 PM, Tom Stellard t...@stellard.net wrote:
From: Tom Stellard thomas.stell...@amd.com
And allocate the correct amount before dispatching the kernel.
---
src/gallium/drivers/r600/evergreen_compute.c | 53
Also add a seperate vector lit test file, since r600 doesn't seem to handle
v2i32 load/store yet, but we can test both for SI.
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600/SIInstructions.td | 5 +
test/CodeGen/R600/load.vec.ll | 19 +++
2 files changed
First patch fixes load/store for v2i32 on R600. Without this, the
other two will cause make check failures. I've verified the changes
using a Radeon 5400 (Cedar). Note that the previous custom
lowering of v2i32 store was causing silent data corruption.
The other two patches expand add/sub on SI
The custom lowering causes llc to crash with a segfault.
Ideally, the custom lowering can be fixed, but this allows
programs which load/store v2i32 to work without crashing.
Signed-off-by: Aaron Watryawa...@gmail.com
---
lib/Target/R600/R600ISelLowering.cpp | 4 ++--
Also add SI tests to existing file and a v2i32 test for both
R600 and SI.
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600/SIISelLowering.cpp | 2 ++
test/CodeGen/R600/add.ll | 37 +++--
2 files changed, 33 insertions(+), 6 deletions
Tested on Pitcairn by: Aaron Watry awa...@gmail.com
Follow-up question: Would it be as easy as it looks to add v2i32 right away?
On Tue, Jun 18, 2013 at 6:21 PM, Tom Stellard t...@stellard.net wrote:
From: Tom Stellard thomas.stell...@amd.com
---
lib/Target/R600/AMDGPUCallingConv.td| 9
) calls that appear in
both R600ISelLowering.cpp and SIISelLowering.cpp are all moved to
AMDGPUISelLowering.cpp. If we decide to implement these ops through native
instructions for either target in the future, we can override that in the
individual targets.
Signed-off-by: Aaron Watry awa...@gmail.com
Also add lit test for both cases on SI, and v2i32 for evergreen.
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600/SIISelLowering.cpp | 3 +++
test/CodeGen/R600/and.ll | 37 +++--
2 files changed, 34 insertions(+), 6 deletions(-)
diff
Also add lit test for both cases on SI, and v2i32 for evergreen.
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600/SIISelLowering.cpp | 3 +++
test/CodeGen/R600/mul.ll | 38 --
2 files changed, 35 insertions(+), 6 deletions(-)
diff
Also add lit test for both cases on SI, and v2i32 for evergreen.
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600/SIISelLowering.cpp | 3 +++
test/CodeGen/R600/or.ll| 41 +++---
2 files changed, 37 insertions(+), 7 deletions(-)
diff
Also add lit test for both cases on SI, and v2i32 for evergreen.
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600/SIISelLowering.cpp | 3 +++
test/CodeGen/R600/shl.ll | 47 ++
2 files changed, 40 insertions(+), 10 deletions(-)
diff
Also add lit test for both cases on SI, and v2i32 for evergreen.
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600/SIISelLowering.cpp | 2 ++
test/CodeGen/R600/srl.ll | 42 +++---
2 files changed, 37 insertions(+), 7 deletions(-)
diff
Also add lit test for both cases on SI, and v2i32 for evergreen.
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600/SIISelLowering.cpp | 2 ++
test/CodeGen/R600/sra.ll | 41 +++---
2 files changed, 36 insertions(+), 7 deletions(-)
diff
Also add lit test for both cases on SI, and v2i32 for evergreen.
Note: I followed the guidance of the v4i32 EG check... UDIV produces really
complex code, so let's just check that the instruction was lowered
successfully.
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600
Also add lit test for both cases on SI, and v2i32 for evergreen.
Note: I followed the guidance of the v4i32 EG check... UREM produces really
complex code, so let's just check that the instruction was lowered
successfully.
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600
No test/expansion for SI has been added yet. Attempts to expand this
operation for SI resulted in a stacktrace in (IIRC) LegalizeIntegerTypes
which was complaining about vector comparisons being required to return
a vector type.
Signed-off-by: Aaron Watry awa...@gmail.com
---
test/CodeGen/R600
Add test cases for both vector sizes on SI and also add v2i32 test for EG.
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600/SIISelLowering.cpp | 3 +++
test/CodeGen/R600/xor.ll | 40 +++---
2 files changed, 36 insertions(+), 7 deletions
() == N-getOperand(0).getValueType().isVector()
Vector compare must return a vector result!' failed.
Signed-off-by: Aaron Watry awa...@gmail.com
---
test/CodeGen/R600/vselect.ll | 26 --
1 file changed, 20 insertions(+), 6 deletions(-)
diff --git a/test/CodeGen/R600
By default, we expand these operations for both EG and SI. Move the
duplicated code into a common space for now. If the targets ever actually
implement these operations as instructions, we can override that in the relevant
target.
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600
Add some constant load v2i32/v4i32 tests for both EG and SI.
Tested on: Pitcairn (7850) and Cedar (54xx)
Signed-off-by: Aaron Watry awa...@gmail.com
---
lib/Target/R600/R600Instructions.td | 3 +++
lib/Target/R600/SIInstructions.td | 10 ++
test/CodeGen/R600/load.vec.ll | 27
I moved it to the top of the file, if that's ok... although I guess I
could leave it at the bottom if you want..
--Aaron
On Fri, Jun 21, 2013 at 9:05 PM, Tom Stellard t...@stellard.net wrote:
On Thu, Jun 20, 2013 at 06:43:42PM -0500, Aaron Watry wrote:
Also add lit test for both cases on SI
Tested-By: Aaron Watry awa...@gmail.com
Tested on an A6-3500 (SUMO)
On Tue, Jun 18, 2013 at 11:54 AM, Tom Stellard t...@stellard.net wrote:
From: Tom Stellard thomas.stell...@amd.com
---
lib/Target/R600/R600Instructions.td | 9 +
test/CodeGen/R600/load.ll | 1 +
2 files
PathV1.h has been removed. In theory this can go back before llvm 3.4, but I
haven't done the research to find out how far back.
Signed-off-by: Aaron Watry awa...@gmail.com
---
src/gallium/state_trackers/clover/llvm/invocation.cpp | 12
1 file changed, 12 insertions(+)
diff --git
Disregard this patch... Looks like Tom already pushed a fix last night.
--Aaron
On Fri, Jun 28, 2013 at 9:41 AM, Aaron Watry awa...@gmail.com wrote:
PathV1.h has been removed. In theory this can go back before llvm 3.4, but I
haven't done the research to find out how far back.
Signed-off
Looks good to me.
I've tested on Cedar (HD5400) with no OpenCL regressions, but cannot
test on SI because SETCC still causes issues (see
https://bugs.freedesktop.org/show_bug.cgi?id=66175). Once SETCC is
fixed for SI, we should probably add SI-CHECK lines to vselect.ll
--Aaron
On Tue, Jul 16,
, Tom Stellard t...@stellard.net wrote:
Hi,
The attached three patches along with this one should fix VSELECT on SI
as well.
-Tom
On Tue, Jul 16, 2013 at 05:12:40PM -0500, Aaron Watry wrote:
Looks good to me.
I've tested on Cedar (HD5400) with no OpenCL regressions, but cannot
test on SI
On Tue, Jul 9, 2013 at 11:21 PM, Tom Stellard t...@stellard.net wrote:
From: Tom Stellard thomas.stell...@amd.com
v2:
- Extend to target size rather than aligned size
- Support for big-endian
---
src/gallium/state_trackers/clover/core/kernel.cpp | 58
--
It'll take me a while to attempt to parse everything that's going on
in these patches (and your resource descriptor types series that this
depends on), but I have sent it all through a piglit run on Evergreen
(Cedar). Everything was latest Mesa/LLVM/libclc upstream code as of
today.
Baseline:
.
And as you said, the descriptors series fixed compute hangs for the
7850 on quite a few kernels which did comparison operations (max/clamp
kernels mostly, maybe some min).
You can definitely get a tested-by for both the descriptors series and this:
Tested-by: Aaron Watry awa...@gmail.com
Quite a few
' on module 'radeon'.
1.Running pass 'AMDGPU DAG-DAG Pattern Instruction Selection' on
function '@vp8_loop_filter_all_edges_kernel'
Aborted (core dumped)
For that you get a:
Tested-By: Aaron Watry awa...@gmail.com
On Wed, Aug 21, 2013 at 1:33 PM, Tom Stellard t...@stellard.net wrote:
From: Tom
The changes look good to me... That seems to be a much more sane way
to add the item to the beginning of the linked list.
I've tested this on CEDAR (Radeon 5400) without any OpenCL
regressions, and the only piglit change was that the new piglit test
created for this bug now passes.
--Aaron
On
6850.
http://openbenchmarking.org/result/1212102-SU-SUBALLOCT33
For my part:
Tested-by: Aaron Watry awa...@gmail.com
--Aaron Watry
u_upload_mgr suballocates memory from a large buffer and maps the allocated
range (unsychronized), which is perfect for short-lived staging buffers.
This reduces
diff --git a/src/gallium/drivers/r600/r600_llvm.c
b/src/gallium/drivers/r600/r600_llvm.c
index 8f1ed26..14c0205 100644
--- a/src/gallium/drivers/r600/r600_llvm.c
+++ b/src/gallium/drivers/r600/r600_llvm.c
@@ -229,11 +229,32 @@ static void llvm_emit_epilogue(struct
lp_build_tgsi_context *
--enable-texture-float --enable-opencl
--enable-r600-llvm-compiler --with-egl-platforms=x11,drm --enable-glx-tls
--Aaron Watry
From: Tom Stellardthomas.stellard at amd.com
http://lists.freedesktop.org/mailman/listinfo/mesa-dev
This fixes multiple symbol errors in pipe-loader
---
src/gallium
of
libgallium.a, but until we can figure why linking against libgallium.la
causes runtime failures in clover we will continue to link against
libgallium.a
Tested-by: Aaron Watry awa...@gmail.com
Piglit runs CL tests again, but I still get a bunch of run-time warnings
along the lines of:
premain
: 1.00
Buffer[20]: 0.765430, Expected: -0.375000
Buffer[21]: 0.765430, Expected: -0.375000
Buffer[22]: 0.765430, Expected: 0.00
Buffer[23]: 0.765430, Expected: 1.00
PIGLIT: {'result': 'fail' }
Regards,
Aaron Watry
2012/9/24 Marek Ol??k mar...@gmail.com:
Hi,
would somebody
Hi Emeric,
It doesn't affect your proposal too much, but I'd recommend changing
the order of your August tasks a bit. I would suggest trying to work
on the loop filter before the motion compensation. A few of the 720p
and 1080p videos that I profiled during my thesis work suggested that
the loop
The maximum workgroup size for a given kernel is based on the
capabilities of the device that it's being run on. Previously,
we were just returning the maximum value of a size_t which is
obviously wrong.
This patch uses the device's capabilities, but doesn't take into
account any resource usage
Reviewed-by: Aaron Watry awa...@gmail.com
I have tested this on a Radeon 5400 (Cedar), and I just sent a few
generated tests to the piglit list.
--Aaron
On Wed, Oct 23, 2013 at 6:28 PM, Tom Stellard t...@stellard.net wrote:
From: Tom Stellard thomas.stell...@amd.com
---
lib/Target/R600
I ran this through a piglit CL test run on my 7850, no test fixes or
regressions.
--Aaron
On Tue, Oct 22, 2013 at 11:28 AM, Tom Stellard t...@stellard.net wrote:
From: Tom Stellard thomas.stell...@amd.com
Without DataLayout, a lot of optimization passes aren't run and the ones
that are don't
I just ran a quick.tests run on evergreen without any regressions.
Patch looks good to me, and doesn't seem to cause any regressions on
the hardware I have available to test with.
--Aaron
On Tue, Oct 22, 2013 at 11:28 AM, Tom Stellard t...@stellard.net wrote:
From: Tom Stellard
On Tue, Oct 29, 2013 at 7:06 PM, Niels Ole Salscheider
niels_...@salscheider-online.de wrote:
Hi Tom,
this has been on my todo list for quite a while.
Your patch looks good to me, but in my experience a block with approximately
the same size for each dimension gives slightly better
Reviewed and Tested-by: Aaron Watry awa...@gmail.com
On Tue, Oct 29, 2013 at 11:48 AM, Tom Stellard t...@stellard.net wrote:
From: Tom Stellard thomas.stell...@amd.com
The ICD loader should be responsible for installing headers.
---
src/gallium/state_trackers/clover/Makefile.am | 21
dso_list was added as an argument for createInternalizePass in 3.4, and then
it was removed again in the same llvm version.
---
src/gallium/state_trackers/clover/llvm/invocation.cpp | 5 -
1 file changed, 5 deletions(-)
diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp
On Mon, Nov 4, 2013 at 7:04 PM, Matt Turner matts...@gmail.com wrote:
On Mon, Nov 4, 2013 at 4:48 PM, Brian Paul bri...@vmware.com wrote:
---
src/mesa/main/compiler.h |7 +--
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/src/mesa/main/compiler.h
On Wed, Nov 6, 2013 at 8:13 AM, Christian König deathsim...@vodafone.de wrote:
From: Christian König christian.koe...@amd.com
This makes VDPAU thread save again.
Signed-off-by: Christian König christian.koe...@amd.com
---
src/gallium/auxiliary/vl/vl_mpeg12_decoder.c | 180
---
src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c
b/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c
index 339d7bf..927fb24 100644
---
---
src/gallium/drivers/radeonsi/radeonsi_compute.c | 16 +++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c
b/src/gallium/drivers/radeonsi/radeonsi_compute.c
index 265dbd7..28a3f17 100644
---
uses tabs for spaces.
Do you want a v2, or are you happy with the patches assuming that I
fix the indentation?
--Aaron
-Tom
On Wed, Nov 06, 2013 at 10:36:49AM -0600, Aaron Watry wrote:
I decided to have some fun and hooked valgrind up to my 7850 while running
a few OpenCL tests in piglit
Turns out that I don't have commit access to Mesa, just piglit. Feel
free to push if they look good.
I decided to have some fun and hooked valgrind up to my SI while running
a few OpenCL tests in piglit. This is the first batch of fixes.
Aaron Watry (6):
radeon/llvm: fix spelling error
v2: Fix indentation
Reviewed-by: Tom Stellard thomas.stell...@amd.com
---
src/gallium/drivers/radeonsi/radeonsi_compute.c | 16 +++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c
Reviewed-by: Tom Stellard thomas.stell...@amd.com
---
src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c
b/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c
index 339d7bf..927fb24 100644
v2: Fix indentation
Reviewed-by: Tom Stellard thomas.stell...@amd.com
---
src/gallium/drivers/radeon/radeon_llvm_emit.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c
b/src/gallium/drivers/radeon/radeon_llvm_emit.c
index 8bf278b..d2e5642
Reviewed-by: Tom Stellard thomas.stell...@amd.com
---
src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index
radeon_llvm_compile allocates memory for binary.code, binary.config, or neither
depending on
what's being done.
We need to make sure to free that memory after it's no longer needed.
---
src/gallium/drivers/r600/r600_llvm.c | 7 +++
1 file changed, 7 insertions(+)
diff --git
1 - 100 of 377 matches
Mail list logo