This adds seamless sampling for cubemap boundaries if requested.
The corner case averaging is messy but seems like it should be spec
compliant.
The face direction stuff is also a bit messy, I've no idea if that could
or should be simpler, or even if all my directions are fully correct!
https://bugs.freedesktop.org/show_bug.cgi?id=58137
Priority: medium
Bug ID: 58137
Assignee: mesa-dev@lists.freedesktop.org
Summary: [r300g, r600g] corruption on 0 A.D. game with postproc
effects enabled
Severity: normal
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Am 2012-12-11 00:47, schrieb Ian Romanick:
[ 760.187261] [drm:radeon_cs_ib_chunk] *ERROR* Invalid command
stream ! [ 760.192898] radeon :01:00.0:
evergreen_cs_track_validate_stencil:602 stencil read bo base
4148500480 not aligned with
We already have the Mesa version in the version string, isn't that enough
to detect Mesa?
---
src/mesa/state_tracker/st_cb_strings.c |5 +
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/src/mesa/state_tracker/st_cb_strings.c
b/src/mesa/state_tracker/st_cb_strings.c
index
This will break apps that expect the current to tweak their behavior. Changing
it now will cause pain to app developers and ourselves, and I honestly don't
the good of it.
For good or bad we have these strings. So I'd prefer we focused on making our
drivers rock solid so that app developers
On 11 December 2012 13:57, Marek Olšák mar...@gmail.com wrote:
We already have the Mesa version in the version string, isn't that enough
to detect Mesa?
In theory, although the vendor string would IMO be the expected place for that.
___
mesa-dev
Am 11.12.2012 10:52, schrieb Dave Airlie:
This adds seamless sampling for cubemap boundaries if requested.
The corner case averaging is messy but seems like it should be spec
compliant.
The face direction stuff is also a bit messy, I've no idea if that could
or should be simpler, or even
Just a few minor things below.
On 12/11/2012 02:52 AM, Dave Airlie wrote:
This adds seamless sampling for cubemap boundaries if requested.
The corner case averaging is messy but seems like it should be spec
compliant.
The face direction stuff is also a bit messy, I've no idea if that could
From: Tom Stellard thomas.stell...@amd.com
Every call to _cl_program::build() was erasing the binaries and logs for
every device associated with the program. This is incorrect because
it is possible to build a program for only a subset of devices and so
any device not being build should not have
From: Tom Stellard thomas.stell...@amd.com
---
src/gallium/state_trackers/clover/api/program.cpp | 7 +++-
.../state_trackers/clover/core/compiler.hpp| 12 +-
src/gallium/state_trackers/clover/core/program.cpp | 12 --
src/gallium/state_trackers/clover/core/program.hpp | 3 +-
On 12/10/2012 03:28 PM, Matt Turner wrote:
The ES 3 spec says that the minumum allowable value is 2^24-1, but the
GL 4.3 and ARB_ES3_compatibility specs require 2^32-1, so return 2^32-1.
Fixes es3conform's element_index_uint_constants test.
---
src/mesa/main/context.c |3 +++
The Align parameter is a power of two, so 16 results in 64K
alignment. Additional to that even 16 byte alignment doesn't
make any sense, so just remove it.
Signed-off-by: Christian König deathsim...@vodafone.de
---
lib/Target/AMDGPU/AMDILISelLowering.cpp |1 -
1 file changed, 1 deletion(-)
Signed-off-by: Christian König deathsim...@vodafone.de
---
lib/Target/AMDGPU/AMDGPUMCInstLower.cpp| 10 --
lib/Target/AMDGPU/AMDGPUMCInstLower.h |5 -
.../AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp | 10 +-
They seem to work fine.
Signed-off-by: Christian König deathsim...@vodafone.de
---
lib/Target/AMDGPU/SIInstructions.td |8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/lib/Target/AMDGPU/SIInstructions.td
b/lib/Target/AMDGPU/SIInstructions.td
index ea8de91..008652f
Branch if we have enough instructions so that it makes sense.
Also remove branches if they don't make sense.
Signed-off-by: Christian König deathsim...@vodafone.de
---
lib/Target/AMDGPU/SILowerControlFlow.cpp | 49 ++
1 file changed, 49 insertions(+)
diff --git
Unlike SGPRs VGPRs doesn't need to be aligned.
Signed-off-by: Christian König deathsim...@vodafone.de
---
lib/Target/AMDGPU/SIRegisterInfo.td | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/lib/Target/AMDGPU/SIRegisterInfo.td
Could you commit and push it to master?
Am Mi, 5. Dezember 2012, 09:31:48 schrieben Sie:
On Tue, Dec 4, 2012 at 12:50 PM, Tobias Droste tdro...@gmx.de wrote:
Anyone interested? ;-)
I would just push it, but I don't have the rights to do so.
Looks reasonable to me.
Reviewed-by: Alex
Tom Stellard t...@stellard.net writes:
From: Tom Stellard thomas.stell...@amd.com
Every call to _cl_program::build() was erasing the binaries and logs for
every device associated with the program. This is incorrect because
it is possible to build a program for only a subset of devices and
Sometimes I've got a patch for a performance optimization that's not
showing a statistically significant performance difference on reported
FPS, but still seems like a good idea because it ought to reduce time
spent in the shader. If I can see the total number of cycles spent in
the shader stage
Kenneth Graunke kenn...@whitecape.org writes:
On 12/07/2012 02:58 PM, Eric Anholt wrote:
This came from an idea by Ben Segovia. 16-wide pixel shaders are very
important for latency hiding on i965, so we want to try really hard to
get them. If scheduling an instruction makes some set of
On 12/10/2012 03:51 PM, Eric Anholt wrote:
Marek Olšák mar...@gmail.com writes:
There are 2 ways. I prefer the former:
GALLIUM_MSAA=n
__GL_FSAA_MODE=n
Tested with ETQW, which doesn't support MSAA on Linux. This is
the only way to get MSAA there.
This sounds like something that would
On 12/10/2012 02:28 PM, Matt Turner wrote:
The ES 3 spec says that the minumum allowable value is 2^24-1, but the
GL 4.3 and ARB_ES3_compatibility specs require 2^32-1, so return 2^32-1.
Fixes es3conform's element_index_uint_constants test.
---
src/mesa/main/context.c |3 +++
On Tue, Dec 11, 2012 at 11:00 AM, Ian Romanick i...@freedesktop.org wrote:
On 12/10/2012 02:28 PM, Matt Turner wrote:
The ES 3 spec says that the minumum allowable value is 2^24-1, but the
GL 4.3 and ARB_ES3_compatibility specs require 2^32-1, so return 2^32-1.
Fixes es3conform's
On 12/10/2012 04:06 PM, Jordan Justen wrote:
On Mon, Dec 10, 2012 at 2:28 PM, Matt Turner matts...@gmail.com wrote:
@@ -966,6 +973,15 @@ find_value(const char *func, GLenum pname, void **p, union
value *v)
int api;
api = ctx-API;
+ /* We index into the table_set[] list of per-API
On 12/10/2012 02:28 PM, Matt Turner wrote:
Fixes the transform_feedback2_init_defaults test from es3conform.
The ES 3 spec lists these as TRANSFORM_FEEDBACK_PAUSED and
TRANSFORM_FEEDBACK_ACTIVE.
---
src/mesa/main/get.c |8 +++-
src/mesa/main/get_hash_params.py | 10
On 12/10/2012 02:28 PM, Matt Turner wrote:
From GL/GLES/GL_CORE and GLES2 - GL/GL_CORE/GLES2.
Yes, we really were exposing ES2_compatibility queries on ES 1.
---
src/mesa/main/get_hash_params.py | 16 ++--
1 files changed, 6 insertions(+), 10 deletions(-)
diff --git
On Tue, Dec 11, 2012 at 11:08 AM, Ian Romanick i...@freedesktop.org wrote:
On 12/10/2012 02:28 PM, Matt Turner wrote:
Fixes the transform_feedback2_init_defaults test from es3conform.
The ES 3 spec lists these as TRANSFORM_FEEDBACK_PAUSED and
TRANSFORM_FEEDBACK_ACTIVE.
---
On 12/10/2012 02:28 PM, Matt Turner wrote:
Other than the comments on 6, 11, and 12, the series is
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
src/mesa/main/get.c | 16
src/mesa/main/get_hash_generator.py |8 +++-
On Tue, Dec 11, 2012 at 11:12 AM, Ian Romanick i...@freedesktop.org wrote:
On 12/10/2012 02:28 PM, Matt Turner wrote:
From GL/GLES/GL_CORE and GLES2 - GL/GL_CORE/GLES2.
Yes, we really were exposing ES2_compatibility queries on ES 1.
---
src/mesa/main/get_hash_params.py | 16
On 12/11/2012 11:14 AM, Matt Turner wrote:
On Tue, Dec 11, 2012 at 11:08 AM, Ian Romanick i...@freedesktop.org wrote:
On 12/10/2012 02:28 PM, Matt Turner wrote:
Fixes the transform_feedback2_init_defaults test from es3conform.
The ES 3 spec lists these as TRANSFORM_FEEDBACK_PAUSED and
I'm not familiar enough with the existing code to feel comfortable
reviewing it, but I've run it through a full piglit test run (using
tests/all.tests w/ GL/GLX enabled) without noticing any issues.
Also, Reaction Quake 3 performance went up by ~25% as a result of this
series on my Radeon
On Tue, Dec 11, 2012 at 8:24 PM, Aaron Watry awa...@gmail.com wrote:
I'm not familiar enough with the existing code to feel comfortable reviewing
it, but I've run it through a full piglit test run (using tests/all.tests w/
GL/GLX enabled) without noticing any issues.
Also, Reaction Quake 3
On Mon, Dec 10, 2012 at 3:47 PM, Marek Olšák mar...@gmail.com wrote:
u_upload_mgr suballocates memory from a large buffer and maps the allocated
range (unsychronized), which is perfect for short-lived staging buffers.
This reduces the number of relocations sent to the kernel.
Series looks
This is redundant since we're calling draw_bind_fragment_shader()
which already does a flush.
v2: the redundant flush in llvmpipe_set_constant_buffer() has
already been removed by commit 3427466e6dbbb8db7c1ecda6b3859ca1cc5827a3
---
src/gallium/drivers/llvmpipe/lp_state_fs.c |2 --
1 files
On 12/11/2012 11:04 AM, Ian Romanick wrote:
On 12/10/2012 04:06 PM, Jordan Justen wrote:
On Mon, Dec 10, 2012 at 2:28 PM, Matt Turner matts...@gmail.com wrote:
@@ -966,6 +973,15 @@ find_value(const char *func, GLenum pname, void
**p, union value *v)
int api;
api = ctx-API;
+ /* We
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 21 +
1 file changed, 21 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index f428a83..c520364 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++
---
src/mesa/drivers/dri/i965/brw_fs.cpp |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index c520364..62800b1 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++
The compute-to-mrf code is really twitchy, and it's hard to construct
GLSL testcases for it. This unit test is also really hard to work with
(for example, if your instruction is removed by dead code elimination,
you end up inspecting something irrelevant), but I did use it for
debugging some of
The way our visitor works, scalar expression/swizzle results that get
stored in channels other than .x will have an intermediate MOV from
their result in the .x channel to the real .y (or whatever) channel, and
similarly for vec2/vec3 results.
By knowing how to adjust DP4-type instructions for
No statistically significant performance difference on glbenchmark 2.7
(n=60). It reduces cycles spent in the vertex shader by 3.3% +/- 0.8%
(n=5), but that's only about .3% of all cycles spent according to the
fixed shader_time.
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 129
This patch series adds varying packing to Mesa, so that we can handle
varyings composed of things other than vec4's without using up extra
varying components.
For the initial implementation I've chosen a strategy that operates
exclusively at the GLSL IR level, so that it doesn't require the
This patch modifies the clip distance lowering pass so that the new
symbol it generates (glClipDistanceMESA) is added to the shader's
symbol table.
This will allow a later patch to modify the linker so that it finds
transform feedback varyings using the symbol table rather than having
to iterate
Previously, link_invalidate_variable_locations() was only called
during assign_attribute_or_color_locations() and
assign_varying_locations(). This meant that in the corner case when
there was only a vertex shader, and varyings were being captured by
transform feedback,
Previously, the linker used a value of -1 in ir_variable::location to
denote a generic input or output of the shader that had not yet been
matched up to a variable in another pipeline stage.
This patch introduces a new ir_variable field,
is_unmatched_generic_inout, for that purpose.
In future
Currently, the location of each varying is recorded in ir_variable as
a multiple of the size of a vec4. In order to pack varyings, we need
to be able to record, e.g. that a vec2 is stored in the second half of
a varying slot rather than the first half.
This patch introduces a field
This patch subdivides the loop that assigns varying locations into two
phases: one phase to match up varyings between shader stages (and
assign them varying locations), and a second phase to record the
varying assignments for use by transform feedback.
This paves the way for varying packing,
This patch further subdivides the loop that assigns varying locations
into two phases: one phase to match up the varyings between shader
stages, and one phase to assign them varying locations.
In between the two phases the matched varyings are stored in a new
data structure called
This patch paves the way for varying packing by adding a sorting step
before varying assignment, which sorts the varyings into an order that
increases the likelihood of being able to find an efficient packing.
First, varyings are sorted into packing classes by considering
attributes that can't be
This lowering pass generates GLSL code that manually packs varyings
into vec4 slots, for the benefit of back-ends that don't support
packed varyings natively.
No functional change--the lowering pass is not yet used.
---
src/glsl/Makefile.sources | 1 +
src/glsl/ir_optimization.h
This patch implements varying packing within varyings that are
composed of multiple vectors of size less than 4 (e.g. arrays of
vec2's, or matrices with height less than 4).
Previously, such varyings used up a full 4-wide varying slot for each
constituent vector, meaning that some of the
This patch implements varying packing between varyings.
Previously, each varying occupied components 0 through N-1 of its
assigned varying slot, so there was no way to pack two varyings into
the same slot. For example, if the varyings were a float, a vec2, a
vec3, and another vec2, they would be
https://bugs.freedesktop.org/show_bug.cgi?id=42516
Brian Paul bri...@vmware.com changed:
What|Removed |Added
Status|NEW |RESOLVED
Previously, if the client program didn't specify a stride when setting
up a vertex attribute, we used _mesa_sizeof_type() to compute the size
of the type, and multiplied it by the number of components.
This didn't work for the 2_10_10_10 formats, since _mesa_sizeof_type()
returns -1 for those
For the initial implementation I've chosen a strategy that operates
exclusively at the GLSL IR level, so that it doesn't require the
cooperation of the driver back-ends.
Wouldn't this negatively affect performance of some GPUs?
Not sure if relevant for Mesa, but e.g. on PowerVR SGX it's
54 matches
Mail list logo