---
src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 50 +++---
1 file changed, 4 insertions(+), 46 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 82543d4..c3050f9 100644
---
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 9baf41c..7c86225 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++
Intended as a (partial) inverse of type_sz(). Will be useful in the
next commit and some other SIMD32 generator changes I have queued up.
---
src/mesa/drivers/dri/i965/brw_reg.h | 20
1 file changed, 20 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_reg.h
---
src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 46 ++
1 file changed, 10 insertions(+), 36 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 78ebf00..922ef1a5 100644
---
Varying pull constant loads inherit the same limitation of pre-ILK
hardware that requires expanding SIMD8 texel fetch instructions to
SIMD16, we can deal with pull constant loads in the same way it's done
for texturing during SIMD lowering.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 27
The case where the source type of the instruction is smaller than the
immediate type could be handled by calculating the portion of the
immediate read by the instruction (assuming that the source channels
are aligned with the destination channels of the copy) and then
representing the same value
...on hardware lacking compressed Align16 support. Will allow
simplifying the generator code and fixing it for SIMD32 codegen.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 29 +
1 file changed, 29 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 12 +++-
1 file changed, 3 insertions(+), 9 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index bec96b1..9baf41c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++
The benefit is we will be able to use the SIMD lowering pass to unroll
math instructions of unsupported width and then remove some cruft from
the generator.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 55 ++
src/mesa/drivers/dri/i965/brw_fs_builder.h | 47
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 10 --
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index a677ea6..a15e15e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++
Only per-channel LOAD_PAYLOAD instructions can be lowered, which
should cover everything that comes in from the front-end.
LOAD_PAYLOAD instructions used to construct actual message payloads
cannot be easily lowered because they contain headers and vectors of
variable type that aren't necessarily
Currently the generator code for most opcodes honours the default
access mode (which should typically be Align1 in the scalar back-end),
but generate_code() doesn't set it explicitly which means that the
access mode from a previous instruction could leak into the following
ones if you did
This was causing the scheduler to be rather optimistic about the
latency of pull constant opcodes on Gen7+. This might seem to
increase the cycle count estimate calculated by the scheduler itself
for some shaders, even though the actual cycle count should actually
be decreased.
---
Seems like this texturing opcode was missing its logical counterpart
which would prevent it from taking advantage of the SIMD lowering
infrastructure, define it and plumb it through the back-end. At some
point we'll likely want to emit a single SAMPLEINFO message shared
among all channels
These can be easily represented in the IR as a MOV instruction with
strided source so they seem rather redundant.
---
src/mesa/drivers/dri/i965/brw_defines.h| 12
src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 2 --
src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 22
This will be useful in the SIMD lowering pass.
---
src/mesa/drivers/dri/i965/brw_ir_fs.h | 28 +++-
1 file changed, 27 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index e8f1d53..79c7eed 100644
If the source value is going to the same for all SIMD-lowered chunks
of the instruction there should be no need to unzip the value into
multiple temporary registers one for each lowered chunk. As a side
effect this fixes SIMD lowering of instructions with a vector
immediate source. In the long
---
src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 36 ++
1 file changed, 2 insertions(+), 34 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 922ef1a5..82543d4 100644
---
For consistency with the Gen7 variant. I'm not doing the same to the
uniform pull constant message at this point because the non-GEN7 one
is still overloaded to be either an expression-like logical
instruction or a Gen4-specific physical send message.
---
src/mesa/drivers/dri/i965/brw_defines.h
---
src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 51 ++
1 file changed, 4 insertions(+), 47 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 652696d..78ebf00 100644
---
It's just a byte MOV with strided source.
---
src/mesa/drivers/dri/i965/brw_defines.h| 1 -
src/mesa/drivers/dri/i965/brw_fs.cpp | 4 +--
src/mesa/drivers/dri/i965/brw_fs.h | 2 --
src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 45 --
This change addresses a number of hardware restrictions on the source
and destination regions and other execution controls of regular
FPU-like instructions that in some cases can be avoided by reducing
the execution size of the instruction. Some of these restrictions
(e.g. the one about 3src
Which is 16 or 8 in most cases. This will make sure that 32-wide
virtual instructions get chopped up into chunks of their maximum
execution size.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 45
1 file changed, 40 insertions(+), 5 deletions(-)
diff --git
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 18 +-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index cf2f6ac..a677ea6 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++
We shouldn't encounter these right now but if we did it wouldn't be
possible for the SIMD lowering pass to split it into multiple
instructions because of its side effects on control flow, so just
assert in order to kill the program.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 4
1 file
The purpose of this series is to improve the back-end infrastructure
so that lowering of most IR instructions that are too wide to execute
natively (which is far more common than usual in SIMD32 dispatch mode)
happens semi-automatically at the IR level.
Patches 1-6 address some issues in a few
Most of this wouldn't have worked for SIMD32 and had various
dispatch_width and compression control bugs. It's mostly dead now
with SIMD lowering of math instructions turned on in the compiler.
---
src/mesa/drivers/dri/i965/brw_fs.h | 10 ---
This will allow the SIMD lowering pass to split 32-wide varying pull
constant loads (not natively supported by the hardware) into 16-wide
instructions.
---
src/mesa/drivers/dri/i965/brw_defines.h| 1 +
src/mesa/drivers/dri/i965/brw_fs.cpp | 50 +++---
If the LOAD_PAYLOAD instruction only has header sources it's possible
for the number of registers written to be less than or equal to the
SIMD component size, in which case it would take the single-MOV path
at the bottom which would cause the channel enable masks to be applied
incorrectly to the
This teaches the SIMD lowering pass about the hardware limits on the
execution size of math instructions, which will allow simplifying the
generator code and at the same time get rid of a number of bugs in the
manual SIMD unrolling done currently that prevent SIMD32 codegen from
working.
---
---
src/mesa/drivers/dri/i965/brw_ir_fs.h | 28 +---
1 file changed, 17 insertions(+), 11 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index 3d47b0c..e8f1d53 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 12 +++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index e98c41d..3646c27 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++
Technically, this was introduced with GL 4.4. However, I believe it
was intended to be retroactive. As far as I know, AMD has never
supported primitive restart with patches, while NVidia and Intel do.
This necessitated the need for a query which would allow applications
to figure out whether
Some hardware supports primitive restart on patch primitives, and other
hardware does not. Modern GL and ES include a query for this feature;
adding a capability bit will allow us to answer it.
As far as I know, AMD hardware does not support this feature, while
NVIDIA and Intel hardware does.
On Saturday, May 21, 2016 12:06:58 PM PDT Timothy Arceri wrote:
> We would have segfaulted in the above code if prog could be NULL.
> ---
> src/mesa/drivers/dri/i965/brw_gs.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_gs.c
Is the else correct? What if you have, e.g., a cube view of a 2d array,
starting at later 2? Don't you want to add the min layer in no matter what?
On May 20, 2016 9:36 PM, "Kenneth Graunke" wrote:
Fixes Piglit's arb_copy_image-texview test with the Meta path disabled
(so
We would have segfaulted in the above code if prog could be NULL.
---
src/mesa/drivers/dri/i965/brw_gs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_gs.c
b/src/mesa/drivers/dri/i965/brw_gs.c
index 4dddb86..8f5dcf3 100644
---
We're dropping Meta in favor of BLORP everywhere we can.
This also fixes bugs when copying cubemaps to 2D, which is currently
broken in the meta pass. BLORP just works.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94198
Signed-off-by: Kenneth Graunke
---
The Meta path handles this, but the CPU/BLT fallbacks did not.
Signed-off-by: Kenneth Graunke
---
src/mesa/drivers/dri/i965/intel_copy_image.c | 15 +++
1 file changed, 15 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/intel_copy_image.c
The BLT can't handle S8 because it's W-tiled (at least without
additional funny business, and I'm not sure we care). Disallow
it so it falls back to the CPU path, which works.
Signed-off-by: Kenneth Graunke
---
src/mesa/drivers/dri/i965/intel_copy_image.c | 3 +++
1 file
For now, only enable it on platforms that actually support ETC2.
At this point, Broadwell is only failing 5 (out of 8358) dEQP tests:
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.
srgb8_alpha8_r11f_g11f_b10f.renderbuffer_to_texture3d
This simplifies things a little - now we only have one (tex or rb?)
if-ladder for src, and a second for dst, rather than four.
Signed-off-by: Kenneth Graunke
---
src/mesa/drivers/dri/i965/intel_copy_image.c | 33 +++-
1 file changed, 13
Fixes Piglit's arb_copy_image-texview test with the Meta path disabled
(so we hit the blitter/CPU fallback paths).
Signed-off-by: Kenneth Graunke
---
src/mesa/drivers/dri/i965/intel_copy_image.c | 4
1 file changed, 4 insertions(+)
diff --git
Currently, it only contains the BLT/CPU fallbacks, so the name is a bit
too generic. But eventually this will use BLORP as well, at which point
the name will make more sense.
The next patch will introduce a second call.
Signed-off-by: Kenneth Graunke
---
On Fri, May 20, 2016 at 4:53 PM, Jason Ekstrand wrote:
> For a long time, several of the 3-channel vertex formats didn't exist so we
> faked them with 4-channel versions. Starting with Sandy Bridge, we can use
> R16G16B16_FLOAT and 8 and 16-bit integer formats become
Patches 1-3, 4 (with comments addressed), 6-10 are:
Reviewed-by: Timothy Arceri
I've made a comment on Patch 5, its also a shame we need to loop over
the entire NumProgramResourceList. If you are not concern about
performance at this stage then you can add my r-b
On Fri, 2016-05-20 at 00:26 -0700, Ian Romanick wrote:
> From: Ian Romanick
>
> Fixes the following dEQP tests on SKL:
>
> dEQP-
> GLES31.functional.separate_shader.validation.varying.mismatch_qualifi
> er_vertex_smooth_fragment_flat
> dEQP-
>
Bay Trail and Haswell added a bunch of new vertex formats. There was also
the addition of 64-bit passthrough formats for BDW+.
---
src/mesa/drivers/dri/i965/brw_surface_formats.c | 32 -
1 file changed, 16 insertions(+), 16 deletions(-)
diff --git
---
src/intel/isl/isl.h | 7 +++
src/intel/isl/isl_format_layout_gen.bash | 3 ++-
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index 71f2971..f55fb51 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@
---
src/intel/isl/isl.h | 1 +
src/intel/isl/isl_format_layout.csv | 1 +
2 files changed, 2 insertions(+)
diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index f55fb51..f8a2f37 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -128,6 +128,7 @@ enum isl_format
This is just a copy-and-paste from brw_surface_formats.c. For the
supports_vertex_fetch function, we do a bit more work so that it properly
handles Bay Trail.
---
src/intel/isl/isl.h| 13 ++
src/intel/isl/isl_format.c | 386 +
2 files changed,
From: Nanley Chery
This format does not support alpha blending, according to the SNB PRM.
Signed-off-by: Nanley Chery
---
src/mesa/drivers/dri/i965/brw_surface_formats.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git
This little series effectively moves the surface format table from
brw_surface_formats.c into ISL. Previously, it got built into
libi965_compiler.la because we needed to share it between drivers and
didn't have a better place to put it. Now it can live in ISL where it
belongs.
When we pull it
With this, we can delete the surface format table in brw_surface_formats.c
because all of the information we need is now in ISL.
---
src/mesa/drivers/dri/i965/Makefile.sources | 3 +-
src/mesa/drivers/dri/i965/brw_context.h | 2 -
src/mesa/drivers/dri/i965/brw_state.h |
---
src/intel/vulkan/anv_formats.c | 41 +++--
1 file changed, 19 insertions(+), 22 deletions(-)
diff --git a/src/intel/vulkan/anv_formats.c b/src/intel/vulkan/anv_formats.c
index a920ab4..1bbb03a 100644
--- a/src/intel/vulkan/anv_formats.c
+++
On Fri, 2016-05-20 at 00:25 -0700, Ian Romanick wrote:
> From: Ian Romanick
>
> There's going to be a second validate_io function, and checking the
> same
> thing twice is silly.
I think we should just do this check
in _mesa_validate_program_pipeline() before we
call
On Fri, May 20, 2016 at 04:29:07PM -0700, Mark Janes wrote:
> Tom Stellard writes:
>
> > On Wed, Apr 27, 2016 at 10:33:14PM +, Youry Metlitsky wrote:
> >> ---
> >> include/GL/mesa_glinterop.h | 15 ++-
> >> 1 file changed, 14 insertions(+), 1 deletion(-)
> >>
On Fri, May 20, 2016 at 7:57 PM, Jason Ekstrand wrote:
>
>
> On Fri, May 20, 2016 at 4:41 PM, Ilia Mirkin wrote:
>>
>> On Fri, May 20, 2016 at 7:53 PM, Jason Ekstrand
>> wrote:
>> > Cc: "11.1 11.2"
On Fri, May 20, 2016 at 4:41 PM, Ilia Mirkin wrote:
> On Fri, May 20, 2016 at 7:53 PM, Jason Ekstrand
> wrote:
> > Cc: "11.1 11.2"
> > ---
> > src/mesa/drivers/dri/i965/brw_context.h | 1 +
> >
On Fri, May 20, 2016 at 7:53 PM, Jason Ekstrand wrote:
> Cc: "11.1 11.2"
> ---
> src/mesa/drivers/dri/i965/brw_context.h | 1 +
> src/mesa/drivers/dri/i965/brw_draw.c| 4 +++-
> src/mesa/drivers/dri/i965/brw_draw_upload.c | 2
Previously, we were using the size of the whole BO which may be
substantially larger than the actual index buffer size.
---
src/mesa/drivers/dri/i965/brw_context.h | 1 +
src/mesa/drivers/dri/i965/brw_draw_upload.c | 8 ++--
src/mesa/drivers/dri/i965/gen8_draw_upload.c | 2 +-
3 files
---
docs/GL3.txt | 4 ++--
src/mesa/drivers/dri/i965/intel_extensions.c | 5 +
2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/docs/GL3.txt b/docs/GL3.txt
index 7e86f5e..c8952a1 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -177,7 +177,7 @@ GL
Reviewed-by: Ian Romanick
---
src/compiler/glsl/linker.cpp | 7 +--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index de56945..c7a7c63 100644
--- a/src/compiler/glsl/linker.cpp
+++
---
src/mesa/main/extensions_table.h | 1 +
src/mesa/main/mtypes.h | 1 +
2 files changed, 2 insertions(+)
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index 471b19f..8bf7163 100644
--- a/src/mesa/main/extensions_table.h
+++
---
src/compiler/nir/nir_lower_samplers.c | 9 ++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/src/compiler/nir/nir_lower_samplers.c
b/src/compiler/nir/nir_lower_samplers.c
index 0de9eb8..4a43269 100644
--- a/src/compiler/nir/nir_lower_samplers.c
+++
Right now, we're setting the range to [0, 0] which is obviously bogus.
Instead, we should set it to be invalid like we do for DrawIndirect.
Cc: "11.1 11.2"
Reviewed-by: Marek Olšák
Reviewed-by: Iago Toral Quiroga
---
Previously, we only handled the "I don't know what's going on" case for
things with InstanceDivisor == 0. However, in the DrawIndirect case we can
get num_instances == 0 and we don't know what's going on with the instanced
ones either. This commit makes the worst-case bound the default and then
The old code always divided rounded down and then subtracted 1. What we
wanted was to divide rounded up and then subtract 1 which is equivalent to
subtracting 1 and then dividing rounded down.
Cc: "11.1 11.2"
---
src/mesa/drivers/dri/i965/brw_draw_upload.c |
The previous code got the BO the first time we encountered it. However,
this can potentially lead to problems if the BO is used for multiple arrays
with the same buffer object because the range we declare as busy may not be
quite right. By delaying the call to intel_bufferobj_buffer, we can
The vbo layer passes an index_bounds_valid flag that we should be using
instead. This also fixes a bug when min_index == -1 and basevertex != 0
where we were actually comparing min_index + basevertex == -1 which was
false and we were getting the wrong buffer-sizing path.
Cc: "11.1 11.2"
This prevents array overflow when the block is actually an array of UBOs or
SSBOs. On some hardware such as i965, such overflows can cause GPU hangs.
Reviewed-by: Ian Romanick
---
src/compiler/glsl/ir_optimization.h | 2 +-
src/compiler/glsl/linker.cpp
For a long time, several of the 3-channel vertex formats didn't exist so we
faked them with 4-channel versions. Starting with Sandy Bridge, we can use
R16G16B16_FLOAT and 8 and 16-bit integer formats become available on
Haswell and Bay Trail.
---
src/mesa/drivers/dri/i965/brw_draw_upload.c | 48
Previously, we were using the size of the BO which may be substantially
larger than the actual vertex buffer size.
---
src/mesa/drivers/dri/i965/brw_context.h | 1 +
src/mesa/drivers/dri/i965/brw_draw_upload.c | 16 +++-
src/mesa/drivers/dri/i965/gen8_draw_upload.c | 2 +-
3
Cc: "11.1 11.2"
---
src/mesa/drivers/dri/i965/brw_context.h | 1 +
src/mesa/drivers/dri/i965/brw_draw.c| 4 +++-
src/mesa/drivers/dri/i965/brw_draw_upload.c | 2 +-
3 files changed, 5 insertions(+), 2 deletions(-)
diff --git
Bay Trail and Haswell added a bunch of new vertex formats. There was also
the addition of 64-bit passthrough formats for BDW+.
---
src/mesa/drivers/dri/i965/brw_surface_formats.c | 32 -
1 file changed, 16 insertions(+), 16 deletions(-)
diff --git
Right now, we're just setting the range to [0, MAX_UINT32] which, while
correct isn't helpful. With DrawIndirect, you can't really know what the
actual range is so we may as well flag it as being an invalid range. This
is what we do for draws with index buffer which is similar (the indices
Tom Stellard writes:
> On Wed, Apr 27, 2016 at 10:33:14PM +, Youry Metlitsky wrote:
>> ---
>> include/GL/mesa_glinterop.h | 15 ++-
>> 1 file changed, 14 insertions(+), 1 deletion(-)
>>
>
> Hi,
>
> This patch breaks the build for me:
>
> glxcmds.c:2699:1:
On 11 May 2016 at 07:51, Rhys Kidd wrote:
> On 10 May 2016 at 16:04, Elie TOURNIER wrote:
>
>> ---
>> doxygen/doxy.bat | 7 +++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/doxygen/doxy.bat b/doxygen/doxy.bat
>> index e566ca3..408964e
On Fri, May 20, 2016 at 1:49 PM, Dave Airlie wrote:
> From: Dave Airlie
>
> For cull distance GLSL will let unsized unused arrays get
> into the backend, we should nuke those straight away, to
> save caring about them later.
>
> This fixes:
>
On Wed, Apr 27, 2016 at 10:33:14PM +, Youry Metlitsky wrote:
> ---
> include/GL/mesa_glinterop.h | 15 ++-
> 1 file changed, 14 insertions(+), 1 deletion(-)
>
Hi,
This patch breaks the build for me:
glxcmds.c:2699:1: error: no previous prototype for
On Fri, May 20, 2016 at 2:36 PM, Jason Ekstrand
wrote:
> On Fri, May 20, 2016 at 1:49 PM, Dave Airlie wrote:
>
>> From: Dave Airlie
>>
>> For cull distance GLSL will let unsized unused arrays get
>> into the backend, we should nuke
Sorry for the delayed response. I was on a vacation for some time.
I wouldn't change it if the expected behavior in cts tests (listed in commit
49c7105) matches the language in OpenGL 4.3+ and with mesa behavior.
You might want to find out what Nvidia's proprietary driver does. You
can get the
Series Reviewed-by: Jordan Justen
On 2016-05-09 15:18:21, Kenneth Graunke wrote:
> This way, the driver's EndTransformFeedback() hook can tell whether the
> transform feedback operation was paused. It's also convenient to have
> Paused remain false until the driver's
On Fri, May 20, 2016 at 1:49 PM, Dave Airlie wrote:
> From: Dave Airlie
>
> For cull distance GLSL will let unsized unused arrays get
> into the backend, we should nuke those straight away, to
> save caring about them later.
>
> This fixes:
>
On Sun, May 8, 2016 at 7:26 PM, Ilia Mirkin wrote:
> Signed-off-by: Ilia Mirkin
> ---
>
> This is pretty academic since no hw supports these formats, but since core
> support for these has landed, might as well extend the view logic.
>
>
From: Dave Airlie
For cull distance GLSL will let unsized unused arrays get
into the backend, we should nuke those straight away, to
save caring about them later.
This fixes:
arb_separate_shader_objects/linker/large-number-of-unused-varyings
as a side effect (even without
Unfortunately, you're right. I'll need to come up with a more complete fix.
Don't submit that knob change.
On 5/20/16, 2:08 PM, "mesa-dev on behalf of Rowley, Timothy O"
wrote:
>Bruce, is cut-aware needed
This is in contrast to emitting it directly in vkCmdPipelineBarrier. This
has a couple of advantages. First, it means that no matter how many
vkCmdPipelineBarrier calls the application strings together it gets one or
two PIPE_CONTROLs. Second, it allow us to better track when we need to do
---
src/intel/vulkan/anv_meta_clear.c | 5 +
src/intel/vulkan/anv_private.h | 1 +
src/intel/vulkan/genX_cmd_buffer.c | 1 +
3 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/src/intel/vulkan/anv_meta_clear.c
b/src/intel/vulkan/anv_meta_clear.c
index eb4e569..87f3733 100644
Instead of blasting it out as part of the pipeline, we put it in the
command buffer and only blast it out when it's really needed. Since the
PUSH_CONSTANT_ALLOC commands aren't pipelined, they immediately cause a
stall which we would like to avoid.
---
src/intel/vulkan/anv_cmd_buffer.c | 1
This has been declared as a uint since the SNB but it's really a boolean.
---
src/intel/genxml/gen6.xml | 2 +-
src/intel/genxml/gen7.xml | 2 +-
src/intel/genxml/gen75.xml | 2 +-
src/intel/genxml/gen8.xml | 2 +-
src/intel/genxml/gen9.xml | 2 +-
5 files changed, 5 insertions(+), 5
Also, we don't actually need it for clipping because meta always colors
inside the lines and, for all other operations, the user is required to set
a scissor. Since DRAWING_RECTANGLE stalls the GPU, we want to emit it as
little as possible.
---
src/intel/vulkan/genX_cmd_buffer.c | 13
Seems reasonable.
Reviewed-by: Jason Ekstrand
On Fri, May 20, 2016 at 9:55 AM, Nanley Chery wrote:
> From: Nanley Chery
>
> In agreement with the SNB PRM, alpha blending is a property that render
> targets may or may not
Bruce, is cut-aware needed for primitive restart?
> On May 20, 2016, at 11:58 AM, Rowley, Timothy O
> wrote:
>
> Highlights this round are a frontend performance boost and
> removal of dead code.
>
> Unfortunately the instanceID/vertexID patch combines some style
>
Iago Toral Quiroga writes:
> The previous implementation relied on the std140 alignment rules to
> avoid handling misalignment in the case where we are loading more than
> 2 double components from a vector, which requires to emit a second load
> message.
>
> This alternative
transfer_inline_write cannot be NULL and the virgl renderer doesn't support
inline writes for textures, so add the default version.
This fixes a crash in st_TexSubImage since commit fb9fe352ea41 ("st/mesa:
use transfer_inline_write for memcpy TexSubImage path").
Cc: Marek Olšák
On Fri, May 20, 2016 at 12:55 AM, Iago Toral Quiroga wrote:
> These are not supported by the fp64 spec though, so they are not really
> necessary. There are, however, other cases of opcodes that are not
> supported by fp64 and where we have provied a double-precision
>
On Fri, May 20, 2016 at 12:55 AM, Iago Toral Quiroga wrote:
> I think these are not strictly necessary since the floats in them
> should be automatically promoted to doubles when operated with
> double sources, but it makes things more explicit at least.
> ---
I would be
Reviewed-by: Matt Turner
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
---
.../drivers/swr/rasterizer/core/backend.cpp| 22 --
src/gallium/drivers/swr/rasterizer/core/backend.h | 13 -
2 files changed, 20 insertions(+), 15 deletions(-)
diff --git a/src/gallium/drivers/swr/rasterizer/core/backend.cpp
1 - 100 of 162 matches
Mail list logo