For example where n=3 first_component=1 this will give us
0xE (WRITEMASK_YZW).
Reviewed-by: Edward O'Callaghan
---
src/mesa/drivers/dri/i965/brw_reg.h | 6 ++
1 file changed, 6 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_reg.h
On Monday, July 18, 2016 10:58:31 PM PDT Ben Widawsky wrote:
> On Mon, Jul 18, 2016 at 07:08:46PM -0700, Kenneth Graunke wrote:
> > Fixes a 10-20% performance regression in OglCSDof caused by commit
> > 5a8c89038abab0184ea72664ab390ec6ca58b4d6, which made images (in the
> > image load/store sense)
V5:
- rebase on Ken's interpolation clean-ups [1]
V4:
- add vec4 backend support and enable for Gen6+
V3:
- Rewrite patch 9 (add support for packing arrays) to not add
hacks to the type_size() functions.
- Add packing support for the load_output intrinsics (patch 12)
- Add
Rather than trying to work out the total number of components
used at a location we simply treat all outputs as vec4s.
---
src/mesa/drivers/dri/i965/brw_fs.h | 1 -
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 22 ++
src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |
Reviewed-by: Edward O'Callaghan
---
src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp
index 9ebfb27..16d2410 100644
---
Reviewed-by: Edward O'Callaghan
---
docs/GL3.txt | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/docs/GL3.txt b/docs/GL3.txt
index 1335397..ebaf4bf 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -193,11 +193,11 @@ GL 4.4, GLSL 4.40:
---
src/mesa/drivers/dri/i965/brw_vec4_tes.cpp | 14 ++
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tes.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_tes.cpp
index 6639c86..8266a9d 100644
---
Reviewed-by: Edward O'Callaghan
---
src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp | 8 ++--
src/mesa/drivers/dri/i965/brw_vec4_tcs.h | 1 +
2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp
Reviewed-by: Edward O'Callaghan
---
src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp | 17 +
src/mesa/drivers/dri/i965/brw_vec4_tcs.h | 1 +
2 files changed, 14 insertions(+), 4 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp
Reviewed-by: Edward O'Callaghan
---
src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index f3b4528..33ad852 100644
---
We will use this for output varyings. To make component
packing simpler we will just treat all varyings as vec4s.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 13 +
src/mesa/drivers/dri/i965/brw_shader.h | 1 +
2 files changed, 14 insertions(+)
diff --git
Am 19.07.2016 um 00:43 schrieb Boyuan Zhang:
Add necessary functions/changes for VAAPI encoding to buffer and picture. These
changes will allow driver to handle all Vaapi encode related operations. This
patch doesn't change the Vaapi decode behaviour.
Signed-off-by: Boyuan Zhang
Hi,
sorry for being late but this patch doesn't mention that all those
symbols should be exported in libGL.so too [1].
If you look at the history of static_data.py it was mentioned that
this list of functions should never grow [2].
Thanks,
Andreas
[1]
Reviewed-by: Edward O'Callaghan
---
src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c
b/src/mesa/drivers/dri/i965/intel_extensions.c
index c557137..ec89094 100644
---
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 38 +++-
1 file changed, 33 insertions(+), 5 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 395594f..e75e7f7 100644
---
This will be used to swizzle components to the beginning or end
of the vector based on the component layout qualifier and whether
we are doing a load or store.
Reviewed-by: Edward O'Callaghan
---
src/mesa/drivers/dri/i965/brw_reg.h | 3 +++
1 file changed, 3
Reviewed-by: Edward O'Callaghan
---
src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp | 7 +++
1 file changed, 7 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp
index 8bd150a..4bc3be7 100644
---
This makes sure we give the correct driver location
for doubles when using component packing.
---
src/compiler/nir/nir_lower_io.c | 16
1 file changed, 16 insertions(+)
diff --git a/src/compiler/nir/nir_lower_io.c b/src/compiler/nir/nir_lower_io.c
index e480264..7a72e69 100644
Here we create a new output_generic_reg array with the ability to
store the dst_reg for each component of user defined varyings.
This is needed as the previous code only stored the dst_reg based
on the varying location which meant packed varyings would overwrite
each other.
---
Am 19.07.2016 um 00:43 schrieb Boyuan Zhang:
Add environmental variable to disable interlace mode. At VAAPI decoding stage,
driver can not distinguish b/w pure decoding case and transcoding case. And
since interlace encoding is not supported, we have to disable interlace for
transcoding case.
Am 19.07.2016 um 00:43 schrieb Boyuan Zhang:
VAAPI passes PIPE_VIDEO_ENTRYPOINT_ENCODE as entry point for encoding case. We
will save this encode entry point in config. config_id was used as profile
previously. Now, config has both profile and entrypoint field, and config_id is
used to get
Am 19.07.2016 um 00:43 schrieb Boyuan Zhang:
Add entrypoint to distinguish H.264 decode and encode. For example, in patch 5/11 when is calling
"VaCreateContext", "pps" and "sps" shouldn't be allocated for H.264 encoding.
So we need to use the entry_point to determine this is H.264 decode or
This came in handy when debugging the payload setup for Tess Eval,
since it prints correct subnr for attributes that can be loaded
in the second half of a register.
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git
---
src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp | 29 +++--
1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp
index 70f81a0..cdfcefa 100644
---
We can implement them directly. Also, document other possible improvements
for future reference.
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 46 +-
1 file changed, 45 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
From: Samuel Iglesias Gonsálvez
Signed-off-by: Samuel Iglesias Gonsálvez
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 13 -
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 20 +---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index d7fbb5d..5c7a07a 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++
---
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 19 +--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 441a450..40ba648 100644
---
The tessellation evaluation stage generates source regions with a vstride=0
for these so they hit the gen7 hardware decompression bug. Split them to
prevent this.
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 13 -
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git
Mostly the same stuff as usual: we ned to shuffle the data before we
write and we need to emit two 32-bit write messages (with appropriate
32-bit writemask channels set) for a full dvec4 scratch write.
---
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 64 ++
1 file
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 60 +-
1 file changed, 30 insertions(+), 30 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 99b30ce..d7fbb5d 100644
---
FIXME: We need to fix the case where not all the attributes fit
in the push constant buffer
---
src/mesa/drivers/dri/i965/brw_vec4_tes.cpp | 63 +++---
1 file changed, 48 insertions(+), 15 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tes.cpp
64-bit scratch read/writes require to shuffle data around so we need
to have access to the full 64-bit data. We will do the right thing
for these when we emit the messages.
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 9 +
1 file changed, 9 insertions(+)
diff --git
This way callers don't need to know about 64-bit particularities and
we reuse some code.
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 22 ++-
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 81 ++
2 files changed, 50 insertions(+), 53 deletions(-)
diff --git
Use a width of 2 with 64-bit attributes. Also, if we have a dvec
split across two registers such that components XY are stored in
the second half of a register and components ZW are stored in the
first half of the next register, fix up the regioning parameters
for channels ZW.
---
These can happen, for example, in tessellation evaluation when it maps
incoming attributes to FIXED_GRF registers. In this case, just as with
VGRFs, we need to make sure we have vstride=0 for these to work.
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 +--
1 file changed, 1 insertion(+), 2
From: Samuel Iglesias Gonsálvez
Signed-off-by: Samuel Iglesias Gonsálvez
---
src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp | 43 +--
1 file changed, 28 insertions(+), 15 deletions(-)
diff --git
We need to shuffle the data before it is written to the URB. Also,
dvec3/4 need two vec4 slots.
---
src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 19 +++
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
There is a single bit for this, so it is a binary 0 or 1 meaning
offset 0B or 16B respectively.
---
src/mesa/drivers/dri/i965/brw_disasm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c
b/src/mesa/drivers/dri/i965/brw_disasm.c
index
---
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 265bb17..ae8704a 100644
---
---
src/mesa/drivers/dri/i965/brw_vec4_cse.cpp | 30 --
1 file changed, 24 insertions(+), 6 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
index 0c1f0c3..d1bd9fa 100644
---
Signed-off-by: Józef Kucia
---
src/gallium/drivers/r600/r600_pipe.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers/r600/r600_pipe.c
b/src/gallium/drivers/r600/r600_pipe.c
index 6bd027b..a3b6189 100644
---
On 18 July 2016 at 21:54, Matt Turner wrote:
> On Mon, Jul 11, 2016 at 10:49 AM, Matt Turner wrote:
>> According to https://llvm.org/bugs/show_bug.cgi?id=19778#c3 this code
>> was violating the spec, resulting in it failing to compile.
>>
>> Cc:
Dropping this patch. It seems I overlooked:
https://lists.freedesktop.org/archives/mesa-dev/2016-June/119616.html
On Mon, 2016-07-18 at 16:39 +0300, Andres Gomez wrote:
> subroutine variables are to be used just in the way functions are
> called. Although the spec doesn't say it explicitely, this
On Mon, 2016-07-18 at 22:16 -0700, Jason Ekstrand wrote:
> The intel_get_image_dims helper function handles some image dimension
> sanitization for us for things such as 1-D array textures. We should
> probably be using it here.
>
> Signed-off-by: Jason Ekstrand
> Cc:
AFAICS, this code is only used when USE_X86_ASM/USE_X86_64_ASM. These are
never defined on Windows (we never use the assembly files on Windows,
regardless which compiler is used), therefore there should be no impact to MSVC
or Windows builds.
Acked-by: Jose Fonseca
Hi,
Just dropped:
https://lists.freedesktop.org/archives/mesa-dev/2016-July/123485.html
I didn't realize there was already this thread open.
On Tue, 2016-06-07 at 09:59 -0700, Ian Romanick wrote:
> On 06/06/2016 10:20 PM, Dave Airlie wrote:
> > From: Dave Airlie
> >
> >
()
On Mon, Jul 18, 2016 at 9:11 AM, Marek Olšák wrote:
> From: Marek Olšák
>
> The goal is to do this in st_validate_state:
>while (dirty)
> atoms[u_bit_scan()]->update(st);
>
> That implies that atoms can't specify which flags they consume.
>
On Tue, 2016-06-07 at 15:20 +1000, Dave Airlie wrote:
> From: Dave Airlie
>
> This fixes:
> GL45-CTS.shader_subroutine.subroutines_cannot_be_assigned_float_int_values_or_be_compared
>
> though I'm not 100% sure why this is illegal from the spec,
> but it makes us pass the
On Tue, Jul 19, 2016 at 6:54 AM, Emil Velikov wrote:
> On 19 July 2016 at 04:21, Tomasz Figa wrote:
>> On Tue, Jul 19, 2016 at 2:35 AM, Emil Velikov
>> wrote:
>>> On 18 July 2016 at 16:38, Tomasz Figa
On 18.07.2016 14:14, Marek Olšák wrote:
From: Marek Olšák
ported from Vulkan
---
src/gallium/drivers/radeonsi/si_compute.c | 8 ++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/si_compute.c
Hello
I would like to test http://patchwork.freedesktop.org/series/9987/ and
http://patchwork.freedesktop.org/series/9988/ but the mbox patches
aren't compatible with mesa-git.
Would it be possible to update 9987 and 9988 to match mesa-git?
Do 9987 and 9988 assume additional public patches that
On 18.07.2016 14:25, Marek Olšák wrote:
From: Marek Olšák
to reduce the call indirections with u_resource_vtbl.
The worst call tree you could get was:
- u_transfer_inline_write_vtbl
- u_default_transfer_inline_write
- u_transfer_map_vtbl
-
Hi
You're best replying directly to the posts on the mailing list for these.
Most folk won't know the their patch series by their patchwork ID
I think Marek posted a branch with his patches applied, it might be easier
to test that, I'm sure he'll rebase his patches after review
Cheers
Mike
On
Patches 1, 3 & 4 are
Reviewed-by: Nicolai Hähnle
On 18.07.2016 14:14, Marek Olšák wrote:
From: Marek Olšák
This effectively removes s_waitcnt instructions after FP16 exports.
Before:
v_cvt_pkrtz_f16_f32_e32 v0, v0, v1 ; 5E000300
Patches 2 & 3:
Reviewed-by: Nicolai Hähnle
On 18.07.2016 14:25, Marek Olšák wrote:
From: Marek Olšák
There is less noise in CPU profile data now.
---
src/gallium/drivers/r600/r600_pipe.c| 2 +-
Series is
Reviewed-by: Nicolai Hähnle
On 18.07.2016 14:35, Marek Olšák wrote:
Hi,
These are small optimizations for reducing pb_cache overhead with Bioshock
Infinite.
Please review.
Marek
___
mesa-dev mailing list
Signed-off-by: Samuel Pitoiset
---
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 55 ++
1 file changed, 55 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 8
1 file changed, 8 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
Signed-off-by: Samuel Pitoiset
---
.../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 34 ++
1 file changed, 34 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
---
src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp | 27 ---
1 file changed, 24 insertions(+), 3 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp
index f61c612..70f81a0 100644
---
---
src/mesa/drivers/dri/i965/intel_extensions.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c
b/src/mesa/drivers/dri/i965/intel_extensions.c
index c557137..6ba44b8 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++
Now that we are letting some instructions through without being
fully scalarized we have to make sure that we do scalarize any
that have XY / ZW writemasks, since this don't have native support.
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 10 +-
1 file changed, 9 insertions(+), 1
In gen < 8 instructions that write more than one register need to read
more than one register too. Make sure we don't break that restriction
by copy propagating from a uniform.
---
src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 7 +++
1 file changed, 7 insertions(+)
diff --git
ARB_gpu_shader_fp64 was the last piece missing. Notice that some
hardware and kernel combinations do not support pipelined register
writes, which are required for some OpenGL 4.0 features, in which
case the driver won't expose 4.0.
---
src/mesa/drivers/dri/i965/intel_extensions.c | 2 ++
From: Connor Abbott
---
src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index 6662a1e..1f8fa80 100644
---
It seems [0] old versions of Mako are no longer supported. Emil mentioned it
might need v0.8.0 [1] for isl_format_layout [2], although I didn't get
a confirmation that it's really the minimum.
Let's raise it to that to avoid getting other bugs.
We might lower it a bit again later if it turns out
The BDW PRM says that it is not supported, but it seems that gen7 is also
affected, since doing DepCtrl on double-float instructions leads to
GPU hangs in some cases, which is probably not surprising knowing that
this is not supported in new hardware iterations. The SKL PRMs do not
mention this
In this case we need to shuffle the 64-bit data before we write it
to memory, source from reg_offset + 1 to write components Z and W
and consider that each DF channel is twice as big.
---
src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 40 --
1 file changed, 32
---
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 15 +--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 454ad03..6e09778 100644
---
Otherwise we end up producing code that violates the register region
restriction that says that when execsize == width and hstride != 0
the vstride can't be 0.
---
src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 11 +++
1 file changed, 11 insertions(+)
diff --git
From: Samuel Iglesias Gonsálvez
Sometimes we emit code that has subnr > 0 to select the second half
of a DF register (components Z or W). For example, the 64-bit
shuffling code does this. For that code to work properly we need to
make sure that that we use a vstride=0 on
From: Samuel Iglesias Gonsálvez
This means we would copy propagate partial reads or writes and that can affect
the result.
Signed-off-by: Samuel Iglesias Gonsálvez
---
src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 3 +++
1 file changed,
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 9 ++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index e204d81..b4a22d1 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++
A vec4 is 16 bytes and a dvec4 is 32 bytes so for doubles we have
to multiply the reladdr by 2. The reg_offset part is in units of 16
bytes and is used to select the low/high 16-byte chunk of a full
dvec4, so we don't want to multiply that part of the address.
---
SIMD4x2 64bit data is stored in register space like this:
r0.0:DF x0 y0 z0 w0
r0.1:DF x1 y1 z1 w1
When we need to write data such as this to memory using 32-bit write
messages we need to shuffle it in this fashion:
r0.0:DF x0 y0 x1 y1
r0.1:DF z0 w0 z1 w1
and emit two 32-bit write messages,
From: Connor Abbott
v2: Also check if the instruction source target is 64-bit. (Samuel)
Signed-off-by: Samuel Iglesias Gonsálvez
---
src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 7 +++
1 file changed, 7 insertions(+)
diff
We need to emit to 32-bit load messages to load a full dvec4. If only
1 or 2 double components are needed dead-code-elimination will remove
the second one.
We also need to shuffle the result of the 32-bit messages to form
valid 64-bit SIMD4x2 data.
---
src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
Same requirements as for UBO loads.
---
src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 32 --
1 file changed, 26 insertions(+), 6 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index 172bf48..5bc1fd5
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index c55d594..8316691 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++
We need to consider the fact that dvec3/4 require two vec4 slots.
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 11 +--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 1b190ab..95b408e
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 9400baa..a366548 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++
https://bugs.freedesktop.org/show_bug.cgi?id=96993
Bug ID: 96993
Summary: new gallium swr driver can not be built on Windows
Product: Mesa
Version: unspecified
Hardware: Other
OS: Windows (All)
Status: NEW
On Tue, 2016-07-19 at 13:03 +0200, Alejandro Piñeiro wrote:
> Is this the correct version of the patch? It uses nir_lower_io with 4
> parameters, while nir_lower_io on master uses 3 (and afaik, it has
> been
> using 3 for a while).
>
> FWIW, this patch doesn't apply cleanly with current master
---
src/gallium/drivers/freedreno/a2xx/disasm-a2xx.c |3 +++
1 file changed, 3 insertions(+)
diff --git a/src/gallium/drivers/freedreno/a2xx/disasm-a2xx.c
b/src/gallium/drivers/freedreno/a2xx/disasm-a2xx.c
index f00d5d4..54b3514 100644
--- a/src/gallium/drivers/freedreno/a2xx/disasm-a2xx.c
Because the meaning of the swizzles and writemasks involved is different,
so replacing the source would lead to different semantics.
---
src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 7 +++
1 file changed, 7 insertions(+)
diff --git
RepCtrl=1 does not work with 64-bit operands so we need to use RepCtrl=0.
In that situation, the regioning generated for the sources seems to be
equivalent to <4,4,1>:DF, so it will only work for components XY, which
means that we have to move any other swizzle to a temporary so that we can
RepCtrl=1 does not work with 64-bit operands so we need to use RepCtrl=0.
In that situation, the regioning generated for the sources seems to be
equivalent to <4,4,1>:DF, so it will only work for components XY, which
means that we have to move any other swizzle to a temporary so that we can
The help string wasn't updated in cbc37f7.
Fixes: cbc37f7 ("anv: install the intel_icd.json to ${datarootdir} by
default")
Signed-off-by: Andreas Boll
---
configure.ac | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/configure.ac b/configure.ac
We only set this to true when fixing up 64bit regions and for one
specific purpose only, so check that nothing else sets this to true.
This helped me find a bug where the field was incorrectly initialized
to true in some cases.
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 +++
1 file changed, 3
Specifically, at least for now, we don't want to deal with the fact that
channel sizes for fp64 instructions are twice the size, so prevent
coalescing from instructions with a different type size.
Also, we should check that if we are coalescing a register from another
MOV we should be reading the
The general idea is that with 32-bit swizzles we cannot address DF
components Z/W directly, so instead we select the region that starts
at the middle of the SIMD register and use X/Y swizzles.
The above, however, has the caveat that we can't do that without
violating register region restrictions
Also, we use reg_offset=1 with DF uniforms when we try to access
components Z/W, so print reg_offset for them too.
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 14 +++---
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
We make scalar sources in 3src instructions use subnr instead of
swizzles because they don't really use swizzles.
With doubles it is more complicated because we use vstride=0 in
more scenarios in which they don't produce scalar regions. Also
RepCtrl is not allowed with 64-bit operands, so we
---
src/mesa/drivers/dri/i965/brw_disasm.c | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c
b/src/mesa/drivers/dri/i965/brw_disasm.c
index c8bdeab..d5e9916 100644
--- a/src/mesa/drivers/dri/i965/brw_disasm.c
+++
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 30 +++---
1 file changed, 15 insertions(+), 15 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 95b408e..68efea6 100644
---
There is a hardware bug affecting compressed double-precision bcsel
instructions in align16 mode by which they won't read predication mask
properly, leading to incorrect behavior at least in non-uniform control
flow scenarios. The bug does not affect other predicated instructions
and it does not
---
src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index c9b8edf..d7c6bf4 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++
On Tue, Jul 19, 2016 at 12:45:54PM +0200, Andreas Boll wrote:
> The help string wasn't updated in cbc37f7.
>
> Fixes: cbc37f7 ("anv: install the intel_icd.json to ${datarootdir} by
> default")
>
> Signed-off-by: Andreas Boll
Good catch!
Reviewed-by: Eric Engestrom
On 19 July 2016 at 09:55, Andreas Boll wrote:
> Hi,
>
> sorry for being late but this patch doesn't mention that all those
> symbols should be exported in libGL.so too [1].
> If you look at the history of static_data.py it was mentioned that
> this list of functions
1 - 100 of 299 matches
Mail list logo