I tested this on my BXT by running the glsl-fs-color-matrix test (which
ends up using MAD in SIMD16) and it worked fine.
Tested-by: Neil Roberts
Reviewed-by: Neil Roberts
- Neil
Ben Widawsky writes:
> This is easily accomplished by moving simd16 3src to GEN9_FEATURES.
>
> Signed-of
It seems a bit weird to modify the the is_power_of_two functions in
format_parser.py, was that intentional? It doesn't look like those
functions are actually used anywhere so maybe we could just remove them,
although it would probably make sense to do that in a separate patch.
Regards,
- Neil
Anu
If a non-const sample number is given to interpolateAtSample it will
now generate an indirect send message with the sample ID similar to
how non-const sampler array indexing works. Previously non-const
values were ignored and instead it ended up using a constant 0 value.
Note that unlike sampler a
The edge flag data on Gen6+ is passed through the fixed function hardware as
an extra attribute. According to the PRM it must be the last valid
VERTEX_ELEMENT structure. However if the vertex ID is also used then another
extra element is added to source the VID. This made it so the vertex ID is in
When the edge flag element is enabled then the elements are slightly
reordered so that the edge flag is always the last one. This was
confusing the code to upload the 3DSTATE_VF_INSTANCING state because
that is uploaded with a separate loop which has an instruction for
each element. The indices use
When gl_VertexID or gl_InstanceID is used a 3DSTATE_VF_SGVS
instruction is sent to create a sort of element to store the generated
values. The last instruction in this chunk of code looks like it was
trying to set the instancing state for the element using the
3DSTATE_VF_INSTANCING instruction. How
This series fixes these two related bugs:
https://bugs.freedesktop.org/show_bug.cgi?id=84677
https://bugs.freedesktop.org/show_bug.cgi?id=91292
The first bug has had a fix for a while but until now I wasn't able to
figure out what to do on BDW. I had previously made a fix for the
second bug but a
Chris Forbes writes:
> Nitpicks aside, I don't think this is a great idea now that you've got
> the SKL PI working.
Can you explain why you don't think this is a good idea? Is it because
it is an optimisation for something that is not known to be a big use
case so carrying around the extra code
When the edge flag element is enabled then the elements are slightly
reordered so that the edge flag is always the last one. This was
confusing the code to upload the 3DSTATE_VF_INSTANCING state because
that is uploaded with a separate loop which has an instruction for
each element. The indices use
For centroid interpolation we can just directly use the values set up
in the shader payload instead of querying the pixel interpolator. To
do this we need to modify brw_compute_barycentric_interp_modes to
detect when interpolateAtCentroid is called.
v2: Rebase on top of changes to set the pulls ba
There is a piece of code that is trying to match expressions of the
form (mul (floor (add (abs x) 0.5) (sign x))). However the check for
the add expression wasn't checking whether it had the expected
operation. It looks like this was just an oversight because it doesn't
match the pattern for the re
In this bit of code point_five can be NULL if the expression is not a
constant. This fixes it to match the pattern of the rest of the chunk
of code so that it checks for NULLs.
Cc: Matt Turner
Cc: "10.6"
---
src/glsl/opt_algebraic.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff
On Gen9+ there is a new bit in 3DSTATE_PS_EXTRA that must be set if
the shader sends a message to the pixel interpolator. This fixes the
interpolateAt* tests on SKL, apart from interpolateatsample-nonconst
but that is not implemented anywhere so it's not a regression.
---
src/mesa/drivers/dri/i965
There was a comment saying that in SIMD16 mode the pixel interpolator
returns coords interleaved 8 channels at a time and that this requires
extra work to support. However, this interleaved format is exactly
what the PLN instruction requires so I don't think anything needs to
be done to support it
Davin McCall writes:
> I actually had thought about this, but technically, you can only use
> unions for type aliasing if you perform all accesses (that are not to
> the 'active' member) through the union. All the list processing code
> that iterates through all the nodes including the tail senti
Hi,
If we wanted to avoid growing the size of exec_list to four pointers
instead of three, maybe we could store it in a union like below:
struct exec_list {
union {
struct {
struct exec_node head_sentinel;
struct exec_node *dummy_pointer_a;
};
struct {
Ben Widawsky writes:
> I am not the right person to judge the complexity tradeoff, but it
> seems like a worthwhile patch to me. I spent a few minutes thinking
> about how it could hurt performance and was unable to come up with
> anything.
Thanks. I was thinking more that the complexity means m
For centroid interpolation we can just directly use the values set up
in the shader payload instead of querying the pixel interpolator. To
do this we need to modify brw_compute_barycentric_interp_modes to
detect when interpolateAtCentroid is called.
This fixes the interpolateAtCentroid tests on SK
Since commit 104c8fc2c2aa5621261f8 the GLSL IR will be freed if NIR is
being used. This was causing it to segfault if INTEL_DEBUG=wm is set.
This patch just makes it avoid dumping the GLSL IR in that case.
---
src/mesa/drivers/dri/i965/brw_program.c | 11 +++
1 file changed, 7 insertions(+
Ben Widawsky writes:
> I think this is beginning to infringe upon the definition of align_w.
> The total width is a function of it's miptree properties and not the
> compressed block properties, right?
>
> In other words, if there is a case where align_w != bw, I think
> total_width should be ali
Hi,
If we are going to have to prod all of the code using this list
implementation to solve this problem anyway maybe it would make more
sense to switch to a kernel-style list instead. There is already an
implementation of this in src/util/list.h. I think this style of list
only accesses the point
brw_miptree_layout_2d tries to ensure that mt->total_width is a
multiple of the compressed block size, presumably because it wouldn't
be possible to make an image that has a fraction of a block. However
it was doing this by aligning mt->total_width to align_w. Previously
align_w has been used as a
Nanley Chery writes:
> Although most of the patch is incorrect, the following change is still
> necessary isn't it?
> if (mt->compressed) {
>mip1_width = ALIGN(minify(mt->physical_width0, 1), mt->align_w) +
> - ALIGN(minify(mt->physical_width0, 2), bw);
> +
> On Fri, Jun 12, 2015 at 7:34 AM, Neil Roberts wrote:
>> The code was buggy anyway because it was checking whether the buffer
>> was busy before calling Driver->AllocTextureImageBuffer. That function
>> actually always frees the buffer and recreates a new one so it was
g
tests by hand after a reboot to see if they fail consistently.
Regards,
- Neil
Nanley Chery writes:
> Hey Neil,
>
> While this patch does fix FXT1, it also regresses 21 other Piglit tests on
> SKL.
>
> - Nanley
>
> On Thu, Jun 11, 2015 at 8:59 AM, Neil Roberts wrote:
&g
Neil Roberts writes:
> It seems a bit weird to use create_pbo=true at all for
> glTexImage{1,2,3}D because in that case we are completely replacing
> the texture. If the texture's buffer is busy instead of allocating a
> PBO we might as well just directly allocate some ne
Previously when glTexImage* is called it would attempt to create a
temporary PBO if the texture is busy in order to avoid blocking when
mapping the texture. This doesn't make much sense for glTexImage
because in that case we are completely replacing the texture anyway so
instead of allocating a PBO
Ah, good catch. Looks good to me.
Reviewed-by: Neil Roberts
It seems a bit weird to use create_pbo=true at all for
glTexImage{1,2,3}D because in that case we are completely replacing the
texture. If the texture's buffer is busy instead of allocating a PBO we
might as well just directly all
This function was trying to align the width and height to a multiple
of the block size for compressed textures. It was using align_w/h as a
shortcut to get the block size as up until Gen9 this always happens to
match. However in Gen9+ the alignment values are expressed as
multiples of the block siz
Hi Nanley,
Could you explain the reasoning behind this patch? I can't find any
mention of needing to align to the square of the block size in the docs.
I think how it works is that on Skylake you can pick any alignment value
you want out of 4, 8 or 16 but for compressed textures that is counted
i
Kenneth Graunke writes:
> _mesa_meta_fb_tex_blit_begin(ctx, &blit);
> + ctx->Extensions.ARB_texture_stencil8 = true;
Maybe you could put assert(ctx->Extensions.ARB_texture_stencil8==false)
just before setting it to true so that we'll definitely remember to
remove it if we eventually enable
Previously when setting up the sample instruction for an indirect
sampler the vec4 backend was directly passing the pseudo opcode's
src0. However vec4_visitor::visit(ir_texture *) doesn't set the
texture operation's src0 -- it's left as BAD_FILE, which when
translated into a brw_reg gives the null
Matt Turner writes:
>> I'll have another look at moving it into brw_send_indirect_message.
>
> Thanks. I'm not really sure what the right solution is, so if you
> decide this patch is good as is, that's fine with me.
Here's what the patches would look like if we made
brw_send_indirect_message lo
A freshly constructed instruction defaults to having a base_mrf of 0
which means that if nothing disables it it will default to using
send-from-MRF. Previously this didn't matter because the constant load
instructions on Gen7 were ignoring the base_mrf anyway. However in the
next patch the brw_send
Jason Ekstrand writes:
> The only place when the fact that the MRFs are virtual matters is in
> register allocation. Implied MRF writes are taken into account in
> setup_mrf_hack_interference. We figure out what MRFs are used and
> then mark them as conflicting with *all* of the VGRFs. We also
Both patches look good to me and I can confirm they make the Piglit
tests pass on Skylake.
Reviewed-by: Neil Roberts
My original assumption of the problem was that the implied writes from
the SCRATCH_WRITE instruction aren't taken into account when calculating
the liveliness of the regi
Matt Turner writes:
> I don't know why I was confused by this patch -- after arriving at the
> same conclusion independently I see that all of the analysis I needed
> was right there.
Yes sorry, I probably didn't explain it very well. Your explanation is a
lot clearer.
> To sum up, vec4_visitor
Looks good to me. Thanks for fixing this. I guess I still have more to
learn about the ISA.
However, should we not also fix the vec4 version? With that,
Reviewed-by: Neil Roberts
If we wanted to play safe and avoid the MUL, we could change it to this
and still avoid having a temporary
querying the block height anyway. The later patch about
combining the two functions would need to be changed too.
Regards,
- Neil
Neil Roberts writes:
> Looks good to me.
>
> Reviewed-by: Neil Roberts
>
> - Neil
>
> Anuj Phogat writes:
>
>> Adding Neil to Cc who
Looks good to me.
Reviewed-by: Neil Roberts
- Neil
Anuj Phogat writes:
> Adding Neil to Cc who committed 4ab8d59.
>
> Reviewed-by: Anuj Phogat
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman
Many thanks for all the reviews and testing. I've pushed the two
patches.
The remaining sampler_array_indexing tests that fail on SKL (the gs
ones) are because of a separate problem described in this patch:
http://patchwork.freedesktop.org/patch/50676/
I'm not really sure whether that's the clea
When calculating the binding table index for non-constant sampler
array indexing it needs to add the base binding table index which is a
constant within the generated code. Often this base is zero so we can
avoid a redundant instruction in that case.
It looks like nothing in shader-db is doing non
Previously when generating the send instruction for a sample
instruction with an indirect sampler it would use the destination
register as a temporary store. This breaks when used in combination
with the opt_sampler_eot optimisation because that forces the
destination to be null. This patch fixes t
Previously when setting up the sample instruction for an indirect
sampler the vec4 backend was directly passing the pseudo opcode's
src0. However this isn't actually set to a valid register because
instead the MRF registers are used as the source so it would end up
passing null as src0.
This patch
atches out there to handle this. Please ignore if
> this has already been sent by someone. (Direct me to it and I will
> review it).
>
> Cc: Matt Turner
> Cc: Neil Roberts
> Cc: Mark Janes
> Signed-off-by: Ben Widawsky
> ---
> src/mesa/drivers/dri/i965/brw_eu_compact.c |
When using SIMD4x2 on Skylake, the sampler instructions need a message
header to select the correct mode. This was added for most sample
instructions in 0ac4c2727 but the TXF_MCS instruction is emitted
separately and it was missed.
This fixes a bunch of Piglit tests which test texelFetch in a geom
Jason Ekstrand writes:
> I think *most* of that code *should* already be there. In theory,
> it's all keyed off of the block size provided by formats.csv.
> However, given some of the rendering errors we're currently seeing, it
> looks like it may need a little patching here and there. :-)
inte
This is required for the I915_PARAM_REVISION macro. Previously this
define was directly copied into the Mesa source.
---
configure.ac | 2 +-
src/mesa/drivers/dri/i965/intel_screen.c | 5 -
2 files changed, 1 insertion(+), 6 deletions(-)
diff --git a/configure.ac b
Ian Romanick writes:
>> For what it's worth, I'm strongly in favour of using these
>> kernel-style lists instead of exec_list. The kernel ones seem much
>> less confusing.
>
> Huh? They're practically identical. The only difference is the
> kernel-style lists have a single sentinel node, and that
Kenneth Graunke writes:
> It might be nice to create a brw_load_register_mem64 function, for
> symmetry with brw_store_register_mem64 - we might want to reuse it
> elsewhere someday.
Ok, that sounds sensible.
> One interesting quirk: the two halves of your register write may land
> in two separ
In order to detect whether the predicate source registers can be used
in a later patch we will need to know the version number for the
command parser. This patch just adds a member to intel_screen and does
an ioctl to get the version.
Reviewed-by: Kenneth Graunke
---
src/mesa/drivers/dri/i965/in
EMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Aut
I thought it might be a good idea to try posting these patches again
since it's been 6 months since they were originally posted. The
patches are a lot more useful now since the command parser in the
kernel is working correctly for Haswell. This means the functionality
is no longer restricted to onl
The opt_sampler_eot optimisation seems to break when the last
instruction is SHADER_OPCODE_TG4. A bunch of Piglit tests end up doing
this so it causes a lot of regressions. I can't find any documentation
or known workarounds to indicate that this is expected behaviour, but
considering that this is
Commit 94ee908448 added a header size parameter to the function to
create the LOAD_PAYLOAD instruction. However this broke
opt_sampler_eot which manually constructs the instruction and so
wasn't setting the header_size. This ends up making the parameters for
the send message all have the wrong loca
opt_sampler_eot enables a direct write to framebuffer from a sample.
In order to do this the sample message needs to have a message header
so if there wasn't one already then the function adds one. In addition
the function sets the destination register to null because it's no
longer used. However i
Hi,
This optimisation doesn't seem to work with textureGather so a bunch of
Piglit tests are failing for me. I'm not sure why it didn't get picked
up by your Jenkins run.
I can't find anything in the bspec nor a known workaround to suggest
that this shouldn't work so I'm not really sure what to d
Jason Ekstrand writes:
> +static inline void list_validate(struct list_head *list)
> +{
> + assert(list->next->prev == list && list->prev->next == list);
> + for (struct list_head *node = list->next; node != list; node = node->next)
> + assert(node->next->prev == node && node->prev->next
Jason Ekstrand writes:
> +static inline bool list_empty(struct list_head *list)
> +{
> + return list->next == list;
> +}
It would be good if list.h also included stdbool.h in order to get the
declaration of bool. However, will that cause problems on MSVC? Is the
Gallium code compiled on MSVC i
Jason Ekstrand writes:
> +#define list_for_each_entry(type, pos, head, member)\
> + for (type *pos = container_of((head)->next, pos, member);\
> + &pos->member != (head); \
> + pos = container_of(pos->member.next, p
Sorry for the really long delay in replying! This patch is still needed
in order to fix a number of Piglit tests so it would be good to get it
landed.
Ben Widawsky writes:
> Sorry for the delay, but I put this off initially because I wasn't
> sure which part of the docs this was addressing. I se
If a send message is emitted with a message length that is less than
required for the message then the remaining parameters default to
zero. We can take advantage of this to save a register when a shader
passes constant zeroes as the final coordinates to the sample
function.
I think this might be
The opt_sampler_eot optimisation of fs_visitor effectively assumes
that it is running on a fragment shader because it casts the program
key to a brw_wm_prog_key. However on Skylake fs_visitor can also be
used for vertex shaders. It looks like this usually works anyway
because the optimisation is sk
Kenneth Graunke writes:
> I like this idea!
>
> We definitely need to skip this optimization on Gen4, since the Gen4/G45
> sampler infers the texturing opcode based on the message length. But
> for Gen5+, it should be no problem.
Ah ok, yes, I will add this.
> Matt mentioned that we have to em
Matt Turner writes:
>> + foreach_block_and_inst(block, fs_inst, inst, cfg) {
>> + if ((inst->opcode == SHADER_OPCODE_TEX ||
>> + inst->opcode == SHADER_OPCODE_TXF) &&
>> + !inst->shadow_compare) {
>> + fs_inst *load_payload = (fs_inst *) inst->prev;
>> +
>> +
If a send message is emitted with a message length that is less than
required for the message then the remaining parameters default to
zero. We can take advantage of this to save a register when a shader
passes constant zeroes as the final coordinates to the sample
function.
I think this might be
On Gen9+ there needs to be a header when sampling using SIMD4x2. The
header is set up by copying from the g0 register. Commit 07c571a39f
tried to fix this mov instruction to always use an exec size of 8
because previously it was incorrectly using 4. It did this by casting
the type of the destinatio
Ilia Mirkin writes:
> That seems awkward... did you mean 1U? FWIW mesa's not at all careful
> about that...
Or maybe even UINT64_C(1). 1l would still be 32-bit on 32-bit
architectures. I guess this is more of a problem for subsequent flags
that go over 32-bit.
- Neil
___
Anuj Phogat writes:
> This will allow Skylake to use _mesa_meta_pbo_GetTexSubImage() for reading
> YF/YS
> tiled surfaces.
>
> V2: Make changes suggested by Neil.
>
> Signed-off-by: Anuj Phogat
> Cc: Neil Roberts
> ---
> src/mesa/drivers/common/meta.h
There were three places in the visitor that had a similar chunk of
code to emit the VS_OPCODE_PULL_CONSTANT_LOAD opcode using a register
for the offset. This patch combines the chunks into a helper function
to reduce the code duplication. It will also be useful in the next
patch to expand what happ
Commit 5a06ee738 added a step to the generator to set up the message
header when generating the VS_OPCODE_PULL_CONSTANT_LOAD_GEN7
instruction. That pseudo opcode is implemented in terms of multiple
actual opcodes, one of which writes to one of the source registers in
order to set up the message hea
Ben Widawsky writes:
>> + void generate_set_simd4x2_header_gen9(vec4_instruction *inst,
>> + struct brw_reg dst);
>
> super bikesheddy, but this function name sounds pretty bad, though I
> understand it's a side-effect of the name of the opcode (which see
Commit 5a06ee738 added a step to the generator to set up the message
header when generating the VS_OPCODE_PULL_CONSTANT_LOAD_GEN7
instruction. That pseudo opcode is implemented in terms of multiple
actual opcodes, one of which writes to one of the source registers in
order to set up the message hea
Ben Widawsky writes:
> Does this happen to allow concurrent piglit to not die in a fire?
Hrm, I just tried running piglit without -1 on my wip/skylake branch
(without this patch) and it seems to work fine! I don't know what has
fixed it but these are the three patches on that branch:
d2e8cd7 i9
Ben Widawsky writes:
> Can you do me a favor since I am lazy? Can you send the generated asm
> diff with this patch? I am admittedly rusty on the matter, but I
> thought a mov is just a mov, and only the regioning is effected by the
> retype. If it indeed does what you say, it really should fix
>
It seems a bit strange that this has stopped working for you. If you
specify a mode in the .dir-locals.el file then it's supposed to set the
variable for any files with that mode or any modes inherited from that
mode. The C and C++ modes both inherit from prog-mode, as well as a
bunch of other ones
Commit cf67ca9ffa9 made the layouting code pick a special layout for
1D images on Skylake. This should not be used for depth and stencil
buffers because these need to be treated as 2D tiled images. However
the patch was missing a check for images with a base format of
GL_STENCIL_INDEX. In practice
Sorry for the delay in replying to this.
Ben Widawsky writes:
>> > +static inline uint32_t
>> > +intel_miptree_blit_height(struct intel_mipmap_tree *mt)
>> > +{
>> > + switch (mt->target) {
>> > + case GL_TEXTURE_CUBE_MAP:
>> > + case GL_TEXTURE_1D_ARRAY:
>> > + case GL_TEXTURE_2D_ARRAY:
Ben Widawsky writes:
> diff --git a/src/mesa/drivers/dri/i965/intel_blit.h
> b/src/mesa/drivers/dri/i965/intel_blit.h
> index 52dd67c..aff2d58 100644
> --- a/src/mesa/drivers/dri/i965/intel_blit.h
> +++ b/src/mesa/drivers/dri/i965/intel_blit.h
> @@ -78,14 +78,34 @@ void intel_emit_linear_blit(st
Ben Widawsky writes:
> diff --git a/src/mesa/drivers/dri/i965/intel_blit.h
> b/src/mesa/drivers/dri/i965/intel_blit.h
> index f563939..531d329 100644
> --- a/src/mesa/drivers/dri/i965/intel_blit.h
> +++ b/src/mesa/drivers/dri/i965/intel_blit.h
> @@ -30,6 +30,9 @@
>
> #include "brw_context.h"
Looks good to me.
Reviewed-by: Neil Roberts
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Ben Widawsky writes:
> This patch actually addresses two potential issues:
> 1. As far as I can tell, drm_intel_gem_bo_alloc_tiled() which is invariably
> called, can potentially change the tiling type. Therefore, we shouldn't be
> checking the requested tiling type, but rather the granted tilin
From: Kenneth Graunke
Several steppings of Skylake fail when using SIMD16 with 3-source
instructions (such as MAD).
This implements WaDisableSIMD16On3SrcInstr and fixes ~190 Piglit
tests.
Based on a patch by Neil Roberts.
---
src/mesa/drivers/dri/i965/brw_shader.cpp | 6 ++
1 file changed
The places that were checking whether 3-source instructions are
supported have now been combined into a small helper function. This
will be used in the next patch to add an additonal restriction.
Based on a patch by Kenneth Graunke.
---
Matt's review for v1 of this patch was conditional based on
On Skylake the qpitch value is uploaded as part of the surface state
so we don't need to add the extra rows that are done for other
generations. However for 3D textures it needs to be aligned to the
tile height and for depth/stencil textures it needs to be a multiple
of 8. Unlike previous generatio
Commit 0ac4c272755c7 made it add a header for the send message when
using SIMD4x2 on Skylake because without this it will end up using
SIMD8D. However the patch missed the case when a sampler is being used
to implement constant loads from a buffer surface in a SIMD4x2 vertex
shader.
This fixes 29
The render surface state command for Skylake doesn't have the surface
array spacing bit so it's not possible to select this layout. I think
it was only used in order to make it pick a tightly-packed qpitch
value that doesn't include space for the mipmaps. However this won't
be necessary after the n
Ben Widawsky writes:
> This patch will use a new calculation to determine if a surface can be blitted
> from or to. Previously, the "total_height" member was used. Total_height in
> the
> case of 2d, 3d, and cube map arrays is the height of each slice/layer/face.
> Since the GL map APIS only eve
Ben Widawsky writes:
> On Fri, Feb 20, 2015 at 10:31:08PM +0000, Neil Roberts wrote:
>> The render surface state command for Skylake doesn't have the surface
>> array spacing bit so I don't think it's possible to select this
>> layout. This avoids a kerne
In Skylake the order of the arguments for sample messages with the LD
type are u, v, lod, r whereas previously they were u, lod, v, r.
This fixes 144 Piglit tests including ones that directly use
texelFetch and also some using the meta stencil blit path which
appears to use texelFetch in its shade
15, Emil Velikov wrote:
>> On 4 March 2015 at 17:22, Neil Roberts wrote:
>>> This reverts commit 546aba143d13ba3f993ead4cc30b2404abfc0202.
>>>
>>> I think the changes to the calls to glBlitFramebuffer from this patch
>>> are no different to what it was doing
In Skylake the order of the arguments for sample messages with the LD
type are u, v, lod, r whereas previously they were u, lod, v, r. This
fixes 82 Piglit tests using texelFetch.
---
I have a feeling this probably isn't the right way to do this patch so
maybe someone who knows the compiler better
Yes, I like this approach much better. I ran it through Piglit and I can
confirm it fixes the same tests as my patch.
Reviewed-by: Neil Roberts
There's no need to reset the author to me. Thanks for looking at this.
Regards,
- Neil
Kenneth Graunke writes:
> Several steppings of Skyl
This reverts commit 546aba143d13ba3f993ead4cc30b2404abfc0202.
I think the changes to the calls to glBlitFramebuffer from this patch
are no different to what it was doing previously because it used to
set height to 1 before doing the blits. However it was introducing
some problems with the blit for
Now that a layered source PBO is interpreted as a single tall 2D image
it's quite easy to accept the image height packing option by just
creating an image that is tall enough to include the image padding.
I'm not sure whether the image height property should affect 1D_ARRAY
textures. My intuition
The yoffset needs to be interpreted as a slice offset for 1D array
textures. This patch implements that by moving the yoffset into
zoffset similar to how it moves the height into depth.
---
src/mesa/drivers/common/meta_tex_subimage.c | 8
1 file changed, 8 insertions(+)
diff --git a/src/
Steppings C0 and D0 of Skylake fail when using SIMD16 with 3-source
instructions (such as MAD). This patch just makes it disable SIMD16 in
those cases.
This implements WaDisableSIMD16On3SrcInstr and fixes ~190 Piglit
tests.
v2: Also apply on stepping D0
---
Damien Lespiau pointed out that the wo
Ilia Mirkin writes:
> On Wed, Mar 4, 2015 at 9:33 AM, Neil Roberts wrote:
>> Stepping C0 of Skylake fails when using SIMD16 with 3-source
>> instructions (such as MAD). This patch just makes it disable SIMD16 in
>> that case.
>>
>> This implements WaDisabl
Stepping C0 of Skylake fails when using SIMD16 with 3-source
instructions (such as MAD). This patch just makes it disable SIMD16 in
that case.
This implements WaDisableSIMD16On3SrcInstr and fixes ~190 Piglit
tests.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 21 +
src/mesa/driv
brwContextInit now queries the GPU revision number via a new parameter
for DRM_I915_GETPARAM. This new parameter requires a kernel patch and
a patch to libdrm. If the kernel doesn't support it then it will
continue but set the revision number to -1. The intention is to use
this to implement workaro
201 - 300 of 506 matches
Mail list logo