Re: [Mesa-dev] [PATCH] i965/bxt: Support 3src simd16 instructions

2015-07-29 Thread Neil Roberts
I tested this on my BXT by running the glsl-fs-color-matrix test (which ends up using MAD in SIMD16) and it worked fine. Tested-by: Neil Roberts Reviewed-by: Neil Roberts - Neil Ben Widawsky writes: > This is easily accomplished by moving simd16 3src to GEN9_FEATURES. > > Signed-of

Re: [Mesa-dev] [PATCH] Delete duplicate function is_power_of_two() and use _mesa_is_pow_two()

2015-07-29 Thread Neil Roberts
It seems a bit weird to modify the the is_power_of_two functions in format_parser.py, was that intentional? It doesn't look like those functions are actually used anywhere so maybe we could just remove them, although it would probably make sense to do that in a separate patch. Regards, - Neil Anu

[Mesa-dev] [PATCH] i965/fs: Handle non-const sample number in interpolateAtSample

2015-07-21 Thread Neil Roberts
If a non-const sample number is given to interpolateAtSample it will now generate an indirect send message with the sample ID similar to how non-const sampler array indexing works. Previously non-const values were ignored and instead it ended up using a constant 0 value. Note that unlike sampler a

[Mesa-dev] [PATCH 2/3 v2] i965: Swap the order of the vertex ID and edge flag attributes

2015-07-13 Thread Neil Roberts
The edge flag data on Gen6+ is passed through the fixed function hardware as an extra attribute. According to the PRM it must be the last valid VERTEX_ELEMENT structure. However if the vertex ID is also used then another extra element is added to source the VID. This made it so the vertex ID is in

[Mesa-dev] [PATCH 3/3 v2] i965/bdw: Fix 3DSTATE_VF_INSTANCING when the edge flag is used

2015-07-13 Thread Neil Roberts
When the edge flag element is enabled then the elements are slightly reordered so that the edge flag is always the last one. This was confusing the code to upload the 3DSTATE_VF_INSTANCING state because that is uploaded with a separate loop which has an instruction for each element. The indices use

[Mesa-dev] [PATCH 1/3] i965/bdw: Fix setting the instancing state for the SGVS element

2015-07-13 Thread Neil Roberts
When gl_VertexID or gl_InstanceID is used a 3DSTATE_VF_SGVS instruction is sent to create a sort of element to store the generated values. The last instruction in this chunk of code looks like it was trying to set the instancing state for the element using the 3DSTATE_VF_INSTANCING instruction. How

[Mesa-dev] [PATCH 0/3] i965: Fix problems with gl_Vertex/InstanceID and glPolygonMode

2015-07-13 Thread Neil Roberts
This series fixes these two related bugs: https://bugs.freedesktop.org/show_bug.cgi?id=84677 https://bugs.freedesktop.org/show_bug.cgi?id=91292 The first bug has had a fix for a while but until now I wasn't able to figure out what to do on BDW. I had previously made a fix for the second bug but a

Re: [Mesa-dev] [PATCH v2] i965/fs: Don't use the pixel interpolater for centroid interpolation

2015-07-13 Thread Neil Roberts
Chris Forbes writes: > Nitpicks aside, I don't think this is a great idea now that you've got > the SKL PI working. Can you explain why you don't think this is a good idea? Is it because it is an optimisation for something that is not known to be a big use case so carrying around the extra code

[Mesa-dev] [PATCH] i965/bdw: Fix 3DSTATE_VF_INSTANCING when the edge flag is used

2015-07-10 Thread Neil Roberts
When the edge flag element is enabled then the elements are slightly reordered so that the edge flag is always the last one. This was confusing the code to upload the 3DSTATE_VF_INSTANCING state because that is uploaded with a separate loop which has an instruction for each element. The indices use

[Mesa-dev] [PATCH v2] i965/fs: Don't use the pixel interpolater for centroid interpolation

2015-07-09 Thread Neil Roberts
For centroid interpolation we can just directly use the values set up in the shader payload instead of querying the pixel interpolator. To do this we need to modify brw_compute_barycentric_interp_modes to detect when interpolateAtCentroid is called. v2: Rebase on top of changes to set the pulls ba

[Mesa-dev] [PATCH 1/2] glsl: Add missing check for whether an expression is an add operation

2015-07-04 Thread Neil Roberts
There is a piece of code that is trying to match expressions of the form (mul (floor (add (abs x) 0.5) (sign x))). However the check for the add expression wasn't checking whether it had the expected operation. It looks like this was just an oversight because it doesn't match the pattern for the re

[Mesa-dev] [PATCH 2/2] glsl: Make sure not to dereference NULL

2015-07-04 Thread Neil Roberts
In this bit of code point_five can be NULL if the expression is not a constant. This fixes it to match the pattern of the rest of the chunk of code so that it checks for NULLs. Cc: Matt Turner Cc: "10.6" --- src/glsl/opt_algebraic.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff

[Mesa-dev] [PATCH] i965/skl: Set the pulls bary bit in 3DSTATE_PS_EXTRA

2015-07-03 Thread Neil Roberts
On Gen9+ there is a new bit in 3DSTATE_PS_EXTRA that must be set if the shader sends a message to the pixel interpolator. This fixes the interpolateAt* tests on SKL, apart from interpolateatsample-nonconst but that is not implemented anywhere so it's not a regression. --- src/mesa/drivers/dri/i965

[Mesa-dev] [PATCH] i965/fs: Don't disable SIMD16 when using the pixel interpolator

2015-07-02 Thread Neil Roberts
There was a comment saying that in SIMD16 mode the pixel interpolator returns coords interleaved 8 channels at a time and that this requires extra work to support. However, this interleaved format is exactly what the PLN instruction requires so I don't think anything needs to be done to support it

Re: [Mesa-dev] [PATCH v3] glsl: fix some strict aliasing issues in exec_list

2015-07-02 Thread Neil Roberts
Davin McCall writes: > I actually had thought about this, but technically, you can only use > unions for type aliasing if you perform all accesses (that are not to > the 'active' member) through the union. All the list processing code > that iterates through all the nodes including the tail senti

Re: [Mesa-dev] [PATCH v3] glsl: fix some strict aliasing issues in exec_list

2015-07-01 Thread Neil Roberts
Hi, If we wanted to avoid growing the size of exec_list to four pointers instead of three, maybe we could store it in a union like below: struct exec_list { union { struct { struct exec_node head_sentinel; struct exec_node *dummy_pointer_a; }; struct {

Re: [Mesa-dev] [PATCH] i965/fs: Don't use the pixel interpolater for centroid interpolation

2015-07-01 Thread Neil Roberts
Ben Widawsky writes: > I am not the right person to judge the complexity tradeoff, but it > seems like a worthwhile patch to me. I spent a few minutes thinking > about how it could hurt performance and was unable to come up with > anything. Thanks. I was thinking more that the complexity means m

[Mesa-dev] [PATCH] i965/fs: Don't use the pixel interpolater for centroid interpolation

2015-06-30 Thread Neil Roberts
For centroid interpolation we can just directly use the values set up in the shader payload instead of querying the pixel interpolator. To do this we need to modify brw_compute_barycentric_interp_modes to detect when interpolateAtCentroid is called. This fixes the interpolateAtCentroid tests on SK

[Mesa-dev] [PATCH] i965: Don't try to print the GLSL IR if it has been freed

2015-06-26 Thread Neil Roberts
Since commit 104c8fc2c2aa5621261f8 the GLSL IR will be freed if NIR is being used. This was causing it to segfault if INTEL_DEBUG=wm is set. This patch just makes it avoid dumping the GLSL IR in that case. --- src/mesa/drivers/dri/i965/brw_program.c | 11 +++ 1 file changed, 7 insertions(+

Re: [Mesa-dev] [PATCH] i965/skl: Fix aligning mt->total_width to the block size

2015-06-24 Thread Neil Roberts
Ben Widawsky writes: > I think this is beginning to infringe upon the definition of align_w. > The total width is a function of it's miptree properties and not the > compressed block properties, right? > > In other words, if there is a case where align_w != bw, I think > total_width should be ali

Re: [Mesa-dev] [PATCH] Fix strict-aliasing violations in GLSL shader list implementation

2015-06-24 Thread Neil Roberts
Hi, If we are going to have to prod all of the code using this list implementation to solve this problem anyway maybe it would make more sense to switch to a kernel-style list instead. There is already an implementation of this in src/util/list.h. I think this style of list only accesses the point

[Mesa-dev] [PATCH] i965/skl: Fix aligning mt->total_width to the block size

2015-06-16 Thread Neil Roberts
brw_miptree_layout_2d tries to ensure that mt->total_width is a multiple of the compressed block size, presumably because it wouldn't be possible to make an image that has a fraction of a block. However it was doing this by aligning mt->total_width to align_w. Previously align_w has been used as a

Re: [Mesa-dev] [PATCH] i965: correct alignment units for 2D compressed textures on Skylake

2015-06-15 Thread Neil Roberts
Nanley Chery writes: > Although most of the patch is incorrect, the following change is still > necessary isn't it? > if (mt->compressed) { >mip1_width = ALIGN(minify(mt->physical_width0, 1), mt->align_w) + > - ALIGN(minify(mt->physical_width0, 2), bw); > +

Re: [Mesa-dev] [PATCH] i965: Don't create a temp PBO when uploading data from glTexImage*

2015-06-12 Thread Neil Roberts
> On Fri, Jun 12, 2015 at 7:34 AM, Neil Roberts wrote: >> The code was buggy anyway because it was checking whether the buffer >> was busy before calling Driver->AllocTextureImageBuffer. That function >> actually always frees the buffer and recreates a new one so it was

Re: [Mesa-dev] [PATCH] i965: Fix aligning to the block size in intel_miptree_copy_slice

2015-06-12 Thread Neil Roberts
g tests by hand after a reboot to see if they fail consistently. Regards, - Neil Nanley Chery writes: > Hey Neil, > > While this patch does fix FXT1, it also regresses 21 other Piglit tests on > SKL. > > - Nanley > > On Thu, Jun 11, 2015 at 8:59 AM, Neil Roberts wrote: &g

Re: [Mesa-dev] [PATCH] meta: Abort texture upload if pixels == null and no pixel unpack buffer set

2015-06-12 Thread Neil Roberts
Neil Roberts writes: > It seems a bit weird to use create_pbo=true at all for > glTexImage{1,2,3}D because in that case we are completely replacing > the texture. If the texture's buffer is busy instead of allocating a > PBO we might as well just directly allocate some ne

[Mesa-dev] [PATCH] i965: Don't create a temp PBO when uploading data from glTexImage*

2015-06-12 Thread Neil Roberts
Previously when glTexImage* is called it would attempt to create a temporary PBO if the texture is busy in order to avoid blocking when mapping the texture. This doesn't make much sense for glTexImage because in that case we are completely replacing the texture anyway so instead of allocating a PBO

Re: [Mesa-dev] [PATCH] meta: Abort texture upload if pixels == null and no pixel unpack buffer set

2015-06-12 Thread Neil Roberts
Ah, good catch. Looks good to me. Reviewed-by: Neil Roberts It seems a bit weird to use create_pbo=true at all for glTexImage{1,2,3}D because in that case we are completely replacing the texture. If the texture's buffer is busy instead of allocating a PBO we might as well just directly all

[Mesa-dev] [PATCH] i965: Fix aligning to the block size in intel_miptree_copy_slice

2015-06-11 Thread Neil Roberts
This function was trying to align the width and height to a multiple of the block size for compressed textures. It was using align_w/h as a shortcut to get the block size as up until Gen9 this always happens to match. However in Gen9+ the alignment values are expressed as multiples of the block siz

Re: [Mesa-dev] [PATCH] i965: correct alignment units for 2D compressed textures on Skylake

2015-06-11 Thread Neil Roberts
Hi Nanley, Could you explain the reasoning behind this patch? I can't find any mention of needing to align to the square of the block size in the docs. I think how it works is that on Skylake you can pick any alignment value you want out of 4, 8 or 16 but for compressed textures that is counted i

Re: [Mesa-dev] [PATCH] i965: Momentarily pretend to support ARB_texture_stencil8 for blits.

2015-06-10 Thread Neil Roberts
Kenneth Graunke writes: > _mesa_meta_fb_tex_blit_begin(ctx, &blit); > + ctx->Extensions.ARB_texture_stencil8 = true; Maybe you could put assert(ctx->Extensions.ARB_texture_stencil8==false) just before setting it to true so that we'll definitely remember to remove it if we eventually enable

[Mesa-dev] [PATCH 2/2] i965/vec4: Fix the source register for indexed samplers

2015-06-10 Thread Neil Roberts
Previously when setting up the sample instruction for an indirect sampler the vec4 backend was directly passing the pseudo opcode's src0. However vec4_visitor::visit(ir_texture *) doesn't set the texture operation's src0 -- it's left as BAD_FILE, which when translated into a brw_reg gives the null

[Mesa-dev] [PATCH 0/2] Fix sampler array indexing in vec4vs

2015-06-10 Thread Neil Roberts
Matt Turner writes: >> I'll have another look at moving it into brw_send_indirect_message. > > Thanks. I'm not really sure what the right solution is, so if you > decide this patch is good as is, that's fine with me. Here's what the patches would look like if we made brw_send_indirect_message lo

[Mesa-dev] [PATCH 1/2] i965: Explicitly set base_mrf to -1 for pull constant loads on Gen7

2015-06-10 Thread Neil Roberts
A freshly constructed instruction defaults to having a base_mrf of 0 which means that if nothing disables it it will default to using send-from-MRF. Previously this didn't matter because the constant load instructions on Gen7 were ignoring the base_mrf anyway. However in the next patch the brw_send

Re: [Mesa-dev] [PATCH 1/2] i965/fs: Don't let the EOT send message interfere with the MRF hack

2015-06-09 Thread Neil Roberts
Jason Ekstrand writes: > The only place when the fact that the MRFs are virtual matters is in > register allocation. Implied MRF writes are taken into account in > setup_mrf_hack_interference. We figure out what MRFs are used and > then mark them as conflicting with *all* of the VGRFs. We also

Re: [Mesa-dev] [PATCH 1/2] i965/fs: Don't let the EOT send message interfere with the MRF hack

2015-06-09 Thread Neil Roberts
Both patches look good to me and I can confirm they make the Piglit tests pass on Skylake. Reviewed-by: Neil Roberts My original assumption of the problem was that the implied writes from the SCRATCH_WRITE instruction aren't taken into account when calculating the liveliness of the regi

Re: [Mesa-dev] [PATCH] i965/vec4: Fix the source register for indexed samplers

2015-06-04 Thread Neil Roberts
Matt Turner writes: > I don't know why I was confused by this patch -- after arriving at the > same conclusion independently I see that all of the analysis I needed > was right there. Yes sorry, I probably didn't explain it very well. Your explanation is a lot clearer. > To sum up, vec4_visitor

Re: [Mesa-dev] [PATCH] i965/fs: Use UW-typed immediate in multiply inst.

2015-06-03 Thread Neil Roberts
Looks good to me. Thanks for fixing this. I guess I still have more to learn about the ISA. However, should we not also fix the vec4 version? With that, Reviewed-by: Neil Roberts If we wanted to play safe and avoid the MUL, we could change it to this and still avoid having a temporary

Re: [Mesa-dev] [RFC v2 12/15] i965: correct VALIGN for 2d textures on Skylake

2015-06-02 Thread Neil Roberts
querying the block height anyway. The later patch about combining the two functions would need to be changed too. Regards, - Neil Neil Roberts writes: > Looks good to me. > > Reviewed-by: Neil Roberts > > - Neil > > Anuj Phogat writes: > >> Adding Neil to Cc who

Re: [Mesa-dev] [RFC v2 12/15] i965: correct VALIGN for 2d textures on Skylake

2015-06-01 Thread Neil Roberts
Looks good to me. Reviewed-by: Neil Roberts - Neil Anuj Phogat writes: > Adding Neil to Cc who committed 4ab8d59. > > Reviewed-by: Anuj Phogat ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman

Re: [Mesa-dev] [PATCH 1/2] i965: Don't use a temporary when generating an indirect sample

2015-06-01 Thread Neil Roberts
Many thanks for all the reviews and testing. I've pushed the two patches. The remaining sampler_array_indexing tests that fail on SKL (the gs ones) are because of a separate problem described in this patch: http://patchwork.freedesktop.org/patch/50676/ I'm not really sure whether that's the clea

[Mesa-dev] [PATCH 2/2] i965: Don't add base_binding_table_index if it's zero

2015-05-29 Thread Neil Roberts
When calculating the binding table index for non-constant sampler array indexing it needs to add the base binding table index which is a constant within the generated code. Often this base is zero so we can avoid a redundant instruction in that case. It looks like nothing in shader-db is doing non

[Mesa-dev] [PATCH 1/2] i965: Don't use a temporary when generating an indirect sample

2015-05-29 Thread Neil Roberts
Previously when generating the send instruction for a sample instruction with an indirect sampler it would use the destination register as a temporary store. This breaks when used in combination with the opt_sampler_eot optimisation because that forces the destination to be null. This patch fixes t

[Mesa-dev] [PATCH] i965/vec4: Fix the source register for indexed samplers

2015-05-28 Thread Neil Roberts
Previously when setting up the sample instruction for an indirect sampler the vec4 backend was directly passing the pseudo opcode's src0. However this isn't actually set to a valid register because instead the MRF registers are used as the source so it would end up passing null as src0. This patch

Re: [Mesa-dev] [PATCH] i965: Disable compaction for EOT send messages

2015-05-28 Thread Neil Roberts
atches out there to handle this. Please ignore if > this has already been sent by someone. (Direct me to it and I will > review it). > > Cc: Matt Turner > Cc: Neil Roberts > Cc: Mark Janes > Signed-off-by: Ben Widawsky > --- > src/mesa/drivers/dri/i965/brw_eu_compact.c |

[Mesa-dev] [PATCH] i965/skl: Add a message header for the TXF_MCS instruction in vec4vs

2015-05-21 Thread Neil Roberts
When using SIMD4x2 on Skylake, the sampler instructions need a message header to select the correct mode. This was added for most sample instructions in 0ac4c2727 but the TXF_MCS instruction is emitted separately and it was missed. This fixes a bunch of Piglit tests which test texelFetch in a geom

Re: [Mesa-dev] [RFC 00/10] Enable support for 2D ASTC HDR and LDR formats

2015-05-20 Thread Neil Roberts
Jason Ekstrand writes: > I think *most* of that code *should* already be there. In theory, > it's all keyed off of the block size provided by formats.csv. > However, given some of the rendering errors we're currently seeing, it > looks like it may need a little patching here and there. :-) inte

[Mesa-dev] [PATCH] configure: Bump libdrm requirement for Intel to 2.4.61

2015-05-12 Thread Neil Roberts
This is required for the I915_PARAM_REVISION macro. Previously this define was directly copied into the Mesa source. --- configure.ac | 2 +- src/mesa/drivers/dri/i965/intel_screen.c | 5 - 2 files changed, 1 insertion(+), 6 deletions(-) diff --git a/configure.ac b

Re: [Mesa-dev] [PATCH 09/13] util/list: Add list_empty and list_length functions

2015-05-11 Thread Neil Roberts
Ian Romanick writes: >> For what it's worth, I'm strongly in favour of using these >> kernel-style lists instead of exec_list. The kernel ones seem much >> less confusing. > > Huh? They're practically identical. The only difference is the > kernel-style lists have a single sentinel node, and that

Re: [Mesa-dev] [PATCH 2/2 v3] i965: Use predicate enable bit for conditional rendering w/o stalling

2015-05-11 Thread Neil Roberts
Kenneth Graunke writes: > It might be nice to create a brw_load_register_mem64 function, for > symmetry with brw_store_register_mem64 - we might want to reuse it > elsewhere someday. Ok, that sounds sensible. > One interesting quirk: the two halves of your register write may land > in two separ

[Mesa-dev] [PATCH 1/2] i965: Store the command parser version number in intel_screen

2015-05-08 Thread Neil Roberts
In order to detect whether the predicate source registers can be used in a later patch we will need to know the version number for the command parser. This patch just adds a member to intel_screen and does an ioctl to get the version. Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/in

[Mesa-dev] [PATCH 2/2 v3] i965: Use predicate enable bit for conditional rendering w/o stalling

2015-05-08 Thread Neil Roberts
EMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Aut

[Mesa-dev] [PATCH 0/2] i965: Do conditional rendering in hardware

2015-05-08 Thread Neil Roberts
I thought it might be a good idea to try posting these patches again since it's been 6 months since they were originally posted. The patches are a lot more useful now since the command parser in the kernel is working correctly for Haswell. This means the functionality is no longer restricted to onl

[Mesa-dev] [PATCH] i965/fs: Disable opt_sampler_eot for textureGather

2015-05-08 Thread Neil Roberts
The opt_sampler_eot optimisation seems to break when the last instruction is SHADER_OPCODE_TG4. A bunch of Piglit tests end up doing this so it causes a lot of regressions. I can't find any documentation or known workarounds to indicate that this is expected behaviour, but considering that this is

[Mesa-dev] [PATCH] i965/fs: Set the header_size on LOAD_PAYLOAD in opt_sampler_eot

2015-05-07 Thread Neil Roberts
Commit 94ee908448 added a header size parameter to the function to create the LOAD_PAYLOAD instruction. However this broke opt_sampler_eot which manually constructs the instruction and so wasn't setting the header_size. This ends up making the parameters for the send message all have the wrong loca

[Mesa-dev] [PATCH] i965/skl: In opt_sampler_eot always set destination register to null

2015-05-07 Thread Neil Roberts
opt_sampler_eot enables a direct write to framebuffer from a sample. In order to do this the sample message needs to have a message header so if there wasn't one already then the function adds one. In addition the function sets the destination register to null because it's no longer used. However i

Re: [Mesa-dev] [PATCH 3/3 v5] i965/fs: Combine tex/fb_write operations (opt)

2015-05-06 Thread Neil Roberts
Hi, This optimisation doesn't seem to work with textureGather so a bunch of Piglit tests are failing for me. I'm not sure why it didn't get picked up by your Jenkins run. I can't find anything in the bspec nor a known workaround to suggest that this shouldn't work so I'm not really sure what to d

Re: [Mesa-dev] [PATCH 10/13] util/list: Add a list validation function

2015-05-05 Thread Neil Roberts
Jason Ekstrand writes: > +static inline void list_validate(struct list_head *list) > +{ > + assert(list->next->prev == list && list->prev->next == list); > + for (struct list_head *node = list->next; node != list; node = node->next) > + assert(node->next->prev == node && node->prev->next

Re: [Mesa-dev] [PATCH 09/13] util/list: Add list_empty and list_length functions

2015-05-05 Thread Neil Roberts
Jason Ekstrand writes: > +static inline bool list_empty(struct list_head *list) > +{ > + return list->next == list; > +} It would be good if list.h also included stdbool.h in order to get the declaration of bool. However, will that cause problems on MSVC? Is the Gallium code compiled on MSVC i

Re: [Mesa-dev] [PATCH 08/13] util/list: Add C99-based iterator macros

2015-05-05 Thread Neil Roberts
Jason Ekstrand writes: > +#define list_for_each_entry(type, pos, head, member)\ > + for (type *pos = container_of((head)->next, pos, member);\ > + &pos->member != (head); \ > + pos = container_of(pos->member.next, p

Re: [Mesa-dev] [PATCH 5/6] i965/skl: Align compressed textures to four times the block size

2015-05-01 Thread Neil Roberts
Sorry for the really long delay in replying! This patch is still needed in order to fix a number of Piglit tests so it would be good to get it landed. Ben Widawsky writes: > Sorry for the delay, but I put this off initially because I wasn't > sure which part of the docs this was addressing. I se

[Mesa-dev] [PATCH v2] i965/fs: Strip trailing constant zeroes in sample messages

2015-04-30 Thread Neil Roberts
If a send message is emitted with a message length that is less than required for the message then the remaining parameters default to zero. We can take advantage of this to save a register when a shader passes constant zeroes as the final coordinates to the sample function. I think this might be

[Mesa-dev] [PATCH] i965/skl: Don't try to apply the opt_sampler_eot extension for vs

2015-04-28 Thread Neil Roberts
The opt_sampler_eot optimisation of fs_visitor effectively assumes that it is running on a fragment shader because it casts the program key to a brw_wm_prog_key. However on Skylake fs_visitor can also be used for vertex shaders. It looks like this usually works anyway because the optimisation is sk

Re: [Mesa-dev] [PATCH] i965/fs: Strip trailing contant zeroes in sample messages

2015-04-24 Thread Neil Roberts
Kenneth Graunke writes: > I like this idea! > > We definitely need to skip this optimization on Gen4, since the Gen4/G45 > sampler infers the texturing opcode based on the message length. But > for Gen5+, it should be no problem. Ah ok, yes, I will add this. > Matt mentioned that we have to em

Re: [Mesa-dev] [PATCH] i965/fs: Strip trailing contant zeroes in sample messages

2015-04-24 Thread Neil Roberts
Matt Turner writes: >> + foreach_block_and_inst(block, fs_inst, inst, cfg) { >> + if ((inst->opcode == SHADER_OPCODE_TEX || >> + inst->opcode == SHADER_OPCODE_TXF) && >> + !inst->shadow_compare) { >> + fs_inst *load_payload = (fs_inst *) inst->prev; >> + >> +

[Mesa-dev] [PATCH] i965/fs: Strip trailing contant zeroes in sample messages

2015-04-24 Thread Neil Roberts
If a send message is emitted with a message length that is less than required for the message then the remaining parameters default to zero. We can take advantage of this to save a register when a shader passes constant zeroes as the final coordinates to the sample function. I think this might be

[Mesa-dev] [PATCH] i965/skl: Force the exec size to 8 when initing header for SIMD4x2

2015-04-23 Thread Neil Roberts
On Gen9+ there needs to be a header when sampling using SIMD4x2. The header is set up by copying from the g0 register. Commit 07c571a39f tried to fix this mov instruction to always use an exec size of 8 because previously it was incorrectly using 4. It did this by casting the type of the destinatio

Re: [Mesa-dev] [PATCH] i965: Add an INTEL_DEBUG=spill option to test spilling

2015-04-23 Thread Neil Roberts
Ilia Mirkin writes: > That seems awkward... did you mean 1U? FWIW mesa's not at all careful > about that... Or maybe even UINT64_C(1). 1l would still be 32-bit on 32-bit architectures. I guess this is more of a problem for subsequent flags that go over 32-bit. - Neil ___

Re: [Mesa-dev] [PATCH V2 01/22] meta: Enable _mesa_meta_pbo_GetTexSubImage() to read in to non-pbo buffers

2015-04-21 Thread Neil Roberts
Anuj Phogat writes: > This will allow Skylake to use _mesa_meta_pbo_GetTexSubImage() for reading > YF/YS > tiled surfaces. > > V2: Make changes suggested by Neil. > > Signed-off-by: Anuj Phogat > Cc: Neil Roberts > --- > src/mesa/drivers/common/meta.h

[Mesa-dev] [PATCH 1/2] i965/vec4: Add a helper function to emit VS_OPCODE_PULL_CONSTANT_LOAD

2015-04-15 Thread Neil Roberts
There were three places in the visitor that had a similar chunk of code to emit the VS_OPCODE_PULL_CONSTANT_LOAD opcode using a register for the offset. This patch combines the chunks into a helper function to reduce the code duplication. It will also be useful in the next patch to expand what happ

[Mesa-dev] [PATCH 2/2 v2] i965/skl: Add the header for constant loads outside of the generator

2015-04-15 Thread Neil Roberts
Commit 5a06ee738 added a step to the generator to set up the message header when generating the VS_OPCODE_PULL_CONSTANT_LOAD_GEN7 instruction. That pseudo opcode is implemented in terms of multiple actual opcodes, one of which writes to one of the source registers in order to set up the message hea

Re: [Mesa-dev] [PATCH] i965/skl: Add the header for constant loads outside of the generator

2015-04-15 Thread Neil Roberts
Ben Widawsky writes: >> + void generate_set_simd4x2_header_gen9(vec4_instruction *inst, >> + struct brw_reg dst); > > super bikesheddy, but this function name sounds pretty bad, though I > understand it's a side-effect of the name of the opcode (which see

[Mesa-dev] [PATCH] i965/skl: Add the header for constant loads outside of the generator

2015-04-14 Thread Neil Roberts
Commit 5a06ee738 added a step to the generator to set up the message header when generating the VS_OPCODE_PULL_CONSTANT_LOAD_GEN7 instruction. That pseudo opcode is implemented in terms of multiple actual opcodes, one of which writes to one of the source registers in order to set up the message hea

Re: [Mesa-dev] Concurrent piglit run on Skylake (was: [PATCH] i965/skl: Use an exec size of 8 to initialise the message header)

2015-04-13 Thread Neil Roberts
Ben Widawsky writes: > Does this happen to allow concurrent piglit to not die in a fire? Hrm, I just tried running piglit without -1 on my wip/skylake branch (without this patch) and it seems to work fine! I don't know what has fixed it but these are the three patches on that branch: d2e8cd7 i9

Re: [Mesa-dev] [PATCH] i965/skl: Use an exec size of 8 to initialise the message header

2015-04-13 Thread Neil Roberts
Ben Widawsky writes: > Can you do me a favor since I am lazy? Can you send the generated asm > diff with this patch? I am admittedly rusty on the matter, but I > thought a mov is just a mov, and only the regioning is effected by the > retype. If it indeed does what you say, it really should fix >

Re: [Mesa-dev] [PATCH] Fix automatic indentation mode for recent emacs, use fewer columns in .git

2015-04-08 Thread Neil Roberts
It seems a bit strange that this has stopped working for you. If you specify a mode in the .dir-locals.el file then it's supposed to set the variable for any files with that mode or any modes inherited from that mode. The C and C++ modes both inherit from prog-mode, as well as a bunch of other ones

[Mesa-dev] [PATCH] i965/skl: Avoid using the 1D stencil layout for stencil-only images

2015-03-31 Thread Neil Roberts
Commit cf67ca9ffa9 made the layouting code pick a special layout for 1D images on Skylake. This should not be used for depth and stencil buffers because these need to be treated as 2D tiled images. However the patch was missing a check for images with a base format of GL_STENCIL_INDEX. In practice

Re: [Mesa-dev] [PATCH 6/6] i965: Allow Y-tiled allocations for large surfaces

2015-03-23 Thread Neil Roberts
Sorry for the delay in replying to this. Ben Widawsky writes: >> > +static inline uint32_t >> > +intel_miptree_blit_height(struct intel_mipmap_tree *mt) >> > +{ >> > + switch (mt->target) { >> > + case GL_TEXTURE_CUBE_MAP: >> > + case GL_TEXTURE_1D_ARRAY: >> > + case GL_TEXTURE_2D_ARRAY:

Re: [Mesa-dev] [PATCH 5/6] i965: Attempt to blit for larger textures

2015-03-23 Thread Neil Roberts
Ben Widawsky writes: > diff --git a/src/mesa/drivers/dri/i965/intel_blit.h > b/src/mesa/drivers/dri/i965/intel_blit.h > index 52dd67c..aff2d58 100644 > --- a/src/mesa/drivers/dri/i965/intel_blit.h > +++ b/src/mesa/drivers/dri/i965/intel_blit.h > @@ -78,14 +78,34 @@ void intel_emit_linear_blit(st

Re: [Mesa-dev] [PATCH 3/6] i965: Create and use #defines for blitter constraints

2015-03-23 Thread Neil Roberts
Ben Widawsky writes: > diff --git a/src/mesa/drivers/dri/i965/intel_blit.h > b/src/mesa/drivers/dri/i965/intel_blit.h > index f563939..531d329 100644 > --- a/src/mesa/drivers/dri/i965/intel_blit.h > +++ b/src/mesa/drivers/dri/i965/intel_blit.h > @@ -30,6 +30,9 @@ > > #include "brw_context.h"

Re: [Mesa-dev] [PATCH 2/6] i965: Fix comments about blit constraints

2015-03-23 Thread Neil Roberts
Looks good to me. Reviewed-by: Neil Roberts - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/6] i965: Kill y_or_x variable in miptree tiling selection

2015-03-23 Thread Neil Roberts
Ben Widawsky writes: > This patch actually addresses two potential issues: > 1. As far as I can tell, drm_intel_gem_bo_alloc_tiled() which is invariably > called, can potentially change the tiling type. Therefore, we shouldn't be > checking the requested tiling type, but rather the granted tilin

[Mesa-dev] [PATCH v2] i965/skl: Break down SIMD16 3-source instructions when required.

2015-03-19 Thread Neil Roberts
From: Kenneth Graunke Several steppings of Skylake fail when using SIMD16 with 3-source instructions (such as MAD). This implements WaDisableSIMD16On3SrcInstr and fixes ~190 Piglit tests. Based on a patch by Neil Roberts. --- src/mesa/drivers/dri/i965/brw_shader.cpp | 6 ++ 1 file changed

[Mesa-dev] [PATCH v2] i965: Refactor SIMD16-to-2xSIMD8 checks.

2015-03-19 Thread Neil Roberts
The places that were checking whether 3-source instructions are supported have now been combined into a small helper function. This will be used in the next patch to add an additonal restriction. Based on a patch by Kenneth Graunke. --- Matt's review for v1 of this patch was conditional based on

[Mesa-dev] [PATCH v3] i965/skl: Fix the qpitch value

2015-03-13 Thread Neil Roberts
On Skylake the qpitch value is uploaded as part of the surface state so we don't need to add the extra rows that are done for other generations. However for 3D textures it needs to be aligned to the tile height and for depth/stencil textures it needs to be a multiple of 8. Unlike previous generatio

[Mesa-dev] [PATCH] i965/skl: Send a message header when doing constant loads SIMD4x2

2015-03-13 Thread Neil Roberts
Commit 0ac4c272755c7 made it add a header for the send message when using SIMD4x2 on Skylake because without this it will end up using SIMD8D. However the patch missed the case when a sampler is being used to implement constant loads from a buffer surface in a SIMD4x2 vertex shader. This fixes 29

[Mesa-dev] [PATCH v2] i965/skl: Don't use ALL_SLICES_AT_EACH_LOD

2015-03-13 Thread Neil Roberts
The render surface state command for Skylake doesn't have the surface array spacing bit so it's not possible to select this layout. I think it was only used in order to make it pick a tightly-packed qpitch value that doesn't include space for the mipmaps. However this won't be necessary after the n

Re: [Mesa-dev] [PATCH 6/6] i965: Allow Y-tiled allocations for large surfaces

2015-03-10 Thread Neil Roberts
Ben Widawsky writes: > This patch will use a new calculation to determine if a surface can be blitted > from or to. Previously, the "total_height" member was used. Total_height in > the > case of 2d, 3d, and cube map arrays is the height of each slice/layer/face. > Since the GL map APIS only eve

Re: [Mesa-dev] [PATCH 6/6] i965/skl: Don't use ALL_SLICES_AT_EACH_LOD

2015-03-10 Thread Neil Roberts
Ben Widawsky writes: > On Fri, Feb 20, 2015 at 10:31:08PM +0000, Neil Roberts wrote: >> The render surface state command for Skylake doesn't have the surface >> array spacing bit so I don't think it's possible to select this >> layout. This avoids a kerne

[Mesa-dev] [PATCH v2] i965/skl: Fix the order of the arguments for the LD sampler message

2015-03-09 Thread Neil Roberts
In Skylake the order of the arguments for sample messages with the LD type are u, v, lod, r whereas previously they were u, lod, v, r. This fixes 144 Piglit tests including ones that directly use texelFetch and also some using the meta stencil blit path which appears to use texelFetch in its shade

Re: [Mesa-dev] [PATCH 1/3] Revert "common: Fix PBOs for 1D_ARRAY."

2015-03-09 Thread Neil Roberts
15, Emil Velikov wrote: >> On 4 March 2015 at 17:22, Neil Roberts wrote: >>> This reverts commit 546aba143d13ba3f993ead4cc30b2404abfc0202. >>> >>> I think the changes to the calls to glBlitFramebuffer from this patch >>> are no different to what it was doing

[Mesa-dev] [PATCH] i965/skl: Fix the order of the arguments for the LD sampler message

2015-03-06 Thread Neil Roberts
In Skylake the order of the arguments for sample messages with the LD type are u, v, lod, r whereas previously they were u, lod, v, r. This fixes 82 Piglit tests using texelFetch. --- I have a feeling this probably isn't the right way to do this patch so maybe someone who knows the compiler better

Re: [Mesa-dev] [PATCH 2/2] i965/skl: Break down SIMD16 3-source instructions when required.

2015-03-05 Thread Neil Roberts
Yes, I like this approach much better. I ran it through Piglit and I can confirm it fixes the same tests as my patch. Reviewed-by: Neil Roberts There's no need to reset the author to me. Thanks for looking at this. Regards, - Neil Kenneth Graunke writes: > Several steppings of Skyl

[Mesa-dev] [PATCH 1/3] Revert "common: Fix PBOs for 1D_ARRAY."

2015-03-04 Thread Neil Roberts
This reverts commit 546aba143d13ba3f993ead4cc30b2404abfc0202. I think the changes to the calls to glBlitFramebuffer from this patch are no different to what it was doing previously because it used to set height to 1 before doing the blits. However it was introducing some problems with the blit for

[Mesa-dev] [PATCH 2/3] meta: Allow GL_UN/PACK_IMAGE_HEIGHT in _mesa_meta_pbo_Get/TexSubImage

2015-03-04 Thread Neil Roberts
Now that a layered source PBO is interpreted as a single tall 2D image it's quite easy to accept the image height packing option by just creating an image that is tall enough to include the image padding. I'm not sure whether the image height property should affect 1D_ARRAY textures. My intuition

[Mesa-dev] [PATCH 3/3] meta: Fix the y offset for 1D_ARRAY in _mesa_meta_pbo_TexSubImage

2015-03-04 Thread Neil Roberts
The yoffset needs to be interpreted as a slice offset for 1D array textures. This patch implements that by moving the yoffset into zoffset similar to how it moves the height into depth. --- src/mesa/drivers/common/meta_tex_subimage.c | 8 1 file changed, 8 insertions(+) diff --git a/src/

[Mesa-dev] [PATCH v2] i965/skl: Disable SIMD16 when 3-source instructions are used

2015-03-04 Thread Neil Roberts
Steppings C0 and D0 of Skylake fail when using SIMD16 with 3-source instructions (such as MAD). This patch just makes it disable SIMD16 in those cases. This implements WaDisableSIMD16On3SrcInstr and fixes ~190 Piglit tests. v2: Also apply on stepping D0 --- Damien Lespiau pointed out that the wo

Re: [Mesa-dev] [PATCH 2/2] i965/skl: Disable SIMD16 when 3-source instructions are used

2015-03-04 Thread Neil Roberts
Ilia Mirkin writes: > On Wed, Mar 4, 2015 at 9:33 AM, Neil Roberts wrote: >> Stepping C0 of Skylake fails when using SIMD16 with 3-source >> instructions (such as MAD). This patch just makes it disable SIMD16 in >> that case. >> >> This implements WaDisabl

[Mesa-dev] [PATCH 2/2] i965/skl: Disable SIMD16 when 3-source instructions are used

2015-03-04 Thread Neil Roberts
Stepping C0 of Skylake fails when using SIMD16 with 3-source instructions (such as MAD). This patch just makes it disable SIMD16 in that case. This implements WaDisableSIMD16On3SrcInstr and fixes ~190 Piglit tests. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 21 + src/mesa/driv

[Mesa-dev] [PATCH 1/2] i965: Store the GPU revision number in brw_context

2015-03-04 Thread Neil Roberts
brwContextInit now queries the GPU revision number via a new parameter for DRM_I915_GETPARAM. This new parameter requires a kernel patch and a patch to libdrm. If the kernel doesn't support it then it will continue but set the revision number to -1. The intention is to use this to implement workaro

<    1   2   3   4   5   6   >