[Mesa-dev] [Bug 100876] Variable GALLIUM_HUD_DUMP_DIR is not working with Wine LFS

2017-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100876

--- Comment #2 from Balázs Vinarz  ---
Yes the files are exist, but there is no data.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH 1/2] disk_cache: reduce default cache size to 5% of filesystem

2017-04-28 Thread Michel Dänzer
On 28/04/17 09:11 PM, Marek Olšák wrote:
> On Thu, Apr 27, 2017 at 8:47 AM, Michel Dänzer  wrote:
>> On 27/04/17 10:15 AM, Timothy Arceri wrote:
>>> Modern disks are extremely large and are only going to get bigger.
>>> Usage has shown frequent Mesa upgrades can result in the cache
>>> growing very fast i.e. wasting a lot of disk space unnecessarily.
>>>
>>> 5% seems like a more reasonable default.
>>>
>>> Cc: "17.1" 
>>> ---
>>>  src/util/disk_cache.c | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
>>> index d9de8ef..9fd7b96 100644
>>> --- a/src/util/disk_cache.c
>>> +++ b/src/util/disk_cache.c
>>> @@ -324,24 +324,24 @@ disk_cache_create(const char *gpu_name, const char 
>>> *timestamp)
>>>   case '\0':
>>>   case 'G':
>>>   case 'g':
>>>   default:
>>>  max_size *= 1024*1024*1024;
>>>  break;
>>>   }
>>>}
>>> }
>>>
>>> -   /* Default to 1GB or 10% of filesystem for maximum cache size. */
>>> +   /* Default to 1GB or 5% of filesystem for maximum cache size. */
>>> if (max_size == 0) {
>>>statvfs(path, );
>>> -  max_size = MAX2(1024*1024*1024, vfs.f_blocks * vfs.f_bsize / 10);
>>> +  max_size = MAX2(1024*1024*1024, vfs.f_blocks * vfs.f_bsize / 20);
>>> }
>>
>> 5% can still be quite a lot (what if every library on the system tried
>> using that much for itself?). How about 1%?
> 
> The argument is flawed. My ccache uses 12% (26.8 GB) of my disk, and
> I'm not saying "what if every small app used that much...". There is a
> very good reason for that size with my use case.

ccache defaults to a maximum cache size of 5G. You had to explicitly
allow it to use more; you can do the same with the Mesa shader cache.


> It certainly makes sense to use 5% of the filesystem for Mesa.

I don't agree for the default, especially as long as the Mesa shader
cache only actually makes use of about 10-20% of the disk space
allocated for it, due to having a huge number of tiny files. (ccache in
contrast actually makes good use of the disk space allocated for it,
because its cache entries are normally larger than a single disk block)


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] fix minor error in YUV2RGB matrix used in shader

2017-04-28 Thread Johnson Lin
The matrix used for YCbCr to RGB is listed in Wiki 
https://en.wikipedia.org/wiki/YCbCr;
There is minor error in the matrix constant: 0.0625=16/256 should be 16.0/255,
 and 0.5=128.0/256 should be 128.0/255.
Note that conversion from a 0-255 byte number to 0-1.0 float is to divide by 255
 instead of 256. That's we get 255=1.0f.
By the constant change we can see the CSC result is bit aligned with
Wiki conversion result and FFMPeg result.
Otherwise in some situation, there will be one bit difference

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100854
---
 src/compiler/nir/nir_lower_tex.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/compiler/nir/nir_lower_tex.c b/src/compiler/nir/nir_lower_tex.c
index 352d1499bc8d..f20425e84aab 100644
--- a/src/compiler/nir/nir_lower_tex.c
+++ b/src/compiler/nir/nir_lower_tex.c
@@ -244,9 +244,9 @@ convert_yuv_to_rgb(nir_builder *b, nir_tex_instr *tex,
nir_ssa_def *yuv =
   nir_vec4(b,
nir_fmul(b, nir_imm_float(b, 1.16438356f),
-nir_fadd(b, y, nir_imm_float(b, -0.0625f))),
-   nir_channel(b, nir_fadd(b, u, nir_imm_float(b, -0.5f)), 0),
-   nir_channel(b, nir_fadd(b, v, nir_imm_float(b, -0.5f)), 0),
+nir_fadd(b, y, nir_imm_float(b, -16.0f/255))),
+   nir_channel(b, nir_fadd(b, u, nir_imm_float(b, -128.0f/255)), 
0),
+   nir_channel(b, nir_fadd(b, v, nir_imm_float(b, -128.0f/255)), 
0),
nir_imm_float(b, 0.0));
 
nir_ssa_def *red = nir_fdot4(b, yuv, nir_build_imm(b, 4, 32, m[0]));
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] fix minor error in YUV2RGB matrix used in shader

2017-04-28 Thread Lin, Johnson
Yes, We can.  Will do it

-Original Message-
From: Eric Anholt [mailto:e...@anholt.net] 
Sent: Saturday, April 29, 2017 6:02 AM
To: Lin, Johnson ; mesa-dev@lists.freedesktop.org
Cc: Lin, Johnson 
Subject: Re: [Mesa-dev] [PATCH] fix minor error in YUV2RGB matrix used in shader

Johnson Lin  writes:

> The matrix used for YCbCr to RGB is listed in Wiki 
> https://en.wikipedia.org/wiki/YCbCr;
> There is minor error in the matrix constant: 0.0625=16/256 should be 
> 16.0/255,  and 0.5=128.0/256 should be 128.0/255.
> Note that conversion from a 0-255 byte number to 0-1.0 float is to 
> divide by 255  instead of 256. That's we get 255=1.0f.
> By the constant change we can see the CSC result is bit aligned with 
> Wiki conversion result and FFMPeg result.
> Otherwise in some situation, there will be one bit difference
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100854
> ---
>  src/compiler/nir/nir_lower_tex.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/compiler/nir/nir_lower_tex.c 
> b/src/compiler/nir/nir_lower_tex.c
> index 352d1499bc8d..385739a56a71 100644
> --- a/src/compiler/nir/nir_lower_tex.c
> +++ b/src/compiler/nir/nir_lower_tex.c
> @@ -244,9 +244,9 @@ convert_yuv_to_rgb(nir_builder *b, nir_tex_instr *tex,
> nir_ssa_def *yuv =
>nir_vec4(b,
> nir_fmul(b, nir_imm_float(b, 1.16438356f),
> -nir_fadd(b, y, nir_imm_float(b, -0.0625f))),
> -   nir_channel(b, nir_fadd(b, u, nir_imm_float(b, -0.5f)), 0),
> -   nir_channel(b, nir_fadd(b, v, nir_imm_float(b, -0.5f)), 0),
> +nir_fadd(b, y, nir_imm_float(b, -0.0627451f))),
> +   nir_channel(b, nir_fadd(b, u, nir_imm_float(b, 
> -0.50196078431f)), 0),
> +   nir_channel(b, nir_fadd(b, v, nir_imm_float(b, 
> + -0.50196078431f)), 0),
> nir_imm_float(b, 0.0));

Could we use 16.0/255.0 and 128.0/255.0, instead of magic-looking numbers?  
With that, it will be:

Reviewed-by: Eric Anholt 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 100613] Regression in Mesa 17 on s390x (zSystems)

2017-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100613

--- Comment #16 from Roland Scheidegger  ---
(In reply to Bruce Cherniak from comment #15)
> This isn't really an OpenSWR (Drivers/Gallium/swr) specific problem.  Is
> there another component that it should be moved to?

We don't really have an appropriate component...
I think such issues often end up with either "Mesa core" or "Other", the former
isn't quite right and the latter of course not all that helpful...

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] swr: move msaa resolve to generalized StoreTile

2017-04-28 Thread Cherniak, Bruce

> On Apr 28, 2017, at 3:20 PM, Ilia Mirkin  wrote:
> 
> On Fri, Apr 28, 2017 at 3:58 PM, Cherniak, Bruce
>  wrote:
>> 
>>> On Apr 27, 2017, at 7:50 PM, Ilia Mirkin  wrote:
>>> 
>>> On Thu, Apr 27, 2017 at 8:45 PM, Cherniak, Bruce
>>>  wrote:
 
> On Apr 27, 2017, at 7:38 PM, Ilia Mirkin  wrote:
> 
> Erm, so ... what happens if I render to FB1, then render to FB2, then
> render to FB1 again (and I have blending enabled)? Doesn't the resolve
> lose the per-sample information? Or does the resolve merely precompute
> the resolved version on the off chance that it's needed, without
> losing the source data?
 
 The resolve occurs into a secondary, driver private, surface.  All 
 per-sample
 information is maintained in the original surfaces.
 
 Yes, the resolve is currently done "on the off chance that it’s needed”.
 There is likely an optimization to be had there, but it should be 
 functionally
 correct.
>>> 
>>> Got it. May I ask why this isn't done on-demand instead? Is it a pain
>>> to plug into swr's execution engine? I'm just concerned that
>>> StoreTile() may get called a lot, more than even there are draws, as
>>> tiles are swapped in and out of "hotness", and I wouldn't be surprised
>>> if resolves were needed only a fraction of the time.
>>> 
>>> Cheers,
>>> 
>>> -ilia
>> 
>> 
>> Good observation.  I haven’t yet seen this to be the case in the scientific
>> visualization applications I’ve been running. But, I can envision where that
>> becomes a performance concern.
>> 
>> Do you mean a blit based “state_tracker initiated” on-demand resolve (via
>> pipe_blit)?  If so, here are my thoughts:
> 
> Yes. The resolve is always initiated via a blit() call anyways (with a
> dst surface with nr_samples == 0).
> 
>> 1) The software winsys and state trackers don't support multisample surfaces
>>   for software renderers, nor will/should they (except for swr).  So, I
>>   thought keeping most of the changes local to our driver would be most
>>   desirable and safest, as far as swrast and llvmpipe are concerned.  Not
>>   sure about wgl yet, but I don't see it.
>> 
>> 2) A blit based resolve causes a pipeline reconfiguration (save/restore 
>> around
>>   the blit) that is inherently less efficient than simply
>>   storing-out/resolving HotTiles.
>> 
>> 3) A blit based resolve needs to sample from the multisample surface using a
>>   texture sampler with 2DMS/3DMS support.  We’re currently using llvmpipe's
>>   sampler which doesn't need this support.  I’m looking into extending it, as
>>   I know we need the functionality for compliance; it’s just not there yet.
>> 
>> I may be off-base on any of these thoughts.  If so, please correct me.
>> 
>> We’ll probably move to a “driver internal” on-demand resolve, implemented
>> similar to StoreTiles.  It's a simple matter to only resolve for the times we
>> know it's needed and the multisample surface is in HotTiles.  But, I need to
>> work out the LoadTiles case for surfaces that aren’t currently in HotTiles.
>> Tricky, since we're checking the resolve status of the secondary (resolved)
>> surface and the HotTile state of the multisample surface.
>> 
>> Thanks for the feedback.  Getting this completely correct and optimized is
>> going to be iterative.  This current patch, while maybe not optimal, helps
>> with functionality.  So, I think it's a step in the right direction.
> 
> I hope you realize I wasn't looking to derail your attempts at
> progress, more like providing some things to think about on your march
> towards perfection :) MS textures/fbo's are definitely a thing,
> probably more so than MS winsys surfaces these days. At least for
> games, maybe not visualization software, with which I have next to no
> experience. Try it with e.g. Unigine Heaven or Valley (with MSAA
> enabled). I'm fairly sure that at least Heaven uses MSAA textures.

We always value and appreciate your input.  And, while we’re primarily
focused on sci vis software, we’d like to be compliant as possible; which
means running a wide variety of applications… even *gasp* games.

I’ll definitely give Unigine a try.  Not expected great performance, but
we can at least strive for correct functionality.

> I believe most hardware uses MSAA compression, based on the
> observation that it's pretty common for all samples in a pixel to have
> the same color, or bg color + fg color + coverage mask. TBH I'm not
> sure how it all works. Something for the future when you get all the
> basics right.

“march towards perfections” :)

> Some hardware has built-in resolve functionality (e.g. Adreno, maybe
> other tilers as well) for moving a MS FBO out of a "hot tile", while
> most hardware requires the pipeline reconfiguration + blit. Perhaps
> it'd make sense to add a special FE command for computing the resolved
> version 

[Mesa-dev] [Bug 100613] Regression in Mesa 17 on s390x (zSystems)

2017-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100613

--- Comment #15 from Bruce Cherniak  ---
This isn't really an OpenSWR (Drivers/Gallium/swr) specific problem.  Is there
another component that it should be moved to?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Solve Android native fence fd double close issue

2017-04-28 Thread Xu, Randy
> -Original Message-
> From: Chad Versace [mailto:chadvers...@chromium.org]
> Sent: Saturday, April 29, 2017 12:19 AM
> To: Emil Velikov 
> Cc: Xu, Randy ; mesa-dev@lists.freedesktop.org
> Subject: Re: [Mesa-dev] [PATCH] i965: Solve Android native fence fd double
> close issue
> 
> On Thu 27 Apr 2017, Emil Velikov wrote:
> > On 27 April 2017 at 12:14, Xu, Randy  wrote:
> > > Hi, Chad
> > >
> > > Please review this patch, we need it to solve some instability
> > > issues
> 
> Randy and Tapani, could you provide a few dEQP test names that this patch
> fixes? I'd like to mention at least one EGL and one Vulkan test in the commit
> message.

It's not dEQP issue, but the instability. The same fd double close will cause 
GLES or Vulkan app crash on Android platform.
It may take 20 minutes to reproduce it.

> 
> > The patch is correct, although the commit message can be improved upon.
> > Read through the following example and consider the alternative
> > solution mentioned within.
> 
> Yes, this patch is correct. It makes brw_dri_create_fence_fd() behave like all
> the other drivers' create_fence_fd funcs, which call dup().
> Since this is an easy one-liner that can backport to stable, let's take it.
> 
> However, I believe the fully correct solution is Emil's plan B:
> __DRI2fenceExtensionRec::create_fence_fd should transfer fd ownership to
> the driver, and therefore no dup is needed. But that's a slightly more 
> invasive
> change that's not as easily backported to stable.
> 
> Reviewed-by: Chad Versace 
> Cc: mesa-sta...@lists.freedesktop.org
> 
> Emil, how about one of us appends your extended commit message to
> Randy's, and then pushes?

Thanks, I prefer to merge this simple solution first. 

> 
> > Then either polish and resend, or send patch that implements plan B.
> > If you opt for B you want to drop the dup/close from the existing
> > users - freedreno and etnaviv.
> >
> > "
> > The semantics of __DRI2fenceExtensionRec::create_fence_fd are unclear
> > if the DRI driver takes ownership of the fd or not.
> > Since the i965 driver supports both "in" and "out" fd it assumes "yes,
> > driver takes ownership", which results in a double close.
> > First time in our destroy_fence() callback and then in the loader.
> >
> > Other DRI modules rely on the loader issuing close().
> >
> > Thus we have two solutions:
> >  - dup() the file descriptor
> >  - close() only if we have an out fence.
> >
> > This patch implements the former, simpler solution.
> >
> > Fixes: 6403e376511 ("i965/sync: Implement fences based on Linux
> > sync_file")
> > Reviewed-by: Emil Velikov  "
> >
> > In either case you want to augment create_fence_fd and destroy_fence
> > (in dri_interface.h) to explicitly define the behaviour.
> > Please keep that a separate patch part of this series.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Drop "Destination Element Offset" from Ironlake SGVs.

2017-04-28 Thread Kenneth Graunke
The Ironlake documentation is terrible, so it's unclear whether or not
this field exists there.  It definitely doesn't exist on Sandybridge
and later.  It definitely does exist on G45.

We haven't been setting it for our normal vertex attributes - just
the SGVs (VertexID, InstanceID, BaseVertex, BaseInstance, DrawID).
We should be consistent.  My guess is that it isn't necessary and
doesn't exist - this patch drops it from the SGVs elements, making
them follow the behavior of most attributes.
---
 src/mesa/drivers/dri/i965/brw_draw_upload.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upload.c
index 7846293cb1b..002e863a649 100644
--- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
@@ -1096,7 +1096,8 @@ brw_emit_vertices(struct brw_context *brw)
  dw0 |= BRW_VE0_VALID |
 brw->vb.nr_buffers << BRW_VE0_INDEX_SHIFT |
 ISL_FORMAT_R32G32_UINT << BRW_VE0_FORMAT_SHIFT;
-dw1 |= (i * 4) << BRW_VE1_DST_OFFSET_SHIFT;
+ if (brw->gen == 4)
+dw1 |= (i * 4) << BRW_VE1_DST_OFFSET_SHIFT;
   }
 
   /* Note that for gl_VertexID, gl_InstanceID, and gl_PrimitiveID values,
@@ -1124,7 +1125,8 @@ brw_emit_vertices(struct brw_context *brw)
 ((brw->vb.nr_buffers + 1) << BRW_VE0_INDEX_SHIFT) |
 (ISL_FORMAT_R32_UINT << BRW_VE0_FORMAT_SHIFT);
 
-dw1 |= (i * 4) << BRW_VE1_DST_OFFSET_SHIFT;
+ if (brw->gen == 4)
+dw1 |= (i * 4) << BRW_VE1_DST_OFFSET_SHIFT;
   }
 
   OUT_BATCH(dw0);
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH 3/3] i965/vec4: don't modify regioning parameters to the sources of DF align1 instructions

2017-04-28 Thread Francisco Jerez
Samuel Iglesias Gonsálvez  writes:

> The regioning parameters are now properly set by convert_to_hw_regs()
> and we don't need to fix them in the generator.
>

It would be worth stressing here that the "fix" previously done in the
generator was strictly speaking wrong for any non-identity regions.
With that clarified patch is:

Reviewed-by: Francisco Jerez 

> Signed-off-by: Samuel Iglesias Gonsálvez 
> Cc: "17.1" 
> ---
>  src/intel/compiler/brw_vec4_generator.cpp | 9 +
>  1 file changed, 1 insertion(+), 8 deletions(-)
>
> diff --git a/src/intel/compiler/brw_vec4_generator.cpp 
> b/src/intel/compiler/brw_vec4_generator.cpp
> index e786ac6a0ca..753b00c4ed1 100644
> --- a/src/intel/compiler/brw_vec4_generator.cpp
> +++ b/src/intel/compiler/brw_vec4_generator.cpp
> @@ -1980,8 +1980,6 @@ generate_code(struct brw_codegen *p,
>   else
>  spread_dst = stride(dst, 8, 4, 2);
>  
> - src[0].vstride = BRW_VERTICAL_STRIDE_4;
> - src[0].width = BRW_WIDTH_4;
>   brw_MOV(p, spread_dst, src[0]);
>  
>   brw_set_default_access_mode(p, BRW_ALIGN_16);
> @@ -2016,9 +2014,7 @@ generate_code(struct brw_codegen *p,
>   src[0] = retype(src[0], BRW_REGISTER_TYPE_UD);
>   if (inst->opcode == VEC4_OPCODE_PICK_HIGH_32BIT)
>  src[0] = suboffset(src[0], 1);
> - src[0].vstride = BRW_VERTICAL_STRIDE_8;
> - src[0].width = BRW_WIDTH_4;
> - src[0].hstride = BRW_HORIZONTAL_STRIDE_2;
> + src[0] = spread(src[0], 2);
>   brw_MOV(p, dst, src[0]);
>  
>   brw_set_default_access_mode(p, BRW_ALIGN_16);
> @@ -2041,9 +2037,6 @@ generate_code(struct brw_codegen *p,
>   dst.hstride = BRW_HORIZONTAL_STRIDE_2;
>  
>   src[0] = retype(src[0], BRW_REGISTER_TYPE_UD);
> - src[0].vstride = BRW_VERTICAL_STRIDE_4;
> - src[0].width = BRW_WIDTH_4;
> - src[0].hstride = BRW_HORIZONTAL_STRIDE_1;
>   brw_MOV(p, dst, src[0]);
>  
>   brw_set_default_access_mode(p, BRW_ALIGN_16);
> -- 
> 2.11.0
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH 2/3] i965/vec4: fix register width for DF VGRF and UNIFORM

2017-04-28 Thread Francisco Jerez
Samuel Iglesias Gonsálvez  writes:

> On gen7, the swizzles used in DF align16 instructions works for element
> size of 32 bits, so we can address only 2 consecutive DFs. As we assumed that
> in the rest of the code and prepare the instructions for this 
> (scalarize_df()),
> we need to set it to two again.
>
> However, for DF align1 instructions, a width of 2 is wrong as we are not
> reading the data we want. For example, an uniform would have a region of
> <0, 2, 1> so it would repeat the first 2 DFs, when we wanted to access
> to the first 4.
>
> This patch sets the default one to 4 and then modifies the width of
> align16 instruction's DF sources when we translate the logical swizzle
> to the physical one.
>
> Signed-off-by: Samuel Iglesias Gonsálvez 
> Cc: "17.1" 
> ---
>  src/intel/compiler/brw_vec4.cpp | 13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
> index 95f96ea69c0..8b755e1b75e 100644
> --- a/src/intel/compiler/brw_vec4.cpp
> +++ b/src/intel/compiler/brw_vec4.cpp
> @@ -2003,9 +2003,7 @@ vec4_visitor::convert_to_hw_regs()
>   struct brw_reg reg;
>   switch (src.file) {
>   case VGRF: {
> -const unsigned type_size = type_sz(src.type);
> -const unsigned width = REG_SIZE / 2 / MAX2(4, type_size);
> -reg = byte_offset(brw_vecn_grf(width, src.nr, 0), src.offset);
> +reg = byte_offset(brw_vecn_grf(4, src.nr, 0), src.offset);
>  reg.type = src.type;
>  reg.abs = src.abs;
>  reg.negate = src.negate;
> @@ -2013,12 +2011,11 @@ vec4_visitor::convert_to_hw_regs()
>   }
>  
>   case UNIFORM: {
> -const unsigned width = REG_SIZE / 2 / MAX2(4, type_sz(src.type));
>  reg = stride(byte_offset(brw_vec4_grf(
>  
> prog_data->base.dispatch_grf_start_reg +
>  src.nr / 2, src.nr % 2 * 4),
>   src.offset),
> - 0, width, 1);
> + 0, 4, 1);
>  reg.type = src.type;
>  reg.abs = src.abs;
>  reg.negate = src.negate;
> @@ -2576,6 +2573,12 @@ vec4_visitor::apply_logical_swizzle(struct brw_reg 
> *hw_reg,
> assert(brw_is_single_value_swizzle(reg.swizzle) ||
>is_supported_64bit_region(inst, arg));
>  
> +   /* Apply the region <2, 2, 1> for GRF or <0, 2, 1> for uniforms, as 
> align16
> +* HW can only do 32-bit swizzle channels.
> +*/
> +   if (reg.file == UNIFORM || reg.file == VGRF)
> +  hw_reg->width = BRW_WIDTH_2;

Any reason this is conditional on the register file?  Originally we were
only setting the width to 2 for the UNIFORM and VGRF files, but that was
probably an oversight...

> +
> if (is_supported_64bit_region(inst, arg) &&
> !is_gen7_supported_64bit_swizzle(inst, arg)) {
>/* Supported 64-bit swizzles are those such that their first two
> -- 
> 2.11.0
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH 1/3] i965/vec4: fix vertical stride to avoid breaking region parameter rule

2017-04-28 Thread Francisco Jerez
Samuel Iglesias Gonsálvez  writes:

> From IVB PRM, vol4, part3, "General Restrictions on Regioning
> Parameters":
>
>   "If ExecSize = Width and HorzStride ≠ 0, VertStride must
>be set to Width * HorzStride."
>
> In next patch, we are going to modify the region parameter for
> uniforms and vgrf. For uniforms that are the source of
> DF align1 instructions, they will have <0, 4, 1> regioning and
> the execsize for those instructions will be 4, so they will break
> the regioning rule. This will be the same for VGRF sources where
> we use the vstride == 0 exploit.
>
> As we know we are not going to cross the GRF boundary with that
> execsize and parameters (not even with the exploit), we just fix
> the vstride here.
>
> Signed-off-by: Samuel Iglesias Gonsálvez 
> Cc: "17.1" 
> ---
>  src/intel/compiler/brw_reg.h| 15 +++
>  src/intel/compiler/brw_vec4.cpp | 19 +++
>  2 files changed, 34 insertions(+)
>
> diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h
> index 17a51fbd655..24e09a84fce 100644
> --- a/src/intel/compiler/brw_reg.h
> +++ b/src/intel/compiler/brw_reg.h
> @@ -914,6 +914,21 @@ static inline unsigned cvt(unsigned val)
> return 0;
>  }
>  
> +static inline unsigned inv_cvt(unsigned val)
> +{
> +   switch (val) {
> +   case 0: return 0;
> +   case 1: return 1;
> +   case 2: return 2;
> +   case 3: return 4;
> +   case 4: return 8;
> +   case 5: return 16;
> +   case 6: return 32;
> +   }
> +   return 0;
> +}
> +
> +

This helper function would be unnecessary if you rearrange things
slightly as suggested below.

>  static inline struct brw_reg
>  stride(struct brw_reg reg, unsigned vstride, unsigned width, unsigned 
> hstride)
>  {
> diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
> index f9b805ea5a9..95f96ea69c0 100644
> --- a/src/intel/compiler/brw_vec4.cpp
> +++ b/src/intel/compiler/brw_vec4.cpp
> @@ -38,6 +38,8 @@ using namespace brw;
>  
>  namespace brw {
>  
> +static bool is_align1_df(vec4_instruction *inst);
> +

Maybe just move the definition up here so the forward declaration
becomes unnecessary?

>  void
>  src_reg::init()
>  {
> @@ -2049,6 +2051,23 @@ vec4_visitor::convert_to_hw_regs()
>  
>   apply_logical_swizzle(, inst, i);
>   src = reg;
> +
> + /* From IVB PRM, vol4, part3, "General Restrictions on Regioning
> +  * Parameters":
> +  *
> +  *   "If ExecSize = Width and HorzStride ≠ 0, VertStride must be set
> +  *to Width * HorzStride."
> +  *
> +  * We can break this rule with DF sources on DF align1
> +  * instructions, because the exec_size would be 4 and width is 4.
> +  * As we know we are not accessing to next GRF, it is safe to
> +  * set vstride to the formula given by the rule itself.
> +  */
> + if (is_align1_df(inst) && inst->exec_size == inv_cvt(src.width + 
> 1)) {

'cvt(inst->exec_size) - 1 == src.width'

> +const unsigned width = inv_cvt(src.width + 1);
> +const unsigned hstride = inv_cvt(src.hstride);

You can drop these two lines.

> +src.vstride = cvt(width * hstride);

src.vstride = src.hstride + src.width;

> + }

With these comments taken into account patch is:

Reviewed-by: Francisco Jerez 

>}
>  
>if (inst->is_3src(devinfo)) {
> -- 
> 2.11.0
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/2] i965/vec4: fix swizzle and writemask when loading an uniform with constant offset

2017-04-28 Thread Francisco Jerez
Samuel Iglesias Gonsálvez  writes:

> It was setting XYWZ swizzle and writemask to all uniforms, no matter if they
> were a vector or scalar, so this can lead to problems when loading them
> to the push constant buffer.
>
> Moreover, 'shift' calculation was designed to calculate the offset in
> DWORDS, but it doesn't take into account DFs, so the calculated swizzle
> for the later ones was wrong.
>
> The indirect case is not changed because MOV INDIRECT will write
> to all components. Added an assert to verify that these uniforms
> are aligned.
>
> v2:
> - Fix 'shift' calculation (Curro)
> - Set both swizzle and writemask.
> - Add assert(shift == 0) for the indirect case.
>
> Signed-off-by: Samuel Iglesias Gonsálvez 
> Cc: "17.1" 

Reviewed-by: Francisco Jerez 

> ---
>  src/intel/compiler/brw_vec4_nir.cpp | 15 +++
>  1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/src/intel/compiler/brw_vec4_nir.cpp 
> b/src/intel/compiler/brw_vec4_nir.cpp
> index a82d52088a8..80115aca0f9 100644
> --- a/src/intel/compiler/brw_vec4_nir.cpp
> +++ b/src/intel/compiler/brw_vec4_nir.cpp
> @@ -852,7 +852,8 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
> *instr)
> * The swizzle also works in the indirect case as the generator adds
> * the swizzle to the offset for us.
> */
> -  unsigned shift = (nir_intrinsic_base(instr) % 16) / 4;
> +  const int type_size = type_sz(src.type);
> +  unsigned shift = (nir_intrinsic_base(instr) % 16) / type_size;
>assert(shift + instr->num_components <= 4);
>  
>nir_const_value *const_offset = nir_src_as_const_value(instr->src[0]);
> @@ -860,14 +861,20 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
> *instr)
>   /* Offsets are in bytes but they should always be multiples of 4 */
>   assert(const_offset->u32[0] % 4 == 0);
>  
> - unsigned offset = const_offset->u32[0] + shift * 4;
> + src.swizzle = brw_swizzle_for_size(instr->num_components);
> + dest.writemask = brw_writemask_for_size(instr->num_components);
> + unsigned offset = const_offset->u32[0] + shift * type_size;
>   src.offset = ROUND_DOWN_TO(offset, 16);
> - shift = (offset % 16) / 4;
> + shift = (offset % 16) / type_size;
> + assert(shift + instr->num_components <= 4);
>   src.swizzle += BRW_SWIZZLE4(shift, shift, shift, shift);
>  
>   emit(MOV(dest, src));
>} else {
> - src.swizzle += BRW_SWIZZLE4(shift, shift, shift, shift);
> + /* Uniform arrays are vec4 aligned, because of std140 alignment
> +  * rules.
> +  */
> + assert(shift == 0);
>  
>   src_reg indirect = get_nir_src(instr->src[0], BRW_REGISTER_TYPE_UD, 
> 1);
>  
> -- 
> 2.11.0


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] fix minor error in YUV2RGB matrix used in shader

2017-04-28 Thread Kenneth Graunke
On Friday, April 28, 2017 3:02:01 PM PDT Eric Anholt wrote:
> Johnson Lin  writes:
> 
> > The matrix used for YCbCr to RGB is listed in Wiki 
> > https://en.wikipedia.org/wiki/YCbCr;
> > There is minor error in the matrix constant: 0.0625=16/256 should be 
> > 16.0/255,
> >  and 0.5=128.0/256 should be 128.0/255.
> > Note that conversion from a 0-255 byte number to 0-1.0 float is to divide 
> > by 255
> >  instead of 256. That's we get 255=1.0f.
> > By the constant change we can see the CSC result is bit aligned with
> > Wiki conversion result and FFMPeg result.
> > Otherwise in some situation, there will be one bit difference
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100854
> > ---
> >  src/compiler/nir/nir_lower_tex.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/compiler/nir/nir_lower_tex.c 
> > b/src/compiler/nir/nir_lower_tex.c
> > index 352d1499bc8d..385739a56a71 100644
> > --- a/src/compiler/nir/nir_lower_tex.c
> > +++ b/src/compiler/nir/nir_lower_tex.c
> > @@ -244,9 +244,9 @@ convert_yuv_to_rgb(nir_builder *b, nir_tex_instr *tex,
> > nir_ssa_def *yuv =
> >nir_vec4(b,
> > nir_fmul(b, nir_imm_float(b, 1.16438356f),
> > -nir_fadd(b, y, nir_imm_float(b, -0.0625f))),
> > -   nir_channel(b, nir_fadd(b, u, nir_imm_float(b, -0.5f)), 0),
> > -   nir_channel(b, nir_fadd(b, v, nir_imm_float(b, -0.5f)), 0),
> > +nir_fadd(b, y, nir_imm_float(b, -0.0627451f))),
> > +   nir_channel(b, nir_fadd(b, u, nir_imm_float(b, 
> > -0.50196078431f)), 0),
> > +   nir_channel(b, nir_fadd(b, v, nir_imm_float(b, 
> > -0.50196078431f)), 0),
> > nir_imm_float(b, 0.0));
> 
> Could we use 16.0/255.0 and 128.0/255.0, instead of magic-looking
> numbers?  With that, it will be:
> 
> Reviewed-by: Eric Anholt 
> 

Also, please start the commit title with "nir: "


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] Call for testing: Gallium set_index_buffer removal etc.

2017-04-28 Thread Marek Olšák
Hi,

This series shrinks various gallium structures and removes
set_index_buffer in order to decrease CPU overhead.


PART 1: Performance results

All testing below was done with radeonsi, and I used the drawoverhead
microbenchmark from mesa/demos ported to piglit and using GL 3.0
Compat and GL 3.2 Core (same GL states in both contexts).

1) Performance difference for the removal of set_index_buffer only:

  Compat: DrawElements: 5.1 -> 5.3 million draws/second
  Core:   DrawElements: 5.1 -> 5.5 million draws/second

The result is better for the core profile where u_vbuf is disabled.


2) Performance difference with all 4 patches (Core profile only)

   DrawArrays: 8.3 -> 8.5 million draws/second
   DrawElements: 5.2 -> 5.8 million draws/second


3) Performance difference with threaded Gallium (Core profile only):

   DrawElements: 5.9 -> 7.1 million draws/second

Threaded Gallium is still work in progress and might require
a non-trivial amount of driver work.


PART 2: Call for testing

These drivers have been tested:
- ddebug
- llvmpipe
- r300 (also with SWTCL)
- r600
- radeonsi
- softpipe
- trace

These drivers need testing:
- etnaviv
- freedreno
- nv30
- nv50
- nvc0
- svga
- swr
- vc4
- virgl

The following state trackers might need testing:
- nine

You can get the patches by fetching:
  git://people.freedesktop.org/~mareko/mesa gallium-cleanup

I'd like to ask to you for testing drivers that I couldn't test.
Please let me know when you're done testing and if things are good.
After that, I'll push everything assuming the code review goes well.
You can also ignore this if you don't mind fixing your driver in
the master branch later.

Thanks,

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] st/mesa: don't call util_draw_init_info in st_draw_vbo

2017-04-28 Thread Marek Olšák
From: Marek Olšák 

---
 src/mesa/state_tracker/st_draw.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c
index 3fee0cd..8b657be 100644
--- a/src/mesa/state_tracker/st_draw.c
+++ b/src/mesa/state_tracker/st_draw.c
@@ -159,66 +159,72 @@ st_draw_vbo(struct gl_context *ctx,
/* Validate state. */
if ((st->dirty | ctx->NewDriverState) & ST_PIPELINE_RENDER_STATE_MASK ||
st->gfx_shaders_may_be_dirty) {
   st_validate_state(st, ST_PIPELINE_RENDER);
}
 
if (st->vertex_array_out_of_memory) {
   return;
}
 
-   util_draw_init_info();
+   /* Initialize pipe_draw_info. */
+   info.primitive_restart = false;
+   info.vertices_per_patch = ctx->TessCtrlProgram.patch_vertices;
+   info.indirect = NULL;
+   info.count_from_stream_output = NULL;
 
if (ib) {
   struct gl_buffer_object *bufobj = ib->obj;
 
   /* Get index bounds for user buffers. */
   if (!index_bounds_valid)
  if (!all_varyings_in_vbos(arrays))
 vbo_get_minmax_indices(ctx, prims, ib, _index, _index,
nr_prims);
 
   info.index_size = ib->index_size;
   info.min_index = min_index;
   info.max_index = max_index;
 
   if (_mesa_is_bufferobj(bufobj)) {
  /* indices are in a real VBO */
+ info.has_user_indices = false;
  info.index.resource = st_buffer_object(bufobj)->buffer;
  start = pointer_to_offset(ib->ptr) / info.index_size;
   } else {
  /* indices are in user space memory */
  info.has_user_indices = true;
  info.index.user = ib->ptr;
   }
 
   setup_primitive_restart(ctx, );
}
else {
+  info.index_size = 0;
+
   /* Transform feedback drawing is always non-indexed. */
   /* Set info.count_from_stream_output. */
   if (tfb_vertcount) {
  if (!st_transform_feedback_draw_init(tfb_vertcount, stream, ))
 return;
   }
}
 
assert(!indirect);
 
/* do actual drawing */
for (i = 0; i < nr_prims; i++) {
   info.mode = translate_prim(ctx, prims[i].mode);
   info.start = start + prims[i].start;
   info.count = prims[i].count;
   info.start_instance = prims[i].base_instance;
   info.instance_count = prims[i].num_instances;
-  info.vertices_per_patch = ctx->TessCtrlProgram.patch_vertices;
   info.index_bias = prims[i].basevertex;
   info.drawid = prims[i].draw_id;
   if (!ib) {
  info.min_index = info.start;
  info.max_index = info.start + info.count - 1;
   }
 
   if (ST_DEBUG & DEBUG_DRAW) {
  debug_printf("st/draw: mode %s  start %u  count %u  index_size %d\n",
   u_prim_name(info.mode),
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] gallium: separate indirect stuff from pipe_draw_info - 80 -> 56 bytes

2017-04-28 Thread Marek Olšák
From: Marek Olšák 

For faster initialization of non-indirect draws.
---
 src/gallium/auxiliary/util/u_draw.c  |  4 +-
 src/gallium/auxiliary/util/u_dump_state.c| 15 ---
 src/gallium/auxiliary/util/u_vbuf.c  |  8 ++--
 src/gallium/docs/source/screen.rst   |  2 +-
 src/gallium/drivers/ddebug/dd_draw.c | 42 ++---
 src/gallium/drivers/ddebug/dd_pipe.h |  7 ++-
 src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c  | 16 +++
 src/gallium/drivers/r600/r600_state_common.c | 12 ++---
 src/gallium/drivers/radeonsi/si_state_draw.c | 59 +---
 src/gallium/drivers/trace/tr_dump_state.c| 12 -
 src/gallium/include/pipe/p_state.h   | 67 +++-
 src/gallium/state_trackers/nine/device9.c|  1 -
 src/gallium/state_trackers/nine/nine_state.c |  1 -
 src/mesa/state_tracker/st_draw.c | 19 
 14 files changed, 152 insertions(+), 113 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_draw.c 
b/src/gallium/auxiliary/util/u_draw.c
index ca78648..e7abbfc 100644
--- a/src/gallium/auxiliary/util/u_draw.c
+++ b/src/gallium/auxiliary/util/u_draw.c
@@ -138,22 +138,22 @@ util_draw_indirect(struct pipe_context *pipe,
uint32_t *params;
const unsigned num_params = info_in->indexed ? 5 : 4;
 
assert(info_in->indirect);
assert(!info_in->count_from_stream_output);
 
memcpy(, info_in, sizeof(info));
 
params = (uint32_t *)
   pipe_buffer_map_range(pipe,
-info_in->indirect,
-info_in->indirect_offset,
+info_in->indirect->buffer,
+info_in->indirect->offset,
 num_params * sizeof(uint32_t),
 PIPE_TRANSFER_READ,
 );
if (!transfer) {
   debug_printf("%s: failed to map indirect buffer\n", __FUNCTION__);
   return;
}
 
info.count = params[0];
info.instance_count = params[1];
diff --git a/src/gallium/auxiliary/util/u_dump_state.c 
b/src/gallium/auxiliary/util/u_dump_state.c
index 0af81f7..9c32557 100644
--- a/src/gallium/auxiliary/util/u_dump_state.c
+++ b/src/gallium/auxiliary/util/u_dump_state.c
@@ -932,25 +932,30 @@ util_dump_draw_info(FILE *stream, const struct 
pipe_draw_info *state)
 
util_dump_member(stream, int,  state, index_bias);
util_dump_member(stream, uint, state, min_index);
util_dump_member(stream, uint, state, max_index);
 
util_dump_member(stream, bool, state, primitive_restart);
util_dump_member(stream, uint, state, restart_index);
 
util_dump_member(stream, ptr, state, count_from_stream_output);
 
-   util_dump_member(stream, ptr, state, indirect);
-   util_dump_member(stream, uint, state, indirect_offset);
-   util_dump_member(stream, uint, state, indirect_stride);
-   util_dump_member(stream, uint, state, indirect_count);
-   util_dump_member(stream, uint, state, indirect_params_offset);
+   if (!state->indirect) {
+  util_dump_member(stream, ptr, state, indirect);
+   } else {
+  util_dump_member(stream, uint, state, indirect->offset);
+  util_dump_member(stream, uint, state, indirect->stride);
+  util_dump_member(stream, uint, state, indirect->draw_count);
+  util_dump_member(stream, uint, state, 
indirect->indirect_draw_count_offset);
+  util_dump_member(stream, ptr, state, indirect->buffer);
+  util_dump_member(stream, ptr, state, indirect->indirect_draw_count);
+   }
 
util_dump_struct_end(stream);
 }
 
 void util_dump_box(FILE *stream, const struct pipe_box *box)
 {
if (!box) {
   util_dump_null(stream);
   return;
}
diff --git a/src/gallium/auxiliary/util/u_vbuf.c 
b/src/gallium/auxiliary/util/u_vbuf.c
index 62b88ac..9d6d529 100644
--- a/src/gallium/auxiliary/util/u_vbuf.c
+++ b/src/gallium/auxiliary/util/u_vbuf.c
@@ -1161,29 +1161,29 @@ void u_vbuf_draw_vbo(struct u_vbuf *mgr, const struct 
pipe_draw_info *info)
}
 
new_info = *info;
 
/* Fallback. We need to know all the parameters. */
if (new_info.indirect) {
   struct pipe_transfer *transfer = NULL;
   int *data;
 
   if (new_info.indexed) {
- data = pipe_buffer_map_range(pipe, new_info.indirect,
-  new_info.indirect_offset, 20,
+ data = pipe_buffer_map_range(pipe, new_info.indirect->buffer,
+  new_info.indirect->offset, 20,
   PIPE_TRANSFER_READ, );
  new_info.index_bias = data[3];
  new_info.start_instance = data[4];
   }
   else {
- data = pipe_buffer_map_range(pipe, new_info.indirect,
-  new_info.indirect_offset, 16,
+ data = pipe_buffer_map_range(pipe, new_info.indirect->buffer,
+  new_info.indirect->offset, 16,
 

Re: [Mesa-dev] [PATCH v02 13/37] i965: Split out enum from brw_eu_defines.h

2017-04-28 Thread Kenneth Graunke
On Monday, April 24, 2017 3:19:08 PM PDT Rafael Antognolli wrote:
> We need to use some enums inside genX_state_upload.c, but including the
> whole header will cause several conflicts between things defined in this
> header and the genxml auto-generated headers.
> 
> So create a separate header that is included both by brw_eu_defines.h
> and genX_state_upload.c.
> 
> Signed-off-by: Rafael Antognolli 
> ---
>  src/intel/Makefile.sources  |  1 +-
>  src/intel/compiler/brw_defines_common.h | 46 ++-
>  src/intel/compiler/brw_eu_defines.h | 22 +
>  3 files changed, 48 insertions(+), 21 deletions(-)
>  create mode 100644 src/intel/compiler/brw_defines_common.h

I don't really like this, but without digging into the issue, I'm not
sure what else to suggest.

Acked-by: Kenneth Graunke 

Maybe we could just remove the conflicting pieces?


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/16] travis: split the make target to three separate ones

2017-04-28 Thread Andres Gomez
On Fri, 2017-04-28 at 19:25 +0100, Emil Velikov wrote:
> From: Emil Velikov 
> 
> Split the target to allow faster builds for each run.
> 
> The overall build time will be more, yet Travis runs multiple builds in
> parallel so we're limited by the slowest one.
> 
> Things are split roughly as:
>  - DRI loaders, classic DRI drivers, classic OSMesa, make check
>  - All Gallium drivers (minus the SWR) alongside st/dri (mesa)
>  - The Vulkan drivers - ANV and RADV, make check (anv)
> 
> v2:
>  - rework RUN_CHECK to MAKE_CHECK_COMMAND
>  - explicitly disable DRI loaders
>  - generate linux/memfd.h locally and enable ANV
>  - add libedit-dev
> 
> v3: Use printf to create the header (Andres).
> 
> Signed-off-by: Emil Velikov 
> ---
>  .travis.yml | 93 
> ++---
>  1 file changed, 77 insertions(+), 16 deletions(-)
> 
> diff --git a/.travis.yml b/.travis.yml
> index 6548e85b767..5298fa11b67 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -26,28 +26,21 @@ env:
>  matrix:
>include:
>  - env:
> -- LABEL="make"
> +- LABEL="make loaders/classic DRI"
>  - BUILD=make
>  - MAKEFLAGS=-j2
> -- LLVM_VERSION=3.9
> -- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
> +- MAKE_CHECK_COMMAND="make check"
> +# XXX: Add wayland platform
> +- DRI_LOADERS="--enable-glx --enable-gbm --enable-egl 
> --with-platforms=x11,drm,surfaceless --enable-osmesa"
>  - DRI_DRIVERS="i915,i965,radeon,r200,swrast,nouveau"
> -- 
> GALLIUM_DRIVERS="i915,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,etnaviv,imx"
> -- VULKAN_DRIVERS="radeon"
> +- GALLIUM_DRIVERS=""
> +- VULKAN_DRIVERS=""
>addons:
>  apt:
> -  sources:
> -- llvm-toolchain-trusty-3.9
>packages:
> -# LLVM packaging is broken and misses these dependencies
> -- libedit-dev
> -# From sources above
> -- llvm-3.9-dev
> -# Common
>  - x11proto-xf86vidmode-dev
>  - libexpat1-dev
>  - libx11-xcb-dev
> -- libelf-dev
>  - env:
>  # NOTE: Building SWR is 2x (yes two) times slower than all the other
>  # gallium drivers combined.
> @@ -55,10 +48,12 @@ matrix:
>  - LABEL="make Gallium Drivers SWR"
>  - BUILD=make
>  - MAKEFLAGS=-j2
> +- MAKE_CHECK_COMMAND="true"
>  - LLVM_VERSION=3.9
>  - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
>  - OVERRIDE_CC="gcc-5"
>  - OVERRIDE_CXX="g++-5"
> +- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
>  - DRI_DRIVERS=""
>  - GALLIUM_DRIVERS="swr"
>  - VULKAN_DRIVERS=""
> @@ -79,6 +74,57 @@ matrix:
>  - libx11-xcb-dev
>  - libelf-dev
>  - env:
> +- LABEL="make Gallium Drivers Other"
> +- BUILD=make
> +- MAKEFLAGS=-j2
> +- MAKE_CHECK_COMMAND="true"
> +- LLVM_VERSION=3.9
> +- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
> +- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
> +- DRI_DRIVERS=""
> +- 
> GALLIUM_DRIVERS="i915,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,etnaviv,imx"
> +- VULKAN_DRIVERS=""
> +  addons:
> +apt:
> +  sources:
> +- llvm-toolchain-trusty-3.9
> +  packages:

libedit-dev is missing here. It is added later in the patch 14 of this
series. It should be reordered.

> +# From sources above
> +- llvm-3.9-dev
> +# Common
> +- x11proto-xf86vidmode-dev
> +- libexpat1-dev
> +- libx11-xcb-dev
> +- libelf-dev
> +- env:
> +- LABEL="make Vulkan"
> +- BUILD=make
> +- MAKEFLAGS=-j2
> +- MAKE_CHECK_COMMAND="make -C src/gtest check && make -C src/intel 
> check"
> +- LLVM_VERSION=3.9
> +- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
> +# XXX: we want to test the WSI, but those are enabled via the EGL 
> toggles
> +# XXX: Add wayland platform
> +# XXX: Platform X11 dependencies are checked when --enable-glx is set
> +- DRI_LOADERS="--enable-glx --disable-gbm --enable-egl 
> --with-platforms=x11"
> +- DRI_DRIVERS=""
> +- GALLIUM_DRIVERS=""
> +- VULKAN_DRIVERS="intel,radeon"
> +  addons:
> +apt:
> +  sources:
> +- llvm-toolchain-trusty-3.9
> +  packages:
> +# LLVM packaging is broken and misses these dependencies
> +- libedit-dev
> +# From sources above
> +- llvm-3.9-dev
> +# Common
> +- x11proto-xf86vidmode-dev
> +- libexpat1-dev
> +- libx11-xcb-dev
> +- libelf-dev
> 

Re: [Mesa-dev] [PATCH v3 06/37] genxml: Add alias for MOCS.

2017-04-28 Thread Kenneth Graunke
On Tuesday, April 25, 2017 1:59:15 PM PDT Rafael Antognolli wrote:
> Use an alias, so we can set the same value as the #define's.
> 
> v3:
>- Call it "SO Buffer MOCS" to follow the most common naming scheme.
>- Add alias for gen7 and gen75 too (Ken).
> 
> Signed-off-by: Rafael Antognolli 
> ---
>  src/intel/genxml/gen7.xml  | 1 +
>  src/intel/genxml/gen75.xml | 1 +
>  src/intel/genxml/gen8.xml  | 1 +
>  src/intel/genxml/gen9.xml  | 1 +
>  4 files changed, 4 insertions(+)
> 
> diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
> index 440258a..b63add0 100644
> --- a/src/intel/genxml/gen7.xml
> +++ b/src/intel/genxml/gen7.xml
> @@ -1642,6 +1642,7 @@
>  
>  
>   type="MEMORY_OBJECT_CONTROL_STATE"/>
> +
>  
>  
>  
> diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml
> index 9f0486c..e63979c 100644
> --- a/src/intel/genxml/gen75.xml
> +++ b/src/intel/genxml/gen75.xml
> @@ -1957,6 +1957,7 @@
>  
>  
>   type="MEMORY_OBJECT_CONTROL_STATE"/>
> +
>  
>  
>  
> diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
> index 408d241..3b44406 100644
> --- a/src/intel/genxml/gen8.xml
> +++ b/src/intel/genxml/gen8.xml
> @@ -2064,6 +2064,7 @@
>  
>  
>   type="MEMORY_OBJECT_CONTROL_STATE"/>
> +
>   type="bool"/>
>   end="52" type="bool"/>
>  
> diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
> index 59daa31..d78a321 100644
> --- a/src/intel/genxml/gen9.xml
> +++ b/src/intel/genxml/gen9.xml
> @@ -2246,6 +2246,7 @@
>  
>  
>   type="MEMORY_OBJECT_CONTROL_STATE"/>
> +
>   type="bool"/>
>   end="52" type="bool"/>
>  
> 

Can we just get rid of "SO Buffer Object Control State" then?
I don't think anything uses it, and your new field is easier to use.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v02 02/37] genxml: Fix gen4-5 xml to make it compile correctly.

2017-04-28 Thread Kenneth Graunke
On Monday, April 24, 2017 3:18:57 PM PDT Rafael Antognolli wrote:
> Set the type of some fields, instead of prefix. Also fix the
> SAMPLER_BORDER_COLOR_STATE fields of gen5.xml.
> 
> Signed-off-by: Rafael Antognolli 

We need to squash this with the previous patch or else it breaks the
build.  I'm not planning on reviewing this.

Patches 1-2 squashed are:

Acked-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/16] travis: model scons check target like the make one

2017-04-28 Thread Andres Gomez
This is:

Reviewed-by: Andres Gomez 

On Fri, 2017-04-28 at 19:25 +0100, Emil Velikov wrote:
> From: Emil Velikov 
> 
> Should make things a bit more consistent across the board.
> 
> Cc: Eric Engestrom 
> CC: Andres Gomez 
> Signed-off-by: Emil Velikov 
> ---
>  .travis.yml | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/.travis.yml b/.travis.yml
> index 5298fa11b67..ec76cf7c9cb 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -130,6 +130,8 @@ matrix:
>  - SCONSFLAGS="-j4"
>  # Explicitly disable.
>  - SCONS_TARGET="llvm=0"
> +# Keep it symmetrical to the make build.
> +- SCONS_CHECK_COMMAND="scons llvm=0 check"
>addons:
>  apt:
>packages:
> @@ -144,6 +146,8 @@ matrix:
>  - BUILD=scons
>  - SCONSFLAGS="-j4"
>  - SCONS_TARGET="llvm=1"
> +# Keep it symmetrical to the make build.
> +- SCONS_CHECK_COMMAND="scons llvm=1 check"
>  - LLVM_VERSION=3.3
>  - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
>addons:
> @@ -165,6 +169,8 @@ matrix:
>  - SCONS_TARGET="swr=1"
>  - LLVM_VERSION=3.9
>  - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
> +# Keep it symmetrical to the make build. There's no actual SWR, yet.
> +- SCONS_CHECK_COMMAND="true"
>  - OVERRIDE_CC="gcc-5"
>  - OVERRIDE_CXX="g++-5"
>addons:
> @@ -278,5 +284,5 @@ script:
>- if test "x$BUILD" = xscons; then
>test -n "$OVERRIDE_CC" && export CC="$OVERRIDE_CC";
>test -n "$OVERRIDE_CXX" && export CXX="$OVERRIDE_CXX";
> -  scons $SCONS_TARGET && scons $SCONS_TARGET check;
> +  scons $SCONS_TARGET && eval $SCONS_CHECK_COMMAND;
>  fi
-- 
Br,

Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/16] travis: enable apt cache

2017-04-28 Thread Andres Gomez
This is:

Reviewed-by: Andres Gomez 

On Fri, 2017-04-28 at 19:25 +0100, Emil Velikov wrote:
> From: Emil Velikov 
> 
> Provides a small, but consistent improvement.
> Example numbers of the jobs added later in the series.
> 
> "make loaders/classic DRI" - 1s
> "scons SWR" - 6s
> 
> Signed-off-by: Emil Velikov 
> ---
>  .travis.yml | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/.travis.yml b/.travis.yml
> index e317a027233..061aed1bc7c 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -4,6 +4,7 @@ sudo: false
>  dist: trusty
>  
>  cache:
> +  apt: true
>directories:
>  - $HOME/.ccache
>  
-- 
Br,

Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v02 37/37] i965: Port gen4+ state emitting code to genxml.

2017-04-28 Thread Kenneth Graunke
On Monday, April 24, 2017 3:19:32 PM PDT Rafael Antognolli wrote:
> On this patch, we port:
>- brw_polygon_stipple
>- brw_polygon_stipple_offset
>- brw_line_stipple
>- brw_drawing_rect
> 
> v2:
>- Also emit states for gen4-5 with this code.
> 
> Signed-off-by: Rafael Antognolli 
> ---
>  src/mesa/drivers/dri/i965/Makefile.sources  |   1 +-
>  src/mesa/drivers/dri/i965/brw_misc_state.c  | 147 +-
>  src/mesa/drivers/dri/i965/brw_state.h   |   5 +-
>  src/mesa/drivers/dri/i965/gen6_viewport_state.c |  60 +-
>  src/mesa/drivers/dri/i965/genX_state_upload.c   | 193 +++--
>  5 files changed, 176 insertions(+), 230 deletions(-)
>  delete mode 100644 src/mesa/drivers/dri/i965/gen6_viewport_state.c
> 
> diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
> b/src/mesa/drivers/dri/i965/Makefile.sources
> index 098ceba..f89d5c2 100644
> --- a/src/mesa/drivers/dri/i965/Makefile.sources
> +++ b/src/mesa/drivers/dri/i965/Makefile.sources
> @@ -86,7 +86,6 @@ i965_FILES = \
>   gen6_scissor_state.c \
>   gen6_sol.c \
>   gen6_urb.c \
> - gen6_viewport_state.c \
>   gen7_cs_state.c \
>   gen7_l3_state.c \
>   gen7_misc_state.c \
> diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
> b/src/mesa/drivers/dri/i965/brw_misc_state.c
> index 83c1810..afa7e08 100644
> --- a/src/mesa/drivers/dri/i965/brw_misc_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
> @@ -44,32 +44,6 @@
>  #include "main/fbobject.h"
>  #include "main/glformats.h"
>  
> -/* Constant single cliprect for framebuffer object or DRI2 drawing */
> -static void
> -upload_drawing_rect(struct brw_context *brw)
> -{
> -   struct gl_context *ctx = >ctx;
> -   const struct gl_framebuffer *fb = ctx->DrawBuffer;
> -   const unsigned int fb_width = _mesa_geometric_width(fb);
> -   const unsigned int fb_height = _mesa_geometric_height(fb);
> -
> -   BEGIN_BATCH(4);
> -   OUT_BATCH(_3DSTATE_DRAWING_RECTANGLE << 16 | (4 - 2));
> -   OUT_BATCH(0); /* xmin, ymin */
> -   OUT_BATCH(((fb_width - 1) & 0x) | ((fb_height - 1) << 16));
> -   OUT_BATCH(0);
> -   ADVANCE_BATCH();
> -}
> -
> -const struct brw_tracked_state brw_drawing_rect = {
> -   .dirty = {
> -  .mesa = _NEW_BUFFERS,
> -  .brw = BRW_NEW_BLORP |
> - BRW_NEW_CONTEXT,
> -   },
> -   .emit = upload_drawing_rect
> -};
> -
>  /**
>   * Upload pointers to the per-stage state.
>   *
> @@ -696,127 +670,6 @@ const struct brw_tracked_state brw_depthbuffer = {
> .emit = brw_emit_depthbuffer,
>  };
>  
> -/**
> - * Polygon stipple packet
> - */
> -static void
> -upload_polygon_stipple(struct brw_context *brw)
> -{
> -   struct gl_context *ctx = >ctx;
> -   GLuint i;
> -
> -   /* _NEW_POLYGON */
> -   if (!ctx->Polygon.StippleFlag)
> -  return;
> -
> -   BEGIN_BATCH(33);
> -   OUT_BATCH(_3DSTATE_POLY_STIPPLE_PATTERN << 16 | (33 - 2));
> -
> -   /* Polygon stipple is provided in OpenGL order, i.e. bottom
> -* row first.  If we're rendering to a window (i.e. the
> -* default frame buffer object, 0), then we need to invert
> -* it to match our pixel layout.  But if we're rendering
> -* to a FBO (i.e. any named frame buffer object), we *don't*
> -* need to invert - we already match the layout.
> -*/
> -   if (_mesa_is_winsys_fbo(ctx->DrawBuffer)) {
> -  for (i = 0; i < 32; i++)
> -   OUT_BATCH(ctx->PolygonStipple[31 - i]); /* invert */
> -   } else {
> -  for (i = 0; i < 32; i++)
> -  OUT_BATCH(ctx->PolygonStipple[i]);
> -   }
> -   ADVANCE_BATCH();
> -}
> -
> -const struct brw_tracked_state brw_polygon_stipple = {
> -   .dirty = {
> -  .mesa = _NEW_POLYGON |
> -  _NEW_POLYGONSTIPPLE,
> -  .brw = BRW_NEW_CONTEXT,
> -   },
> -   .emit = upload_polygon_stipple
> -};
> -
> -/**
> - * Polygon stipple offset packet
> - */
> -static void
> -upload_polygon_stipple_offset(struct brw_context *brw)
> -{
> -   struct gl_context *ctx = >ctx;
> -
> -   /* _NEW_POLYGON */
> -   if (!ctx->Polygon.StippleFlag)
> -  return;
> -
> -   BEGIN_BATCH(2);
> -   OUT_BATCH(_3DSTATE_POLY_STIPPLE_OFFSET << 16 | (2-2));
> -
> -   /* _NEW_BUFFERS
> -*
> -* If we're drawing to a system window we have to invert the Y axis
> -* in order to match the OpenGL pixel coordinate system, and our
> -* offset must be matched to the window position.  If we're drawing
> -* to a user-created FBO then our native pixel coordinate system
> -* works just fine, and there's no window system to worry about.
> -*/
> -   if (_mesa_is_winsys_fbo(ctx->DrawBuffer))
> -  OUT_BATCH((32 - (_mesa_geometric_height(ctx->DrawBuffer) & 31)) & 31);
> -   else
> -  OUT_BATCH(0);
> -   ADVANCE_BATCH();
> -}
> -
> -const struct brw_tracked_state brw_polygon_stipple_offset = {
> -   .dirty = {
> -  .mesa = _NEW_BUFFERS |
> -  _NEW_POLYGON,
> -  .brw = BRW_NEW_CONTEXT,
> -   },
> -   .emit = 

Re: [Mesa-dev] [PATCH 02/21] anv/cmd_buffer: Use the device allocator for QueueSubmit

2017-04-28 Thread Nanley Chery
On Fri, Apr 14, 2017 at 10:37:49AM -0700, Jason Ekstrand wrote:
> The command is really operating on a Queue not a command buffer and the
> nearest object to that with an allocator is VkDevice.
> 
> Cc: "17.0" 

Should this have been Cc'ed to mesa-stable instead of mesa-dev?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [ANNOUNCE] mesa 17.0.5

2017-04-28 Thread Andres Gomez
Mesa 17.0.5 is now available.

In this release we have:

Nouveau has seen fixed a problem regarding the instructions emission
with GF100's ISA encoding.

Intel drivers have several fixes.  The Vulkan one has some corrections
for flusing the cache (VF and texture) while setting up a null surface
state, which is cheaper than calculating when one is needed, and
corrections for the handling of the VK_ATTACHMENT_UNUSED. It also
includes a patch to disable CCS on BDW input attachments since it is
not supported by the sampler.  i965 includes a change in spilling's
cost heuristic that improves performance in Unigine Heaven, and another
fix in the register coalesce optimization.

In Mesa core we bring corrections in the API validation, including a
set of patches for glMultiDrawArrays handling and a fix for the
validation of the sampler types.  Additionally, NIR includes a
correction when using ARB_shader_clock.

On integration side, there are several patches to fix configure and
build errors.

radv comes with a fix to properly report the timestampPeriod.

Finally, Gallivm brings a fix to avoid a hang when apps call exit and
we also include a fix in the state tracker to invalidate the readpix
cache.



Andres Gomez (17):
  cherry-ignore: Add the pci_id into the shader cache UUID
  cherry-ignore: fix crash if ctx torn down with no rendering
  cherry-ignore: Fix typos.
  cherry-ignore: Revert "etnaviv: Cannot render to rb-swapped formats"
  cherry-ignore: Revert "i965/fs: Don't emit SEL instructions for 
type-converting MOVs."
  cherry-ignore: fix typo in a2b10g10r10 fast clear calculation
  cherry-ignore: remove unused anv_dispatch_table dtable
  cherry-ignore: remove unused radv_dispatch_table dtable
  cherry-ignore: make radv_resolve_entrypoint static
  cherry-ignore: vulkan: add support for libmesa_vulkan_util
  cherry-ignore: r600: fix libmesa_amd_common dependency
  cherry-ignore: remove dead brw_new_shader() declaration
  cherry-ignore: remove i965_symbols_test reference from .gitignore
  cherry-ignore: automake: ensure that the destination directory is created
  cherry-ignore: provide required gem stubs for the tests
  Update version to 17.0.5
  docs: add release notes for 17.0.5

Boyan Ding (2):
  nvc0/ir: Properly handle a "split form" of predicate destination
  nir: Destination component count of shader_clock intrinsic is 2

Emil Velikov (5):
  docs: add sha256 checksums for 17.0.4
  winsys/sw/dri: don't use GNU void pointer arithmetic
  st/clover: add space between < and ::
  configure.ac: check require_basic_egl only if egl enabled
  st/mesa: automake: honour the vdpau header install location

Francisco Jerez (2):
  intel/fs: Use regs_written() in spilling cost heuristic for improved 
accuracy.
  intel/fs: Take into account amount of data read in spilling cost 
heuristic.

Grazvydas Ignotas (1):
  radv: report timestampPeriod correctly

Jason Ekstrand (5):
  anv/blorp: Flush the texture cache in UpdateBuffer
  anv/cmd_buffer: Flush the VF cache at the top of all primaries
  anv/cmd_buffer: Always set up a null surface state
  anv/cmd_buffer: Use the null surface state for ATTACHMENT_UNUSED
  anv/blorp: Properly handle VK_ATTACHMENT_UNUSED

Kenneth Graunke (1):
  i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().

Marek Olšák (1):
  st/mesa: invalidate the readpix cache in st_indirect_draw_vbo

Nanley Chery (1):
  anv/cmd_buffer: Disable CCS on BDW input attachments

Nicolai Hähnle (4):
  mesa: fix remaining xfb prims check for GLES with multiple instances
  mesa: extract need_xfb_remaining_prims_check
  mesa: move glMultiDrawArrays to vbo and fix error handling
  vbo: fix gl_DrawID handling in glMultiDrawArrays

Rob Clark (1):
  util/queue: don't hang at exit

Timothy Arceri (1):
  mesa: validate sampler type across the whole program

git tag: mesa-17.0.5

https://mesa.freedesktop.org/archive/mesa-17.0.5.tar.gz
MD5:  597ba33ab5a27d62a6ce9cb5c5418932  mesa-17.0.5.tar.gz
SHA1: 5f5d03151f54b6d7640a6403c2cd7adbf46224f0  mesa-17.0.5.tar.gz
SHA256: 7510eee0d0077860b250d30d73305048c2df4ba09ea8fc04e4f3eec7beece301  
mesa-17.0.5.tar.gz
SHA512: 
addd1902f4e51ab4596a6739c327b399eeca7b91eb4c945c68befa5126b05117ad22a3ef14f3af48105efc6cdfa6d894dfd35c3696658c7d340a1d3d6db605f1
  mesa-17.0.5.tar.gz
PGP:  https://mesa.freedesktop.org/archive/mesa-17.0.5.tar.gz.sig

https://mesa.freedesktop.org/archive/mesa-17.0.5.tar.xz
MD5:  5587b6b693260e3a3125f60fed6a625d  mesa-17.0.5.tar.xz
SHA1: 030fe080bb53377e44ece3f99b969af08e4a4ff8  mesa-17.0.5.tar.xz
SHA256: 668efa445d2f57a26e5c096b1965a685733a3b57d9c736f9d6460263847f9bfe  
mesa-17.0.5.tar.xz
SHA512: 
942fa62c9098bcd030856cd622696eae418f292addb912e1d558cf27d396f25c3f2000dae97a12d1ff233f1ea157497259442082005035bb27b9bafb2cfc33c3
  mesa-17.0.5.tar.xz
PGP:  

Re: [Mesa-dev] [PATCH v02 34/37] i965: Port gen4+ emit vertices code to genxml.

2017-04-28 Thread Kenneth Graunke
On Monday, April 24, 2017 3:19:29 PM PDT Rafael Antognolli wrote:
> Some code that was placed in brw_draw_upload.c and exported to be used
> by gen8+ was also moved to genX_state_upload, and the respective symbols
> are not exported anymore.
> 
> v2:
>- Remove code from brw_draw_upload too
>- Emit vertices for gen4-5 too.
>- Use helper to setup brw_address (Kristian)
>- Use macros for MOCS values.
>- Do not use #ifndef NDEBUG on code that is actually used (Ken)
> 
> Signed-off-by: Rafael Antognolli 

There's a lot of code here that isn't generation specific, and isn't
taking advantage of conditional-compilation.  Can we leave most of it
in brw_draw_upload.c?  Basically, just have the vertices atom and the
VERTEX_BUFFER_STATE function in genX_state_upload.c.

The VERTEX_BUFFER_STATE code looks good.

> +static void
> +genX(emit_vertices)(struct brw_context *brw)
> +{
> +   uint32_t *dw;
> +
> +   brw_prepare_vertices(brw);
> +   brw_prepare_shader_draw_parameters(brw);
> +
> +#if GEN_GEN < 8
> +   brw_emit_query_begin(brw);
> +#endif

This function is a no-op (early returns) on Gen6+ (brw->hw_ctx != NULL).
We should either make it GEN_GEN < 6 or just call it everywhere.

> +
> +   const struct brw_vs_prog_data *vs_prog_data =
> +  brw_vs_prog_data(brw->vs.base.prog_data);
> +
> +#if GEN_GEN >= 8
> +   struct gl_context *ctx = >ctx;
> +   bool uses_edge_flag = (ctx->Polygon.FrontMode != GL_FILL ||
> +  ctx->Polygon.BackMode != GL_FILL);
> +
> +   if (vs_prog_data->uses_vertexid || vs_prog_data->uses_instanceid) {
> +  unsigned vue = brw->vb.nr_enabled;
> +
> +  /* The element for the edge flags must always be last, so we have to
> +   * insert the SGVS before it in that case.
> +   */
> +  if (uses_edge_flag) {
> + assert(vue > 0);
> + vue--;
> +  }
> +
> +  WARN_ONCE(vue >= 33,
> +"Trying to insert VID/IID past 33rd vertex element, "
> +"need to reorder the vertex attrbutes.");
> +
> +  brw_batch_emit(brw, GENX(3DSTATE_VF_SGVS), vfs) {
> + if (vs_prog_data->uses_vertexid) {
> +vfs.VertexIDEnable = true;
> +vfs.VertexIDComponentNumber = 2;
> +vfs.VertexIDElementOffset = vue;
> + }
> +
> + if (vs_prog_data->uses_instanceid) {
> +vfs.InstanceIDEnable = true;
> +vfs.InstanceIDComponentNumber = 3;
> +vfs.InstanceIDElementOffset = vue;
> + }
> +  }
> +
> +  brw_batch_emit(brw, GENX(3DSTATE_VF_INSTANCING), vfi) {
> + vfi.InstancingEnable = true;
> + vfi.VertexElementIndex = vue;
> +  }
> +   } else {
> +  brw_batch_emit(brw, GENX(3DSTATE_VF_SGVS), vfs);
> +   }
> +
> +   /* Normally we don't need an element for the SGVS attribute because the
> +* 3DSTATE_VF_SGVS instruction lets you store the generated attribute in 
> an
> +* element that is past the list in 3DSTATE_VERTEX_ELEMENTS. However if
> +* we're using draw parameters then we need an element for the those
> +* values.  Additionally if there is an edge flag element then the SGVS
> +* can't be inserted past that so we need a dummy element to ensure that
> +* the edge flag is the last one.
> +*/
> +   const bool needs_sgvs_element = (vs_prog_data->uses_basevertex ||
> +vs_prog_data->uses_baseinstance ||
> +((vs_prog_data->uses_instanceid ||
> +  vs_prog_data->uses_vertexid)
> + && uses_edge_flag));
> +#else
> +   const bool needs_sgvs_element = (vs_prog_data->uses_basevertex ||
> +vs_prog_data->uses_baseinstance ||
> +vs_prog_data->uses_instanceid ||
> +vs_prog_data->uses_vertexid);
> +#endif
> +   unsigned nr_elements =
> +  brw->vb.nr_enabled + needs_sgvs_element + vs_prog_data->uses_drawid;
> +
> +#if GEN_GEN < 8
> +   /* If any of the formats of vb.enabled needs more that one upload, we need
> +* to add it to nr_elements */

  */ goes on its own line.

> +   for (unsigned i = 0; i < brw->vb.nr_enabled; i++) {
> +  struct brw_vertex_element *input = brw->vb.enabled[i];
> +  uint32_t format = brw_get_vertex_surface_type(brw, input->glarray);
> +
> +  if (genX(uploads_needed(format)) > 1)
> + nr_elements++;
> +   }
> +#endif
> +
> +   /* If the VS doesn't read any inputs (calculating vertex position from
> +* a state variable for some reason, for example), emit a single pad
> +* VERTEX_ELEMENT struct and bail.
> +*
> +* The stale VB state stays in place, but they don't do anything unless
> +* a VE loads from them.
> +*/
> +   if (nr_elements == 0) {
> +  dw = brw_batch_emitn(brw, GENX(3DSTATE_VERTEX_ELEMENTS), 3 + 
> 

Re: [Mesa-dev] [PATCH] fix minor error in YUV2RGB matrix used in shader

2017-04-28 Thread Eric Anholt
Johnson Lin  writes:

> The matrix used for YCbCr to RGB is listed in Wiki 
> https://en.wikipedia.org/wiki/YCbCr;
> There is minor error in the matrix constant: 0.0625=16/256 should be 16.0/255,
>  and 0.5=128.0/256 should be 128.0/255.
> Note that conversion from a 0-255 byte number to 0-1.0 float is to divide by 
> 255
>  instead of 256. That's we get 255=1.0f.
> By the constant change we can see the CSC result is bit aligned with
> Wiki conversion result and FFMPeg result.
> Otherwise in some situation, there will be one bit difference
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100854
> ---
>  src/compiler/nir/nir_lower_tex.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/compiler/nir/nir_lower_tex.c 
> b/src/compiler/nir/nir_lower_tex.c
> index 352d1499bc8d..385739a56a71 100644
> --- a/src/compiler/nir/nir_lower_tex.c
> +++ b/src/compiler/nir/nir_lower_tex.c
> @@ -244,9 +244,9 @@ convert_yuv_to_rgb(nir_builder *b, nir_tex_instr *tex,
> nir_ssa_def *yuv =
>nir_vec4(b,
> nir_fmul(b, nir_imm_float(b, 1.16438356f),
> -nir_fadd(b, y, nir_imm_float(b, -0.0625f))),
> -   nir_channel(b, nir_fadd(b, u, nir_imm_float(b, -0.5f)), 0),
> -   nir_channel(b, nir_fadd(b, v, nir_imm_float(b, -0.5f)), 0),
> +nir_fadd(b, y, nir_imm_float(b, -0.0627451f))),
> +   nir_channel(b, nir_fadd(b, u, nir_imm_float(b, 
> -0.50196078431f)), 0),
> +   nir_channel(b, nir_fadd(b, v, nir_imm_float(b, 
> -0.50196078431f)), 0),
> nir_imm_float(b, 0.0));

Could we use 16.0/255.0 and 128.0/255.0, instead of magic-looking
numbers?  With that, it will be:

Reviewed-by: Eric Anholt 


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/12] Android: default to building all drivers

2017-04-28 Thread Mauro Rossi
2017-04-28 21:38 GMT+02:00 Rob Herring :

> On Fri, Apr 28, 2017 at 10:58 AM, Mauro Rossi 
> wrote:
> > 2017-04-28 14:23 GMT+02:00 Rob Herring :
> >> On Thu, Apr 27, 2017 at 9:50 PM, Chih-Wei Huang <
> cwhu...@android-x86.org> wrote:
> >>> A typo in the subject?
> >>> (s/building/build/)
> >>
> >> It's a bit misleading as originally I wrote it such that a blank
> >> BOARD_GPU_DRIVERS would enable all drivers, then changed it to "all".
> >> So it's not really a default anymore.
> >>
> >>> 2017-04-28 3:43 GMT+08:00 Rob Herring :
>  If BOARD_GPU_DRIVERS is empty, build all the drivers. This doesn't
>  enable building mesa as that is controlled by including libGLES_mesa
> in
>  the product.
> 
>  Signed-off-by: Rob Herring 
>  ---
>   Android.mk | 8 
>   1 file changed, 8 insertions(+)
> 
>  diff --git a/Android.mk b/Android.mk
>  index 9f481ee7e109..76858c1616bc 100644
>  --- a/Android.mk
>  +++ b/Android.mk
>  @@ -1,3 +1,4 @@
>  +
>   # Mesa 3-D graphics library
>   #
>   # Copyright (C) 2010-2011 Chia-I Wu 
>  @@ -53,8 +54,15 @@ gallium_drivers := \
>  vc4.HAVE_GALLIUM_VC4 \
>  virgl.HAVE_GALLIUM_VIRGL
> 
>  +$(warning $(BOARD_GPU_DRIVERS))
>  +
>  +ifeq ($(BOARD_GPU_DRIVERS),all)
>  +MESA_BUILD_CLASSIC := $(filter HAVE_%, $(subst ., ,
> $(classic_drivers)))
>  +MESA_BUILD_GALLIUM := $(filter HAVE_%, $(subst ., ,
> $(gallium_drivers)))
>  +else
>   MESA_BUILD_CLASSIC := $(strip $(foreach d, $(BOARD_GPU_DRIVERS),
> $(patsubst $(d).%,%, $(filter $(d).%, $(classic_drivers)
>   MESA_BUILD_GALLIUM := $(strip $(foreach d, $(BOARD_GPU_DRIVERS),
> $(patsubst $(d).%,%, $(filter $(d).%, $(gallium_drivers)
>  +endif
>   $(foreach d, $(MESA_BUILD_CLASSIC) $(MESA_BUILD_GALLIUM), $(eval
> $(d) := true))
> 
>   # host and target must be the same arch to generate matypes.h
>  --
> >>>
> >>> Aren't some drivers for arm or x86 only?
> >>
> >> In practice, yes. But they should build on all architectures so folks
> >> can easily build test. If the arm only ones required an arm compiler
> >> for example, then none of the x86 folks would build them and check
> >> that their changes don't break any drivers.
> >>
> >>> Is it really possible to build all drivers?
> >>
> >> Yes. That is what my CI job does.
> >>
> >> Rob
> >
> > Is it possible to leave a $(warning   ) in case someone specifies a
> > new/wrong driver in BOARD_GPU_DRIVERS?
>
> Okay. It wasn't immediately
>
> > In case I wanted to use mesa 17.2 to build swrast only, considering
> > that radeonsi/radeon require LLVM 3.8.0, how can I do that if mesa
> > builds everything?
>
> BOARD_GPU_DRIVERS=swrast
>
> > Why should we completely loose the possibility to control the list of
> > drivers in android, which is basically a one driver case (swrast) for
> > some android-x86 boards or when we could live with intel drivers like
> > in an Atom tablet?
>
> You haven't. The existing way to list drivers is unchanged externally.
>
> Rob
>

Thanks and sorry I had completely misunderstood the commit message
Mauro
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 100876] Variable GALLIUM_HUD_DUMP_DIR is not working with Wine LFS

2017-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100876

--- Comment #1 from Edmondo Tommasina  ---
There is a patch in git master that maybe helps:

https://cgit.freedesktop.org/mesa/mesa/commit/?id=5589fd89e1337a03c947840b344f515cb1d3a96d

Just to be sure: Are you seeing empty fps and cpu files in /media/ramdisk/ or
the files don't get created at all?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 1/2] glx|egl: allow to test if glthread is safe enough on X11 platform

2017-04-28 Thread Matt Turner
On Fri, Apr 28, 2017 at 2:11 PM, Gregory Hainaut
 wrote:
> I extended the struct __DRIbackgroundCallableExtensionRec because
> the other function pointer is already related for glthread.
>
> DRI2/DRI3 glx code path check that display can be locked (basically
> XInitThread was called)
>
> EGL code path is more tricky as we don't want to pull X11 header. Instead
> the code will assume that it is safe if X11 isn't used or there is no display
> (i.e. 100% XCB)
>
> The new function will be used in the next commit
>
> Signed-off-by: Gregory Hainaut 
> ---
>  include/GL/internal/dri_interface.h |  9 +
>  src/egl/drivers/dri2/egl_dri2.c | 30 ++
>  src/glx/dri2_glx.c  |  9 +
>  src/glx/dri3_glx.c  |  8 
>  4 files changed, 56 insertions(+)
>
> diff --git a/include/GL/internal/dri_interface.h 
> b/include/GL/internal/dri_interface.h
> index 86efd1bdc9..28a52ccdb9 100644
> --- a/include/GL/internal/dri_interface.h
> +++ b/include/GL/internal/dri_interface.h
> @@ -1713,13 +1713,22 @@ struct __DRIbackgroundCallableExtensionRec {
>  * non-background thread (i.e. a thread that has already been bound to a
>  * context using __DRIcoreExtensionRec::bindContext()); when this happens,
>  * the \c loaderPrivate pointer must be equal to the pointer that was
>  * passed to the driver when the currently bound context was created.
>  *
>  * This call should execute quickly enough that the driver can call it 
> with
>  * impunity whenever a background thread starts performing drawing
>  * operations (e.g. it should just set a thread-local variable).
>  */
> void (*setBackgroundContext)(void *loaderPrivate);
> +   /**
> +* Indicate that it is multithread safe to use glthread. Typically
> +* XInitThread was called in GLX setup.
> +*
> +* \param loaderPrivate is the value that was passed to to the driver when
> +* the context was created.  This can be used by the loader to identify
> +* which context any callbacks are associated with.
> +*/
> +   GLboolean (*isGlThreadSafe)(void *loaderPrivate);
>  };
>
>  #endif
> diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
> index 2cab7d00c1..df2db97bcf 100644
> --- a/src/egl/drivers/dri2/egl_dri2.c
> +++ b/src/egl/drivers/dri2/egl_dri2.c
> @@ -85,24 +85,54 @@
>
>  static void
>  dri_set_background_context(void *loaderPrivate)
>  {
> _EGLContext *ctx = _eglGetCurrentContext();
> _EGLThreadInfo *t = _eglGetCurrentThread();
>
> _eglBindContextToThread(ctx, t);
>  }
>
> +static GLboolean
> +dri_is_glthread_safe(void *loaderPrivate)
> +{
> +#ifdef HAVE_X11_PLATFORM
> +   struct dri2_egl_surface *dri2_surf = loaderPrivate;
> +   _EGLDisplay *display =  dri2_surf->base.Resource.Display;
> +   Display *dpy;
> +
> +   // Only the libX11 isn't safe
> +   if (display->Platform != _EGL_PLATFORM_X11)
> +  return true;
> +
> +   // Will use pure XCB so no libX11 here either
> +   if (display->PlatformDisplay == NULL)
> +  return true;
> +
> +   // In an ideal world we would check the X11 lock pointer
> +   // (display->PlatformDisplay->lock_fns). Unfortunately it
> +   // requires to know the full type. And we don't want to bring X11
> +   // headers here.
> +   //
> +   // So let's assume an unsafe behavior. Modern EGL code shouldn't use
> +   // libX11 anyway.

Don't use C++ comments.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] nir: add pass to lower atomic counters to SSBO

2017-04-28 Thread Jason Ekstrand
Acked-by: Jason Ekstrand 

On Mon, Apr 24, 2017 at 8:28 AM, Rob Clark  wrote:

> This is equivalent to what mesa/st does in glsl_to_tgsi.  For most hw
> there isn't a particularly good reason to treat these differently.
>
> Signed-off-by: Rob Clark 
> ---
> v2: do the interface_type thing properly
>
>  src/compiler/Makefile.sources|   1 +
>  src/compiler/nir/nir.h   |   1 +
>  src/compiler/nir/nir_lower_atomics_to_ssbo.c | 222
> +++
>  3 files changed, 224 insertions(+)
>  create mode 100644 src/compiler/nir/nir_lower_atomics_to_ssbo.c
>
> diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
> index 2455d4e..b2a3a42 100644
> --- a/src/compiler/Makefile.sources
> +++ b/src/compiler/Makefile.sources
> @@ -208,6 +208,7 @@ NIR_FILES = \
> nir/nir_lower_64bit_packing.c \
> nir/nir_lower_alu_to_scalar.c \
> nir/nir_lower_atomics.c \
> +   nir/nir_lower_atomics_to_ssbo.c \
> nir/nir_lower_bitmap.c \
> nir/nir_lower_clamp_color_outputs.c \
> nir/nir_lower_clip.c \
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index ce5b434..be35930 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -2546,6 +2546,7 @@ void nir_lower_bitmap(nir_shader *shader, const
> nir_lower_bitmap_options *option
>
>  bool nir_lower_atomics(nir_shader *shader,
> const struct gl_shader_program *shader_program);
> +bool nir_lower_atomics_to_ssbo(nir_shader *shader, unsigned ssbo_offset);
>  bool nir_lower_to_source_mods(nir_shader *shader);
>
>  bool nir_lower_gs_intrinsics(nir_shader *shader);
> diff --git a/src/compiler/nir/nir_lower_atomics_to_ssbo.c
> b/src/compiler/nir/nir_lower_atomics_to_ssbo.c
> new file mode 100644
> index 000..2c04485
> --- /dev/null
> +++ b/src/compiler/nir/nir_lower_atomics_to_ssbo.c
> @@ -0,0 +1,222 @@
> +/*
> + * Copyright © 2017 Red Hat
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> DEALINGS
> + * IN THE SOFTWARE.
> + *
> + * Authors:
> + *Rob Clark 
> + */
> +
> +#include "nir.h"
> +#include "nir_builder.h"
> +
> +/*
> + * Remap atomic counters to SSBOs.  Atomic counters get remapped to
> + * SSBO binding points [0..ssbo_offset) and the original SSBOs are
> + * remapped to [ssbo_offset..n) (mostly to align with what mesa/st
> + * does.
> + */
> +
> +static bool
> +lower_instr(nir_intrinsic_instr *instr, unsigned ssbo_offset,
> nir_builder *b)
> +{
> +   nir_intrinsic_op op;
> +   switch (instr->intrinsic) {
> +   case nir_intrinsic_ssbo_atomic_add:
> +   case nir_intrinsic_ssbo_atomic_imin:
> +   case nir_intrinsic_ssbo_atomic_umin:
> +   case nir_intrinsic_ssbo_atomic_imax:
> +   case nir_intrinsic_ssbo_atomic_umax:
> +   case nir_intrinsic_ssbo_atomic_and:
> +   case nir_intrinsic_ssbo_atomic_or:
> +   case nir_intrinsic_ssbo_atomic_xor:
> +   case nir_intrinsic_ssbo_atomic_exchange:
> +   case nir_intrinsic_ssbo_atomic_comp_swap:
> +   case nir_intrinsic_store_ssbo:
> +   case nir_intrinsic_load_ssbo:
> +  /* keep same opcode, remap buffer_index */
> +  op = instr->intrinsic;
> +  break;
> +   case nir_intrinsic_atomic_counter_inc:
> +   case nir_intrinsic_atomic_counter_add:
> +   case nir_intrinsic_atomic_counter_dec:
> +  /* inc and dec get remapped to add: */
> +  op = nir_intrinsic_ssbo_atomic_add;
> +  break;
> +   case nir_intrinsic_atomic_counter_read:
> +  op = nir_intrinsic_load_ssbo;
> +  break;
> +   case nir_intrinsic_atomic_counter_min:
> +  op = nir_intrinsic_ssbo_atomic_umin;
> +  break;
> +   case nir_intrinsic_atomic_counter_max:
> +  op = nir_intrinsic_ssbo_atomic_umax;
> +  break;
> +   case nir_intrinsic_atomic_counter_and:
> +  op = 

[Mesa-dev] [PATCH 14/14] radeonsi/gfx9: allow the scratch buffer in HS and GS

2017-04-28 Thread Marek Olšák
From: Marek Olšák 

It works now.
---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index a3ff511..a5260f5 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -491,25 +491,20 @@ static void si_shader_hs(struct si_screen *sscreen, 
struct si_shader *shader)
if (sscreen->b.chip_class >= GFX9) {
si_pm4_set_reg(pm4, R_00B410_SPI_SHADER_PGM_LO_LS, va >> 8);
si_pm4_set_reg(pm4, R_00B414_SPI_SHADER_PGM_HI_LS, va >> 40);
 
/* We need at least 2 components for LS.
 * VGPR0-3: (VertexID, RelAutoindex, InstanceID / StepRate0, 
InstanceID).
 * StepRate0 is set to 1. so that VGPR3 doesn't have to be 
loaded.
 */
ls_vgpr_comp_cnt = shader->info.uses_instanceid ? 2 : 1;
 
-   if (shader->config.scratch_bytes_per_wave) {
-   fprintf(stderr, "HS: scratch buffer unsupported");
-   abort();
-   }
-
shader->config.rsrc2 =
S_00B42C_USER_SGPR(GFX9_TCS_NUM_USER_SGPR) |
S_00B42C_USER_SGPR_MSB(GFX9_TCS_NUM_USER_SGPR >> 5) |

S_00B42C_SCRATCH_EN(shader->config.scratch_bytes_per_wave > 0);
} else {
si_pm4_set_reg(pm4, R_00B420_SPI_SHADER_PGM_LO_HS, va >> 8);
si_pm4_set_reg(pm4, R_00B424_SPI_SHADER_PGM_HI_HS, va >> 40);
 
shader->config.rsrc2 =
S_00B42C_USER_SGPR(GFX6_TCS_NUM_USER_SGPR) |
@@ -809,25 +804,20 @@ static void si_shader_gs(struct si_screen *sscreen, 
struct si_shader *shader)
si_pm4_set_reg(pm4, R_028A94_VGT_GS_MAX_PRIMS_PER_SUBGROUP,
   
S_028A94_MAX_PRIMS_PER_SUBGROUP(gs_info.max_prims_per_subgroup));
si_pm4_set_reg(pm4, R_028AAC_VGT_ESGS_RING_ITEMSIZE,
   shader->key.part.gs.es->esgs_itemsize / 4);
 
if (es_type == PIPE_SHADER_TESS_EVAL)
si_set_tesseval_regs(sscreen, shader->key.part.gs.es, 
pm4);
 
polaris_set_vgt_vertex_reuse(sscreen, shader->key.part.gs.es,
 NULL, pm4);
-
-   if (shader->config.scratch_bytes_per_wave) {
-   fprintf(stderr, "GS: scratch buffer unsupported");
-   abort();
-   }
} else {
si_pm4_set_reg(pm4, R_00B220_SPI_SHADER_PGM_LO_GS, va >> 8);
si_pm4_set_reg(pm4, R_00B224_SPI_SHADER_PGM_HI_GS, va >> 40);
 
si_pm4_set_reg(pm4, R_00B228_SPI_SHADER_PGM_RSRC1_GS,
   S_00B228_VGPRS((shader->config.num_vgprs - 1) / 
4) |
   S_00B228_SGPRS((shader->config.num_sgprs - 1) / 
8) |
   S_00B228_DX10_CLAMP(1) |
   S_00B228_FLOAT_MODE(shader->config.float_mode));
si_pm4_set_reg(pm4, R_00B22C_SPI_SHADER_PGM_RSRC2_GS,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/14] radeonsi: prevent race conditions when doing scratch patching

2017-04-28 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 32 +++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 16fb522..a3ff511 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -2570,60 +2570,88 @@ static bool si_update_gs_ring_buffers(struct si_context 
*sctx)
}
if (sctx->gsvs_ring) {
si_set_ring_buffer(>b.b, SI_RING_GSVS,
   sctx->gsvs_ring, 0, sctx->gsvs_ring->width0,
   false, false, 0, 0, 0);
}
 
return true;
 }
 
+static void si_shader_lock(struct si_shader *shader)
+{
+   mtx_lock(>selector->mutex);
+   if (shader->previous_stage_sel) {
+   assert(shader->previous_stage_sel != shader->selector);
+   mtx_lock(>previous_stage_sel->mutex);
+   }
+}
+
+static void si_shader_unlock(struct si_shader *shader)
+{
+   if (shader->previous_stage_sel)
+   mtx_unlock(>previous_stage_sel->mutex);
+   mtx_unlock(>selector->mutex);
+}
+
 /**
  * @returns 1 if \p sel has been updated to use a new scratch buffer
  *  0 if not
  *  < 0 if there was a failure
  */
 static int si_update_scratch_buffer(struct si_context *sctx,
struct si_shader *shader)
 {
uint64_t scratch_va = sctx->scratch_buffer->gpu_address;
int r;
 
if (!shader)
return 0;
 
/* This shader doesn't need a scratch buffer */
if (shader->config.scratch_bytes_per_wave == 0)
return 0;
 
+   /* Prevent race conditions when updating:
+* - si_shader::scratch_bo
+* - si_shader::binary::code
+* - si_shader::previous_stage::binary::code.
+*/
+   si_shader_lock(shader);
+
/* This shader is already configured to use the current
 * scratch buffer. */
-   if (shader->scratch_bo == sctx->scratch_buffer)
+   if (shader->scratch_bo == sctx->scratch_buffer) {
+   si_shader_unlock(shader);
return 0;
+   }
 
assert(sctx->scratch_buffer);
 
if (shader->previous_stage)
si_shader_apply_scratch_relocs(shader->previous_stage, 
scratch_va);
 
si_shader_apply_scratch_relocs(shader, scratch_va);
 
/* Replace the shader bo with a new bo that has the relocs applied. */
r = si_shader_binary_upload(sctx->screen, shader);
-   if (r)
+   if (r) {
+   si_shader_unlock(shader);
return r;
+   }
 
/* Update the shader state to use the new shader bo. */
si_shader_init_pm4_state(sctx->screen, shader);
 
r600_resource_reference(>scratch_bo, sctx->scratch_buffer);
 
+   si_shader_unlock(shader);
return 1;
 }
 
 static unsigned si_get_current_scratch_buffer_size(struct si_context *sctx)
 {
return sctx->scratch_buffer ? sctx->scratch_buffer->b.b.width0 : 0;
 }
 
 static unsigned si_get_scratch_buffer_bytes_per_wave(struct si_shader *shader)
 {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/14] radeonsi: don't use util_memcpy_cpu_to_le32 for shader uploads

2017-04-28 Thread Marek Olšák
From: Marek Olšák 

at least I think this is correct.
---
 src/gallium/drivers/radeonsi/si_shader.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 06ad370..8bdde1a 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -6545,41 +6545,42 @@ int si_shader_binary_upload(struct si_screen *sscreen, 
struct si_shader *shader)
PIPE_USAGE_IMMUTABLE,
align(bo_size, SI_CPDMA_ALIGNMENT));
if (!shader->bo)
return -ENOMEM;
 
/* Upload. */
ptr = sscreen->b.ws->buffer_map(shader->bo->buf, NULL,
PIPE_TRANSFER_READ_WRITE |
PIPE_TRANSFER_UNSYNCHRONIZED);
 
+   /* Don't use util_memcpy_cpu_to_le32. LLVM binaries are
+* endian-independent. */
if (prolog) {
-   util_memcpy_cpu_to_le32(ptr, prolog->code, prolog->code_size);
+   memcpy(ptr, prolog->code, prolog->code_size);
ptr += prolog->code_size;
}
if (previous_stage) {
-   util_memcpy_cpu_to_le32(ptr, previous_stage->code,
-   previous_stage->code_size);
+   memcpy(ptr, previous_stage->code, previous_stage->code_size);
ptr += previous_stage->code_size;
}
if (prolog2) {
-   util_memcpy_cpu_to_le32(ptr, prolog2->code, prolog2->code_size);
+   memcpy(ptr, prolog2->code, prolog2->code_size);
ptr += prolog2->code_size;
}
 
-   util_memcpy_cpu_to_le32(ptr, mainb->code, mainb->code_size);
+   memcpy(ptr, mainb->code, mainb->code_size);
ptr += mainb->code_size;
 
if (epilog)
-   util_memcpy_cpu_to_le32(ptr, epilog->code, epilog->code_size);
+   memcpy(ptr, epilog->code, epilog->code_size);
else if (mainb->rodata_size > 0)
-   util_memcpy_cpu_to_le32(ptr, mainb->rodata, mainb->rodata_size);
+   memcpy(ptr, mainb->rodata, mainb->rodata_size);
 
sscreen->b.ws->buffer_unmap(shader->bo->buf);
return 0;
 }
 
 static void si_shader_dump_disassembly(const struct ac_shader_binary *binary,
   struct pipe_debug_callback *debug,
   const char *name, FILE *file)
 {
char *line, *p;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/14] radeonsi: inline si_llvm_shader_type into si_llvm_create_func

2017-04-28 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c   |  1 -
 src/gallium/drivers/radeonsi/si_shader_internal.h  |  1 -
 .../drivers/radeonsi/si_shader_tgsi_setup.c| 53 +-
 3 files changed, 22 insertions(+), 33 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 8bdde1a..fed8639 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -5676,21 +5676,20 @@ static const struct lp_build_tgsi_action interp_action 
= {
 static void si_create_function(struct si_shader_context *ctx,
   const char *name,
   LLVMTypeRef *returns, unsigned num_returns,
   LLVMTypeRef *params, unsigned num_params,
   int last_sgpr, unsigned max_workgroup_size)
 {
int i;
 
si_llvm_create_func(ctx, name, returns, num_returns,
params, num_params);
-   si_llvm_shader_type(ctx->main_fn, ctx->type);
ctx->return_value = LLVMGetUndef(ctx->return_type);
 
for (i = 0; i <= last_sgpr; ++i) {
LLVMValueRef P = LLVMGetParam(ctx->main_fn, i);
 
/* The combination of:
 * - ByVal
 * - dereferenceable
 * - invariant.load
 * allows the optimization passes to move loads and reduces
diff --git a/src/gallium/drivers/radeonsi/si_shader_internal.h 
b/src/gallium/drivers/radeonsi/si_shader_internal.h
index b54db20..35315ca 100644
--- a/src/gallium/drivers/radeonsi/si_shader_internal.h
+++ b/src/gallium/drivers/radeonsi/si_shader_internal.h
@@ -233,21 +233,20 @@ struct si_shader_context {
LLVMValueRef shared_memory;
 };
 
 static inline struct si_shader_context *
 si_shader_context(struct lp_build_tgsi_context *bld_base)
 {
return (struct si_shader_context*)bld_base;
 }
 
 void si_llvm_add_attribute(LLVMValueRef F, const char *name, int value);
-void si_llvm_shader_type(LLVMValueRef F, unsigned type);
 
 LLVMTargetRef si_llvm_get_amdgpu_target(const char *triple);
 
 unsigned si_llvm_compile(LLVMModuleRef M, struct ac_shader_binary *binary,
 LLVMTargetMachineRef tm,
 struct pipe_debug_callback *debug);
 
 LLVMTypeRef tgsi2llvmtype(struct lp_build_tgsi_context *bld_base,
  enum tgsi_opcode_type type);
 
diff --git a/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c 
b/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c
index 2b0d600..de671ef 100644
--- a/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c
+++ b/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c
@@ -58,51 +58,20 @@ enum si_llvm_calling_convention {
 };
 
 void si_llvm_add_attribute(LLVMValueRef F, const char *name, int value)
 {
char str[16];
 
snprintf(str, sizeof(str), "%i", value);
LLVMAddTargetDependentFunctionAttr(F, name, str);
 }
 
-/**
- * Set the shader type we want to compile
- *
- * @param type shader type to set
- */
-void si_llvm_shader_type(LLVMValueRef F, unsigned type)
-{
-   enum si_llvm_calling_convention calling_conv;
-
-   switch (type) {
-   case PIPE_SHADER_VERTEX:
-   case PIPE_SHADER_TESS_CTRL:
-   case PIPE_SHADER_TESS_EVAL:
-   calling_conv = RADEON_LLVM_AMDGPU_VS;
-   break;
-   case PIPE_SHADER_GEOMETRY:
-   calling_conv = RADEON_LLVM_AMDGPU_GS;
-   break;
-   case PIPE_SHADER_FRAGMENT:
-   calling_conv = RADEON_LLVM_AMDGPU_PS;
-   break;
-   case PIPE_SHADER_COMPUTE:
-   calling_conv = RADEON_LLVM_AMDGPU_CS;
-   break;
-   default:
-   unreachable("Unhandle shader type");
-   }
-
-   LLVMSetFunctionCallConv(F, calling_conv);
-}
-
 static void init_amdgpu_target()
 {
gallivm_init_llvm_targets();
LLVMInitializeAMDGPUTargetInfo();
LLVMInitializeAMDGPUTarget();
LLVMInitializeAMDGPUTargetMC();
LLVMInitializeAMDGPUAsmPrinter();
 
/* For inline assembly. */
LLVMInitializeAMDGPUAsmParser();
@@ -1385,35 +1354,57 @@ void si_llvm_context_set_tgsi(struct si_shader_context 
*ctx,
ctx->bld_base.emit_fetch_funcs[TGSI_FILE_SYSTEM_VALUE] = 
fetch_system_value;
 }
 
 void si_llvm_create_func(struct si_shader_context *ctx,
 const char *name,
 LLVMTypeRef *return_types, unsigned num_return_elems,
 LLVMTypeRef *ParamTypes, unsigned ParamCount)
 {
LLVMTypeRef main_fn_type, ret_type;
LLVMBasicBlockRef main_fn_body;
+   enum si_llvm_calling_convention call_conv;
 
if (num_return_elems)
ret_type = LLVMStructTypeInContext(ctx->gallivm.context,
   

[Mesa-dev] [PATCH 04/14] radeonsi: don't call eliminate_const_vs_outputs in shaders without VS exports

2017-04-28 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 427afd5..5ee8c6f 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -7207,24 +7207,24 @@ static void si_init_shader_ctx(struct si_shader_context 
*ctx,
bld_base->op_actions[TGSI_OPCODE_EMIT].emit = si_llvm_emit_vertex;
bld_base->op_actions[TGSI_OPCODE_ENDPRIM].emit = si_llvm_emit_primitive;
bld_base->op_actions[TGSI_OPCODE_BARRIER].emit = si_llvm_emit_barrier;
 }
 
 static void si_eliminate_const_vs_outputs(struct si_shader_context *ctx)
 {
struct si_shader *shader = ctx->shader;
struct tgsi_shader_info *info = >selector->info;
 
-   if (ctx->type == PIPE_SHADER_FRAGMENT ||
-   ctx->type == PIPE_SHADER_COMPUTE ||
-   shader->key.as_es ||
-   shader->key.as_ls)
+   if ((ctx->type != PIPE_SHADER_VERTEX &&
+ctx->type != PIPE_SHADER_TESS_EVAL) ||
+   shader->key.as_ls ||
+   shader->key.as_es)
return;
 
ac_eliminate_const_vs_outputs(>ac,
  ctx->main_fn,
  shader->info.vs_output_param_offset,
  info->num_outputs,
  >info.nr_param_exports);
 }
 
 static void si_count_scratch_private_memory(struct si_shader_context *ctx)
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/14] radeonsi: remove unused parameters from si_shader_apply_scratch_relocs

2017-04-28 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_compute.c   | 2 +-
 src/gallium/drivers/radeonsi/si_shader.c| 6 ++
 src/gallium/drivers/radeonsi/si_shader.h| 6 ++
 src/gallium/drivers/radeonsi/si_state_shaders.c | 2 +-
 4 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 63f1ac9..382e5a1 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -335,21 +335,21 @@ static bool si_setup_compute_scratch_buffer(struct 
si_context *sctx,
   PIPE_USAGE_DEFAULT,
   scratch_needed, 256);
 
if (!sctx->compute_scratch_buffer)
return false;
}
 
if (sctx->compute_scratch_buffer != shader->scratch_bo && 
scratch_needed) {
uint64_t scratch_va = sctx->compute_scratch_buffer->gpu_address;
 
-   si_shader_apply_scratch_relocs(sctx, shader, config, 
scratch_va);
+   si_shader_apply_scratch_relocs(shader, scratch_va);
 
if (si_shader_binary_upload(sctx->screen, shader))
return false;
 
r600_resource_reference(>scratch_bo,
sctx->compute_scratch_buffer);
}
 
return true;
 }
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index fed8639..f1eee32 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -6462,24 +6462,22 @@ void si_shader_binary_read_config(struct 
ac_shader_binary *binary,
}
}
break;
}
}
 
if (!conf->spi_ps_input_addr)
conf->spi_ps_input_addr = conf->spi_ps_input_ena;
 }
 
-void si_shader_apply_scratch_relocs(struct si_context *sctx,
-   struct si_shader *shader,
-   struct si_shader_config *config,
-   uint64_t scratch_va)
+void si_shader_apply_scratch_relocs(struct si_shader *shader,
+   uint64_t scratch_va)
 {
unsigned i;
uint32_t scratch_rsrc_dword0 = scratch_va;
uint32_t scratch_rsrc_dword1 =
S_008F04_BASE_ADDRESS_HI(scratch_va >> 32);
 
/* Enable scratch coalescing. */
scratch_rsrc_dword1 |= S_008F04_SWIZZLE_ENABLE(1);
 
for (i = 0 ; i < shader->binary.reloc_count; i++) {
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 4a2c042..61269bd 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -598,24 +598,22 @@ int si_shader_create(struct si_screen *sscreen, 
LLVMTargetMachineRef tm,
 struct pipe_debug_callback *debug);
 void si_shader_destroy(struct si_shader *shader);
 unsigned si_shader_io_get_unique_index(unsigned semantic_name, unsigned index);
 unsigned si_shader_io_get_unique_index2(unsigned name, unsigned index);
 int si_shader_binary_upload(struct si_screen *sscreen, struct si_shader 
*shader);
 void si_shader_dump(struct si_screen *sscreen, struct si_shader *shader,
struct pipe_debug_callback *debug, unsigned processor,
FILE *f, bool check_debug_option);
 void si_multiwave_lds_size_workaround(struct si_screen *sscreen,
  unsigned *lds_size);
-void si_shader_apply_scratch_relocs(struct si_context *sctx,
-   struct si_shader *shader,
-   struct si_shader_config *config,
-   uint64_t scratch_va);
+void si_shader_apply_scratch_relocs(struct si_shader *shader,
+   uint64_t scratch_va);
 void si_shader_binary_read_config(struct ac_shader_binary *binary,
  struct si_shader_config *conf,
  unsigned symbol_offset);
 unsigned si_get_spi_shader_z_format(bool writes_z, bool writes_stencil,
bool writes_samplemask);
 const char *si_get_shader_name(struct si_shader *shader, unsigned processor);
 
 /* Inline helpers. */
 
 /* Return the pointer to the main shader part's pointer. */
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index e479422..d5de749 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -2595,21 +2595,21 @@ static int si_update_scratch_buffer(struct si_context 
*sctx,
if (shader->config.scratch_bytes_per_wave == 0)
return 0;
 
/* This shader is already configured to use the 

[Mesa-dev] [PATCH 12/14] radeonsi: separate scratch state patching code into its own function

2017-04-28 Thread Marek Olšák
From: Marek Olšák 

Picked from a different branch. When we stop using the scratch patching,
this function will not be called.
---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 101 +---
 1 file changed, 55 insertions(+), 46 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index ac53eff..16fb522 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -2635,95 +2635,104 @@ static unsigned 
si_get_max_scratch_bytes_per_wave(struct si_context *sctx)
unsigned bytes = 0;
 
bytes = MAX2(bytes, 
si_get_scratch_buffer_bytes_per_wave(sctx->ps_shader.current));
bytes = MAX2(bytes, 
si_get_scratch_buffer_bytes_per_wave(sctx->gs_shader.current));
bytes = MAX2(bytes, 
si_get_scratch_buffer_bytes_per_wave(sctx->vs_shader.current));
bytes = MAX2(bytes, 
si_get_scratch_buffer_bytes_per_wave(sctx->tcs_shader.current));
bytes = MAX2(bytes, 
si_get_scratch_buffer_bytes_per_wave(sctx->tes_shader.current));
return bytes;
 }
 
+static bool si_update_scratch_relocs(struct si_context *sctx)
+{
+   int r;
+
+   /* Update the shaders, so that they are using the latest scratch.
+* The scratch buffer may have been changed since these shaders were
+* last used, so we still need to try to update them, even if they
+* require scratch buffers smaller than the current size.
+*/
+   r = si_update_scratch_buffer(sctx, sctx->ps_shader.current);
+   if (r < 0)
+   return false;
+   if (r == 1)
+   si_pm4_bind_state(sctx, ps, sctx->ps_shader.current->pm4);
+
+   r = si_update_scratch_buffer(sctx, sctx->gs_shader.current);
+   if (r < 0)
+   return false;
+   if (r == 1)
+   si_pm4_bind_state(sctx, gs, sctx->gs_shader.current->pm4);
+
+   r = si_update_scratch_buffer(sctx, sctx->tcs_shader.current);
+   if (r < 0)
+   return false;
+   if (r == 1)
+   si_pm4_bind_state(sctx, hs, sctx->tcs_shader.current->pm4);
+
+   /* VS can be bound as LS, ES, or VS. */
+   r = si_update_scratch_buffer(sctx, sctx->vs_shader.current);
+   if (r < 0)
+   return false;
+   if (r == 1) {
+   if (sctx->tes_shader.current)
+   si_pm4_bind_state(sctx, ls, 
sctx->vs_shader.current->pm4);
+   else if (sctx->gs_shader.current)
+   si_pm4_bind_state(sctx, es, 
sctx->vs_shader.current->pm4);
+   else
+   si_pm4_bind_state(sctx, vs, 
sctx->vs_shader.current->pm4);
+   }
+
+   /* TES can be bound as ES or VS. */
+   r = si_update_scratch_buffer(sctx, sctx->tes_shader.current);
+   if (r < 0)
+   return false;
+   if (r == 1) {
+   if (sctx->gs_shader.current)
+   si_pm4_bind_state(sctx, es, 
sctx->tes_shader.current->pm4);
+   else
+   si_pm4_bind_state(sctx, vs, 
sctx->tes_shader.current->pm4);
+   }
+
+   return true;
+}
+
 static bool si_update_spi_tmpring_size(struct si_context *sctx)
 {
unsigned current_scratch_buffer_size =
si_get_current_scratch_buffer_size(sctx);
unsigned scratch_bytes_per_wave =
si_get_max_scratch_bytes_per_wave(sctx);
unsigned scratch_needed_size = scratch_bytes_per_wave *
sctx->scratch_waves;
unsigned spi_tmpring_size;
-   int r;
 
if (scratch_needed_size > 0) {
if (scratch_needed_size > current_scratch_buffer_size) {
/* Create a bigger scratch buffer */
r600_resource_reference(>scratch_buffer, NULL);
 
sctx->scratch_buffer = (struct r600_resource*)
r600_aligned_buffer_create(>screen->b.b,
   
R600_RESOURCE_FLAG_UNMAPPABLE,
   PIPE_USAGE_DEFAULT,
   scratch_needed_size, 
256);
if (!sctx->scratch_buffer)
return false;
 
si_mark_atom_dirty(sctx, >scratch_state);
r600_context_add_resource_size(>b.b,
   
>scratch_buffer->b.b);
}
 
-   /* Update the shaders, so they are using the latest scratch.  
The
-* scratch buffer may have been changed since these shaders were
-* last used, so we still need to try to update them, even if
-* they require scratch buffers smaller than the current size.
-*/
-   r = 

[Mesa-dev] [PATCH 11/14] radeonsi/gfx9: also apply scratch relocations to the 1st shader of merged shaders

2017-04-28 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index d5de749..ac53eff 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -2595,20 +2595,23 @@ static int si_update_scratch_buffer(struct si_context 
*sctx,
if (shader->config.scratch_bytes_per_wave == 0)
return 0;
 
/* This shader is already configured to use the current
 * scratch buffer. */
if (shader->scratch_bo == sctx->scratch_buffer)
return 0;
 
assert(sctx->scratch_buffer);
 
+   if (shader->previous_stage)
+   si_shader_apply_scratch_relocs(shader->previous_stage, 
scratch_va);
+
si_shader_apply_scratch_relocs(shader, scratch_va);
 
/* Replace the shader bo with a new bo that has the relocs applied. */
r = si_shader_binary_upload(sctx->screen, shader);
if (r)
return r;
 
/* Update the shader state to use the new shader bo. */
si_shader_init_pm4_state(sctx->screen, shader);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/14] radeonsi: make si_compile_llvm static

2017-04-28 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 16 
 src/gallium/drivers/radeonsi/si_shader.h |  8 
 2 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 0fa91de..06ad370 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -6784,28 +6784,28 @@ void si_shader_dump(struct si_screen *sscreen, struct 
si_shader *shader,
if (shader->epilog)
si_shader_dump_disassembly(>epilog->binary,
   debug, "epilog", file);
fprintf(file, "\n");
}
 
si_shader_dump_stats(sscreen, shader, debug, processor, file,
 check_debug_option);
 }
 
-int si_compile_llvm(struct si_screen *sscreen,
-   struct ac_shader_binary *binary,
-   struct si_shader_config *conf,
-   LLVMTargetMachineRef tm,
-   LLVMModuleRef mod,
-   struct pipe_debug_callback *debug,
-   unsigned processor,
-   const char *name)
+static int si_compile_llvm(struct si_screen *sscreen,
+  struct ac_shader_binary *binary,
+  struct si_shader_config *conf,
+  LLVMTargetMachineRef tm,
+  LLVMModuleRef mod,
+  struct pipe_debug_callback *debug,
+  unsigned processor,
+  const char *name)
 {
int r = 0;
unsigned count = p_atomic_inc_return(>b.num_compilations);
 
if (r600_can_dump_shader(>b, processor)) {
fprintf(stderr, "radeonsi: Compiling shader %d\n", count);
 
if (!(sscreen->b.debug_flags & (DBG_NO_IR | DBG_PREOPT_IR))) {
fprintf(stderr, "%s LLVM IR:\n\n", name);
ac_dump_module(mod);
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 0988d91..4a2c042 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -589,28 +589,20 @@ si_generate_gs_copy_shader(struct si_screen *sscreen,
   struct si_shader_selector *gs_selector,
   struct pipe_debug_callback *debug);
 int si_compile_tgsi_shader(struct si_screen *sscreen,
   LLVMTargetMachineRef tm,
   struct si_shader *shader,
   bool is_monolithic,
   struct pipe_debug_callback *debug);
 int si_shader_create(struct si_screen *sscreen, LLVMTargetMachineRef tm,
 struct si_shader *shader,
 struct pipe_debug_callback *debug);
-int si_compile_llvm(struct si_screen *sscreen,
-   struct ac_shader_binary *binary,
-   struct si_shader_config *conf,
-   LLVMTargetMachineRef tm,
-   LLVMModuleRef mod,
-   struct pipe_debug_callback *debug,
-   unsigned processor,
-   const char *name);
 void si_shader_destroy(struct si_shader *shader);
 unsigned si_shader_io_get_unique_index(unsigned semantic_name, unsigned index);
 unsigned si_shader_io_get_unique_index2(unsigned name, unsigned index);
 int si_shader_binary_upload(struct si_screen *sscreen, struct si_shader 
*shader);
 void si_shader_dump(struct si_screen *sscreen, struct si_shader *shader,
struct pipe_debug_callback *debug, unsigned processor,
FILE *f, bool check_debug_option);
 void si_multiwave_lds_size_workaround(struct si_screen *sscreen,
  unsigned *lds_size);
 void si_shader_apply_scratch_relocs(struct si_context *sctx,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/14] radeonsi: drop support for LLVM 3.8

2017-04-28 Thread Marek Olšák
From: Marek Olšák 

LLVM 3.8:
- had broken indirect resource indexing
- didn't have scratch coalescing
- was the last user of problematic v16i8
- only supported OpenGL 4.1

This leaves us with LLVM 3.9 and LLVM 4.0 support for Mesa 17.2.
---
 configure.ac   |   4 +-
 src/amd/common/ac_llvm_build.c | 179 ++---
 src/amd/common/ac_llvm_util.c  |   7 -
 src/gallium/drivers/radeon/r600_pipe_common.c  |  10 +-
 src/gallium/drivers/radeonsi/si_pipe.c |  21 +--
 src/gallium/drivers/radeonsi/si_shader.c   |  42 ++---
 src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c  |   6 +-
 .../drivers/radeonsi/si_shader_tgsi_setup.c|  30 +---
 8 files changed, 81 insertions(+), 218 deletions(-)

diff --git a/configure.ac b/configure.ac
index ba04279..a614458 100644
--- a/configure.ac
+++ b/configure.ac
@@ -95,22 +95,22 @@ XCBGLX_REQUIRED=1.8.1
 XDAMAGE_REQUIRED=1.1
 XSHMFENCE_REQUIRED=1.1
 XVMC_REQUIRED=1.0.6
 PYTHON_MAKO_REQUIRED=0.8.0
 LIBSENSORS_REQUIRED=4.0.0
 ZLIB_REQUIRED=1.2.8
 
 dnl LLVM versions
 LLVM_REQUIRED_GALLIUM=3.3.0
 LLVM_REQUIRED_OPENCL=3.6.0
-LLVM_REQUIRED_R600=3.8.0
-LLVM_REQUIRED_RADEONSI=3.8.0
+LLVM_REQUIRED_R600=3.9.0
+LLVM_REQUIRED_RADEONSI=3.9.0
 LLVM_REQUIRED_RADV=3.9.0
 LLVM_REQUIRED_SWR=3.9.0
 
 dnl Check for progs
 AC_PROG_CPP
 AC_PROG_CC
 AC_PROG_CXX
 AM_PROG_CC_C_O
 AM_PROG_AS
 AX_CHECK_GNU_MAKE
diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 171016b..ba92e7e 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -226,56 +226,30 @@ struct cube_selection_coords {
LLVMValueRef stc[2];
LLVMValueRef ma;
LLVMValueRef id;
 };
 
 static void
 build_cube_intrinsic(struct ac_llvm_context *ctx,
 LLVMValueRef in[3],
 struct cube_selection_coords *out)
 {
-   LLVMBuilderRef builder = ctx->builder;
-
-   if (HAVE_LLVM >= 0x0309) {
-   LLVMTypeRef f32 = ctx->f32;
-
-   out->stc[1] = ac_build_intrinsic(ctx, "llvm.amdgcn.cubetc",
-   f32, in, 3, AC_FUNC_ATTR_READNONE);
-   out->stc[0] = ac_build_intrinsic(ctx, "llvm.amdgcn.cubesc",
-   f32, in, 3, AC_FUNC_ATTR_READNONE);
-   out->ma = ac_build_intrinsic(ctx, "llvm.amdgcn.cubema",
-   f32, in, 3, AC_FUNC_ATTR_READNONE);
-   out->id = ac_build_intrinsic(ctx, "llvm.amdgcn.cubeid",
-   f32, in, 3, AC_FUNC_ATTR_READNONE);
-   } else {
-   LLVMValueRef c[4] = {
-   in[0],
-   in[1],
-   in[2],
-   LLVMGetUndef(LLVMTypeOf(in[0]))
-   };
-   LLVMValueRef vec = ac_build_gather_values(ctx, c, 4);
-
-   LLVMValueRef tmp =
-   ac_build_intrinsic(ctx, "llvm.AMDGPU.cube",
-  LLVMTypeOf(vec), , 1,
-  AC_FUNC_ATTR_READNONE);
-
-   out->stc[1] = LLVMBuildExtractElement(builder, tmp,
-   LLVMConstInt(ctx->i32, 0, 0), "");
-   out->stc[0] = LLVMBuildExtractElement(builder, tmp,
-   LLVMConstInt(ctx->i32, 1, 0), "");
-   out->ma = LLVMBuildExtractElement(builder, tmp,
-   LLVMConstInt(ctx->i32, 2, 0), "");
-   out->id = LLVMBuildExtractElement(builder, tmp,
-   LLVMConstInt(ctx->i32, 3, 0), "");
-   }
+   LLVMTypeRef f32 = ctx->f32;
+
+   out->stc[1] = ac_build_intrinsic(ctx, "llvm.amdgcn.cubetc",
+f32, in, 3, AC_FUNC_ATTR_READNONE);
+   out->stc[0] = ac_build_intrinsic(ctx, "llvm.amdgcn.cubesc",
+f32, in, 3, AC_FUNC_ATTR_READNONE);
+   out->ma = ac_build_intrinsic(ctx, "llvm.amdgcn.cubema",
+f32, in, 3, AC_FUNC_ATTR_READNONE);
+   out->id = ac_build_intrinsic(ctx, "llvm.amdgcn.cubeid",
+f32, in, 3, AC_FUNC_ATTR_READNONE);
 }
 
 /**
  * Build a manual selection sequence for cube face sc/tc coordinates and
  * major axis vector (multiplied by 2 for consistency) for the given
  * vec3 \p coords, for the face implied by \p selcoords.
  *
  * For the major axis, we always adjust the sign to be in the direction of
  * selcoords.ma; i.e., a positive out_ma means that coords is pointed towards
  * the selcoords major axis.
@@ -551,21 +525,21 @@ ac_build_buffer_store_dword(struct ac_llvm_context *ctx,
unsigned num_channels,
LLVMValueRef voffset,
LLVMValueRef soffset,
   

[Mesa-dev] [PATCH 02/14] radeonsi: stop using v16i8

2017-04-28 Thread Marek Olšák
From: Marek Olšák 

---
 src/amd/common/ac_llvm_build.c  |  2 +-
 src/gallium/drivers/radeonsi/si_shader.c| 18 --
 src/gallium/drivers/radeonsi/si_shader_internal.h   |  1 -
 src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c |  1 -
 4 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 209dfdd..171016b 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -752,21 +752,21 @@ LLVMValueRef ac_build_buffer_load_format(struct 
ac_llvm_context *ctx,
  ctx->v4f32, args, ARRAY_SIZE(args),
  /* READNONE means writes can't
   * affect it, while READONLY means
   * that writes can affect it. */
  readonly_memory && HAVE_LLVM >= 
0x0400 ?
  AC_FUNC_ATTR_READNONE :
  AC_FUNC_ATTR_READONLY);
}
 
LLVMValueRef args[] = {
-   rsrc,
+   LLVMBuildBitCast(ctx->builder, rsrc, ctx->v16i8, ""),
voffset,
vindex,
};
return ac_build_intrinsic(ctx, "llvm.SI.vs.load.input",
  ctx->v4f32, args, 3,
  AC_FUNC_ATTR_READNONE |
  AC_FUNC_ATTR_LEGACY);
 }
 
 /**
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 77dd6b1..3ac1ef4 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1378,21 +1378,21 @@ static LLVMValueRef get_sample_id(struct 
si_shader_context *ctx)
 /**
  * Load a dword from a constant buffer.
  */
 static LLVMValueRef buffer_load_const(struct si_shader_context *ctx,
  LLVMValueRef resource,
  LLVMValueRef offset)
 {
LLVMBuilderRef builder = ctx->gallivm.builder;
LLVMValueRef args[2] = {resource, offset};
 
-   return lp_build_intrinsic(builder, "llvm.SI.load.const", ctx->f32, 
args, 2,
+   return lp_build_intrinsic(builder, "llvm.SI.load.const.v4i32", 
ctx->f32, args, 2,
  LP_FUNC_ATTR_READNONE |
  LP_FUNC_ATTR_LEGACY);
 }
 
 static LLVMValueRef load_sample_position(struct si_shader_context *ctx, 
LLVMValueRef sample_id)
 {
struct lp_build_context *uint_bld = >bld_base.uint_bld;
struct gallivm_state *gallivm = >gallivm;
LLVMBuilderRef builder = gallivm->builder;
LLVMValueRef desc = LLVMGetParam(ctx->main_fn, ctx->param_rw_buffers);
@@ -4666,22 +4666,21 @@ static void tex_fetch_args(
unsigned chan;
unsigned num_deriv_channels = 0;
bool has_offset = inst->Texture.NumOffsets > 0;
LLVMValueRef res_ptr, samp_ptr, fmask_ptr = NULL;
unsigned dmask = 0xf;
 
tex_fetch_ptrs(bld_base, emit_data, _ptr, _ptr, _ptr);
 
if (target == TGSI_TEXTURE_BUFFER) {
emit_data->dst_type = ctx->v4f32;
-   emit_data->args[0] = LLVMBuildBitCast(gallivm->builder, res_ptr,
- ctx->v16i8, "");
+   emit_data->args[0] = res_ptr;
emit_data->args[1] = ctx->i32_0;
emit_data->args[2] = lp_build_emit_fetch(bld_base, 
emit_data->inst, 0, TGSI_CHAN_X);
emit_data->arg_count = 3;
return;
}
 
/* Fetch and project texture coordinates */
coords[3] = lp_build_emit_fetch(bld_base, emit_data->inst, 0, 
TGSI_CHAN_W);
for (chan = 0; chan < 3; chan++ ) {
coords[chan] = lp_build_emit_fetch(bld_base,
@@ -5835,48 +5834,48 @@ static unsigned si_get_max_workgroup_size(struct 
si_shader *shader)
max_work_group_size = SI_MAX_VARIABLE_THREADS_PER_BLOCK;
}
return max_work_group_size;
 }
 
 static void declare_per_stage_desc_pointers(struct si_shader_context *ctx,
LLVMTypeRef *params,
unsigned *num_params,
bool assign_params)
 {
-   params[(*num_params)++] = const_array(ctx->v16i8, SI_NUM_CONST_BUFFERS);
+   params[(*num_params)++] = const_array(ctx->v4i32, SI_NUM_CONST_BUFFERS);
params[(*num_params)++] = const_array(ctx->v8i32, SI_NUM_SAMPLERS);
params[(*num_params)++] = const_array(ctx->v8i32, SI_NUM_IMAGES);
params[(*num_params)++] = const_array(ctx->v4i32, 
SI_NUM_SHADER_BUFFERS);
 
if (assign_params) {
ctx->param_const_buffers  = *num_params - 4;

[Mesa-dev] [PATCH 10/14] radeonsi/gfx9: set correct LLVM calling conventions for merged shaders

2017-04-28 Thread Marek Olšák
From: Marek Olšák 

for scratch support
---
 src/gallium/drivers/radeonsi/si_shader.c|  1 +
 src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c | 19 +--
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index f1eee32..b13f1b2 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -6412,20 +6412,21 @@ void si_shader_binary_read_config(struct 
ac_shader_binary *binary,
 * extracting fields to be emitted later.
 */
 
for (i = 0; i < binary->config_size_per_symbol; i+= 8) {
unsigned reg = util_le32_to_cpu(*(uint32_t*)(config + i));
unsigned value = util_le32_to_cpu(*(uint32_t*)(config + i + 4));
switch (reg) {
case R_00B028_SPI_SHADER_PGM_RSRC1_PS:
case R_00B128_SPI_SHADER_PGM_RSRC1_VS:
case R_00B228_SPI_SHADER_PGM_RSRC1_GS:
+   case R_00B428_SPI_SHADER_PGM_RSRC1_HS:
case R_00B848_COMPUTE_PGM_RSRC1:
conf->num_sgprs = MAX2(conf->num_sgprs, 
(G_00B028_SGPRS(value) + 1) * 8);
conf->num_vgprs = MAX2(conf->num_vgprs, 
(G_00B028_VGPRS(value) + 1) * 4);
conf->float_mode =  G_00B028_FLOAT_MODE(value);
conf->rsrc1 = value;
break;
case R_00B02C_SPI_SHADER_PGM_RSRC2_PS:
conf->lds_size = MAX2(conf->lds_size, 
G_00B02C_EXTRA_LDS_SIZE(value));
break;
case R_00B84C_COMPUTE_PGM_RSRC2:
diff --git a/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c 
b/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c
index de671ef..f717299 100644
--- a/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c
+++ b/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c
@@ -48,20 +48,21 @@ struct si_llvm_flow {
/* Loop exit or next part of if/else/endif. */
LLVMBasicBlockRef next_block;
LLVMBasicBlockRef loop_entry_block;
 };
 
 enum si_llvm_calling_convention {
RADEON_LLVM_AMDGPU_VS = 87,
RADEON_LLVM_AMDGPU_GS = 88,
RADEON_LLVM_AMDGPU_PS = 89,
RADEON_LLVM_AMDGPU_CS = 90,
+   RADEON_LLVM_AMDGPU_HS = 93,
 };
 
 void si_llvm_add_attribute(LLVMValueRef F, const char *name, int value)
 {
char str[16];
 
snprintf(str, sizeof(str), "%i", value);
LLVMAddTargetDependentFunctionAttr(F, name, str);
 }
 
@@ -1355,42 +1356,56 @@ void si_llvm_context_set_tgsi(struct si_shader_context 
*ctx,
 }
 
 void si_llvm_create_func(struct si_shader_context *ctx,
 const char *name,
 LLVMTypeRef *return_types, unsigned num_return_elems,
 LLVMTypeRef *ParamTypes, unsigned ParamCount)
 {
LLVMTypeRef main_fn_type, ret_type;
LLVMBasicBlockRef main_fn_body;
enum si_llvm_calling_convention call_conv;
+   unsigned real_shader_type;
 
if (num_return_elems)
ret_type = LLVMStructTypeInContext(ctx->gallivm.context,
   return_types,
   num_return_elems, true);
else
ret_type = LLVMVoidTypeInContext(ctx->gallivm.context);
 
/* Setup the function */
ctx->return_type = ret_type;
main_fn_type = LLVMFunctionType(ret_type, ParamTypes, ParamCount, 0);
ctx->main_fn = LLVMAddFunction(ctx->gallivm.module, name, main_fn_type);
main_fn_body = LLVMAppendBasicBlockInContext(ctx->gallivm.context,
ctx->main_fn, "main_body");
LLVMPositionBuilderAtEnd(ctx->gallivm.builder, main_fn_body);
 
-   switch (ctx->type) {
+   real_shader_type = ctx->type;
+
+   /* LS is merged into HS (TCS), and ES is merged into GS. */
+   if (ctx->screen->b.chip_class >= GFX9) {
+   if (ctx->shader->key.as_ls)
+   real_shader_type = PIPE_SHADER_TESS_CTRL;
+   else if (ctx->shader->key.as_es)
+   real_shader_type = PIPE_SHADER_GEOMETRY;
+   }
+
+   switch (real_shader_type) {
case PIPE_SHADER_VERTEX:
-   case PIPE_SHADER_TESS_CTRL:
case PIPE_SHADER_TESS_EVAL:
call_conv = RADEON_LLVM_AMDGPU_VS;
break;
+   case PIPE_SHADER_TESS_CTRL:
+   call_conv = HAVE_LLVM >= 0x0500 ? RADEON_LLVM_AMDGPU_HS :
+ RADEON_LLVM_AMDGPU_VS;
+   break;
case PIPE_SHADER_GEOMETRY:
call_conv = RADEON_LLVM_AMDGPU_GS;
break;
case PIPE_SHADER_FRAGMENT:
call_conv = RADEON_LLVM_AMDGPU_PS;
break;
case PIPE_SHADER_COMPUTE:
  

[Mesa-dev] [PATCH 01/14] radeonsi/gfx9: make some PA & DB registers match the closed Vulkan driver

2017-04-28 Thread Marek Olšák
From: Marek Olšák 

Cc: 17.1 
---
 src/amd/common/gfx9d.h  |  4 
 src/gallium/drivers/radeonsi/si_state.c | 21 ++---
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/src/amd/common/gfx9d.h b/src/amd/common/gfx9d.h
index e295a1d..787d0a9 100644
--- a/src/amd/common/gfx9d.h
+++ b/src/amd/common/gfx9d.h
@@ -4067,20 +4067,24 @@
 #define   C_028054_BASE_HI
0xFF00
 #define R_028058_DB_STENCIL_WRITE_BASE  
0x028058
 #define R_02805C_DB_STENCIL_WRITE_BASE_HI   
0x02805C
 #define   S_02805C_BASE_HI(x) 
(((unsigned)(x) & 0xFF) << 0)
 #define   G_02805C_BASE_HI(x) (((x) >> 
0) & 0xFF)
 #define   C_02805C_BASE_HI
0xFF00
 #define R_028060_DB_DFSM_CONTROL
0x028060
 #define   S_028060_PUNCHOUT_MODE(x)   
(((unsigned)(x) & 0x03) << 0)
 #define   G_028060_PUNCHOUT_MODE(x)   (((x) >> 
0) & 0x03)
 #define   C_028060_PUNCHOUT_MODE  
0xFFFC
+#define V_028060_AUTO  0
+#define V_028060_FORCE_ON  1
+#define V_028060_FORCE_OFF 2
+#define V_028060_RESERVED  3
 #define   S_028060_POPS_DRAIN_PS_ON_OVERLAP(x)
(((unsigned)(x) & 0x1) << 2)
 #define   G_028060_POPS_DRAIN_PS_ON_OVERLAP(x)(((x) >> 
2) & 0x1)
 #define   C_028060_POPS_DRAIN_PS_ON_OVERLAP   
0xFFFB
 #define   S_028060_DISALLOW_OVERFLOW(x)   
(((unsigned)(x) & 0x1) << 3)
 #define   G_028060_DISALLOW_OVERFLOW(x)   (((x) >> 
3) & 0x1)
 #define   C_028060_DISALLOW_OVERFLOW  
0xFFF7
 #define R_028064_DB_RENDER_FILTER   
0x028064
 #define   S_028064_PS_INVOKE_MASK(x)  
(((unsigned)(x) & 0x) << 0)
 #define   G_028064_PS_INVOKE_MASK(x)  (((x) >> 
0) & 0x)
 #define   C_028064_PS_INVOKE_MASK 
0x
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 938e7fb..17ac23e 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -4557,27 +4557,42 @@ static void si_init_config(struct si_context *sctx)
if (sctx->screen->b.has_rbplus)
si_pm4_set_reg(pm4, R_028C40_PA_SC_SHADER_CONTROL, 0);
 
si_pm4_set_reg(pm4, R_028080_TA_BC_BASE_ADDR, border_color_va >> 8);
if (sctx->b.chip_class >= CIK)
si_pm4_set_reg(pm4, R_028084_TA_BC_BASE_ADDR_HI, 
border_color_va >> 40);
si_pm4_add_bo(pm4, sctx->border_color_buffer, RADEON_USAGE_READ,
  RADEON_PRIO_BORDER_COLORS);
 
if (sctx->b.chip_class >= GFX9) {
-   si_pm4_set_reg(pm4, R_028060_DB_DFSM_CONTROL, 0);
+   unsigned num_se = sscreen->b.info.max_se;
+   unsigned pc_lines = 0;
+
+   switch (sctx->b.family) {
+   case CHIP_VEGA10:
+   pc_lines = 4096;
+   break;
+   default:
+   assert(0);
+   }
+
+   si_pm4_set_reg(pm4, R_028060_DB_DFSM_CONTROL,
+  S_028060_PUNCHOUT_MODE(V_028060_FORCE_OFF));
si_pm4_set_reg(pm4, R_028064_DB_RENDER_FILTER, 0);
/* TODO: We can use this to disable RBs for rendering to GART: 
*/
si_pm4_set_reg(pm4, R_02835C_PA_SC_TILE_STEERING_OVERRIDE, 0);
si_pm4_set_reg(pm4, R_02883C_PA_SU_OVER_RASTERIZATION_CNTL, 0);
/* TODO: Enable the binner: */
si_pm4_set_reg(pm4, R_028C44_PA_SC_BINNER_CNTL_0,
-  
S_028C44_BINNING_MODE(V_028C44_DISABLE_BINNING_USE_LEGACY_SC));
-   si_pm4_set_reg(pm4, R_028C48_PA_SC_BINNER_CNTL_1, 0);
+  
S_028C44_BINNING_MODE(V_028C44_DISABLE_BINNING_USE_LEGACY_SC) |
+  S_028C44_DISABLE_START_OF_PRIM(1));
+   si_pm4_set_reg(pm4, R_028C48_PA_SC_BINNER_CNTL_1,
+  S_028C48_MAX_ALLOC_COUNT(MIN2(128, pc_lines / (4 
* num_se))) |
+  S_028C48_MAX_PRIM_PER_BATCH(1023));
si_pm4_set_reg(pm4, 
R_028C4C_PA_SC_CONSERVATIVE_RASTERIZATION_CNTL,
   S_028C4C_NULL_SQUAD_AA_MASK_ENABLE(1));

[Mesa-dev] [PATCH 05/14] radeonsi: fold surrounding code into si_llvm_finalize_module

2017-04-28 Thread Marek Olšák
From: Marek Olšák 

and rename to si_llvm_optimize_module.
---
 src/gallium/drivers/radeonsi/si_shader.c| 20 
 src/gallium/drivers/radeonsi/si_shader_internal.h   |  3 +--
 src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c | 10 +++---
 3 files changed, 12 insertions(+), 21 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 5ee8c6f..0fa91de 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -6985,27 +6985,22 @@ si_generate_gs_copy_shader(struct si_screen *sscreen,
if (stream == 0)
si_llvm_export_vs(bld_base, outputs, 
gsinfo->num_outputs);
 
LLVMBuildBr(builder, end_bb);
}
 
LLVMPositionBuilderAtEnd(builder, end_bb);
 
LLVMBuildRetVoid(gallivm->builder);
 
-   /* Dump LLVM IR before any optimization passes */
-   if (sscreen->b.debug_flags & DBG_PREOPT_IR &&
-   r600_can_dump_shader(>b, PIPE_SHADER_GEOMETRY))
-   ac_dump_module(ctx.gallivm.module);
-
-   si_llvm_finalize_module(,
-   r600_extra_shader_checks(>b, PIPE_SHADER_GEOMETRY));
+   ctx.type = PIPE_SHADER_GEOMETRY; /* override for shader dumping */
+   si_llvm_optimize_module();
 
r = si_compile_llvm(sscreen, >binary,
>config, ctx.tm,
ctx.gallivm.module,
debug, PIPE_SHADER_GEOMETRY,
"GS Copy Shader");
if (!r) {
if (r600_can_dump_shader(>b, PIPE_SHADER_GEOMETRY))
fprintf(stderr, "GS Copy Shader:\n");
si_shader_dump(sscreen, ctx.shader, debug,
@@ -8122,27 +8117,21 @@ int si_compile_tgsi_shader(struct si_screen *sscreen,
}
 
si_get_ps_epilog_key(shader, _key);
si_build_ps_epilog_function(, _key);
parts[need_prolog ? 2 : 1] = ctx.main_fn;
 
si_build_wrapper_function(, parts, need_prolog ? 3 : 2,
  need_prolog ? 1 : 0, 0);
}
 
-   /* Dump LLVM IR before any optimization passes */
-   if (sscreen->b.debug_flags & DBG_PREOPT_IR &&
-   r600_can_dump_shader(>b, ctx.type))
-   LLVMDumpModule(ctx.gallivm.module);
-
-   si_llvm_finalize_module(,
-   r600_extra_shader_checks(>b, 
ctx.type));
+   si_llvm_optimize_module();
 
/* Post-optimization transformations and analysis. */
si_eliminate_const_vs_outputs();
 
if ((debug && debug->debug_message) ||
r600_can_dump_shader(>b, ctx.type))
si_count_scratch_private_memory();
 
/* Compile to bytecode. */
r = si_compile_llvm(sscreen, >binary, >config, tm,
@@ -8297,22 +8286,21 @@ si_get_shader_part(struct si_screen *sscreen,
else
shader.key.part.ps.epilog = key->ps_epilog.states;
break;
default:
unreachable("bad shader part");
}
 
build(, key);
 
/* Compile. */
-   si_llvm_finalize_module(,
-   r600_extra_shader_checks(>b, PIPE_SHADER_FRAGMENT));
+   si_llvm_optimize_module();
 
if (si_compile_llvm(sscreen, >binary, >config, tm,
gallivm->module, debug, ctx.type, name)) {
FREE(result);
result = NULL;
goto out;
}
 
result->next = *list;
*list = result;
diff --git a/src/gallium/drivers/radeonsi/si_shader_internal.h 
b/src/gallium/drivers/radeonsi/si_shader_internal.h
index 03bf83d..b54db20 100644
--- a/src/gallium/drivers/radeonsi/si_shader_internal.h
+++ b/src/gallium/drivers/radeonsi/si_shader_internal.h
@@ -264,22 +264,21 @@ void si_llvm_context_init(struct si_shader_context *ctx,
 void si_llvm_context_set_tgsi(struct si_shader_context *ctx,
  struct si_shader *shader);
 
 void si_llvm_create_func(struct si_shader_context *ctx,
 const char *name,
 LLVMTypeRef *return_types, unsigned num_return_elems,
 LLVMTypeRef *ParamTypes, unsigned ParamCount);
 
 void si_llvm_dispose(struct si_shader_context *ctx);
 
-void si_llvm_finalize_module(struct si_shader_context *ctx,
-bool run_verifier);
+void si_llvm_optimize_module(struct si_shader_context *ctx);
 
 LLVMValueRef si_llvm_emit_fetch_64bit(struct lp_build_tgsi_context *bld_base,
  enum tgsi_opcode_type type,
  LLVMValueRef ptr,
  LLVMValueRef ptr2);
 
 LLVMValueRef si_llvm_emit_fetch(struct lp_build_tgsi_context *bld_base,
const struct 

[Mesa-dev] [RFC 1/2] glx|egl: allow to test if glthread is safe enough on X11 platform

2017-04-28 Thread Gregory Hainaut
I extended the struct __DRIbackgroundCallableExtensionRec because
the other function pointer is already related for glthread.

DRI2/DRI3 glx code path check that display can be locked (basically
XInitThread was called)

EGL code path is more tricky as we don't want to pull X11 header. Instead
the code will assume that it is safe if X11 isn't used or there is no display
(i.e. 100% XCB)

The new function will be used in the next commit

Signed-off-by: Gregory Hainaut 
---
 include/GL/internal/dri_interface.h |  9 +
 src/egl/drivers/dri2/egl_dri2.c | 30 ++
 src/glx/dri2_glx.c  |  9 +
 src/glx/dri3_glx.c  |  8 
 4 files changed, 56 insertions(+)

diff --git a/include/GL/internal/dri_interface.h 
b/include/GL/internal/dri_interface.h
index 86efd1bdc9..28a52ccdb9 100644
--- a/include/GL/internal/dri_interface.h
+++ b/include/GL/internal/dri_interface.h
@@ -1713,13 +1713,22 @@ struct __DRIbackgroundCallableExtensionRec {
 * non-background thread (i.e. a thread that has already been bound to a
 * context using __DRIcoreExtensionRec::bindContext()); when this happens,
 * the \c loaderPrivate pointer must be equal to the pointer that was
 * passed to the driver when the currently bound context was created.
 *
 * This call should execute quickly enough that the driver can call it with
 * impunity whenever a background thread starts performing drawing
 * operations (e.g. it should just set a thread-local variable).
 */
void (*setBackgroundContext)(void *loaderPrivate);
+   /**
+* Indicate that it is multithread safe to use glthread. Typically
+* XInitThread was called in GLX setup.
+*
+* \param loaderPrivate is the value that was passed to to the driver when
+* the context was created.  This can be used by the loader to identify
+* which context any callbacks are associated with.
+*/
+   GLboolean (*isGlThreadSafe)(void *loaderPrivate);
 };
 
 #endif
diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 2cab7d00c1..df2db97bcf 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -85,24 +85,54 @@
 
 static void
 dri_set_background_context(void *loaderPrivate)
 {
_EGLContext *ctx = _eglGetCurrentContext();
_EGLThreadInfo *t = _eglGetCurrentThread();
 
_eglBindContextToThread(ctx, t);
 }
 
+static GLboolean
+dri_is_glthread_safe(void *loaderPrivate)
+{
+#ifdef HAVE_X11_PLATFORM
+   struct dri2_egl_surface *dri2_surf = loaderPrivate;
+   _EGLDisplay *display =  dri2_surf->base.Resource.Display;
+   Display *dpy;
+
+   // Only the libX11 isn't safe
+   if (display->Platform != _EGL_PLATFORM_X11)
+  return true;
+
+   // Will use pure XCB so no libX11 here either
+   if (display->PlatformDisplay == NULL)
+  return true;
+
+   // In an ideal world we would check the X11 lock pointer
+   // (display->PlatformDisplay->lock_fns). Unfortunately it
+   // requires to know the full type. And we don't want to bring X11
+   // headers here.
+   //
+   // So let's assume an unsafe behavior. Modern EGL code shouldn't use
+   // libX11 anyway.
+   return false;
+#else
+   return true;
+#endif
+}
+
 const __DRIbackgroundCallableExtension background_callable_extension = {
.base = { __DRI_BACKGROUND_CALLABLE, 1 },
 
.setBackgroundContext = dri_set_background_context,
+   .isGlThreadSafe   = dri_is_glthread_safe,
 };
 
 const __DRIuseInvalidateExtension use_invalidate = {
.base = { __DRI_USE_INVALIDATE, 1 }
 };
 
 EGLint dri2_to_egl_attribute_map[] = {
0,
EGL_BUFFER_SIZE,/* __DRI_ATTRIB_BUFFER_SIZE */
EGL_LEVEL,/* __DRI_ATTRIB_LEVEL */
diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
index 145f44d6e8..8f4d2f027f 100644
--- a/src/glx/dri2_glx.c
+++ b/src/glx/dri2_glx.c
@@ -946,20 +946,28 @@ dri2GetSwapInterval(__GLXDRIdrawable *pdraw)
   return priv->swap_interval;
 }
 
 static void
 driSetBackgroundContext(void *loaderPrivate)
 {
struct dri2_context *pcp = (struct dri2_context *) loaderPrivate;
__glXSetCurrentContext(>base);
 }
 
+static GLboolean
+driIsGlThreadSafe(void *loaderPrivate)
+{
+   struct dri2_context *pcp = (struct dri2_context *) loaderPrivate;
+   return pcp->base.psc->dpy->lock_fns != NULL;
+}
+
+
 static const __DRIdri2LoaderExtension dri2LoaderExtension = {
.base = { __DRI_DRI2_LOADER, 3 },
 
.getBuffers  = dri2GetBuffers,
.flushFrontBuffer= dri2FlushFrontBuffer,
.getBuffersWithFormat= dri2GetBuffersWithFormat,
 };
 
 static const __DRIdri2LoaderExtension dri2LoaderExtension_old = {
.base = { __DRI_DRI2_LOADER, 3 },
@@ -970,20 +978,21 @@ static const __DRIdri2LoaderExtension 
dri2LoaderExtension_old = {
 };
 
 static const __DRIuseInvalidateExtension dri2UseInvalidate = {
.base = { __DRI_USE_INVALIDATE, 1 }
 };
 
 static const 

[Mesa-dev] [RFC 2/2] glthread/gallium: require safe_glthread to start glthread

2017-04-28 Thread Gregory Hainaut
Otherwise print a warning

Signed-off-by: Gregory Hainaut 
---
 src/gallium/state_trackers/dri/dri_context.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/dri/dri_context.c 
b/src/gallium/state_trackers/dri/dri_context.c
index 92d79849c4..35b0c454be 100644
--- a/src/gallium/state_trackers/dri/dri_context.c
+++ b/src/gallium/state_trackers/dri/dri_context.c
@@ -153,22 +153,28 @@ dri_create_context(gl_api api, const struct gl_config * 
visual,
 
if (ctx->st->cso_context) {
   ctx->pp = pp_init(ctx->st->pipe, screen->pp_enabled, 
ctx->st->cso_context);
   ctx->hud = hud_create(ctx->st->pipe, ctx->st->cso_context);
}
 
/* Do this last. */
if (ctx->st->start_thread &&
/* the driver loader must implement this */
screen->sPriv->dri2.backgroundCallable &&
-   driQueryOptionb(>optionCache, "mesa_glthread"))
-  ctx->st->start_thread(ctx->st);
+   driQueryOptionb(>optionCache, "mesa_glthread")) {
+
+  if 
(ctx->sPriv->dri2.backgroundCallable->isGlThreadSafe(cPriv->loaderPrivate))
+ ctx->st->start_thread(ctx->st);
+  else
+ fprintf(stderr, "MESA warning: glthread can't be enabled because "
+   "the application didn't call XInitThreads\n");
+   }
 
*error = __DRI_CTX_ERROR_SUCCESS;
return GL_TRUE;
 
  fail:
if (ctx && ctx->st)
   ctx->st->destroy(ctx->st);
 
free(ctx);
return GL_FALSE;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 0/2] Disable glthread is libX11 isn't thread-safe

2017-04-28 Thread Gregory Hainaut
Hello,

Following the discussion from
https://lists.freedesktop.org/archives/mesa-dev/2017-April/153137.html

A check was added to ensure that X11 display can be locked. It should be enough
to ensure thread safety between X11 and glthread.

I also did the check on DRI3 as I'm not 100% sure that it is really thread safe.

EGL case is more tricky so the pair (X11/libX11) is marked as unsafe.
I think it is fine because modern EGL application should rely on XCB (on the X11
platform).

Best regards,
Gregory Hainaut (2):
  glx|egl: allow to test if glthread is safe enough on X11 platform
  glthread/gallium: require safe_glthread to start glthread

 include/GL/internal/dri_interface.h  |  9 +
 src/egl/drivers/dri2/egl_dri2.c  | 30 
 src/gallium/state_trackers/dri/dri_context.c | 10 --
 src/glx/dri2_glx.c   |  9 +
 src/glx/dri3_glx.c   |  8 
 5 files changed, 64 insertions(+), 2 deletions(-)

-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: Add NIR loop unrolling.

2017-04-28 Thread Bas Nieuwenhuizen
Not much effect on dota2/talos, but positive on deferred.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_pipeline.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index d6989137a55..7340675915f 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -61,6 +61,7 @@ static const struct nir_shader_compiler_options nir_options = 
{
.lower_unpack_unorm_4x8 = true,
.lower_extract_byte = true,
.lower_extract_word = true,
+   .max_unroll_iterations = 32
 };
 
 VkResult radv_CreateShaderModule(
@@ -152,6 +153,12 @@ radv_optimize_nir(struct nir_shader *shader)
 NIR_PASS(progress, shader, nir_copy_prop);
 NIR_PASS(progress, shader, nir_opt_remove_phis);
 NIR_PASS(progress, shader, nir_opt_dce);
+if (nir_opt_trivial_continues(shader)) {
+progress = true;
+NIR_PASS(progress, shader, nir_copy_prop);
+NIR_PASS(progress, shader, nir_opt_dce);
+}
+NIR_PASS(progress, shader, nir_opt_if);
 NIR_PASS(progress, shader, nir_opt_dead_cf);
 NIR_PASS(progress, shader, nir_opt_cse);
 NIR_PASS(progress, shader, nir_opt_peephole_select, 8);
@@ -159,6 +166,9 @@ radv_optimize_nir(struct nir_shader *shader)
 NIR_PASS(progress, shader, nir_opt_constant_folding);
 NIR_PASS(progress, shader, nir_opt_undef);
 NIR_PASS(progress, shader, nir_opt_conditional_discard);
+if (shader->options->max_unroll_iterations) {
+NIR_PASS(progress, shader, nir_opt_loop_unroll, 0);
+}
 } while (progress);
 }
 
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 100741] Chromium - Memory leak

2017-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100741

Chad Versace  changed:

   What|Removed |Added

 CC||chadvers...@chromium.org

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 100876] Variable GALLIUM_HUD_DUMP_DIR is not working with Wine LFS

2017-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100876

Bug ID: 100876
   Summary: Variable GALLIUM_HUD_DUMP_DIR is not working with Wine
LFS
   Product: Mesa
   Version: 17.0
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: vinib...@freemail.hu
QA Contact: mesa-dev@lists.freedesktop.org

Hello Devs,

I wanted to create some scripted/automated benchmark, but seems like the
visible(correctly generated) data won't save into the dump files, however they
were created.
I used the following command:
GALLIUM_HUD_DUMP_DIR="/media/ramdisk/" GALLIUM_HUD="fps,cpu"
/usr/share/playonlinux/playonlinux --run "LFS"
The ownerships are fine glxgears were able to generate data, but it depends on
the exitcode as I experienced.
I'm using rolling Arch with:
linux 4.10.11-1
mesa 17.0.4-2
wine-gaming-nine 2.3-1
xorg-server 1.19.3-2

It's working if I disable the Gallium nine in the settings.

Thank you

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] radeonsi: don't load unused compute shader input SGPRs and VGPRs

2017-04-28 Thread Marek Olšák
On Tue, Apr 25, 2017 at 8:24 AM, Nicolai Hähnle  wrote:
> On 24.04.2017 18:22, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> Basically, don't load GRID_SIZE or BLOCK_SIZE if they are unused,
>> determine
>> whether to load BLOCK_ID for each component separately, and set the number
>> of THREAD_ID VGPRs to load. Now we should get the maximum CS launch wave
>> rate in most cases.
>> ---
>>  src/gallium/drivers/radeonsi/si_compute.c | 71
>> ++-
>>  src/gallium/drivers/radeonsi/si_shader.c  | 37 
>>  src/gallium/drivers/radeonsi/si_shader.h  | 11 
>>  src/gallium/drivers/radeonsi/si_shader_internal.h |  5 ++
>>  4 files changed, 76 insertions(+), 48 deletions(-)
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_compute.c
>> b/src/gallium/drivers/radeonsi/si_compute.c
>> index 2b2efae..b3399d1 100644
>> --- a/src/gallium/drivers/radeonsi/si_compute.c
>> +++ b/src/gallium/drivers/radeonsi/si_compute.c
>> @@ -41,20 +41,22 @@ struct si_compute {
>>
>> unsigned ir_type;
>> unsigned local_size;
>> unsigned private_size;
>> unsigned input_size;
>> struct si_shader shader;
>>
>> struct pipe_resource *global_buffers[MAX_GLOBAL_BUFFERS];
>> unsigned use_code_object_v2 : 1;
>> unsigned variable_group_size : 1;
>> +   unsigned uses_grid_size:1;
>> +   unsigned uses_block_size:1;
>>  };
>>
>>  struct dispatch_packet {
>> uint16_t header;
>> uint16_t setup;
>> uint16_t workgroup_size_x;
>> uint16_t workgroup_size_y;
>> uint16_t workgroup_size_z;
>> uint16_t reserved0;
>> uint32_t grid_size_x;
>> @@ -114,37 +116,45 @@ static void si_create_compute_state_async(void *job,
>> int thread_index)
>> memset(, 0, sizeof(sel));
>>
>> sel.screen = program->screen;
>> tgsi_scan_shader(program->tokens, );
>> sel.tokens = program->tokens;
>> sel.type = PIPE_SHADER_COMPUTE;
>> sel.local_size = program->local_size;
>>
>> program->shader.selector = 
>> program->shader.is_monolithic = true;
>> +   program->uses_grid_size = sel.info.uses_grid_size;
>> +   program->uses_block_size = sel.info.uses_block_size;
>>
>> if (si_shader_create(program->screen, tm, >shader,
>> debug)) {
>> program->shader.compilation_failed = true;
>> } else {
>> bool scratch_enabled =
>> shader->config.scratch_bytes_per_wave > 0;
>> +   unsigned user_sgprs = SI_NUM_RESOURCE_SGPRS +
>> + (sel.info.uses_grid_size ? 3 : 0) +
>> + (sel.info.uses_block_size ? 3 : 0);
>>
>> shader->config.rsrc1 =
>> S_00B848_VGPRS((shader->config.num_vgprs - 1) / 4)
>> |
>> S_00B848_SGPRS((shader->config.num_sgprs - 1) / 8)
>> |
>> S_00B848_DX10_CLAMP(1) |
>> S_00B848_FLOAT_MODE(shader->config.float_mode);
>>
>> shader->config.rsrc2 =
>> -   S_00B84C_USER_SGPR(SI_CS_NUM_USER_SGPR) |
>> +   S_00B84C_USER_SGPR(user_sgprs) |
>> S_00B84C_SCRATCH_EN(scratch_enabled) |
>> -   S_00B84C_TGID_X_EN(1) | S_00B84C_TGID_Y_EN(1) |
>> -   S_00B84C_TGID_Z_EN(1) | S_00B84C_TIDIG_COMP_CNT(2)
>> |
>> +   S_00B84C_TGID_X_EN(sel.info.uses_block_id[0]) |
>> +   S_00B84C_TGID_Y_EN(sel.info.uses_block_id[1]) |
>> +   S_00B84C_TGID_Z_EN(sel.info.uses_block_id[2]) |
>> +   S_00B84C_TIDIG_COMP_CNT(sel.info.uses_thread_id[2]
>> ? 2 :
>> +   sel.info.uses_thread_id[1]
>> ? 1 : 0) |
>> S_00B84C_LDS_SIZE(shader->config.lds_size);
>>
>> program->variable_group_size =
>>
>> sel.info.properties[TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH] == 0;
>> }
>>
>> FREE(program->tokens);
>> program->shader.selector = NULL;
>>  }
>>
>> @@ -644,50 +654,57 @@ static bool si_upload_compute_input(struct
>> si_context *sctx,
>> }
>>
>> r600_resource_reference(_buffer, NULL);
>>
>> return true;
>>  }
>>
>>  static void si_setup_tgsi_grid(struct si_context *sctx,
>>  const struct pipe_grid_info *info)
>>  {
>> +   struct si_compute *program = sctx->cs_shader_state.program;
>> struct radeon_winsys_cs *cs = sctx->b.gfx.cs;
>> unsigned grid_size_reg = R_00B900_COMPUTE_USER_DATA_0 +
>> - 4 * SI_SGPR_GRID_SIZE;
>> +4 * SI_NUM_RESOURCE_SGPRS;
>> +   unsigned block_size_reg = grid_size_reg +
>> + /* 12 bytes = 3 dwords. */
>> + 

Re: [Mesa-dev] [PATCH v2] swr: move msaa resolve to generalized StoreTile

2017-04-28 Thread Ilia Mirkin
On Fri, Apr 28, 2017 at 3:58 PM, Cherniak, Bruce
 wrote:
>
>> On Apr 27, 2017, at 7:50 PM, Ilia Mirkin  wrote:
>>
>> On Thu, Apr 27, 2017 at 8:45 PM, Cherniak, Bruce
>>  wrote:
>>>
 On Apr 27, 2017, at 7:38 PM, Ilia Mirkin  wrote:

 Erm, so ... what happens if I render to FB1, then render to FB2, then
 render to FB1 again (and I have blending enabled)? Doesn't the resolve
 lose the per-sample information? Or does the resolve merely precompute
 the resolved version on the off chance that it's needed, without
 losing the source data?
>>>
>>> The resolve occurs into a secondary, driver private, surface.  All 
>>> per-sample
>>> information is maintained in the original surfaces.
>>>
>>> Yes, the resolve is currently done "on the off chance that it’s needed”.
>>> There is likely an optimization to be had there, but it should be 
>>> functionally
>>> correct.
>>
>> Got it. May I ask why this isn't done on-demand instead? Is it a pain
>> to plug into swr's execution engine? I'm just concerned that
>> StoreTile() may get called a lot, more than even there are draws, as
>> tiles are swapped in and out of "hotness", and I wouldn't be surprised
>> if resolves were needed only a fraction of the time.
>>
>> Cheers,
>>
>>  -ilia
>
>
> Good observation.  I haven’t yet seen this to be the case in the scientific
> visualization applications I’ve been running. But, I can envision where that
> becomes a performance concern.
>
> Do you mean a blit based “state_tracker initiated” on-demand resolve (via
> pipe_blit)?  If so, here are my thoughts:

Yes. The resolve is always initiated via a blit() call anyways (with a
dst surface with nr_samples == 0).

> 1) The software winsys and state trackers don't support multisample surfaces
>for software renderers, nor will/should they (except for swr).  So, I
>thought keeping most of the changes local to our driver would be most
>desirable and safest, as far as swrast and llvmpipe are concerned.  Not
>sure about wgl yet, but I don't see it.
>
> 2) A blit based resolve causes a pipeline reconfiguration (save/restore around
>the blit) that is inherently less efficient than simply
>storing-out/resolving HotTiles.
>
> 3) A blit based resolve needs to sample from the multisample surface using a
>texture sampler with 2DMS/3DMS support.  We’re currently using llvmpipe's
>sampler which doesn't need this support.  I’m looking into extending it, as
>I know we need the functionality for compliance; it’s just not there yet.
>
> I may be off-base on any of these thoughts.  If so, please correct me.
>
> We’ll probably move to a “driver internal” on-demand resolve, implemented
> similar to StoreTiles.  It's a simple matter to only resolve for the times we
> know it's needed and the multisample surface is in HotTiles.  But, I need to
> work out the LoadTiles case for surfaces that aren’t currently in HotTiles.
> Tricky, since we're checking the resolve status of the secondary (resolved)
> surface and the HotTile state of the multisample surface.
>
> Thanks for the feedback.  Getting this completely correct and optimized is
> going to be iterative.  This current patch, while maybe not optimal, helps
> with functionality.  So, I think it's a step in the right direction.

I hope you realize I wasn't looking to derail your attempts at
progress, more like providing some things to think about on your march
towards perfection :) MS textures/fbo's are definitely a thing,
probably more so than MS winsys surfaces these days. At least for
games, maybe not visualization software, with which I have next to no
experience. Try it with e.g. Unigine Heaven or Valley (with MSAA
enabled). I'm fairly sure that at least Heaven uses MSAA textures.

I believe most hardware uses MSAA compression, based on the
observation that it's pretty common for all samples in a pixel to have
the same color, or bg color + fg color + coverage mask. TBH I'm not
sure how it all works. Something for the future when you get all the
basics right.

Some hardware has built-in resolve functionality (e.g. Adreno, maybe
other tilers as well) for moving a MS FBO out of a "hot tile", while
most hardware requires the pipeline reconfiguration + blit. Perhaps
it'd make sense to add a special FE command for computing the resolved
version of all the tiles, and have that state get dirtied when you
render. There are also extensions like
GL_EXT_multisampled_render_to_texture which support the
"insta-resolve" use-case more directly. However they're not
implemented in mesa AFAIK.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] swr: move msaa resolve to generalized StoreTile

2017-04-28 Thread Cherniak, Bruce

> On Apr 27, 2017, at 7:50 PM, Ilia Mirkin  wrote:
> 
> On Thu, Apr 27, 2017 at 8:45 PM, Cherniak, Bruce
>  wrote:
>> 
>>> On Apr 27, 2017, at 7:38 PM, Ilia Mirkin  wrote:
>>> 
>>> Erm, so ... what happens if I render to FB1, then render to FB2, then
>>> render to FB1 again (and I have blending enabled)? Doesn't the resolve
>>> lose the per-sample information? Or does the resolve merely precompute
>>> the resolved version on the off chance that it's needed, without
>>> losing the source data?
>> 
>> The resolve occurs into a secondary, driver private, surface.  All per-sample
>> information is maintained in the original surfaces.
>> 
>> Yes, the resolve is currently done "on the off chance that it’s needed”.
>> There is likely an optimization to be had there, but it should be 
>> functionally
>> correct.
> 
> Got it. May I ask why this isn't done on-demand instead? Is it a pain
> to plug into swr's execution engine? I'm just concerned that
> StoreTile() may get called a lot, more than even there are draws, as
> tiles are swapped in and out of "hotness", and I wouldn't be surprised
> if resolves were needed only a fraction of the time.
> 
> Cheers,
> 
>  -ilia


Good observation.  I haven’t yet seen this to be the case in the scientific
visualization applications I’ve been running. But, I can envision where that
becomes a performance concern.

Do you mean a blit based “state_tracker initiated” on-demand resolve (via
pipe_blit)?  If so, here are my thoughts:

1) The software winsys and state trackers don't support multisample surfaces
   for software renderers, nor will/should they (except for swr).  So, I
   thought keeping most of the changes local to our driver would be most
   desirable and safest, as far as swrast and llvmpipe are concerned.  Not
   sure about wgl yet, but I don't see it.

2) A blit based resolve causes a pipeline reconfiguration (save/restore around
   the blit) that is inherently less efficient than simply
   storing-out/resolving HotTiles.

3) A blit based resolve needs to sample from the multisample surface using a
   texture sampler with 2DMS/3DMS support.  We’re currently using llvmpipe's
   sampler which doesn't need this support.  I’m looking into extending it, as
   I know we need the functionality for compliance; it’s just not there yet.

I may be off-base on any of these thoughts.  If so, please correct me.

We’ll probably move to a “driver internal” on-demand resolve, implemented
similar to StoreTiles.  It's a simple matter to only resolve for the times we
know it's needed and the multisample surface is in HotTiles.  But, I need to
work out the LoadTiles case for surfaces that aren’t currently in HotTiles.
Tricky, since we're checking the resolve status of the secondary (resolved)
surface and the HotTile state of the multisample surface.

Thanks for the feedback.  Getting this completely correct and optimized is
going to be iterative.  This current patch, while maybe not optimal, helps
with functionality.  So, I think it's a step in the right direction.

Thanks,

Bruce

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 30/31] glsl: disable tree grafting optimization for bindless images

2017-04-28 Thread Nicolai Hähnle

On 28.04.2017 03:15, Timothy Arceri wrote:

On 26/04/17 18:27, Samuel Pitoiset wrote:

On 04/26/2017 10:05 AM, Nicolai Hähnle wrote:

On 24.04.2017 12:36, Samuel Pitoiset wrote:

Because the variable declaration holds more information than
the dereference. Note that an image is considered bindless either
if it has been declared in the default uniform block with the
bindless_sampler layout qualifier, or when its storage is not
uniform because this is not allowed without ARB_bindless_texture.


It seems unfortunate that we have to do this. Can you explain what
goes wrong without this change?


We lost the variable and all information contained in ir_variable for
images.


Can you give an example of a shader where this becomes an issue? And why?


Samuel provided me with one, but the basic problem is actually quite 
simple. Consider this (using GLSL rather than the IR, but I hope you get 
the drift):


  coherent image2D img = (some expression);
  out = imageLoad(img, ...);

Tree grafting will convert that to

  out = imageLoad((some expression), ...);

But now the coherent qualifier is gone.

In a way, GLSL is not very well specified here: the qualifiers should 
really be part of the type, at least in the same way that C/C++ have the 
cv-qualifiers, but they aren't.


There is another problem, which is the "mere" implementation problem 
that st_glsl_to_tgsi is only set up to handle sampler/image parameters 
to intrinsics that are direct dereferences. visit_image_intrinsics has a 
cast from ir_rvalue to ir_dereference, which is simply incorrect when 
that parameter is an expression.


The easy answer is to just not do tree grafting for samplers and images.

The cleaner answer is to disable tree grafting only when any of the 
data.image_* qualifiers are set on the variable to be grafted, and to 
fix st_glsl_to_tgsi so that it also handles expressions as sampler/image 
parameters to intrinsics.


Cheers,
Nicolai









Thanks,
Nicolai




Signed-off-by: Samuel Pitoiset 
---
 src/compiler/glsl/opt_tree_grafting.cpp | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/compiler/glsl/opt_tree_grafting.cpp
b/src/compiler/glsl/opt_tree_grafting.cpp
index 28b6e1856e..d4a1ec5675 100644
--- a/src/compiler/glsl/opt_tree_grafting.cpp
+++ b/src/compiler/glsl/opt_tree_grafting.cpp
@@ -371,6 +371,15 @@ tree_grafting_basic_block(ir_instruction
*bb_first,
   if (lhs_var->data.precise)
  continue;

+  if (lhs_var->type->is_image() &&
+  (lhs_var->data.bindless || lhs_var->data.mode !=
ir_var_uniform)) {
+ /* Disable tree grafting optimization for bindless image
types because
+  * the variable declaration holds more information than the
+  * dereference.
+  */
+ continue;
+  }
+
   ir_variable_refcount_entry *entry =
info->refs->get_variable_entry(lhs_var);

   if (!entry->declaration ||





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 36/61] radeonsi/gfx9: set registers and shader key for merged ES-GS

2017-04-28 Thread Marek Olšák
On Fri, Apr 28, 2017 at 9:31 PM, Nicolai Hähnle  wrote:
> Fair enough on those magic numbers. It would be nice to understand them
> better though.
>
>
> On 28.04.2017 17:56, Marek Olšák wrote:
> [snip]

 @@ -1721,20 +1893,26 @@ static void *si_create_shader_selector(struct
 pipe_context *ctx,
 break;
 case TGSI_SEMANTIC_CLIPVERTEX: /* ignore these
 */
 case TGSI_SEMANTIC_EDGEFLAG:
 break;
 default:
 sel->outputs_written2 |=
 1u <<
 si_shader_io_get_unique_index2(name, index);
 }
 }
 sel->esgs_itemsize =
 util_last_bit64(sel->outputs_written)
 * 16;
 +
 +   /* For the ESGS ring in LDS, add 1 dword to reduce LDS
 bank
 +* conflicts, i.e. each vertex will start at a different
 bank.
 +*/
 +   if (sctx->b.chip_class >= GFX9)
 +   sel->esgs_itemsize += 4;
>>>
>>>
>>>
>>> Could this not be achieved by some form of rounding instead?
>>
>>
>> What do you mean?
>
>
> Actually, I think I was mistaken. There are 4 banks, and they're
> interleaved, right? So the idea is to have esgs_itemsize not be a multiple
> of 16 bytes, but a multiple of 16 bytes + 4 bytes. It makes sense to me now.

As far as I know, LDS has 32 banks, but some small chips (Kabini,
Stoney) have only 16 banks. I don't know how they are interleaved.
This tweak was suggested by a hw doc.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/12] Android: default to building all drivers

2017-04-28 Thread Rob Herring
On Fri, Apr 28, 2017 at 10:58 AM, Mauro Rossi  wrote:
> 2017-04-28 14:23 GMT+02:00 Rob Herring :
>> On Thu, Apr 27, 2017 at 9:50 PM, Chih-Wei Huang  
>> wrote:
>>> A typo in the subject?
>>> (s/building/build/)
>>
>> It's a bit misleading as originally I wrote it such that a blank
>> BOARD_GPU_DRIVERS would enable all drivers, then changed it to "all".
>> So it's not really a default anymore.
>>
>>> 2017-04-28 3:43 GMT+08:00 Rob Herring :
 If BOARD_GPU_DRIVERS is empty, build all the drivers. This doesn't
 enable building mesa as that is controlled by including libGLES_mesa in
 the product.

 Signed-off-by: Rob Herring 
 ---
  Android.mk | 8 
  1 file changed, 8 insertions(+)

 diff --git a/Android.mk b/Android.mk
 index 9f481ee7e109..76858c1616bc 100644
 --- a/Android.mk
 +++ b/Android.mk
 @@ -1,3 +1,4 @@
 +
  # Mesa 3-D graphics library
  #
  # Copyright (C) 2010-2011 Chia-I Wu 
 @@ -53,8 +54,15 @@ gallium_drivers := \
 vc4.HAVE_GALLIUM_VC4 \
 virgl.HAVE_GALLIUM_VIRGL

 +$(warning $(BOARD_GPU_DRIVERS))
 +
 +ifeq ($(BOARD_GPU_DRIVERS),all)
 +MESA_BUILD_CLASSIC := $(filter HAVE_%, $(subst ., , $(classic_drivers)))
 +MESA_BUILD_GALLIUM := $(filter HAVE_%, $(subst ., , $(gallium_drivers)))
 +else
  MESA_BUILD_CLASSIC := $(strip $(foreach d, $(BOARD_GPU_DRIVERS), 
 $(patsubst $(d).%,%, $(filter $(d).%, $(classic_drivers)
  MESA_BUILD_GALLIUM := $(strip $(foreach d, $(BOARD_GPU_DRIVERS), 
 $(patsubst $(d).%,%, $(filter $(d).%, $(gallium_drivers)
 +endif
  $(foreach d, $(MESA_BUILD_CLASSIC) $(MESA_BUILD_GALLIUM), $(eval $(d) := 
 true))

  # host and target must be the same arch to generate matypes.h
 --
>>>
>>> Aren't some drivers for arm or x86 only?
>>
>> In practice, yes. But they should build on all architectures so folks
>> can easily build test. If the arm only ones required an arm compiler
>> for example, then none of the x86 folks would build them and check
>> that their changes don't break any drivers.
>>
>>> Is it really possible to build all drivers?
>>
>> Yes. That is what my CI job does.
>>
>> Rob
>
> Is it possible to leave a $(warning   ) in case someone specifies a
> new/wrong driver in BOARD_GPU_DRIVERS?

Okay. It wasn't immediately

> In case I wanted to use mesa 17.2 to build swrast only, considering
> that radeonsi/radeon require LLVM 3.8.0, how can I do that if mesa
> builds everything?

BOARD_GPU_DRIVERS=swrast

> Why should we completely loose the possibility to control the list of
> drivers in android, which is basically a one driver case (swrast) for
> some android-x86 boards or when we could live with intel drivers like
> in an Atom tablet?

You haven't. The existing way to list drivers is unchanged externally.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/mesa: use min_index and max_index directly from vbo

2017-04-28 Thread Nicolai Hähnle

Thanks for this. Series is

Reviewed-by: Nicolai Hähnle 


On 26.04.2017 11:35, Marek Olšák wrote:

From: Marek Olšák 

also remove the incorrect comment about primitive restart.
---
 src/mesa/state_tracker/st_draw.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c
index d710284..e510d43 100644
--- a/src/mesa/state_tracker/st_draw.c
+++ b/src/mesa/state_tracker/st_draw.c
@@ -200,28 +200,23 @@ st_draw_vbo(struct gl_context *ctx,
if (ib) {
   /* Get index bounds for user buffers. */
   if (!index_bounds_valid)
  if (!all_varyings_in_vbos(arrays))
 vbo_get_minmax_indices(ctx, prims, ib, _index, _index,
nr_prims);

   setup_index_buffer(st, ib);

   info.indexed = TRUE;
-  if (min_index != ~0U && max_index != ~0U) {
- info.min_index = min_index;
- info.max_index = max_index;
-  }
+  info.min_index = min_index;
+  info.max_index = max_index;

-  /* The VBO module handles restart for the non-indexed GLDrawArrays
-   * so we only set these fields for indexed drawing:
-   */
   setup_primitive_restart(ctx, , ib->index_size);
}
else {
   /* Transform feedback drawing is always non-indexed. */
   /* Set info.count_from_stream_output. */
   if (tfb_vertcount) {
  if (!st_transform_feedback_draw_init(tfb_vertcount, stream, ))
 return;
   }
}




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 43/61] radeonsi/gfx9: add GS prolog support for merged ES-GS

2017-04-28 Thread Marek Olšák
On Fri, Apr 28, 2017 at 9:33 PM, Nicolai Hähnle  wrote:
> On 28.04.2017 17:59, Marek Olšák wrote:
>>
>> On Fri, Apr 28, 2017 at 1:25 PM, Nicolai Hähnle 
>> wrote:
>>>
>>> On 24.04.2017 10:45, Marek Olšák wrote:


 From: Marek Olšák 

 ---
  src/gallium/drivers/radeonsi/si_shader.c | 87
 +---
  1 file changed, 70 insertions(+), 17 deletions(-)

 diff --git a/src/gallium/drivers/radeonsi/si_shader.c
 b/src/gallium/drivers/radeonsi/si_shader.c
 index a4c2ac0..392f85d 100644
 --- a/src/gallium/drivers/radeonsi/si_shader.c
 +++ b/src/gallium/drivers/radeonsi/si_shader.c
 @@ -7368,20 +7368,28 @@ static void
 si_count_scratch_private_memory(struct
 si_shader_context *ctx)
 LLVMTypeRef type =
 LLVMGetElementType(LLVMTypeOf(inst));
 /* No idea why LLVM aligns allocas to 4
 elements.
 */
 unsigned alignment = LLVMGetAlignment(inst);
 unsigned dw_size =
 align(llvm_get_type_size(type)
 / 4, alignment);
 ctx->shader->config.private_mem_vgprs +=
 dw_size;
 }
 bb = LLVMGetNextBasicBlock(bb);
 }
  }

 +static void si_init_exec_full_mask(struct si_shader_context *ctx)
 +{
 +   LLVMValueRef full_mask = LLVMConstInt(ctx->i64, ~0ull, 0);
 +   lp_build_intrinsic(ctx->gallivm.builder,
 +  "llvm.amdgcn.init.exec", ctx->voidt,
 +  _mask, 1, LP_FUNC_ATTR_CONVERGENT);
 +}
 +
  static void si_init_exec_from_input(struct si_shader_context *ctx,
 unsigned param, unsigned bitoffset)
  {
 LLVMValueRef args[] = {
 LLVMGetParam(ctx->main_fn, param),
 LLVMConstInt(ctx->i32, bitoffset, 0),
 };
 lp_build_intrinsic(ctx->gallivm.builder,
"llvm.amdgcn.init.exec.from.input",
ctx->voidt, args, 2,
 LP_FUNC_ATTR_CONVERGENT);
 @@ -7681,79 +7689,128 @@ static void si_get_ps_epilog_key(struct
 si_shader
 *shader,
 key->ps_epilog.states = shader->key.part.ps.epilog;
  }

  /**
   * Build the GS prolog function. Rotate the input vertices for triangle
 strips
   * with adjacency.
   */
  static void si_build_gs_prolog_function(struct si_shader_context *ctx,
 union si_shader_part_key *key)
  {
 -   const unsigned num_sgprs = GFX6_GS_NUM_USER_SGPR + 2;
 -   const unsigned num_vgprs = 8;
 +   unsigned num_sgprs, num_vgprs;
 struct gallivm_state *gallivm = >gallivm;
 LLVMBuilderRef builder = gallivm->builder;
 -   LLVMTypeRef params[32];
 -   LLVMTypeRef returns[32];
 +   LLVMTypeRef params[48]; /* 40 SGPRs (maximum) + some VGPRs */
 +   LLVMTypeRef returns[48];
 LLVMValueRef func, ret;

 +   if (ctx->screen->b.chip_class >= GFX9) {
 +   num_sgprs = 8 + GFX9_GS_NUM_USER_SGPR;
 +   num_vgprs = 5; /* ES inputs are not needed by GS */
 +   } else {
 +   num_sgprs = GFX6_GS_NUM_USER_SGPR + 2;
 +   num_vgprs = 8;
 +   }
 +
 for (unsigned i = 0; i < num_sgprs; ++i) {
 params[i] = ctx->i32;
 returns[i] = ctx->i32;
 }

 for (unsigned i = 0; i < num_vgprs; ++i) {
 params[num_sgprs + i] = ctx->i32;
 returns[num_sgprs + i] = ctx->f32;
 }

 /* Create the function. */
 si_create_function(ctx, "gs_prolog", returns, num_sgprs +
 num_vgprs,
params, num_sgprs + num_vgprs, num_sgprs -
 1);
 func = ctx->main_fn;

 +   /* Set the full EXEC mask for the prolog, because we are only
 fiddling
 +* with registers here. The main shader part will set the
 correct
 EXEC
 +* mask.
 +*/
 +   if (ctx->screen->b.chip_class >= GFX9)
 +   si_init_exec_full_mask(ctx);
 +
 /* Copy inputs to outputs. This should be no-op, as the
 registers
 match,
  * but it will prevent the compiler from overwriting them
 unintentionally.
  */
 ret = ctx->return_value;
 for (unsigned i = 0; i < num_sgprs; i++) {
 LLVMValueRef p = LLVMGetParam(func, i);
 ret = LLVMBuildInsertValue(builder, ret, p, i, "");
 }
 for (unsigned i = 0; i < num_vgprs; 

Re: [Mesa-dev] [PATCH 56/61] radeonsi: get InstanceID from VGPR1 (or VGPR2 for tess) instead of VGPR3

2017-04-28 Thread Nicolai Hähnle

On 28.04.2017 18:08, Marek Olšák wrote:

On Fri, Apr 28, 2017 at 1:54 PM, Nicolai Hähnle  wrote:

On 24.04.2017 10:45, Marek Olšák wrote:


From: Marek Olšák 

VGPR1 = InstanceID / StepRate0; // StepRate0 can be set to 1
---
 src/gallium/drivers/radeonsi/si_shader.c| 20 ++--
 src/gallium/drivers/radeonsi/si_shader.h|  1 +
 src/gallium/drivers/radeonsi/si_state.c |  1 +
 src/gallium/drivers/radeonsi/si_state_shaders.c | 24
+---
 4 files changed, 33 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c
b/src/gallium/drivers/radeonsi/si_shader.c
index edb50a3..ce509af 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -5838,23 +5838,28 @@ static void declare_vs_specific_input_sgprs(struct
si_shader_context *ctx,
params[ctx->param_vs_state_bits = (*num_params)++] = ctx->i32;
 }

 static void declare_vs_input_vgprs(struct si_shader_context *ctx,
   LLVMTypeRef *params, unsigned
*num_params,
   unsigned *num_prolog_vgprs)
 {
struct si_shader *shader = ctx->shader;

params[ctx->param_vertex_id = (*num_params)++] = ctx->i32;
-   params[ctx->param_rel_auto_id = (*num_params)++] = ctx->i32;
-   params[ctx->param_vs_prim_id = (*num_params)++] = ctx->i32;
-   params[ctx->param_instance_id = (*num_params)++] = ctx->i32;
+   if (shader->key.as_ls) {
+   params[ctx->param_rel_auto_id = (*num_params)++] =
ctx->i32;
+   params[ctx->param_instance_id = (*num_params)++] =
ctx->i32;
+   } else {
+   params[ctx->param_instance_id = (*num_params)++] =
ctx->i32;
+   params[ctx->param_vs_prim_id = (*num_params)++] =
ctx->i32;
+   }
+   params[(*num_params)++] = ctx->i32; /* unused */

if (!shader->is_gs_copy_shader) {
/* Vertex load indices. */
ctx->param_vertex_index0 = (*num_params);
for (unsigned i = 0; i <
shader->selector->info.num_inputs; i++)
params[(*num_params)++] = ctx->i32;
*num_prolog_vgprs += shader->selector->info.num_inputs;
}
 }

@@ -7497,25 +7502,28 @@ static bool si_compile_tgsi_main(struct
si_shader_context *ctx,
 static void si_get_vs_prolog_key(const struct tgsi_shader_info *info,
 unsigned num_input_sgprs,
 const struct si_vs_prolog_bits
*prolog_key,
 struct si_shader *shader_out,
 union si_shader_part_key *key)
 {
memset(key, 0, sizeof(*key));
key->vs_prolog.states = *prolog_key;
key->vs_prolog.num_input_sgprs = num_input_sgprs;
key->vs_prolog.last_input = MAX2(1, info->num_inputs) - 1;
+   key->vs_prolog.as_ls = shader_out->key.as_ls;

-   if (shader_out->selector->type == PIPE_SHADER_TESS_CTRL)
+   if (shader_out->selector->type == PIPE_SHADER_TESS_CTRL) {
+   key->vs_prolog.as_ls = 1;
key->vs_prolog.num_merged_next_stage_vgprs = 2;
-   else if (shader_out->selector->type == PIPE_SHADER_GEOMETRY)
+   } else if (shader_out->selector->type == PIPE_SHADER_GEOMETRY) {
key->vs_prolog.num_merged_next_stage_vgprs = 5;
+   }

/* Set the instanceID flag. */
for (unsigned i = 0; i < info->num_inputs; i++)
if (key->vs_prolog.states.instance_divisors[i])
shader_out->info.uses_instanceid = true;
 }

 /**
  * Compute the VS epilog key, which contains all the information needed
to
  * build the VS epilog function, and set the PrimitiveID output offset.
@@ -8508,21 +8516,21 @@ static void si_build_vs_prolog_function(struct
si_shader_context *ctx,
LLVMValueRef ret, func;
int last_sgpr, num_params, num_returns, i;
unsigned first_vs_vgpr = key->vs_prolog.num_input_sgprs +

key->vs_prolog.num_merged_next_stage_vgprs;
unsigned num_input_vgprs =
key->vs_prolog.num_merged_next_stage_vgprs + 4;
unsigned num_all_input_regs = key->vs_prolog.num_input_sgprs +
  num_input_vgprs;
unsigned user_sgpr_base =
key->vs_prolog.num_merged_next_stage_vgprs ? 8 : 0;

ctx->param_vertex_id = first_vs_vgpr;
-   ctx->param_instance_id = first_vs_vgpr + 3;
+   ctx->param_instance_id = first_vs_vgpr + (key->vs_prolog.as_ls ? 2
: 1);

/* 4 preloaded VGPRs + vertex load indices as prolog outputs */
params = alloca(num_all_input_regs * sizeof(LLVMTypeRef));
returns = alloca((num_all_input_regs + key->vs_prolog.last_input +
1) *
 sizeof(LLVMTypeRef));
num_params = 0;
num_returns = 0;

/* Declare input and output SGPRs. */
num_params = 0;
diff --git 

Re: [Mesa-dev] [PATCH 43/61] radeonsi/gfx9: add GS prolog support for merged ES-GS

2017-04-28 Thread Nicolai Hähnle

On 28.04.2017 17:59, Marek Olšák wrote:

On Fri, Apr 28, 2017 at 1:25 PM, Nicolai Hähnle  wrote:

On 24.04.2017 10:45, Marek Olšák wrote:


From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 87
+---
 1 file changed, 70 insertions(+), 17 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c
b/src/gallium/drivers/radeonsi/si_shader.c
index a4c2ac0..392f85d 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -7368,20 +7368,28 @@ static void si_count_scratch_private_memory(struct
si_shader_context *ctx)
LLVMTypeRef type =
LLVMGetElementType(LLVMTypeOf(inst));
/* No idea why LLVM aligns allocas to 4 elements.
*/
unsigned alignment = LLVMGetAlignment(inst);
unsigned dw_size = align(llvm_get_type_size(type)
/ 4, alignment);
ctx->shader->config.private_mem_vgprs += dw_size;
}
bb = LLVMGetNextBasicBlock(bb);
}
 }

+static void si_init_exec_full_mask(struct si_shader_context *ctx)
+{
+   LLVMValueRef full_mask = LLVMConstInt(ctx->i64, ~0ull, 0);
+   lp_build_intrinsic(ctx->gallivm.builder,
+  "llvm.amdgcn.init.exec", ctx->voidt,
+  _mask, 1, LP_FUNC_ATTR_CONVERGENT);
+}
+
 static void si_init_exec_from_input(struct si_shader_context *ctx,
unsigned param, unsigned bitoffset)
 {
LLVMValueRef args[] = {
LLVMGetParam(ctx->main_fn, param),
LLVMConstInt(ctx->i32, bitoffset, 0),
};
lp_build_intrinsic(ctx->gallivm.builder,
   "llvm.amdgcn.init.exec.from.input",
   ctx->voidt, args, 2, LP_FUNC_ATTR_CONVERGENT);
@@ -7681,79 +7689,128 @@ static void si_get_ps_epilog_key(struct si_shader
*shader,
key->ps_epilog.states = shader->key.part.ps.epilog;
 }

 /**
  * Build the GS prolog function. Rotate the input vertices for triangle
strips
  * with adjacency.
  */
 static void si_build_gs_prolog_function(struct si_shader_context *ctx,
union si_shader_part_key *key)
 {
-   const unsigned num_sgprs = GFX6_GS_NUM_USER_SGPR + 2;
-   const unsigned num_vgprs = 8;
+   unsigned num_sgprs, num_vgprs;
struct gallivm_state *gallivm = >gallivm;
LLVMBuilderRef builder = gallivm->builder;
-   LLVMTypeRef params[32];
-   LLVMTypeRef returns[32];
+   LLVMTypeRef params[48]; /* 40 SGPRs (maximum) + some VGPRs */
+   LLVMTypeRef returns[48];
LLVMValueRef func, ret;

+   if (ctx->screen->b.chip_class >= GFX9) {
+   num_sgprs = 8 + GFX9_GS_NUM_USER_SGPR;
+   num_vgprs = 5; /* ES inputs are not needed by GS */
+   } else {
+   num_sgprs = GFX6_GS_NUM_USER_SGPR + 2;
+   num_vgprs = 8;
+   }
+
for (unsigned i = 0; i < num_sgprs; ++i) {
params[i] = ctx->i32;
returns[i] = ctx->i32;
}

for (unsigned i = 0; i < num_vgprs; ++i) {
params[num_sgprs + i] = ctx->i32;
returns[num_sgprs + i] = ctx->f32;
}

/* Create the function. */
si_create_function(ctx, "gs_prolog", returns, num_sgprs +
num_vgprs,
   params, num_sgprs + num_vgprs, num_sgprs - 1);
func = ctx->main_fn;

+   /* Set the full EXEC mask for the prolog, because we are only
fiddling
+* with registers here. The main shader part will set the correct
EXEC
+* mask.
+*/
+   if (ctx->screen->b.chip_class >= GFX9)
+   si_init_exec_full_mask(ctx);
+
/* Copy inputs to outputs. This should be no-op, as the registers
match,
 * but it will prevent the compiler from overwriting them
unintentionally.
 */
ret = ctx->return_value;
for (unsigned i = 0; i < num_sgprs; i++) {
LLVMValueRef p = LLVMGetParam(func, i);
ret = LLVMBuildInsertValue(builder, ret, p, i, "");
}
for (unsigned i = 0; i < num_vgprs; i++) {
LLVMValueRef p = LLVMGetParam(func, num_sgprs + i);
p = LLVMBuildBitCast(builder, p, ctx->f32, "");
ret = LLVMBuildInsertValue(builder, ret, p, num_sgprs + i,
"");
}

if (key->gs_prolog.states.tri_strip_adj_fix) {
/* Remap the input vertices for every other primitive. */
-   const unsigned vtx_params[6] = {
+   const unsigned gfx6_vtx_params[6] = {
num_sgprs,
num_sgprs + 1,
num_sgprs + 3,
num_sgprs + 4,
num_sgprs + 5,
num_sgprs + 6
  

Re: [Mesa-dev] [PATCH 36/61] radeonsi/gfx9: set registers and shader key for merged ES-GS

2017-04-28 Thread Nicolai Hähnle
Fair enough on those magic numbers. It would be nice to understand them 
better though.



On 28.04.2017 17:56, Marek Olšák wrote:
[snip]

@@ -1721,20 +1893,26 @@ static void *si_create_shader_selector(struct
pipe_context *ctx,
break;
case TGSI_SEMANTIC_CLIPVERTEX: /* ignore these */
case TGSI_SEMANTIC_EDGEFLAG:
break;
default:
sel->outputs_written2 |=
1u <<
si_shader_io_get_unique_index2(name, index);
}
}
sel->esgs_itemsize = util_last_bit64(sel->outputs_written)
* 16;
+
+   /* For the ESGS ring in LDS, add 1 dword to reduce LDS
bank
+* conflicts, i.e. each vertex will start at a different
bank.
+*/
+   if (sctx->b.chip_class >= GFX9)
+   sel->esgs_itemsize += 4;



Could this not be achieved by some form of rounding instead?


What do you mean?


Actually, I think I was mistaken. There are 4 banks, and they're 
interleaved, right? So the idea is to have esgs_itemsize not be a 
multiple of 16 bytes, but a multiple of 16 bytes + 4 bytes. It makes 
sense to me now.


This patch is also

Reviewed-by: Nicolai Hähnle 


--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] r600g: avoid redundant DB registerupdates

2017-04-28 Thread Constantine Kharlamov
On 28.04.2017 21:19, Marc Dietrich wrote:
> Am Freitag, 28. April 2017, 16:53:55 CEST schrieb Dieter Nützel:
>> I'm running this, too.
>> But alone. 4/4 didn't apply anylonger ;-)
>>
>> NO glitches on NI/Turks XT (6670).
>>
>> I had tested 'Heaven' and 'Valley' even with the former patch version.
>> The 'Heaven' GPU hang (wireframe/tessellation) is OLD, as it stays there
>> for ages.
>> So:
>>
>> Tested-by: Dieter Nützel 
> 
> Dieter, your card is HD5000/Evergreen, while mine (RS880) is similar to 
> HD4200/r600. I'm ok if the patch only gets applied to the everygreen+ 
> generations if the glitches for r600 cant get fixed. 
> 
> Marc

Sorry, I didn't have time to look more at it, but just for the record: I'm 
either find the problem with r600 generation, or I won't try to get it merged. 
Because if I don't know why doesn't it work for r600 series, I can't be sure 
that this patch won't break in a rare corner case for later generation.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: set vector_elements to 1 for samplers

2017-04-28 Thread Mark Janes
Samuel Pitoiset  writes:

> On 04/28/2017 06:58 PM, Mark Janes wrote:
>> With this commit, a wide range of intel hardware began hanging during
>> the GLES CTS, with dmesg errors like:
>> 
>> [25488.739167] traps: glcts[15106] general protection ip:7fdac6484ba5 
>> sp:7ffdcda85a20 error:0
>> 
>> Machines that did complete the cts, reported hundreds of errors like:
>> 
>> *** Error in `/tmp/build_root/m64/bin/es/cts/glcts': malloc(): memory
>>  corruption: 0x562c7503b270 ***
>
> That's unfortunate. I don't have any Intel hw.

I can set up a CI build that will test your patches on Intel hardware.
Many non-Intel developers find that the exhaustive testing in our CI
alerts them to bugs in their patches.  It takes about 30 minutes to run.

Please give me a branch that you want to use for testing.  When you
force-push to that branch, a build will trigger.

> Kenneth is going to revert the patch for now.
>
> Would be very nice if someone with the hardware could have a look.
>
> Which tests are failing? Can you list some?

I put a list of failing piglit tests in
https://bugs.freedesktop.org/show_bug.cgi?id=100871

>> 
>> 
>> 
>> Samuel Pitoiset  writes:
>> 
>>> I don't see any reasons why vector_elements is 1 for images and
>>> 0 for samplers. This increases consistency and allows to clean
>>> up some code a bit.
>>>
>>> This will also help for ARB_bindless_texture.
>>>
>>> No piglit regressions with RadeonSI.
>>>
>>> Signed-off-by: Samuel Pitoiset 
>>> ---
>>>   src/compiler/glsl_types.cpp |  7 +--
>>>   src/mesa/main/uniform_query.cpp | 15 +--
>>>   2 files changed, 6 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp
>>> index 0480bef80e..bf078ad614 100644
>>> --- a/src/compiler/glsl_types.cpp
>>> +++ b/src/compiler/glsl_types.cpp
>>> @@ -95,12 +95,7 @@ glsl_type::glsl_type(GLenum gl_type, glsl_base_type 
>>> base_type,
>>>   
>>>  memset(& fields, 0, sizeof(fields));
>>>   
>>> -   if (is_sampler()) {
>>> -  /* Samplers take no storage whatsoever. */
>>> -  matrix_columns = vector_elements = 0;
>>> -   } else {
>>> -  matrix_columns = vector_elements = 1;
>>> -   }
>>> +   matrix_columns = vector_elements = 1;
>>>   }
>>>   
>>>   glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields,
>>> diff --git a/src/mesa/main/uniform_query.cpp 
>>> b/src/mesa/main/uniform_query.cpp
>>> index e400d0eb00..114f6fb5be 100644
>>> --- a/src/mesa/main/uniform_query.cpp
>>> +++ b/src/mesa/main/uniform_query.cpp
>>> @@ -321,8 +321,7 @@ _mesa_get_uniform(struct gl_context *ctx, GLuint 
>>> program, GLint location,
>>>  }
>>>   
>>>  {
>>> -  unsigned elements = (uni->type->is_sampler())
>>> -? 1 : uni->type->components();
>>> +  unsigned elements = uni->type->components();
>>> const int dmul = uni->type->is_64bit() ? 2 : 1;
>>> const int rmul = glsl_base_type_is_64bit(returnType) ? 2 : 1;
>>>   
>>> @@ -648,10 +647,8 @@ _mesa_propagate_uniforms_to_driver_storage(struct 
>>> gl_uniform_storage *uni,
>>>   {
>>>  unsigned i;
>>>   
>>> -   /* vector_elements and matrix_columns can be 0 for samplers.
>>> -*/
>>> -   const unsigned components = MAX2(1, uni->type->vector_elements);
>>> -   const unsigned vectors = MAX2(1, uni->type->matrix_columns);
>>> +   const unsigned components = uni->type->vector_elements;
>>> +   const unsigned vectors = uni->type->matrix_columns;
>>>  const int dmul = uni->type->is_64bit() ? 2 : 1;
>>>   
>>>  /* Store the data in the driver's requested type in the driver's 
>>> storage
>>> @@ -803,8 +800,7 @@ validate_uniform(GLint location, GLsizei count, const 
>>> GLvoid *values,
>>>  }
>>>   
>>>  /* Verify that the types are compatible. */
>>> -   const unsigned components = uni->type->is_sampler()
>>> -  ? 1 : uni->type->vector_elements;
>>> +   const unsigned components = uni->type->vector_elements;
>>>   
>>>  if (components != src_components) {
>>> /* glUniformN() must match float/vecN type */
>>> @@ -925,8 +921,7 @@ _mesa_uniform(GLint location, GLsizei count, const 
>>> GLvoid *values,
>>>return;
>>>  }
>>>   
>>> -   const unsigned components = uni->type->is_sampler()
>>> -  ? 1 : uni->type->vector_elements;
>>> +   const unsigned components = uni->type->vector_elements;
>>>   
>>>  /* Page 82 (page 96 of the PDF) of the OpenGL 2.1 spec says:
>>>   *
>>> -- 
>>> 2.12.2
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl: polish dri2_to_egl_attribute_map[]

2017-04-28 Thread Matt Turner
Nice change.

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: add more fallback gallium formats for GL integer formats

2017-04-28 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Fri, Apr 28, 2017 at 5:11 AM, Brian Paul  wrote:
> The VMware driver has a limited set of integer texture formats.  We
> often have to fall back to 4-component formats when 1- or 2-component
> formats are missing.
>
> This fixes about 8 integer texture Piglit tests with the VMware driver
> on Linux.  We've had this code in-house for a long time but I guess it
> was never up-streamed to Mesa master.
>
> This shouldn't regress any other drivers since we're either choosing
> an earlier format in the list, or failing anyway.
> ---
>  src/mesa/state_tracker/st_format.c | 50 
> +++---
>  1 file changed, 25 insertions(+), 25 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_format.c 
> b/src/mesa/state_tracker/st_format.c
> index 7901d50..012f1a4 100644
> --- a/src/mesa/state_tracker/st_format.c
> +++ b/src/mesa/state_tracker/st_format.c
> @@ -1186,7 +1186,7 @@ static const struct format_mapping format_map[] = {
> },
> {
>{ 1, GL_LUMINANCE, GL_LUMINANCE4, GL_LUMINANCE8, 0 },
> -  { PIPE_FORMAT_L8_UNORM, DEFAULT_RGB_FORMATS }
> +  { PIPE_FORMAT_L8_UNORM, PIPE_FORMAT_L8A8_UNORM, DEFAULT_RGB_FORMATS }
> },
>
> /* basic Luminance/Alpha formats */
> @@ -1682,101 +1682,101 @@ static const struct format_mapping format_map[] = {
> {
>{ GL_ALPHA_INTEGER_EXT,
>  GL_ALPHA8I_EXT, 0 },
> -  { PIPE_FORMAT_A8_SINT, 0 }
> +  { PIPE_FORMAT_A8_SINT, PIPE_FORMAT_R8G8B8A8_SINT, 0 }
> },
> {
>{ GL_ALPHA16I_EXT, 0 },
> -  { PIPE_FORMAT_A16_SINT, 0 }
> +  { PIPE_FORMAT_A16_SINT, PIPE_FORMAT_R16G16B16A16_SINT, 0 }
> },
> {
>{ GL_ALPHA32I_EXT, 0 },
> -  { PIPE_FORMAT_A32_SINT, 0 }
> +  { PIPE_FORMAT_A32_SINT, PIPE_FORMAT_R32G32B32A32_SINT, 0 }
> },
> {
>{ GL_ALPHA8UI_EXT, 0 },
> -  { PIPE_FORMAT_A8_UINT, 0 }
> +  { PIPE_FORMAT_A8_UINT, PIPE_FORMAT_R8G8B8A8_UINT, 0 }
> },
> {
>{ GL_ALPHA16UI_EXT, 0 },
> -  { PIPE_FORMAT_A16_UINT, 0 }
> +  { PIPE_FORMAT_A16_UINT, PIPE_FORMAT_R16G16B16A16_UINT, 0 }
> },
> {
>{ GL_ALPHA32UI_EXT, 0 },
> -  { PIPE_FORMAT_A32_UINT, 0 }
> +  { PIPE_FORMAT_A32_UINT, PIPE_FORMAT_R32G32B32A32_UINT, 0 }
> },
> {
>{ GL_INTENSITY8I_EXT, 0 },
> -  { PIPE_FORMAT_I8_SINT, 0 }
> +  { PIPE_FORMAT_I8_SINT, PIPE_FORMAT_R8G8B8A8_SINT, 0 }
> },
> {
>{ GL_INTENSITY16I_EXT, 0 },
> -  { PIPE_FORMAT_I16_SINT, 0 }
> +  { PIPE_FORMAT_I16_SINT, PIPE_FORMAT_R16G16B16A16_SINT, 0 }
> },
> {
>{ GL_INTENSITY32I_EXT, 0 },
> -  { PIPE_FORMAT_I32_SINT, 0 }
> +  { PIPE_FORMAT_I32_SINT, PIPE_FORMAT_R32G32B32A32_SINT, 0 }
> },
> {
>{ GL_INTENSITY8UI_EXT, 0 },
> -  { PIPE_FORMAT_I8_UINT, 0 }
> +  { PIPE_FORMAT_I8_UINT, PIPE_FORMAT_R8G8B8A8_UINT, 0 }
> },
> {
>{ GL_INTENSITY16UI_EXT, 0 },
> -  { PIPE_FORMAT_I16_UINT, 0 }
> +  { PIPE_FORMAT_I16_UINT, PIPE_FORMAT_R16G16B16A16_UINT, 0 }
> },
> {
>{ GL_INTENSITY32UI_EXT, 0 },
> -  { PIPE_FORMAT_I32_UINT, 0 }
> +  { PIPE_FORMAT_I32_UINT, PIPE_FORMAT_R32G32B32A32_UINT, 0 }
> },
> {
>{ GL_LUMINANCE8I_EXT, 0 },
> -  { PIPE_FORMAT_L8_SINT, 0 }
> +  { PIPE_FORMAT_L8_SINT, PIPE_FORMAT_R8G8B8A8_SINT, 0 }
> },
> {
>{ GL_LUMINANCE16I_EXT, 0 },
> -  { PIPE_FORMAT_L16_SINT, 0 }
> +  { PIPE_FORMAT_L16_SINT, PIPE_FORMAT_R16G16B16A16_SINT, 0 }
> },
> {
>{ GL_LUMINANCE32I_EXT, 0 },
> -  { PIPE_FORMAT_L32_SINT, 0 }
> +  { PIPE_FORMAT_L32_SINT, PIPE_FORMAT_R32G32B32A32_SINT, 0 }
> },
> {
>{ GL_LUMINANCE_INTEGER_EXT,
>  GL_LUMINANCE8UI_EXT, 0 },
> -  { PIPE_FORMAT_L8_UINT, 0 }
> +  { PIPE_FORMAT_L8_UINT, PIPE_FORMAT_R8G8B8A8_UINT, 0 }
> },
> {
>{ GL_LUMINANCE16UI_EXT, 0 },
> -  { PIPE_FORMAT_L16_UINT, 0 }
> +  { PIPE_FORMAT_L16_UINT, PIPE_FORMAT_R16G16B16A16_UINT, 0 }
> },
> {
>{ GL_LUMINANCE32UI_EXT, 0 },
> -  { PIPE_FORMAT_L32_UINT, 0 }
> +  { PIPE_FORMAT_L32_UINT, PIPE_FORMAT_R32G32B32A32_UINT, 0 }
> },
> {
>{ GL_LUMINANCE_ALPHA_INTEGER_EXT,
>  GL_LUMINANCE_ALPHA8I_EXT, 0 },
> -  { PIPE_FORMAT_L8A8_SINT, 0 }
> +  { PIPE_FORMAT_L8A8_SINT, PIPE_FORMAT_R8G8B8A8_SINT, 0 }
> },
> {
>{ GL_LUMINANCE_ALPHA16I_EXT, 0 },
> -  { PIPE_FORMAT_L16A16_SINT, 0 }
> +  { PIPE_FORMAT_L16A16_SINT, PIPE_FORMAT_R16G16B16A16_SINT, 0 }
> },
> {
>{ GL_LUMINANCE_ALPHA32I_EXT, 0 },
> -  { PIPE_FORMAT_L32A32_SINT, 0 }
> +  { PIPE_FORMAT_L32A32_SINT, PIPE_FORMAT_R32G32B32A32_SINT, 0 }
> },
> {
>{ GL_LUMINANCE_ALPHA8UI_EXT, 0 },
> -  { PIPE_FORMAT_L8A8_UINT, 0 }
> +  { PIPE_FORMAT_L8A8_UINT, PIPE_FORMAT_R8G8B8A8_UINT, 0 }
> },
> {
>{ 

Re: [Mesa-dev] [PATCH] mesa: optimize color_buffer_writes_enabled()

2017-04-28 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Fri, Apr 28, 2017 at 5:09 AM, Brian Paul  wrote:
> Return as soon as we find an existing color channel that's enabled for
> writing.  Typically, this allows us to return true on the first loop
> iteration intead of doing four iterations.
>
> No piglit regressions.
> ---
>  src/mesa/main/clear.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/main/clear.c b/src/mesa/main/clear.c
> index a1bb36e..884cf98 100644
> --- a/src/mesa/main/clear.c
> +++ b/src/mesa/main/clear.c
> @@ -115,16 +115,17 @@ color_buffer_writes_enabled(const struct gl_context 
> *ctx, unsigned idx)
>  {
> struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[idx];
> GLuint c;
> -   GLubyte colorMask = 0;
>
> if (rb) {
>for (c = 0; c < 4; c++) {
> - if (_mesa_format_has_color_component(rb->Format, c))
> -colorMask |= ctx->Color.ColorMask[idx][c];
> + if (ctx->Color.ColorMask[idx][c] &&
> + _mesa_format_has_color_component(rb->Format, c)) {
> +return true;
> + }
>}
> }
>
> -   return colorMask != 0;
> +   return false;
>  }
>
>
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/15] travis: enable apt cache

2017-04-28 Thread Andres Gomez
On Fri, 2017-04-28 at 19:27 +0100, Emil Velikov wrote:
> On 28 April 2017 at 19:15, Emil Velikov  wrote:
> > On 28 April 2017 at 11:50, Andres Gomez  wrote:
> > > Do we want to do this?
> > > 
> > > According to Travis own doc, there is little to no gain:
> > > https://docs.travis-ci.com/user/caching/#Things-not-to-cache
> > > 
> > 
> > The packages we use should not be slow to download, although I've not
> > checked explicitly. It seems counter intuitive to explicitly add apt
> > caching as an option yet advice against it :-\
> > 
> > Let's me give it a spin and see what happens.
> > 
> 
> Using apt cache is a clear win - not huge one but still.
> 
> "make loaders/classic DRI" - 1s
> "scons SWR" - 6s

Let's keep the patch, then.

This is:

Reviewed-by: Andres Gomez 

-- 
Br,

Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/16] travis: bump MAKEFLAGS to -j4

2017-04-28 Thread Emil Velikov
From: Emil Velikov 

The instance should have 2 cores, yet bumping the jobs to 4 should give
us a minor speed improvement.

Signed-off-by: Emil Velikov 
Reviewed-by: Andres Gomez 
---
 .travis.yml | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index db3cb9517fe..5e060d0335c 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -31,7 +31,7 @@ matrix:
 - env:
 - LABEL="make loaders/classic DRI"
 - BUILD=make
-- MAKEFLAGS=-j2
+- MAKEFLAGS="-j4"
 - MAKE_CHECK_COMMAND="make check"
 - DRI_LOADERS="--enable-glx --enable-gbm --enable-egl 
--with-platforms=x11,drm,surfaceless,wayland --enable-osmesa"
 - DRI_DRIVERS="i915,i965,radeon,r200,swrast,nouveau"
@@ -51,7 +51,7 @@ matrix:
 # Start this early so that it doesn't hunder the run time.
 - LABEL="make Gallium Drivers SWR"
 - BUILD=make
-- MAKEFLAGS=-j2
+- MAKEFLAGS="-j4"
 - MAKE_CHECK_COMMAND="true"
 - LLVM_VERSION=3.9
 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
@@ -82,7 +82,7 @@ matrix:
 - env:
 - LABEL="make Gallium Drivers Other"
 - BUILD=make
-- MAKEFLAGS=-j2
+- MAKEFLAGS="-j4"
 - MAKE_CHECK_COMMAND="true"
 - LLVM_VERSION=3.9
 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
@@ -110,7 +110,7 @@ matrix:
 # NOTE: Analogous to SWR above, building Clover is quite slow.
 - LABEL="make Gallium ST Clover"
 - BUILD=make
-- MAKEFLAGS=-j2
+- MAKEFLAGS="-j4"
 - MAKE_CHECK_COMMAND="true"
 - LLVM_VERSION=3.6
 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
@@ -145,7 +145,7 @@ matrix:
 - env:
 - LABEL="make Gallium ST Other"
 - BUILD=make
-- MAKEFLAGS=-j2
+- MAKEFLAGS="-j4"
 - MAKE_CHECK_COMMAND="true"
 - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
 - DRI_DRIVERS=""
@@ -175,7 +175,7 @@ matrix:
 - env:
 - LABEL="make Vulkan"
 - BUILD=make
-- MAKEFLAGS=-j2
+- MAKEFLAGS="-j4"
 - MAKE_CHECK_COMMAND="make -C src/gtest check && make -C src/intel 
check"
 - LLVM_VERSION=3.9
 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/16] travis: split the make target to three separate ones

2017-04-28 Thread Emil Velikov
From: Emil Velikov 

Split the target to allow faster builds for each run.

The overall build time will be more, yet Travis runs multiple builds in
parallel so we're limited by the slowest one.

Things are split roughly as:
 - DRI loaders, classic DRI drivers, classic OSMesa, make check
 - All Gallium drivers (minus the SWR) alongside st/dri (mesa)
 - The Vulkan drivers - ANV and RADV, make check (anv)

v2:
 - rework RUN_CHECK to MAKE_CHECK_COMMAND
 - explicitly disable DRI loaders
 - generate linux/memfd.h locally and enable ANV
 - add libedit-dev

v3: Use printf to create the header (Andres).

Signed-off-by: Emil Velikov 
---
 .travis.yml | 93 ++---
 1 file changed, 77 insertions(+), 16 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 6548e85b767..5298fa11b67 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -26,28 +26,21 @@ env:
 matrix:
   include:
 - env:
-- LABEL="make"
+- LABEL="make loaders/classic DRI"
 - BUILD=make
 - MAKEFLAGS=-j2
-- LLVM_VERSION=3.9
-- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
+- MAKE_CHECK_COMMAND="make check"
+# XXX: Add wayland platform
+- DRI_LOADERS="--enable-glx --enable-gbm --enable-egl 
--with-platforms=x11,drm,surfaceless --enable-osmesa"
 - DRI_DRIVERS="i915,i965,radeon,r200,swrast,nouveau"
-- 
GALLIUM_DRIVERS="i915,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,etnaviv,imx"
-- VULKAN_DRIVERS="radeon"
+- GALLIUM_DRIVERS=""
+- VULKAN_DRIVERS=""
   addons:
 apt:
-  sources:
-- llvm-toolchain-trusty-3.9
   packages:
-# LLVM packaging is broken and misses these dependencies
-- libedit-dev
-# From sources above
-- llvm-3.9-dev
-# Common
 - x11proto-xf86vidmode-dev
 - libexpat1-dev
 - libx11-xcb-dev
-- libelf-dev
 - env:
 # NOTE: Building SWR is 2x (yes two) times slower than all the other
 # gallium drivers combined.
@@ -55,10 +48,12 @@ matrix:
 - LABEL="make Gallium Drivers SWR"
 - BUILD=make
 - MAKEFLAGS=-j2
+- MAKE_CHECK_COMMAND="true"
 - LLVM_VERSION=3.9
 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
 - OVERRIDE_CC="gcc-5"
 - OVERRIDE_CXX="g++-5"
+- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
 - DRI_DRIVERS=""
 - GALLIUM_DRIVERS="swr"
 - VULKAN_DRIVERS=""
@@ -79,6 +74,57 @@ matrix:
 - libx11-xcb-dev
 - libelf-dev
 - env:
+- LABEL="make Gallium Drivers Other"
+- BUILD=make
+- MAKEFLAGS=-j2
+- MAKE_CHECK_COMMAND="true"
+- LLVM_VERSION=3.9
+- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
+- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
+- DRI_DRIVERS=""
+- 
GALLIUM_DRIVERS="i915,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,etnaviv,imx"
+- VULKAN_DRIVERS=""
+  addons:
+apt:
+  sources:
+- llvm-toolchain-trusty-3.9
+  packages:
+# From sources above
+- llvm-3.9-dev
+# Common
+- x11proto-xf86vidmode-dev
+- libexpat1-dev
+- libx11-xcb-dev
+- libelf-dev
+- env:
+- LABEL="make Vulkan"
+- BUILD=make
+- MAKEFLAGS=-j2
+- MAKE_CHECK_COMMAND="make -C src/gtest check && make -C src/intel 
check"
+- LLVM_VERSION=3.9
+- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
+# XXX: we want to test the WSI, but those are enabled via the EGL 
toggles
+# XXX: Add wayland platform
+# XXX: Platform X11 dependencies are checked when --enable-glx is set
+- DRI_LOADERS="--enable-glx --disable-gbm --enable-egl 
--with-platforms=x11"
+- DRI_DRIVERS=""
+- GALLIUM_DRIVERS=""
+- VULKAN_DRIVERS="intel,radeon"
+  addons:
+apt:
+  sources:
+- llvm-toolchain-trusty-3.9
+  packages:
+# LLVM packaging is broken and misses these dependencies
+- libedit-dev
+# From sources above
+- llvm-3.9-dev
+# Common
+- x11proto-xf86vidmode-dev
+- libexpat1-dev
+- libx11-xcb-dev
+- libelf-dev
+- env:
 - LABEL="scons"
 - BUILD=scons
 - SCONSFLAGS="-j4"
@@ -200,18 +246,33 @@ install:
   (cd $LIBTXC_DXTN_VERSION && ./configure --prefix=$HOME/prefix && make 
install);
 fi
 
+  # Generate the header since one is missing on the Travis instance
+  - mkdir -p linux
+  - echo "#ifndef _LINUX_MEMFD_H" > linux/memfd.h
+  - echo "#define _LINUX_MEMFD_H"

[Mesa-dev] [PATCH 07/16] travis: rework "if test" blocks in the script section

2017-04-28 Thread Emil Velikov
From: Emil Velikov 

Split the "if test" blocks so that we get more sensible output in case
of a failure.

Signed-off-by: Emil Velikov 
Reviewed-by: Andres Gomez 
---
 .travis.yml | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/.travis.yml b/.travis.yml
index 8921429c7e9..a4fe00d8023 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -114,6 +114,8 @@ script:
 --disable-llvm-shared-libs
 ;
   make && make check;
-elif test x$BUILD = xscons; then
+fi
+
+  - if test "x$BUILD" = xscons; then
   scons llvm=1 && scons llvm=1 check;
 fi
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/16] travis: add separate "scons" and "scons llvm" targets

2017-04-28 Thread Emil Velikov
From: Emil Velikov 

The former does not require any LLVM, while the latter uses LLVM 3.3.

This way we'll quickly catch any LLVM 3.3+ functionality that gets
introduced where it shouldn't.

Add the full list of addons for each build permutation.

v2: Keep libedit-dev, rework check target.
v3: Comment the current check target, add -j4 SCONSFLAGS
v4:
 - Remove llvm-toolchain-trusty-3.3 source (Andres)
 - Keep check target as-is (Andres)

Signed-off-by: Emil Velikov 
Reviewed-by: Andres Gomez 
---
 .travis.yml | 61 +
 1 file changed, 45 insertions(+), 16 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 0d87c663bb2..38f55713511 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -7,20 +7,6 @@ cache:
   apt: true
   ccache: true
 
-addons:
-  apt:
-sources:
-  - llvm-toolchain-trusty-3.9
-packages:
-  - x11proto-xf86vidmode-dev
-  - libexpat1-dev
-  - libx11-xcb-dev
-  # LLVM packaging is broken and misses these dependencies
-  - libedit-dev
-  - llvm-3.9-dev
-  - libelf-dev
-  - scons
-
 env:
   global:
 - XORG_RELEASES=http://xorg.freedesktop.org/releases/individual
@@ -34,8 +20,6 @@ env:
 - LIBXCB_VERSION=libxcb-1.11
 - LIBXSHMFENCE_VERSION=libxshmfence-1.2
 - LIBTXC_DXTN_VERSION=libtxc_dxtn-1.0.1
-- LLVM_VERSION=3.9
-- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
 - PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig
 - LD_LIBRARY_PATH="$HOME/prefix/lib:$LD_LIBRARY_PATH"
 
@@ -45,11 +29,56 @@ matrix:
 - LABEL="make"
 - BUILD=make
 - MAKEFLAGS=-j2
+- LLVM_VERSION=3.9
+- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
+  addons:
+apt:
+  sources:
+- llvm-toolchain-trusty-3.9
+  packages:
+# LLVM packaging is broken and misses these dependencies
+- libedit-dev
+# From sources above
+- llvm-3.9-dev
+# Common
+- x11proto-xf86vidmode-dev
+- libexpat1-dev
+- libx11-xcb-dev
+- libelf-dev
+- env:
+- LABEL="scons"
+- BUILD=scons
+- SCONSFLAGS="-j4"
+# Explicitly disable.
+- SCONS_TARGET="llvm=0"
+  addons:
+apt:
+  packages:
+- scons
+# Common
+- x11proto-xf86vidmode-dev
+- libexpat1-dev
+- libx11-xcb-dev
+- libelf-dev
 - env:
 - LABEL="scons LLVM"
 - BUILD=scons
 - SCONSFLAGS="-j4"
 - SCONS_TARGET="llvm=1"
+- LLVM_VERSION=3.3
+- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
+  addons:
+apt:
+  packages:
+- scons
+# LLVM packaging is broken and misses these dependencies
+- libedit-dev
+- llvm-3.3-dev
+# Common
+- x11proto-xf86vidmode-dev
+- libexpat1-dev
+- libx11-xcb-dev
+- libelf-dev
 
 install:
   - pip install --user mako
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/16] travis: add Gallium state-tracker targets

2017-04-28 Thread Emil Velikov
From: Emil Velikov 

Split into OpenCL and others, since the former is quite time consuming.

v2:
 - explicitly enable/disable components
 - build libvdpau 1.1 requirement
 - enable st/vdpau
 - build libva 1.6.2 (API 0.38) requirement

v3: Drop ubuntu-toolchain-r-test from sources (Andres)

Signed-off-by: Emil Velikov 
Reviewed-by: Andres Gomez 
---
 .travis.yml | 103 ++--
 1 file changed, 93 insertions(+), 10 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index ec76cf7c9cb..563d9e25379 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -20,6 +20,8 @@ env:
 - LIBXCB_VERSION=libxcb-1.11
 - LIBXSHMFENCE_VERSION=libxshmfence-1.2
 - LIBTXC_DXTN_VERSION=libtxc_dxtn-1.0.1
+- LIBVDPAU_VERSION=libvdpau-1.1
+- LIBVA_VERSION=libva-1.6.2
 - PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig
 - LD_LIBRARY_PATH="$HOME/prefix/lib:$LD_LIBRARY_PATH"
 
@@ -33,6 +35,7 @@ matrix:
 # XXX: Add wayland platform
 - DRI_LOADERS="--enable-glx --enable-gbm --enable-egl 
--with-platforms=x11,drm,surfaceless --enable-osmesa"
 - DRI_DRIVERS="i915,i965,radeon,r200,swrast,nouveau"
+- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa 
--disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx 
--disable-gallium-osmesa"
 - GALLIUM_DRIVERS=""
 - VULKAN_DRIVERS=""
   addons:
@@ -55,6 +58,7 @@ matrix:
 - OVERRIDE_CXX="g++-5"
 - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
 - DRI_DRIVERS=""
+- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa 
--disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx 
--disable-gallium-osmesa"
 - GALLIUM_DRIVERS="swr"
 - VULKAN_DRIVERS=""
   addons:
@@ -82,6 +86,7 @@ matrix:
 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
 - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
 - DRI_DRIVERS=""
+- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa 
--disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx 
--disable-gallium-osmesa"
 - 
GALLIUM_DRIVERS="i915,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,etnaviv,imx"
 - VULKAN_DRIVERS=""
   addons:
@@ -89,6 +94,8 @@ matrix:
   sources:
 - llvm-toolchain-trusty-3.9
   packages:
+# LLVM packaging is broken and misses these dependencies
+- libedit-dev
 # From sources above
 - llvm-3.9-dev
 # Common
@@ -97,6 +104,70 @@ matrix:
 - libx11-xcb-dev
 - libelf-dev
 - env:
+# NOTE: Analogous to SWR above, building Clover is quite slow.
+- LABEL="make Gallium ST Clover"
+- BUILD=make
+- MAKEFLAGS=-j2
+- MAKE_CHECK_COMMAND="true"
+- LLVM_VERSION=3.6
+- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
+- OVERRIDE_CC=gcc-4.7
+- OVERRIDE_CXX=g++-4.7
+- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
+- DRI_DRIVERS=""
+- GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd 
--enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau 
--disable-va --disable-omx --disable-gallium-osmesa"
+# i915 most likely doesn't work with OpenCL.
+# Regardless - we're doing a quick build test here.
+- GALLIUM_DRIVERS="i915"
+- VULKAN_DRIVERS=""
+  addons:
+apt:
+  sources:
+- llvm-toolchain-trusty-3.6
+  packages:
+- libclc-dev
+# LLVM packaging is broken and misses these dependencies
+- libedit-dev
+- g++-4.7
+# From sources above
+- llvm-3.6-dev
+- clang-3.6
+- libclang-3.6-dev
+# Common
+- x11proto-xf86vidmode-dev
+- libexpat1-dev
+- libx11-xcb-dev
+- libelf-dev
+- env:
+- LABEL="make Gallium ST Other"
+- BUILD=make
+- MAKEFLAGS=-j2
+- MAKE_CHECK_COMMAND="true"
+- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
+- DRI_DRIVERS=""
+- GALLIUM_ST="--enable-dri --disable-opencl --enable-xa --enable-nine 
--enable-xvmc --enable-vdpau --enable-va --enable-omx --enable-gallium-osmesa"
+# We need swrast for osmesa and nine.
+# i915 most likely doesn't work with most ST.
+# Regardless - we're doing a quick build test here.
+- GALLIUM_DRIVERS="i915,swrast"
+- VULKAN_DRIVERS=""
+  addons:
+apt:
+  packages:
+# Nine requires gcc 4.6... which is the one we have right ?
+- libxvmc-dev
+# Build locally, for now.
+#- libvdpau-dev
+#- libva-dev
+- libomxil-bellagio-dev
+ 

[Mesa-dev] [PATCH 05/16] travis: automatically manage ccache caching

2017-04-28 Thread Emil Velikov
From: Emil Velikov 

According to the manual

"If you are using ccache, use:

  language: c # or other C/C++ variants

  cache: ccache

to cache $HOME/.ccache and automatically add /usr/lib/ccache to your
$PATH."

Signed-off-by: Emil Velikov 
Reviewed-by: Andres Gomez 
---
 .travis.yml | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 061aed1bc7c..f34b762a4e5 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -5,8 +5,7 @@ dist: trusty
 
 cache:
   apt: true
-  directories:
-- $HOME/.ccache
+  ccache: true
 
 addons:
   apt:
@@ -47,7 +46,6 @@ env:
 - BUILD=scons
 
 install:
-  - export PATH="/usr/lib/ccache:$PATH"
   - pip install --user mako
 
   # Since libdrm gets updated in configure.ac regularly, try to pick up the
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/16] travis: remove unused -dev packages

2017-04-28 Thread Emil Velikov
From: Emil Velikov 

We effectively override libdrm-dev and libxcb-dri2-0-dev since we build
and install the package locally.

Signed-off-by: Emil Velikov 
Reviewed-by: Andres Gomez 
---
 .travis.yml | 2 --
 1 file changed, 2 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index f34b762a4e5..8921429c7e9 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -12,10 +12,8 @@ addons:
 sources:
   - llvm-toolchain-trusty-3.9
 packages:
-  - libdrm-dev
   - x11proto-xf86vidmode-dev
   - libexpat1-dev
-  - libxcb-dri2-0-dev
   - libx11-xcb-dev
   # LLVM packaging is broken and misses these dependencies
   - libedit-dev
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/16] travis: enable wayland support

2017-04-28 Thread Emil Velikov
From: Emil Velikov 

Signed-off-by: Emil Velikov 
Reviewed-by: Andres Gomez 
---
 .travis.yml | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 563d9e25379..db3cb9517fe 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -22,6 +22,7 @@ env:
 - LIBTXC_DXTN_VERSION=libtxc_dxtn-1.0.1
 - LIBVDPAU_VERSION=libvdpau-1.1
 - LIBVA_VERSION=libva-1.6.2
+- LIBWAYLAND_VERSION=wayland-1.11.1
 - PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig
 - LD_LIBRARY_PATH="$HOME/prefix/lib:$LD_LIBRARY_PATH"
 
@@ -32,8 +33,7 @@ matrix:
 - BUILD=make
 - MAKEFLAGS=-j2
 - MAKE_CHECK_COMMAND="make check"
-# XXX: Add wayland platform
-- DRI_LOADERS="--enable-glx --enable-gbm --enable-egl 
--with-platforms=x11,drm,surfaceless --enable-osmesa"
+- DRI_LOADERS="--enable-glx --enable-gbm --enable-egl 
--with-platforms=x11,drm,surfaceless,wayland --enable-osmesa"
 - DRI_DRIVERS="i915,i965,radeon,r200,swrast,nouveau"
 - GALLIUM_ST="--enable-dri --disable-opencl --disable-xa 
--disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx 
--disable-gallium-osmesa"
 - GALLIUM_DRIVERS=""
@@ -41,6 +41,7 @@ matrix:
   addons:
 apt:
   packages:
+- xz-utils
 - x11proto-xf86vidmode-dev
 - libexpat1-dev
 - libx11-xcb-dev
@@ -73,6 +74,7 @@ matrix:
 - g++-5
 - llvm-3.9-dev
 # Common
+- xz-utils
 - x11proto-xf86vidmode-dev
 - libexpat1-dev
 - libx11-xcb-dev
@@ -99,6 +101,7 @@ matrix:
 # From sources above
 - llvm-3.9-dev
 # Common
+- xz-utils
 - x11proto-xf86vidmode-dev
 - libexpat1-dev
 - libx11-xcb-dev
@@ -134,6 +137,7 @@ matrix:
 - clang-3.6
 - libclang-3.6-dev
 # Common
+- xz-utils
 - x11proto-xf86vidmode-dev
 - libexpat1-dev
 - libx11-xcb-dev
@@ -163,6 +167,7 @@ matrix:
 # LLVM packaging is broken and misses these dependencies
 - libedit-dev
 # Common
+- xz-utils
 - x11proto-xf86vidmode-dev
 - libexpat1-dev
 - libx11-xcb-dev
@@ -175,9 +180,8 @@ matrix:
 - LLVM_VERSION=3.9
 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
 # XXX: we want to test the WSI, but those are enabled via the EGL 
toggles
-# XXX: Add wayland platform
 # XXX: Platform X11 dependencies are checked when --enable-glx is set
-- DRI_LOADERS="--enable-glx --disable-gbm --enable-egl 
--with-platforms=x11"
+- DRI_LOADERS="--enable-glx --disable-gbm --enable-egl 
--with-platforms=x11,wayland"
 - DRI_DRIVERS=""
 # XXX: enable DRI for EGL above
 - GALLIUM_ST="--enable-dri --disable-opencl --disable-xa 
--disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx 
--disable-gallium-osmesa"
@@ -193,6 +197,7 @@ matrix:
 # From sources above
 - llvm-3.9-dev
 # Common
+- xz-utils
 - x11proto-xf86vidmode-dev
 - libexpat1-dev
 - libx11-xcb-dev
@@ -210,6 +215,7 @@ matrix:
   packages:
 - scons
 # Common
+- xz-utils
 - x11proto-xf86vidmode-dev
 - libexpat1-dev
 - libx11-xcb-dev
@@ -231,6 +237,7 @@ matrix:
 - libedit-dev
 - llvm-3.3-dev
 # Common
+- xz-utils
 - x11proto-xf86vidmode-dev
 - libexpat1-dev
 - libx11-xcb-dev
@@ -259,6 +266,7 @@ matrix:
 - g++-5
 - llvm-3.9-dev
 # Common
+- xz-utils
 - x11proto-xf86vidmode-dev
 - libexpat1-dev
 - libx11-xcb-dev
@@ -333,6 +341,10 @@ install:
   - tar -jxvf $LIBVA_VERSION.tar.bz2
   - (cd $LIBVA_VERSION && ./configure --prefix=$HOME/prefix --disable-wayland 
--disable-dummy-driver && make install)
 
+  - wget http://wayland.freedesktop.org/releases/$LIBWAYLAND_VERSION.tar.xz
+  - tar -axvf $LIBWAYLAND_VERSION.tar.xz
+  - (cd $LIBWAYLAND_VERSION && ./configure --prefix=$HOME/prefix 
--enable-libraries --without-host-scanner --disable-documentation 
--disable-dtd-validation && make install)
+
   # Generate the header since one is missing on the Travis instance
   - mkdir -p linux
   - printf "%s\n" \
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/16] travis: enable apt cache

2017-04-28 Thread Emil Velikov
From: Emil Velikov 

Provides a small, but consistent improvement.
Example numbers of the jobs added later in the series.

"make loaders/classic DRI" - 1s
"scons SWR" - 6s

Signed-off-by: Emil Velikov 
---
 .travis.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.travis.yml b/.travis.yml
index e317a027233..061aed1bc7c 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -4,6 +4,7 @@ sudo: false
 dist: trusty
 
 cache:
+  apt: true
   directories:
 - $HOME/.ccache
 
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/16] travis: replace Trusty-based LLVM toolchain apt-get with apt addon

2017-04-28 Thread Emil Velikov
From: Andres Gomez 

Trusty's LLVM toochain repository was whitelisted some time ago. See:
https://github.com/travis-ci/apt-source-whitelist/commit/479067c5e74cb0c1e2419209179b1afe2edce274

Signed-off-by: Andres Gomez 
[Emil Velikov]
 - set sudo to false
 - reference the Trusty change (Rhys)
 - keep libedit-dev
Signed-off-by: Emil Velikov 
---
 .travis.yml | 15 ---
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 49158a09fce..efb8f286c87 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -1,6 +1,6 @@
 language: c
 
-sudo: required
+sudo: false
 dist: trusty
 
 cache:
@@ -9,6 +9,8 @@ cache:
 
 addons:
   apt:
+sources:
+  - llvm-toolchain-trusty-3.9
 packages:
   - libdrm-dev
   - x11proto-xf86vidmode-dev
@@ -17,6 +19,7 @@ addons:
   - libx11-xcb-dev
   # LLVM packaging is broken and misses these dependencies
   - libedit-dev
+  - llvm-3.9-dev
   - libelf-dev
   - scons
 
@@ -33,7 +36,6 @@ env:
 - LIBXCB_VERSION=libxcb-1.11
 - LIBXSHMFENCE_VERSION=libxshmfence-1.2
 - LLVM_VERSION=3.9
-- LLVM_PACKAGE="llvm-${LLVM_VERSION} llvm-${LLVM_VERSION}-dev"
 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
 - PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig
 - LD_LIBRARY_PATH="$HOME/prefix/lib:$LD_LIBRARY_PATH"
@@ -91,15 +93,6 @@ install:
   - tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2
   - (cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make 
install)
 
-  # Install LLVM directly via apt-get (not Travis-CI's apt addon)
-  # See 
https://github.com/travis-ci/apt-source-whitelist/pull/205#issuecomment-216054237
-
-  - wget -nv -O - http://llvm.org/apt/llvm-snapshot.gpg.key | sudo apt-key add 
-
-  - sudo apt-add-repository -y 'deb http://llvm.org/apt/trusty 
llvm-toolchain-trusty-3.9 main'
-  - sudo apt-add-repository -y 'deb http://llvm.org/apt/trusty 
llvm-toolchain-trusty main'
-  - sudo apt-get update -qq
-  - sudo apt-get install -qq -y $LLVM_PACKAGE
-
 script:
   - if test "x$BUILD" = xmake; then
   ./autogen.sh --enable-debug
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/16] travis: model scons check target like the make one

2017-04-28 Thread Emil Velikov
From: Emil Velikov 

Should make things a bit more consistent across the board.

Cc: Eric Engestrom 
CC: Andres Gomez 
Signed-off-by: Emil Velikov 
---
 .travis.yml | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/.travis.yml b/.travis.yml
index 5298fa11b67..ec76cf7c9cb 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -130,6 +130,8 @@ matrix:
 - SCONSFLAGS="-j4"
 # Explicitly disable.
 - SCONS_TARGET="llvm=0"
+# Keep it symmetrical to the make build.
+- SCONS_CHECK_COMMAND="scons llvm=0 check"
   addons:
 apt:
   packages:
@@ -144,6 +146,8 @@ matrix:
 - BUILD=scons
 - SCONSFLAGS="-j4"
 - SCONS_TARGET="llvm=1"
+# Keep it symmetrical to the make build.
+- SCONS_CHECK_COMMAND="scons llvm=1 check"
 - LLVM_VERSION=3.3
 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
   addons:
@@ -165,6 +169,8 @@ matrix:
 - SCONS_TARGET="swr=1"
 - LLVM_VERSION=3.9
 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
+# Keep it symmetrical to the make build. There's no actual SWR, yet.
+- SCONS_CHECK_COMMAND="true"
 - OVERRIDE_CC="gcc-5"
 - OVERRIDE_CXX="g++-5"
   addons:
@@ -278,5 +284,5 @@ script:
   - if test "x$BUILD" = xscons; then
   test -n "$OVERRIDE_CC" && export CC="$OVERRIDE_CC";
   test -n "$OVERRIDE_CXX" && export CXX="$OVERRIDE_CXX";
-  scons $SCONS_TARGET && scons $SCONS_TARGET check;
+  scons $SCONS_TARGET && eval $SCONS_CHECK_COMMAND;
 fi
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/16] travis: add "scons swr" to the build matrix

2017-04-28 Thread Emil Velikov
From: Emil Velikov 

Requires GCC 5.0 (due to the C++14 requirement) and LLVM 3.9.

v2: Enable the target, add libedit-dev, rework check target.
v3: Comment the current check target, add -j4 SCONSFLAGS, quote OVERRIDE
variables.
v4: Keep check target as-is (Andres)

Cc: Tim Rowley 
Cc: George Kyriazis 
Reviewed-by: George Kyriazis 
Signed-off-by: Emil Velikov 
Reviewed-by: Andres Gomez 
---
 .travis.yml | 28 
 1 file changed, 28 insertions(+)

diff --git a/.travis.yml b/.travis.yml
index 38f55713511..be394f31279 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -79,6 +79,32 @@ matrix:
 - libexpat1-dev
 - libx11-xcb-dev
 - libelf-dev
+- env:
+- LABEL="scons SWR"
+- BUILD=scons
+- SCONSFLAGS="-j4"
+- SCONS_TARGET="swr=1"
+- LLVM_VERSION=3.9
+- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
+- OVERRIDE_CC="gcc-5"
+- OVERRIDE_CXX="g++-5"
+  addons:
+apt:
+  sources:
+- ubuntu-toolchain-r-test
+- llvm-toolchain-trusty-3.9
+  packages:
+- scons
+# LLVM packaging is broken and misses these dependencies
+- libedit-dev
+# From sources above
+- g++-5
+- llvm-3.9-dev
+# Common
+- x11proto-xf86vidmode-dev
+- libexpat1-dev
+- libx11-xcb-dev
+- libelf-dev
 
 install:
   - pip install --user mako
@@ -154,5 +180,7 @@ script:
 fi
 
   - if test "x$BUILD" = xscons; then
+  test -n "$OVERRIDE_CC" && export CC="$OVERRIDE_CC";
+  test -n "$OVERRIDE_CXX" && export CXX="$OVERRIDE_CXX";
   scons $SCONS_TARGET && scons $SCONS_TARGET check;
 fi
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/16] travis: add "make swr" to the build matrix

2017-04-28 Thread Emil Velikov
From: Emil Velikov 

v2: Quote OVERRIDE variables.
v3: Add missplaced libedit-dev hunk (Andres).

Signed-off-by: Emil Velikov 
Reviewed-by: Andres Gomez 
---
 .travis.yml | 41 ++---
 1 file changed, 38 insertions(+), 3 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index be394f31279..6548e85b767 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -31,6 +31,9 @@ matrix:
 - MAKEFLAGS=-j2
 - LLVM_VERSION=3.9
 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
+- DRI_DRIVERS="i915,i965,radeon,r200,swrast,nouveau"
+- 
GALLIUM_DRIVERS="i915,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,etnaviv,imx"
+- VULKAN_DRIVERS="radeon"
   addons:
 apt:
   sources:
@@ -46,6 +49,36 @@ matrix:
 - libx11-xcb-dev
 - libelf-dev
 - env:
+# NOTE: Building SWR is 2x (yes two) times slower than all the other
+# gallium drivers combined.
+# Start this early so that it doesn't hunder the run time.
+- LABEL="make Gallium Drivers SWR"
+- BUILD=make
+- MAKEFLAGS=-j2
+- LLVM_VERSION=3.9
+- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
+- OVERRIDE_CC="gcc-5"
+- OVERRIDE_CXX="g++-5"
+- DRI_DRIVERS=""
+- GALLIUM_DRIVERS="swr"
+- VULKAN_DRIVERS=""
+  addons:
+apt:
+  sources:
+- ubuntu-toolchain-r-test
+- llvm-toolchain-trusty-3.9
+  packages:
+# LLVM packaging is broken and misses these dependencies
+- libedit-dev
+# From sources above
+- g++-5
+- llvm-3.9-dev
+# Common
+- x11proto-xf86vidmode-dev
+- libexpat1-dev
+- libx11-xcb-dev
+- libelf-dev
+- env:
 - LABEL="scons"
 - BUILD=scons
 - SCONSFLAGS="-j4"
@@ -169,11 +202,13 @@ install:
 
 script:
   - if test "x$BUILD" = xmake; then
+  test -n "$OVERRIDE_CC" && export CC="$OVERRIDE_CC";
+  test -n "$OVERRIDE_CXX" && export CXX="$OVERRIDE_CXX";
   ./autogen.sh --enable-debug
 --with-platforms=x11,drm
---with-dri-drivers=i915,i965,radeon,r200,swrast,nouveau
-
--with-gallium-drivers=i915,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,etnaviv,imx
---with-vulkan-drivers=radeon
+--with-dri-drivers=$DRI_DRIVERS
+--with-gallium-drivers=$GALLIUM_DRIVERS
+--with-vulkan-drivers=$VULKAN_DRIVERS
 --disable-llvm-shared-libs
 ;
   make && make check;
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/16] travis: explicitly LD_LIBRARY_PATH the local libraries

2017-04-28 Thread Emil Velikov
From: Emil Velikov 

Some of the libraries may be dlopened, which may not always work due to
the non-standard prefix that we're using.

Signed-off-by: Emil Velikov 
Reviewed-by: Andres Gomez 
---
 .travis.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.travis.yml b/.travis.yml
index 6d6e44cc419..49158a09fce 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -36,6 +36,7 @@ env:
 - LLVM_PACKAGE="llvm-${LLVM_VERSION} llvm-${LLVM_VERSION}-dev"
 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
 - PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig
+- LD_LIBRARY_PATH="$HOME/prefix/lib:$LD_LIBRARY_PATH"
 - MAKEFLAGS=-j2
   matrix:
 - BUILD=make
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/16] travis: add the possibility of using the txc-dxtn library

2017-04-28 Thread Emil Velikov
From: Andres Gomez 

The txc-dxtn library implements the patented S3 Texture Compression
algorithm.

By default it won't be used but we add the possibility of setting the
USE_TXC_DXTN variable to yes in the travis web UI so it will be
installed and used for the scons tests.

Cc: Eric Anholt 
Cc: Rhys Kidd 
Signed-off-by: Andres Gomez 
[Emil Velikov: keep the LIB prefix, drop the LD_LIBRARY_PATH, fold URL]
Signed-off-by: Emil Velikov 
---
 .travis.yml | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/.travis.yml b/.travis.yml
index efb8f286c87..e317a027233 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -35,6 +35,7 @@ env:
 - XCBPROTO_VERSION=xcb-proto-1.11
 - LIBXCB_VERSION=libxcb-1.11
 - LIBXSHMFENCE_VERSION=libxshmfence-1.2
+- LIBTXC_DXTN_VERSION=libtxc_dxtn-1.0.1
 - LLVM_VERSION=3.9
 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
 - PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig
@@ -93,6 +94,19 @@ install:
   - tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2
   - (cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make 
install)
 
+  # libtxc-dxtn uses the patented S3 Texture Compression
+  # algorithm. Therefore, we don't want to use this library but it is
+  # still possible through setting the USE_TXC_DXTN variable to yes in
+  # the travis web UI.
+  #
+  # According to Wikipedia, the patent expires on October 2, 2017:
+  # https://en.wikipedia.org/wiki/S3_Texture_Compression#Patent
+  - if test "x$USE_TXC_DXTN" = xyes; then
+  wget 
https://people.freedesktop.org/~cbrill/libtxc_dxtn/$LIBTXC_DXTN_VERSION.tar.bz2;
+  tar -jxvf $LIBTXC_DXTN_VERSION.tar.bz2;
+  (cd $LIBTXC_DXTN_VERSION && ./configure --prefix=$HOME/prefix && make 
install);
+fi
+
 script:
   - if test "x$BUILD" = xmake; then
   ./autogen.sh --enable-debug
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/16] travis: split out matrix from env

2017-04-28 Thread Emil Velikov
From: Emil Velikov 

With next commits we'll add a couple of more options.

v2: Rework check target.
v3: Comment the current check target, add -j4 SCONSFLAGS
v4: Keep check target as-is, will rework with later patch.

Signed-off-by: Emil Velikov 
Reviewed-by: Andres Gomez 
---
 .travis.yml | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index a4fe00d8023..0d87c663bb2 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -38,10 +38,18 @@ env:
 - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
 - PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig
 - LD_LIBRARY_PATH="$HOME/prefix/lib:$LD_LIBRARY_PATH"
-- MAKEFLAGS=-j2
-  matrix:
-- BUILD=make
-- BUILD=scons
+
+matrix:
+  include:
+- env:
+- LABEL="make"
+- BUILD=make
+- MAKEFLAGS=-j2
+- env:
+- LABEL="scons LLVM"
+- BUILD=scons
+- SCONSFLAGS="-j4"
+- SCONS_TARGET="llvm=1"
 
 install:
   - pip install --user mako
@@ -117,5 +125,5 @@ script:
 fi
 
   - if test "x$BUILD" = xscons; then
-  scons llvm=1 && scons llvm=1 check;
+  scons $SCONS_TARGET && scons $SCONS_TARGET check;
 fi
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/15] travis: enable apt cache

2017-04-28 Thread Emil Velikov
On 28 April 2017 at 19:15, Emil Velikov  wrote:
> On 28 April 2017 at 11:50, Andres Gomez  wrote:
>> Do we want to do this?
>>
>> According to Travis own doc, there is little to no gain:
>> https://docs.travis-ci.com/user/caching/#Things-not-to-cache
>>
> The packages we use should not be slow to download, although I've not
> checked explicitly. It seems counter intuitive to explicitly add apt
> caching as an option yet advice against it :-\
>
> Let's me give it a spin and see what happens.
>
Using apt cache is a clear win - not huge one but still.

"make loaders/classic DRI" - 1s
"scons SWR" - 6s

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH kmscube 1/2] common: use %llx to print modifier

2017-04-28 Thread Ben Widawsky

Fix kmscube -A on i915 :P

On 17-04-28 14:17:34, Rob Clark wrote:

I guess this applies on top of one of Ben's in-flight patches?
Perhaps it can be squashed into that?  (Otherwise remind me about this
when the modifiers patchset is merged)

BR,
-R

On Fri, Apr 28, 2017 at 12:18 PM, Lucas Stach  wrote:

Use long long format when printing the format modifier, as a simple
long is only 4 bytes on 32bit systems.

Signed-off-by: Lucas Stach 
---
 drm-common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drm-common.c b/drm-common.c
index 2f2c918596a4..fb4ec7f4389e 100644
--- a/drm-common.c
+++ b/drm-common.c
@@ -73,7 +73,7 @@ struct drm_fb * drm_fb_get_from_bo(struct gbm_bo *bo)

if (modifiers[0]) {
flags = DRM_MODE_FB_MODIFIERS;
-   printf("Using modifier %lx\n", modifiers[0]);
+   printf("Using modifier %llx\n", modifiers[0]);
}

ret = drmModeAddFB2WithModifiers(drm_fd, width, height,
--
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] r600g: avoid redundant DB registerupdates

2017-04-28 Thread Marc Dietrich
Am Freitag, 28. April 2017, 16:53:55 CEST schrieb Dieter Nützel:
> I'm running this, too.
> But alone. 4/4 didn't apply anylonger ;-)
> 
> NO glitches on NI/Turks XT (6670).
> 
> I had tested 'Heaven' and 'Valley' even with the former patch version.
> The 'Heaven' GPU hang (wireframe/tessellation) is OLD, as it stays there
> for ages.
> So:
> 
> Tested-by: Dieter Nützel 

Dieter, your card is HD5000/Evergreen, while mine (RS880) is similar to 
HD4200/r600. I'm ok if the patch only gets applied to the everygreen+ 
generations if the glitches for r600 cant get fixed. 

Marc

> 
> Dieter
> 
> Am 28.04.2017 09:57, schrieb Marc Dietrich:
> > Hi Constantine,
> > 
> > 
> > Am Donnerstag, 27. April 2017, 21:04:37 CEST schrieb Constantine
> > 
> > Kharlamov:
> >> Please, could you try this patch. The change is: I'm setting
> >> dirty_zsbuf in
> >> r600_bind_blend_state_internal() as well. It was the difference
> >> between
> >> radeonsi and r600 for CB updates, and my guess is, it might be
> >> relevant to
> >> DB ones as well.
> > 
> > ok, crash is gone and I get 2-3 fps more :-)
> > 
> > But some rendering glitches and:
> > 
> > radeon :01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon :01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon :01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon :01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon :01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon :01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon :01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon :01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon :01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon :01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon :01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon :01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon :01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon :01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon :01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon :01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon :01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon :01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon :01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon :01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon :01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon :01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon :01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon :01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > 
> > 
> > Marc
> > 
> >> ---
> >> 
> >>  src/gallium/drivers/r600/evergreen_state.c   | 76
> >> 
> >> +++- src/gallium/drivers/r600/r600_blit.c
> >> 
> >>  1 +
> >>  src/gallium/drivers/r600/r600_hw_context.c   |  1 +
> >>  src/gallium/drivers/r600/r600_pipe.h |  1 +
> >>  src/gallium/drivers/r600/r600_state.c| 52 ++-
> >>  src/gallium/drivers/r600/r600_state_common.c |  2 +
> >>  6 files changed, 73 insertions(+), 60 deletions(-)
> >> 
> >> diff --git a/src/gallium/drivers/r600/evergreen_state.c
> >> b/src/gallium/drivers/r600/evergreen_state.c index
> >> 19ad504097..7d84e92250
> >> 100644
> >> --- a/src/gallium/drivers/r600/evergreen_state.c
> >> +++ b/src/gallium/drivers/r600/evergreen_state.c
> >> @@ -1426,6 +1426,7 @@ static void
> >> 

Re: [Mesa-dev] [PATCH kmscube 1/2] common: use %llx to print modifier

2017-04-28 Thread Rob Clark
I guess this applies on top of one of Ben's in-flight patches?
Perhaps it can be squashed into that?  (Otherwise remind me about this
when the modifiers patchset is merged)

BR,
-R

On Fri, Apr 28, 2017 at 12:18 PM, Lucas Stach  wrote:
> Use long long format when printing the format modifier, as a simple
> long is only 4 bytes on 32bit systems.
>
> Signed-off-by: Lucas Stach 
> ---
>  drm-common.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drm-common.c b/drm-common.c
> index 2f2c918596a4..fb4ec7f4389e 100644
> --- a/drm-common.c
> +++ b/drm-common.c
> @@ -73,7 +73,7 @@ struct drm_fb * drm_fb_get_from_bo(struct gbm_bo *bo)
>
> if (modifiers[0]) {
> flags = DRM_MODE_FB_MODIFIERS;
> -   printf("Using modifier %lx\n", modifiers[0]);
> +   printf("Using modifier %llx\n", modifiers[0]);
> }
>
> ret = drmModeAddFB2WithModifiers(drm_fd, width, height,
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/15] travis: enable apt cache

2017-04-28 Thread Emil Velikov
On 28 April 2017 at 11:50, Andres Gomez  wrote:
> Do we want to do this?
>
> According to Travis own doc, there is little to no gain:
> https://docs.travis-ci.com/user/caching/#Things-not-to-cache
>
The packages we use should not be slow to download, although I've not
checked explicitly. It seems counter intuitive to explicitly add apt
caching as an option yet advice against it :-\

Let's me give it a spin and see what happens.

BTW, since there's a handful of suggestions I've spit out the
SCONS_CHECK_COMMAND into separate patch that follows after it's make
brethren.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: set vector_elements to 1 for samplers

2017-04-28 Thread Samuel Pitoiset



On 04/28/2017 06:58 PM, Mark Janes wrote:

With this commit, a wide range of intel hardware began hanging during
the GLES CTS, with dmesg errors like:

[25488.739167] traps: glcts[15106] general protection ip:7fdac6484ba5 
sp:7ffdcda85a20 error:0

Machines that did complete the cts, reported hundreds of errors like:

*** Error in `/tmp/build_root/m64/bin/es/cts/glcts': malloc(): memory
 corruption: 0x562c7503b270 ***


That's unfortunate. I don't have any Intel hw.

Kenneth is going to revert the patch for now.

Would be very nice if someone with the hardware could have a look.

Which tests are failing? Can you list some?





Samuel Pitoiset  writes:


I don't see any reasons why vector_elements is 1 for images and
0 for samplers. This increases consistency and allows to clean
up some code a bit.

This will also help for ARB_bindless_texture.

No piglit regressions with RadeonSI.

Signed-off-by: Samuel Pitoiset 
---
  src/compiler/glsl_types.cpp |  7 +--
  src/mesa/main/uniform_query.cpp | 15 +--
  2 files changed, 6 insertions(+), 16 deletions(-)

diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp
index 0480bef80e..bf078ad614 100644
--- a/src/compiler/glsl_types.cpp
+++ b/src/compiler/glsl_types.cpp
@@ -95,12 +95,7 @@ glsl_type::glsl_type(GLenum gl_type, glsl_base_type 
base_type,
  
 memset(& fields, 0, sizeof(fields));
  
-   if (is_sampler()) {

-  /* Samplers take no storage whatsoever. */
-  matrix_columns = vector_elements = 0;
-   } else {
-  matrix_columns = vector_elements = 1;
-   }
+   matrix_columns = vector_elements = 1;
  }
  
  glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields,

diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index e400d0eb00..114f6fb5be 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -321,8 +321,7 @@ _mesa_get_uniform(struct gl_context *ctx, GLuint program, 
GLint location,
 }
  
 {

-  unsigned elements = (uni->type->is_sampler())
-? 1 : uni->type->components();
+  unsigned elements = uni->type->components();
const int dmul = uni->type->is_64bit() ? 2 : 1;
const int rmul = glsl_base_type_is_64bit(returnType) ? 2 : 1;
  
@@ -648,10 +647,8 @@ _mesa_propagate_uniforms_to_driver_storage(struct gl_uniform_storage *uni,

  {
 unsigned i;
  
-   /* vector_elements and matrix_columns can be 0 for samplers.

-*/
-   const unsigned components = MAX2(1, uni->type->vector_elements);
-   const unsigned vectors = MAX2(1, uni->type->matrix_columns);
+   const unsigned components = uni->type->vector_elements;
+   const unsigned vectors = uni->type->matrix_columns;
 const int dmul = uni->type->is_64bit() ? 2 : 1;
  
 /* Store the data in the driver's requested type in the driver's storage

@@ -803,8 +800,7 @@ validate_uniform(GLint location, GLsizei count, const 
GLvoid *values,
 }
  
 /* Verify that the types are compatible. */

-   const unsigned components = uni->type->is_sampler()
-  ? 1 : uni->type->vector_elements;
+   const unsigned components = uni->type->vector_elements;
  
 if (components != src_components) {

/* glUniformN() must match float/vecN type */
@@ -925,8 +921,7 @@ _mesa_uniform(GLint location, GLsizei count, const GLvoid 
*values,
   return;
 }
  
-   const unsigned components = uni->type->is_sampler()

-  ? 1 : uni->type->vector_elements;
+   const unsigned components = uni->type->vector_elements;
  
 /* Page 82 (page 96 of the PDF) of the OpenGL 2.1 spec says:

  *
--
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] vbo: set min_index = 0 so gallium can use the value directly

2017-04-28 Thread Marek Olšák
Ping

On Wed, Apr 26, 2017 at 11:35 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> We could also remove index_bounds_valid and use max_index != ~0 instead.
> Opinions on that are welcome.
> ---
>  src/mesa/vbo/vbo_context.c| 2 +-
>  src/mesa/vbo/vbo_exec_array.c | 6 +++---
>  2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/vbo/vbo_context.c b/src/mesa/vbo/vbo_context.c
> index 7022fe9..384e405 100644
> --- a/src/mesa/vbo/vbo_context.c
> +++ b/src/mesa/vbo/vbo_context.c
> @@ -161,21 +161,21 @@ vbo_draw_indirect_prims(struct gl_context *ctx,
> prim[draw_count - 1].end = 1;
> for (i = 0; i < draw_count; ++i, indirect_offset += stride) {
>prim[i].mode = mode;
>prim[i].indexed = !!ib;
>prim[i].indirect_offset = indirect_offset;
>prim[i].is_indirect = 1;
>prim[i].draw_id = i;
> }
>
> vbo->draw_prims(ctx, prim, draw_count,
> -   ib, false, ~0, ~0,
> +   ib, false, 0, ~0,
> NULL, 0,
> ctx->DrawIndirectBuffer);
>
> free(prim);
>  }
>
>
>  GLboolean _vbo_CreateContext( struct gl_context *ctx )
>  {
> struct vbo_context *vbo = CALLOC_STRUCT(vbo_context);
> diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
> index aecfad0..15382ea 100644
> --- a/src/mesa/vbo/vbo_exec_array.c
> +++ b/src/mesa/vbo/vbo_exec_array.c
> @@ -1315,21 +1315,21 @@ vbo_validated_multidrawelements(struct gl_context 
> *ctx, GLenum mode,
>   prim[i].base_instance = 0;
>   prim[i].draw_id = i;
>   prim[i].is_indirect = 0;
>   if (basevertex != NULL)
>  prim[i].basevertex = basevertex[i];
>   else
>  prim[i].basevertex = 0;
>}
>
>vbo->draw_prims(ctx, prim, primcount, ,
> -  false, ~0, ~0, NULL, 0, NULL);
> +  false, 0, ~0, NULL, 0, NULL);
> }
> else {
>/* render one prim at a time */
>for (i = 0; i < primcount; i++) {
>   if (count[i] == 0)
>  continue;
>   ib.count = count[i];
>   ib.index_size = vbo_sizeof_ib_type(type);
>   ib.obj = ctx->Array.VAO->IndexBufferObj;
>   ib.ptr = indices[i];
> @@ -1344,21 +1344,21 @@ vbo_validated_multidrawelements(struct gl_context 
> *ctx, GLenum mode,
>   prim[0].indexed = 1;
>   prim[0].num_instances = 1;
>   prim[0].base_instance = 0;
>   prim[0].draw_id = i;
>   prim[0].is_indirect = 0;
>   if (basevertex != NULL)
>  prim[0].basevertex = basevertex[i];
>   else
>  prim[0].basevertex = 0;
>
> - vbo->draw_prims(ctx, prim, 1, , false, ~0, ~0, NULL, 0, NULL);
> + vbo->draw_prims(ctx, prim, 1, , false, 0, ~0, NULL, 0, NULL);
>}
> }
>
> free(prim);
>
> if (MESA_DEBUG_FLAGS & DEBUG_ALWAYS_FLUSH) {
>_mesa_flush(ctx);
> }
>  }
>
> @@ -1436,21 +1436,21 @@ vbo_draw_transform_feedback(struct gl_context *ctx, 
> GLenum mode,
> prim[0].end = 1;
> prim[0].mode = mode;
> prim[0].num_instances = numInstances;
> prim[0].base_instance = 0;
> prim[0].is_indirect = 0;
>
> /* Maybe we should do some primitive splitting for primitive restart
>  * (like in DrawArrays), but we have no way to know how many vertices
>  * will be rendered. */
>
> -   vbo->draw_prims(ctx, prim, 1, NULL, GL_FALSE, ~0, ~0, obj, stream, NULL);
> +   vbo->draw_prims(ctx, prim, 1, NULL, GL_FALSE, 0, ~0, obj, stream, NULL);
>
> if (MESA_DEBUG_FLAGS & DEBUG_ALWAYS_FLUSH) {
>_mesa_flush(ctx);
> }
>  }
>
>
>  /**
>   * Like DrawArrays, but take the count from a transform feedback object.
>   * \param mode  GL_POINTS, GL_LINES, GL_TRIANGLE_STRIP, etc.
> --
> 2.7.4
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/15] travis: split the make target to three separate ones

2017-04-28 Thread Andres Gomez
On Fri, 2017-04-28 at 17:43 +0100, Emil Velikov wrote:
> On 28 April 2017 at 16:00, Andres Gomez  wrote:
> > On Thu, 2017-04-27 at 19:38 +0100, Emil Velikov wrote:
> > > From: Emil Velikov 
> > > 
> > > Split the target to allow faster builds for each run.
> > > 
> > > The overall build time will be more, yet Travis runs multiple builds in
> > > parallel so we're limited by the slowest one.
> > > 
> > > Things are split roughly as:
> > >  - DRI loaders, classic DRI drivers, classic OSMesa, make check
> > >  - All Gallium drivers (minus the SWR) alongside st/dri (mesa)
> > >  - The Vulkan drivers - ANV and RADV, make check (anv)
> > 
> > I think it would be better to split ANV and RADV on different builds
> > too.
> > 
> 
> Any particular reason why?

Just that, they are under heavy development (not that the rest of the
mesa isn't, to be honest), and with so many changes coming in I would
rather have a build that, if failing, I will know straight away whether
it is radv or anv.

A matter of taste, in any case ...

-- 
Br,

Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: set vector_elements to 1 for samplers

2017-04-28 Thread Mark Janes
With this commit, a wide range of intel hardware began hanging during
the GLES CTS, with dmesg errors like:

[25488.739167] traps: glcts[15106] general protection ip:7fdac6484ba5 
sp:7ffdcda85a20 error:0

Machines that did complete the cts, reported hundreds of errors like:

*** Error in `/tmp/build_root/m64/bin/es/cts/glcts': malloc(): memory
corruption: 0x562c7503b270 ***



Samuel Pitoiset  writes:

> I don't see any reasons why vector_elements is 1 for images and
> 0 for samplers. This increases consistency and allows to clean
> up some code a bit.
>
> This will also help for ARB_bindless_texture.
>
> No piglit regressions with RadeonSI.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/compiler/glsl_types.cpp |  7 +--
>  src/mesa/main/uniform_query.cpp | 15 +--
>  2 files changed, 6 insertions(+), 16 deletions(-)
>
> diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp
> index 0480bef80e..bf078ad614 100644
> --- a/src/compiler/glsl_types.cpp
> +++ b/src/compiler/glsl_types.cpp
> @@ -95,12 +95,7 @@ glsl_type::glsl_type(GLenum gl_type, glsl_base_type 
> base_type,
>  
> memset(& fields, 0, sizeof(fields));
>  
> -   if (is_sampler()) {
> -  /* Samplers take no storage whatsoever. */
> -  matrix_columns = vector_elements = 0;
> -   } else {
> -  matrix_columns = vector_elements = 1;
> -   }
> +   matrix_columns = vector_elements = 1;
>  }
>  
>  glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields,
> diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
> index e400d0eb00..114f6fb5be 100644
> --- a/src/mesa/main/uniform_query.cpp
> +++ b/src/mesa/main/uniform_query.cpp
> @@ -321,8 +321,7 @@ _mesa_get_uniform(struct gl_context *ctx, GLuint program, 
> GLint location,
> }
>  
> {
> -  unsigned elements = (uni->type->is_sampler())
> -  ? 1 : uni->type->components();
> +  unsigned elements = uni->type->components();
>const int dmul = uni->type->is_64bit() ? 2 : 1;
>const int rmul = glsl_base_type_is_64bit(returnType) ? 2 : 1;
>  
> @@ -648,10 +647,8 @@ _mesa_propagate_uniforms_to_driver_storage(struct 
> gl_uniform_storage *uni,
>  {
> unsigned i;
>  
> -   /* vector_elements and matrix_columns can be 0 for samplers.
> -*/
> -   const unsigned components = MAX2(1, uni->type->vector_elements);
> -   const unsigned vectors = MAX2(1, uni->type->matrix_columns);
> +   const unsigned components = uni->type->vector_elements;
> +   const unsigned vectors = uni->type->matrix_columns;
> const int dmul = uni->type->is_64bit() ? 2 : 1;
>  
> /* Store the data in the driver's requested type in the driver's storage
> @@ -803,8 +800,7 @@ validate_uniform(GLint location, GLsizei count, const 
> GLvoid *values,
> }
>  
> /* Verify that the types are compatible. */
> -   const unsigned components = uni->type->is_sampler()
> -  ? 1 : uni->type->vector_elements;
> +   const unsigned components = uni->type->vector_elements;
>  
> if (components != src_components) {
>/* glUniformN() must match float/vecN type */
> @@ -925,8 +921,7 @@ _mesa_uniform(GLint location, GLsizei count, const GLvoid 
> *values,
>   return;
> }
>  
> -   const unsigned components = uni->type->is_sampler()
> -  ? 1 : uni->type->vector_elements;
> +   const unsigned components = uni->type->vector_elements;
>  
> /* Page 82 (page 96 of the PDF) of the OpenGL 2.1 spec says:
>  *
> -- 
> 2.12.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] LLVM build issue on r299777

2017-04-28 Thread Nicolai Hähnle

On 28.04.2017 13:08, Eric Engestrom wrote:

Hi,

I'm currently running llvm r299777 but I have no idea when this was
changed.


Yeah, you'll just have to upgrade LLVM (or downgrade to LLVM 4.0). This 
sort of thing unfortunately happens occasionally when running LLVM trunk 
and Mesa master. I currently have r301197 here, and I'm not aware of any 
brokenness.


Cheers,
Nicolai




make[4]: Entering directory '/var/tmp/mesa-git/src/mesa/src/amd'
  CXX  common/common_libamd_common_la-ac_llvm_helper.lo
common/ac_llvm_helper.cpp: In function ‘void 
ac_add_attr_dereferenceable(LLVMValueRef, uint64_t)’:
common/ac_llvm_helper.cpp:53:83: error: no matching function for call to 
‘llvm::Argument::addAttr(llvm::Attribute)’
A->addAttr(llvm::Attribute::getWithDereferenceableBytes(A->getContext(), 
bytes));

   ^
In file included from /usr/include/llvm/IR/Function.h:25:0,
 from /usr/include/llvm/IR/Module.h:21,
 from /usr/include/llvm/ExecutionEngine/ExecutionEngine.h:22,
 from common/ac_llvm_helper.cpp:35:
/usr/include/llvm/IR/Argument.h:111:8: note: candidate: void 
llvm::Argument::addAttr(llvm::AttributeList)
   void addAttr(AttributeList AS);
^~~
/usr/include/llvm/IR/Argument.h:111:8: note:   no known conversion for argument 
1 from ‘llvm::Attribute’ to ‘llvm::AttributeList’
/usr/include/llvm/IR/Argument.h:113:8: note: candidate: void 
llvm::Argument::addAttr(llvm::Attribute::AttrKind)
   void addAttr(Attribute::AttrKind Kind) {
^~~
/usr/include/llvm/IR/Argument.h:113:8: note:   no known conversion for argument 
1 from ‘llvm::Attribute’ to ‘llvm::Attribute::AttrKind’
make[4]: *** [Makefile:1065: common/common_libamd_common_la-ac_llvm_helper.lo] 
Error 1

Cc'ing Marek and Nicolai as I've seen you fixing LLVM version issues in
the past :)

Cheers,
  Eric



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/15] travis: split the make target to three separate ones

2017-04-28 Thread Emil Velikov
On 28 April 2017 at 16:00, Andres Gomez  wrote:
> On Thu, 2017-04-27 at 19:38 +0100, Emil Velikov wrote:
>> From: Emil Velikov 
>>
>> Split the target to allow faster builds for each run.
>>
>> The overall build time will be more, yet Travis runs multiple builds in
>> parallel so we're limited by the slowest one.
>>
>> Things are split roughly as:
>>  - DRI loaders, classic DRI drivers, classic OSMesa, make check
>>  - All Gallium drivers (minus the SWR) alongside st/dri (mesa)
>>  - The Vulkan drivers - ANV and RADV, make check (anv)
>
> I think it would be better to split ANV and RADV on different builds
> too.
>
Any particular reason why?

As-is the jobs are fairly well balanced - be that with cold [1] or
warm ccache [2] [3].
In the latter case, big hunk of the time is dominated by the
build/install of dependencies (even w/o wayland/va/vdpau).

[1] https://travis-ci.org/evelikov/Mesa/builds/226462487
[2] https://travis-ci.org/evelikov/Mesa/builds/226874115
[3] https://travis-ci.org/evelikov/Mesa/builds/226879305

>> @@ -207,18 +255,33 @@ install:
>>(cd $LIBTXC_DXTN_VERSION && ./configure --prefix=$HOME/prefix && make 
>> install);
>>  fi
>>
>> +  # Generate the header since one is missing on the Travis instance
>> +  - mkdir -p linux
>> +  - echo "#ifndef _LINUX_MEMFD_H" > linux/memfd.h
>> +  - echo "#define _LINUX_MEMFD_H" >> linux/memfd.h
>> +  - echo ""   >> linux/memfd.h
>> +  - echo "#define __NR_memfd_create 319"  >> linux/memfd.h
>> +  - echo "#define SYS_memfd_create __NR_memfd_create" >> linux/memfd.h
>> +  - echo ""   >> linux/memfd.h
>> +  - echo "#define MFD_CLOEXEC 0x0001U">> linux/memfd.h
>> +  - echo "#define MFD_ALLOW_SEALING   0x0002U">> linux/memfd.h
>> +  - echo ""   >> linux/memfd.h
>> +  - echo "#endif /* _LINUX_MEMFD_H */">> linux/memfd.h
>
> This is a bit ugly on the travis log output. What about replacing it
> with something like?:
>
>   - printf "%s\n" \
>"#ifndef _LINUX_MEMFD_H" \
>"#define _LINUX_MEMFD_H" \
>"" \
>"#define __NR_memfd_create 319" \
>"#define SYS_memfd_create __NR_memfd_create" \
>"" \
>"#define MFD_CLOEXEC 0x0001U" \
>"#define MFD_ALLOW_SEALING   0x0002U" \
>"" \
>"#endif /* _LINUX_MEMFD_H */" > linux/memfd.h
>
Looks nice and works like a charm. Much appreciated.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH kmscube 1/2] common: use %llx to print modifier

2017-04-28 Thread Lucas Stach
Use long long format when printing the format modifier, as a simple
long is only 4 bytes on 32bit systems.

Signed-off-by: Lucas Stach 
---
 drm-common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drm-common.c b/drm-common.c
index 2f2c918596a4..fb4ec7f4389e 100644
--- a/drm-common.c
+++ b/drm-common.c
@@ -73,7 +73,7 @@ struct drm_fb * drm_fb_get_from_bo(struct gbm_bo *bo)
 
if (modifiers[0]) {
flags = DRM_MODE_FB_MODIFIERS;
-   printf("Using modifier %lx\n", modifiers[0]);
+   printf("Using modifier %llx\n", modifiers[0]);
}
 
ret = drmModeAddFB2WithModifiers(drm_fd, width, height,
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH kmscube 2/2] drm-atomic: init out_fence_fd to -1

2017-04-28 Thread Lucas Stach
The current initial value of 0 is a valid fd, so this will trip up
the GPU submit on the first render, when used as an IN fence for rendering.

Reported-by: Philipp Zabel 
Signed-off-by: Lucas Stach 
---
 drm-atomic.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drm-atomic.c b/drm-atomic.c
index c06e52fb25ba..0f3c4f285fb7 100644
--- a/drm-atomic.c
+++ b/drm-atomic.c
@@ -33,8 +33,9 @@
 
 #define VOID2U64(x) ((uint64_t)(unsigned long)(x))
 
-static struct drm drm;
-
+static struct drm drm = {
+   .kms_out_fence_fd = -1,
+};
 
 static int add_connector_property(drmModeAtomicReq *req, uint32_t obj_id,
const char *name, uint64_t value)
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Solve Android native fence fd double close issue

2017-04-28 Thread Chad Versace
On Thu 27 Apr 2017, Emil Velikov wrote:
> On 27 April 2017 at 12:14, Xu, Randy  wrote:
> > Hi, Chad
> >
> > Please review this patch, we need it to solve some instability issues

Randy and Tapani, could you provide a few dEQP test names that this
patch fixes? I'd like to mention at least one EGL and one Vulkan test in
the commit message.

> The patch is correct, although the commit message can be improved upon.
> Read through the following example and consider the alternative
> solution mentioned within.

Yes, this patch is correct. It makes brw_dri_create_fence_fd() behave
like all the other drivers' create_fence_fd funcs, which call dup().
Since this is an easy one-liner that can backport to stable, let's take
it.

However, I believe the fully correct solution is Emil's plan B:
__DRI2fenceExtensionRec::create_fence_fd should transfer fd ownership to
the driver, and therefore no dup is needed. But that's a slightly more
invasive change that's not as easily backported to stable.

Reviewed-by: Chad Versace 
Cc: mesa-sta...@lists.freedesktop.org

Emil, how about one of us appends your extended commit message to
Randy's, and then pushes?

> Then either polish and resend, or send patch that implements plan B.
> If you opt for B you want to drop the dup/close from the existing
> users - freedreno and etnaviv.
> 
> "
> The semantics of __DRI2fenceExtensionRec::create_fence_fd are unclear
> if the DRI driver takes ownership of the fd or not.
> Since the i965 driver supports both "in" and "out" fd it assumes "yes,
> driver takes ownership", which results in a double close.
> First time in our destroy_fence() callback and then in the loader.
> 
> Other DRI modules rely on the loader issuing close().
> 
> Thus we have two solutions:
>  - dup() the file descriptor
>  - close() only if we have an out fence.
> 
> This patch implements the former, simpler solution.
> 
> Fixes: 6403e376511 ("i965/sync: Implement fences based on Linux sync_file")
> Reviewed-by: Emil Velikov 
> "
> 
> In either case you want to augment create_fence_fd and destroy_fence
> (in dri_interface.h) to explicitly define the behaviour.
> Please keep that a separate patch part of this series.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 56/61] radeonsi: get InstanceID from VGPR1 (or VGPR2 for tess) instead of VGPR3

2017-04-28 Thread Marek Olšák
On Fri, Apr 28, 2017 at 1:54 PM, Nicolai Hähnle  wrote:
> On 24.04.2017 10:45, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> VGPR1 = InstanceID / StepRate0; // StepRate0 can be set to 1
>> ---
>>  src/gallium/drivers/radeonsi/si_shader.c| 20 ++--
>>  src/gallium/drivers/radeonsi/si_shader.h|  1 +
>>  src/gallium/drivers/radeonsi/si_state.c |  1 +
>>  src/gallium/drivers/radeonsi/si_state_shaders.c | 24
>> +---
>>  4 files changed, 33 insertions(+), 13 deletions(-)
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_shader.c
>> b/src/gallium/drivers/radeonsi/si_shader.c
>> index edb50a3..ce509af 100644
>> --- a/src/gallium/drivers/radeonsi/si_shader.c
>> +++ b/src/gallium/drivers/radeonsi/si_shader.c
>> @@ -5838,23 +5838,28 @@ static void declare_vs_specific_input_sgprs(struct
>> si_shader_context *ctx,
>> params[ctx->param_vs_state_bits = (*num_params)++] = ctx->i32;
>>  }
>>
>>  static void declare_vs_input_vgprs(struct si_shader_context *ctx,
>>LLVMTypeRef *params, unsigned
>> *num_params,
>>unsigned *num_prolog_vgprs)
>>  {
>> struct si_shader *shader = ctx->shader;
>>
>> params[ctx->param_vertex_id = (*num_params)++] = ctx->i32;
>> -   params[ctx->param_rel_auto_id = (*num_params)++] = ctx->i32;
>> -   params[ctx->param_vs_prim_id = (*num_params)++] = ctx->i32;
>> -   params[ctx->param_instance_id = (*num_params)++] = ctx->i32;
>> +   if (shader->key.as_ls) {
>> +   params[ctx->param_rel_auto_id = (*num_params)++] =
>> ctx->i32;
>> +   params[ctx->param_instance_id = (*num_params)++] =
>> ctx->i32;
>> +   } else {
>> +   params[ctx->param_instance_id = (*num_params)++] =
>> ctx->i32;
>> +   params[ctx->param_vs_prim_id = (*num_params)++] =
>> ctx->i32;
>> +   }
>> +   params[(*num_params)++] = ctx->i32; /* unused */
>>
>> if (!shader->is_gs_copy_shader) {
>> /* Vertex load indices. */
>> ctx->param_vertex_index0 = (*num_params);
>> for (unsigned i = 0; i <
>> shader->selector->info.num_inputs; i++)
>> params[(*num_params)++] = ctx->i32;
>> *num_prolog_vgprs += shader->selector->info.num_inputs;
>> }
>>  }
>>
>> @@ -7497,25 +7502,28 @@ static bool si_compile_tgsi_main(struct
>> si_shader_context *ctx,
>>  static void si_get_vs_prolog_key(const struct tgsi_shader_info *info,
>>  unsigned num_input_sgprs,
>>  const struct si_vs_prolog_bits
>> *prolog_key,
>>  struct si_shader *shader_out,
>>  union si_shader_part_key *key)
>>  {
>> memset(key, 0, sizeof(*key));
>> key->vs_prolog.states = *prolog_key;
>> key->vs_prolog.num_input_sgprs = num_input_sgprs;
>> key->vs_prolog.last_input = MAX2(1, info->num_inputs) - 1;
>> +   key->vs_prolog.as_ls = shader_out->key.as_ls;
>>
>> -   if (shader_out->selector->type == PIPE_SHADER_TESS_CTRL)
>> +   if (shader_out->selector->type == PIPE_SHADER_TESS_CTRL) {
>> +   key->vs_prolog.as_ls = 1;
>> key->vs_prolog.num_merged_next_stage_vgprs = 2;
>> -   else if (shader_out->selector->type == PIPE_SHADER_GEOMETRY)
>> +   } else if (shader_out->selector->type == PIPE_SHADER_GEOMETRY) {
>> key->vs_prolog.num_merged_next_stage_vgprs = 5;
>> +   }
>>
>> /* Set the instanceID flag. */
>> for (unsigned i = 0; i < info->num_inputs; i++)
>> if (key->vs_prolog.states.instance_divisors[i])
>> shader_out->info.uses_instanceid = true;
>>  }
>>
>>  /**
>>   * Compute the VS epilog key, which contains all the information needed
>> to
>>   * build the VS epilog function, and set the PrimitiveID output offset.
>> @@ -8508,21 +8516,21 @@ static void si_build_vs_prolog_function(struct
>> si_shader_context *ctx,
>> LLVMValueRef ret, func;
>> int last_sgpr, num_params, num_returns, i;
>> unsigned first_vs_vgpr = key->vs_prolog.num_input_sgprs +
>>
>> key->vs_prolog.num_merged_next_stage_vgprs;
>> unsigned num_input_vgprs =
>> key->vs_prolog.num_merged_next_stage_vgprs + 4;
>> unsigned num_all_input_regs = key->vs_prolog.num_input_sgprs +
>>   num_input_vgprs;
>> unsigned user_sgpr_base =
>> key->vs_prolog.num_merged_next_stage_vgprs ? 8 : 0;
>>
>> ctx->param_vertex_id = first_vs_vgpr;
>> -   ctx->param_instance_id = first_vs_vgpr + 3;
>> +   ctx->param_instance_id = first_vs_vgpr + (key->vs_prolog.as_ls ? 2
>> : 1);
>>
>> /* 4 preloaded VGPRs + vertex load indices as prolog outputs */
>> params = alloca(num_all_input_regs * 

  1   2   >