Re: [Mesa-dev] [PATCH] tgsi/scan: use wrap-around shift behavior explicitly for file_mask

2018-03-01 Thread Jose Fonseca

On 02/03/18 02:23, Brian Paul wrote:

On 03/01/2018 07:01 PM, srol...@vmware.com wrote:

From: Roland Scheidegger 

The comment said it will only represent the lowest 32 regs. This was
not entirely true in practice, since at least on x86 you'll get
masked shifts (unless the compiler could recognize it already and toss
it out). It turns out this actually works out alright (presumably
noone uses it for temp regs) when increasing max sampler views, so
make that behavior explicit.
Albeit it feels a bit hacky (but in any case, explicit behavior there
is better than undefined behavior).


This effectively treats the file_mask as an simplistic bloom filter.  I 
suppose that as long as all users of this for sampler views are aware of 
this (and don't assume that if a shader with a single SRV on slot 32 
also uses slot 0) it might work.



---
  src/gallium/auxiliary/tgsi/tgsi_scan.c | 7 +--
  src/gallium/drivers/llvmpipe/lp_state_fs.c | 7 ++-
  src/gallium/drivers/swr/swr_shader.cpp | 2 +-
  3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c 
b/src/gallium/auxiliary/tgsi/tgsi_scan.c

index c35eff2..0d229c9 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
@@ -585,8 +585,11 @@ scan_declaration(struct tgsi_shader_info *info,
    int buffer;
    unsigned index, target, type;
-  /* only first 32 regs will appear in this bitfield */
-  info->file_mask[file] |= (1 << reg);
+  /*
+   * only first 32 regs will appear in this bitfield, if larger
+   * bits will wrap around.
+   */
+  info->file_mask[file] |= (1 << (reg & 31));


Or, reg % 32 and let the compiler optimize it.

Either way, Reviewed-by: Brian Paul 


I also admit that somehow reg & 31 made me pause, but reg % 32 seems 
obviously right.



Reviewed-by: Jose Fonseca 

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/12] vbo: Remove vbo_save_vertex_list::buffer_offset.

2018-03-01 Thread Mathias Fröhlich
Hi Brian,

On Thursday, 1 March 2018 18:15:55 CET Brian Paul wrote:
> That sounds great.  At VMware we come across quite a few legacy GL apps 
> that make heavy use of display lists, and often times, the applications 
> are pretty inefficient with display list use.
I know. Working myself for a CAE company doing fluid simulation ...

> My previous optimization applied to multiple glBegin/End primitives 
> within a single display list (avoid re-emitting vertex array offsets for 
> each draw call).  If we can do that with glBegin/End prims in separate 
> display lists, that could be a really nice improvement.

If I understand the dlist compiler correct your current optimization has 
already the potential to share across dlists as long as the amount of vertex 
data stays less than the chunk of vbo that gets allocated at one time possibly 
spaning multiple display lists.
The issue is that its less likely to hit the optimization with respect to the 
beginning of the backing vbo of the current chunk. Rather than that the 
previous set of vertices stems with a much higher probability of a similar 
draw. And this base offset will fit much more often IMO. Therefore optimizing 
for the past voa's offset is much more likely to happen. This is better for 
inside the dlist but also across dlists.

Did you see the single patch I sent out mostly for you yesterday (european) 
morning?

For one mentioned openscenegraph workload that reduces the amount of allocated 
vaos from 536 to 138 with a dlist count of 311. Note that the maximum expected 
vao count is 311*2 = 622.

best

Mathias


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: use clock_gettime() on PIPE_OS_BSD

2018-03-01 Thread Jonathan Gray
On Wed, Feb 28, 2018 at 08:22:57AM -0700, Brian Paul wrote:
> On 02/28/2018 03:19 AM, Jonathan Gray wrote:
> > OpenBSD, FreeBSD, NetBSD and DragonFlyBSD all have clock_gettime()
> > so use it when PIPE_OS_BSD is defined.
> > 
> > Signed-off-by: Jonathan Gray 
> > ---
> >   src/util/os_time.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/src/util/os_time.c b/src/util/os_time.c
> > index 72dc7e49c0..ac488b2287 100644
> > --- a/src/util/os_time.c
> > +++ b/src/util/os_time.c
> > @@ -55,7 +55,7 @@
> >   int64_t
> >   os_time_get_nano(void)
> >   {
> > -#if defined(PIPE_OS_LINUX)
> > +#if defined(PIPE_OS_LINUX) || defined(PIPE_OS_BSD)
> >  struct timespec tv;
> >  clock_gettime(CLOCK_MONOTONIC, );
> > 
> 
> LGTM.
> Reviewed-by: Brian Paul 

Thanks, there is also u_thread_get_time_nano().  I tried having autoconf
do AC_CHECK_FUNC for pthread_getcpuclockid() and setting a define
if it was found but couldn't seem to get the pthread linkage in the test
to work properly.  So for now in my local tree I have the diff below.

pthread_getcpuclockid() is available on
OpenBSD https://man.openbsd.org/pthread_getcpuclockid
FreeBSD https://www.freebsd.org/cgi/man.cgi?query=pthread_getcpuclockid
NetBSD 
http://netbsd.gw.com/cgi-bin/man-cgi?pthread_getcpuclockid++NetBSD-current
DragonFly 
https://leaf.dragonflybsd.org/cgi/web-man?command=pthread_getcpuclockid=ANY
Cygwin https://cygwin.com/cygwin-api/compatibility.html#std-susv4

but not macos and solaris?

diff --git a/src/util/u_thread.h b/src/util/u_thread.h
index 8c6e0bdc59..4f559e5c8f 100644
--- a/src/util/u_thread.h
+++ b/src/util/u_thread.h
@@ -71,21 +71,21 @@ static inline void u_thread_setname( const char *name )
 }
 
 /*
  * Thread statistics.
  */
 
 /* Return the time of a thread's CPU time clock. */
 static inline int64_t
 u_thread_get_time_nano(thrd_t thread)
 {
-#if defined(__linux__) && defined(HAVE_PTHREAD)
+#if defined(HAVE_PTHREAD)
struct timespec ts;
clockid_t cid;
 
pthread_getcpuclockid(thread, );
clock_gettime(cid, );
return (int64_t)ts.tv_sec * 10 + ts.tv_nsec;
 #else
return 0;
 #endif
 }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/13] nir: add load_param

2018-03-01 Thread Jason Ekstrand
On Wed, Feb 28, 2018 at 1:25 PM, Rob Clark  wrote:

> On Wed, Feb 28, 2018 at 4:16 PM, Eric Anholt  wrote:
> > Rob Clark  writes:
> >
> >> From: Karol Herbst 
> >>
> >> OpenCL kernels have parameters (see pipe_grid_info::input), and so we
> >> need a way to access them.
> >>
> >> Signed-off-by: Rob Clark 
> >>
> >> ---
> >>  src/compiler/nir/nir_intrinsics.h |  2 ++
> >>  src/compiler/nir/nir_lower_io.c   | 13 ++---
> >>  2 files changed, 12 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/src/compiler/nir/nir_intrinsics.h b/src/compiler/nir/nir_
> intrinsics.h
> >> index ede29277876..0915c5e809f 100644
> >> --- a/src/compiler/nir/nir_intrinsics.h
> >> +++ b/src/compiler/nir/nir_intrinsics.h
> >> @@ -435,6 +435,8 @@ LOAD(ubo, 2, 0, xx, xx, xx,
> NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REOR
> >>  LOAD(input, 1, 2, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE |
> NIR_INTRINSIC_CAN_REORDER)
> >>  /* src[] = { vertex, offset }. const_index[] = { base, component } */
> >>  LOAD(per_vertex_input, 2, 2, BASE, COMPONENT, xx,
> NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
> >> +/* src[] = { }. const_index[] = { base } */
> >> +LOAD(param, 0, 1, BASE, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE |
> NIR_INTRINSIC_CAN_REORDER)
> >
> > I know this is a new request compared to the existing pattern, but could
> > you put a comment describing what load_param does in the code as well as
> > the commit message?  Especially what the meaning of the base is.  We've
> > been bad at this in NIR, and it makes learning how to write a new NIR
> > backend more challenging than it should be.
> >
> > Also, what makes these params different from UBOs or default uniforms?
>
> yeah, I guess makes sense to describe better in code.. but for now
> base is just the parameter number (ie. first param base=0, second
> param base=1, etc)
>
> For ir3, I end up uploading these as uniforms.. I'm not sure how nv
> does kernel params offhand.  Maybe if they just end up uniforms for
> everyone we can change clover to use ctx->set_constant_buffer() and
> use a pass to lower load_param to load_uniform.  At least that would
> solve one awkward thing about the pipe driver and clover agreeing on
> the layout of pipe_grid_info::input.
>

Disclaimer: I won't claim to be an OpenCL expert.  I had a chat with Curro
this morning and I think I have a decent handle on how it works now but I
may be missing something.

That said, I suspect that load/store_param is the wrong approach.  My
understanding is that each type of parameter: pointer, image, constant,
etc. has a corresponding binding type in GL that's a fairly close match:
SSBO, image, UBO, etc.  If this is true, then I think the better thing to
do would be to translate parameters in the main function into vtn_variables
of the appropriate type.  OpenCL can use a very simple binding model where
you index by the order they show up in the parameter list.  For globals,
I'm not sure how OpenCL gives you access to them but I would guess some
fairly straightforward binding concept can be used there as well.

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Update the documentation for meson

2018-03-01 Thread Matt Turner
On Thu, Mar 1, 2018 at 11:34 AM, Dylan Baker  wrote:
> Meson is pretty well tested and works in most configurations now, so we
> can remove the warning about it being unsuited for actual use.
>
> It's also worth documenting that meson 0.42.0 or greater is required.
>
> Signed-off-by: Dylan Baker 
> ---
>  docs/meson.html | 34 +-
>  1 file changed, 21 insertions(+), 13 deletions(-)
>
> diff --git a/docs/meson.html b/docs/meson.html
> index 77f89b0c6c7..782cc198649 100644
> --- a/docs/meson.html
> +++ b/docs/meson.html
> @@ -18,11 +18,20 @@
>
>  1. Basic Usage
>
> -The Meson build system for Mesa is still under active development,
> -and should not be used in production environments.
> +The Meson build system is generally considered stable and ready
> +for production
>
> -The meson build is currently only tested on linux, and is known to not 
> work
> -on macOS, Windows, and haiku. This will be fixed.
> +The meson build is currently known to work on Linux, macOS, Cygwin, Haiku,
> +FreeBSD, DragonflyBSD, and NetBSD, it is believed to work on OpenBSD.
> +
> +Mesa requires Meson >= 0.42.0 to build in general.
> +
> +Additionaly, to build the Clover OpenCL state tracker or the OpenSWR driver
> +meson 0.44.0 or greater is required.
> +
> +Some older versions of meson do not check that they are too old and will 
> error
> +out in odd ways.
> +
>
>  
>  The meson program is used to configure the source directory and generates
> @@ -122,12 +131,11 @@ llvm-config, so using an LLVM from a non-standard path 
> is as easy as
>  PKG_CONFIG_PATH
>  The
>  pkg-config utility is a hard requirement for configuring and
> -building Mesa on Linux and *BSD. It is used to search for external libraries
> -on the system. This environment variable is used to control the search
> -path for pkg-config. For instance, setting
> -PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig will search for
> -package metadata in /usr/X11R6 before the standard
> -directories.
> +building Mesa on Unix-like systems. It is used to search for external 
> libraries
> +on the system. This environment variable is used to control the search path 
> for
> +pkg-config. For instance, setting
> +PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig will search for package
> +metadata in /usr/X11R6 before the standard directories.
>  
>  
>
> @@ -151,9 +159,9 @@ may interfer with debbugging as some code and validation 
> will be optimized
>  away.
>  
>
> - For those wishing to pass their own -O option, use the "plain" buildtype,
> -which cuases meson to inject no additional compiler arguments, only those in

s/cuases/causes/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Update the documentation for meson

2018-03-01 Thread Brian Paul

On 03/01/2018 12:34 PM, Dylan Baker wrote:

Meson is pretty well tested and works in most configurations now, so we
can remove the warning about it being unsuited for actual use.

It's also worth documenting that meson 0.42.0 or greater is required.

Signed-off-by: Dylan Baker 
---
  docs/meson.html | 34 +-
  1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/docs/meson.html b/docs/meson.html
index 77f89b0c6c7..782cc198649 100644
--- a/docs/meson.html
+++ b/docs/meson.html
@@ -18,11 +18,20 @@
  
  1. Basic Usage
  
-The Meson build system for Mesa is still under active development,

-and should not be used in production environments.
+The Meson build system is generally considered stable and ready
+for production
  
-The meson build is currently only tested on linux, and is known to not work

-on macOS, Windows, and haiku. This will be fixed.
+The meson build is currently known to work on Linux, macOS, Cygwin, Haiku,
+FreeBSD, DragonflyBSD, and NetBSD, it is believed to work on OpenBSD.
+
+Mesa requires Meson >= 0.42.0 to build in general.
+
+Additionaly, to build the Clover OpenCL state tracker or the OpenSWR driver
+meson 0.44.0 or greater is required.
+
+Some older versions of meson do not check that they are too old and will error
+out in odd ways.
+
  
  

  The meson program is used to configure the source directory and generates
@@ -122,12 +131,11 @@ llvm-config, so using an LLVM from a non-standard path is 
as easy as
  PKG_CONFIG_PATH
  The
  pkg-config utility is a hard requirement for configuring and
-building Mesa on Linux and *BSD. It is used to search for external libraries
-on the system. This environment variable is used to control the search
-path for pkg-config. For instance, setting
-PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig will search for
-package metadata in /usr/X11R6 before the standard
-directories.
+building Mesa on Unix-like systems. It is used to search for external libraries
+on the system. This environment variable is used to control the search path for
+pkg-config. For instance, setting
+PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig will search for package
+metadata in /usr/X11R6 before the standard directories.
  
  
  
@@ -151,9 +159,9 @@ may interfer with debbugging as some code and validation will be optimized

  away.
  
  
- For those wishing to pass their own -O option, use the "plain" buildtype,

-which cuases meson to inject no additional compiler arguments, only those in
-the C/CXXFLAGS and those that mesa itself defines.
+ For those wishing to pass their own optimization flags, use the "plain"
+buildtype, which causes meson to inject no additional compiler arguments, only
+those in the C/CXXFLAGS and those that mesa itself defines.
  
  
  



Reviewed-by: Brian Paul 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] tgsi/scan: use wrap-around shift behavior explicitly for file_mask

2018-03-01 Thread Brian Paul

On 03/01/2018 07:01 PM, srol...@vmware.com wrote:

From: Roland Scheidegger 

The comment said it will only represent the lowest 32 regs. This was
not entirely true in practice, since at least on x86 you'll get
masked shifts (unless the compiler could recognize it already and toss
it out). It turns out this actually works out alright (presumably
noone uses it for temp regs) when increasing max sampler views, so
make that behavior explicit.
Albeit it feels a bit hacky (but in any case, explicit behavior there
is better than undefined behavior).
---
  src/gallium/auxiliary/tgsi/tgsi_scan.c | 7 +--
  src/gallium/drivers/llvmpipe/lp_state_fs.c | 7 ++-
  src/gallium/drivers/swr/swr_shader.cpp | 2 +-
  3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c 
b/src/gallium/auxiliary/tgsi/tgsi_scan.c
index c35eff2..0d229c9 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
@@ -585,8 +585,11 @@ scan_declaration(struct tgsi_shader_info *info,
int buffer;
unsigned index, target, type;
  
-  /* only first 32 regs will appear in this bitfield */

-  info->file_mask[file] |= (1 << reg);
+  /*
+   * only first 32 regs will appear in this bitfield, if larger
+   * bits will wrap around.
+   */
+  info->file_mask[file] |= (1 << (reg & 31));


Or, reg % 32 and let the compiler optimize it.

Either way, Reviewed-by: Brian Paul 


info->file_count[file]++;
info->file_max[file] = MAX2(info->file_max[file], (int)reg);
  
diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c b/src/gallium/drivers/llvmpipe/lp_state_fs.c

index 603fd84..48c004c 100644
--- a/src/gallium/drivers/llvmpipe/lp_state_fs.c
+++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c
@@ -3323,7 +3323,12 @@ make_variant_key(struct llvmpipe_context *lp,
 if (shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] != -1) {
key->nr_sampler_views = 
shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] + 1;
for(i = 0; i < key->nr_sampler_views; ++i) {
- if(shader->info.base.file_mask[TGSI_FILE_SAMPLER_VIEW] & (1 << i)) {
+ /*
+  * Note sview may exceed what's representable by file_mask.
+  * This will still work, the only downside is that not actually
+  * used views may be included in the shader key.
+  */
+ if(shader->info.base.file_mask[TGSI_FILE_SAMPLER_VIEW] & (1 << (i & 
31))) {
  lp_sampler_static_texture_state(>state[i].texture_state,
  
lp->sampler_views[PIPE_SHADER_FRAGMENT][i]);
   }
diff --git a/src/gallium/drivers/swr/swr_shader.cpp 
b/src/gallium/drivers/swr/swr_shader.cpp
index e5fb679..fa1c0b8 100644
--- a/src/gallium/drivers/swr/swr_shader.cpp
+++ b/src/gallium/drivers/swr/swr_shader.cpp
@@ -98,7 +98,7 @@ swr_generate_sampler_key(const struct lp_tgsi_info ,
key.nr_sampler_views =
   info.base.file_max[TGSI_FILE_SAMPLER_VIEW] + 1;
for (unsigned i = 0; i < key.nr_sampler_views; i++) {
- if (info.base.file_mask[TGSI_FILE_SAMPLER_VIEW] & (1 << i)) {
+ if (info.base.file_mask[TGSI_FILE_SAMPLER_VIEW] & (1 << (i & 31))) {
  const struct pipe_sampler_view *view =
 ctx->sampler_views[shader_type][i];
  lp_sampler_static_texture_state(



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5 v6] clover/llvm: Add get_[cl|language]_version, validation and some helpers

2018-03-01 Thread Aaron Watry
Used to calculate the default CLC language version based on the --cl-std in 
build args
and the device capabilities.

According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by:
 1) If you have -cl-std=CL1.1+ use the version specified
 2) If not, use the highest 1.x version that the device supports

Curiously, there is no valid value for -cl-std=CL1.0

Validates requested cl-std against device_clc_version

Signed-off-by: Aaron Watry 
Cc: Pierre Moreau 

v6: (Pierre) Add more const and fix some whitespace

v5: (Aaron) Use a collection of cl versions instead of switch cases
Consolidates the string, numeric version, and clc langstandard::kind

v4: (Pierre) Split get_language_version addition and use into separate patches
Squash patches that add the helpers and validate the language standard

v3: Change device_version to device_clc_version

v2: (Pierre) Move create_compiler_instance changes to correct patch
to prevent temporary build breakage.
Convert version_str into unsigned and use it to find language version
Add build_error for unknown language version string
Whitespace fixes
---
 .../state_trackers/clover/llvm/invocation.cpp  | 63 ++
 1 file changed, 63 insertions(+)

diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index 0bc06e..0f854b9049 100644
--- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
+++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
@@ -63,6 +63,23 @@ using ::llvm::Module;
 using ::llvm::raw_string_ostream;
 
 namespace {
+
+   struct cl_version {
+  std::string version_str; // CL Version
+  unsigned version_number; // Numeric CL Version
+  clang::LangStandard::Kind clc_lang_standard; // lang standard for version
+   };
+
+   static const unsigned ANY_VERSION = 999;
+   const cl_version cl_versions[] = {
+  { "1.0", 100, clang::LangStandard::lang_opencl10},
+  { "1.1", 110, clang::LangStandard::lang_opencl11},
+  { "1.2", 120, clang::LangStandard::lang_opencl12},
+  { "2.0", 200, clang::LangStandard::lang_opencl20},
+  { "2.1", 210, clang::LangStandard::lang_unspecified}, //2.1 doesn't exist
+  { "2.2", 220, clang::LangStandard::lang_unspecified}, //2.2 doesn't exist
+   };
+
void
init_targets() {
   static bool targets_initialized = false;
@@ -93,6 +110,52 @@ namespace {
   return ctx;
}
 
+   const struct cl_version
+   get_cl_version(const std::string _str,
+  unsigned max = ANY_VERSION) {
+  for (const struct cl_version version : cl_versions) {
+ if (version.version_number == max || version.version_str == 
version_str) {
+return version;
+ }
+  }
+  throw build_error("Unknown/Unsupported language version");
+   }
+
+   clang::LangStandard::Kind
+   get_lang_standard_from_version_str(const std::string _str,
+  bool is_build_opt = false) {
+   /**
+   * Per CL 2.0 spec, section 5.8.4.5:
+   * If it's an option, use the value directly.
+   * If it's a device version, clamp to max 1.x version, a.k.a. 1.2
+   */
+  const struct cl_version version = get_cl_version(version_str,
+  is_build_opt ? ANY_VERSION : 120);
+  return version.clc_lang_standard;
+   }
+
+   clang::LangStandard::Kind
+   get_language_version(const std::vector ,
+const std::string _version) {
+
+  const std::string search = "-cl-std=CL";
+
+  for (auto opt: opts) {
+ auto pos = opt.find(search);
+ if (pos == 0){
+const auto ver = opt.substr(pos + search.size());
+const auto device_ver = get_cl_version(device_version);
+const auto requested = get_cl_version(ver);
+if (requested.version_number > device_ver.version_number) {
+   throw build_error();
+}
+return get_lang_standard_from_version_str(ver, true);
+ }
+  }
+
+  return get_lang_standard_from_version_str(device_version);
+   }
+
std::unique_ptr
create_compiler_instance(const device ,
 const std::vector ,
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] tgsi/scan: use wrap-around shift behavior explicitly for file_mask

2018-03-01 Thread sroland
From: Roland Scheidegger 

The comment said it will only represent the lowest 32 regs. This was
not entirely true in practice, since at least on x86 you'll get
masked shifts (unless the compiler could recognize it already and toss
it out). It turns out this actually works out alright (presumably
noone uses it for temp regs) when increasing max sampler views, so
make that behavior explicit.
Albeit it feels a bit hacky (but in any case, explicit behavior there
is better than undefined behavior).
---
 src/gallium/auxiliary/tgsi/tgsi_scan.c | 7 +--
 src/gallium/drivers/llvmpipe/lp_state_fs.c | 7 ++-
 src/gallium/drivers/swr/swr_shader.cpp | 2 +-
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c 
b/src/gallium/auxiliary/tgsi/tgsi_scan.c
index c35eff2..0d229c9 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
@@ -585,8 +585,11 @@ scan_declaration(struct tgsi_shader_info *info,
   int buffer;
   unsigned index, target, type;
 
-  /* only first 32 regs will appear in this bitfield */
-  info->file_mask[file] |= (1 << reg);
+  /*
+   * only first 32 regs will appear in this bitfield, if larger
+   * bits will wrap around.
+   */
+  info->file_mask[file] |= (1 << (reg & 31));
   info->file_count[file]++;
   info->file_max[file] = MAX2(info->file_max[file], (int)reg);
 
diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c 
b/src/gallium/drivers/llvmpipe/lp_state_fs.c
index 603fd84..48c004c 100644
--- a/src/gallium/drivers/llvmpipe/lp_state_fs.c
+++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c
@@ -3323,7 +3323,12 @@ make_variant_key(struct llvmpipe_context *lp,
if (shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] != -1) {
   key->nr_sampler_views = 
shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] + 1;
   for(i = 0; i < key->nr_sampler_views; ++i) {
- if(shader->info.base.file_mask[TGSI_FILE_SAMPLER_VIEW] & (1 << i)) {
+ /*
+  * Note sview may exceed what's representable by file_mask.
+  * This will still work, the only downside is that not actually
+  * used views may be included in the shader key.
+  */
+ if(shader->info.base.file_mask[TGSI_FILE_SAMPLER_VIEW] & (1 << (i & 
31))) {
 lp_sampler_static_texture_state(>state[i].texture_state,
 
lp->sampler_views[PIPE_SHADER_FRAGMENT][i]);
  }
diff --git a/src/gallium/drivers/swr/swr_shader.cpp 
b/src/gallium/drivers/swr/swr_shader.cpp
index e5fb679..fa1c0b8 100644
--- a/src/gallium/drivers/swr/swr_shader.cpp
+++ b/src/gallium/drivers/swr/swr_shader.cpp
@@ -98,7 +98,7 @@ swr_generate_sampler_key(const struct lp_tgsi_info ,
   key.nr_sampler_views =
  info.base.file_max[TGSI_FILE_SAMPLER_VIEW] + 1;
   for (unsigned i = 0; i < key.nr_sampler_views; i++) {
- if (info.base.file_mask[TGSI_FILE_SAMPLER_VIEW] & (1 << i)) {
+ if (info.base.file_mask[TGSI_FILE_SAMPLER_VIEW] & (1 << (i & 31))) {
 const struct pipe_sampler_view *view =
ctx->sampler_views[shader_type][i];
 lp_sampler_static_texture_state(
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] clover/llvm: Add get_[cl|language]_version, validation and some helpers

2018-03-01 Thread Aaron Watry
On Thu, Mar 1, 2018 at 3:43 PM, Pierre Moreau  wrote:
> On 2018-03-01 — 13:39, Aaron Watry wrote:
>> Used to calculate the default CLC language version based on the --cl-std in 
>> build args
>> and the device capabilities.
>>
>> According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by:
>>  1) If you have -cl-std=CL1.1+ use the version specified
>>  2) If not, use the highest 1.x version that the device supports
>>
>> Curiously, there is no valid value for -cl-std=CL1.0
>>
>> Validates requested cl-std against device_clc_version
>>
>> Signed-off-by: Aaron Watry 
>> Cc: Pierre Moreau 
>>
>> v5: (Aaron) Use a collection of cl versions instead of switch cases
>> Consolidates the string, numeric version, and clc langstandard::kind
>>
>> v4: (Pierre) Split get_language_version addition and use into separate 
>> patches
>> Squash patches that add the helpers and validate the language standard
>>
>> v3: Change device_version to device_clc_version
>>
>> v2: (Pierre) Move create_compiler_instance changes to correct patch
>> to prevent temporary build breakage.
>> Convert version_str into unsigned and use it to find language version
>> Add build_error for unknown language version string
>> Whitespace fixes
>> ---
>>  .../state_trackers/clover/llvm/invocation.cpp  | 63 
>> ++
>>  1 file changed, 63 insertions(+)
>>
>> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
>> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
>> index 1924c0317f..8d76f203de 100644
>> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
>> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
>> @@ -63,6 +63,23 @@ using ::llvm::Module;
>>  using ::llvm::raw_string_ostream;
>>
>>  namespace {
>> +
>> +   struct cl_version {
>
> I would rename everything that uses *cl_version* to *clc_version*, as they are
> all about the OpenCL C language version, rather than the OpenCL API version.
>
Yup, I use this set of declarations again in the last patch of the
series for determining CL version
when setting __OPENCL_VERSION__, hence the name I chose.

I'll address the other comments you had in this file which were not
related to the cl/clc confusion for now.

>> +  std::string version_str; //CL Version
>
> Minor change, but a space could be added between the comment token and the
> comment itself (same for the other comments further down).

I added a space here and on the next two comments.

In order to stay <= 80 chars, I changed the last one to:
// Lang standard for version

It was at 82 characters before anyway.

> I would go “OpenCL C” instead of just “CL” in the comment.
>
>> +  unsigned version_number; //Numeric CL Version
>> +  clang::LangStandard::Kind clc_lang_standard; //CLC standard of this 
>> version
>
> Similarly here.
>
>> +   };
>> +
>> +   static const unsigned ANY_VERSION = 999;
>> +   cl_version const cl_versions[] = {
>
> Please place “const” before the type, for consistency.

Done.

>
>> +  { "1.0", 100, clang::LangStandard::lang_opencl10},
>> +  { "1.1", 110, clang::LangStandard::lang_opencl11},
>> +  { "1.2", 120, clang::LangStandard::lang_opencl12},
>> +  { "2.0", 200, clang::LangStandard::lang_opencl20},
>> +  { "2.1", 210, clang::LangStandard::lang_unspecified}, //2.1 doesn't 
>> exist
>> +  { "2.2", 220, clang::LangStandard::lang_unspecified}, //2.2 doesn't 
>> exist
>
> You should remove 2.1 and 2.2, as those versions of OpenCL C do not exist, and
> “CL2.1” or “CL2.2” are not valid values to “-cl-std”.
>
>> +   };
>> +
>> void
>> init_targets() {
>>static bool targets_initialized = false;
>> @@ -93,6 +110,52 @@ namespace {
>>return ctx;
>> }
>>
>> +   struct cl_version
>> +   get_cl_version(const std::string _str,
>> +  unsigned max = ANY_VERSION) {
>> +  for (struct cl_version version : cl_versions) {
>
> You could take a constant reference here.

Done.

I also make the function return type const as well.


>
>> + if (version.version_number == max || version.version_str == 
>> version_str) {
>> +return version;
>> + }
>> +  }
>> +  throw build_error("Unknown/Unsupported language version");
>> +   }
>> +
>> +   clang::LangStandard::Kind
>> +   get_lang_standard_from_version_str(const std::string _str,
>> +  bool is_build_opt = false) {
>> +   /**
>> +   * Per CL 2.0 spec, section 5.8.4.5:
>> +   * If it's an option, use the value directly.
>> +   * If it's a device version, clamp to max 1.x version, a.k.a. 1.2
>> +   */
>> +  struct cl_version version = get_cl_version(version_str,

I made this const as well.

>> +  is_build_opt ? ANY_VERSION : 120);
>> +  return version.clc_lang_standard;
>> +   }
>> +
>> +   clang::LangStandard::Kind
>> +   

Re: [Mesa-dev] [PATCH 2/2] clover: Include generic type in several kernel/device obj() calls

2018-03-01 Thread Aaron Watry
On Thu, Mar 1, 2018, 5:19 PM Francisco Jerez  wrote:

> Aaron Watry  writes:
>
> > Fixes auto-completion for some device and kernel methods in my IDE.
> >
> > No functional change intended.
> >
>
> NAK to this one.  object.hpp goes through quite some effort to infer the
> type automatically in a way that's guaranteed correct.  I don't think we
> want to increase the syntactic burden of our codebase designing it
> around the deficient autocompletion support of some IDE.
>

Fair enough.  Netbeans C/C++ support has been kinda "meh", for a while,
it's just what I'm used to in my day job (It's much better at Java).

Consider the patch dropped.

--Aaron

>
> > Signed-off-by: Aaron Watry 
> > ---
> >  src/gallium/state_trackers/clover/api/device.cpp |  2 +-
> >  src/gallium/state_trackers/clover/api/kernel.cpp | 22
> +++---
> >  2 files changed, 12 insertions(+), 12 deletions(-)
> >
> > diff --git a/src/gallium/state_trackers/clover/api/device.cpp
> b/src/gallium/state_trackers/clover/api/device.cpp
> > index 3572bb0c92..2aaa2c59cb 100644
> > --- a/src/gallium/state_trackers/clover/api/device.cpp
> > +++ b/src/gallium/state_trackers/clover/api/device.cpp
> > @@ -98,7 +98,7 @@ CLOVER_API cl_int
> >  clGetDeviceInfo(cl_device_id d_dev, cl_device_info param,
> >  size_t size, void *r_buf, size_t *r_size) try {
> > property_buffer buf { r_buf, size, r_size };
> > -   auto  = obj(d_dev);
> > +   auto  = obj(d_dev);
> >
> > switch (param) {
> > case CL_DEVICE_TYPE:
> > diff --git a/src/gallium/state_trackers/clover/api/kernel.cpp
> b/src/gallium/state_trackers/clover/api/kernel.cpp
> > index b665773d9e..705828a688 100644
> > --- a/src/gallium/state_trackers/clover/api/kernel.cpp
> > +++ b/src/gallium/state_trackers/clover/api/kernel.cpp
> > @@ -28,7 +28,7 @@ using namespace clover;
> >
> >  CLOVER_API cl_kernel
> >  clCreateKernel(cl_program d_prog, const char *name, cl_int *r_errcode)
> try {
> > -   auto  = obj(d_prog);
> > +   auto  = obj(d_prog);
> >
> > if (!name)
> >throw error(CL_INVALID_VALUE);
> > @@ -50,7 +50,7 @@ clCreateKernel(cl_program d_prog, const char *name,
> cl_int *r_errcode) try {
> >  CLOVER_API cl_int
> >  clCreateKernelsInProgram(cl_program d_prog, cl_uint count,
> >   cl_kernel *rd_kerns, cl_uint *r_count) try {
> > -   auto  = obj(d_prog);
> > +   auto  = obj(d_prog);
> > auto  = prog.symbols();
> >
> > if (rd_kerns && count < syms.size())
> > @@ -76,7 +76,7 @@ clCreateKernelsInProgram(cl_program d_prog, cl_uint
> count,
> >
> >  CLOVER_API cl_int
> >  clRetainKernel(cl_kernel d_kern) try {
> > -   obj(d_kern).retain();
> > +   obj(d_kern).retain();
> > return CL_SUCCESS;
> >
> >  } catch (error ) {
> > @@ -85,7 +85,7 @@ clRetainKernel(cl_kernel d_kern) try {
> >
> >  CLOVER_API cl_int
> >  clReleaseKernel(cl_kernel d_kern) try {
> > -   if (obj(d_kern).release())
> > +   if (obj(d_kern).release())
> >delete pobj(d_kern);
> >
> > return CL_SUCCESS;
> > @@ -97,7 +97,7 @@ clReleaseKernel(cl_kernel d_kern) try {
> >  CLOVER_API cl_int
> >  clSetKernelArg(cl_kernel d_kern, cl_uint idx, size_t size,
> > const void *value) try {
> > -   obj(d_kern).args().at(idx).set(size, value);
> > +   obj(d_kern).args().at(idx).set(size, value);
> > return CL_SUCCESS;
> >
> >  } catch (std::out_of_range ) {
> > @@ -111,7 +111,7 @@ CLOVER_API cl_int
> >  clGetKernelInfo(cl_kernel d_kern, cl_kernel_info param,
> >  size_t size, void *r_buf, size_t *r_size) try {
> > property_buffer buf { r_buf, size, r_size };
> > -   auto  = obj(d_kern);
> > +   auto  = obj(d_kern);
> >
> > switch (param) {
> > case CL_KERNEL_FUNCTION_NAME:
> > @@ -149,7 +149,7 @@ clGetKernelWorkGroupInfo(cl_kernel d_kern,
> cl_device_id d_dev,
> >   cl_kernel_work_group_info param,
> >   size_t size, void *r_buf, size_t *r_size) try {
> > property_buffer buf { r_buf, size, r_size };
> > -   auto  = obj(d_kern);
> > +   auto  = obj(d_kern);
> > auto  = (d_dev ? *pobj(d_dev) :
> unique(kern.program().devices()));
> >
> > if (!count(dev, kern.program().devices()))
> > @@ -279,8 +279,8 @@ clEnqueueNDRangeKernel(cl_command_queue d_q,
> cl_kernel d_kern,
> > const size_t *d_grid_size, const size_t
> *d_block_size,
> > cl_uint num_deps, const cl_event *d_deps,
> > cl_event *rd_ev) try {
> > -   auto  = obj(d_q);
> > -   auto  = obj(d_kern);
> > +   auto  = obj(d_q);
> > +   auto  = obj(d_kern);
> > auto deps = objs(d_deps, num_deps);
> > auto grid_size = validate_grid_size(q, dims, d_grid_size);
> > auto grid_offset = validate_grid_offset(q, dims, d_grid_offset);
> > @@ -306,8 +306,8 @@ CLOVER_API cl_int
> >  clEnqueueTask(cl_command_queue d_q, cl_kernel d_kern,
> >cl_uint num_deps, 

[Mesa-dev] [Bug 105291] r600 [CEDAR]: GPU stalls when running shadertoy "ladybug"

2018-03-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105291

--- Comment #2 from Roland Scheidegger  ---
You could figure out if it simply takes too long by increasing the timeout
(albeit that needs a recompile of the kernel module).
If so, I'm not sure what to do really. (Though I think the shader actually does
spilling, possibly could eliminate some of it to speed it up, not sure though
it's possible).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Generalize intel_upload.c to support multiple uploaders.

2018-03-01 Thread Chris Wilson
Quoting Kenneth Graunke (2018-03-01 23:39:54)
> I'd like to reuse the upload logic for a new program cache, but the
> buffers will need to have a different lifetime than the default
> uploader, and also some address space restrictions.
> 
> This makes it a bit more like u_upload_mgr.

To be clear, this patch is just prep work that should be no functional
changes? To my tired eyes, it should be the same code with a different
lick of paint, but it would be nice to know if I missed something.
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105291] r600 [CEDAR]: GPU stalls when running shadertoy "ladybug"

2018-03-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105291

--- Comment #1 from russianneuroman...@ya.ru ---
Created attachment 137739
  --> https://bugs.freedesktop.org/attachment.cgi?id=137739=edit
dmesg of Radeon HD 6620G hang

Same issue here when running on SUMO iGPU with Linux 4.15.5 and Mesa 18.1 git.
Not reproducible on TURKS dGPU. Complete dmesg is attached. If running with any
additional debug option is required please let me know.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965: Generalize intel_upload.c to support multiple uploaders.

2018-03-01 Thread Kenneth Graunke
I'd like to reuse the upload logic for a new program cache, but the
buffers will need to have a different lifetime than the default
uploader, and also some address space restrictions.

This makes it a bit more like u_upload_mgr.
---
 src/mesa/drivers/dri/i965/brw_context.c  |  2 +
 src/mesa/drivers/dri/i965/brw_context.h  | 14 ++--
 src/mesa/drivers/dri/i965/brw_curbe.c|  4 +-
 src/mesa/drivers/dri/i965/brw_draw_upload.c  | 30 
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 +--
 src/mesa/drivers/dri/i965/gen6_constant_state.c  | 10 +--
 src/mesa/drivers/dri/i965/intel_batchbuffer.c|  2 +-
 src/mesa/drivers/dri/i965/intel_buffer_objects.h | 21 ++---
 src/mesa/drivers/dri/i965/intel_upload.c | 97 
 9 files changed, 101 insertions(+), 91 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index ea1c78d1fe6..2deab2e088d 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -1009,6 +1009,8 @@ brwCreateContext(gl_api api,
   return false;
}
 
+   brw_upload_init(>upload, brw->bufmgr, 65536);
+
brw_init_state(brw);
 
intelInitExtensions(ctx);
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 050b656e3da..d6e3c7807f7 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -718,6 +718,14 @@ struct brw_perf_query_info
uint32_t n_b_counter_regs;
 };
 
+struct brw_uploader {
+   struct brw_bufmgr *bufmgr;
+   struct brw_bo *bo;
+   void *map;
+   uint32_t next_offset;
+   unsigned default_size;
+};
+
 /**
  * brw_context is derived from gl_context.
  */
@@ -786,11 +794,7 @@ struct brw_context
 
struct intel_batchbuffer batch;
 
-   struct {
-  struct brw_bo *bo;
-  void *map;
-  uint32_t next_offset;
-   } upload;
+   struct brw_uploader upload;
 
/**
 * Set if rendering has occurred to the drawable's front buffer.
diff --git a/src/mesa/drivers/dri/i965/brw_curbe.c 
b/src/mesa/drivers/dri/i965/brw_curbe.c
index c747110e310..e4a2bd9c891 100644
--- a/src/mesa/drivers/dri/i965/brw_curbe.c
+++ b/src/mesa/drivers/dri/i965/brw_curbe.c
@@ -214,8 +214,8 @@ brw_upload_constant_buffer(struct brw_context *brw)
   goto emit;
}
 
-   buf = intel_upload_space(brw, bufsz, 64,
->curbe.curbe_bo, >curbe.curbe_offset);
+   buf = brw_upload_space(>upload, bufsz, 64,
+  >curbe.curbe_bo, >curbe.curbe_offset);
 
STATIC_ASSERT(sizeof(gl_constant_value) == sizeof(float));
 
diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upload.c
index 9b81999ea05..c058064403e 100644
--- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
@@ -412,10 +412,10 @@ copy_array_to_vbo_array(struct brw_context *brw,
 * to replicate it out.
 */
if (src_stride == 0) {
-  intel_upload_data(brw, element->glarray->Ptr,
-element->glarray->_ElementSize,
-element->glarray->_ElementSize,
-   >bo, >offset);
+  brw_upload_data(>upload, element->glarray->Ptr,
+  element->glarray->_ElementSize,
+  element->glarray->_ElementSize,
+  >bo, >offset);
 
   buffer->stride = 0;
   buffer->size = element->glarray->_ElementSize;
@@ -425,8 +425,8 @@ copy_array_to_vbo_array(struct brw_context *brw,
const unsigned char *src = element->glarray->Ptr + min * src_stride;
int count = max - min + 1;
GLuint size = count * dst_stride;
-   uint8_t *dst = intel_upload_space(brw, size, dst_stride,
- >bo, >offset);
+   uint8_t *dst = brw_upload_space(>upload, size, dst_stride,
+   >bo, >offset);
 
/* The GL 4.5 spec says:
 *  "If any enabled array’s buffer binding is zero when DrawArrays or
@@ -699,15 +699,17 @@ brw_prepare_shader_draw_parameters(struct brw_context 
*brw)
/* For non-indirect draws, upload gl_BaseVertex. */
if ((vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance) &&
brw->draw.draw_params_bo == NULL) {
-  intel_upload_data(brw, >draw.params, sizeof(brw->draw.params), 4,
-   >draw.draw_params_bo,
->draw.draw_params_offset);
+  brw_upload_data(>upload,
+  >draw.params, sizeof(brw->draw.params), 4,
+  >draw.draw_params_bo,
+  >draw.draw_params_offset);
}
 
if (vs_prog_data->uses_drawid) {
-  intel_upload_data(brw, >draw.gl_drawid, 
sizeof(brw->draw.gl_drawid), 4,
->draw.draw_id_bo,
->draw.draw_id_offset);
+  brw_upload_data(>upload,
+  

[Mesa-dev] [PATCH 2/2] i965: Mark upload buffers with MAP_ASYNC and MAP_PERSISTENT.

2018-03-01 Thread Kenneth Graunke
This should have no practical impact.  For the default uploader, we
don't really care, but for others, we may want to append more data
as the GPU is reading existing data, which means we need async and
persistent flags.
---
 src/mesa/drivers/dri/i965/intel_upload.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_upload.c 
b/src/mesa/drivers/dri/i965/intel_upload.c
index 53dff556873..e4297bf22b4 100644
--- a/src/mesa/drivers/dri/i965/intel_upload.c
+++ b/src/mesa/drivers/dri/i965/intel_upload.c
@@ -87,7 +87,9 @@ brw_upload_space(struct brw_uploader *upload,
if (!upload->bo) {
   upload->bo = brw_bo_alloc(upload->bufmgr, "streamed data",
 MAX2(upload->default_size, size), 4096);
-  upload->map = brw_bo_map(NULL, upload->bo, MAP_READ | MAP_WRITE);
+  upload->map = brw_bo_map(NULL, upload->bo,
+   MAP_READ | MAP_WRITE |
+   MAP_PERSISTENT | MAP_ASYNC);
}
 
upload->next_offset = offset + size;
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] clover: Include generic type in several kernel/device obj() calls

2018-03-01 Thread Francisco Jerez
Aaron Watry  writes:

> Fixes auto-completion for some device and kernel methods in my IDE.
>
> No functional change intended.
>

NAK to this one.  object.hpp goes through quite some effort to infer the
type automatically in a way that's guaranteed correct.  I don't think we
want to increase the syntactic burden of our codebase designing it
around the deficient autocompletion support of some IDE.

> Signed-off-by: Aaron Watry 
> ---
>  src/gallium/state_trackers/clover/api/device.cpp |  2 +-
>  src/gallium/state_trackers/clover/api/kernel.cpp | 22 +++---
>  2 files changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/src/gallium/state_trackers/clover/api/device.cpp 
> b/src/gallium/state_trackers/clover/api/device.cpp
> index 3572bb0c92..2aaa2c59cb 100644
> --- a/src/gallium/state_trackers/clover/api/device.cpp
> +++ b/src/gallium/state_trackers/clover/api/device.cpp
> @@ -98,7 +98,7 @@ CLOVER_API cl_int
>  clGetDeviceInfo(cl_device_id d_dev, cl_device_info param,
>  size_t size, void *r_buf, size_t *r_size) try {
> property_buffer buf { r_buf, size, r_size };
> -   auto  = obj(d_dev);
> +   auto  = obj(d_dev);
>  
> switch (param) {
> case CL_DEVICE_TYPE:
> diff --git a/src/gallium/state_trackers/clover/api/kernel.cpp 
> b/src/gallium/state_trackers/clover/api/kernel.cpp
> index b665773d9e..705828a688 100644
> --- a/src/gallium/state_trackers/clover/api/kernel.cpp
> +++ b/src/gallium/state_trackers/clover/api/kernel.cpp
> @@ -28,7 +28,7 @@ using namespace clover;
>  
>  CLOVER_API cl_kernel
>  clCreateKernel(cl_program d_prog, const char *name, cl_int *r_errcode) try {
> -   auto  = obj(d_prog);
> +   auto  = obj(d_prog);
>  
> if (!name)
>throw error(CL_INVALID_VALUE);
> @@ -50,7 +50,7 @@ clCreateKernel(cl_program d_prog, const char *name, cl_int 
> *r_errcode) try {
>  CLOVER_API cl_int
>  clCreateKernelsInProgram(cl_program d_prog, cl_uint count,
>   cl_kernel *rd_kerns, cl_uint *r_count) try {
> -   auto  = obj(d_prog);
> +   auto  = obj(d_prog);
> auto  = prog.symbols();
>  
> if (rd_kerns && count < syms.size())
> @@ -76,7 +76,7 @@ clCreateKernelsInProgram(cl_program d_prog, cl_uint count,
>  
>  CLOVER_API cl_int
>  clRetainKernel(cl_kernel d_kern) try {
> -   obj(d_kern).retain();
> +   obj(d_kern).retain();
> return CL_SUCCESS;
>  
>  } catch (error ) {
> @@ -85,7 +85,7 @@ clRetainKernel(cl_kernel d_kern) try {
>  
>  CLOVER_API cl_int
>  clReleaseKernel(cl_kernel d_kern) try {
> -   if (obj(d_kern).release())
> +   if (obj(d_kern).release())
>delete pobj(d_kern);
>  
> return CL_SUCCESS;
> @@ -97,7 +97,7 @@ clReleaseKernel(cl_kernel d_kern) try {
>  CLOVER_API cl_int
>  clSetKernelArg(cl_kernel d_kern, cl_uint idx, size_t size,
> const void *value) try {
> -   obj(d_kern).args().at(idx).set(size, value);
> +   obj(d_kern).args().at(idx).set(size, value);
> return CL_SUCCESS;
>  
>  } catch (std::out_of_range ) {
> @@ -111,7 +111,7 @@ CLOVER_API cl_int
>  clGetKernelInfo(cl_kernel d_kern, cl_kernel_info param,
>  size_t size, void *r_buf, size_t *r_size) try {
> property_buffer buf { r_buf, size, r_size };
> -   auto  = obj(d_kern);
> +   auto  = obj(d_kern);
>  
> switch (param) {
> case CL_KERNEL_FUNCTION_NAME:
> @@ -149,7 +149,7 @@ clGetKernelWorkGroupInfo(cl_kernel d_kern, cl_device_id 
> d_dev,
>   cl_kernel_work_group_info param,
>   size_t size, void *r_buf, size_t *r_size) try {
> property_buffer buf { r_buf, size, r_size };
> -   auto  = obj(d_kern);
> +   auto  = obj(d_kern);
> auto  = (d_dev ? *pobj(d_dev) : unique(kern.program().devices()));
>  
> if (!count(dev, kern.program().devices()))
> @@ -279,8 +279,8 @@ clEnqueueNDRangeKernel(cl_command_queue d_q, cl_kernel 
> d_kern,
> const size_t *d_grid_size, const size_t *d_block_size,
> cl_uint num_deps, const cl_event *d_deps,
> cl_event *rd_ev) try {
> -   auto  = obj(d_q);
> -   auto  = obj(d_kern);
> +   auto  = obj(d_q);
> +   auto  = obj(d_kern);
> auto deps = objs(d_deps, num_deps);
> auto grid_size = validate_grid_size(q, dims, d_grid_size);
> auto grid_offset = validate_grid_offset(q, dims, d_grid_offset);
> @@ -306,8 +306,8 @@ CLOVER_API cl_int
>  clEnqueueTask(cl_command_queue d_q, cl_kernel d_kern,
>cl_uint num_deps, const cl_event *d_deps,
>cl_event *rd_ev) try {
> -   auto  = obj(d_q);
> -   auto  = obj(d_kern);
> +   auto  = obj(d_q);
> +   auto  = obj(d_kern);
> auto deps = objs(d_deps, num_deps);
>  
> validate_common(q, kern, deps);
> -- 
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP 

[Mesa-dev] [Bug 105320] Wrong results produced by vkCmdCopyBuffer() from storage texel buffer

2018-03-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105320

Bug ID: 105320
   Summary: Wrong results produced by vkCmdCopyBuffer() from
storage texel buffer
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Vulkan/radeon
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: joseph.ku...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 137738
  --> https://bugs.freedesktop.org/attachment.cgi?id=137738=edit
Sample program

The attached program produces wrong results:

Got value 3735928559, expected 8.
Got value 8, expected 7.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] intel/fs: Set up sampler message headers in the visitor on gen7+

2018-03-01 Thread Jason Ekstrand
On Thu, Mar 1, 2018 at 12:38 PM, Francisco Jerez 
wrote:

> Jason Ekstrand  writes:
>
> > This gives the scheduler visibility into the headers which should
> > improve scheduling.  More importantly, however, it lets the scheduler
> > know that the header gets written.  As-is, the scheduler thinks that a
> > texture instruction only reads it's payload and is unaware that it may
> > write to the first register so it may reorder it with respect to a read
> > from that register.  This is causing issues in a couple of Dota 2 vertex
> > shaders.
> >
>
> Yikes...  Corrupting your GRF since 2012...  Render target writes
> probably need a similar treatment.
>

Yeah... It all needs a similar treatment. :-)


> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104923
> > Cc: mesa-sta...@lists.freedesktop.org
>
> Reviewed-by: Francisco Jerez 
>

Thanks!


> > ---
> >  src/intel/compiler/brw_fs.cpp   | 40
> +
> >  src/intel/compiler/brw_fs_generator.cpp | 21 +++--
> >  2 files changed, 39 insertions(+), 22 deletions(-)
> >
> > diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.
> cpp
> > index 113f62c..ab8cc89 100644
> > --- a/src/intel/compiler/brw_fs.cpp
> > +++ b/src/intel/compiler/brw_fs.cpp
> > @@ -4192,17 +4192,15 @@ lower_sampler_logical_send_gen7(const
> fs_builder , fs_inst *inst, opcode op,
> > op == SHADER_OPCODE_SAMPLEINFO ||
> > is_high_sampler(devinfo, sampler)) {
> >/* For general texture offsets (no txf workaround), we need a
> header to
> > -   * put them in.  Note that we're only reserving space for it in
> the
> > -   * message payload as it will be initialized implicitly by the
> > -   * generator.
> > +   * put them in.
> > *
> > * TG4 needs to place its channel select in the header, for
> interaction
> > * with ARB_texture_swizzle.  The sampler index is only 4-bits,
> so for
> > * larger sampler numbers we need to offset the Sampler State
> Pointer in
> > * the header.
> > */
> > +  fs_reg header = retype(sources[0], BRW_REGISTER_TYPE_UD);
> >header_size = 1;
> > -  sources[0] = fs_reg();
> >length++;
> >
> >/* If we're requesting fewer than four channels worth of response,
> > @@ -4214,6 +4212,40 @@ lower_sampler_logical_send_gen7(const fs_builder
> , fs_inst *inst, opcode op,
> >   unsigned mask = ~((1 << (regs_written(inst) / reg_width)) - 1)
> & 0xf;
> >   inst->offset |= mask << 12;
> >}
> > +
> > +  /* Build the actual header */
> > +  const fs_builder ubld = bld.exec_all().group(8, 0);
> > +  const fs_builder ubld1 = ubld.group(1, 0);
> > +  ubld.MOV(header, retype(brw_vec8_grf(0, 0),
> BRW_REGISTER_TYPE_UD));
> > +  if (inst->offset) {
> > + ubld1.MOV(component(header, 2), brw_imm_ud(inst->offset));
> > +  } else if (bld.shader->stage != MESA_SHADER_VERTEX &&
> > + bld.shader->stage != MESA_SHADER_FRAGMENT) {
> > + /* The vertex and fragment stages have g0.2 set to 0, so
> > +  * header0.2 is 0 when g0 is copied. Other stages may not, so
> we
> > +  * must set it to 0 to avoid setting undesirable bits in the
> > +  * message.
> > +  */
> > + ubld1.MOV(component(header, 2), brw_imm_ud(0));
> > +  }
> > +
> > +  if (is_high_sampler(devinfo, sampler)) {
> > + if (sampler.file == BRW_IMMEDIATE_VALUE) {
> > +assert(sampler.ud >= 16);
> > +const int sampler_state_size = 16; /* 16 bytes */
> > +
> > +ubld1.ADD(component(header, 3),
> > +  retype(brw_vec1_grf(0, 3), BRW_REGISTER_TYPE_UD),
> > +  brw_imm_ud(16 * (sampler.ud / 16) *
> sampler_state_size));
> > + } else {
> > +fs_reg tmp = ubld1.vgrf(BRW_REGISTER_TYPE_UD);
> > +ubld1.AND(tmp, sampler, brw_imm_ud(0x0f0));
> > +ubld1.SHL(tmp, tmp, brw_imm_ud(4));
> > +ubld1.ADD(component(header, 3),
> > +  retype(brw_vec1_grf(0, 3), BRW_REGISTER_TYPE_UD),
> > +  tmp);
> > + }
> > +  }
> > }
> >
> > if (shadow_c.file != BAD_FILE) {
> > diff --git a/src/intel/compiler/brw_fs_generator.cpp
> b/src/intel/compiler/brw_fs_generator.cpp
> > index b59c09f..a5a821a 100644
> > --- a/src/intel/compiler/brw_fs_generator.cpp
> > +++ b/src/intel/compiler/brw_fs_generator.cpp
> > @@ -1001,19 +1001,13 @@ fs_generator::generate_tex(fs_inst *inst,
> struct brw_reg dst, struct brw_reg src
> >  * we need to set it up explicitly and load the offset bitfield.
> >  * Otherwise, we can use an implied move from g0 to the first
> message reg.
> >  */
> > -   if (inst->header_size != 0) {
> > +   if (inst->header_size != 0 && devinfo->gen < 7) {
> >if (devinfo->gen < 6 && 

Re: [Mesa-dev] [PATCH 05/29] intel/isl: Add a helper for inverting swizzles

2018-03-01 Thread Jason Ekstrand
On Thu, Mar 1, 2018 at 6:49 AM, Pohjolainen, Topi <
topi.pohjolai...@gmail.com> wrote:

> On Mon, Feb 26, 2018 at 08:42:42AM -0800, Jason Ekstrand wrote:
> > On Mon, Feb 26, 2018 at 6:19 AM, Pohjolainen, Topi <
> > topi.pohjolai...@gmail.com> wrote:
> >
> > > On Fri, Jan 26, 2018 at 05:59:34PM -0800, Jason Ekstrand wrote:
> > > > ---
> > > >  src/intel/isl/isl.c | 30 ++
> > > >  src/intel/isl/isl.h |  2 ++
> > > >  2 files changed, 32 insertions(+)
> > > >
> > > > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> > > > index a2d3ae6..420d387 100644
> > > > --- a/src/intel/isl/isl.c
> > > > +++ b/src/intel/isl/isl.c
> > > > @@ -2379,3 +2379,33 @@ isl_swizzle_compose(struct isl_swizzle first,
> > > struct isl_swizzle second)
> > > >.a = swizzle_select(first.a, second),
> > > > };
> > > >  }
> > > > +
> > > > +/**
> > > > + * Returns a swizzle that is the pseudo-inverse of this swizzle.
> > > > + */
> > > > +struct isl_swizzle
> > > > +isl_swizzle_invert(struct isl_swizzle swizzle)
> > > > +{
> > > > +   /* Default to zero for channels which do not show up in the
> swizzle
> > > */
> > > > +   enum isl_channel_select chans[4] = {
> > > > +  ISL_CHANNEL_SELECT_ZERO,
> > > > +  ISL_CHANNEL_SELECT_ZERO,
> > > > +  ISL_CHANNEL_SELECT_ZERO,
> > > > +  ISL_CHANNEL_SELECT_ZERO,
> > > > +   };
> > > > +
> > > > +   /* We go in ABGR order so that, if there are any duplicates, the
> > > first one
> > > > +* is taken if you look at it in RGBA order.  This is what
> Haswell
> > > hardware
> > > > +* does for render target swizzles.
> > > > +*/
> > > > +   if ((unsigned)(swizzle.a - ISL_CHANNEL_SELECT_RED) < 4)
> > > > +  chans[swizzle.a - ISL_CHANNEL_SELECT_RED] =
> > > ISL_CHANNEL_SELECT_ALPHA;
> > > > +   if ((unsigned)(swizzle.b - ISL_CHANNEL_SELECT_RED) < 4)
> > > > +  chans[swizzle.b - ISL_CHANNEL_SELECT_RED] =
> > > ISL_CHANNEL_SELECT_BLUE;
> > > > +   if ((unsigned)(swizzle.g - ISL_CHANNEL_SELECT_RED) < 4)
> > > > +  chans[swizzle.g - ISL_CHANNEL_SELECT_RED] =
> > > ISL_CHANNEL_SELECT_GREEN;
> > > > +   if ((unsigned)(swizzle.r - ISL_CHANNEL_SELECT_RED) < 4)
> > > > +  chans[swizzle.r - ISL_CHANNEL_SELECT_RED] =
> > > ISL_CHANNEL_SELECT_RED;
> > > > +
> > > > +   return (struct isl_swizzle) { chans[0], chans[1], chans[2],
> chans[3]
> > > };
> > >
> > > If given
> > >
> > > swizzle == { ISL_CHANNEL_SELECT_RED,
> > >  ISL_CHANNEL_SELECT_GREEN,
> > >  ISL_CHANNEL_SELECT_BLUE,
> > >  ISL_CHANNEL_SELECT_ALPHA },
> > >
> > > then
> > > chans[ISL_CHANNEL_SELECT_ALPHA - ISL_CHANNEL_SELECT_RED] ==
> chans[3] ==
> > > ISL_CHANNEL_SELECT_ALPHA
> > >
> > > and so on, and the function returns the same swizzle as given?
> >
> >
> > Yes, that is how the subtraction works.
>
> I was expecting it to "invert" that, i.e., to return ABGR. But okay, if
> given
> identity swizzle it returns identity.
>
> In order to understand how it works I thought I read further the series to
> find an example - there seems to be one in patch 12 and another in patch
> 16.
> In case of 16 and destination format B4G4R4A4 the swizzle looks to be BGRA
> (looking at anv_formats.c::main_formats).
>
> In that case we get:
>
>chans[ALPHA - RED] = chans[3] = ALPHA
>chans[RED   - RED] = chans[0] = BLUE
>chans[GREEN - RED] = chans[1] = GREEN
>chans[BLUE  - RED] = chans[2] = RED
>
> and as a swizzle BLUE, GREEN, RED, ALPHA. This is again the same as given.
> What am I not understanding?
>

I think the confusion is what "invert" means.  It doesn't mean we reverse
the channels or anything like that.  It's an inverse in the sense that when
you compose a swizzle with it's inverse, you get the identity back out.
The inverse of BGRA is BGRA because if you apply the BGRA swizzle twice,
you get RGBA again.  If you start with ARGB, you get

chans[BLUE - RED] = chans[2] = ALPHA
chans[GREEN - RED] = chans[1] = BLUE
chans[RED - RED] = chans[0] = GREEN
chans[ALPHA - RED] = chans[3] = RED

This gives an inverse swizzle of GBAR which is certainly a weird swizzle.
However, if you apply ARGB and then GBAR, you get back to identity since
one is a left rotate and one is a right rotate.  Does that make a bit more
sense?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] clover: Allow overriding platform/device version numbers

2018-03-01 Thread Francisco Jerez
Aaron Watry  writes:

> Useful for testing API, builtin library, and device completeness of
> not-yet-supported versions.
>
> Signed-off-by: Aaron Watry 
> Cc: Pierre Moreau 
> Cc: Jan Vesely 
> Cc: Francisco Jerez 
> ---
>  src/gallium/state_trackers/clover/api/platform.cpp | 7 ++-
>  src/gallium/state_trackers/clover/core/device.cpp  | 5 +++--
>  2 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/state_trackers/clover/api/platform.cpp 
> b/src/gallium/state_trackers/clover/api/platform.cpp
> index ed86163311..8cb2718973 100644
> --- a/src/gallium/state_trackers/clover/api/platform.cpp
> +++ b/src/gallium/state_trackers/clover/api/platform.cpp
> @@ -23,6 +23,7 @@
>  #include "api/util.hpp"
>  #include "core/platform.hpp"
>  #include "git_sha1.h"
> +#include "util/u_debug.h"
>  
>  using namespace clover;
>  
> @@ -51,6 +52,7 @@ clover::GetPlatformInfo(cl_platform_id d_platform, 
> cl_platform_info param,
> property_buffer buf { r_buf, size, r_size };
>  
> obj(d_platform);
> +   std::string version_string;
>  
> switch (param) {
> case CL_PLATFORM_PROFILE:
> @@ -58,7 +60,10 @@ clover::GetPlatformInfo(cl_platform_id d_platform, 
> cl_platform_info param,
>break;
>  
> case CL_PLATFORM_VERSION:
> -  buf.as_string() = "OpenCL 1.1 Mesa " PACKAGE_VERSION
> +  version_string = std::string(
> +debug_get_option("CLOVER_PLATFORM_VERSION_OVERRIDE", "1.1"));

Can you make the version_string declaration local to this case block and
mark as const?  With that fixed:

Reviewed-by: Francisco Jerez 

> +
> +  buf.as_string() = "OpenCL " + version_string + " Mesa " PACKAGE_VERSION
>  #ifdef MESA_GIT_SHA1
>  " (" MESA_GIT_SHA1 ")"
>  #endif
> diff --git a/src/gallium/state_trackers/clover/core/device.cpp 
> b/src/gallium/state_trackers/clover/core/device.cpp
> index 71cf4bf60a..245d728886 100644
> --- a/src/gallium/state_trackers/clover/core/device.cpp
> +++ b/src/gallium/state_trackers/clover/core/device.cpp
> @@ -25,6 +25,7 @@
>  #include "core/platform.hpp"
>  #include "pipe/p_screen.h"
>  #include "pipe/p_state.h"
> +#include "util/u_debug.h"
>  
>  using namespace clover;
>  
> @@ -268,10 +269,10 @@ device::endianness() const {
>  
>  std::string
>  device::device_version() const {
> -return "1.1";
> +   return std::string(debug_get_option("CLOVER_DEVICE_VERSION_OVERRIDE", 
> "1.1"));
>  }
>  
>  std::string
>  device::device_clc_version() const {
> -return "1.1";
> +   return std::string(debug_get_option("CLOVER_DEVICE_CLC_VERSION_OVERRIDE", 
> "1.1"));
>  }
> -- 
> 2.14.1


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105161] KHR_blend_equation_advanced doesn't work in GLSL 1.10-1.40 shaders

2018-03-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105161

Kenneth Graunke  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|NOTABUG |---
Summary|Validation of   |KHR_blend_equation_advanced
   |KHR_blend_equation_advanced |doesn't work in GLSL
   |stricter than NVidia|1.10-1.40 shaders

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] clover/llvm: Pass device down to compile

2018-03-01 Thread Francisco Jerez
Aaron Watry  writes:

> We'll need to be able to detect device version to define the appropriate
> __OPENCL_VERSION__ header.
>
> v2: Rebase after removing the previous patch (Pierre)
>   - Removed "clover: Add device_clc_version to llvm::create_compiler_instance"
>
> Signed-off-by: Aaron Watry 
> Cc: Pierre Moreau 

Patches 1-3 are:

Reviewed-by: Francisco Jerez 

> ---
>  src/gallium/state_trackers/clover/llvm/invocation.cpp | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index 42aabfb9a3..1924c0317f 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -146,7 +146,7 @@ namespace {
> std::unique_ptr
> compile(LLVMContext , clang::CompilerInstance ,
> const std::string , const std::string ,
> -   const header_map , const std::string ,
> +   const header_map , const device ,
> const std::string , std::string _log) {
>c.getFrontendOpts().ProgramAction = clang::frontend::EmitLLVMOnly;
>c.getHeaderSearchOpts().UseBuiltinIncludes = true;
> @@ -190,7 +190,7 @@ namespace {
>// barrier() (e.g. Moving barrier() inside a conditional that is
>// no executed by all threads) during its optimizaton passes.
>compat::add_link_bitcode_file(c.getCodeGenOpts(),
> -LIBCLC_LIBEXECDIR + target + ".bc");
> +LIBCLC_LIBEXECDIR + dev.ir_target() + 
> ".bc");
>  
>// Compile the code
>clang::EmitLLVMOnlyAction act();
> @@ -212,8 +212,7 @@ clover::llvm::compile_program(const std::string ,
>  
> auto ctx = create_context(r_log);
> auto c = create_compiler_instance(dev, tokenize(opts + " input.cl"), 
> r_log);
> -   auto mod = compile(*ctx, *c, "input.cl", source, headers, dev.ir_target(),
> -  opts, r_log);
> +   auto mod = compile(*ctx, *c, "input.cl", source, headers, dev, opts, 
> r_log);
>  
> if (has_flag(debug::llvm))
>debug::log(".ll", print_module_bitcode(*mod));
> -- 
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 2018 X.Org Board of Directors Elections Nomination period is NOW

2018-03-01 Thread Rob Clark
All,

We have extended the deadline for nominations until 9 Mar 2018.  We
currently have four nominees for four seats, but we would like to have
at least another candidate or two, so please consider stepping up and
nominating yourself or a friend!

BR,
-R

On Fri, Feb 9, 2018 at 9:01 AM, Rob Clark  wrote:
> We are seeking nominations for candidates for election to the X.Org
> Foundation Board of Directors. All X.Org Foundation members are
> eligible for election to the board.
>
> Nominations for the 2018 election are now open and will remain open
> until 23:59 UTC on 23 Feb 2018.
>
> The Board consists of directors elected from the membership. Each
> year, an election is held to bring the total number of directors to
> eight. The four members receiving the highest vote totals will serve
> as directors for two year terms.
>
> The directors who received two year terms starting in 2017 were Rob
> Clark, Martin Peres, Taylor Campbell and Daniel Vetter. They will
> continue to serve until their term ends in 2019. Current directors
> whose term expires in 2018 are Alex Deucher, Egbert Eich, Keith
> Packard and Bryce Harrington.
>
> A director is expected to participate in the fortnightly IRC meeting
> to discuss current business and to attend the annual meeting of the
> X.Org Foundation, which will be held at a location determined in
> advance by the Board of Directors.
>
> A member may nominate themselves or any other member they feel is
> qualified. Nominations should be sent to the Election Committee at
> elections at x.org.
>
> Nominees shall be required to be current members of the X.Org
> Foundation, and submit a personal statement of up to 200 words that
> will be provided to prospective voters. The collected statements,
> along with the statement of contribution to the X.Org Foundation in
> the members account page on http://members.x.org, will be made
> available to all voters to help them make their voting decisions.
>
> Nominations, membership applications or renewals and completed
> personal statements must be received no later than 23:59 UTC on 23 Feb
> 2018.
>
> The slate of candidates will be published 1 Mar 2018 and candidate Q
> will begin then. The deadline for Xorg membership applications and
> renewals is 1 Mar 2018.
>
> Cheers, Rob Clark, on behalf of the X.Org BoD
> https://www.x.org/wiki/BoardOfDirectors/Elections/2018/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105255] Waiting for fences without waitAll is not implemented

2018-03-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105255

Józef Kucia  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Józef Kucia  ---
Thanks for fixing it quickly.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/glsl_to_nir: simplify st_nir_assign_var_locations() and fix for fs outputs

2018-03-01 Thread Timothy Arceri

Just FYI this has been reviewed over IRC and pushed.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] ac: fix nir_intrinsic_shared_atomic_comp_swap handling

2018-03-01 Thread Timothy Arceri

Just FYI this has been reviewed over IRC and pushed.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] clover: Allow overriding platform/device version numbers

2018-03-01 Thread Pierre Moreau
Reviewed-by: Pierre Moreau 

On 2018-03-01 — 13:44, Aaron Watry wrote:
> Useful for testing API, builtin library, and device completeness of
> not-yet-supported versions.
> 
> Signed-off-by: Aaron Watry 
> Cc: Pierre Moreau 
> Cc: Jan Vesely 
> Cc: Francisco Jerez 
> ---
>  src/gallium/state_trackers/clover/api/platform.cpp | 7 ++-
>  src/gallium/state_trackers/clover/core/device.cpp  | 5 +++--
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/clover/api/platform.cpp 
> b/src/gallium/state_trackers/clover/api/platform.cpp
> index ed86163311..8cb2718973 100644
> --- a/src/gallium/state_trackers/clover/api/platform.cpp
> +++ b/src/gallium/state_trackers/clover/api/platform.cpp
> @@ -23,6 +23,7 @@
>  #include "api/util.hpp"
>  #include "core/platform.hpp"
>  #include "git_sha1.h"
> +#include "util/u_debug.h"
>  
>  using namespace clover;
>  
> @@ -51,6 +52,7 @@ clover::GetPlatformInfo(cl_platform_id d_platform, 
> cl_platform_info param,
> property_buffer buf { r_buf, size, r_size };
>  
> obj(d_platform);
> +   std::string version_string;
>  
> switch (param) {
> case CL_PLATFORM_PROFILE:
> @@ -58,7 +60,10 @@ clover::GetPlatformInfo(cl_platform_id d_platform, 
> cl_platform_info param,
>break;
>  
> case CL_PLATFORM_VERSION:
> -  buf.as_string() = "OpenCL 1.1 Mesa " PACKAGE_VERSION
> +  version_string = std::string(
> +debug_get_option("CLOVER_PLATFORM_VERSION_OVERRIDE", "1.1"));
> +
> +  buf.as_string() = "OpenCL " + version_string + " Mesa " PACKAGE_VERSION
>  #ifdef MESA_GIT_SHA1
>  " (" MESA_GIT_SHA1 ")"
>  #endif
> diff --git a/src/gallium/state_trackers/clover/core/device.cpp 
> b/src/gallium/state_trackers/clover/core/device.cpp
> index 71cf4bf60a..245d728886 100644
> --- a/src/gallium/state_trackers/clover/core/device.cpp
> +++ b/src/gallium/state_trackers/clover/core/device.cpp
> @@ -25,6 +25,7 @@
>  #include "core/platform.hpp"
>  #include "pipe/p_screen.h"
>  #include "pipe/p_state.h"
> +#include "util/u_debug.h"
>  
>  using namespace clover;
>  
> @@ -268,10 +269,10 @@ device::endianness() const {
>  
>  std::string
>  device::device_version() const {
> -return "1.1";
> +   return std::string(debug_get_option("CLOVER_DEVICE_VERSION_OVERRIDE", 
> "1.1"));
>  }
>  
>  std::string
>  device::device_clc_version() const {
> -return "1.1";
> +   return std::string(debug_get_option("CLOVER_DEVICE_CLC_VERSION_OVERRIDE", 
> "1.1"));
>  }
> -- 
> 2.14.1
> 


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] clover/llvm: Add get_[cl|language]_version, validation and some helpers

2018-03-01 Thread Pierre Moreau
On 2018-03-01 — 22:43, Pierre Moreau wrote:
> On 2018-03-01 — 13:39, Aaron Watry wrote:
> > Used to calculate the default CLC language version based on the --cl-std in 
> > build args
> > and the device capabilities.
> > 
> > According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by:
> >  1) If you have -cl-std=CL1.1+ use the version specified
> >  2) If not, use the highest 1.x version that the device supports
> > 
> > Curiously, there is no valid value for -cl-std=CL1.0
> > 
> > Validates requested cl-std against device_clc_version
> > 
> > Signed-off-by: Aaron Watry 
> > Cc: Pierre Moreau 
> > 
> > v5: (Aaron) Use a collection of cl versions instead of switch cases
> > Consolidates the string, numeric version, and clc langstandard::kind
> > 
> > v4: (Pierre) Split get_language_version addition and use into separate 
> > patches
> > Squash patches that add the helpers and validate the language standard
> > 
> > v3: Change device_version to device_clc_version
> > 
> > v2: (Pierre) Move create_compiler_instance changes to correct patch
> > to prevent temporary build breakage.
> > Convert version_str into unsigned and use it to find language version
> > Add build_error for unknown language version string
> > Whitespace fixes
> > ---
> >  .../state_trackers/clover/llvm/invocation.cpp  | 63 
> > ++
> >  1 file changed, 63 insertions(+)
> > 
> > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> > b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > index 1924c0317f..8d76f203de 100644
> > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > @@ -63,6 +63,23 @@ using ::llvm::Module;
> >  using ::llvm::raw_string_ostream;
> >  
> >  namespace {
> > +
> > +   struct cl_version {
> 
> I would rename everything that uses *cl_version* to *clc_version*, as they are
> all about the OpenCL C language version, rather than the OpenCL API version.

Hum, I just saw you uses it for converting the device OpenCL API version as
well. I need to have another look at this patch.

> 
> > +  std::string version_str; //CL Version
> 
> Minor change, but a space could be added between the comment token and the
> comment itself (same for the other comments further down).
> I would go “OpenCL C” instead of just “CL” in the comment.
> 
> > +  unsigned version_number; //Numeric CL Version
> > +  clang::LangStandard::Kind clc_lang_standard; //CLC standard of this 
> > version
> 
> Similarly here.
> 
> > +   };
> > +
> > +   static const unsigned ANY_VERSION = 999;
> > +   cl_version const cl_versions[] = {
> 
> Please place “const” before the type, for consistency.
> 
> > +  { "1.0", 100, clang::LangStandard::lang_opencl10},
> > +  { "1.1", 110, clang::LangStandard::lang_opencl11},
> > +  { "1.2", 120, clang::LangStandard::lang_opencl12},
> > +  { "2.0", 200, clang::LangStandard::lang_opencl20},
> > +  { "2.1", 210, clang::LangStandard::lang_unspecified}, //2.1 doesn't 
> > exist
> > +  { "2.2", 220, clang::LangStandard::lang_unspecified}, //2.2 doesn't 
> > exist
> 
> You should remove 2.1 and 2.2, as those versions of OpenCL C do not exist, and
> “CL2.1” or “CL2.2” are not valid values to “-cl-std”.
> 
> > +   };
> > +
> > void
> > init_targets() {
> >static bool targets_initialized = false;
> > @@ -93,6 +110,52 @@ namespace {
> >return ctx;
> > }
> >  
> > +   struct cl_version
> > +   get_cl_version(const std::string _str,
> > +  unsigned max = ANY_VERSION) {
> > +  for (struct cl_version version : cl_versions) {
> 
> You could take a constant reference here.
> 
> > + if (version.version_number == max || version.version_str == 
> > version_str) {
> > +return version;
> > + }
> > +  }
> > +  throw build_error("Unknown/Unsupported language version");
> > +   }
> > +
> > +   clang::LangStandard::Kind
> > +   get_lang_standard_from_version_str(const std::string _str,
> > +  bool is_build_opt = false) {
> > +   /**
> > +   * Per CL 2.0 spec, section 5.8.4.5:
> > +   * If it's an option, use the value directly.
> > +   * If it's a device version, clamp to max 1.x version, a.k.a. 1.2
> > +   */
> > +  struct cl_version version = get_cl_version(version_str,
> > +  is_build_opt ? ANY_VERSION : 120);
> > +  return version.clc_lang_standard;
> > +   }
> > +
> > +   clang::LangStandard::Kind
> > +   get_language_version(const std::vector ,
> > +const std::string _version) {
> > +
> > +  const std::string search = "-cl-std=CL";
> > +
> > +  for(auto opt: opts){
> 
> Missing spaces after “for” and before ‘{’.
> 
> > + auto pos = opt.find(search);
> > + if (pos == 0){
> > +   

Re: [Mesa-dev] [PATCH 4/5] clover/llvm: Add get_[cl|language]_version, validation and some helpers

2018-03-01 Thread Pierre Moreau
On 2018-03-01 — 13:39, Aaron Watry wrote:
> Used to calculate the default CLC language version based on the --cl-std in 
> build args
> and the device capabilities.
> 
> According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by:
>  1) If you have -cl-std=CL1.1+ use the version specified
>  2) If not, use the highest 1.x version that the device supports
> 
> Curiously, there is no valid value for -cl-std=CL1.0
> 
> Validates requested cl-std against device_clc_version
> 
> Signed-off-by: Aaron Watry 
> Cc: Pierre Moreau 
> 
> v5: (Aaron) Use a collection of cl versions instead of switch cases
> Consolidates the string, numeric version, and clc langstandard::kind
> 
> v4: (Pierre) Split get_language_version addition and use into separate patches
> Squash patches that add the helpers and validate the language standard
> 
> v3: Change device_version to device_clc_version
> 
> v2: (Pierre) Move create_compiler_instance changes to correct patch
> to prevent temporary build breakage.
> Convert version_str into unsigned and use it to find language version
> Add build_error for unknown language version string
> Whitespace fixes
> ---
>  .../state_trackers/clover/llvm/invocation.cpp  | 63 
> ++
>  1 file changed, 63 insertions(+)
> 
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index 1924c0317f..8d76f203de 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -63,6 +63,23 @@ using ::llvm::Module;
>  using ::llvm::raw_string_ostream;
>  
>  namespace {
> +
> +   struct cl_version {

I would rename everything that uses *cl_version* to *clc_version*, as they are
all about the OpenCL C language version, rather than the OpenCL API version.

> +  std::string version_str; //CL Version

Minor change, but a space could be added between the comment token and the
comment itself (same for the other comments further down).
I would go “OpenCL C” instead of just “CL” in the comment.

> +  unsigned version_number; //Numeric CL Version
> +  clang::LangStandard::Kind clc_lang_standard; //CLC standard of this 
> version

Similarly here.

> +   };
> +
> +   static const unsigned ANY_VERSION = 999;
> +   cl_version const cl_versions[] = {

Please place “const” before the type, for consistency.

> +  { "1.0", 100, clang::LangStandard::lang_opencl10},
> +  { "1.1", 110, clang::LangStandard::lang_opencl11},
> +  { "1.2", 120, clang::LangStandard::lang_opencl12},
> +  { "2.0", 200, clang::LangStandard::lang_opencl20},
> +  { "2.1", 210, clang::LangStandard::lang_unspecified}, //2.1 doesn't 
> exist
> +  { "2.2", 220, clang::LangStandard::lang_unspecified}, //2.2 doesn't 
> exist

You should remove 2.1 and 2.2, as those versions of OpenCL C do not exist, and
“CL2.1” or “CL2.2” are not valid values to “-cl-std”.

> +   };
> +
> void
> init_targets() {
>static bool targets_initialized = false;
> @@ -93,6 +110,52 @@ namespace {
>return ctx;
> }
>  
> +   struct cl_version
> +   get_cl_version(const std::string _str,
> +  unsigned max = ANY_VERSION) {
> +  for (struct cl_version version : cl_versions) {

You could take a constant reference here.

> + if (version.version_number == max || version.version_str == 
> version_str) {
> +return version;
> + }
> +  }
> +  throw build_error("Unknown/Unsupported language version");
> +   }
> +
> +   clang::LangStandard::Kind
> +   get_lang_standard_from_version_str(const std::string _str,
> +  bool is_build_opt = false) {
> +   /**
> +   * Per CL 2.0 spec, section 5.8.4.5:
> +   * If it's an option, use the value directly.
> +   * If it's a device version, clamp to max 1.x version, a.k.a. 1.2
> +   */
> +  struct cl_version version = get_cl_version(version_str,
> +  is_build_opt ? ANY_VERSION : 120);
> +  return version.clc_lang_standard;
> +   }
> +
> +   clang::LangStandard::Kind
> +   get_language_version(const std::vector ,
> +const std::string _version) {
> +
> +  const std::string search = "-cl-std=CL";
> +
> +  for(auto opt: opts){

Missing spaces after “for” and before ‘{’.

> + auto pos = opt.find(search);
> + if (pos == 0){
> +auto ver = opt.substr(pos+search.size());

You could have spaces around the ‘+’.
And variables here could be marked as constant.

> +auto device_ver = get_cl_version(device_version);
> +auto requested = get_cl_version(ver);
> +if (requested.version_number > device_ver.version_number) {
> +   throw build_error();
> +}
> +return 

[Mesa-dev] [PATCH 3/3] radv: report the scratch private memory size with shader stats

2018-03-01 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_shader.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 171802eede..d216408074 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -666,13 +666,15 @@ generate_shader_stats(struct radv_device *device,
   "VGPRS: %d\n"
   "Spilled SGPRs: %d\n"
   "Spilled VGPRs: %d\n"
+  "PrivMem VGPRS: %d\n"
   "Code Size: %d bytes\n"
   "LDS: %d blocks\n"
   "Scratch: %d bytes per wave\n"
   "Max Waves: %d\n"
   "\n\n\n",
   conf->num_sgprs, conf->num_vgprs,
-  conf->spilled_sgprs, conf->spilled_vgprs, 
variant->code_size,
+  conf->spilled_sgprs, conf->spilled_vgprs,
+  variant->info.private_mem_vgprs, 
variant->code_size,
   conf->lds_size, conf->scratch_bytes_per_wave,
   max_simd_waves);
 }
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] ac: add ac_count_scratch_private_memory()

2018-03-01 Thread Samuel Pitoiset
Imported from RadeonSI.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_llvm_util.c| 31 +++
 src/amd/common/ac_llvm_util.h|  3 +++
 src/gallium/drivers/radeonsi/si_shader.c | 32 
 3 files changed, 38 insertions(+), 28 deletions(-)

diff --git a/src/amd/common/ac_llvm_util.c b/src/amd/common/ac_llvm_util.c
index b88c4e4979..3530bf088b 100644
--- a/src/amd/common/ac_llvm_util.c
+++ b/src/amd/common/ac_llvm_util.c
@@ -24,10 +24,12 @@
  */
 /* based on pieces from si_pipe.c and radeon_llvm_emit.c */
 #include "ac_llvm_util.h"
+#include "ac_llvm_build.h"
 #include "util/bitscan.h"
 #include 
 #include 
 #include "c11/threads.h"
+#include "util/u_math.h"
 
 #include 
 #include 
@@ -207,3 +209,32 @@ ac_llvm_add_target_dep_function_attr(LLVMValueRef F,
snprintf(str, sizeof(str), "%i", value);
LLVMAddTargetDependentFunctionAttr(F, name, str);
 }
+
+unsigned
+ac_count_scratch_private_memory(LLVMValueRef function)
+{
+   unsigned private_mem_vgprs = 0;
+
+   /* Process all LLVM instructions. */
+   LLVMBasicBlockRef bb = LLVMGetFirstBasicBlock(function);
+   while (bb) {
+   LLVMValueRef next = LLVMGetFirstInstruction(bb);
+
+   while (next) {
+   LLVMValueRef inst = next;
+   next = LLVMGetNextInstruction(next);
+
+   if (LLVMGetInstructionOpcode(inst) != LLVMAlloca)
+   continue;
+
+   LLVMTypeRef type = LLVMGetElementType(LLVMTypeOf(inst));
+   /* No idea why LLVM aligns allocas to 4 elements. */
+   unsigned alignment = LLVMGetAlignment(inst);
+   unsigned dw_size = align(ac_get_type_size(type) / 4, 
alignment);
+   private_mem_vgprs += dw_size;
+   }
+   bb = LLVMGetNextBasicBlock(bb);
+   }
+
+   return private_mem_vgprs;
+}
diff --git a/src/amd/common/ac_llvm_util.h b/src/amd/common/ac_llvm_util.h
index 3cf385a33e..5329bb1b70 100644
--- a/src/amd/common/ac_llvm_util.h
+++ b/src/amd/common/ac_llvm_util.h
@@ -105,6 +105,9 @@ ac_get_store_intr_attribs(bool writeonly_memory)
  AC_FUNC_ATTR_WRITEONLY;
 }
 
+unsigned
+ac_count_scratch_private_memory(LLVMValueRef function);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 2a50b266f6..74bd435124 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -5957,32 +5957,6 @@ static void si_optimize_vs_outputs(struct 
si_shader_context *ctx)
   >info.nr_param_exports);
 }
 
-static void si_count_scratch_private_memory(struct si_shader_context *ctx)
-{
-   ctx->shader->config.private_mem_vgprs = 0;
-
-   /* Process all LLVM instructions. */
-   LLVMBasicBlockRef bb = LLVMGetFirstBasicBlock(ctx->main_fn);
-   while (bb) {
-   LLVMValueRef next = LLVMGetFirstInstruction(bb);
-
-   while (next) {
-   LLVMValueRef inst = next;
-   next = LLVMGetNextInstruction(next);
-
-   if (LLVMGetInstructionOpcode(inst) != LLVMAlloca)
-   continue;
-
-   LLVMTypeRef type = LLVMGetElementType(LLVMTypeOf(inst));
-   /* No idea why LLVM aligns allocas to 4 elements. */
-   unsigned alignment = LLVMGetAlignment(inst);
-   unsigned dw_size = align(ac_get_type_size(type) / 4, 
alignment);
-   ctx->shader->config.private_mem_vgprs += dw_size;
-   }
-   bb = LLVMGetNextBasicBlock(bb);
-   }
-}
-
 static void si_init_exec_from_input(struct si_shader_context *ctx,
unsigned param, unsigned bitoffset)
 {
@@ -6929,8 +6903,10 @@ int si_compile_tgsi_shader(struct si_screen *sscreen,
si_optimize_vs_outputs();
 
if ((debug && debug->debug_message) ||
-   si_can_dump_shader(sscreen, ctx.type))
-   si_count_scratch_private_memory();
+   si_can_dump_shader(sscreen, ctx.type)) {
+   ctx.shader->config.private_mem_vgprs =
+   ac_count_scratch_private_memory(ctx.main_fn);
+   }
 
/* Compile to bytecode. */
r = si_compile_llvm(sscreen, >binary, >config, tm,
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] ac/nir: count the scratch private memory size

2018-03-01 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_nir_to_llvm.c | 10 --
 src/amd/common/ac_nir_to_llvm.h |  1 +
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 23344909af..6446dd682f 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -6779,7 +6779,8 @@ LLVMModuleRef 
ac_translate_nir_to_llvm(LLVMTargetMachineRef tm,
struct nir_shader *const *shaders,
int shader_count,
struct ac_shader_variant_info 
*shader_info,
-   const struct ac_nir_compiler_options 
*options)
+   const struct ac_nir_compiler_options 
*options,
+  bool dump_shader)
 {
struct radv_shader_context ctx = {0};
unsigned i;
@@ -6946,6 +6947,11 @@ LLVMModuleRef 
ac_translate_nir_to_llvm(LLVMTargetMachineRef tm,
if (shader_count == 1)
ac_nir_eliminate_const_vs_outputs();
 
+   if (dump_shader) {
+   ctx.shader_info->private_mem_vgprs =
+   ac_count_scratch_private_memory(ctx.main_function);
+   }
+
return ctx.ac.module;
 }
 
@@ -7139,7 +7145,7 @@ void ac_compile_nir_shader(LLVMTargetMachineRef tm,
 {
 
LLVMModuleRef llvm_module = ac_translate_nir_to_llvm(tm, nir, 
nir_count, shader_info,
-options);
+options, 
dump_shader);
 
ac_compile_llvm_module(tm, llvm_module, binary, config, shader_info, 
nir[0]->info.stage, dump_shader, options->supports_spill);
for (int i = 0; i < nir_count; ++i)
diff --git a/src/amd/common/ac_nir_to_llvm.h b/src/amd/common/ac_nir_to_llvm.h
index 766acec6ed..e873a1c9d8 100644
--- a/src/amd/common/ac_nir_to_llvm.h
+++ b/src/amd/common/ac_nir_to_llvm.h
@@ -163,6 +163,7 @@ struct ac_shader_variant_info {
unsigned num_user_sgprs;
unsigned num_input_sgprs;
unsigned num_input_vgprs;
+   unsigned private_mem_vgprs;
bool need_indirect_descriptor_sets;
struct {
struct {
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] intel/fs: Set up sampler message headers in the visitor on gen7+

2018-03-01 Thread Francisco Jerez
Jason Ekstrand  writes:

> This gives the scheduler visibility into the headers which should
> improve scheduling.  More importantly, however, it lets the scheduler
> know that the header gets written.  As-is, the scheduler thinks that a
> texture instruction only reads it's payload and is unaware that it may
> write to the first register so it may reorder it with respect to a read
> from that register.  This is causing issues in a couple of Dota 2 vertex
> shaders.
>

Yikes...  Corrupting your GRF since 2012...  Render target writes
probably need a similar treatment.

> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104923
> Cc: mesa-sta...@lists.freedesktop.org

Reviewed-by: Francisco Jerez 

> ---
>  src/intel/compiler/brw_fs.cpp   | 40 
> +
>  src/intel/compiler/brw_fs_generator.cpp | 21 +++--
>  2 files changed, 39 insertions(+), 22 deletions(-)
>
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index 113f62c..ab8cc89 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -4192,17 +4192,15 @@ lower_sampler_logical_send_gen7(const fs_builder 
> , fs_inst *inst, opcode op,
> op == SHADER_OPCODE_SAMPLEINFO ||
> is_high_sampler(devinfo, sampler)) {
>/* For general texture offsets (no txf workaround), we need a header to
> -   * put them in.  Note that we're only reserving space for it in the
> -   * message payload as it will be initialized implicitly by the
> -   * generator.
> +   * put them in.
> *
> * TG4 needs to place its channel select in the header, for interaction
> * with ARB_texture_swizzle.  The sampler index is only 4-bits, so for
> * larger sampler numbers we need to offset the Sampler State Pointer 
> in
> * the header.
> */
> +  fs_reg header = retype(sources[0], BRW_REGISTER_TYPE_UD);
>header_size = 1;
> -  sources[0] = fs_reg();
>length++;
>  
>/* If we're requesting fewer than four channels worth of response,
> @@ -4214,6 +4212,40 @@ lower_sampler_logical_send_gen7(const fs_builder , 
> fs_inst *inst, opcode op,
>   unsigned mask = ~((1 << (regs_written(inst) / reg_width)) - 1) & 
> 0xf;
>   inst->offset |= mask << 12;
>}
> +
> +  /* Build the actual header */
> +  const fs_builder ubld = bld.exec_all().group(8, 0);
> +  const fs_builder ubld1 = ubld.group(1, 0);
> +  ubld.MOV(header, retype(brw_vec8_grf(0, 0), BRW_REGISTER_TYPE_UD));
> +  if (inst->offset) {
> + ubld1.MOV(component(header, 2), brw_imm_ud(inst->offset));
> +  } else if (bld.shader->stage != MESA_SHADER_VERTEX &&
> + bld.shader->stage != MESA_SHADER_FRAGMENT) {
> + /* The vertex and fragment stages have g0.2 set to 0, so
> +  * header0.2 is 0 when g0 is copied. Other stages may not, so we
> +  * must set it to 0 to avoid setting undesirable bits in the
> +  * message.
> +  */
> + ubld1.MOV(component(header, 2), brw_imm_ud(0));
> +  }
> +
> +  if (is_high_sampler(devinfo, sampler)) {
> + if (sampler.file == BRW_IMMEDIATE_VALUE) {
> +assert(sampler.ud >= 16);
> +const int sampler_state_size = 16; /* 16 bytes */
> +
> +ubld1.ADD(component(header, 3),
> +  retype(brw_vec1_grf(0, 3), BRW_REGISTER_TYPE_UD),
> +  brw_imm_ud(16 * (sampler.ud / 16) * 
> sampler_state_size));
> + } else {
> +fs_reg tmp = ubld1.vgrf(BRW_REGISTER_TYPE_UD);
> +ubld1.AND(tmp, sampler, brw_imm_ud(0x0f0));
> +ubld1.SHL(tmp, tmp, brw_imm_ud(4));
> +ubld1.ADD(component(header, 3),
> +  retype(brw_vec1_grf(0, 3), BRW_REGISTER_TYPE_UD),
> +  tmp);
> + }
> +  }
> }
>  
> if (shadow_c.file != BAD_FILE) {
> diff --git a/src/intel/compiler/brw_fs_generator.cpp 
> b/src/intel/compiler/brw_fs_generator.cpp
> index b59c09f..a5a821a 100644
> --- a/src/intel/compiler/brw_fs_generator.cpp
> +++ b/src/intel/compiler/brw_fs_generator.cpp
> @@ -1001,19 +1001,13 @@ fs_generator::generate_tex(fs_inst *inst, struct 
> brw_reg dst, struct brw_reg src
>  * we need to set it up explicitly and load the offset bitfield.
>  * Otherwise, we can use an implied move from g0 to the first message reg.
>  */
> -   if (inst->header_size != 0) {
> +   if (inst->header_size != 0 && devinfo->gen < 7) {
>if (devinfo->gen < 6 && !inst->offset) {
>   /* Set up an implied move from g0 to the MRF. */
>   src = retype(brw_vec8_grf(0, 0), BRW_REGISTER_TYPE_UW);
>} else {
> - struct brw_reg header_reg;
> -
> - if (devinfo->gen >= 7) {
> -header_reg = src;
> - } else {
> -assert(inst->base_mrf != -1);

Re: [Mesa-dev] [PATCH 1/5] clover/llvm: Use device in llvm compilation instead of copying fields

2018-03-01 Thread Pierre Moreau
I am wondering whether you should squash the first three patches together, as
they are all about passing the device around rather than the attributes, and
patch 2 (and 3) just start from where the previous patch left, and pass the
device one level deeper in the call chain.
Regardless of whether the patches are squashed together or not, but with Jan’s
comment about the unused variable in patch 2 addressed, patches 1, 2 and 3 are

Reviewed-by: Pierre Moreau 

Thank you for resending an updated version of this series.
Pierre

On 2018-03-01 — 13:39, Aaron Watry wrote:
> Copying the individual fields from the device when compiling/linking
> will lead to an unnecessarily large number of fields getting passed
> around.
> 
> v3: Rebase on current master
> v2: Use device in function args before making additional changes in following 
> patches
> 
> Signed-off-by: Aaron Watry 
> Cc: Jan Vesely 
> Cc: Pierre Moreau 
> ---
>  src/gallium/state_trackers/clover/core/program.cpp   |  7 +++
>  .../state_trackers/clover/llvm/invocation.cpp| 20 
> ++--
>  .../state_trackers/clover/llvm/invocation.hpp|  5 ++---
>  3 files changed, 15 insertions(+), 17 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/clover/core/program.cpp 
> b/src/gallium/state_trackers/clover/core/program.cpp
> index ae4b50a879..4e74fccd97 100644
> --- a/src/gallium/state_trackers/clover/core/program.cpp
> +++ b/src/gallium/state_trackers/clover/core/program.cpp
> @@ -53,8 +53,8 @@ program::compile(const ref_vector , const 
> std::string ,
>   try {
>  const module m = (dev.ir_format() == PIPE_SHADER_IR_TGSI ?
>tgsi::compile_program(_source, log) :
> -  llvm::compile_program(_source, headers,
> -dev.ir_target(), opts, 
> log));
> +  llvm::compile_program(_source, headers, dev,
> +opts, log));
>  _builds[] = { m, opts, log };
>   } catch (...) {
>  _builds[] = { module(), opts, log };
> @@ -78,8 +78,7 @@ program::link(const ref_vector , const 
> std::string ,
>try {
>   const module m = (dev.ir_format() == PIPE_SHADER_IR_TGSI ?
> tgsi::link_program(ms) :
> -   llvm::link_program(ms, dev.ir_format(),
> -  dev.ir_target(), opts, log));
> +   llvm::link_program(ms, dev, opts, log));
>   _builds[] = { m, opts, log };
>} catch (...) {
>   _builds[] = { module(), opts, log };
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index e4ca5fa444..c8c0311a3a 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -201,17 +201,17 @@ namespace {
>  module
>  clover::llvm::compile_program(const std::string ,
>const header_map ,
> -  const std::string ,
> +  const device ,
>const std::string ,
>std::string _log) {
> if (has_flag(debug::clc))
>debug::log(".cl", "// Options: " + opts + '\n' + source);
>  
> auto ctx = create_context(r_log);
> -   auto c = create_compiler_instance(target, tokenize(opts + " input.cl"),
> - r_log);
> -   auto mod = compile(*ctx, *c, "input.cl", source, headers, target, opts,
> -  r_log);
> +   auto c = create_compiler_instance(dev.ir_target(),
> + tokenize(opts + " input.cl"), r_log);
> +   auto mod = compile(*ctx, *c, "input.cl", source, headers, dev.ir_target(),
> +  opts, r_log);
>  
> if (has_flag(debug::llvm))
>debug::log(".ll", print_module_bitcode(*mod));
> @@ -269,14 +269,14 @@ namespace {
>  
>  module
>  clover::llvm::link_program(const std::vector ,
> -   enum pipe_shader_ir ir, const std::string ,
> +   const device ,
> const std::string , std::string _log) {
> std::vector options = tokenize(opts + " input.cl");
> const bool create_library = count("-create-library", options);
> erase_if(equals("-create-library"), options);
>  
> auto ctx = create_context(r_log);
> -   auto c = create_compiler_instance(target, options, r_log);
> +   auto c = create_compiler_instance(dev.ir_target(), options, r_log);
> auto mod = link(*ctx, *c, modules, r_log);
>  
> optimize(*mod, c->getCodeGenOpts().OptimizationLevel, !create_library);
> @@ -291,11 +291,11 @@ 

Re: [Mesa-dev] [PATCH 2/5] clover: Pass device to llvm::create_compiler_instance

2018-03-01 Thread Jan Vesely
On Thu, 2018-03-01 at 13:39 -0600, Aaron Watry wrote:
> We'll be using dev.device_clc_version to select the default language version
> soon along with the existing ir_target field.
> 
> Signed-off-by: Aaron Watry 
> Cc: Pierre Moreau 
> Cc: Jan Vesely 
> 
> v4: Pass the device down instead of device_clc_version as a separate field
> v3: Revise to acknowledge that we now have the device in compile/link_program
> instead of the string values.
> v2: (Pierre) Move changes to create_compiler_instance invocation to correct
> patch to prevent temporary build breakage.
> (Jan) Use device_clc_version instead of device_version for compile/link
> ---
>  src/gallium/state_trackers/clover/llvm/invocation.cpp | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index c8c0311a3a..42aabfb9a3 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -94,7 +94,7 @@ namespace {
> }
>  
> std::unique_ptr
> -   create_compiler_instance(const target ,
> +   create_compiler_instance(const device ,
>  const std::vector ,
>  std::string _log) {
>std::unique_ptr c { new 
> clang::CompilerInstance };
> @@ -108,6 +108,9 @@ namespace {
>const std::vector copts =
>   map(std::mem_fn(::string::c_str), opts);
>  
> +  const target  = dev.ir_target();
> +  const std::string _clc_version = dev.device_clc_version();

This variable does not seem to be used until patch 5. Am I missing
something? Better not introduce it early.
Other than that:
Reviewed-by: Jan Vesely 

Jan

> +
>if (!clang::CompilerInvocation::CreateFromArgs(
>   c->getInvocation(), copts.data(), copts.data() + copts.size(), 
> diag))
>   throw invalid_build_options_error();
> @@ -208,8 +211,7 @@ clover::llvm::compile_program(const std::string ,
>debug::log(".cl", "// Options: " + opts + '\n' + source);
>  
> auto ctx = create_context(r_log);
> -   auto c = create_compiler_instance(dev.ir_target(),
> - tokenize(opts + " input.cl"), r_log);
> +   auto c = create_compiler_instance(dev, tokenize(opts + " input.cl"), 
> r_log);
> auto mod = compile(*ctx, *c, "input.cl", source, headers, dev.ir_target(),
>opts, r_log);
>  
> @@ -276,7 +278,7 @@ clover::llvm::link_program(const std::vector 
> ,
> erase_if(equals("-create-library"), options);
>  
> auto ctx = create_context(r_log);
> -   auto c = create_compiler_instance(dev.ir_target(), options, r_log);
> +   auto c = create_compiler_instance(dev, options, r_log);
> auto mod = link(*ctx, *c, modules, r_log);
>  
> optimize(*mod, c->getCodeGenOpts().OptimizationLevel, !create_library);


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] clover/llvm: Use device in llvm compilation instead of copying fields

2018-03-01 Thread Jan Vesely
On Thu, 2018-03-01 at 13:39 -0600, Aaron Watry wrote:
> Copying the individual fields from the device when compiling/linking
> will lead to an unnecessarily large number of fields getting passed
> around.
> 
> v3: Rebase on current master
> v2: Use device in function args before making additional changes in following 
> patches
> 
> Signed-off-by: Aaron Watry 
> Cc: Jan Vesely 
> Cc: Pierre Moreau 

Few places could use temporary variable, dunno what Francisco prefers.
This version LGTM.
Reviewed-by: Jan Vesely 

Jan

> ---
>  src/gallium/state_trackers/clover/core/program.cpp   |  7 +++
>  .../state_trackers/clover/llvm/invocation.cpp| 20 
> ++--
>  .../state_trackers/clover/llvm/invocation.hpp|  5 ++---
>  3 files changed, 15 insertions(+), 17 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/clover/core/program.cpp 
> b/src/gallium/state_trackers/clover/core/program.cpp
> index ae4b50a879..4e74fccd97 100644
> --- a/src/gallium/state_trackers/clover/core/program.cpp
> +++ b/src/gallium/state_trackers/clover/core/program.cpp
> @@ -53,8 +53,8 @@ program::compile(const ref_vector , const 
> std::string ,
>   try {
>  const module m = (dev.ir_format() == PIPE_SHADER_IR_TGSI ?
>tgsi::compile_program(_source, log) :
> -  llvm::compile_program(_source, headers,
> -dev.ir_target(), opts, 
> log));
> +  llvm::compile_program(_source, headers, dev,
> +opts, log));
>  _builds[] = { m, opts, log };
>   } catch (...) {
>  _builds[] = { module(), opts, log };
> @@ -78,8 +78,7 @@ program::link(const ref_vector , const 
> std::string ,
>try {
>   const module m = (dev.ir_format() == PIPE_SHADER_IR_TGSI ?
> tgsi::link_program(ms) :
> -   llvm::link_program(ms, dev.ir_format(),
> -  dev.ir_target(), opts, log));
> +   llvm::link_program(ms, dev, opts, log));
>   _builds[] = { m, opts, log };
>} catch (...) {
>   _builds[] = { module(), opts, log };
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index e4ca5fa444..c8c0311a3a 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -201,17 +201,17 @@ namespace {
>  module
>  clover::llvm::compile_program(const std::string ,
>const header_map ,
> -  const std::string ,
> +  const device ,
>const std::string ,
>std::string _log) {
> if (has_flag(debug::clc))
>debug::log(".cl", "// Options: " + opts + '\n' + source);
>  
> auto ctx = create_context(r_log);
> -   auto c = create_compiler_instance(target, tokenize(opts + " input.cl"),
> - r_log);
> -   auto mod = compile(*ctx, *c, "input.cl", source, headers, target, opts,
> -  r_log);
> +   auto c = create_compiler_instance(dev.ir_target(),
> + tokenize(opts + " input.cl"), r_log);
> +   auto mod = compile(*ctx, *c, "input.cl", source, headers, dev.ir_target(),
> +  opts, r_log);
>  
> if (has_flag(debug::llvm))
>debug::log(".ll", print_module_bitcode(*mod));
> @@ -269,14 +269,14 @@ namespace {
>  
>  module
>  clover::llvm::link_program(const std::vector ,
> -   enum pipe_shader_ir ir, const std::string ,
> +   const device ,
> const std::string , std::string _log) {
> std::vector options = tokenize(opts + " input.cl");
> const bool create_library = count("-create-library", options);
> erase_if(equals("-create-library"), options);
>  
> auto ctx = create_context(r_log);
> -   auto c = create_compiler_instance(target, options, r_log);
> +   auto c = create_compiler_instance(dev.ir_target(), options, r_log);
> auto mod = link(*ctx, *c, modules, r_log);
>  
> optimize(*mod, c->getCodeGenOpts().OptimizationLevel, !create_library);
> @@ -291,11 +291,11 @@ clover::llvm::link_program(const std::vector 
> ,
> if (create_library) {
>return build_module_library(*mod, module::section::text_library);
>  
> -   } else if (ir == PIPE_SHADER_IR_NATIVE) {
> +   } else if (dev.ir_format() == PIPE_SHADER_IR_NATIVE) {
>if (has_flag(debug::native))
> - debug::log(id +  ".asm", print_module_native(*mod, target));
> + debug::log(id +  ".asm", 

Re: [Mesa-dev] [PATCH 3/6] intel/ir: Allow arbitrary scratch flag registers for SHADER_OPCODE_FIND_LIVE_CHANNEL.

2018-03-01 Thread Francisco Jerez
Kenneth Graunke  writes:

> On Tuesday, February 27, 2018 1:38:25 PM PST Francisco Jerez wrote:
>> This shouldn't cause any functional change at this point, it changes
>> SHADER_OPCODE_FIND_LIVE_CHANNEL to use the flag register specified at
>> the IR level instead of the hard-coded f1.0, now that it can be
>> represented in backend_instruction::flag_subreg.  This will be
>> necessary for scheduling to behave correctly once more things start
>> making use of f1.0.
>> ---
>>  src/intel/compiler/brw_eu_emit.c| 5 +++--
>>  src/intel/compiler/brw_fs.cpp   | 3 ++-
>>  src/intel/compiler/brw_fs_builder.h | 2 +-
>>  3 files changed, 6 insertions(+), 4 deletions(-)
>> 
>> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
>> index 6c86b1592fd..0b87d8ab14e 100644
>> --- a/src/intel/compiler/brw_fs.cpp
>> +++ b/src/intel/compiler/brw_fs.cpp
>> @@ -931,7 +931,8 @@ fs_inst::flags_written() const
>> if ((conditional_mod && (opcode != BRW_OPCODE_SEL &&
>>  opcode != BRW_OPCODE_IF &&
>>  opcode != BRW_OPCODE_WHILE)) ||
>> -   opcode == FS_OPCODE_MOV_DISPATCH_TO_FLAGS) {
>> +   opcode == FS_OPCODE_MOV_DISPATCH_TO_FLAGS ||
>> +   opcode == SHADER_OPCODE_FIND_LIVE_CHANNEL) {
>
> Looks like an unrelated fix?  It's probably fine here though.
>

The purpose of this change is to make the flag write visible to the
scheduler, because with this patch we start specifying the flag
subregister that is to be used as scratch predicate internally by
SHADER_OPCODE_FIND_LIVE_CHANNEL as backend_instruction::flag_subreg, but
because SHADER_OPCODE_FIND_LIVE_CHANNEL shouldn't ever have a predicate
nor a conditional mod, the instruction wouldn't be considered to write
the flag register if it wasn't because of this change.

>>return flag_mask(this);
>> } else {
>>return flag_mask(dst, size_written);


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] clover: Allow overriding platform/device version numbers

2018-03-01 Thread Aaron Watry
Useful for testing API, builtin library, and device completeness of
not-yet-supported versions.

Signed-off-by: Aaron Watry 
Cc: Pierre Moreau 
Cc: Jan Vesely 
Cc: Francisco Jerez 
---
 src/gallium/state_trackers/clover/api/platform.cpp | 7 ++-
 src/gallium/state_trackers/clover/core/device.cpp  | 5 +++--
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/gallium/state_trackers/clover/api/platform.cpp 
b/src/gallium/state_trackers/clover/api/platform.cpp
index ed86163311..8cb2718973 100644
--- a/src/gallium/state_trackers/clover/api/platform.cpp
+++ b/src/gallium/state_trackers/clover/api/platform.cpp
@@ -23,6 +23,7 @@
 #include "api/util.hpp"
 #include "core/platform.hpp"
 #include "git_sha1.h"
+#include "util/u_debug.h"
 
 using namespace clover;
 
@@ -51,6 +52,7 @@ clover::GetPlatformInfo(cl_platform_id d_platform, 
cl_platform_info param,
property_buffer buf { r_buf, size, r_size };
 
obj(d_platform);
+   std::string version_string;
 
switch (param) {
case CL_PLATFORM_PROFILE:
@@ -58,7 +60,10 @@ clover::GetPlatformInfo(cl_platform_id d_platform, 
cl_platform_info param,
   break;
 
case CL_PLATFORM_VERSION:
-  buf.as_string() = "OpenCL 1.1 Mesa " PACKAGE_VERSION
+  version_string = std::string(
+debug_get_option("CLOVER_PLATFORM_VERSION_OVERRIDE", "1.1"));
+
+  buf.as_string() = "OpenCL " + version_string + " Mesa " PACKAGE_VERSION
 #ifdef MESA_GIT_SHA1
 " (" MESA_GIT_SHA1 ")"
 #endif
diff --git a/src/gallium/state_trackers/clover/core/device.cpp 
b/src/gallium/state_trackers/clover/core/device.cpp
index 71cf4bf60a..245d728886 100644
--- a/src/gallium/state_trackers/clover/core/device.cpp
+++ b/src/gallium/state_trackers/clover/core/device.cpp
@@ -25,6 +25,7 @@
 #include "core/platform.hpp"
 #include "pipe/p_screen.h"
 #include "pipe/p_state.h"
+#include "util/u_debug.h"
 
 using namespace clover;
 
@@ -268,10 +269,10 @@ device::endianness() const {
 
 std::string
 device::device_version() const {
-return "1.1";
+   return std::string(debug_get_option("CLOVER_DEVICE_VERSION_OVERRIDE", 
"1.1"));
 }
 
 std::string
 device::device_clc_version() const {
-return "1.1";
+   return std::string(debug_get_option("CLOVER_DEVICE_CLC_VERSION_OVERRIDE", 
"1.1"));
 }
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] clover: Include generic type in several kernel/device obj() calls

2018-03-01 Thread Aaron Watry
Fixes auto-completion for some device and kernel methods in my IDE.

No functional change intended.

Signed-off-by: Aaron Watry 
---
 src/gallium/state_trackers/clover/api/device.cpp |  2 +-
 src/gallium/state_trackers/clover/api/kernel.cpp | 22 +++---
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/src/gallium/state_trackers/clover/api/device.cpp 
b/src/gallium/state_trackers/clover/api/device.cpp
index 3572bb0c92..2aaa2c59cb 100644
--- a/src/gallium/state_trackers/clover/api/device.cpp
+++ b/src/gallium/state_trackers/clover/api/device.cpp
@@ -98,7 +98,7 @@ CLOVER_API cl_int
 clGetDeviceInfo(cl_device_id d_dev, cl_device_info param,
 size_t size, void *r_buf, size_t *r_size) try {
property_buffer buf { r_buf, size, r_size };
-   auto  = obj(d_dev);
+   auto  = obj(d_dev);
 
switch (param) {
case CL_DEVICE_TYPE:
diff --git a/src/gallium/state_trackers/clover/api/kernel.cpp 
b/src/gallium/state_trackers/clover/api/kernel.cpp
index b665773d9e..705828a688 100644
--- a/src/gallium/state_trackers/clover/api/kernel.cpp
+++ b/src/gallium/state_trackers/clover/api/kernel.cpp
@@ -28,7 +28,7 @@ using namespace clover;
 
 CLOVER_API cl_kernel
 clCreateKernel(cl_program d_prog, const char *name, cl_int *r_errcode) try {
-   auto  = obj(d_prog);
+   auto  = obj(d_prog);
 
if (!name)
   throw error(CL_INVALID_VALUE);
@@ -50,7 +50,7 @@ clCreateKernel(cl_program d_prog, const char *name, cl_int 
*r_errcode) try {
 CLOVER_API cl_int
 clCreateKernelsInProgram(cl_program d_prog, cl_uint count,
  cl_kernel *rd_kerns, cl_uint *r_count) try {
-   auto  = obj(d_prog);
+   auto  = obj(d_prog);
auto  = prog.symbols();
 
if (rd_kerns && count < syms.size())
@@ -76,7 +76,7 @@ clCreateKernelsInProgram(cl_program d_prog, cl_uint count,
 
 CLOVER_API cl_int
 clRetainKernel(cl_kernel d_kern) try {
-   obj(d_kern).retain();
+   obj(d_kern).retain();
return CL_SUCCESS;
 
 } catch (error ) {
@@ -85,7 +85,7 @@ clRetainKernel(cl_kernel d_kern) try {
 
 CLOVER_API cl_int
 clReleaseKernel(cl_kernel d_kern) try {
-   if (obj(d_kern).release())
+   if (obj(d_kern).release())
   delete pobj(d_kern);
 
return CL_SUCCESS;
@@ -97,7 +97,7 @@ clReleaseKernel(cl_kernel d_kern) try {
 CLOVER_API cl_int
 clSetKernelArg(cl_kernel d_kern, cl_uint idx, size_t size,
const void *value) try {
-   obj(d_kern).args().at(idx).set(size, value);
+   obj(d_kern).args().at(idx).set(size, value);
return CL_SUCCESS;
 
 } catch (std::out_of_range ) {
@@ -111,7 +111,7 @@ CLOVER_API cl_int
 clGetKernelInfo(cl_kernel d_kern, cl_kernel_info param,
 size_t size, void *r_buf, size_t *r_size) try {
property_buffer buf { r_buf, size, r_size };
-   auto  = obj(d_kern);
+   auto  = obj(d_kern);
 
switch (param) {
case CL_KERNEL_FUNCTION_NAME:
@@ -149,7 +149,7 @@ clGetKernelWorkGroupInfo(cl_kernel d_kern, cl_device_id 
d_dev,
  cl_kernel_work_group_info param,
  size_t size, void *r_buf, size_t *r_size) try {
property_buffer buf { r_buf, size, r_size };
-   auto  = obj(d_kern);
+   auto  = obj(d_kern);
auto  = (d_dev ? *pobj(d_dev) : unique(kern.program().devices()));
 
if (!count(dev, kern.program().devices()))
@@ -279,8 +279,8 @@ clEnqueueNDRangeKernel(cl_command_queue d_q, cl_kernel 
d_kern,
const size_t *d_grid_size, const size_t *d_block_size,
cl_uint num_deps, const cl_event *d_deps,
cl_event *rd_ev) try {
-   auto  = obj(d_q);
-   auto  = obj(d_kern);
+   auto  = obj(d_q);
+   auto  = obj(d_kern);
auto deps = objs(d_deps, num_deps);
auto grid_size = validate_grid_size(q, dims, d_grid_size);
auto grid_offset = validate_grid_offset(q, dims, d_grid_offset);
@@ -306,8 +306,8 @@ CLOVER_API cl_int
 clEnqueueTask(cl_command_queue d_q, cl_kernel d_kern,
   cl_uint num_deps, const cl_event *d_deps,
   cl_event *rd_ev) try {
-   auto  = obj(d_q);
-   auto  = obj(d_kern);
+   auto  = obj(d_q);
+   auto  = obj(d_kern);
auto deps = objs(d_deps, num_deps);
 
validate_common(q, kern, deps);
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/6] intel/eu: Plumb header present bit to codegen helpers for HDC messages.

2018-03-01 Thread Jordan Justen
1-4 Reviewed-by: Jordan Justen 

On 2018-02-27 13:38:26, Francisco Jerez wrote:
> This makes sure that the header-present bit of the message descriptor
> is in sync with the IR instruction fields, which gives the optimizer
> more control to avoid the overhead of setting up a message header when
> it's possible to do so.
> ---
>  src/intel/compiler/brw_eu.h   | 18 --
>  src/intel/compiler/brw_eu_emit.c  | 30 ++
>  src/intel/compiler/brw_fs_generator.cpp   | 20 ++--
>  src/intel/compiler/brw_vec4_generator.cpp | 11 ++-
>  4 files changed, 50 insertions(+), 29 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_eu.h b/src/intel/compiler/brw_eu.h
> index 2d0f56f7938..a5f28d8fc65 100644
> --- a/src/intel/compiler/brw_eu.h
> +++ b/src/intel/compiler/brw_eu.h
> @@ -444,7 +444,8 @@ brw_untyped_atomic(struct brw_codegen *p,
> struct brw_reg surface,
> unsigned atomic_op,
> unsigned msg_length,
> -   bool response_expected);
> +   bool response_expected,
> +   bool header_present);
>  
>  void
>  brw_untyped_surface_read(struct brw_codegen *p,
> @@ -459,7 +460,8 @@ brw_untyped_surface_write(struct brw_codegen *p,
>struct brw_reg payload,
>struct brw_reg surface,
>unsigned msg_length,
> -  unsigned num_channels);
> +  unsigned num_channels,
> +  bool header_present);
>  
>  void
>  brw_typed_atomic(struct brw_codegen *p,
> @@ -468,7 +470,8 @@ brw_typed_atomic(struct brw_codegen *p,
>   struct brw_reg surface,
>   unsigned atomic_op,
>   unsigned msg_length,
> - bool response_expected);
> + bool response_expected,
> + bool header_present);
>  
>  void
>  brw_typed_surface_read(struct brw_codegen *p,
> @@ -476,14 +479,16 @@ brw_typed_surface_read(struct brw_codegen *p,
> struct brw_reg payload,
> struct brw_reg surface,
> unsigned msg_length,
> -   unsigned num_channels);
> +   unsigned num_channels,
> +   bool header_present);
>  
>  void
>  brw_typed_surface_write(struct brw_codegen *p,
>  struct brw_reg payload,
>  struct brw_reg surface,
>  unsigned msg_length,
> -unsigned num_channels);
> +unsigned num_channels,
> +bool header_present);
>  
>  void
>  brw_byte_scattered_read(struct brw_codegen *p,
> @@ -498,7 +503,8 @@ brw_byte_scattered_write(struct brw_codegen *p,
>   struct brw_reg payload,
>   struct brw_reg surface,
>   unsigned msg_length,
> - unsigned bit_size);
> + unsigned bit_size,
> + bool header_present);
>  
>  void
>  brw_memory_fence(struct brw_codegen *p,
> diff --git a/src/intel/compiler/brw_eu_emit.c 
> b/src/intel/compiler/brw_eu_emit.c
> index 9fc6d12f288..9529a30d27e 100644
> --- a/src/intel/compiler/brw_eu_emit.c
> +++ b/src/intel/compiler/brw_eu_emit.c
> @@ -2877,7 +2877,8 @@ brw_untyped_atomic(struct brw_codegen *p,
> struct brw_reg surface,
> unsigned atomic_op,
> unsigned msg_length,
> -   bool response_expected)
> +   bool response_expected,
> +   bool header_present)
>  {
> const struct gen_device_info *devinfo = p->devinfo;
> const unsigned sfid = (devinfo->gen >= 8 || devinfo->is_haswell ?
> @@ -2895,7 +2896,7 @@ brw_untyped_atomic(struct brw_codegen *p,
>p, sfid, brw_writemask(dst, mask), payload, surface, msg_length,
>brw_surface_payload_size(p, response_expected,
> devinfo->gen >= 8 || devinfo->is_haswell, 
> true),
> -  align1);
> +  header_present);
>  
> brw_set_dp_untyped_atomic_message(
>p, insn, atomic_op, response_expected);
> @@ -2978,7 +2979,8 @@ brw_untyped_surface_write(struct brw_codegen *p,
>struct brw_reg payload,
>struct brw_reg surface,
>unsigned msg_length,
> -  unsigned num_channels)
> +  unsigned num_channels,
> +  bool header_present)
>  {
> const struct gen_device_info *devinfo = p->devinfo;
> const unsigned sfid = (devinfo->gen >= 8 || devinfo->is_haswell ?
> @@ -2990,7 +2992,7 @@ brw_untyped_surface_write(struct brw_codegen *p,
> 

[Mesa-dev] [PATCH 4/5] clover/llvm: Add get_[cl|language]_version, validation and some helpers

2018-03-01 Thread Aaron Watry
Used to calculate the default CLC language version based on the --cl-std in 
build args
and the device capabilities.

According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by:
 1) If you have -cl-std=CL1.1+ use the version specified
 2) If not, use the highest 1.x version that the device supports

Curiously, there is no valid value for -cl-std=CL1.0

Validates requested cl-std against device_clc_version

Signed-off-by: Aaron Watry 
Cc: Pierre Moreau 

v5: (Aaron) Use a collection of cl versions instead of switch cases
Consolidates the string, numeric version, and clc langstandard::kind

v4: (Pierre) Split get_language_version addition and use into separate patches
Squash patches that add the helpers and validate the language standard

v3: Change device_version to device_clc_version

v2: (Pierre) Move create_compiler_instance changes to correct patch
to prevent temporary build breakage.
Convert version_str into unsigned and use it to find language version
Add build_error for unknown language version string
Whitespace fixes
---
 .../state_trackers/clover/llvm/invocation.cpp  | 63 ++
 1 file changed, 63 insertions(+)

diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index 1924c0317f..8d76f203de 100644
--- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
+++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
@@ -63,6 +63,23 @@ using ::llvm::Module;
 using ::llvm::raw_string_ostream;
 
 namespace {
+
+   struct cl_version {
+  std::string version_str; //CL Version
+  unsigned version_number; //Numeric CL Version
+  clang::LangStandard::Kind clc_lang_standard; //CLC standard of this 
version
+   };
+
+   static const unsigned ANY_VERSION = 999;
+   cl_version const cl_versions[] = {
+  { "1.0", 100, clang::LangStandard::lang_opencl10},
+  { "1.1", 110, clang::LangStandard::lang_opencl11},
+  { "1.2", 120, clang::LangStandard::lang_opencl12},
+  { "2.0", 200, clang::LangStandard::lang_opencl20},
+  { "2.1", 210, clang::LangStandard::lang_unspecified}, //2.1 doesn't exist
+  { "2.2", 220, clang::LangStandard::lang_unspecified}, //2.2 doesn't exist
+   };
+
void
init_targets() {
   static bool targets_initialized = false;
@@ -93,6 +110,52 @@ namespace {
   return ctx;
}
 
+   struct cl_version
+   get_cl_version(const std::string _str,
+  unsigned max = ANY_VERSION) {
+  for (struct cl_version version : cl_versions) {
+ if (version.version_number == max || version.version_str == 
version_str) {
+return version;
+ }
+  }
+  throw build_error("Unknown/Unsupported language version");
+   }
+
+   clang::LangStandard::Kind
+   get_lang_standard_from_version_str(const std::string _str,
+  bool is_build_opt = false) {
+   /**
+   * Per CL 2.0 spec, section 5.8.4.5:
+   * If it's an option, use the value directly.
+   * If it's a device version, clamp to max 1.x version, a.k.a. 1.2
+   */
+  struct cl_version version = get_cl_version(version_str,
+  is_build_opt ? ANY_VERSION : 120);
+  return version.clc_lang_standard;
+   }
+
+   clang::LangStandard::Kind
+   get_language_version(const std::vector ,
+const std::string _version) {
+
+  const std::string search = "-cl-std=CL";
+
+  for(auto opt: opts){
+ auto pos = opt.find(search);
+ if (pos == 0){
+auto ver = opt.substr(pos+search.size());
+auto device_ver = get_cl_version(device_version);
+auto requested = get_cl_version(ver);
+if (requested.version_number > device_ver.version_number) {
+   throw build_error();
+}
+return get_lang_standard_from_version_str(ver, true);
+ }
+  }
+
+  return get_lang_standard_from_version_str(device_version);
+   }
+
std::unique_ptr
create_compiler_instance(const device ,
 const std::vector ,
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/5 v3] A few clover fixes for both CTS and eventual 1.2 support

2018-03-01 Thread Aaron Watry
The first two patches of the previous series [1] landed upstream a while back.

When I bump my platform/device versions to 1.2, the clang instance has
been confirmed to enable 1.2 language features (like the static keyword
required in test/cl/program/execute/static.cl, which goes skip->pass), while
building a program with -cl-std=CL1.1 still leaves the 1.2 features disabled.

I've also updated my clover platform version to 2.2 and device version to
2.0 and verified that I see expected behavior when specifying a -cl-std of:
  1.0 - Invalid build options error (-cl-std=CL1.0 isn't valid).
  1.1 - Build proceeds with version 1.1
  1.2 - Build proceeds with version 1.2
  2.0 - Build proceeds with version 2.0
  2.1 - Invalid build options error
  2.2 - Invalid build options error

Changes since version 2:
  - Squashed several patches together as Pierre suggested and tried to avoid
introducing and then changing API in later patches
  - Enable the -cl-std and __OPENCL_VERSION__ changes at the same time
  - Remove switch statements for version detection and created an array of
struct cl_version with several attributes instead that can be matched on.
Seems a lot cleaner that way (to me).

Major changes since v1:
  Addressed Pierre's build-breakage comments
  Added a check for cl-std > device_clc_version
  Added a patch to pass the device object down into invocation.cpp
instead of adding a bunch of device-based arguments.
  Use device_clc_version for cl version detection instead of device_version
  Added device_clc_version in device.cpp/hpp

Anyway, happy reviewing.

Cc: Jan Vesely 
Cc: Pierre Moreau 
Cc: Francisco Jerez 


1 - https://lists.freedesktop.org/archives/mesa-dev/2017-July/164699.html

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] clover: Pass device to llvm::create_compiler_instance

2018-03-01 Thread Aaron Watry
We'll be using dev.device_clc_version to select the default language version
soon along with the existing ir_target field.

Signed-off-by: Aaron Watry 
Cc: Pierre Moreau 
Cc: Jan Vesely 

v4: Pass the device down instead of device_clc_version as a separate field
v3: Revise to acknowledge that we now have the device in compile/link_program
instead of the string values.
v2: (Pierre) Move changes to create_compiler_instance invocation to correct
patch to prevent temporary build breakage.
(Jan) Use device_clc_version instead of device_version for compile/link
---
 src/gallium/state_trackers/clover/llvm/invocation.cpp | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index c8c0311a3a..42aabfb9a3 100644
--- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
+++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
@@ -94,7 +94,7 @@ namespace {
}
 
std::unique_ptr
-   create_compiler_instance(const target ,
+   create_compiler_instance(const device ,
 const std::vector ,
 std::string _log) {
   std::unique_ptr c { new clang::CompilerInstance 
};
@@ -108,6 +108,9 @@ namespace {
   const std::vector copts =
  map(std::mem_fn(::string::c_str), opts);
 
+  const target  = dev.ir_target();
+  const std::string _clc_version = dev.device_clc_version();
+
   if (!clang::CompilerInvocation::CreateFromArgs(
  c->getInvocation(), copts.data(), copts.data() + copts.size(), 
diag))
  throw invalid_build_options_error();
@@ -208,8 +211,7 @@ clover::llvm::compile_program(const std::string ,
   debug::log(".cl", "// Options: " + opts + '\n' + source);
 
auto ctx = create_context(r_log);
-   auto c = create_compiler_instance(dev.ir_target(),
- tokenize(opts + " input.cl"), r_log);
+   auto c = create_compiler_instance(dev, tokenize(opts + " input.cl"), r_log);
auto mod = compile(*ctx, *c, "input.cl", source, headers, dev.ir_target(),
   opts, r_log);
 
@@ -276,7 +278,7 @@ clover::llvm::link_program(const std::vector 
,
erase_if(equals("-create-library"), options);
 
auto ctx = create_context(r_log);
-   auto c = create_compiler_instance(dev.ir_target(), options, r_log);
+   auto c = create_compiler_instance(dev, options, r_log);
auto mod = link(*ctx, *c, modules, r_log);
 
optimize(*mod, c->getCodeGenOpts().OptimizationLevel, !create_library);
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] clover/llvm: Use device in llvm compilation instead of copying fields

2018-03-01 Thread Aaron Watry
Copying the individual fields from the device when compiling/linking
will lead to an unnecessarily large number of fields getting passed
around.

v3: Rebase on current master
v2: Use device in function args before making additional changes in following 
patches

Signed-off-by: Aaron Watry 
Cc: Jan Vesely 
Cc: Pierre Moreau 
---
 src/gallium/state_trackers/clover/core/program.cpp   |  7 +++
 .../state_trackers/clover/llvm/invocation.cpp| 20 ++--
 .../state_trackers/clover/llvm/invocation.hpp|  5 ++---
 3 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/src/gallium/state_trackers/clover/core/program.cpp 
b/src/gallium/state_trackers/clover/core/program.cpp
index ae4b50a879..4e74fccd97 100644
--- a/src/gallium/state_trackers/clover/core/program.cpp
+++ b/src/gallium/state_trackers/clover/core/program.cpp
@@ -53,8 +53,8 @@ program::compile(const ref_vector , const 
std::string ,
  try {
 const module m = (dev.ir_format() == PIPE_SHADER_IR_TGSI ?
   tgsi::compile_program(_source, log) :
-  llvm::compile_program(_source, headers,
-dev.ir_target(), opts, 
log));
+  llvm::compile_program(_source, headers, dev,
+opts, log));
 _builds[] = { m, opts, log };
  } catch (...) {
 _builds[] = { module(), opts, log };
@@ -78,8 +78,7 @@ program::link(const ref_vector , const 
std::string ,
   try {
  const module m = (dev.ir_format() == PIPE_SHADER_IR_TGSI ?
tgsi::link_program(ms) :
-   llvm::link_program(ms, dev.ir_format(),
-  dev.ir_target(), opts, log));
+   llvm::link_program(ms, dev, opts, log));
  _builds[] = { m, opts, log };
   } catch (...) {
  _builds[] = { module(), opts, log };
diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index e4ca5fa444..c8c0311a3a 100644
--- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
+++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
@@ -201,17 +201,17 @@ namespace {
 module
 clover::llvm::compile_program(const std::string ,
   const header_map ,
-  const std::string ,
+  const device ,
   const std::string ,
   std::string _log) {
if (has_flag(debug::clc))
   debug::log(".cl", "// Options: " + opts + '\n' + source);
 
auto ctx = create_context(r_log);
-   auto c = create_compiler_instance(target, tokenize(opts + " input.cl"),
- r_log);
-   auto mod = compile(*ctx, *c, "input.cl", source, headers, target, opts,
-  r_log);
+   auto c = create_compiler_instance(dev.ir_target(),
+ tokenize(opts + " input.cl"), r_log);
+   auto mod = compile(*ctx, *c, "input.cl", source, headers, dev.ir_target(),
+  opts, r_log);
 
if (has_flag(debug::llvm))
   debug::log(".ll", print_module_bitcode(*mod));
@@ -269,14 +269,14 @@ namespace {
 
 module
 clover::llvm::link_program(const std::vector ,
-   enum pipe_shader_ir ir, const std::string ,
+   const device ,
const std::string , std::string _log) {
std::vector options = tokenize(opts + " input.cl");
const bool create_library = count("-create-library", options);
erase_if(equals("-create-library"), options);
 
auto ctx = create_context(r_log);
-   auto c = create_compiler_instance(target, options, r_log);
+   auto c = create_compiler_instance(dev.ir_target(), options, r_log);
auto mod = link(*ctx, *c, modules, r_log);
 
optimize(*mod, c->getCodeGenOpts().OptimizationLevel, !create_library);
@@ -291,11 +291,11 @@ clover::llvm::link_program(const std::vector 
,
if (create_library) {
   return build_module_library(*mod, module::section::text_library);
 
-   } else if (ir == PIPE_SHADER_IR_NATIVE) {
+   } else if (dev.ir_format() == PIPE_SHADER_IR_NATIVE) {
   if (has_flag(debug::native))
- debug::log(id +  ".asm", print_module_native(*mod, target));
+ debug::log(id +  ".asm", print_module_native(*mod, dev.ir_target()));
 
-  return build_module_native(*mod, target, *c, r_log);
+  return build_module_native(*mod, dev.ir_target(), *c, r_log);
 
} else {
   unreachable("Unsupported IR.");
diff --git a/src/gallium/state_trackers/clover/llvm/invocation.hpp 
b/src/gallium/state_trackers/clover/llvm/invocation.hpp
index 5b3530c382..ff9caa457c 100644
--- 

[Mesa-dev] [PATCH 3/5] clover/llvm: Pass device down to compile

2018-03-01 Thread Aaron Watry
We'll need to be able to detect device version to define the appropriate
__OPENCL_VERSION__ header.

v2: Rebase after removing the previous patch (Pierre)
  - Removed "clover: Add device_clc_version to llvm::create_compiler_instance"

Signed-off-by: Aaron Watry 
Cc: Pierre Moreau 
---
 src/gallium/state_trackers/clover/llvm/invocation.cpp | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index 42aabfb9a3..1924c0317f 100644
--- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
+++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
@@ -146,7 +146,7 @@ namespace {
std::unique_ptr
compile(LLVMContext , clang::CompilerInstance ,
const std::string , const std::string ,
-   const header_map , const std::string ,
+   const header_map , const device ,
const std::string , std::string _log) {
   c.getFrontendOpts().ProgramAction = clang::frontend::EmitLLVMOnly;
   c.getHeaderSearchOpts().UseBuiltinIncludes = true;
@@ -190,7 +190,7 @@ namespace {
   // barrier() (e.g. Moving barrier() inside a conditional that is
   // no executed by all threads) during its optimizaton passes.
   compat::add_link_bitcode_file(c.getCodeGenOpts(),
-LIBCLC_LIBEXECDIR + target + ".bc");
+LIBCLC_LIBEXECDIR + dev.ir_target() + 
".bc");
 
   // Compile the code
   clang::EmitLLVMOnlyAction act();
@@ -212,8 +212,7 @@ clover::llvm::compile_program(const std::string ,
 
auto ctx = create_context(r_log);
auto c = create_compiler_instance(dev, tokenize(opts + " input.cl"), r_log);
-   auto mod = compile(*ctx, *c, "input.cl", source, headers, dev.ir_target(),
-  opts, r_log);
+   auto mod = compile(*ctx, *c, "input.cl", source, headers, dev, opts, r_log);
 
if (has_flag(debug::llvm))
   debug::log(".ll", print_module_bitcode(*mod));
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] clover: Dynamically calculate __OPENCL_VERSION__ and CLC language version

2018-03-01 Thread Aaron Watry
Use get_language_version to calculate default cl standard based on
device capabilities and -cl-std specified in build options.

v4: Squash the __OPENCL_VERSION__ and CLC language version patches
v3: (Jan) Allow device_version up to 2.2 while device_clc_version
only goes to 2.0
Use get_cl_version to calculate version instead
v2: Split out from the previous patch (Pierre)

Signed-off-by: Aaron Watry 
CC: Pierre Moreau 
CC: Jan Vesely 
---
 src/gallium/state_trackers/clover/llvm/invocation.cpp | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index 8d76f203de..f146695585 100644
--- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
+++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
@@ -194,7 +194,7 @@ namespace {
   compat::set_lang_defaults(c->getInvocation(), c->getLangOpts(),
 compat::ik_opencl, 
::llvm::Triple(target.triple),
 c->getPreprocessorOpts(),
-clang::LangStandard::lang_opencl11);
+get_language_version(opts, 
device_clc_version));
 
   c->createDiagnostics(new clang::TextDiagnosticPrinter(
   *new raw_string_ostream(r_log),
@@ -225,7 +225,9 @@ namespace {
   c.getPreprocessorOpts().Includes.push_back("clc/clc.h");
 
   // Add definition for the OpenCL version
-  c.getPreprocessorOpts().addMacroDef("__OPENCL_VERSION__=110");
+  c.getPreprocessorOpts().addMacroDef("__OPENCL_VERSION__=" +
+  std::to_string(get_cl_version(
+  dev.device_version()).version_number));
 
   // clc.h requires that this macro be defined:
   c.getPreprocessorOpts().addMacroDef("cl_clang_storage_class_specifiers");
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Update the documentation for meson

2018-03-01 Thread Dylan Baker
Meson is pretty well tested and works in most configurations now, so we
can remove the warning about it being unsuited for actual use.

It's also worth documenting that meson 0.42.0 or greater is required.

Signed-off-by: Dylan Baker 
---
 docs/meson.html | 34 +-
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/docs/meson.html b/docs/meson.html
index 77f89b0c6c7..782cc198649 100644
--- a/docs/meson.html
+++ b/docs/meson.html
@@ -18,11 +18,20 @@
 
 1. Basic Usage
 
-The Meson build system for Mesa is still under active development,
-and should not be used in production environments.
+The Meson build system is generally considered stable and ready
+for production
 
-The meson build is currently only tested on linux, and is known to not work
-on macOS, Windows, and haiku. This will be fixed.
+The meson build is currently known to work on Linux, macOS, Cygwin, Haiku,
+FreeBSD, DragonflyBSD, and NetBSD, it is believed to work on OpenBSD.
+
+Mesa requires Meson >= 0.42.0 to build in general.
+
+Additionaly, to build the Clover OpenCL state tracker or the OpenSWR driver
+meson 0.44.0 or greater is required.
+
+Some older versions of meson do not check that they are too old and will error
+out in odd ways.
+
 
 
 The meson program is used to configure the source directory and generates
@@ -122,12 +131,11 @@ llvm-config, so using an LLVM from a non-standard path is 
as easy as
 PKG_CONFIG_PATH
 The
 pkg-config utility is a hard requirement for configuring and
-building Mesa on Linux and *BSD. It is used to search for external libraries
-on the system. This environment variable is used to control the search
-path for pkg-config. For instance, setting
-PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig will search for
-package metadata in /usr/X11R6 before the standard
-directories.
+building Mesa on Unix-like systems. It is used to search for external libraries
+on the system. This environment variable is used to control the search path for
+pkg-config. For instance, setting
+PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig will search for package
+metadata in /usr/X11R6 before the standard directories.
 
 
 
@@ -151,9 +159,9 @@ may interfer with debbugging as some code and validation 
will be optimized
 away.
 
 
- For those wishing to pass their own -O option, use the "plain" buildtype,
-which cuases meson to inject no additional compiler arguments, only those in
-the C/CXXFLAGS and those that mesa itself defines.
+ For those wishing to pass their own optimization flags, use the "plain"
+buildtype, which causes meson to inject no additional compiler arguments, only
+those in the C/CXXFLAGS and those that mesa itself defines.
 
 
 
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/6] intel/fs: Handle surface opcode sample masks via predication.

2018-03-01 Thread Kenneth Graunke
On Tuesday, February 27, 2018 1:38:27 PM PST Francisco Jerez wrote:
> The main motivation is to enable HDC surface opcodes on ICL which no
> longer allows the sample mask to be provided in a message header, but
> this is enabled all the way back to IVB when possible because it
> decreases the instruction count of some shaders using HDC messages
> significantly, e.g. one of the SynMark2 CSDof compute shaders
> decreases instruction count by about 40% due to the removal of header
> setup boilerplate which in turn makes a number of send message
> payloads more easily CSE-able.  Shader-db results on SKL:
> 
>  total instructions in shared programs: 15325319 -> 15314384 (-0.07%)
>  instructions in affected programs: 311532 -> 300597 (-3.51%)
>  helped: 491
>  HURT: 1
> 
> Shader-db results on BDW where the optimization needs to be disabled
> in some cases due to hardware restrictions:
> 
>  total instructions in shared programs: 15604794 -> 15598028 (-0.04%)
>  instructions in affected programs: 220863 -> 214097 (-3.06%)
>  helped: 351
>  HURT: 0
> 
> The FPS of SynMark2 CSDof improves by 5.09% ±0.36% (n=10) on my SKL
> laptop with this change.
> ---
>  src/intel/compiler/brw_fs.cpp | 42 +-
>  1 file changed, 41 insertions(+), 1 deletion(-)
> 
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index 0b87d8ab14e..639432b4f49 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -4432,6 +4432,8 @@ static void
>  lower_surface_logical_send(const fs_builder , fs_inst *inst, opcode op,
> const fs_reg _mask)
>  {
> +   const gen_device_info *devinfo = bld.shader->devinfo;
> +
> /* Get the logical send arguments. */
> const fs_reg  = inst->src[0];
> const fs_reg  = inst->src[1];
> @@ -4442,7 +,20 @@ lower_surface_logical_send(const fs_builder , 
> fs_inst *inst, opcode op,
> /* Calculate the total number of components of the payload. */
> const unsigned addr_sz = inst->components_read(0);
> const unsigned src_sz = inst->components_read(1);
> -   const unsigned header_sz = (sample_mask.file == BAD_FILE ? 0 : 1);
> +   /* From the BDW PRM Volume 7, page 147:
> +*
> +*  "For the Data Cache Data Port*, the header must be present for the
> +*   following message types: [...] Typed read/write/atomics"
> +*
> +* Earlier generations have a similar wording.  Because of this 
> restriction
> +* we don't attempt to implement sample masks via predication for such
> +* messages prior to Gen9, since we have to provide a header anyway.  On
> +* Gen11+ the header has been removed so we can only use predication.
> +*/
> +   const unsigned header_sz = devinfo->gen < 9 &&
> +  (op == SHADER_OPCODE_TYPED_SURFACE_READ ||
> +   op == SHADER_OPCODE_TYPED_SURFACE_WRITE ||
> +   op == SHADER_OPCODE_TYPED_ATOMIC) ? 1 : 0;
> const unsigned sz = header_sz + addr_sz + src_sz;
>  
> /* Allocate space for the payload. */
> @@ -4462,6 +4477,31 @@ lower_surface_logical_send(const fs_builder , 
> fs_inst *inst, opcode op,
>  
> bld.LOAD_PAYLOAD(payload, components, sz, header_sz);
>  
> +   /* Predicate the instruction on the sample mask if no header is
> +* provided.
> +*/
> +   if (!header_sz && sample_mask.file != BAD_FILE &&
> +   sample_mask.file != IMM) {
> +  const fs_builder ubld = bld.group(1, 0).exec_all();
> +  if (inst->predicate) {
> + assert(inst->predicate == BRW_PREDICATE_NORMAL);
> + assert(!inst->predicate_inverse);
> + assert(inst->flag_subreg < 2);
> + /* Combine the sample mask with the existing predicate by using a
> +  * vertical predication mode.
> +   */
> + inst->predicate = BRW_PREDICATE_ALIGN1_ALLV;
> + ubld.MOV(retype(brw_flag_subreg(inst->flag_subreg + 2),
> + sample_mask.type),
> +  sample_mask);

I was surprised to see flag_subreg remain unchanged here, but then I
re-read how allv works, and it does f0.0 & f1.0, or f0.1 & f1.1.  So,
we can leave it as 0 or 1 and it'll implicitly use 2 or 3 as well.

Series is:
Reviewed-by: Kenneth Graunke 

> +  } else {
> + inst->flag_subreg = 2;
> + inst->predicate = BRW_PREDICATE_NORMAL;
> + ubld.MOV(retype(brw_flag_subreg(inst->flag_subreg), 
> sample_mask.type),
> +  sample_mask);
> +  }
> +   }
> +
> /* Update the original instruction. */
> inst->opcode = op;
> inst->mlen = header_sz + (addr_sz + src_sz) * inst->exec_size / 8;
> 



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeon/vcn: use enc profile instead of pic profile

2018-03-01 Thread Boyuan Zhang

Agree, I added the missing profile and entry_point to st/omx.
Please see the attached patch below.

On radeon driver side, do you think we should still check the profile
in encoder instead since profile shouldn't been changed during encoding.
Or we can just leave it with picture profile with this fix?

From: Boyuan Zhang 

Profile and entry point were missing in the picture structure.
Therefore, add them back.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/omx_bellagio/vid_enc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/state_trackers/omx_bellagio/vid_enc.c 
b/src/gallium/state_trackers/omx_bellagio/vid_enc.c

index 1a4fb62..162ec1f 100644
--- a/src/gallium/state_trackers/omx_bellagio/vid_enc.c
+++ b/src/gallium/state_trackers/omx_bellagio/vid_enc.c
@@ -1098,6 +1098,8 @@ static void enc_HandleTask(omx_base_PortType 
*port, struct encode_task *task,


    picture.picture_type = picture_type;
    picture.pic_order_cnt = task->pic_order_cnt;
+   picture.base.profile = 
enc_TranslateOMXProfileToPipe(priv->profile_level.eProfile);

+   picture.base.entry_point = PIPE_VIDEO_ENTRYPOINT_ENCODE;
    if (priv->restricted_b_frames && picture_type == 
PIPE_H264_ENC_PICTURE_TYPE_B)

   picture.not_referenced = true;
    enc_ControlPicture(port, );
--
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/6] intel/ir: Allow arbitrary scratch flag registers for SHADER_OPCODE_FIND_LIVE_CHANNEL.

2018-03-01 Thread Kenneth Graunke
On Tuesday, February 27, 2018 1:38:25 PM PST Francisco Jerez wrote:
> This shouldn't cause any functional change at this point, it changes
> SHADER_OPCODE_FIND_LIVE_CHANNEL to use the flag register specified at
> the IR level instead of the hard-coded f1.0, now that it can be
> represented in backend_instruction::flag_subreg.  This will be
> necessary for scheduling to behave correctly once more things start
> making use of f1.0.
> ---
>  src/intel/compiler/brw_eu_emit.c| 5 +++--
>  src/intel/compiler/brw_fs.cpp   | 3 ++-
>  src/intel/compiler/brw_fs_builder.h | 2 +-
>  3 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index 6c86b1592fd..0b87d8ab14e 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -931,7 +931,8 @@ fs_inst::flags_written() const
> if ((conditional_mod && (opcode != BRW_OPCODE_SEL &&
>  opcode != BRW_OPCODE_IF &&
>  opcode != BRW_OPCODE_WHILE)) ||
> -   opcode == FS_OPCODE_MOV_DISPATCH_TO_FLAGS) {
> +   opcode == FS_OPCODE_MOV_DISPATCH_TO_FLAGS ||
> +   opcode == SHADER_OPCODE_FIND_LIVE_CHANNEL) {

Looks like an unrelated fix?  It's probably fine here though.

>return flag_mask(this);
> } else {
>return flag_mask(dst, size_written);


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/util: use sockets on PIPE_OS_UNIX in u_network

2018-03-01 Thread Emil Velikov
On 28 February 2018 at 10:21, Jonathan Gray  wrote:
> Instead of listing all the UNIX PIPE_OS platforms just use
> PIPE_OS_UNIX.  Makes BSD sockets available on PIPE_OS_BSD.
>
> Signed-off-by: Jonathan Gray 
> ---
>  src/gallium/auxiliary/util/u_network.c | 9 +++--
>  src/gallium/auxiliary/util/u_network.h | 5 +
>  2 files changed, 4 insertions(+), 10 deletions(-)
>
Merged this and the PIPE_OS_BSD patch with Brian's R-B.

Jonathan you can get commit access to push the reviewed patches.
See [1] for instructions and [2] for example.

Thanks
Emil
[1] https://www.freedesktop.org/wiki/AccountRequests/
[2] https://bugs.freedesktop.org/show_bug.cgi?id=105296
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] travis: make Meson find the proper llvm-config

2018-03-01 Thread Emil Velikov
On 1 March 2018 at 18:15, Dylan Baker  wrote:
> Quoting Emil Velikov (2018-03-01 09:27:44)
>> On 1 March 2018 at 17:16, Dylan Baker  wrote:
>> > Quoting Emil Velikov (2018-02-28 16:01:56)
>> >> On 28 February 2018 at 21:18, Andres Gomez  wrote:
>> >> > Travis CI has moved to LLVM 5.0, and meson is detecting automatically
>> >> > the available version in /usr/local/bin based on the PATH env variable
>> >> > order preference.
>> >> >
>> >> > As for 0.44.x, Meson cannot receive the path to the llvm-config binary
>> >> > as a configuration parameter. See
>> >> > https://github.com/mesonbuild/meson/issues/2887 and
>> >> > https://github.com/dcbaker/meson/commit/7c8b6ee3fa42f43c9ac7dcacc61a77eca3f1bcef
>> >> >
>> >> > We want to use the custom (APT) installed version. Therefore, let's
>> >> > make Meson find our wanted version sooner than the one at
>> >> > /usr/local/bin
>> >> >
>> >> > Once this is corrected, we would still need a patch similar to:
>> >> > https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html
>> >> >
>> >> > v2: Create the link only to the specificly wanted LLVM version (Gert).
>> >> >
>> >> > Cc: Eric Engestrom 
>> >> > Cc: Dylan Baker 
>> >> > Cc: Emil Velikov 
>> >> > Cc: Juan A. Suarez Romero 
>> >> > Cc: Gert Wollny 
>> >> > Cc: Jon Turney 
>> >> > Signed-off-by: Andres Gomez 
>> >> > Reviewed-and-Tested-by: Eric Engestrom 
>> >> > Reviewed-by: Dylan Baker 
>> >> > Reviewed-by: Juan A. Suarez 
>> >> > ---
>> >> >  .travis.yml | 30 ++
>> >> >  1 file changed, 26 insertions(+), 4 deletions(-)
>> >> >
>> >> > diff --git a/.travis.yml b/.travis.yml
>> >> > index 0ec08e5bff7..823111ca539 100644
>> >> > --- a/.travis.yml
>> >> > +++ b/.travis.yml
>> >> > @@ -34,6 +34,8 @@ matrix:
>> >> >  - LABEL="meson Vulkan"
>> >> >  - BUILD=meson
>> >> >  - MESON_OPTIONS="-Ddri-drivers= -Dgallium-drivers="
>> >> > +- LLVM_VERSION=4.0
>> >> > +- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
>> >> >addons:
>> >> >  apt:
>> >> >sources:
>> >> > @@ -573,8 +575,28 @@ script:
>> >> >scons $SCONS_TARGET && eval $SCONS_CHECK_COMMAND;
>> >> >  fi
>> >> >
>> >> > -  - if test "x$BUILD" = xmeson; then
>> >> > -  export CFLAGS="$CFLAGS -isystem`pwd`";
>> >> > -  meson _build $MESON_OPTIONS;
>> >> > -  ninja -C _build;
>> >> > +  - |
>> >> > +if test "x$BUILD" = xmeson; then
>> >> > +
>> >> > +  # Travis CI has moved to LLVM 5.0, and meson is detecting
>> >> > +  # automatically the available version in /usr/local/bin based on
>> >> > +  # the PATH env variable order preference.
>> >> > +  #
>> >> > +  # As for 0.44.x, Meson cannot receive the path to the
>> >> > +  # llvm-config binary as a configuration parameter. See
>> >> > +  # https://github.com/mesonbuild/meson/issues/2887 and
>> >> > +  # 
>> >> > https://github.com/dcbaker/meson/commit/7c8b6ee3fa42f43c9ac7dcacc61a77eca3f1bcef
>> >> > +  #
>> >> > +  # We want to use the custom (APT) installed version. Therefore,
>> >> > +  # let's make Meson find our wanted version sooner than the one
>> >> > +  # at /usr/local/bin
>> >> > +  #
>> >> > +  # Once this is corrected, we would still need a patch similar
>> >> > +  # to:
>> >> > +  # 
>> >> > https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html
>> >> > +  test -f /usr/bin/$LLVM_CONFIG && ln -s /usr/bin/$LLVM_CONFIG 
>> >> > $HOME/prefix/bin/llvm-config
>> >> > +
>> >> Patch looks good,
>> >> Reviewed-by: Emil Velikov 
>> >>
>> >> Aside:
>> >> I'm not quite sure we need Eric's llvm-version toggle.
>> >>
>> >> Haven't looked exactly what meson does now, but it seems like it
>> >> probes for specifics binaries/locations.
>> >> So having something like /opt/bin/llvm-config-host-4.0 won't cut it -
>> >> a sort of pattern fairly common when using OE/Yocto.
>> >>
>> >> Let's keep that for another time,
>> >> Emil
>> >
>> > Since I wrote that logic in meson,
>> >
>> > It tries (in order), llvm-config, llvm-config-7svn, ... Until it
>> > finds one that satisfies the version requirements passed. Which means if 
>> > there
>> > is no `llvm-config` it will pick llvm-config-5.0 before llvm-config-4.0. 
>> > I've
>> > proposed adding a meson level option to force a specific llvm-config to be 
>> > used
>> > but it's been ignored thus far.
>> >
>> I would skim through OE/Yocto/others and point it out to the meson people.
>> Here's one example from LibreELEC [1] which seems fairly common.
>>
>> -Emil
>>
>> [1] 
>> 

Re: [Mesa-dev] [PATCH v2] travis: make Meson find the proper llvm-config

2018-03-01 Thread Dylan Baker
Quoting Emil Velikov (2018-03-01 09:27:44)
> On 1 March 2018 at 17:16, Dylan Baker  wrote:
> > Quoting Emil Velikov (2018-02-28 16:01:56)
> >> On 28 February 2018 at 21:18, Andres Gomez  wrote:
> >> > Travis CI has moved to LLVM 5.0, and meson is detecting automatically
> >> > the available version in /usr/local/bin based on the PATH env variable
> >> > order preference.
> >> >
> >> > As for 0.44.x, Meson cannot receive the path to the llvm-config binary
> >> > as a configuration parameter. See
> >> > https://github.com/mesonbuild/meson/issues/2887 and
> >> > https://github.com/dcbaker/meson/commit/7c8b6ee3fa42f43c9ac7dcacc61a77eca3f1bcef
> >> >
> >> > We want to use the custom (APT) installed version. Therefore, let's
> >> > make Meson find our wanted version sooner than the one at
> >> > /usr/local/bin
> >> >
> >> > Once this is corrected, we would still need a patch similar to:
> >> > https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html
> >> >
> >> > v2: Create the link only to the specificly wanted LLVM version (Gert).
> >> >
> >> > Cc: Eric Engestrom 
> >> > Cc: Dylan Baker 
> >> > Cc: Emil Velikov 
> >> > Cc: Juan A. Suarez Romero 
> >> > Cc: Gert Wollny 
> >> > Cc: Jon Turney 
> >> > Signed-off-by: Andres Gomez 
> >> > Reviewed-and-Tested-by: Eric Engestrom 
> >> > Reviewed-by: Dylan Baker 
> >> > Reviewed-by: Juan A. Suarez 
> >> > ---
> >> >  .travis.yml | 30 ++
> >> >  1 file changed, 26 insertions(+), 4 deletions(-)
> >> >
> >> > diff --git a/.travis.yml b/.travis.yml
> >> > index 0ec08e5bff7..823111ca539 100644
> >> > --- a/.travis.yml
> >> > +++ b/.travis.yml
> >> > @@ -34,6 +34,8 @@ matrix:
> >> >  - LABEL="meson Vulkan"
> >> >  - BUILD=meson
> >> >  - MESON_OPTIONS="-Ddri-drivers= -Dgallium-drivers="
> >> > +- LLVM_VERSION=4.0
> >> > +- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
> >> >addons:
> >> >  apt:
> >> >sources:
> >> > @@ -573,8 +575,28 @@ script:
> >> >scons $SCONS_TARGET && eval $SCONS_CHECK_COMMAND;
> >> >  fi
> >> >
> >> > -  - if test "x$BUILD" = xmeson; then
> >> > -  export CFLAGS="$CFLAGS -isystem`pwd`";
> >> > -  meson _build $MESON_OPTIONS;
> >> > -  ninja -C _build;
> >> > +  - |
> >> > +if test "x$BUILD" = xmeson; then
> >> > +
> >> > +  # Travis CI has moved to LLVM 5.0, and meson is detecting
> >> > +  # automatically the available version in /usr/local/bin based on
> >> > +  # the PATH env variable order preference.
> >> > +  #
> >> > +  # As for 0.44.x, Meson cannot receive the path to the
> >> > +  # llvm-config binary as a configuration parameter. See
> >> > +  # https://github.com/mesonbuild/meson/issues/2887 and
> >> > +  # 
> >> > https://github.com/dcbaker/meson/commit/7c8b6ee3fa42f43c9ac7dcacc61a77eca3f1bcef
> >> > +  #
> >> > +  # We want to use the custom (APT) installed version. Therefore,
> >> > +  # let's make Meson find our wanted version sooner than the one
> >> > +  # at /usr/local/bin
> >> > +  #
> >> > +  # Once this is corrected, we would still need a patch similar
> >> > +  # to:
> >> > +  # 
> >> > https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html
> >> > +  test -f /usr/bin/$LLVM_CONFIG && ln -s /usr/bin/$LLVM_CONFIG 
> >> > $HOME/prefix/bin/llvm-config
> >> > +
> >> Patch looks good,
> >> Reviewed-by: Emil Velikov 
> >>
> >> Aside:
> >> I'm not quite sure we need Eric's llvm-version toggle.
> >>
> >> Haven't looked exactly what meson does now, but it seems like it
> >> probes for specifics binaries/locations.
> >> So having something like /opt/bin/llvm-config-host-4.0 won't cut it -
> >> a sort of pattern fairly common when using OE/Yocto.
> >>
> >> Let's keep that for another time,
> >> Emil
> >
> > Since I wrote that logic in meson,
> >
> > It tries (in order), llvm-config, llvm-config-7svn, ... Until it
> > finds one that satisfies the version requirements passed. Which means if 
> > there
> > is no `llvm-config` it will pick llvm-config-5.0 before llvm-config-4.0. 
> > I've
> > proposed adding a meson level option to force a specific llvm-config to be 
> > used
> > but it's been ignored thus far.
> >
> I would skim through OE/Yocto/others and point it out to the meson people.
> Here's one example from LibreELEC [1] which seems fairly common.
> 
> -Emil
> 
> [1] 
> https://github.com/LibreELEC/LibreELEC.tv/blob/master/packages/lang/llvm/package.mk

That's a little different because they're potentially doing a cross compile.
Meson already allows you to specify an llvm-config in a cross file, just not for
native 

Re: [Mesa-dev] [PATCH] svga: fix blending regression

2018-03-01 Thread Brian Paul

Piglit fbo-drawbuffers2-blend

-Brian

On 03/01/2018 10:25 AM, Ilia Mirkin wrote:

Ok. Is there a test that failed? I'll probably have to fix up nv50 for it.

On Mar 1, 2018 11:43 AM, "Brian Paul" > wrote:


On 02/28/2018 08:36 AM, Ilia Mirkin wrote:

Can st/mesa instead be fixed to maintain the original API? Other
drivers look in rt[0] in the non-independent_blend_enable case. For
example, freedreno and nouveau.


If independent blend is not in use, then rt[0] will have all the
blending info, as before.

The case that broke for us was when PIPE_CAP_INDEP_BLEND_FUNC==0 and
independent blend was enabled for rt[i] where i > 0.  Before, the
blend src/dst terms were in rt[0] but now they're in rt[i].

I'd rather not change the semantics again.

-Brian


On Wed, Feb 28, 2018 at 10:29 AM, Brian Paul > wrote:

The earlier Mesa commit 3d06c8afb5 ("st/mesa: don't
translate blend
state when it's disabled for a colorbuffer") subtly changed the
details of gallium's per-RT blend state.

In particular, when pipe_rt_blend_state[i].blend_enabled is
true,
we have to get the src/dst blend terms from
pipe_rt_blend_state[i],
not [0] as before.

We now have to scan the blend targets to find the first one
that's
enabled (if any).  We have to use the index of that target
for getting
the src/dst blend terms.  And note that we have to set
identical blend
terms for all targets.

This fixes the Piglit fbo-drawbuffers2-blend test.  VMware
bug 2063493.
---
   src/gallium/drivers/svga/svga_pipe_blend.c | 35
--
   1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c
b/src/gallium/drivers/svga/svga_pipe_blend.c
index 04855fa..6bb9d94 100644
--- a/src/gallium/drivers/svga/svga_pipe_blend.c
+++ b/src/gallium/drivers/svga/svga_pipe_blend.c
@@ -148,6 +148,17 @@ svga_create_blend_state(struct
pipe_context *pipe,
      if (!blend)
         return NULL;

+   /* Find index of first target with blending enabled.  -1
means blending
+    * is not enabled at all.
+    */
+   int first_enabled = -1;
+   for (i = 0; i < PIPE_MAX_COLOR_BUFS; i++) {
+      if (templ->rt[i].blend_enable) {
+         first_enabled = i;
+         break;
+      }
+   }
+
      /* Fill in the per-rendertarget blend state.  We
currently only
       * support independent blend enable and colormask per
render target.
       */
@@ -260,24 +271,26 @@ svga_create_blend_state(struct
pipe_context *pipe,
            }
         }
         else {
-         /* Note: the vgpu10 device does not yet support
independent
-          * blend terms per render target.  Target[0]
always specifies the
-          * blending terms.
+         /* Note: the vgpu10 device does not yet support
independent blend
+          * terms per render target.  When blending is
enabled, the blend
+          * terms must match for all targets.
             */
-         if (templ->independent_blend_enable ||
templ->rt[0].blend_enable) {
-            /* always use the 0th target's blending terms
for now */
+         if (first_enabled >= 0) {
+            /* use first enabled target's blending terms */
+            const struct pipe_rt_blend_state *rt =
>rt[first_enabled];
+
               blend->rt[i].srcblend =
-               svga_translate_blend_factor(svga,
templ->rt[0].rgb_src_factor);
+               svga_translate_blend_factor(svga,
rt->rgb_src_factor);
               blend->rt[i].dstblend =
-               svga_translate_blend_factor(svga,
templ->rt[0].rgb_dst_factor);
+               svga_translate_blend_factor(svga,
rt->rgb_dst_factor);
               blend->rt[i].blendeq =
- 
  svga_translate_blend_func(templ->rt[0].rgb_func);

+               svga_translate_blend_func(rt->rgb_func);
               

Re: [Mesa-dev] [PATCH v2] travis: make Meson find the proper llvm-config

2018-03-01 Thread Emil Velikov
On 1 March 2018 at 17:16, Dylan Baker  wrote:
> Quoting Emil Velikov (2018-02-28 16:01:56)
>> On 28 February 2018 at 21:18, Andres Gomez  wrote:
>> > Travis CI has moved to LLVM 5.0, and meson is detecting automatically
>> > the available version in /usr/local/bin based on the PATH env variable
>> > order preference.
>> >
>> > As for 0.44.x, Meson cannot receive the path to the llvm-config binary
>> > as a configuration parameter. See
>> > https://github.com/mesonbuild/meson/issues/2887 and
>> > https://github.com/dcbaker/meson/commit/7c8b6ee3fa42f43c9ac7dcacc61a77eca3f1bcef
>> >
>> > We want to use the custom (APT) installed version. Therefore, let's
>> > make Meson find our wanted version sooner than the one at
>> > /usr/local/bin
>> >
>> > Once this is corrected, we would still need a patch similar to:
>> > https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html
>> >
>> > v2: Create the link only to the specificly wanted LLVM version (Gert).
>> >
>> > Cc: Eric Engestrom 
>> > Cc: Dylan Baker 
>> > Cc: Emil Velikov 
>> > Cc: Juan A. Suarez Romero 
>> > Cc: Gert Wollny 
>> > Cc: Jon Turney 
>> > Signed-off-by: Andres Gomez 
>> > Reviewed-and-Tested-by: Eric Engestrom 
>> > Reviewed-by: Dylan Baker 
>> > Reviewed-by: Juan A. Suarez 
>> > ---
>> >  .travis.yml | 30 ++
>> >  1 file changed, 26 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/.travis.yml b/.travis.yml
>> > index 0ec08e5bff7..823111ca539 100644
>> > --- a/.travis.yml
>> > +++ b/.travis.yml
>> > @@ -34,6 +34,8 @@ matrix:
>> >  - LABEL="meson Vulkan"
>> >  - BUILD=meson
>> >  - MESON_OPTIONS="-Ddri-drivers= -Dgallium-drivers="
>> > +- LLVM_VERSION=4.0
>> > +- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
>> >addons:
>> >  apt:
>> >sources:
>> > @@ -573,8 +575,28 @@ script:
>> >scons $SCONS_TARGET && eval $SCONS_CHECK_COMMAND;
>> >  fi
>> >
>> > -  - if test "x$BUILD" = xmeson; then
>> > -  export CFLAGS="$CFLAGS -isystem`pwd`";
>> > -  meson _build $MESON_OPTIONS;
>> > -  ninja -C _build;
>> > +  - |
>> > +if test "x$BUILD" = xmeson; then
>> > +
>> > +  # Travis CI has moved to LLVM 5.0, and meson is detecting
>> > +  # automatically the available version in /usr/local/bin based on
>> > +  # the PATH env variable order preference.
>> > +  #
>> > +  # As for 0.44.x, Meson cannot receive the path to the
>> > +  # llvm-config binary as a configuration parameter. See
>> > +  # https://github.com/mesonbuild/meson/issues/2887 and
>> > +  # 
>> > https://github.com/dcbaker/meson/commit/7c8b6ee3fa42f43c9ac7dcacc61a77eca3f1bcef
>> > +  #
>> > +  # We want to use the custom (APT) installed version. Therefore,
>> > +  # let's make Meson find our wanted version sooner than the one
>> > +  # at /usr/local/bin
>> > +  #
>> > +  # Once this is corrected, we would still need a patch similar
>> > +  # to:
>> > +  # 
>> > https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html
>> > +  test -f /usr/bin/$LLVM_CONFIG && ln -s /usr/bin/$LLVM_CONFIG 
>> > $HOME/prefix/bin/llvm-config
>> > +
>> Patch looks good,
>> Reviewed-by: Emil Velikov 
>>
>> Aside:
>> I'm not quite sure we need Eric's llvm-version toggle.
>>
>> Haven't looked exactly what meson does now, but it seems like it
>> probes for specifics binaries/locations.
>> So having something like /opt/bin/llvm-config-host-4.0 won't cut it -
>> a sort of pattern fairly common when using OE/Yocto.
>>
>> Let's keep that for another time,
>> Emil
>
> Since I wrote that logic in meson,
>
> It tries (in order), llvm-config, llvm-config-7svn, ... Until it
> finds one that satisfies the version requirements passed. Which means if there
> is no `llvm-config` it will pick llvm-config-5.0 before llvm-config-4.0. I've
> proposed adding a meson level option to force a specific llvm-config to be 
> used
> but it's been ignored thus far.
>
I would skim through OE/Yocto/others and point it out to the meson people.
Here's one example from LibreELEC [1] which seems fairly common.

-Emil

[1] 
https://github.com/LibreELEC/LibreELEC.tv/blob/master/packages/lang/llvm/package.mk
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: fix blending regression

2018-03-01 Thread Ilia Mirkin
Ok. Is there a test that failed? I'll probably have to fix up nv50 for it.

On Mar 1, 2018 11:43 AM, "Brian Paul"  wrote:

> On 02/28/2018 08:36 AM, Ilia Mirkin wrote:
>
>> Can st/mesa instead be fixed to maintain the original API? Other
>> drivers look in rt[0] in the non-independent_blend_enable case. For
>> example, freedreno and nouveau.
>>
>
> If independent blend is not in use, then rt[0] will have all the blending
> info, as before.
>
> The case that broke for us was when PIPE_CAP_INDEP_BLEND_FUNC==0 and
> independent blend was enabled for rt[i] where i > 0.  Before, the blend
> src/dst terms were in rt[0] but now they're in rt[i].
>
> I'd rather not change the semantics again.
>
> -Brian
>
>
>> On Wed, Feb 28, 2018 at 10:29 AM, Brian Paul  wrote:
>>
>>> The earlier Mesa commit 3d06c8afb5 ("st/mesa: don't translate blend
>>> state when it's disabled for a colorbuffer") subtly changed the
>>> details of gallium's per-RT blend state.
>>>
>>> In particular, when pipe_rt_blend_state[i].blend_enabled is true,
>>> we have to get the src/dst blend terms from pipe_rt_blend_state[i],
>>> not [0] as before.
>>>
>>> We now have to scan the blend targets to find the first one that's
>>> enabled (if any).  We have to use the index of that target for getting
>>> the src/dst blend terms.  And note that we have to set identical blend
>>> terms for all targets.
>>>
>>> This fixes the Piglit fbo-drawbuffers2-blend test.  VMware bug 2063493.
>>> ---
>>>   src/gallium/drivers/svga/svga_pipe_blend.c | 35
>>> --
>>>   1 file changed, 24 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c
>>> b/src/gallium/drivers/svga/svga_pipe_blend.c
>>> index 04855fa..6bb9d94 100644
>>> --- a/src/gallium/drivers/svga/svga_pipe_blend.c
>>> +++ b/src/gallium/drivers/svga/svga_pipe_blend.c
>>> @@ -148,6 +148,17 @@ svga_create_blend_state(struct pipe_context *pipe,
>>>  if (!blend)
>>> return NULL;
>>>
>>> +   /* Find index of first target with blending enabled.  -1 means
>>> blending
>>> +* is not enabled at all.
>>> +*/
>>> +   int first_enabled = -1;
>>> +   for (i = 0; i < PIPE_MAX_COLOR_BUFS; i++) {
>>> +  if (templ->rt[i].blend_enable) {
>>> + first_enabled = i;
>>> + break;
>>> +  }
>>> +   }
>>> +
>>>  /* Fill in the per-rendertarget blend state.  We currently only
>>>   * support independent blend enable and colormask per render target.
>>>   */
>>> @@ -260,24 +271,26 @@ svga_create_blend_state(struct pipe_context *pipe,
>>>}
>>> }
>>> else {
>>> - /* Note: the vgpu10 device does not yet support independent
>>> -  * blend terms per render target.  Target[0] always specifies
>>> the
>>> -  * blending terms.
>>> + /* Note: the vgpu10 device does not yet support independent
>>> blend
>>> +  * terms per render target.  When blending is enabled, the
>>> blend
>>> +  * terms must match for all targets.
>>> */
>>> - if (templ->independent_blend_enable ||
>>> templ->rt[0].blend_enable) {
>>> -/* always use the 0th target's blending terms for now */
>>> + if (first_enabled >= 0) {
>>> +/* use first enabled target's blending terms */
>>> +const struct pipe_rt_blend_state *rt =
>>> >rt[first_enabled];
>>> +
>>>   blend->rt[i].srcblend =
>>> -   svga_translate_blend_factor(svga,
>>> templ->rt[0].rgb_src_factor);
>>> +   svga_translate_blend_factor(svga, rt->rgb_src_factor);
>>>   blend->rt[i].dstblend =
>>> -   svga_translate_blend_factor(svga,
>>> templ->rt[0].rgb_dst_factor);
>>> +   svga_translate_blend_factor(svga, rt->rgb_dst_factor);
>>>   blend->rt[i].blendeq =
>>> -   svga_translate_blend_func(templ->rt[0].rgb_func);
>>> +   svga_translate_blend_func(rt->rgb_func);
>>>   blend->rt[i].srcblend_alpha =
>>> -   svga_translate_blend_factor(svga,
>>> templ->rt[0].alpha_src_factor);
>>> +   svga_translate_blend_factor(svga, rt->alpha_src_factor);
>>>   blend->rt[i].dstblend_alpha =
>>> -   svga_translate_blend_factor(svga,
>>> templ->rt[0].alpha_dst_factor);
>>> +   svga_translate_blend_factor(svga, rt->alpha_dst_factor);
>>>   blend->rt[i].blendeq_alpha =
>>> -   svga_translate_blend_func(templ->rt[0].alpha_func);
>>> +   svga_translate_blend_func(rt->alpha_func);
>>>
>>>   if (blend->rt[i].srcblend_alpha != blend->rt[i].srcblend ||
>>>   blend->rt[i].dstblend_alpha != blend->rt[i].dstblend ||
>>> --
>>> 2.7.4
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.f
>>> 

Re: [Mesa-dev] [PATCH v2] travis: make Meson find the proper llvm-config

2018-03-01 Thread Dylan Baker
Quoting Emil Velikov (2018-02-28 16:01:56)
> On 28 February 2018 at 21:18, Andres Gomez  wrote:
> > Travis CI has moved to LLVM 5.0, and meson is detecting automatically
> > the available version in /usr/local/bin based on the PATH env variable
> > order preference.
> >
> > As for 0.44.x, Meson cannot receive the path to the llvm-config binary
> > as a configuration parameter. See
> > https://github.com/mesonbuild/meson/issues/2887 and
> > https://github.com/dcbaker/meson/commit/7c8b6ee3fa42f43c9ac7dcacc61a77eca3f1bcef
> >
> > We want to use the custom (APT) installed version. Therefore, let's
> > make Meson find our wanted version sooner than the one at
> > /usr/local/bin
> >
> > Once this is corrected, we would still need a patch similar to:
> > https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html
> >
> > v2: Create the link only to the specificly wanted LLVM version (Gert).
> >
> > Cc: Eric Engestrom 
> > Cc: Dylan Baker 
> > Cc: Emil Velikov 
> > Cc: Juan A. Suarez Romero 
> > Cc: Gert Wollny 
> > Cc: Jon Turney 
> > Signed-off-by: Andres Gomez 
> > Reviewed-and-Tested-by: Eric Engestrom 
> > Reviewed-by: Dylan Baker 
> > Reviewed-by: Juan A. Suarez 
> > ---
> >  .travis.yml | 30 ++
> >  1 file changed, 26 insertions(+), 4 deletions(-)
> >
> > diff --git a/.travis.yml b/.travis.yml
> > index 0ec08e5bff7..823111ca539 100644
> > --- a/.travis.yml
> > +++ b/.travis.yml
> > @@ -34,6 +34,8 @@ matrix:
> >  - LABEL="meson Vulkan"
> >  - BUILD=meson
> >  - MESON_OPTIONS="-Ddri-drivers= -Dgallium-drivers="
> > +- LLVM_VERSION=4.0
> > +- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
> >addons:
> >  apt:
> >sources:
> > @@ -573,8 +575,28 @@ script:
> >scons $SCONS_TARGET && eval $SCONS_CHECK_COMMAND;
> >  fi
> >
> > -  - if test "x$BUILD" = xmeson; then
> > -  export CFLAGS="$CFLAGS -isystem`pwd`";
> > -  meson _build $MESON_OPTIONS;
> > -  ninja -C _build;
> > +  - |
> > +if test "x$BUILD" = xmeson; then
> > +
> > +  # Travis CI has moved to LLVM 5.0, and meson is detecting
> > +  # automatically the available version in /usr/local/bin based on
> > +  # the PATH env variable order preference.
> > +  #
> > +  # As for 0.44.x, Meson cannot receive the path to the
> > +  # llvm-config binary as a configuration parameter. See
> > +  # https://github.com/mesonbuild/meson/issues/2887 and
> > +  # 
> > https://github.com/dcbaker/meson/commit/7c8b6ee3fa42f43c9ac7dcacc61a77eca3f1bcef
> > +  #
> > +  # We want to use the custom (APT) installed version. Therefore,
> > +  # let's make Meson find our wanted version sooner than the one
> > +  # at /usr/local/bin
> > +  #
> > +  # Once this is corrected, we would still need a patch similar
> > +  # to:
> > +  # 
> > https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html
> > +  test -f /usr/bin/$LLVM_CONFIG && ln -s /usr/bin/$LLVM_CONFIG 
> > $HOME/prefix/bin/llvm-config
> > +
> Patch looks good,
> Reviewed-by: Emil Velikov 
> 
> Aside:
> I'm not quite sure we need Eric's llvm-version toggle.
> 
> Haven't looked exactly what meson does now, but it seems like it
> probes for specifics binaries/locations.
> So having something like /opt/bin/llvm-config-host-4.0 won't cut it -
> a sort of pattern fairly common when using OE/Yocto.
> 
> Let's keep that for another time,
> Emil

Since I wrote that logic in meson,

It tries (in order), llvm-config, llvm-config-7svn, ... Until it
finds one that satisfies the version requirements passed. Which means if there
is no `llvm-config` it will pick llvm-config-5.0 before llvm-config-4.0. I've
proposed adding a meson level option to force a specific llvm-config to be used
but it's been ignored thus far.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/12] vbo: Remove vbo_save_vertex_list::buffer_offset.

2018-03-01 Thread Brian Paul

On 02/28/2018 12:10 AM, Mathias Fröhlich wrote:

Hi Brian,

On Wednesday, 28 February 2018 00:56:36 CET Brian Paul wrote:

On 02/26/2018 11:12 PM, mathias.froehl...@gmx.net wrote:

From: Mathias Fröhlich 

The buffer_offset is used in aligned_vertex_buffer_offset.
But now that most of these decisions are done in compile_vertex_list
we can work on local variables instead of struct members in the
display list code. Clean that up and remove buffer_offset.


I presume the optimization I implemented here this still works after
this change.


I have been watching what you did last there.
And I have tried carefully to keep that behavior.

Well, the major purpose of the bigger series is that the direct OpenGL API
user as well as internal users like the dlist and immediate mode code can
build up VAOs that already contain just a single buffer object binding and so
on. Also to give the mesa layer already a chance to see that there is no
change in the vertex arrays.

So, what I mention in the cover letter that there sould be more optimization
possible is at least one completely unfinalized change that I tried while
playing around to check for more optimizations. Means the display list
compiler now keeps the VAO's from the previous list. One thing that we can do
now is to apply your optimization against the offset of the previous display
list VAOs. Means the idea is that a lot of calling code ist compiling the
display lists in an order that is also used while execute. That is checking
the rest division against the previous VAO's offset instead of the buffer
objects start offset is helping much more often. Then, if we can as a first
order optimization keep the dlist compilers VAOs as long as possible then we
do in turn not flag DriverFlags.NewArray and the driver shall in turn not even
need to look at the arrays to detect changes.
I'll try split out that easy change from the hackeries for review within the
next week ...
But appart from that the dlist compiler can be hacked now to keep the same VAO
used in the previous list by some offsetting to the primitives or pading
vertices or what not to share the same pair of VAOs for more successive dlist
nodes.
You can be pretty creative here ...


That sounds great.  At VMware we come across quite a few legacy GL apps 
that make heavy use of display lists, and often times, the applications 
are pretty inefficient with display list use.


My previous optimization applied to multiple glBegin/End primitives 
within a single display list (avoid re-emitting vertex array offsets for 
each draw call).  If we can do that with glBegin/End prims in separate 
display lists, that could be a really nice improvement.





BTW: I am only mentioning legacy draw entry points, here. But note that the
legacy entry points now basically use themselves the basic entry point that a
modern OpenGL application uses. Means optimizing the modern main draw entry
point does no longer partly collide with the already present dlist
optimizations.

The next changes will try to incrementally adress the way from the VAO down
into the drivers.


If so, and with the minor comments on patch 4, the series LGTM.

Reviewed-by: Brian Paul 

Nice work!


Thanks for the review!!
I will apply the requested changes!
And rerun the tests wrt the inserted assertations.


Thanks.

-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir/search: Include 8 and 16-bit support in construct_value

2018-03-01 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

I think they make sense as two separate patches.  I've updated my patch
title to say that it's only for match_value and I'll push both once Jenkins
is done with them.

On Thu, Mar 1, 2018 at 9:06 AM, Jose Maria Casanova Crespo <
jmcasan...@igalia.com> wrote:

> ---
>  src/compiler/nir/nir_search.c | 15 +++
>  1 file changed, 15 insertions(+)
>
> diff --git a/src/compiler/nir/nir_search.c b/src/compiler/nir/nir_search.c
> index c7c52ae320d..28b36b2b863 100644
> --- a/src/compiler/nir/nir_search.c
> +++ b/src/compiler/nir/nir_search.c
> @@ -525,6 +525,9 @@ construct_value(const nir_search_value *value,
>case nir_type_float:
>   load->def.name = ralloc_asprintf(load, "%f", c->data.d);
>   switch (bitsize->dest_size) {
> + case 16:
> +load->value.u16[0] = _mesa_float_to_half(c->data.d);
> +break;
>   case 32:
>  load->value.f32[0] = c->data.d;
>  break;
> @@ -539,6 +542,12 @@ construct_value(const nir_search_value *value,
>case nir_type_int:
>   load->def.name = ralloc_asprintf(load, "%" PRIi64, c->data.i);
>   switch (bitsize->dest_size) {
> + case 8:
> +load->value.i8[0] = c->data.i;
> +break;
> + case 16:
> +load->value.i16[0] = c->data.i;
> +break;
>   case 32:
>  load->value.i32[0] = c->data.i;
>  break;
> @@ -553,6 +562,12 @@ construct_value(const nir_search_value *value,
>case nir_type_uint:
>   load->def.name = ralloc_asprintf(load, "%" PRIu64, c->data.u);
>   switch (bitsize->dest_size) {
> + case 8:
> +load->value.u8[0] = c->data.u;
> +break;
> + case 16:
> +load->value.u16[0] = c->data.u;
> +break;
>   case 32:
>  load->value.u32[0] = c->data.u;
>  break;
> --
> 2.14.3
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 5/6] tegra: Initial support

2018-03-01 Thread Dylan Baker
Quoting Thierry Reding (2018-03-01 05:54:53)
> Tegra K1 and later use a GPU that can be driven by the Nouveau driver.
> But the GPU is a pure render node and has no display engine, hence the
> scanout needs to happen on the Tegra display hardware. The GPU and the
> display engine each have a separate DRM device node exposed by the
> kernel.
> 
> To make the setup appear as a single device, this driver instantiates
> a Nouveau screen with each instance of a Tegra screen and forwards GPU
> requests to the Nouveau screen. For purposes of scanout it will import
> buffers created on the GPU into the display driver. Handles that
> userspace requests are those of the display driver so that they can be
> used to create framebuffers.
> 
> This has been tested with some GBM test programs, as well as kmscube and
> weston. All of those run without modifications, but I'm sure there is a
> lot that can be improved.
> 
> Some fixes contributed by Hector Martin .
> 

Also please make sure that any new meson.build files are in the EXTRA_DIST array
in the companion Makefile.am so that they'll be in the dist tarball.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 5/6] tegra: Initial support

2018-03-01 Thread Dylan Baker
Quoting Thierry Reding (2018-03-01 05:54:53)
> --- /dev/null
> +++ b/src/gallium/winsys/tegra/drm/meson.build
> @@ -0,0 +1,33 @@
> +# Copyright © 2018 NVIDIA CORPORATION
> +
> +# Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> +# of this software and associated documentation files (the "Software"), to 
> deal
> +# in the Software without restriction, including without limitation the 
> rights
> +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> +# copies of the Software, and to permit persons to whom the Software is
> +# furnished to do so, subject to the following conditions:
> +
> +# The above copyright notice and this permission notice shall be included in
> +# all copies or substantial portions of the Software.
> +
> +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
> +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 
> THE
> +# SOFTWARE.
> +
> +libtegradrm = static_library(
> +  'tegradrm',
> +  'tegra_drm_winsys.c',
> +  include_directories : [
> +inc_include, inc_src, inc_gallium, inc_gallium_aux, inc_gallium_drivers,
> +inc_gallium_winsys
> +  ],
> +)
> +
> +driver_tegra = declare_dependency(
> +  compile_args : '-DGALLIUM_TEGRA',
> +  link_with : libtegradrm,
> +)

We don't need this driver_tegra, since the one in drivers/tegra is actually
complete, right?

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir/search: Include 8 and 16-bit support in construct_value

2018-03-01 Thread Jose Maria Casanova Crespo
---
 src/compiler/nir/nir_search.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/compiler/nir/nir_search.c b/src/compiler/nir/nir_search.c
index c7c52ae320d..28b36b2b863 100644
--- a/src/compiler/nir/nir_search.c
+++ b/src/compiler/nir/nir_search.c
@@ -525,6 +525,9 @@ construct_value(const nir_search_value *value,
   case nir_type_float:
  load->def.name = ralloc_asprintf(load, "%f", c->data.d);
  switch (bitsize->dest_size) {
+ case 16:
+load->value.u16[0] = _mesa_float_to_half(c->data.d);
+break;
  case 32:
 load->value.f32[0] = c->data.d;
 break;
@@ -539,6 +542,12 @@ construct_value(const nir_search_value *value,
   case nir_type_int:
  load->def.name = ralloc_asprintf(load, "%" PRIi64, c->data.i);
  switch (bitsize->dest_size) {
+ case 8:
+load->value.i8[0] = c->data.i;
+break;
+ case 16:
+load->value.i16[0] = c->data.i;
+break;
  case 32:
 load->value.i32[0] = c->data.i;
 break;
@@ -553,6 +562,12 @@ construct_value(const nir_search_value *value,
   case nir_type_uint:
  load->def.name = ralloc_asprintf(load, "%" PRIu64, c->data.u);
  switch (bitsize->dest_size) {
+ case 8:
+load->value.u8[0] = c->data.u;
+break;
+ case 16:
+load->value.u16[0] = c->data.u;
+break;
  case 32:
 load->value.u32[0] = c->data.u;
 break;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir/search: Support 8 and 16-bit constants

2018-03-01 Thread Chema Casanova
I've been checking the whole nir_search.c and there is another pending
16-bit support in construct_value function. I'm sending a patch so feel
free to squash it to your if it makes sense.

In any case this is.

Reviewed-by: Jose Maria Casanova Crespo 

El 28/02/18 a las 22:18, Jason Ekstrand escribió:
> ---
>  src/compiler/nir/nir_search.c | 20 
>  1 file changed, 20 insertions(+)
> 
> diff --git a/src/compiler/nir/nir_search.c b/src/compiler/nir/nir_search.c
> index dec56fe..c7c52ae 100644
> --- a/src/compiler/nir/nir_search.c
> +++ b/src/compiler/nir/nir_search.c
> @@ -27,6 +27,7 @@
>  
>  #include 
>  #include "nir_search.h"
> +#include "util/half_float.h"
>  
>  struct match_state {
> bool inexact_match;
> @@ -194,6 +195,9 @@ match_value(const nir_search_value *value, nir_alu_instr 
> *instr, unsigned src,
>   for (unsigned i = 0; i < num_components; ++i) {
>  double val;
>  switch (load->def.bit_size) {
> +case 16:
> +   val = _mesa_half_to_float(load->value.u16[new_swizzle[i]]);
> +   break;
>  case 32:
> val = load->value.f32[new_swizzle[i]];
> break;
> @@ -213,6 +217,22 @@ match_value(const nir_search_value *value, nir_alu_instr 
> *instr, unsigned src,
>case nir_type_uint:
>case nir_type_bool32:
>   switch (load->def.bit_size) {
> + case 8:
> +for (unsigned i = 0; i < num_components; ++i) {
> +   if (load->value.u8[new_swizzle[i]] !=
> +   (uint8_t)const_val->data.u)
> +  return false;
> +}
> +return true;
> +
> + case 16:
> +for (unsigned i = 0; i < num_components; ++i) {
> +   if (load->value.u16[new_swizzle[i]] !=
> +   (uint16_t)const_val->data.u)
> +  return false;
> +}
> +return true;
> +
>   case 32:
>  for (unsigned i = 0; i < num_components; ++i) {
> if (load->value.u32[new_swizzle[i]] !=
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 4/6] nouveau: Add framebuffer modifier support

2018-03-01 Thread Dylan Baker
Quoting Thierry Reding (2018-03-01 05:54:52)
> From: Thierry Reding 
> 
> This adds support for framebuffer modifiers to Nouveau. This will be
> used by the Tegra driver to share metadata about the format of buffers
> (such as the tiling mode or compression).
> 
> Changes in v2:
> - remove unused parameters to nouveau_buffer_create()
> - move format modifier query code to nvc0 backend
> - restrict format modifiers to 2D textures
> - implement ->query_dmabuf_modifiers()
> 
> Acked-by: Emil Velikov 
> Tested-by: Andre Heider 
> Signed-off-by: Thierry Reding 
> ---
>  src/gallium/drivers/nouveau/Android.mk   |  3 +
>  src/gallium/drivers/nouveau/Makefile.am  |  1 +
>  src/gallium/drivers/nouveau/nouveau_screen.c |  4 ++
>  src/gallium/drivers/nouveau/nv30/nv30_resource.c |  2 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c  | 81 
> +++-
>  src/gallium/drivers/nouveau/nvc0/nvc0_resource.c | 59 -
>  src/gallium/drivers/nouveau/nvc0/nvc0_resource.h |  3 +-
>  7 files changed, 149 insertions(+), 4 deletions(-)
> 
> diff --git a/src/gallium/drivers/nouveau/Android.mk 
> b/src/gallium/drivers/nouveau/Android.mk
> index 2de22e73ec18..a446774a86e8 100644
> --- a/src/gallium/drivers/nouveau/Android.mk
> +++ b/src/gallium/drivers/nouveau/Android.mk
> @@ -36,6 +36,9 @@ LOCAL_SRC_FILES := \
> $(NVC0_CODEGEN_SOURCES) \
> $(NVC0_C_SOURCES)
>  
> +LOCAL_C_INCLUDES := \
> +   $(MESA_TOP)/include/drm-uapi
> +
>  LOCAL_SHARED_LIBRARIES := libdrm_nouveau
>  LOCAL_MODULE := libmesa_pipe_nouveau
>  
> diff --git a/src/gallium/drivers/nouveau/Makefile.am 
> b/src/gallium/drivers/nouveau/Makefile.am
> index 91547178e397..f6126b544811 100644
> --- a/src/gallium/drivers/nouveau/Makefile.am
> +++ b/src/gallium/drivers/nouveau/Makefile.am
> @@ -24,6 +24,7 @@ include Makefile.sources
>  include $(top_srcdir)/src/gallium/Automake.inc
>  
>  AM_CPPFLAGS = \
> +   -I$(top_srcdir)/include/drm-uapi \

This needs to be added for the meson build as well, right? Should just need to
add "inc_drm_uapi" to the relevant "include_directories" lines.

> $(GALLIUM_DRIVER_CFLAGS) \
> $(LIBDRM_CFLAGS) \
> $(NOUVEAU_CFLAGS)
> diff --git a/src/gallium/drivers/nouveau/nouveau_screen.c 
> b/src/gallium/drivers/nouveau/nouveau_screen.c
> index c144b39b2dd2..b84ef13ebe7f 100644
> --- a/src/gallium/drivers/nouveau/nouveau_screen.c
> +++ b/src/gallium/drivers/nouveau/nouveau_screen.c
> @@ -1,3 +1,5 @@
> +#include 
> +
>  #include "pipe/p_defines.h"
>  #include "pipe/p_screen.h"
>  #include "pipe/p_state.h"
> @@ -23,6 +25,8 @@
>  #include "nouveau_mm.h"
>  #include "nouveau_buffer.h"
>  
> +#include "nvc0/nvc0_resource.h"
> +
>  /* XXX this should go away */
>  #include "state_tracker/drm_driver.h"
>  
> diff --git a/src/gallium/drivers/nouveau/nv30/nv30_resource.c 
> b/src/gallium/drivers/nouveau/nv30/nv30_resource.c
> index ff34f6e5f9fa..386bd3459bd3 100644
> --- a/src/gallium/drivers/nouveau/nv30/nv30_resource.c
> +++ b/src/gallium/drivers/nouveau/nv30/nv30_resource.c
> @@ -23,6 +23,8 @@
>   *
>   */
>  
> +#include 
> +
>  #include "util/u_format.h"
>  #include "util/u_inlines.h"
>  
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c
> index 27674f72a7c0..7983c4030876 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c
> @@ -20,8 +20,11 @@
>   * OTHER DEALINGS IN THE SOFTWARE.
>   */
>  
> +#include 
> +
>  #include "pipe/p_state.h"
>  #include "pipe/p_defines.h"
> +#include "state_tracker/drm_driver.h"
>  #include "util/u_inlines.h"
>  #include "util/u_format.h"
>  
> @@ -233,9 +236,79 @@ nvc0_miptree_init_layout_tiled(struct nv50_miptree *mt)
> }
>  }
>  
> +static uint64_t nvc0_miptree_get_modifier(struct nv50_miptree *mt)
> +{
> +   union nouveau_bo_config *config = >base.bo->config;
> +   uint64_t modifier;
> +
> +   if (mt->layout_3d)
> +  return DRM_FORMAT_MOD_INVALID;
> +
> +   switch (config->nvc0.memtype) {
> +   case 0x00:
> +  modifier = DRM_FORMAT_MOD_LINEAR;
> +  break;
> +
> +   case 0xfe:
> +  switch (NVC0_TILE_MODE_Y(config->nvc0.tile_mode)) {
> +  case 0:
> + modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_ONE_GOB;
> + break;
> +
> +  case 1:
> + modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_TWO_GOB;
> + break;
> +
> +  case 2:
> + modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_FOUR_GOB;
> + break;
> +
> +  case 3:
> + modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_EIGHT_GOB;
> + break;
> +
> +  case 4:
> + modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_SIXTEEN_GOB;
> + break;
> +
> +  case 5:
> + modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB;
> + break;
> +
> +  default:

Re: [Mesa-dev] [PATCH] r600/egd_tables.py: make the script python 2+3 compatible

2018-03-01 Thread Dylan Baker
Quoting sndir...@suse.de (2018-03-01 08:11:54)
> From: Stefan Dirsch 
> 
> Patch by "Tomas Chvatal"  with modifications
> by "Michal Srb"  to not break python 2.
> 
> https://bugzilla.suse.com/show_bug.cgi?id=1082303
> 
> v2:
> - no longer try to encode a unicode
> - make use of 'from __future__ import print_function', so semantics
>   of print statements in python2 are closer to print functions in python3
> 
> https://lists.freedesktop.org/archives/mesa-dev/2018-February/187056.html
> 
> Signed-off-by: Stefan Dirsch 
> Reviewed-by: Tomas Chvatal 
> Reviewed-by: Dylan Baker 

Process comment for you. Reviewed-by in mesa is like the kernel, it carries a
very specific formal meaning and you don't add Reviewed-by (or Acked-by,
Tested-by, etc) unless someone explicitly says that they give a Reviewed-by.

Also, please use the natural name@domain form for emails. Scripts that scrape 
emails
out of emails and git commits are not fooled by "name at domain" anyway.

> ---
>  src/gallium/drivers/r600/egd_tables.py | 53 
> +-
>  1 file changed, 27 insertions(+), 26 deletions(-)
> 
> diff --git a/src/gallium/drivers/r600/egd_tables.py 
> b/src/gallium/drivers/r600/egd_tables.py
> index d7b78c7fb1..4796456330 100644
> --- a/src/gallium/drivers/r600/egd_tables.py
> +++ b/src/gallium/drivers/r600/egd_tables.py
> @@ -1,3 +1,4 @@
> +from __future__ import print_function
>  
>  CopyRight = '''
>  /*
> @@ -60,7 +61,7 @@ class StringTable:
>  """
>  fragments = [
>  '"%s\\0" /* %s */' % (
> -te[0].encode('string_escape'),
> +te[0],
>  ', '.join(str(idx) for idx in te[2])
>  )
>  for te in self.table
> @@ -217,10 +218,10 @@ def write_tables(regs, packets):
>  strings = StringTable()
>  strings_offsets = IntTable("int")
>  
> -print '/* This file is autogenerated by egd_tables.py from evergreend.h. 
> Do not edit directly. */'
> -print
> -print CopyRight.strip()
> -print '''
> +print('/* This file is autogenerated by egd_tables.py from evergreend.h. 
> Do not edit directly. */')
> +print('')
> +print(CopyRight.strip())
> +print('''
>  #ifndef EG_TABLES_H
>  #define EG_TABLES_H
>  
> @@ -242,20 +243,20 @@ struct eg_packet3 {
>  unsigned name_offset;
>  unsigned op;
>  };
> -'''
> +''')
>  
> -print 'static const struct eg_packet3 packet3_table[] = {'
> +print('static const struct eg_packet3 packet3_table[] = {')
>  for pkt in packets:
> -print '\t{%s, %s},' % (strings.add(pkt[5:]), pkt)
> -print '};'
> -print
> +print('\t{%s, %s},' % (strings.add(pkt[5:]), pkt))
> +print('};')
> +print('')
>  
> -print 'static const struct eg_field egd_fields_table[] = {'
> +print('static const struct eg_field egd_fields_table[] = {')
>  
>  fields_idx = 0
>  for reg in regs:
>  if len(reg.fields) and reg.own_fields:
> -print '\t/* %s */' % (fields_idx)
> +print('\t/* %s */' % (fields_idx))
>  
>  reg.fields_idx = fields_idx
>  
> @@ -266,34 +267,34 @@ struct eg_packet3 {
>  while value[1] >= len(values_offsets):
>  values_offsets.append(-1)
>  values_offsets[value[1]] = 
> strings.add(strip_prefix(value[0]))
> -print '\t{%s, %s(~0u), %s, %s},' % (
> +print('\t{%s, %s(~0u), %s, %s},' % (
>  strings.add(field.name), field.s_name,
> -len(values_offsets), 
> strings_offsets.add(values_offsets))
> +len(values_offsets), 
> strings_offsets.add(values_offsets)))
>  else:
> -print '\t{%s, %s(~0u)},' % (strings.add(field.name), 
> field.s_name)
> +print('\t{%s, %s(~0u)},' % (strings.add(field.name), 
> field.s_name))
>  fields_idx += 1
>  
> -print '};'
> -print
> +print('};')
> +print('')
>  
> -print 'static const struct eg_reg egd_reg_table[] = {'
> +print('static const struct eg_reg egd_reg_table[] = {')
>  for reg in regs:
>  if len(reg.fields):
> -print '\t{%s, %s, %s, %s},' % (strings.add(reg.name), reg.r_name,
> -len(reg.fields), reg.fields_idx if reg.own_fields else 
> reg.fields_owner.fields_idx)
> +print('\t{%s, %s, %s, %s},' % (strings.add(reg.name), reg.r_name,
> +len(reg.fields), reg.fields_idx if reg.own_fields else 
> reg.fields_owner.fields_idx))
>  else:
> -print '\t{%s, %s},' % (strings.add(reg.name), reg.r_name)
> -print '};'
> -print
> +print('\t{%s, %s},' % (strings.add(reg.name), reg.r_name))
> +print('};')
> +print('')
>  
>  strings.emit(sys.stdout, "egd_strings")
>  
> -print
> +print('')
>  
>  

Re: [Mesa-dev] [PATCH v2] disk cache: Link with -latomic if necessary

2018-03-01 Thread Dylan Baker
Quoting Thierry Reding (2018-03-01 05:28:07)
> From: Thierry Reding 
> 
> The disk cache implementation uses 64-bit atomic operations. For some
> architectures, such as 32-bit ARM, GCC will not be able to translate
> these operations into atomic, lock-free instructions and will instead
> rely on the external atomics library to provide these operations.
> 
> Check at configuration time whether or not linking against libatomic
> is necessary and if so, create a dependency that can be used while
> linking the mesautil library.
> 
> This is the meson equivalent of 2ef7f23820a6 ("configure: check if
> -latomic is needed for __atomic_*").
> 
> For some background information on this, see:
> 
> https://gcc.gnu.org/wiki/Atomic/GCCMM
> 
> Changes in v2:
> - clarify meaning of lock-free in commit message
> - fix build if -latomic is not necessary
> 
> Acked-by: Matt Turner 
> Signed-off-by: Thierry Reding 
> ---
>  meson.build  | 17 +
>  src/util/meson.build |  2 +-
>  2 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/meson.build b/meson.build
> index e9928a379313..bb6a835084fe 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -790,9 +790,26 @@ else
>  endif
>  
>  # Check for GCC style atomics
> +dep_atomic = declare_dependency()
> +
>  if cc.compiles('int main() { int n; return __atomic_load_n(, 
> __ATOMIC_ACQUIRE); }',
> name : 'GCC atomic builtins')
>pre_args += '-DUSE_GCC_ATOMIC_BUILTINS'
> +
> +  # Not all atomic calls can be turned into lock-free instructions, in which
> +  # GCC will make calls into the libatomic library. Check whether we need to
> +  # link with -latomic.
> +  #
> +  # This can happen for 64-bit atomic operations on 32-bit architectures such
> +  # as ARM.
> +  if not cc.links('''#include 
> + int main() {
> +   uint64_t n;
> +   return (int)__atomic_load_n(, __ATOMIC_ACQUIRE);
> + }''',
> +  name : 'GCC atomic builtins required -latomic')
> +dep_atomic = cc.find_library('atomic')
> +  endif
>  endif
>  if not cc.links('''#include 
> uint64_t v;
> diff --git a/src/util/meson.build b/src/util/meson.build
> index b23dba3a9851..eece1cefef6a 100644
> --- a/src/util/meson.build
> +++ b/src/util/meson.build
> @@ -102,7 +102,7 @@ libmesa_util = static_library(
>'mesa_util',
>[files_mesa_util, format_srgb],
>include_directories : inc_common,
> -  dependencies : [dep_zlib, dep_clock, dep_thread],
> +  dependencies : [dep_zlib, dep_clock, dep_thread, dep_atomic],
>c_args : [c_msvc_compat_args, c_vis_args],
>build_by_default : false
>  )
> -- 
> 2.16.2
> 

Reviewed-by: Dylan Baker 


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: fix blending regression

2018-03-01 Thread Brian Paul

On 02/28/2018 08:36 AM, Ilia Mirkin wrote:

Can st/mesa instead be fixed to maintain the original API? Other
drivers look in rt[0] in the non-independent_blend_enable case. For
example, freedreno and nouveau.


If independent blend is not in use, then rt[0] will have all the 
blending info, as before.


The case that broke for us was when PIPE_CAP_INDEP_BLEND_FUNC==0 and 
independent blend was enabled for rt[i] where i > 0.  Before, the blend 
src/dst terms were in rt[0] but now they're in rt[i].


I'd rather not change the semantics again.

-Brian



On Wed, Feb 28, 2018 at 10:29 AM, Brian Paul  wrote:

The earlier Mesa commit 3d06c8afb5 ("st/mesa: don't translate blend
state when it's disabled for a colorbuffer") subtly changed the
details of gallium's per-RT blend state.

In particular, when pipe_rt_blend_state[i].blend_enabled is true,
we have to get the src/dst blend terms from pipe_rt_blend_state[i],
not [0] as before.

We now have to scan the blend targets to find the first one that's
enabled (if any).  We have to use the index of that target for getting
the src/dst blend terms.  And note that we have to set identical blend
terms for all targets.

This fixes the Piglit fbo-drawbuffers2-blend test.  VMware bug 2063493.
---
  src/gallium/drivers/svga/svga_pipe_blend.c | 35 --
  1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c 
b/src/gallium/drivers/svga/svga_pipe_blend.c
index 04855fa..6bb9d94 100644
--- a/src/gallium/drivers/svga/svga_pipe_blend.c
+++ b/src/gallium/drivers/svga/svga_pipe_blend.c
@@ -148,6 +148,17 @@ svga_create_blend_state(struct pipe_context *pipe,
 if (!blend)
return NULL;

+   /* Find index of first target with blending enabled.  -1 means blending
+* is not enabled at all.
+*/
+   int first_enabled = -1;
+   for (i = 0; i < PIPE_MAX_COLOR_BUFS; i++) {
+  if (templ->rt[i].blend_enable) {
+ first_enabled = i;
+ break;
+  }
+   }
+
 /* Fill in the per-rendertarget blend state.  We currently only
  * support independent blend enable and colormask per render target.
  */
@@ -260,24 +271,26 @@ svga_create_blend_state(struct pipe_context *pipe,
   }
}
else {
- /* Note: the vgpu10 device does not yet support independent
-  * blend terms per render target.  Target[0] always specifies the
-  * blending terms.
+ /* Note: the vgpu10 device does not yet support independent blend
+  * terms per render target.  When blending is enabled, the blend
+  * terms must match for all targets.
*/
- if (templ->independent_blend_enable || templ->rt[0].blend_enable) {
-/* always use the 0th target's blending terms for now */
+ if (first_enabled >= 0) {
+/* use first enabled target's blending terms */
+const struct pipe_rt_blend_state *rt = >rt[first_enabled];
+
  blend->rt[i].srcblend =
-   svga_translate_blend_factor(svga, templ->rt[0].rgb_src_factor);
+   svga_translate_blend_factor(svga, rt->rgb_src_factor);
  blend->rt[i].dstblend =
-   svga_translate_blend_factor(svga, templ->rt[0].rgb_dst_factor);
+   svga_translate_blend_factor(svga, rt->rgb_dst_factor);
  blend->rt[i].blendeq =
-   svga_translate_blend_func(templ->rt[0].rgb_func);
+   svga_translate_blend_func(rt->rgb_func);
  blend->rt[i].srcblend_alpha =
-   svga_translate_blend_factor(svga, 
templ->rt[0].alpha_src_factor);
+   svga_translate_blend_factor(svga, rt->alpha_src_factor);
  blend->rt[i].dstblend_alpha =
-   svga_translate_blend_factor(svga, 
templ->rt[0].alpha_dst_factor);
+   svga_translate_blend_factor(svga, rt->alpha_dst_factor);
  blend->rt[i].blendeq_alpha =
-   svga_translate_blend_func(templ->rt[0].alpha_func);
+   svga_translate_blend_func(rt->alpha_func);

  if (blend->rt[i].srcblend_alpha != blend->rt[i].srcblend ||
  blend->rt[i].dstblend_alpha != blend->rt[i].dstblend ||
--
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev=DwIBaQ=uilaK90D4TOVoH58JNXRgQ=Ie7_encNUsqxbSRbqbNgofw0ITcfE8JKfaUjIQhncGA=72mZC9F35bZXJK1nvsVtzu5aKILU-K1dy8FEsy2WnfU=wzSB-UOk_lz1oaznhV3p-XBGArDxVYuIH4ThYpac5Us=


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600/egd_tables.py: make the script python 2+3 compatible

2018-03-01 Thread sndirsch
From: Stefan Dirsch 

Patch by "Tomas Chvatal"  with modifications
by "Michal Srb"  to not break python 2.

https://bugzilla.suse.com/show_bug.cgi?id=1082303

v2:
- no longer try to encode a unicode
- make use of 'from __future__ import print_function', so semantics
  of print statements in python2 are closer to print functions in python3

https://lists.freedesktop.org/archives/mesa-dev/2018-February/187056.html

Signed-off-by: Stefan Dirsch 
Reviewed-by: Tomas Chvatal 
Reviewed-by: Dylan Baker 
---
 src/gallium/drivers/r600/egd_tables.py | 53 +-
 1 file changed, 27 insertions(+), 26 deletions(-)

diff --git a/src/gallium/drivers/r600/egd_tables.py 
b/src/gallium/drivers/r600/egd_tables.py
index d7b78c7fb1..4796456330 100644
--- a/src/gallium/drivers/r600/egd_tables.py
+++ b/src/gallium/drivers/r600/egd_tables.py
@@ -1,3 +1,4 @@
+from __future__ import print_function
 
 CopyRight = '''
 /*
@@ -60,7 +61,7 @@ class StringTable:
 """
 fragments = [
 '"%s\\0" /* %s */' % (
-te[0].encode('string_escape'),
+te[0],
 ', '.join(str(idx) for idx in te[2])
 )
 for te in self.table
@@ -217,10 +218,10 @@ def write_tables(regs, packets):
 strings = StringTable()
 strings_offsets = IntTable("int")
 
-print '/* This file is autogenerated by egd_tables.py from evergreend.h. 
Do not edit directly. */'
-print
-print CopyRight.strip()
-print '''
+print('/* This file is autogenerated by egd_tables.py from evergreend.h. 
Do not edit directly. */')
+print('')
+print(CopyRight.strip())
+print('''
 #ifndef EG_TABLES_H
 #define EG_TABLES_H
 
@@ -242,20 +243,20 @@ struct eg_packet3 {
 unsigned name_offset;
 unsigned op;
 };
-'''
+''')
 
-print 'static const struct eg_packet3 packet3_table[] = {'
+print('static const struct eg_packet3 packet3_table[] = {')
 for pkt in packets:
-print '\t{%s, %s},' % (strings.add(pkt[5:]), pkt)
-print '};'
-print
+print('\t{%s, %s},' % (strings.add(pkt[5:]), pkt))
+print('};')
+print('')
 
-print 'static const struct eg_field egd_fields_table[] = {'
+print('static const struct eg_field egd_fields_table[] = {')
 
 fields_idx = 0
 for reg in regs:
 if len(reg.fields) and reg.own_fields:
-print '\t/* %s */' % (fields_idx)
+print('\t/* %s */' % (fields_idx))
 
 reg.fields_idx = fields_idx
 
@@ -266,34 +267,34 @@ struct eg_packet3 {
 while value[1] >= len(values_offsets):
 values_offsets.append(-1)
 values_offsets[value[1]] = 
strings.add(strip_prefix(value[0]))
-print '\t{%s, %s(~0u), %s, %s},' % (
+print('\t{%s, %s(~0u), %s, %s},' % (
 strings.add(field.name), field.s_name,
-len(values_offsets), 
strings_offsets.add(values_offsets))
+len(values_offsets), 
strings_offsets.add(values_offsets)))
 else:
-print '\t{%s, %s(~0u)},' % (strings.add(field.name), 
field.s_name)
+print('\t{%s, %s(~0u)},' % (strings.add(field.name), 
field.s_name))
 fields_idx += 1
 
-print '};'
-print
+print('};')
+print('')
 
-print 'static const struct eg_reg egd_reg_table[] = {'
+print('static const struct eg_reg egd_reg_table[] = {')
 for reg in regs:
 if len(reg.fields):
-print '\t{%s, %s, %s, %s},' % (strings.add(reg.name), reg.r_name,
-len(reg.fields), reg.fields_idx if reg.own_fields else 
reg.fields_owner.fields_idx)
+print('\t{%s, %s, %s, %s},' % (strings.add(reg.name), reg.r_name,
+len(reg.fields), reg.fields_idx if reg.own_fields else 
reg.fields_owner.fields_idx))
 else:
-print '\t{%s, %s},' % (strings.add(reg.name), reg.r_name)
-print '};'
-print
+print('\t{%s, %s},' % (strings.add(reg.name), reg.r_name))
+print('};')
+print('')
 
 strings.emit(sys.stdout, "egd_strings")
 
-print
+print('')
 
 strings_offsets.emit(sys.stdout, "egd_strings_offsets")
 
-print
-print '#endif'
+print('')
+print('#endif')
 
 
 def main():
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105296] Account request for Chema Casanova

2018-03-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105296

Brian Paul  changed:

   What|Removed |Added

  Component|Other   |New Accounts
   Assignee|mesa-dev@lists.freedesktop. |sitewranglers@lists.freedes
   |org |ktop.org
 QA Contact|mesa-dev@lists.freedesktop. |
   |org |
Product|Mesa|freedesktop.org

--- Comment #3 from Brian Paul  ---
Reassigning to df.o admins.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] clover/llvm: Drop support for LLVM < 3.9.

2018-03-01 Thread Jan Vesely
On Tue, 2018-02-20 at 18:17 +, Emil Velikov wrote:
> On 28 October 2017 at 00:32, Francisco Jerez  wrote:
> > Vedran Miletić  writes:
> > 
> > > v2: remove/inline compat stuff
> > > 
> > > Reviewed-by: Francisco Jerez 
> > > ---
> > >  .../state_trackers/clover/llvm/codegen/native.cpp  |   8 +-
> > >  src/gallium/state_trackers/clover/llvm/compat.hpp  | 109 
> > > +
> > >  .../state_trackers/clover/llvm/invocation.cpp  |  22 +++--
> > >  .../state_trackers/clover/llvm/metadata.hpp|  30 +-
> > >  4 files changed, 20 insertions(+), 149 deletions(-)
> > > 
> > > diff --git a/src/gallium/state_trackers/clover/llvm/codegen/native.cpp 
> > > b/src/gallium/state_trackers/clover/llvm/codegen/native.cpp
> > > index 12c83a92b6..38028597a3 100644
> > > --- a/src/gallium/state_trackers/clover/llvm/codegen/native.cpp
> > > +++ b/src/gallium/state_trackers/clover/llvm/codegen/native.cpp
> > > @@ -114,7 +114,7 @@ namespace {
> > > 
> > >std::unique_ptr tm {
> > >   t->createTargetMachine(target.triple, target.cpu, "", {},
> > > -compat::default_reloc_model,
> > > +::llvm::None,
> > >  compat::default_code_model,
> > >  ::llvm::CodeGenOpt::Default) };
> > >if (!tm)
> > > @@ -124,11 +124,11 @@ namespace {
> > >::llvm::SmallVector data;
> > > 
> > >{
> > > - compat::pass_manager pm;
> > > + ::llvm::legacy::PassManager pm;
> > >   ::llvm::raw_svector_ostream os { data };
> > > - compat::raw_ostream_to_emit_file fos(os);
> > > + ::llvm::raw_svector_ostream (os);
> > > 
> > 
> > This is just a reference to the other local variable above.  Mind
> > cleaning it up?
> > 
> > > - mod.setDataLayout(compat::get_data_layout(*tm));
> > > + mod.setDataLayout(tm->createDataLayout());
> > >   tm->Options.MCOptions.AsmVerbose =
> > >  (ft == TargetMachine::CGFT_AssemblyFile);
> > > 
> > > diff --git a/src/gallium/state_trackers/clover/llvm/compat.hpp 
> > > b/src/gallium/state_trackers/clover/llvm/compat.hpp
> > > index f8b56516d5..ce3a29f7d5 100644
> > > --- a/src/gallium/state_trackers/clover/llvm/compat.hpp
> > > +++ b/src/gallium/state_trackers/clover/llvm/compat.hpp
> > > @@ -36,6 +36,8 @@
> > > 
> > >  #include "util/algorithm.hpp"
> > > 
> > > +#include 
> > > +#include 
> > >  #include 
> > >  #include 
> > >  #include 
> > > @@ -46,16 +48,6 @@
> > >  #include 
> > >  #endif
> > > 
> > > -#if HAVE_LLVM >= 0x0307
> > > -#include 
> > > -#include 
> > > -#else
> > > -#include 
> > > -#include 
> > > -#include 
> > > -#include 
> > > -#endif
> > > -
> > >  #include 
> > >  #include 
> > >  #include 
> > > @@ -63,12 +55,6 @@
> > >  namespace clover {
> > > namespace llvm {
> > >namespace compat {
> > > -#if HAVE_LLVM >= 0x0307
> > > - typedef ::llvm::TargetLibraryInfoImpl target_library_info;
> > > -#else
> > > - typedef ::llvm::TargetLibraryInfo target_library_info;
> > > -#endif
> > > -
> > >  #if HAVE_LLVM >= 0x0500
> > >   const auto lang_as_offset = 0;
> > >   const clang::InputKind ik_opencl = clang::InputKind::OpenCL;
> > > @@ -77,19 +63,6 @@ namespace clover {
> > >   const clang::InputKind ik_opencl = clang::IK_OpenCL;
> > >  #endif
> > > 
> > > - inline void
> > > - set_lang_defaults(clang::CompilerInvocation ,
> > > -   clang::LangOptions , clang::InputKind 
> > > ik,
> > > -   const ::llvm::Triple ,
> > > -   clang::PreprocessorOptions ,
> > > -   clang::LangStandard::Kind std) {
> > > -#if HAVE_LLVM >= 0x0309
> > > -inv.setLangDefaults(lopts, ik, t, ppopts, std);
> > > -#else
> > > -inv.setLangDefaults(lopts, ik, std);
> > > -#endif
> > > - }
> > > -
> > >   inline void
> > >   add_link_bitcode_file(clang::CodeGenOptions ,
> > > const std::string ) {
> > > @@ -100,78 +73,8 @@ namespace clover {
> > >  F.PropagateAttrs = true;
> > >  F.LinkFlags = ::llvm::Linker::Flags::None;
> > >  opts.LinkBitcodeFiles.emplace_back(F);
> > > -#elif HAVE_LLVM >= 0x0308
> > > -
> > > opts.LinkBitcodeFiles.emplace_back(::llvm::Linker::Flags::None, path);
> > > -#else
> > > -opts.LinkBitcodeFile = path;
> > > -#endif
> > > - }
> > > -
> > > -#if HAVE_LLVM >= 0x0307
> > > - typedef ::llvm::legacy::PassManager pass_manager;
> > > -#else
> > > - typedef ::llvm::PassManager pass_manager;
> > > -#endif
> > > -
> > > - inline void
> > > - add_data_layout_pass(pass_manager ) {
> > > -#if HAVE_LLVM < 0x0307
> > > -pm.add(new 

Re: [Mesa-dev] [PATCH 05/29] intel/isl: Add a helper for inverting swizzles

2018-03-01 Thread Pohjolainen, Topi
On Mon, Feb 26, 2018 at 08:42:42AM -0800, Jason Ekstrand wrote:
> On Mon, Feb 26, 2018 at 6:19 AM, Pohjolainen, Topi <
> topi.pohjolai...@gmail.com> wrote:
> 
> > On Fri, Jan 26, 2018 at 05:59:34PM -0800, Jason Ekstrand wrote:
> > > ---
> > >  src/intel/isl/isl.c | 30 ++
> > >  src/intel/isl/isl.h |  2 ++
> > >  2 files changed, 32 insertions(+)
> > >
> > > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> > > index a2d3ae6..420d387 100644
> > > --- a/src/intel/isl/isl.c
> > > +++ b/src/intel/isl/isl.c
> > > @@ -2379,3 +2379,33 @@ isl_swizzle_compose(struct isl_swizzle first,
> > struct isl_swizzle second)
> > >.a = swizzle_select(first.a, second),
> > > };
> > >  }
> > > +
> > > +/**
> > > + * Returns a swizzle that is the pseudo-inverse of this swizzle.
> > > + */
> > > +struct isl_swizzle
> > > +isl_swizzle_invert(struct isl_swizzle swizzle)
> > > +{
> > > +   /* Default to zero for channels which do not show up in the swizzle
> > */
> > > +   enum isl_channel_select chans[4] = {
> > > +  ISL_CHANNEL_SELECT_ZERO,
> > > +  ISL_CHANNEL_SELECT_ZERO,
> > > +  ISL_CHANNEL_SELECT_ZERO,
> > > +  ISL_CHANNEL_SELECT_ZERO,
> > > +   };
> > > +
> > > +   /* We go in ABGR order so that, if there are any duplicates, the
> > first one
> > > +* is taken if you look at it in RGBA order.  This is what Haswell
> > hardware
> > > +* does for render target swizzles.
> > > +*/
> > > +   if ((unsigned)(swizzle.a - ISL_CHANNEL_SELECT_RED) < 4)
> > > +  chans[swizzle.a - ISL_CHANNEL_SELECT_RED] =
> > ISL_CHANNEL_SELECT_ALPHA;
> > > +   if ((unsigned)(swizzle.b - ISL_CHANNEL_SELECT_RED) < 4)
> > > +  chans[swizzle.b - ISL_CHANNEL_SELECT_RED] =
> > ISL_CHANNEL_SELECT_BLUE;
> > > +   if ((unsigned)(swizzle.g - ISL_CHANNEL_SELECT_RED) < 4)
> > > +  chans[swizzle.g - ISL_CHANNEL_SELECT_RED] =
> > ISL_CHANNEL_SELECT_GREEN;
> > > +   if ((unsigned)(swizzle.r - ISL_CHANNEL_SELECT_RED) < 4)
> > > +  chans[swizzle.r - ISL_CHANNEL_SELECT_RED] =
> > ISL_CHANNEL_SELECT_RED;
> > > +
> > > +   return (struct isl_swizzle) { chans[0], chans[1], chans[2], chans[3]
> > };
> >
> > If given
> >
> > swizzle == { ISL_CHANNEL_SELECT_RED,
> >  ISL_CHANNEL_SELECT_GREEN,
> >  ISL_CHANNEL_SELECT_BLUE,
> >  ISL_CHANNEL_SELECT_ALPHA },
> >
> > then
> > chans[ISL_CHANNEL_SELECT_ALPHA - ISL_CHANNEL_SELECT_RED] == chans[3] ==
> > ISL_CHANNEL_SELECT_ALPHA
> >
> > and so on, and the function returns the same swizzle as given?
> 
> 
> Yes, that is how the subtraction works.

I was expecting it to "invert" that, i.e., to return ABGR. But okay, if given
identity swizzle it returns identity.

In order to understand how it works I thought I read further the series to
find an example - there seems to be one in patch 12 and another in patch 16.
In case of 16 and destination format B4G4R4A4 the swizzle looks to be BGRA
(looking at anv_formats.c::main_formats).

In that case we get:

   chans[ALPHA - RED] = chans[3] = ALPHA
   chans[RED   - RED] = chans[0] = BLUE
   chans[GREEN - RED] = chans[1] = GREEN
   chans[BLUE  - RED] = chans[2] = RED

and as a swizzle BLUE, GREEN, RED, ALPHA. This is again the same as given.
What am I not understanding?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 4/6] nouveau: Add framebuffer modifier support

2018-03-01 Thread Ilia Mirkin
On Thu, Mar 1, 2018 at 8:54 AM, Thierry Reding  wrote:
> From: Thierry Reding 
>
> This adds support for framebuffer modifiers to Nouveau. This will be
> used by the Tegra driver to share metadata about the format of buffers
> (such as the tiling mode or compression).
>
> Changes in v2:
> - remove unused parameters to nouveau_buffer_create()
> - move format modifier query code to nvc0 backend
> - restrict format modifiers to 2D textures
> - implement ->query_dmabuf_modifiers()
>
> Acked-by: Emil Velikov 
> Tested-by: Andre Heider 
> Signed-off-by: Thierry Reding 
> ---
>  src/gallium/drivers/nouveau/Android.mk   |  3 +
>  src/gallium/drivers/nouveau/Makefile.am  |  1 +
>  src/gallium/drivers/nouveau/nouveau_screen.c |  4 ++
>  src/gallium/drivers/nouveau/nv30/nv30_resource.c |  2 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c  | 81 
> +++-
>  src/gallium/drivers/nouveau/nvc0/nvc0_resource.c | 59 -
>  src/gallium/drivers/nouveau/nvc0/nvc0_resource.h |  3 +-
>  7 files changed, 149 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/Android.mk 
> b/src/gallium/drivers/nouveau/Android.mk
> index 2de22e73ec18..a446774a86e8 100644
> --- a/src/gallium/drivers/nouveau/Android.mk
> +++ b/src/gallium/drivers/nouveau/Android.mk
> @@ -36,6 +36,9 @@ LOCAL_SRC_FILES := \
> $(NVC0_CODEGEN_SOURCES) \
> $(NVC0_C_SOURCES)
>
> +LOCAL_C_INCLUDES := \
> +   $(MESA_TOP)/include/drm-uapi
> +
>  LOCAL_SHARED_LIBRARIES := libdrm_nouveau
>  LOCAL_MODULE := libmesa_pipe_nouveau
>
> diff --git a/src/gallium/drivers/nouveau/Makefile.am 
> b/src/gallium/drivers/nouveau/Makefile.am
> index 91547178e397..f6126b544811 100644
> --- a/src/gallium/drivers/nouveau/Makefile.am
> +++ b/src/gallium/drivers/nouveau/Makefile.am
> @@ -24,6 +24,7 @@ include Makefile.sources
>  include $(top_srcdir)/src/gallium/Automake.inc
>
>  AM_CPPFLAGS = \
> +   -I$(top_srcdir)/include/drm-uapi \
> $(GALLIUM_DRIVER_CFLAGS) \
> $(LIBDRM_CFLAGS) \
> $(NOUVEAU_CFLAGS)

Someone is likely to complain about forgetting about the N+1 build
system, meson.

> diff --git a/src/gallium/drivers/nouveau/nouveau_screen.c 
> b/src/gallium/drivers/nouveau/nouveau_screen.c
> index c144b39b2dd2..b84ef13ebe7f 100644
> --- a/src/gallium/drivers/nouveau/nouveau_screen.c
> +++ b/src/gallium/drivers/nouveau/nouveau_screen.c
> @@ -1,3 +1,5 @@
> +#include 
> +
>  #include "pipe/p_defines.h"
>  #include "pipe/p_screen.h"
>  #include "pipe/p_state.h"
> @@ -23,6 +25,8 @@
>  #include "nouveau_mm.h"
>  #include "nouveau_buffer.h"
>
> +#include "nvc0/nvc0_resource.h"

Can't have that... why do you need it here?

> +
>  /* XXX this should go away */
>  #include "state_tracker/drm_driver.h"
>
> diff --git a/src/gallium/drivers/nouveau/nv30/nv30_resource.c 
> b/src/gallium/drivers/nouveau/nv30/nv30_resource.c
> index ff34f6e5f9fa..386bd3459bd3 100644
> --- a/src/gallium/drivers/nouveau/nv30/nv30_resource.c
> +++ b/src/gallium/drivers/nouveau/nv30/nv30_resource.c
> @@ -23,6 +23,8 @@
>   *
>   */
>
> +#include 
> +
>  #include "util/u_format.h"
>  #include "util/u_inlines.h"
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c
> index 27674f72a7c0..7983c4030876 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c
> @@ -20,8 +20,11 @@
>   * OTHER DEALINGS IN THE SOFTWARE.
>   */
>
> +#include 
> +
>  #include "pipe/p_state.h"
>  #include "pipe/p_defines.h"
> +#include "state_tracker/drm_driver.h"
>  #include "util/u_inlines.h"
>  #include "util/u_format.h"
>
> @@ -233,9 +236,79 @@ nvc0_miptree_init_layout_tiled(struct nv50_miptree *mt)
> }
>  }
>
> +static uint64_t nvc0_miptree_get_modifier(struct nv50_miptree *mt)
> +{
> +   union nouveau_bo_config *config = >base.bo->config;
> +   uint64_t modifier;
> +
> +   if (mt->layout_3d)
> +  return DRM_FORMAT_MOD_INVALID;
> +
> +   switch (config->nvc0.memtype) {
> +   case 0x00:
> +  modifier = DRM_FORMAT_MOD_LINEAR;
> +  break;
> +
> +   case 0xfe:
> +  switch (NVC0_TILE_MODE_Y(config->nvc0.tile_mode)) {
> +  case 0:
> + modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_ONE_GOB;
> + break;
> +
> +  case 1:
> + modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_TWO_GOB;
> + break;
> +
> +  case 2:
> + modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_FOUR_GOB;
> + break;
> +
> +  case 3:
> + modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_EIGHT_GOB;
> + break;
> +
> +  case 4:
> + modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_SIXTEEN_GOB;
> + break;
> +
> +  case 5:
> + modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB;
> + break;
> +
> +  default:
> + 

Re: [Mesa-dev] [PATCH] loader: Add support for platform and host1x busses

2018-03-01 Thread Eric Engestrom


On March 1, 2018 1:31:53 PM UTC, Thierry Reding  
wrote:
> From: Thierry Reding 
> 
> ARM SoCs usually have their DRM/KMS devices on the platform bus, so
> add
> support for this bus in order to allow use of the DRI_PRIME
> environment
> variable with those devices.
> 
> While at it, also support the host1x bus, which is effectively the
> same
> but uses an additional layer in the bus hierarchy.
> 
> Note that it isn't enough to support the bus that has the rendering
> GPU
> because the loader code will also try to construct an ID path tag for
> a
> scanout-only device if it is the default that is being opened.
> 
> The ID path tag for a device can be obtained by running udevadm info
> on
> the device node:
> 
>   $ udevadm info /dev/dri/card0
> 
> and looking up the ID_PATH_TAG entry in the output.
> 
> Signed-off-by: Thierry Reding 
> ---
>  src/loader/loader.c | 27 +++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/src/loader/loader.c b/src/loader/loader.c
> index 92b4c5204b19..ca578b8cd232 100644
> --- a/src/loader/loader.c
> +++ b/src/loader/loader.c
> @@ -120,6 +120,33 @@ static char
> *drm_construct_id_path_tag(drmDevicePtr device)
> device->businfo.pci->func) < 0) {
>   return NULL;
>}
> +   } else if (device->bustype == DRM_BUS_PLATFORM ||
> +  device->bustype == DRM_BUS_HOST1X) {
> +  char *fullname, *name, *address;
> +
> +  if (device->bustype == DRM_BUS_PLATFORM)
> + fullname = device->businfo.platform->fullname;
> +  else
> + fullname = device->businfo.host1x->fullname;
> +
> +  name = strrchr(fullname, '/');
> +  if (!name)
> + name = strdup(fullname);
> +  else
> + name = strdup(++name);

Looks like UB to me; how about this instead?

  name = strdup(name + 1);

With that:
Reviewed-by: Eric Engestrom 

> +
> +  address = strchr(name, '@');
> +  if (address) {
> + *address++ = '\0';
> +
> + if (asprintf(, "platform-%s_%s", address, name) < 0)
> +tag = NULL;
> +  } else {
> + if (asprintf(, "platform-%s", name) < 0)
> +tag = NULL;
> +  }
> +
> +  free(name);
> }
> return tag;
>  }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 6/6] autotools: Add tegra to AM_DISTCHECK_CONFIGURE_FLAGS

2018-03-01 Thread Thierry Reding
From: Thierry Reding 

This allows the driver to be built on a make distcheck and makes sure
that it properly builds when a distribution tarball is made.

Suggested-by: Emil Velikov 
Signed-off-by: Thierry Reding 
---
 Makefile.am | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile.am b/Makefile.am
index 5c3a6717d34e..de6921bf1fcb 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -45,7 +45,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-libunwind \
--with-platforms=x11,wayland,drm,surfaceless \
--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \
-   
--with-gallium-drivers=i915,nouveau,r300,pl111,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr,etnaviv,imx
 \
+   
--with-gallium-drivers=i915,nouveau,r300,pl111,r600,radeonsi,freedreno,svga,swrast,vc4,tegra,virgl,swr,etnaviv,imx
 \
--with-vulkan-drivers=intel,radeon
 
 ACLOCAL_AMFLAGS = -I m4
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 4/6] nouveau: Add framebuffer modifier support

2018-03-01 Thread Thierry Reding
From: Thierry Reding 

This adds support for framebuffer modifiers to Nouveau. This will be
used by the Tegra driver to share metadata about the format of buffers
(such as the tiling mode or compression).

Changes in v2:
- remove unused parameters to nouveau_buffer_create()
- move format modifier query code to nvc0 backend
- restrict format modifiers to 2D textures
- implement ->query_dmabuf_modifiers()

Acked-by: Emil Velikov 
Tested-by: Andre Heider 
Signed-off-by: Thierry Reding 
---
 src/gallium/drivers/nouveau/Android.mk   |  3 +
 src/gallium/drivers/nouveau/Makefile.am  |  1 +
 src/gallium/drivers/nouveau/nouveau_screen.c |  4 ++
 src/gallium/drivers/nouveau/nv30/nv30_resource.c |  2 +
 src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c  | 81 +++-
 src/gallium/drivers/nouveau/nvc0/nvc0_resource.c | 59 -
 src/gallium/drivers/nouveau/nvc0/nvc0_resource.h |  3 +-
 7 files changed, 149 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/nouveau/Android.mk 
b/src/gallium/drivers/nouveau/Android.mk
index 2de22e73ec18..a446774a86e8 100644
--- a/src/gallium/drivers/nouveau/Android.mk
+++ b/src/gallium/drivers/nouveau/Android.mk
@@ -36,6 +36,9 @@ LOCAL_SRC_FILES := \
$(NVC0_CODEGEN_SOURCES) \
$(NVC0_C_SOURCES)
 
+LOCAL_C_INCLUDES := \
+   $(MESA_TOP)/include/drm-uapi
+
 LOCAL_SHARED_LIBRARIES := libdrm_nouveau
 LOCAL_MODULE := libmesa_pipe_nouveau
 
diff --git a/src/gallium/drivers/nouveau/Makefile.am 
b/src/gallium/drivers/nouveau/Makefile.am
index 91547178e397..f6126b544811 100644
--- a/src/gallium/drivers/nouveau/Makefile.am
+++ b/src/gallium/drivers/nouveau/Makefile.am
@@ -24,6 +24,7 @@ include Makefile.sources
 include $(top_srcdir)/src/gallium/Automake.inc
 
 AM_CPPFLAGS = \
+   -I$(top_srcdir)/include/drm-uapi \
$(GALLIUM_DRIVER_CFLAGS) \
$(LIBDRM_CFLAGS) \
$(NOUVEAU_CFLAGS)
diff --git a/src/gallium/drivers/nouveau/nouveau_screen.c 
b/src/gallium/drivers/nouveau/nouveau_screen.c
index c144b39b2dd2..b84ef13ebe7f 100644
--- a/src/gallium/drivers/nouveau/nouveau_screen.c
+++ b/src/gallium/drivers/nouveau/nouveau_screen.c
@@ -1,3 +1,5 @@
+#include 
+
 #include "pipe/p_defines.h"
 #include "pipe/p_screen.h"
 #include "pipe/p_state.h"
@@ -23,6 +25,8 @@
 #include "nouveau_mm.h"
 #include "nouveau_buffer.h"
 
+#include "nvc0/nvc0_resource.h"
+
 /* XXX this should go away */
 #include "state_tracker/drm_driver.h"
 
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_resource.c 
b/src/gallium/drivers/nouveau/nv30/nv30_resource.c
index ff34f6e5f9fa..386bd3459bd3 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_resource.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_resource.c
@@ -23,6 +23,8 @@
  *
  */
 
+#include 
+
 #include "util/u_format.h"
 #include "util/u_inlines.h"
 
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c
index 27674f72a7c0..7983c4030876 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c
@@ -20,8 +20,11 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 
+#include 
+
 #include "pipe/p_state.h"
 #include "pipe/p_defines.h"
+#include "state_tracker/drm_driver.h"
 #include "util/u_inlines.h"
 #include "util/u_format.h"
 
@@ -233,9 +236,79 @@ nvc0_miptree_init_layout_tiled(struct nv50_miptree *mt)
}
 }
 
+static uint64_t nvc0_miptree_get_modifier(struct nv50_miptree *mt)
+{
+   union nouveau_bo_config *config = >base.bo->config;
+   uint64_t modifier;
+
+   if (mt->layout_3d)
+  return DRM_FORMAT_MOD_INVALID;
+
+   switch (config->nvc0.memtype) {
+   case 0x00:
+  modifier = DRM_FORMAT_MOD_LINEAR;
+  break;
+
+   case 0xfe:
+  switch (NVC0_TILE_MODE_Y(config->nvc0.tile_mode)) {
+  case 0:
+ modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_ONE_GOB;
+ break;
+
+  case 1:
+ modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_TWO_GOB;
+ break;
+
+  case 2:
+ modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_FOUR_GOB;
+ break;
+
+  case 3:
+ modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_EIGHT_GOB;
+ break;
+
+  case 4:
+ modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_SIXTEEN_GOB;
+ break;
+
+  case 5:
+ modifier = DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB;
+ break;
+
+  default:
+ modifier = DRM_FORMAT_MOD_INVALID;
+ break;
+  }
+  break;
+
+   default:
+  modifier = DRM_FORMAT_MOD_INVALID;
+  break;
+   }
+
+   return modifier;
+}
+
+static boolean
+nvc0_miptree_get_handle(struct pipe_screen *pscreen,
+struct pipe_resource *pt,
+struct winsys_handle *whandle)
+{
+   struct nv50_miptree *mt = nv50_miptree(pt);
+   boolean ret;
+
+   ret = nv50_miptree_get_handle(pscreen, pt, 

[Mesa-dev] [PATCH v3 3/6] nouveau/nvc0: Extract common tile mode macro

2018-03-01 Thread Thierry Reding
From: Thierry Reding 

Add a new macro that can be used to extract the tiling mode from a
tile_mode value. This is will be used to determine the number of GOBs
used in block linear mode.

Acked-by: Emil Velikov 
Tested-by: Andre Heider 
Signed-off-by: Thierry Reding 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_resource.h | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_resource.h 
b/src/gallium/drivers/nouveau/nvc0/nvc0_resource.h
index 0d5f026d6e1c..c68a50948360 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_resource.h
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_resource.h
@@ -6,14 +6,17 @@
 
 #define NVC0_RESOURCE_FLAG_VIDEO (NOUVEAU_RESOURCE_FLAG_DRV_PRIV << 0)
 
+#define NVC0_TILE_MODE_X(m) (((m) >> 0) & 0xf)
+#define NVC0_TILE_MODE_Y(m) (((m) >> 4) & 0xf)
+#define NVC0_TILE_MODE_Z(m) (((m) >> 8) & 0xf)
 
-#define NVC0_TILE_SHIFT_X(m) m) >> 0) & 0xf) + 6)
-#define NVC0_TILE_SHIFT_Y(m) m) >> 4) & 0xf) + 3)
-#define NVC0_TILE_SHIFT_Z(m) m) >> 8) & 0xf) + 0)
+#define NVC0_TILE_SHIFT_X(m) (NVC0_TILE_MODE_X(m) + 6)
+#define NVC0_TILE_SHIFT_Y(m) (NVC0_TILE_MODE_Y(m) + 3)
+#define NVC0_TILE_SHIFT_Z(m) (NVC0_TILE_MODE_Z(m) + 0)
 
-#define NVC0_TILE_SIZE_X(m) (64 << (((m) >> 0) & 0xf))
-#define NVC0_TILE_SIZE_Y(m) ( 8 << (((m) >> 4) & 0xf))
-#define NVC0_TILE_SIZE_Z(m) ( 1 << (((m) >> 8) & 0xf))
+#define NVC0_TILE_SIZE_X(m) (64 << NVC0_TILE_MODE_X(m))
+#define NVC0_TILE_SIZE_Y(m) ( 8 << NVC0_TILE_MODE_Y(m))
+#define NVC0_TILE_SIZE_Z(m) ( 1 << NVC0_TILE_MODE_Z(m))
 
 /* it's ok to mask only in the end because max value is 3 * 5 */
 
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 5/6] tegra: Initial support

2018-03-01 Thread Thierry Reding
Tegra K1 and later use a GPU that can be driven by the Nouveau driver.
But the GPU is a pure render node and has no display engine, hence the
scanout needs to happen on the Tegra display hardware. The GPU and the
display engine each have a separate DRM device node exposed by the
kernel.

To make the setup appear as a single device, this driver instantiates
a Nouveau screen with each instance of a Tegra screen and forwards GPU
requests to the Nouveau screen. For purposes of scanout it will import
buffers created on the GPU into the display driver. Handles that
userspace requests are those of the display driver so that they can be
used to create framebuffers.

This has been tested with some GBM test programs, as well as kmscube and
weston. All of those run without modifications, but I'm sure there is a
lot that can be improved.

Some fixes contributed by Hector Martin .

Changes in v2:
- duplicate file descriptor in winsys to avoid potential issues
- require nouveau when building the tegra driver
- check for nouveau driver name on render node
- remove unneeded dependency on libdrm_tegra
- remove zombie references to libudev
- add missing headers to C_SOURCES variable
- drop unneeded tegra/ prefix for includes
- open device files with O_CLOEXEC
- update copyrights

Changes in v3:
- properly unwrap resources in ->resource_copy_region()
- support vertex buffers passed by user pointer
- allocate custom stream and const uploader
- silence error message on pre-Tegra124
- support X without explicit PRIME

Reviewed-by: Emil Velikov 
Acked-by: Emil Velikov 
Tested-by: Andre Heider 
Signed-off-by: Thierry Reding 
---
 configure.ac   |   12 +-
 include/drm-uapi/tegra_drm.h   |  225 
 meson.build|7 +-
 src/gallium/Makefile.am|5 +
 .../auxiliary/pipe-loader/pipe_loader_drm.c|7 +-
 src/gallium/auxiliary/target-helpers/drm_helper.h  |   23 +
 .../auxiliary/target-helpers/drm_helper_public.h   |3 +
 src/gallium/drivers/tegra/Automake.inc |   11 +
 src/gallium/drivers/tegra/Makefile.am  |   11 +
 src/gallium/drivers/tegra/Makefile.sources |6 +
 src/gallium/drivers/tegra/meson.build  |   41 +
 src/gallium/drivers/tegra/tegra_context.c  | 1325 
 src/gallium/drivers/tegra/tegra_context.h  |   81 ++
 src/gallium/drivers/tegra/tegra_resource.h |   76 ++
 src/gallium/drivers/tegra/tegra_screen.c   |  688 ++
 src/gallium/drivers/tegra/tegra_screen.h   |   45 +
 src/gallium/meson.build|6 +
 src/gallium/targets/dri/Makefile.am|2 +
 src/gallium/targets/dri/meson.build|4 +-
 src/gallium/targets/dri/target.c   |4 +
 src/gallium/targets/vdpau/Makefile.am  |2 +
 src/gallium/winsys/tegra/drm/Makefile.am   |   10 +
 src/gallium/winsys/tegra/drm/Makefile.sources  |2 +
 src/gallium/winsys/tegra/drm/meson.build   |   33 +
 src/gallium/winsys/tegra/drm/tegra_drm_public.h|   31 +
 src/gallium/winsys/tegra/drm/tegra_drm_winsys.c|   49 +
 26 files changed, 2705 insertions(+), 4 deletions(-)
 create mode 100644 include/drm-uapi/tegra_drm.h
 create mode 100644 src/gallium/drivers/tegra/Automake.inc
 create mode 100644 src/gallium/drivers/tegra/Makefile.am
 create mode 100644 src/gallium/drivers/tegra/Makefile.sources
 create mode 100644 src/gallium/drivers/tegra/meson.build
 create mode 100644 src/gallium/drivers/tegra/tegra_context.c
 create mode 100644 src/gallium/drivers/tegra/tegra_context.h
 create mode 100644 src/gallium/drivers/tegra/tegra_resource.h
 create mode 100644 src/gallium/drivers/tegra/tegra_screen.c
 create mode 100644 src/gallium/drivers/tegra/tegra_screen.h
 create mode 100644 src/gallium/winsys/tegra/drm/Makefile.am
 create mode 100644 src/gallium/winsys/tegra/drm/Makefile.sources
 create mode 100644 src/gallium/winsys/tegra/drm/meson.build
 create mode 100644 src/gallium/winsys/tegra/drm/tegra_drm_public.h
 create mode 100644 src/gallium/winsys/tegra/drm/tegra_drm_winsys.c

diff --git a/configure.ac b/configure.ac
index d8093597dd04..27528181b73e 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1351,7 +1351,7 @@ GALLIUM_DRIVERS_DEFAULT="r300,r600,svga,swrast"
 AC_ARG_WITH([gallium-drivers],
 [AS_HELP_STRING([--with-gallium-drivers@<:@=DIRS...@:>@],
 [comma delimited Gallium drivers list, e.g.
-
"i915,nouveau,r300,r600,radeonsi,freedreno,pl111,svga,swrast,swr,vc4,vc5,virgl,etnaviv,imx"
+
"i915,nouveau,r300,r600,radeonsi,freedreno,pl111,svga,swrast,swr,tegra,vc4,vc5,virgl,etnaviv,imx"
 @<:@default=r300,r600,svga,swrast@:>@])],
 [with_gallium_drivers="$withval"],
 

[Mesa-dev] [PATCH v3 0/6] NVIDIA Tegra support

2018-03-01 Thread Thierry Reding
From: Thierry Reding 

This series of patches implements initial support for Tegra. The first
two patches import DRM UAPI from v4.16-rc1 that provides framebuffer
modifiers that can be used to specify buffers shared between Nouveau
and the Tegra DRM driver.

Patches 3 and 4 add support for framebuffer modifiers to Nouveau and
patch 5 build on top of those to provide initial Tegra support in Mesa.
The current patches allow running common use-cases such as Wayland,
kmscube, etc.

Patch 6 adds the Tegra driver to the list of gallium drivers built
during a `make distcheck'.

Some people have been using earlier versions of these patches to run a
completely open-source graphics stack on various Tegra210 devices. I've
Cc'ed some of them so that they can provide feedback.

This series is also available in a git repository here:

https://cgit.freedesktop.org/~tagr/mesa #tegra-v3

Thierry

Thierry Reding (6):
  drm/fourcc: Fix fourcc_mod_code() definition
  drm/tegra: Sanitize format modifiers
  nouveau/nvc0: Extract common tile mode macro
  nouveau: Add framebuffer modifier support
  tegra: Initial support
  autotools: Add tegra to AM_DISTCHECK_CONFIGURE_FLAGS

 Makefile.am|2 +-
 configure.ac   |   12 +-
 include/drm-uapi/drm_fourcc.h  |   38 +-
 include/drm-uapi/tegra_drm.h   |  225 
 meson.build|7 +-
 src/gallium/Makefile.am|5 +
 .../auxiliary/pipe-loader/pipe_loader_drm.c|7 +-
 src/gallium/auxiliary/target-helpers/drm_helper.h  |   23 +
 .../auxiliary/target-helpers/drm_helper_public.h   |3 +
 src/gallium/drivers/nouveau/Android.mk |3 +
 src/gallium/drivers/nouveau/Makefile.am|1 +
 src/gallium/drivers/nouveau/nouveau_screen.c   |4 +
 src/gallium/drivers/nouveau/nv30/nv30_resource.c   |2 +
 src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c|   81 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_resource.c   |   59 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_resource.h   |   18 +-
 src/gallium/drivers/tegra/Automake.inc |   11 +
 src/gallium/drivers/tegra/Makefile.am  |   11 +
 src/gallium/drivers/tegra/Makefile.sources |6 +
 src/gallium/drivers/tegra/meson.build  |   41 +
 src/gallium/drivers/tegra/tegra_context.c  | 1325 
 src/gallium/drivers/tegra/tegra_context.h  |   81 ++
 src/gallium/drivers/tegra/tegra_resource.h |   76 ++
 src/gallium/drivers/tegra/tegra_screen.c   |  688 ++
 src/gallium/drivers/tegra/tegra_screen.h   |   45 +
 src/gallium/meson.build|6 +
 src/gallium/targets/dri/Makefile.am|2 +
 src/gallium/targets/dri/meson.build|4 +-
 src/gallium/targets/dri/target.c   |4 +
 src/gallium/targets/vdpau/Makefile.am  |2 +
 src/gallium/winsys/tegra/drm/Makefile.am   |   10 +
 src/gallium/winsys/tegra/drm/Makefile.sources  |2 +
 src/gallium/winsys/tegra/drm/meson.build   |   33 +
 src/gallium/winsys/tegra/drm/tegra_drm_public.h|   31 +
 src/gallium/winsys/tegra/drm/tegra_drm_winsys.c|   49 +
 35 files changed, 2884 insertions(+), 33 deletions(-)
 create mode 100644 include/drm-uapi/tegra_drm.h
 create mode 100644 src/gallium/drivers/tegra/Automake.inc
 create mode 100644 src/gallium/drivers/tegra/Makefile.am
 create mode 100644 src/gallium/drivers/tegra/Makefile.sources
 create mode 100644 src/gallium/drivers/tegra/meson.build
 create mode 100644 src/gallium/drivers/tegra/tegra_context.c
 create mode 100644 src/gallium/drivers/tegra/tegra_context.h
 create mode 100644 src/gallium/drivers/tegra/tegra_resource.h
 create mode 100644 src/gallium/drivers/tegra/tegra_screen.c
 create mode 100644 src/gallium/drivers/tegra/tegra_screen.h
 create mode 100644 src/gallium/winsys/tegra/drm/Makefile.am
 create mode 100644 src/gallium/winsys/tegra/drm/Makefile.sources
 create mode 100644 src/gallium/winsys/tegra/drm/meson.build
 create mode 100644 src/gallium/winsys/tegra/drm/tegra_drm_public.h
 create mode 100644 src/gallium/winsys/tegra/drm/tegra_drm_winsys.c

-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 2/6] drm/tegra: Sanitize format modifiers

2018-03-01 Thread Thierry Reding
From: Thierry Reding 

The existing format modifier definitions were merged prematurely, and
recent work has unveiled that the definitions are suboptimal in several
ways:

  - The format specifiers, except for one, are not Tegra specific, but
the names don't reflect that.
  - The number space is split into two, reserving 32 bits for some
"parameter" which most of the modifiers are not going to have.
  - Symbolic names for the modifiers are not using the standard
DRM_FORMAT_MOD_* prefix, which makes them awkward to use.
  - The vendor prefix NV is somewhat ambiguous.

Fortunately, nobody's started using these modifiers, so we can still fix
the above issues. Do so by using the standard prefix. Also, remove TEGRA
from the name of those modifiers that exist on NVIDIA GPUs as well. In
case of the block linear modifiers, make the "parameter" smaller (4
bits, though only 6 values are valid) and don't let that leak into any
of the other modifiers.

Finally, also use the more canonical NVIDIA instead of the ambiguous NV
prefix.

This is based on commit 268892cb63a822315921a8dab48ac3e4abf7dd03 from
Linux v4.16-rc1.

Acked-by: Emil Velikov 
Tested-by: Andre Heider 
Signed-off-by: Thierry Reding 
---
 include/drm-uapi/drm_fourcc.h | 36 +++-
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/include/drm-uapi/drm_fourcc.h b/include/drm-uapi/drm_fourcc.h
index a76ed8f9e383..e04613d30a13 100644
--- a/include/drm-uapi/drm_fourcc.h
+++ b/include/drm-uapi/drm_fourcc.h
@@ -178,7 +178,7 @@ extern "C" {
 #define DRM_FORMAT_MOD_VENDOR_NONE0
 #define DRM_FORMAT_MOD_VENDOR_INTEL   0x01
 #define DRM_FORMAT_MOD_VENDOR_AMD 0x02
-#define DRM_FORMAT_MOD_VENDOR_NV  0x03
+#define DRM_FORMAT_MOD_VENDOR_NVIDIA  0x03
 #define DRM_FORMAT_MOD_VENDOR_SAMSUNG 0x04
 #define DRM_FORMAT_MOD_VENDOR_QCOM0x05
 #define DRM_FORMAT_MOD_VENDOR_VIVANTE 0x06
@@ -338,29 +338,17 @@ extern "C" {
  */
 #define DRM_FORMAT_MOD_VIVANTE_SPLIT_SUPER_TILED fourcc_mod_code(VIVANTE, 4)
 
-/* NVIDIA Tegra frame buffer modifiers */
-
-/*
- * Some modifiers take parameters, for example the number of vertical GOBs in
- * a block. Reserve the lower 32 bits for parameters
- */
-#define __fourcc_mod_tegra_mode_shift 32
-#define fourcc_mod_tegra_code(val, params) \
-   fourcc_mod_code(NV, __u64)val) << __fourcc_mod_tegra_mode_shift) | 
params))
-#define fourcc_mod_tegra_mod(m) \
-   (m & ~((1ULL << __fourcc_mod_tegra_mode_shift) - 1))
-#define fourcc_mod_tegra_param(m) \
-   (m & ((1ULL << __fourcc_mod_tegra_mode_shift) - 1))
+/* NVIDIA frame buffer modifiers */
 
 /*
  * Tegra Tiled Layout, used by Tegra 2, 3 and 4.
  *
  * Pixels are arranged in simple tiles of 16 x 16 bytes.
  */
-#define NV_FORMAT_MOD_TEGRA_TILED fourcc_mod_tegra_code(1, 0)
+#define DRM_FORMAT_MOD_NVIDIA_TEGRA_TILED fourcc_mod_code(NVIDIA, 1)
 
 /*
- * Tegra 16Bx2 Block Linear layout, used by TK1/TX1
+ * 16Bx2 Block Linear layout, used by desktop GPUs, and Tegra K1 and later
  *
  * Pixels are arranged in 64x8 Groups Of Bytes (GOBs). GOBs are then stacked
  * vertically by a power of 2 (1 to 32 GOBs) to form a block.
@@ -380,7 +368,21 @@ extern "C" {
  * Chapter 20 "Pixel Memory Formats" of the Tegra X1 TRM describes this format
  * in full detail.
  */
-#define NV_FORMAT_MOD_TEGRA_16BX2_BLOCK(v) fourcc_mod_tegra_code(2, v)
+#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(v) \
+   fourcc_mod_code(NVIDIA, 0x10 | ((v) & 0xf))
+
+#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_ONE_GOB \
+   fourcc_mod_code(NVIDIA, 0x10)
+#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_TWO_GOB \
+   fourcc_mod_code(NVIDIA, 0x11)
+#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_FOUR_GOB \
+   fourcc_mod_code(NVIDIA, 0x12)
+#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_EIGHT_GOB \
+   fourcc_mod_code(NVIDIA, 0x13)
+#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_SIXTEEN_GOB \
+   fourcc_mod_code(NVIDIA, 0x14)
+#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB \
+   fourcc_mod_code(NVIDIA, 0x15)
 
 /*
  * Broadcom VC4 "T" format
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 1/6] drm/fourcc: Fix fourcc_mod_code() definition

2018-03-01 Thread Thierry Reding
From: Thierry Reding 

Avoid a compiler warnings when the val parameter is an expression.

This is based on commit 5843f4e02fbe86a59981e35adc6cabebee46fdc0 from
Linux v4.16-rc1.

Acked-by: Emil Velikov 
Tested-by: Andre Heider 
Signed-off-by: Thierry Reding 
---
 include/drm-uapi/drm_fourcc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/drm-uapi/drm_fourcc.h b/include/drm-uapi/drm_fourcc.h
index 3ad838d3f93f..a76ed8f9e383 100644
--- a/include/drm-uapi/drm_fourcc.h
+++ b/include/drm-uapi/drm_fourcc.h
@@ -188,7 +188,7 @@ extern "C" {
 #define DRM_FORMAT_RESERVED  ((1ULL << 56) - 1)
 
 #define fourcc_mod_code(vendor, val) \
-   __u64)DRM_FORMAT_MOD_VENDOR_## vendor) << 56) | (val & 
0x00ffULL))
+   __u64)DRM_FORMAT_MOD_VENDOR_## vendor) << 56) | ((val) & 
0x00ffULL))
 
 /*
  * Format Modifier tokens:
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600/egd_tables.py: make the script python 2+3 compatible

2018-03-01 Thread Stefan Dirsch
On Wed, Feb 28, 2018 at 02:16:08PM +0100, Stefan Dirsch wrote:
> On Wed, Feb 28, 2018 at 12:12:25PM +0100, Stefan Dirsch wrote:
> > Patch by "Tomas Chvatal"  with modifications
> > by "Michal Srb"  to not break python 2.
> > 
> > https://bugzilla.suse.com/show_bug.cgi?id=1082303
> > 
> > v2:
> > - open parse file in binary mode (default changed from binary to unicode
> >   text mode with python3)
> > - make use of 'from __future__ import print_function', so semantics
> >   of print statements in python2 are closer to print functions in python3
> > 
> > https://lists.freedesktop.org/archives/mesa-dev/2018-February/187056.html
> 
> Ok, guys. This patch is complete crap. Tested with
> 
>   python2/python3 ./egd_tables.py
> 
> (meanwhile figured out, that the parse file itself needs to be added as second
> argument)
> 
> This happens if you try to upstream and revise a patch, that you didn't write
> yourself (in a language you're not familiar with). Sigh.
> 
> So please forget about this (not so nice) try.

Meanwhile I figured out that even the patch I initially sent was already
broken. It generated lines like this:

   "b'NOP'\0" /* 0 */
   "b'DEALLOC_STATE'\0" /* 4 */

instead of (python2)

   "NOP\0" /* 0 */
   "DEALLOC_STATE\0" /* 4 */

OMG.

Stefan

Public Key available
--
Stefan Dirsch (Res. & Dev.)   SUSE LINUX GmbH
Tel: 0911-740 53 0Maxfeldstraße 5
FAX: 0911-740 53 479  D-90409 Nürnberg
http://www.suse.deGermany 
---
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham
Norton, HRB 21284 (AG Nürnberg)
---
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] loader: Add support for platform and host1x busses

2018-03-01 Thread Thierry Reding
From: Thierry Reding 

ARM SoCs usually have their DRM/KMS devices on the platform bus, so add
support for this bus in order to allow use of the DRI_PRIME environment
variable with those devices.

While at it, also support the host1x bus, which is effectively the same
but uses an additional layer in the bus hierarchy.

Note that it isn't enough to support the bus that has the rendering GPU
because the loader code will also try to construct an ID path tag for a
scanout-only device if it is the default that is being opened.

The ID path tag for a device can be obtained by running udevadm info on
the device node:

$ udevadm info /dev/dri/card0

and looking up the ID_PATH_TAG entry in the output.

Signed-off-by: Thierry Reding 
---
 src/loader/loader.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/src/loader/loader.c b/src/loader/loader.c
index 92b4c5204b19..ca578b8cd232 100644
--- a/src/loader/loader.c
+++ b/src/loader/loader.c
@@ -120,6 +120,33 @@ static char *drm_construct_id_path_tag(drmDevicePtr device)
device->businfo.pci->func) < 0) {
  return NULL;
   }
+   } else if (device->bustype == DRM_BUS_PLATFORM ||
+  device->bustype == DRM_BUS_HOST1X) {
+  char *fullname, *name, *address;
+
+  if (device->bustype == DRM_BUS_PLATFORM)
+ fullname = device->businfo.platform->fullname;
+  else
+ fullname = device->businfo.host1x->fullname;
+
+  name = strrchr(fullname, '/');
+  if (!name)
+ name = strdup(fullname);
+  else
+ name = strdup(++name);
+
+  address = strchr(name, '@');
+  if (address) {
+ *address++ = '\0';
+
+ if (asprintf(, "platform-%s_%s", address, name) < 0)
+tag = NULL;
+  } else {
+ if (asprintf(, "platform-%s", name) < 0)
+tag = NULL;
+  }
+
+  free(name);
}
return tag;
 }
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] disk cache: Link with -latomic if necessary

2018-03-01 Thread Thierry Reding
From: Thierry Reding 

The disk cache implementation uses 64-bit atomic operations. For some
architectures, such as 32-bit ARM, GCC will not be able to translate
these operations into atomic, lock-free instructions and will instead
rely on the external atomics library to provide these operations.

Check at configuration time whether or not linking against libatomic
is necessary and if so, create a dependency that can be used while
linking the mesautil library.

This is the meson equivalent of 2ef7f23820a6 ("configure: check if
-latomic is needed for __atomic_*").

For some background information on this, see:

https://gcc.gnu.org/wiki/Atomic/GCCMM

Changes in v2:
- clarify meaning of lock-free in commit message
- fix build if -latomic is not necessary

Acked-by: Matt Turner 
Signed-off-by: Thierry Reding 
---
 meson.build  | 17 +
 src/util/meson.build |  2 +-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index e9928a379313..bb6a835084fe 100644
--- a/meson.build
+++ b/meson.build
@@ -790,9 +790,26 @@ else
 endif
 
 # Check for GCC style atomics
+dep_atomic = declare_dependency()
+
 if cc.compiles('int main() { int n; return __atomic_load_n(, 
__ATOMIC_ACQUIRE); }',
name : 'GCC atomic builtins')
   pre_args += '-DUSE_GCC_ATOMIC_BUILTINS'
+
+  # Not all atomic calls can be turned into lock-free instructions, in which
+  # GCC will make calls into the libatomic library. Check whether we need to
+  # link with -latomic.
+  #
+  # This can happen for 64-bit atomic operations on 32-bit architectures such
+  # as ARM.
+  if not cc.links('''#include 
+ int main() {
+   uint64_t n;
+   return (int)__atomic_load_n(, __ATOMIC_ACQUIRE);
+ }''',
+  name : 'GCC atomic builtins required -latomic')
+dep_atomic = cc.find_library('atomic')
+  endif
 endif
 if not cc.links('''#include 
uint64_t v;
diff --git a/src/util/meson.build b/src/util/meson.build
index b23dba3a9851..eece1cefef6a 100644
--- a/src/util/meson.build
+++ b/src/util/meson.build
@@ -102,7 +102,7 @@ libmesa_util = static_library(
   'mesa_util',
   [files_mesa_util, format_srgb],
   include_directories : inc_common,
-  dependencies : [dep_zlib, dep_clock, dep_thread],
+  dependencies : [dep_zlib, dep_clock, dep_thread, dep_atomic],
   c_args : [c_msvc_compat_args, c_vis_args],
   build_by_default : false
 )
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: ensure that images don't try to reference non-existent levels

2018-03-01 Thread Ilia Mirkin
Yes, as I mentioned this makes some tests assert.

They were passing before, but it was through luck since the actual images
were never accessed.

On Mar 1, 2018 04:04, "Timothy Arceri"  wrote:

> This causes the CTS tests to assert on radeonsi where they previously
> passed. If that expected?
>
> On 27/02/18 16:19, Ilia Mirkin wrote:
>
>> Ideally the st_finalize_texture call would take care of that, but it
>> doesn't seem to with KHR-GL45.shader_image_size.advanced-nonMS-*. This
>> assertions makes sure that no such values are passed to the driver.
>>
>> Signed-off-by: Ilia Mirkin 
>> ---
>>
>> This will trigger asserts in CTS, but I think that's better than feeding
>> broken values to driver backends.
>>
>>   src/mesa/state_tracker/st_atom_image.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/src/mesa/state_tracker/st_atom_image.c
>> b/src/mesa/state_tracker/st_atom_image.c
>> index 1c4980173f4..421c926cf04 100644
>> --- a/src/mesa/state_tracker/st_atom_image.c
>> +++ b/src/mesa/state_tracker/st_atom_image.c
>> @@ -97,6 +97,7 @@ st_convert_image(const struct st_context *st, const
>> struct gl_image_unit *u,
>>   img->resource = stObj->pt;
>> img->u.tex.level = u->Level + stObj->base.MinLevel;
>> +  assert(img->u.tex.level <= img->resource->last_level);
>> if (stObj->pt->target == PIPE_TEXTURE_3D) {
>>if (u->Layered) {
>>   img->u.tex.first_layer = 0;
>>
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105105] Suffixless KHR_robustness functions aren't exposed in ES 3.2

2018-03-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105105

--- Comment #4 from Tapani Pälli  ---
(In reply to Kenneth Graunke from comment #0)
> The readnpixels test does:
> 
>PFNGLREADNPIXELS pReadnPixels =
> (PFNGLREADNPIXELS)context->getRenderContext().
> getProcAddress("glReadnPixels");
> 
> and then calls that function pointer...but ends up in generic_nop().

It does not actually call it, instead it calls api via "gl.readnPixels", gl
contains the API functions that have been resolved already earlier. The
pReadnPixels would actually work if it would be utilized, I changed test to use
that and it passes.

I'm wanted to write this down here so that I won't forget this. Will need to
figure out how context functions in 'gl.*' get resolved.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] ac/nir: only enable used channels when exporting parameters

2018-03-01 Thread Samuel Pitoiset
This allows us to generate, for example,
"exp param0 v0, off, off, off" if only the first channel is needed.

Not sure if this improves performance but it's worth trying.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_nir_to_llvm.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 94648232c8..9fa6773633 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -5990,11 +5990,11 @@ si_llvm_init_export_args(struct radv_shader_context 
*ctx,
 
 static void
 radv_export_param(struct radv_shader_context *ctx, unsigned index,
- LLVMValueRef *values)
+ LLVMValueRef *values, unsigned enabled_channels)
 {
struct ac_export_args args;
 
-   si_llvm_init_export_args(ctx, values, 0xf,
+   si_llvm_init_export_args(ctx, values, enabled_channels,
 V_008DFC_SQ_EXP_PARAM + index, );
ac_build_export(>ac, );
 }
@@ -6154,7 +6154,23 @@ handle_vs_outputs_post(struct radv_shader_context *ctx,
for (unsigned j = 0; j < 4; j++)
values[j] = ac_to_float(>ac, radv_load_output(ctx, 
i, j));
 
-   radv_export_param(ctx, param_count, values);
+   unsigned output_usage_mask;
+
+   if (ctx->stage == MESA_SHADER_VERTEX &&
+   !ctx->is_gs_copy_shader) {
+   output_usage_mask =
+   ctx->shader_info->info.vs.output_usage_mask[i];
+   } else if (ctx->stage == MESA_SHADER_TESS_EVAL) {
+   output_usage_mask =
+   ctx->shader_info->info.tes.output_usage_mask[i];
+   } else {
+   /* Enable all channels for the GS copy shader because
+* we don't know the output usage mask currently.
+*/
+   output_usage_mask = 0xf;
+   }
+
+   radv_export_param(ctx, param_count, values, output_usage_mask);
 
outinfo->vs_output_param_offset[i] = param_count++;
}
@@ -6168,7 +6184,7 @@ handle_vs_outputs_post(struct radv_shader_context *ctx,
for (unsigned j = 1; j < 4; j++)
values[j] = ctx->ac.f32_0;
 
-   radv_export_param(ctx, param_count, values);
+   radv_export_param(ctx, param_count, values, 0xf);
 
outinfo->vs_output_param_offset[VARYING_SLOT_PRIMITIVE_ID] = 
param_count++;
outinfo->export_prim_id = true;
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] ac: update enabled channels mask when optimizing PARAM exports

2018-03-01 Thread Samuel Pitoiset
When the mask is not 0xf we need to update the number of
enabled channels, otherwise the hardware won't emit the
components that are combined.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_llvm_build.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 15144addb9..8e21de1302 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1709,6 +1709,7 @@ void ac_get_image_intr_name(const char *base_name,
 }
 
 #define AC_EXP_TARGET (HAVE_LLVM >= 0x0500 ? 0 : 3)
+#define AC_EXP_ENABLED_CHANNELS (HAVE_LLVM >= 0x0500 ? 1 : 0)
 #define AC_EXP_OUT0 (HAVE_LLVM >= 0x0500 ? 2 : 5)
 
 enum ac_ir_type {
@@ -1781,7 +1782,8 @@ static bool ac_eliminate_const_output(uint8_t 
*vs_output_param_offset,
return true;
 }
 
-static bool ac_eliminate_duplicated_output(uint8_t *vs_output_param_offset,
+static bool ac_eliminate_duplicated_output(struct ac_llvm_context *ctx,
+  uint8_t *vs_output_param_offset,
   uint32_t num_outputs,
   struct ac_vs_exports *processed,
   struct ac_vs_exp_inst *exp)
@@ -1833,6 +1835,10 @@ static bool ac_eliminate_duplicated_output(uint8_t 
*vs_output_param_offset,
 */
struct ac_vs_exp_inst *match = >exp[p];
 
+   /* Get current enabled channels mask. */
+   LLVMValueRef arg = LLVMGetOperand(match->inst, AC_EXP_ENABLED_CHANNELS);
+   unsigned enabled_channels = LLVMConstIntGetZExtValue(arg);
+
while (copy_back_channels) {
unsigned chan = u_bit_scan(_back_channels);
 
@@ -1840,6 +1846,13 @@ static bool ac_eliminate_duplicated_output(uint8_t 
*vs_output_param_offset,
LLVMSetOperand(match->inst, AC_EXP_OUT0 + chan,
   exp->chan[chan].value);
match->chan[chan] = exp->chan[chan];
+
+   /* Update number of enabled channels because the original mask
+* is not always 0xf.
+*/
+   enabled_channels |= (1 << chan);
+   LLVMSetOperand(match->inst, AC_EXP_ENABLED_CHANNELS,
+  LLVMConstInt(ctx->i32, enabled_channels, 0));
}
 
/* The PARAM export is duplicated. Kill it. */
@@ -1927,7 +1940,8 @@ void ac_optimize_vs_outputs(struct ac_llvm_context *ctx,
/* Eliminate constant and duplicated PARAM exports. */
if (ac_eliminate_const_output(vs_output_param_offset,
  num_outputs, ) ||
-   
ac_eliminate_duplicated_output(vs_output_param_offset,
+   ac_eliminate_duplicated_output(ctx,
+  
vs_output_param_offset,
   num_outputs, 
,
   )) {
removed_any = true;
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] ac/nir: pass the number of enabled channels to si_llvm_init_export_args()

2018-03-01 Thread Samuel Pitoiset
Currently, it's always 0xf but an upcoming patch will reduce the
number of channels for parameters export.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_nir_to_llvm.c | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index afe17a8f11..94648232c8 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -5862,11 +5862,12 @@ setup_shared(struct ac_nir_context *ctx,
 static void
 si_llvm_init_export_args(struct radv_shader_context *ctx,
 LLVMValueRef *values,
+unsigned enabled_channels,
 unsigned target,
 struct ac_export_args *args)
 {
-   /* Default is 0xf. Adjusted below depending on the format. */
-   args->enabled_channels = 0xf;
+   /* Specify the channels that are enabled. */
+   args->enabled_channels = enabled_channels;
 
/* Specify whether the EXEC mask represents the valid mask */
args->valid_mask = 0;
@@ -5979,8 +5980,12 @@ si_llvm_init_export_args(struct radv_shader_context *ctx,
 
memcpy(>out[0], values, sizeof(values[0]) * 4);
 
-   for (unsigned i = 0; i < 4; ++i)
+   for (unsigned i = 0; i < 4; ++i) {
+   if (!(args->enabled_channels & (1 << i)))
+   continue;
+
args->out[i] = ac_to_float(>ac, args->out[i]);
+   }
 }
 
 static void
@@ -5989,7 +5994,7 @@ radv_export_param(struct radv_shader_context *ctx, 
unsigned index,
 {
struct ac_export_args args;
 
-   si_llvm_init_export_args(ctx, values,
+   si_llvm_init_export_args(ctx, values, 0xf,
 V_008DFC_SQ_EXP_PARAM + index, );
ac_build_export(>ac, );
 }
@@ -6046,13 +6051,13 @@ handle_vs_outputs_post(struct radv_shader_context *ctx,
 
if (ctx->num_output_clips + ctx->num_output_culls > 4) {
target = V_008DFC_SQ_EXP_POS + 3;
-   si_llvm_init_export_args(ctx, [4], target, );
+   si_llvm_init_export_args(ctx, [4], 0xf, target, 
);
memcpy(_args[target - V_008DFC_SQ_EXP_POS],
   , sizeof(args));
}
 
target = V_008DFC_SQ_EXP_POS + 2;
-   si_llvm_init_export_args(ctx, [0], target, );
+   si_llvm_init_export_args(ctx, [0], 0xf, target, );
memcpy(_args[target - V_008DFC_SQ_EXP_POS],
   , sizeof(args));
 
@@ -6063,7 +6068,7 @@ handle_vs_outputs_post(struct radv_shader_context *ctx,
for (unsigned j = 0; j < 4; j++)
pos_values[j] = radv_load_output(ctx, VARYING_SLOT_POS, 
j);
}
-   si_llvm_init_export_args(ctx, pos_values, V_008DFC_SQ_EXP_POS, 
_args[0]);
+   si_llvm_init_export_args(ctx, pos_values, 0xf, V_008DFC_SQ_EXP_POS, 
_args[0]);
 
if (ctx->output_mask & (1ull << VARYING_SLOT_PSIZ)) {
outinfo->writes_pointsize = true;
@@ -6531,7 +6536,7 @@ si_export_mrt_color(struct radv_shader_context *ctx,
struct ac_export_args *args)
 {
/* Export */
-   si_llvm_init_export_args(ctx, color,
+   si_llvm_init_export_args(ctx, color, 0xf,
 V_008DFC_SQ_EXP_MRT + index, args);
 
if (is_last) {
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] ac/shader: scan output usage mask for VS and TES

2018-03-01 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/common/ac_shader_info.c | 18 ++
 src/amd/common/ac_shader_info.h |  4 
 2 files changed, 22 insertions(+)

diff --git a/src/amd/common/ac_shader_info.c b/src/amd/common/ac_shader_info.c
index 57d7edec76..98de963147 100644
--- a/src/amd/common/ac_shader_info.c
+++ b/src/amd/common/ac_shader_info.c
@@ -146,6 +146,24 @@ gather_intrinsic_info(const nir_shader *nir, const 
nir_intrinsic_instr *instr,
}
}
break;
+   case nir_intrinsic_store_var: {
+   nir_deref_var *dvar = instr->variables[0];
+   nir_variable *var = dvar->var;
+
+   if (var->data.mode == nir_var_shader_out) {
+   unsigned idx = var->data.location;
+   unsigned comp = var->data.location_frac;
+
+   if (nir->info.stage == MESA_SHADER_VERTEX) {
+   info->vs.output_usage_mask[idx] |=
+   instr->const_index[0] << comp;
+   } else if (nir->info.stage == MESA_SHADER_TESS_EVAL) {
+   info->tes.output_usage_mask[idx] |=
+   instr->const_index[0] << comp;
+   }
+   }
+   break;
+   }
default:
break;
}
diff --git a/src/amd/common/ac_shader_info.h b/src/amd/common/ac_shader_info.h
index 60ddfd2d71..12a1dcf915 100644
--- a/src/amd/common/ac_shader_info.h
+++ b/src/amd/common/ac_shader_info.h
@@ -37,10 +37,14 @@ struct ac_shader_info {
bool uses_prim_id;
struct {
uint8_t input_usage_mask[VERT_ATTRIB_MAX];
+   uint8_t output_usage_mask[VARYING_SLOT_VAR31 + 1];
bool has_vertex_buffers; /* needs vertex buffers and base/start 
*/
bool needs_draw_id;
bool needs_instance_id;
} vs;
+   struct {
+   uint8_t output_usage_mask[VARYING_SLOT_VAR31 + 1];
+   } tes;
struct {
bool force_persample;
bool needs_sample_positions;
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 72600] ES3 context returned when ES2 is requested

2018-03-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=72600

Tapani Pälli  changed:

   What|Removed |Added

 Resolution|--- |WORKSFORME
 Status|NEW |RESOLVED

--- Comment #13 from Tapani Pälli  ---
I'm resolving this old bug as WORKSFORME. Mesa sets EGL_RENDERABLE_TYPE which
can be queried via eglGetConfigAttrib and for ES3 compatible config it has
EGL_OPENGL_ES3_BIT set. Please reopen if there is a bug.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] meson: fix LLVM version detection when <= 3.4

2018-03-01 Thread Andres Gomez
On Thu, 2018-03-01 at 09:25 +, Eric Engestrom wrote:

[...]

> 
> Oh, my apologies, I didn't think about that!
> Can you add that paragraph in the commit message so it's clearer?
> (I know there was already a mention of that, but I had not understood it the 
> first time around)
> 
> > 
> > You can see an example of this error at:
> > https://travis-ci.org/Igalia/release-mesa/builds/347267445
> > 
> > 
> > I'll send a new version following your snippet. Thanks! ☺
> 
> You can add my r-b on that patch :)

Modified locally and landed.

Thanks a lot for the review! ☺

-- 
Br,

Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: do not set pending_reset_query in BeginCommandBuffer()

2018-03-01 Thread Alex Smith
Reviewed-by: Alex Smith 

On 1 March 2018 at 09:53, Samuel Pitoiset  wrote:

> This is just useless for two reasons:
> 1) flush_bits is not set accordingly, so nothing will be flushed
>in BeginQuery().
> 2) we always flush caches in EndCommandBuffer(), so if a reset
>is done in a previous command buffer we are safe.
>
> Cc: Alex Smith 
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 7 ---
>  1 file changed, 7 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_
> buffer.c
> index cfdc531acd..2b41baea3d 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -1930,13 +1930,6 @@ VkResult radv_BeginCommandBuffer(
>
> cmd_buffer->status = RADV_CMD_BUFFER_STATUS_RECORDING;
>
> -   /* Force cache flushes before starting a new query in case the
> -* corresponding pool has been resetted from a different command
> -* buffer. This is because we have to flush caches between reset
> and
> -* begin if the compute shader path has been used.
> -*/
> -   cmd_buffer->pending_reset_query = true;
> -
> return result;
>  }
>
> --
> 2.16.2
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: do not set pending_reset_query in BeginCommandBuffer()

2018-03-01 Thread Samuel Pitoiset
This is just useless for two reasons:
1) flush_bits is not set accordingly, so nothing will be flushed
   in BeginQuery().
2) we always flush caches in EndCommandBuffer(), so if a reset
   is done in a previous command buffer we are safe.

Cc: Alex Smith 
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index cfdc531acd..2b41baea3d 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1930,13 +1930,6 @@ VkResult radv_BeginCommandBuffer(
 
cmd_buffer->status = RADV_CMD_BUFFER_STATUS_RECORDING;
 
-   /* Force cache flushes before starting a new query in case the
-* corresponding pool has been resetted from a different command
-* buffer. This is because we have to flush caches between reset and
-* begin if the compute shader path has been used.
-*/
-   cmd_buffer->pending_reset_query = true;
-
return result;
 }
 
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] radv: make sure to emit cache flushes before starting a query

2018-03-01 Thread Samuel Pitoiset



On 03/01/2018 10:21 AM, Alex Smith wrote:

Hi Samuel,

On 28 February 2018 at 20:47, Samuel Pitoiset > wrote:


If the query pool has been previously resetted using the compute
shader path.

v3: set pending_reset_query only for the compute shader path
v2: handle multiple commands buffers with same pool

Fixes: a41e2e9cf5 ("radv: allow to use a compute shader for
resetting the query pool")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105292

Cc: "18.0" >
Signed-off-by: Samuel Pitoiset >
---
  src/amd/vulkan/radv_cmd_buffer.c |  7 +++
  src/amd/vulkan/radv_private.h    |  5 +
  src/amd/vulkan/radv_query.c      | 28 +---
  3 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c
b/src/amd/vulkan/radv_cmd_buffer.c
index 2b41baea3d..cfdc531acd 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1930,6 +1930,13 @@ VkResult radv_BeginCommandBuffer(

         cmd_buffer->status = RADV_CMD_BUFFER_STATUS_RECORDING;

+       /* Force cache flushes before starting a new query in case the
+        * corresponding pool has been resetted from a different command
+        * buffer. This is because we have to flush caches between
reset and
+        * begin if the compute shader path has been used.
+        */
+       cmd_buffer->pending_reset_query = true;


Since this just ends up calling si_emit_cache_flush, doesn't flush_bits 
need to be set accordingly for it to actually do anything? If the reset 
is done in a previous command buffer, I think the flush would already 
have been done in EndCommandBuffer on that?


You are right, this is just useless because we always flushes caches in 
EndCommandBuffer(). Thanks for pointing this out. Will remove that hunk.




Thanks,
Alex

+
         return result;
  }

diff --git a/src/amd/vulkan/radv_private.h
b/src/amd/vulkan/radv_private.h
index c72df5a737..b76d2eb5cb 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -1003,6 +1003,11 @@ struct radv_cmd_buffer {
         uint32_t gfx9_fence_offset;
         struct radeon_winsys_bo *gfx9_fence_bo;
         uint32_t gfx9_fence_idx;
+
+       /**
+        * Whether a query pool has been resetted and we have to
flush caches.
+        */
+       bool pending_reset_query;
  };

  struct radv_image;
diff --git a/src/amd/vulkan/radv_query.c b/src/amd/vulkan/radv_query.c
index ace745e4e6..b1393a2ec7 100644
--- a/src/amd/vulkan/radv_query.c
+++ b/src/amd/vulkan/radv_query.c
@@ -1058,17 +1058,23 @@ void radv_CmdResetQueryPool(
  {
         RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
         RADV_FROM_HANDLE(radv_query_pool, pool, queryPool);
-       struct radv_cmd_state *state = _buffer->state;
+       uint32_t flush_bits = 0;

-       state->flush_bits |= radv_fill_buffer(cmd_buffer, pool->bo,
-                                             firstQuery * pool->stride,
-                                             queryCount *
pool->stride, 0);
+       flush_bits |= radv_fill_buffer(cmd_buffer, pool->bo,
+                                      firstQuery * pool->stride,
+                                      queryCount * pool->stride, 0);

         if (pool->type == VK_QUERY_TYPE_TIMESTAMP ||
             pool->type == VK_QUERY_TYPE_PIPELINE_STATISTICS) {
-               state->flush_bits |= radv_fill_buffer(cmd_buffer,
pool->bo,
-   
  pool->availability_offset + firstQuery * 4,

-                                                     queryCount *
4, 0);
+               flush_bits |= radv_fill_buffer(cmd_buffer, pool->bo,
+ 
pool->availability_offset + firstQuery * 4,

+                                              queryCount * 4, 0);
+       }
+
+       if (flush_bits) {
+               /* Only need to flush caches for the compute shader
path. */
+               cmd_buffer->pending_reset_query = true;
+               cmd_buffer->state.flush_bits |= flush_bits;
         }
  }

@@ -1086,6 +1092,14 @@ void radv_CmdBeginQuery(

         radv_cs_add_buffer(cmd_buffer->device->ws, cs, pool->bo, 8);

+       if (cmd_buffer->pending_reset_query) {
+               /* Make sure to flush caches if the query pool has been
+                * previously resetted using the compute 

Re: [Mesa-dev] [PATCH 1/2] meson: fix LLVM version detection when <= 3.4

2018-03-01 Thread Eric Engestrom


On February 28, 2018 8:30:14 PM UTC, Andres Gomez  wrote:
> On Wed, 2018-02-28 at 17:12 +, Eric Engestrom wrote:
> > On Wednesday, 2018-02-28 17:08:41 +, Eric Engestrom wrote:
> > > On Wednesday, 2018-02-28 17:02:50 +, Eric Engestrom wrote:
> > > > On Wednesday, 2018-02-28 17:52:05 +0200, Andres Gomez wrote:
> > > > > 3 digits versions in LLVM only started from 3.4.1 on. Hence,
> if you
> > > > > have installed 3.4 or below, meson will fail even when we may
> not make
> > > > > use of LLVM.
> > > > > 
> > > > > Cc: Dylan Baker 
> > > > > Cc: Eric Engestrom 
> > > > > Signed-off-by: Andres Gomez 
> > > > > ---
> > > > >  meson.build | 13 -
> > > > >  1 file changed, 12 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/meson.build b/meson.build
> > > > > index 308f64cf811..b8c0b04893c 100644
> > > > > --- a/meson.build
> > > > > +++ b/meson.build
> > > > > @@ -1037,7 +1037,18 @@ if with_llvm
> > > > ># that for our version checks.
> > > > ># svn suffixes are stripped by meson as of 0.43, and git
> suffixes are
> > > > ># strippped as of 0.44, but we support older meson
> versions.
> > > > > -  _llvm_patch = _llvm_version[2]
> > > > > +
> > > > > +  # 3 digits versions in LLVM only started from 3.4.1 on
> > > > > +  if dep_llvm.version() <= '3.3'
> > > > 
> > > > The correct 'meson way' to compare version strings is
> > > >   if dep_llvm.version().version_compare('<= 3.3')
> > > > 
> > > > > +_llvm_patch = _llvm_version[1]
> > > > > +  elif dep_llvm.version() >= '3.5'
> > > > > +_llvm_patch = _llvm_version[2]
> > > > > +  elif dep_llvm.version().startswith('3.4.1') or
> dep_llvm.version().startswith('3.4.2')
> > > > > +_llvm_patch = _llvm_version[2]
> > > > > +  else
> > > > > +_llvm_patch = _llvm_version[1]
> > > > > +  endif
> > > > 
> > > > This whole logic seems overly complicated, and I don't think
> duplicating
> > > > the minor version as the patch version is the right thing
> either.
> 
> Ouch! Right, minor version should be 0 in those cases.
> 
> > > > How about this instead?
> > > > 
> > > >   if dep_llvm.version().version_compare('>= 3.4.1')
> > > > _llvm_patch = _llvm_version[2]
> > > >   else
> > > > _llvm_patch = '0'
> > > > endif
> > > 
> > > Actually, do we still support llvm < 3.4? Didn't we just bump the
> > > minimum to 4.0?
> > > I think we did, in which case this patch is not necessary at all
> :)
> > 
> > Correction: the minimum is 3.9, which is still >= 3.4, so NAK on
> this
> > patch, it would be dead code anyway :)
> 
> The purpose of this patch is not to provide support for older versions
> of LLVM but avoiding meson to fail when an older version is present in
> the system.
> 
> In other words, you can perfectly build with an old LLVM (< 3.4.1) in
> the system while not needing LLVM at all (auto). When passing through
> this detection code, meson will fail when accessing "_llvm_version[2]"
> due to:
> 
> "Index 2 out of bounds of array of size 2."

Oh, my apologies, I didn't think about that!
Can you add that paragraph in the commit message so it's clearer?
(I know there was already a mention of that, but I had not understood it the 
first time around)

> 
> You can see an example of this error at:
> https://travis-ci.org/Igalia/release-mesa/builds/347267445
> 
> 
> I'll send a new version following your snippet. Thanks! ☺

You can add my r-b on that patch :)

> 
> -- 
> Br,
> 
> Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] radv: make sure to emit cache flushes before starting a query

2018-03-01 Thread Alex Smith
Hi Samuel,

On 28 February 2018 at 20:47, Samuel Pitoiset 
wrote:

> If the query pool has been previously resetted using the compute
> shader path.
>
> v3: set pending_reset_query only for the compute shader path
> v2: handle multiple commands buffers with same pool
>
> Fixes: a41e2e9cf5 ("radv: allow to use a compute shader for resetting the
> query pool")
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105292
> Cc: "18.0" 
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c |  7 +++
>  src/amd/vulkan/radv_private.h|  5 +
>  src/amd/vulkan/radv_query.c  | 28 +---
>  3 files changed, 33 insertions(+), 7 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_
> buffer.c
> index 2b41baea3d..cfdc531acd 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -1930,6 +1930,13 @@ VkResult radv_BeginCommandBuffer(
>
> cmd_buffer->status = RADV_CMD_BUFFER_STATUS_RECORDING;
>
> +   /* Force cache flushes before starting a new query in case the
> +* corresponding pool has been resetted from a different command
> +* buffer. This is because we have to flush caches between reset
> and
> +* begin if the compute shader path has been used.
> +*/
> +   cmd_buffer->pending_reset_query = true;
>

Since this just ends up calling si_emit_cache_flush, doesn't flush_bits
need to be set accordingly for it to actually do anything? If the reset is
done in a previous command buffer, I think the flush would already have
been done in EndCommandBuffer on that?

Thanks,
Alex

+
> return result;
>  }
>
> diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
> index c72df5a737..b76d2eb5cb 100644
> --- a/src/amd/vulkan/radv_private.h
> +++ b/src/amd/vulkan/radv_private.h
> @@ -1003,6 +1003,11 @@ struct radv_cmd_buffer {
> uint32_t gfx9_fence_offset;
> struct radeon_winsys_bo *gfx9_fence_bo;
> uint32_t gfx9_fence_idx;
> +
> +   /**
> +* Whether a query pool has been resetted and we have to flush
> caches.
> +*/
> +   bool pending_reset_query;
>  };
>
>  struct radv_image;
> diff --git a/src/amd/vulkan/radv_query.c b/src/amd/vulkan/radv_query.c
> index ace745e4e6..b1393a2ec7 100644
> --- a/src/amd/vulkan/radv_query.c
> +++ b/src/amd/vulkan/radv_query.c
> @@ -1058,17 +1058,23 @@ void radv_CmdResetQueryPool(
>  {
> RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
> RADV_FROM_HANDLE(radv_query_pool, pool, queryPool);
> -   struct radv_cmd_state *state = _buffer->state;
> +   uint32_t flush_bits = 0;
>
> -   state->flush_bits |= radv_fill_buffer(cmd_buffer, pool->bo,
> - firstQuery * pool->stride,
> - queryCount * pool->stride,
> 0);
> +   flush_bits |= radv_fill_buffer(cmd_buffer, pool->bo,
> +  firstQuery * pool->stride,
> +  queryCount * pool->stride, 0);
>
> if (pool->type == VK_QUERY_TYPE_TIMESTAMP ||
> pool->type == VK_QUERY_TYPE_PIPELINE_STATISTICS) {
> -   state->flush_bits |= radv_fill_buffer(cmd_buffer, pool->bo,
> -
>  pool->availability_offset + firstQuery * 4,
> - queryCount * 4, 0);
> +   flush_bits |= radv_fill_buffer(cmd_buffer, pool->bo,
> +  pool->availability_offset +
> firstQuery * 4,
> +  queryCount * 4, 0);
> +   }
> +
> +   if (flush_bits) {
> +   /* Only need to flush caches for the compute shader path.
> */
> +   cmd_buffer->pending_reset_query = true;
> +   cmd_buffer->state.flush_bits |= flush_bits;
> }
>  }
>
> @@ -1086,6 +1092,14 @@ void radv_CmdBeginQuery(
>
> radv_cs_add_buffer(cmd_buffer->device->ws, cs, pool->bo, 8);
>
> +   if (cmd_buffer->pending_reset_query) {
> +   /* Make sure to flush caches if the query pool has been
> +* previously resetted using the compute shader path.
> +*/
> +   si_emit_cache_flush(cmd_buffer);
> +   cmd_buffer->pending_reset_query = false;
> +   }
> +
> switch (pool->type) {
> case VK_QUERY_TYPE_OCCLUSION:
> radeon_check_space(cmd_buffer->device->ws, cs, 7);
> --
> 2.16.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

[Mesa-dev] [PATCH] ac: fix nir_intrinsic_shared_atomic_comp_swap handling

2018-03-01 Thread Timothy Arceri
Following on from 49879f377870 this makes sure we use the correct
src index.

Fixes cts test:
KHR-GL46.compute_shader.atomic-case3
---
 src/amd/common/ac_nir_to_llvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 2bd1257a52..40201f5d10 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -3998,7 +3998,7 @@ static LLVMValueRef visit_var_atomic(struct 
ac_nir_context *ctx,
 
if (instr->intrinsic == nir_intrinsic_var_atomic_comp_swap ||
instr->intrinsic == nir_intrinsic_shared_atomic_comp_swap) {
-   LLVMValueRef src1 = get_src(ctx, instr->src[1]);
+   LLVMValueRef src1 = get_src(ctx, instr->src[src_idx + 1]);
result = LLVMBuildAtomicCmpXchg(ctx->ac.builder,
ptr, src, src1,

LLVMAtomicOrderingSequentiallyConsistent,
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/9] intel/blorp: Add indirect clear color support to mcs_partial_resolve

2018-03-01 Thread Pohjolainen, Topi
On Mon, Feb 26, 2018 at 07:15:13AM -0800, Jason Ekstrand wrote:
> On February 26, 2018 05:33:15 "Pohjolainen, Topi"
>  wrote:
> 
> >On Fri, Feb 23, 2018 at 10:22:57PM -0800, Jason Ekstrand wrote:
> >>This is a bit complicated because we have to get the indirect clear
> >>color in there somehow.  In order to not do any more work in the shader
> >>than needed, we set it up as it's own vertex binding which points
> >>directly at the clear color address specified by the client.
> >>---
> >> src/intel/blorp/blorp_clear.c | 25 +-
> >> src/intel/blorp/blorp_genX_exec.h | 54 
> >> ---
> >> src/intel/blorp/blorp_priv.h  |  1 +
> >> 3 files changed, 70 insertions(+), 10 deletions(-)
> >>
> >>diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c
> >>index dde116f..832e8ee 100644
> >>--- a/src/intel/blorp/blorp_clear.c
> >>+++ b/src/intel/blorp/blorp_clear.c
> >>@@ -833,9 +833,18 @@ blorp_ccs_resolve(struct blorp_batch *batch,
> >>batch->blorp->exec(batch, );
> >> }
> >>
> >>+static nir_ssa_def *
> >>+blorp_nir_bit(nir_builder *b, nir_ssa_def *src, unsigned bit)
> >>+{
> >>+   return nir_iand(b, nir_ushr(b, src, nir_imm_int(b, bit)),
> >>+  nir_imm_int(b, 1));
> >>+}
> >>+
> >> struct blorp_mcs_partial_resolve_key
> >> {
> >>enum blorp_shader_type shader_type;
> >>+   bool indirect_clear_color;
> >>+   bool int_format;
> >>uint32_t num_samples;
> >> };
> >>
> >>@@ -845,6 +854,8 @@
> >>blorp_params_get_mcs_partial_resolve_kernel(struct blorp_context
> >>*blorp,
> >> {
> >>const struct blorp_mcs_partial_resolve_key blorp_key = {
> >>   .shader_type = BLORP_SHADER_TYPE_MCS_PARTIAL_RESOLVE,
> >>+  .indirect_clear_color = params->dst.clear_color_addr.buffer != NULL,
> >>+  .int_format = isl_format_has_int_channel(params->dst.view.format),
> >>   .num_samples = params->num_samples,
> >>};
> >>
> >>@@ -879,7 +890,18 @@
> >>blorp_params_get_mcs_partial_resolve_kernel(struct blorp_context
> >>*blorp,
> >>discard->src[0] = nir_src_for_ssa(nir_inot(, is_clear));
> >>nir_builder_instr_insert(, >instr);
> >>
> >>-   nir_copy_var(, frag_color, v_color);
> >>+   nir_ssa_def *clear_color = nir_load_var(, v_color);
> >>+   if (blorp_key.indirect_clear_color && blorp->isl_dev->info->gen <= 8) {
> >>+  /* Gen7-8 clear colors are stored as single 0/1 bits */
> >>+  clear_color = nir_vec4(, blorp_nir_bit(, clear_color, 31),
> >>+ blorp_nir_bit(, clear_color, 30),
> >>+ blorp_nir_bit(, clear_color, 29),
> >>+ blorp_nir_bit(, clear_color, 28));
> >
> >I was expecting to see right hand side in the form:
> >blorp_nir_bit( nir_channel(, clear_color, 0), 31) and so on. So omitting
> >the channel selection defaults to first or am I missing something else?
> 
> More or less, yes.  In this case, I think it's actually doing a
> vectorised thing in blorp_nir_bit and then the vec4 at the end is
> only taking the first channel of reach argument.  The computer then
> scalarizes the whole thing and throw away the unnerved components.
> In any case, it's perfectly safe.
> 
> >In  addition I was suprised to see that we can treat it as integer
> >in expressions
> >even though it is always defined as float - it gets its type after v_color,
> >right?
> 
> NIR doesn't really have types in the classic sense. The only you're
> information a SSA value has in NIR is its number of components and
> bits per component.  Whether those boots are interpreted as an int
> or a float are up to the individual instruction.
> 
> >And that in turn is unconditionally declared as:
> >
> >   nir_variable *v_color =
> >  BLORP_CREATE_NIR_INPUT(b.shader, clear_color, glsl_vec4_type());
> 
> We could have just as easily declared it as a uvec4. The important
> part is that it's declared to be flat.

I think I see how it works now, thanks for the explanation! Patch is:

Reviewed-by: Topi Pohjolainen 

> 
> >>+
> >>+  if (!blorp_key.int_format)
> >>+ clear_color = nir_i2f32(, clear_color);
> >>+   }
> >>+   nir_store_var(, frag_color, clear_color, 0xf);
> >>
> >>struct brw_wm_prog_key wm_key;
> >>brw_blorp_init_wm_prog_key(_key);
> >>@@ -925,6 +947,7 @@ blorp_mcs_partial_resolve(struct blorp_batch *batch,
> >>
> >>params.num_samples = params.dst.surf.samples;
> >>params.num_layers = num_layers;
> >>+   params.dst_clear_color_as_input = surf->clear_color_addr.buffer != NULL;
> >>
> >>memcpy(_inputs.clear_color,
> >>   surf->clear_color.f32, sizeof(float) * 4);
> >>diff --git a/src/intel/blorp/blorp_genX_exec.h
> >>b/src/intel/blorp/blorp_genX_exec.h
> >>index cea514e..cc408ca 100644
> >>--- a/src/intel/blorp/blorp_genX_exec.h
> >>+++ b/src/intel/blorp/blorp_genX_exec.h
> >>@@ -297,7 +297,7 @@ static void
> >> blorp_emit_vertex_buffers(struct 

Re: [Mesa-dev] [PATCH 05/13] nir: expose 'C' wrappers for std430 size/alignment

2018-03-01 Thread Karol Herbst
On Wed, Feb 28, 2018 at 10:59 PM, Rob Clark  wrote:
> hmm, I haven't tried passing a struct (rather than a pointer to a
> struct) as a parameter, but if there were 8/16 bit fields in the
> struct it would calculate the size incorrectly.
>
> (otoh the more important question is whether this agrees with how
> clover lays out the input buffer, as far as where the n+1'th parameter
> starts, which I'm not sure about)
>

you can also pack structs in OpenCL C. I just used those functions in
my backend code, because it worked for the more simple examples.

> BR,
> -R
>
> On Wed, Feb 28, 2018 at 4:47 PM, Jason Ekstrand  wrote:
>> I thought OpenCL used a different set of alignment rules for structs,
>> unions, etc.  In particular, I thought it was very close to standard C.  If
>> that's true, then std430 is not what you want.
>>
>> On Wed, Feb 28, 2018 at 1:44 PM, Karol Herbst  wrote:
>>>
>>> it isn't yet. But you would use it in your driver when calculating
>>> your memory offsets for kernel arguments. In OpenCL things are aligned
>>> in memory by the size of the type and we would use those functions to
>>> calculate those.
>>>
>>> On Wed, Feb 28, 2018 at 10:39 PM, Jason Ekstrand 
>>> wrote:
>>> > Looking through commit titles, I don't see any obvious place where this
>>> > would get used.
>>> >
>>> > On Wed, Feb 28, 2018 at 11:51 AM, Rob Clark  wrote:
>>> >>
>>> >> Signed-off-by: Rob Clark 
>>> >> ---
>>> >>  src/compiler/nir_types.cpp | 12 
>>> >>  src/compiler/nir_types.h   |  4 
>>> >>  2 files changed, 16 insertions(+)
>>> >>
>>> >> diff --git a/src/compiler/nir_types.cpp b/src/compiler/nir_types.cpp
>>> >> index cbdd452dc81..0085a19248a 100644
>>> >> --- a/src/compiler/nir_types.cpp
>>> >> +++ b/src/compiler/nir_types.cpp
>>> >> @@ -117,6 +117,18 @@ glsl_get_aoa_size(const struct glsl_type *type)
>>> >> return type->arrays_of_arrays_size();
>>> >>  }
>>> >>
>>> >> +unsigned
>>> >> +glsl_std430_size(const struct glsl_type *type, bool row_major)
>>> >> +{
>>> >> +   return type->std430_size(row_major);
>>> >> +}
>>> >> +
>>> >> +unsigned
>>> >> +glsl_std430_base_alignment(const struct glsl_type *type, bool
>>> >> row_major)
>>> >> +{
>>> >> +   return type->std430_base_alignment(row_major);
>>> >> +}
>>> >> +
>>> >>  unsigned
>>> >>  glsl_count_attribute_slots(const struct glsl_type *type,
>>> >> bool is_vertex_input)
>>> >> diff --git a/src/compiler/nir_types.h b/src/compiler/nir_types.h
>>> >> index e2dfd1ef5b7..5b5e09d137f 100644
>>> >> --- a/src/compiler/nir_types.h
>>> >> +++ b/src/compiler/nir_types.h
>>> >> @@ -71,6 +71,10 @@ unsigned glsl_get_length(const struct glsl_type
>>> >> *type);
>>> >>
>>> >>  unsigned glsl_get_aoa_size(const struct glsl_type *type);
>>> >>
>>> >> +unsigned glsl_std430_size(const struct glsl_type *type, bool
>>> >> row_major);
>>> >> +
>>> >> +unsigned glsl_std430_base_alignment(const struct glsl_type *type, bool
>>> >> row_major);
>>> >> +
>>> >>  unsigned glsl_count_attribute_slots(const struct glsl_type *type,
>>> >>  bool is_vertex_input);
>>> >>
>>> >> --
>>> >> 2.14.3
>>> >>
>>> >> ___
>>> >> mesa-dev mailing list
>>> >> mesa-dev@lists.freedesktop.org
>>> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>> >
>>> >
>>
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >