[Mesa-dev] [PATCH 1/1] radv: consider MESA_VK_VERSION_OVERRIDE when setting the api version

2019-04-24 Thread Eleni Maria Stea
Before setting the physical device API version, we should check if the
MESA_VK_VERSION_OVERRIDE environment variable is set and take it into
account.
---
 src/amd/vulkan/radv_extensions.py | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_extensions.py 
b/src/amd/vulkan/radv_extensions.py
index 9743ce1a774..8f29f4ca40f 100644
--- a/src/amd/vulkan/radv_extensions.py
+++ b/src/amd/vulkan/radv_extensions.py
@@ -333,9 +333,13 @@ VkResult radv_EnumerateInstanceVersion(
 uint32_t
 radv_physical_device_api_version(struct radv_physical_device *dev)
 {
+uint32_t override = vk_get_version_override();
+uint32_t version = VK_MAKE_VERSION(1, 0, 68);
+
 if (!ANDROID && dev->rad_info.has_syncobj_wait_for_submit)
-return ${MAX_API_VERSION.c_vk_version()};
-return VK_MAKE_VERSION(1, 0, 68);
+version = ${MAX_API_VERSION.c_vk_version()};
+
+return override ? MIN2(override, version) : version;
 }
 """)
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v5 08/11] anv: Added support for dynamic sample locations

2019-03-15 Thread Eleni Maria Stea
On Thu, 14 Mar 2019 20:00:45 -0500
Jason Ekstrand  wrote:
> >
> >  extern const struct anv_dynamic_state default_dynamic_state;
> > diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> > b/src/intel/vulkan/genX_cmd_buffer.c
> > index 7687507e6b7..5d2b17cf8ae 100644
> > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > @@ -2796,6 +2796,17 @@ genX(cmd_buffer_flush_state)(struct
> > anv_cmd_buffer *cmd_buffer)
> >ANV_CMD_DIRTY_RENDER_TARGETS))
> >gen7_cmd_buffer_emit_scissor(cmd_buffer);
> >
> > +   if (cmd_buffer->state.gfx.dynamic.sample_locations.valid) {
> > +  uint32_t samples =
> > cmd_buffer->state.gfx.dynamic.sample_locations.samples;
> > +  const VkSampleLocationEXT *locations =
> > + cmd_buffer->state.gfx.dynamic.sample_locations.locations;
> > +  genX(emit_multisample)(_buffer->batch, samples,
> > locations); +#if GEN_GEN >= 8
> > +  genX(emit_sample_pattern)(_buffer->batch, samples,
> > locations); +#endif
> > +  cmd_buffer->state.gfx.dynamic.sample_locations.valid = false;
> >  
> 
> I'm not sure this is actually going to be correct.  With dynamic
> state, you're required to set it before you use it.  With pipeline
> state, it gets set every time the pipeline is bound.  Effectively,
> the pipeline state is a big bag of dynamic state.  With both of
> these, however, there are no defaults and you're required to bind a
> pipeline containing the state or explicitly set it on the command
> buffer before it gets used. VK_EXT_sample_locations is different
> though because it does have a default.  So the question I'm coming
> around to is: When does the default get applied?  The only sensible
> thing I can think of is at the top of the command buffer or maybe the
> top of the subpass.  If this is the case, then we need to emit sample
> positions at the start of every subpass.  Does the spec talk about
> this at all?
> 
> --Jason
> 
> >  

Hi Jason,

If I understand well (sorry if I misunderstood), you want to make sure
that in every case we will have locations set either the default or
custom, and that we emit the default locations when a pipeline (or
subpass, but we don't have locations per subpass, only per pipeline) is
bound after a pipeline for which custom locations were set dynamically?

I didn't find any reference to this problem in the spec, the solution I
thought myself was to use the variable bool custom_locations to decide
if we are going to emit custom locations or the default in
emit_ms_state and always emit some locations when the extension is
enabled. 

So:

When the emit_ms_state is called for each new pipeline at rasterization,
we check if 1- the custom locations are TRUE and if 2- the flag for the
dynamic locations (VK_DYNAMIC_STATE_SAMPLE_LOCATIONS_EXT) is set to
false. If both apply we emit the custom locations. In every other case
(when locations_enabled are false, or when dynamic state is true) we
emit the default locations.

This way, in the non-dynamic case (no VK_DYNAMIC_STATE... set):
If a pipeline has the locations enabled, we emit the custom.
If a pipeline doesn't have locations enabled, we emit the default.
So, we always have some locations set for the pipeline.

Similarly in the dynamic case:
When the user sets locations with vkCmdSetSampleLocations we use these
locations.
As we have locations per pipeline not per subpass (variable sample
locations = false) next pipeline that will be bound will have either
custom locations (from emit_ms_state) or the default (emitted by
the emit_ms_state), unless if the DYNAMIC_STATE flag is set and the user
calls the vkCmdSetSampleLocationsEXT again to override the default,
that we'll set the user's.
So, again we'll always have some locations set (and these will be the
default when the user doesn't chain the locations info struct, or
disables the locations, or sets the DYNAMIC flag but doesn't override
the locations with the VkCmdSetLocationsEXT).

So, I think we are fine.

If we didn't emit always the default pipelines created after setting
locations with any of the 2 possible ways would have garbage locations
set. 

I verified this would happen like that: If you don't make use of the
bool custom_locations and run all the multisample.* vulkancts tests at
once, you will notice that tests that don't set the sample locations in
the pipeline and run after some tests that set them fail. I had spotted
the following failures:

dEQP-VK.glsl.builtin_var.fragcoord_msaa.*
dEQP-VK.pipeline.multisample.sample_mask_with_depth_test.samples_.*
dEQP-VK.pipeline.multisample.sample_mask_with_depth_test.samples_.*_post_depth_coverage
dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_distinct_values.128_128_1.samples_.*
dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_distinct_values.137_191_1.samples_.*
dEQP-VK.pipeline.multisample_interpolation.sample_qualifier_distinct_values.128_128_1.samples_.*

[Mesa-dev] [PATCH v4 3/9] anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT

2019-03-14 Thread Eleni Maria Stea
Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT according to
the Vulkan Specification section [36.2. Additional Multisampling
Capabilities].

v2: 1- Moved the vkGetPhysicalDeviceMultisamplePropertiesEXT from the
   anv_sample_locations.c to the anv_device.c (Jason Ekstrand)
2- Simplified the code that sets the grid size (Jason Ekstrand)
3- Instead of filling the whole struct, we only fill the parts we
   should override (sType, grid size) and we call
   anv_debug_ignored_stype to any pNext elements (Jason Ekstrand)
---
 src/intel/vulkan/anv_device.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 52ea058bdd5..0bfff7e0b30 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -3557,6 +3557,31 @@ VkResult anv_GetCalibratedTimestampsEXT(
return VK_SUCCESS;
 }
 
+void
+anv_GetPhysicalDeviceMultisamplePropertiesEXT(VkPhysicalDevice physicalDevice,
+  VkSampleCountFlagBits samples,
+  VkMultisamplePropertiesEXT
+  *pMultisampleProperties)
+{
+   ANV_FROM_HANDLE(anv_physical_device, physical_device, physicalDevice);
+
+   VkExtent2D grid_size;
+   if (samples & isl_device_get_sample_counts(_device->isl_dev)) {
+  grid_size.width = 1;
+  grid_size.height = 1;
+   } else {
+  grid_size.width = 0;
+  grid_size.height = 0;
+   }
+
+   pMultisampleProperties->sType =
+  VK_STRUCTURE_TYPE_MULTISAMPLE_PROPERTIES_EXT;
+   pMultisampleProperties->maxSampleLocationGridSize = grid_size;
+
+   vk_foreach_struct(ext, pMultisampleProperties->pNext)
+  anv_debug_ignored_stype(ext->sType);
+}
+
 /* vk_icd.h does not declare this function, so we declare it here to
  * suppress Wmissing-prototypes.
  */
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v4 7/9] anv: Optimized the emission of the default locations on Gen8+

2019-03-14 Thread Eleni Maria Stea
We only emit sample locations when the extension is enabled by the user.
In all other cases the default locations are emitted once when the device
is initialized to increase performance.
---
 src/intel/vulkan/anv_genX.h|  3 ++-
 src/intel/vulkan/genX_cmd_buffer.c |  2 +-
 src/intel/vulkan/genX_pipeline.c   | 13 -
 src/intel/vulkan/genX_state.c  |  8 +---
 4 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index 82fe5cc93bf..f28ee0b1a76 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -93,4 +93,5 @@ void genX(emit_ms_state)(struct anv_batch *batch,
  const VkSampleLocationEXT *sl,
  uint32_t num_samples,
  uint32_t log2_samples,
- bool custom_sample_locations);
+ bool custom_sample_locations,
+ bool sample_locations_ext_enabled);
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 57dd94bfbd7..63913dd0668 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2651,7 +2651,7 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer 
*cmd_buffer)
 
genX(emit_ms_state)(_buffer->batch,
dyn_state->sample_locations.positions,
-   samples, log2_samples, true);
+   samples, log2_samples, true, true);
 }
 
 void
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index 21b21a719da..1245090386c 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -572,10 +572,12 @@ emit_sample_mask(struct anv_pipeline *pipeline,
 }
 
 static void
-emit_ms_state(struct anv_pipeline *pipeline,
+emit_ms_state(struct anv_device *device,
+  struct anv_pipeline *pipeline,
   const VkPipelineMultisampleStateCreateInfo *info,
   const VkPipelineDynamicStateCreateInfo *dinfo)
 {
+   bool sample_loc_enabled = device->enabled_extensions.EXT_sample_locations;
VkSampleLocationsInfoEXT *sl;
bool custom_locations = false;
uint32_t samples = 1;
@@ -586,7 +588,7 @@ emit_ms_state(struct anv_pipeline *pipeline,
if (info) {
   samples = info->rasterizationSamples;
 
-  if (info->pNext) {
+  if (sample_loc_enabled && info->pNext) {
  VkPipelineSampleLocationsStateCreateInfoEXT *slinfo =
 (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext;
 
@@ -613,8 +615,8 @@ emit_ms_state(struct anv_pipeline *pipeline,
   log2_samples = __builtin_ffs(samples) - 1;
}
 
-   genX(emit_ms_state)(>batch, sl->pSampleLocations, samples, 
log2_samples,
-   custom_locations);
+   genX(emit_ms_state)(>batch, sl->pSampleLocations, samples,
+   log2_samples, custom_locations, sample_loc_enabled);
 }
 
 static const uint32_t vk_to_gen_logic_op[] = {
@@ -1944,7 +1946,8 @@ genX(graphics_pipeline_create)(
assert(pCreateInfo->pRasterizationState);
emit_rs_state(pipeline, pCreateInfo->pRasterizationState,
  pCreateInfo->pMultisampleState, pass, subpass);
-   emit_ms_state(pipeline, pCreateInfo->pMultisampleState, 
pCreateInfo->pDynamicState);
+   emit_ms_state(device, pipeline, pCreateInfo->pMultisampleState,
+ pCreateInfo->pDynamicState);
emit_ds_state(pipeline, pCreateInfo->pDepthStencilState, pass, subpass);
emit_cb_state(pipeline, pCreateInfo->pColorBlendState,
pCreateInfo->pMultisampleState);
diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
index 9b05506f3af..6e13001b74f 100644
--- a/src/intel/vulkan/genX_state.c
+++ b/src/intel/vulkan/genX_state.c
@@ -568,12 +568,14 @@ genX(emit_ms_state)(struct anv_batch *batch,
 const VkSampleLocationEXT *sl,
 uint32_t num_samples,
 uint32_t log2_samples,
-bool custom_sample_locations)
+bool custom_sample_locations,
+bool sample_locations_ext_enabled)
 {
emit_multisample(batch, sl, num_samples, log2_samples,
 custom_sample_locations);
 #if GEN_GEN >= 8
-   emit_sample_locations(batch, sl, num_samples,
- custom_sample_locations);
+   if (sample_locations_ext_enabled)
+  emit_sample_locations(batch, sl, num_samples,
+custom_sample_locations);
 #endif
 }
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v4 8/9] anv: Removed unused header file

2019-03-14 Thread Eleni Maria Stea
In src/intel/vulkan/genX_blorp_exec.c we included the file:
common/gen_sample_positions.h but not use it. Removed.

Reviewed-by: Sagar Ghuge 
Reviewed-by: Jason Ekstrand 
---
 src/intel/vulkan/genX_blorp_exec.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/intel/vulkan/genX_blorp_exec.c 
b/src/intel/vulkan/genX_blorp_exec.c
index e9c85d56d5f..0eeefaaa9d6 100644
--- a/src/intel/vulkan/genX_blorp_exec.c
+++ b/src/intel/vulkan/genX_blorp_exec.c
@@ -31,7 +31,6 @@
 #undef __gen_combine_address
 
 #include "common/gen_l3_config.h"
-#include "common/gen_sample_positions.h"
 #include "blorp/blorp_genX_exec.h"
 
 static void *
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v4 6/9] anv: Added support for dynamic and non-dynamic sample locations on Gen7

2019-03-14 Thread Eleni Maria Stea
Allowing setting dynamic and non-dynamic sample locations on Gen7.

v2: Similarly to the previous patches, removed structs and functions
that were used to sort and store the sorted sample positions (Jason
Ekstrand)
---
 src/intel/vulkan/anv_genX.h| 13 ++---
 src/intel/vulkan/genX_cmd_buffer.c |  9 ++--
 src/intel/vulkan/genX_pipeline.c   | 13 +
 src/intel/vulkan/genX_state.c  | 86 +-
 4 files changed, 70 insertions(+), 51 deletions(-)

diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index 5c618ab..82fe5cc93bf 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -89,11 +89,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer 
*cmd_buffer,
 void genX(blorp_exec)(struct blorp_batch *batch,
   const struct blorp_params *params);
 
-void genX(emit_multisample)(struct anv_batch *batch,
-uint32_t samples,
-uint32_t log2_samples);
-
-void genX(emit_sample_locations)(struct anv_batch *batch,
- const VkSampleLocationEXT *sl,
- uint32_t num_samples,
- bool custom_locations);
+void genX(emit_ms_state)(struct anv_batch *batch,
+ const VkSampleLocationEXT *sl,
+ uint32_t num_samples,
+ uint32_t log2_samples,
+ bool custom_sample_locations);
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 5d7c9b51a84..57dd94bfbd7 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2642,7 +2642,6 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer 
*cmd_buffer,
 static void
 cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer)
 {
-#if GEN_GEN >= 8
struct anv_dynamic_state *dyn_state = _buffer->state.gfx.dynamic;
uint32_t samples = dyn_state->sample_locations.num_samples;
uint32_t log2_samples;
@@ -2650,11 +2649,9 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer 
*cmd_buffer)
assert(samples > 0);
log2_samples = __builtin_ffs(samples) - 1;
 
-   genX(emit_multisample)(_buffer->batch, samples, log2_samples);
-   genX(emit_sample_locations)(_buffer->batch,
-   dyn_state->sample_locations.positions,
-   samples, true);
-#endif
+   genX(emit_ms_state)(_buffer->batch,
+   dyn_state->sample_locations.positions,
+   samples, log2_samples, true);
 }
 
 void
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index ada022620d1..21b21a719da 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -576,11 +576,8 @@ emit_ms_state(struct anv_pipeline *pipeline,
   const VkPipelineMultisampleStateCreateInfo *info,
   const VkPipelineDynamicStateCreateInfo *dinfo)
 {
-#if GEN_GEN >= 8
VkSampleLocationsInfoEXT *sl;
bool custom_locations = false;
-#endif
-
uint32_t samples = 1;
uint32_t log2_samples = 0;
 
@@ -589,7 +586,6 @@ emit_ms_state(struct anv_pipeline *pipeline,
if (info) {
   samples = info->rasterizationSamples;
 
-#if GEN_GEN >= 8
   if (info->pNext) {
  VkPipelineSampleLocationsStateCreateInfoEXT *slinfo =
 (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext;
@@ -613,17 +609,12 @@ emit_ms_state(struct anv_pipeline *pipeline,
 }
  }
   }
-#endif
 
   log2_samples = __builtin_ffs(samples) - 1;
}
 
-   genX(emit_multisample(>batch, samples, log2_samples));
-
-#if GEN_GEN >= 8
-   genX(emit_sample_locations)(>batch, sl->pSampleLocations,
-   samples, custom_locations);
-#endif
+   genX(emit_ms_state)(>batch, sl->pSampleLocations, samples, 
log2_samples,
+   custom_locations);
 }
 
 static const uint32_t vk_to_gen_logic_op[] = {
diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
index 4fdb74111a5..9b05506f3af 100644
--- a/src/intel/vulkan/genX_state.c
+++ b/src/intel/vulkan/genX_state.c
@@ -436,10 +436,12 @@ VkResult genX(CreateSampler)(
return VK_SUCCESS;
 }
 
-void
-genX(emit_multisample)(struct anv_batch *batch,
-   uint32_t samples,
-   uint32_t log2_samples)
+static void
+emit_multisample(struct anv_batch *batch,
+ const VkSampleLocationEXT *sl,
+ uint32_t samples,
+ uint32_t log2_samples,
+ bool custom_locations)
 {
anv_batch_emit(batch, GENX(3DSTATE_MULTISAMPLE), ms) {
   ms.NumberofMultisamples = log2_samples;
@@ -452,31 +454,51 @@ genX(emit_multisample)(struct anv_batch *batch,
*/
   ms.PixelPositionOffsetEnable  = false;
 #else
-  switch (samples) {
-  case 1:
- 

[Mesa-dev] [PATCH v4 9/9] anv: Enabled the VK_EXT_sample_locations extension

2019-03-14 Thread Eleni Maria Stea
Enabled the VK_EXT_sample_locations for Intel Gen >= 7.

v2: Replaced device.info->gen >= 7 with True, as Anv doesn't support
anything below Gen7. (Lionel Landwerlin)

Reviewed-by: Sagar Ghuge 
---
 src/intel/vulkan/anv_extensions.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 9e4e03e46df..5a30c733c5c 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -129,7 +129,7 @@ EXTENSIONS = [
 Extension('VK_EXT_inline_uniform_block',  1, True),
 Extension('VK_EXT_pci_bus_info',  2, True),
 Extension('VK_EXT_post_depth_coverage',   1, 'device->info.gen 
>= 9'),
-Extension('VK_EXT_sample_locations',  1, False),
+Extension('VK_EXT_sample_locations',  1, True),
 Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen 
>= 9'),
 Extension('VK_EXT_scalar_block_layout',   1, True),
 Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen 
>= 9'),
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v4 4/9] anv: Added support for non-dynamic sample locations on Gen8+

2019-03-14 Thread Eleni Maria Stea
Allowing the user to set custom sample locations non-dynamically, by
filling the extension structs and chaining them to the pipeline structs
according to the Vulkan specification section [26.5. Custom Sample
Locations] for the following structures:

'VkPipelineSampleLocationsStateCreateInfoEXT'
'VkSampleLocationsInfoEXT'
'VkSampleLocationEXT'

Once custom locations are used, the default locations are lost and need
to be re-emitted again in the next pipeline creation. For that, we emit
the 3DSTATE_SAMPLE_PATTERN at every pipeline creation.

v2: In v1, we used the custom anv_sample struct to store the location
and the distance from the pixel center because we would then use
this distance to sort the locations and send them in increasing
monotonical order to the GPU. That was because the Skylake PRM Vol.
2a "3DSTATE_SAMPLE_PATTERN" says that the samples must have
monotonically increasing distance from the pixel center to get the
correct centroid computation in the device. However, the Vulkan
spec seems to require that the samples occur in the order provided
through the API and this requirement is only for the standard
locations. As long as this only affects centroid calculations as
the docs say, we should be ok because OpenGL and Vulkan only
require that the centroid be some lit sample and that it's the same
for all samples in a pixel; they have no requirement that it be the
one closest to center. (Jason Ekstrand)
For that we made the following changes:
1- We removed the custom structs and functions from anv_private.h
   and anv_sample_locations.h and anv_sample_locations.c (the last
   two files were removed). (Jason Ekstrand)
2- We modified the macros used to take also the array as parameter
   and we renamed them to start by GEN_. (Jason Ekstrand)
3- We don't sort the samples anymore. (Jason Ekstrand)
---
 src/intel/common/gen_sample_positions.h | 57 ++
 src/intel/vulkan/anv_genX.h |  5 ++
 src/intel/vulkan/anv_private.h  |  1 +
 src/intel/vulkan/genX_pipeline.c| 79 +
 src/intel/vulkan/genX_state.c   | 72 ++
 5 files changed, 201 insertions(+), 13 deletions(-)

diff --git a/src/intel/common/gen_sample_positions.h 
b/src/intel/common/gen_sample_positions.h
index da48dcb5ed0..850661931cf 100644
--- a/src/intel/common/gen_sample_positions.h
+++ b/src/intel/common/gen_sample_positions.h
@@ -160,4 +160,61 @@ prefix##14YOffset  = 0.9375; \
 prefix##15XOffset  = 0.0625; \
 prefix##15YOffset  = 0.;
 
+/* Examples:
+ * in case of GEN_GEN < 8:
+ * GEN_SAMPLE_POS_ELEM(ms.Sample, info->pSampleLocations, 0); expands to:
+ *ms.Sample0XOffset = info->pSampleLocations[0].pos.x;
+ *ms.Sample0YOffset = info->pSampleLocations[0].y;
+ *
+ * in case of GEN_GEN >= 8:
+ * GEN_SAMPLE_POS_ELEM(sp._16xSample, info->pSampleLocations, 0); expands to:
+ *sp._16xSample0XOffset = info->pSampleLocations[0].x;
+ *sp._16xSample0YOffset = info->pSampleLocations[0].y;
+ */
+
+#define GEN_SAMPLE_POS_ELEM(prefix, arr, sample_idx) \
+prefix##sample_idx##XOffset = arr[sample_idx].x; \
+prefix##sample_idx##YOffset = arr[sample_idx].y;
+
+#define GEN_SAMPLE_POS_1X_ARRAY(prefix, arr)\
+GEN_SAMPLE_POS_ELEM(prefix, arr, 0);
+
+#define GEN_SAMPLE_POS_2X_ARRAY(prefix, arr) \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 0); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 1);
+
+#define GEN_SAMPLE_POS_4X_ARRAY(prefix, arr) \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 0); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 1); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 2); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 3);
+
+#define GEN_SAMPLE_POS_8X_ARRAY(prefix, arr) \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 0); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 1); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 2); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 3); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 4); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 5); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 6); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 7);
+
+#define GEN_SAMPLE_POS_16X_ARRAY(prefix, arr) \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 0); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 1); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 2); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 3); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 4); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 5); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 6); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 7); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 8); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 9); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 10); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 11); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 12); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 13); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 14); \
+GEN_SAMPLE_POS_ELEM(prefix, arr, 15);
+
 #endif /* GEN_SAMPLE_POSITIONS_H */
diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index 8fd32cabf1e..fb7419b6347 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -88,3 +88,8 @@ void 

[Mesa-dev] [PATCH v4 5/9] anv: Added support for dynamic sample locations on Gen8+

2019-03-14 Thread Eleni Maria Stea
Added support for setting the locations when the pipeline has been
created with the dynamic state bit enabled according to the Vulkan
Specification section [26.5. Custom Sample Locations] for the function:

'vkCmdSetSampleLocationsEXT'

The reason that we preferred to store the boolean valid inside the
dynamic state struct for locations instead of using a dirty bit
(ANV_CMD_DIRTY_SAMPLE_LOCATIONS for example) is that other functions
can modify the value of the dirty bits causing unexpected behavior.

v2: Removed all the anv* structs used with sample locations to store
the locations in order for dynamic case. (see also the patch for
the non-dynamic case. (Jason Ekstrand)
---
 src/intel/vulkan/anv_cmd_buffer.c  | 19 ++
 src/intel/vulkan/anv_genX.h|  4 +++
 src/intel/vulkan/anv_private.h |  6 +
 src/intel/vulkan/genX_cmd_buffer.c | 24 ++
 src/intel/vulkan/genX_pipeline.c   | 40 +-
 src/intel/vulkan/genX_state.c  | 36 +++
 6 files changed, 90 insertions(+), 39 deletions(-)

diff --git a/src/intel/vulkan/anv_cmd_buffer.c 
b/src/intel/vulkan/anv_cmd_buffer.c
index 1b34644a434..866cd03b05e 100644
--- a/src/intel/vulkan/anv_cmd_buffer.c
+++ b/src/intel/vulkan/anv_cmd_buffer.c
@@ -558,6 +558,25 @@ void anv_CmdSetStencilReference(
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE;
 }
 
+void
+anv_CmdSetSampleLocationsEXT(VkCommandBuffer commandBuffer,
+ const VkSampleLocationsInfoEXT 
*pSampleLocationsInfo)
+{
+   ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
+
+   struct anv_dynamic_state *dyn_state = _buffer->state.gfx.dynamic;
+   uint32_t num_samples = pSampleLocationsInfo->sampleLocationsPerPixel;
+
+   assert(pSampleLocationsInfo);
+   dyn_state->sample_locations.num_samples = num_samples;
+
+   memcpy(dyn_state->sample_locations.positions,
+  pSampleLocationsInfo->pSampleLocations,
+  num_samples * sizeof *pSampleLocationsInfo->pSampleLocations);
+
+   dyn_state->sample_locations.valid = true;
+}
+
 static void
 anv_cmd_buffer_bind_descriptor_set(struct anv_cmd_buffer *cmd_buffer,
VkPipelineBindPoint bind_point,
diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index fb7419b6347..5c618ab 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -89,6 +89,10 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer 
*cmd_buffer,
 void genX(blorp_exec)(struct blorp_batch *batch,
   const struct blorp_params *params);
 
+void genX(emit_multisample)(struct anv_batch *batch,
+uint32_t samples,
+uint32_t log2_samples);
+
 void genX(emit_sample_locations)(struct anv_batch *batch,
  const VkSampleLocationEXT *sl,
  uint32_t num_samples,
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index a39195733cd..1e1d2feaa50 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2124,6 +2124,12 @@ struct anv_dynamic_state {
   uint32_t  front;
   uint32_t  back;
} stencil_reference;
+
+   struct {
+  VkSampleLocationEXT   
positions[MAX_SAMPLE_LOCATIONS];
+  uint32_t  num_samples;
+  bool  valid;
+   } sample_locations;
 };
 
 extern const struct anv_dynamic_state default_dynamic_state;
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 7687507e6b7..5d7c9b51a84 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -30,6 +30,7 @@
 #include "util/fast_idiv_by_const.h"
 
 #include "common/gen_l3_config.h"
+#include "common/gen_sample_positions.h"
 #include "genxml/gen_macros.h"
 #include "genxml/genX_pack.h"
 
@@ -2638,6 +2639,24 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer 
*cmd_buffer,
cmd_buffer->state.push_constants_dirty &= ~flushed;
 }
 
+static void
+cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer)
+{
+#if GEN_GEN >= 8
+   struct anv_dynamic_state *dyn_state = _buffer->state.gfx.dynamic;
+   uint32_t samples = dyn_state->sample_locations.num_samples;
+   uint32_t log2_samples;
+
+   assert(samples > 0);
+   log2_samples = __builtin_ffs(samples) - 1;
+
+   genX(emit_multisample)(_buffer->batch, samples, log2_samples);
+   genX(emit_sample_locations)(_buffer->batch,
+   dyn_state->sample_locations.positions,
+   samples, true);
+#endif
+}
+
 void
 genX(cmd_buffer_flush_state)(struct anv_cmd_buffer *cmd_buffer)
 {
@@ -2796,6 +2815,11 @@ genX(cmd_buffer_flush_state)(struct anv_cmd_buffer 
*cmd_buffer)
  

[Mesa-dev] [PATCH v4 0/9] Implementation of the VK_EXT_sample_locations

2019-03-14 Thread Eleni Maria Stea
Implemented the requirements from the VK_EXT_sample_locations extension
specification to allow setting custom sample locations on Intel Gen >= 7.

Some decisions explained:

The grid size was set to 1x1 because the hardware only supports a single
set of sample locations for the whole framebuffer.

The user can set custom sample locations either per pipeline, by filling
the extension provided structs, or dynamically the way it is described
in sections 26.5, 36.1, 36.2 of the Vulkan specification.

Sections 6.7.3 and 7.4 describe how to use sample locations with images
when a layout transition is about to take place. These sections were
ignored as currently we aren't using sample locations with images in the
driver.

Variable sample locations aren't required and have not been implemented.

(v2): Initially, we were sorting the samples because according to the
  Skylake PRM (vol 2a SAMPLE_PATTERN) the samples should be sent in
  a monotonically increasing distance from the center to get the
  correct centroid computation in the device. However the Vulkan
  spec seems to require that the samples occur in the order provided
  through the API. As long as this requirement only affects centroid
  calculations we should be ok without the ordering because OpenGL
  and Vulkan only require the centroid to be some lit sample and
  that it's the same for all samples in a pixel. They have no
  requirement that it be the one closest to the center. (Jason
  Ekstrand)

We have 754 vk-gl-cts tests for this extension:
690 of the tests pass on Gen >= 9 (where we can support 16 samples).
The remaining 64 tests aren't supported because they test the variable
sample locations.

Eleni Maria Stea (9):
  anv: Added the VK_EXT_sample_locations extension to the anv_extensions
list
  anv: Set the values for the
VkPhysicalDeviceSampleLocationsPropertiesEXT
  anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT
  anv: Added support for non-dynamic sample locations on Gen8+
  anv: Added support for dynamic sample locations on Gen8+
  anv: Added support for dynamic and non-dynamic sample locations on
Gen7
  anv: Optimized the emission of the default locations on Gen8+
  anv: Removed unused header file
  anv: Enabled the VK_EXT_sample_locations extension

 src/intel/Makefile.sources  |   1 +
 src/intel/common/gen_sample_positions.h |  53 ++
 src/intel/vulkan/anv_cmd_buffer.c   |  19 
 src/intel/vulkan/anv_device.c   |  21 
 src/intel/vulkan/anv_extensions.py  |   1 +
 src/intel/vulkan/anv_genX.h |   7 ++
 src/intel/vulkan/anv_private.h  |  18 
 src/intel/vulkan/anv_sample_locations.c |  96 ++
 src/intel/vulkan/anv_sample_locations.h |  29 ++
 src/intel/vulkan/genX_blorp_exec.c  |   1 -
 src/intel/vulkan/genX_cmd_buffer.c  |  24 +
 src/intel/vulkan/genX_pipeline.c|  92 +
 src/intel/vulkan/genX_state.c   | 128 
 src/intel/vulkan/meson.build|   1 +
 14 files changed, 450 insertions(+), 41 deletions(-)
 create mode 100644 src/intel/vulkan/anv_sample_locations.c
 create mode 100644 src/intel/vulkan/anv_sample_locations.h

-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v4 2/9] anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT

2019-03-14 Thread Eleni Maria Stea
The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled with
implementation dependent values and according to the table from the
Vulkan Specification section [36.1. Limit Requirements]:

pname | max | min
pname:sampleLocationSampleCounts   |-|ename:VK_SAMPLE_COUNT_4_BIT
pname:maxSampleLocationGridSize|-|(1, 1)
pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375)
pname:sampleLocationSubPixelBits   |-|4
pname:variableSampleLocations  | false   |implementation dependent

The hardware only supports setting the same sample location for all the
pixels, so we only support 1x1 grids.

Also, variableSampleLocations is set to false because we don't support the
feature.

v2: 1- Replaced false with VK_FALSE for consistency. (Sagar Ghuge)
2- Used the isl_device_sample_count to take the number of samples
per platform to avoid extra checks. (Sagar Ghuge)

v3: 1- Replaced VK_FALSE with false as Jason has sent a patch to replace
VK_FALSE with false in other places. (Jason Ekstrand)
2- Removed unecessary defines and set the grid size to 1 (Jason Ekstrand)

Reviewed-by: Sagar Ghuge 
---
 src/intel/vulkan/anv_device.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 83fa3936c19..52ea058bdd5 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1401,6 +1401,26 @@ void anv_GetPhysicalDeviceProperties2(
  break;
   }
 
+  case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT: {
+ VkPhysicalDeviceSampleLocationsPropertiesEXT *props =
+(VkPhysicalDeviceSampleLocationsPropertiesEXT *)ext;
+
+ props->sampleLocationSampleCounts =
+isl_device_get_sample_counts(>isl_dev);
+
+ /* See also anv_GetPhysicalDeviceMultisamplePropertiesEXT */
+ props->maxSampleLocationGridSize.width = 1;
+ props->maxSampleLocationGridSize.height = 1;
+
+ props->sampleLocationCoordinateRange[0] = 0;
+ props->sampleLocationCoordinateRange[1] = 0.9375;
+ props->sampleLocationSubPixelBits = 4;
+
+ props->variableSampleLocations = false;
+
+ break;
+  }
+
   default:
  anv_debug_ignored_stype(ext->sType);
  break;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v4 1/9] anv: Added the VK_EXT_sample_locations extension to the anv_extensions list

2019-03-14 Thread Eleni Maria Stea
Added the VK_EXT_sample_locations to the anv_extensions.py list to
generate the related entrypoints.

Reviewed-by: Sagar Ghuge 
---
 src/intel/vulkan/anv_extensions.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 6fff293dee4..9e4e03e46df 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -129,6 +129,7 @@ EXTENSIONS = [
 Extension('VK_EXT_inline_uniform_block',  1, True),
 Extension('VK_EXT_pci_bus_info',  2, True),
 Extension('VK_EXT_post_depth_coverage',   1, 'device->info.gen 
>= 9'),
+Extension('VK_EXT_sample_locations',  1, False),
 Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen 
>= 9'),
 Extension('VK_EXT_scalar_block_layout',   1, True),
 Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen 
>= 9'),
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/9] anv: Added support for non-dynamic sample locations on Gen8+

2019-03-13 Thread Eleni Maria Stea
On Wed, 13 Mar 2019 08:16:10 -0500
Jason Ekstrand  wrote:

> On Mon, Mar 11, 2019 at 10:05 AM Eleni Maria Stea 
> wrote:
> 
> > Allowing the user to set custom sample locations non-dynamically, by
> > filling the extension structs and chaining them to the pipeline
> > structs according to the Vulkan specification section [26.5. Custom
> > Sample Locations]

[...]

> > +void
> > +anv_calc_sample_locations(struct anv_sample *samples,
> > +  uint32_t num_samples,
> > +  const VkSampleLocationsInfoEXT *info)
> > +{
> > +   int i;
> > +
> > +   for(i = 0; i < num_samples; i++) {
> > +  float dx, dy;
> > +
> > +  /* this is because the grid is 1x1, in case that
> > +   * we support different grid sizes in the future
> > +   * this must be changed.
> > +   */
> > +  samples[i].offs_x = info->pSampleLocations[i].x;
> > +  samples[i].offs_y = info->pSampleLocations[i].y;
> > +
> > +  /* distance from the center */
> > +  dx = samples[i].offs_x - 0.5;
> > +  dy = samples[i].offs_y - 0.5;
> > +
> > +  samples[i].radius = dx * dx + dy * dy;
> > +   }
> > +
> > +   qsort(samples, num_samples, sizeof *samples, compare_samples);
> >  
> 
> Are we allowed to re-order the samples like this?  The spec says:
> 
> The sample location for sample i at the pixel grid location (x,y) is
> taken from pSampleLocations[(x + y * sampleLocationGridSize.width) *
> sampleLocationsPerPixel + i]
> 
> Which leads me to think that they expect the ordering of samples to be
> respected.  Yes, I know the HW docs say we're supposed to order them
> from nearest to furthest.  However, AFAIK, that's only so we get nice
> centroids and I don't know that it's actually required.
> 
> --Jason

I wasn't sure about this to be honest. I could remove the qsort and
explain why we decided to ignore the PRM in a comment for the case that
someone decides to put this back in the future.

Thanks a lot for reviewing the series, BTW. I am working on the
changes for all patches.

Eleni

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 9/9] anv: Enabled the VK_EXT_sample_locations extension

2019-03-13 Thread Eleni Maria Stea
Enabled the VK_EXT_sample_locations for Intel Gen >= 7.

v2: Replaced device.info->gen >= 7 with True, as Anv doesn't support
anything below Gen7. (Lionel Landwerlin)

Reviewed-by: Sagar Ghuge 
---
 src/intel/vulkan/anv_extensions.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 9e4e03e46df..5a30c733c5c 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -129,7 +129,7 @@ EXTENSIONS = [
 Extension('VK_EXT_inline_uniform_block',  1, True),
 Extension('VK_EXT_pci_bus_info',  2, True),
 Extension('VK_EXT_post_depth_coverage',   1, 'device->info.gen 
>= 9'),
-Extension('VK_EXT_sample_locations',  1, False),
+Extension('VK_EXT_sample_locations',  1, True),
 Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen 
>= 9'),
 Extension('VK_EXT_scalar_block_layout',   1, True),
 Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen 
>= 9'),
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 8/9] anv: Removed unused header file

2019-03-13 Thread Eleni Maria Stea
In src/intel/vulkan/genX_blorp_exec.c we included the file:
common/gen_sample_positions.h but not use it. Removed.

Reviewed-by: Sagar Ghuge 
---
 src/intel/vulkan/genX_blorp_exec.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/intel/vulkan/genX_blorp_exec.c 
b/src/intel/vulkan/genX_blorp_exec.c
index e9c85d56d5f..0eeefaaa9d6 100644
--- a/src/intel/vulkan/genX_blorp_exec.c
+++ b/src/intel/vulkan/genX_blorp_exec.c
@@ -31,7 +31,6 @@
 #undef __gen_combine_address
 
 #include "common/gen_l3_config.h"
-#include "common/gen_sample_positions.h"
 #include "blorp/blorp_genX_exec.h"
 
 static void *
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 7/9] anv: Optimized the emission of the default locations on Gen8+

2019-03-13 Thread Eleni Maria Stea
We only emit sample locations when the extension is enabled by the user.
In all other cases the default locations are emitted once when the device
is initialized to increase performance.
---
 src/intel/vulkan/anv_genX.h|  3 ++-
 src/intel/vulkan/genX_cmd_buffer.c |  2 +-
 src/intel/vulkan/genX_pipeline.c   | 11 +++
 src/intel/vulkan/genX_state.c  |  8 +---
 4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index e82d83465ef..7f33a2b0a68 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -93,4 +93,5 @@ void genX(emit_ms_state)(struct anv_batch *batch,
  struct anv_sample *anv_samples,
  uint32_t num_samples,
  uint32_t log2_samples,
- bool custom_sample_locations);
+ bool custom_sample_locations,
+ bool sample_locations_ext_enabled);
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 4752c66f350..ae7c5a80a3c 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2654,7 +2654,7 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer 
*cmd_buffer)
anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples;
 
genX(emit_ms_state)(_buffer->batch, anv_samples, samples,
-   log2_samples, true);
+   log2_samples, true, true);
 }
 
 void
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index 8afc08f0320..12adfa65da8 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -573,10 +573,12 @@ emit_sample_mask(struct anv_pipeline *pipeline,
 }
 
 static void
-emit_ms_state(struct anv_pipeline *pipeline,
+emit_ms_state(struct anv_device *device,
+  struct anv_pipeline *pipeline,
   const VkPipelineMultisampleStateCreateInfo *info,
   const VkPipelineDynamicStateCreateInfo *dinfo)
 {
+   bool sample_loc_enabled = device->enabled_extensions.EXT_sample_locations;
struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS];
VkSampleLocationsInfoEXT *sl;
bool custom_locations = false;
@@ -588,7 +590,7 @@ emit_ms_state(struct anv_pipeline *pipeline,
if (info) {
   samples = info->rasterizationSamples;
 
-  if (info->pNext) {
+  if (sample_loc_enabled && info->pNext) {
  VkPipelineSampleLocationsStateCreateInfoEXT *slinfo =
 (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext;
 
@@ -617,7 +619,7 @@ emit_ms_state(struct anv_pipeline *pipeline,
}
 
genX(emit_ms_state)(>batch, anv_samples, samples, log2_samples,
-   custom_locations);
+   custom_locations, sample_loc_enabled);
 }
 
 static const uint32_t vk_to_gen_logic_op[] = {
@@ -1947,7 +1949,8 @@ genX(graphics_pipeline_create)(
assert(pCreateInfo->pRasterizationState);
emit_rs_state(pipeline, pCreateInfo->pRasterizationState,
  pCreateInfo->pMultisampleState, pass, subpass);
-   emit_ms_state(pipeline, pCreateInfo->pMultisampleState, 
pCreateInfo->pDynamicState);
+   emit_ms_state(device, pipeline, pCreateInfo->pMultisampleState,
+ pCreateInfo->pDynamicState);
emit_ds_state(pipeline, pCreateInfo->pDepthStencilState, pass, subpass);
emit_cb_state(pipeline, pCreateInfo->pColorBlendState,
pCreateInfo->pMultisampleState);
diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
index 804cfab3a56..bc6b5870d8d 100644
--- a/src/intel/vulkan/genX_state.c
+++ b/src/intel/vulkan/genX_state.c
@@ -552,12 +552,14 @@ genX(emit_ms_state)(struct anv_batch *batch,
   struct anv_sample *anv_samples,
   uint32_t num_samples,
   uint32_t log2_samples,
-  bool custom_sample_locations)
+  bool custom_sample_locations,
+  bool sample_locations_ext_enabled)
 {
emit_multisample(batch, anv_samples, num_samples, log2_samples,
 custom_sample_locations);
 #if GEN_GEN >= 8
-   emit_sample_locations(batch, anv_samples, num_samples,
- custom_sample_locations);
+   if (sample_locations_ext_enabled)
+  emit_sample_locations(batch, anv_samples, num_samples,
+custom_sample_locations);
 #endif
 }
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 3/9] anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT

2019-03-13 Thread Eleni Maria Stea
Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT according to
the Vulkan Specification section [36.2. Additional Multisampling
Capabilities].
---
 src/intel/Makefile.sources  |  1 +
 src/intel/vulkan/anv_sample_locations.c | 60 +
 src/intel/vulkan/meson.build|  1 +
 3 files changed, 62 insertions(+)
 create mode 100644 src/intel/vulkan/anv_sample_locations.c

diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index a5c8828a6b6..a0873c7ccc2 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -251,6 +251,7 @@ VULKAN_FILES := \
vulkan/anv_pipeline_cache.c \
vulkan/anv_private.h \
vulkan/anv_queue.c \
+   vulkan/anv_sample_locations.c \
vulkan/anv_util.c \
vulkan/anv_wsi.c \
vulkan/vk_format_info.h
diff --git a/src/intel/vulkan/anv_sample_locations.c 
b/src/intel/vulkan/anv_sample_locations.c
new file mode 100644
index 000..1ebf280e05b
--- /dev/null
+++ b/src/intel/vulkan/anv_sample_locations.c
@@ -0,0 +1,60 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "anv_private.h"
+
+void
+anv_GetPhysicalDeviceMultisamplePropertiesEXT(VkPhysicalDevice physicalDevice,
+  VkSampleCountFlagBits samples,
+  VkMultisamplePropertiesEXT
+  *pMultisampleProperties)
+{
+   ANV_FROM_HANDLE(anv_physical_device, physical_device, physicalDevice);
+   const struct gen_device_info *devinfo = _device->info;
+
+   VkExtent2D grid_size;
+   switch (samples) {
+   case VK_SAMPLE_COUNT_2_BIT:
+   case VK_SAMPLE_COUNT_4_BIT:
+   case VK_SAMPLE_COUNT_8_BIT:
+  grid_size.width = SAMPLE_LOC_GRID_W;
+  grid_size.height = SAMPLE_LOC_GRID_H;
+  break;
+
+   case VK_SAMPLE_COUNT_16_BIT:
+  if (devinfo->gen >= 9) {
+ grid_size.width = SAMPLE_LOC_GRID_W;
+ grid_size.height = SAMPLE_LOC_GRID_H;
+ break;
+  }
+   default:
+  grid_size.width = grid_size.height = 0;
+  break;
+   };
+
+   *pMultisampleProperties = (VkMultisamplePropertiesEXT) {
+  .sType = VK_STRUCTURE_TYPE_MULTISAMPLE_PROPERTIES_EXT,
+  .pNext = NULL,
+  .maxSampleLocationGridSize = grid_size
+   };
+}
diff --git a/src/intel/vulkan/meson.build b/src/intel/vulkan/meson.build
index 7fa43a6ad79..3f78757c774 100644
--- a/src/intel/vulkan/meson.build
+++ b/src/intel/vulkan/meson.build
@@ -135,6 +135,7 @@ libanv_files = files(
   'anv_pipeline_cache.c',
   'anv_private.h',
   'anv_queue.c',
+  'anv_sample_locations.c',
   'anv_util.c',
   'anv_wsi.c',
   'vk_format_info.h',
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 1/9] anv: Added the VK_EXT_sample_locations extension to the anv_extensions list

2019-03-13 Thread Eleni Maria Stea
Added the VK_EXT_sample_locations to the anv_extensions.py list to
generate the related entrypoints.

Reviewed-by: Sagar Ghuge 
---
 src/intel/vulkan/anv_extensions.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 6fff293dee4..9e4e03e46df 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -129,6 +129,7 @@ EXTENSIONS = [
 Extension('VK_EXT_inline_uniform_block',  1, True),
 Extension('VK_EXT_pci_bus_info',  2, True),
 Extension('VK_EXT_post_depth_coverage',   1, 'device->info.gen 
>= 9'),
+Extension('VK_EXT_sample_locations',  1, False),
 Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen 
>= 9'),
 Extension('VK_EXT_scalar_block_layout',   1, True),
 Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen 
>= 9'),
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 6/9] anv: Added support for dynamic and non-dynamic sample locations on Gen7

2019-03-13 Thread Eleni Maria Stea
Allowing setting dynamic and non-dynamic sample locations on Gen7.
---
 src/intel/vulkan/anv_genX.h| 13 ++---
 src/intel/vulkan/genX_cmd_buffer.c |  9 ++--
 src/intel/vulkan/genX_pipeline.c   | 13 +
 src/intel/vulkan/genX_state.c  | 86 +-
 4 files changed, 70 insertions(+), 51 deletions(-)

diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index f84fe457152..e82d83465ef 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -89,11 +89,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer 
*cmd_buffer,
 void genX(blorp_exec)(struct blorp_batch *batch,
   const struct blorp_params *params);
 
-void genX(emit_multisample)(struct anv_batch *batch,
-uint32_t samples,
-uint32_t log2_samples);
-
-void genX(emit_sample_locations)(struct anv_batch *batch,
- const struct anv_sample *anv_samples,
- uint32_t num_samples,
- bool custom_locations);
+void genX(emit_ms_state)(struct anv_batch *batch,
+ struct anv_sample *anv_samples,
+ uint32_t num_samples,
+ uint32_t log2_samples,
+ bool custom_sample_locations);
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 9229df84caa..4752c66f350 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2643,8 +2643,7 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer 
*cmd_buffer,
 static void
 cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer)
 {
-#if GEN_GEN >= 8
-   const struct anv_sample *anv_samples;
+   struct anv_sample *anv_samples;
uint32_t log2_samples;
uint32_t samples;
 
@@ -2654,10 +2653,8 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer 
*cmd_buffer)
log2_samples = __builtin_ffs(samples) - 1;
anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples;
 
-   genX(emit_multisample)(_buffer->batch, samples, log2_samples);
-   genX(emit_sample_locations)(_buffer->batch, anv_samples, samples,
-  true);
-#endif
+   genX(emit_ms_state)(_buffer->batch, anv_samples, samples,
+   log2_samples, true);
 }
 
 void
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index fa42e622077..8afc08f0320 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -577,12 +577,9 @@ emit_ms_state(struct anv_pipeline *pipeline,
   const VkPipelineMultisampleStateCreateInfo *info,
   const VkPipelineDynamicStateCreateInfo *dinfo)
 {
-#if GEN_GEN >= 8
struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS];
VkSampleLocationsInfoEXT *sl;
bool custom_locations = false;
-#endif
-
uint32_t samples = 1;
uint32_t log2_samples = 0;
 
@@ -591,7 +588,6 @@ emit_ms_state(struct anv_pipeline *pipeline,
if (info) {
   samples = info->rasterizationSamples;
 
-#if GEN_GEN >= 8
   if (info->pNext) {
  VkPipelineSampleLocationsStateCreateInfoEXT *slinfo =
 (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext;
@@ -616,17 +612,12 @@ emit_ms_state(struct anv_pipeline *pipeline,
 }
  }
   }
-#endif
 
   log2_samples = __builtin_ffs(samples) - 1;
}
 
-   genX(emit_multisample(>batch, samples, log2_samples));
-
-#if GEN_GEN >= 8
-   genX(emit_sample_locations)(>batch, anv_samples, samples,
-   custom_locations);
-#endif
+   genX(emit_ms_state)(>batch, anv_samples, samples, log2_samples,
+   custom_locations);
 }
 
 static const uint32_t vk_to_gen_logic_op[] = {
diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
index 44cfc925ed5..804cfab3a56 100644
--- a/src/intel/vulkan/genX_state.c
+++ b/src/intel/vulkan/genX_state.c
@@ -437,10 +437,12 @@ VkResult genX(CreateSampler)(
return VK_SUCCESS;
 }
 
-void
-genX(emit_multisample)(struct anv_batch *batch,
-   uint32_t samples,
-   uint32_t log2_samples)
+static void
+emit_multisample(struct anv_batch *batch,
+ const struct anv_sample *anv_samples,
+ uint32_t samples,
+ uint32_t log2_samples,
+ bool custom_locations)
 {
anv_batch_emit(batch, GENX(3DSTATE_MULTISAMPLE), ms) {
   ms.NumberofMultisamples = log2_samples;
@@ -453,34 +455,52 @@ genX(emit_multisample)(struct anv_batch *batch,
*/
   ms.PixelPositionOffsetEnable  = false;
 #else
-  switch (samples) {
-  case 1:
- GEN_SAMPLE_POS_1X(ms.Sample);
- break;
-  case 2:
- GEN_SAMPLE_POS_2X(ms.Sample);
- break;
-  case 4:
- GEN_SAMPLE_POS_4X(ms.Sample);
- break;
- 

[Mesa-dev] [PATCH v3 5/9] anv: Added support for dynamic sample locations on Gen8+

2019-03-13 Thread Eleni Maria Stea
Added support for setting the locations when the pipeline has been
created with the dynamic state bit enabled according to the Vulkan
Specification section [26.5. Custom Sample Locations] for the function:

'vkCmdSetSampleLocationsEXT'

The reason that we preferred to store the boolean valid inside the
dynamic state struct for locations instead of using a dirty bit
(ANV_CMD_DIRTY_SAMPLE_LOCATIONS for example) is that other functions
can modify the value of the dirty bits causing unexpected behavior.
---
 src/intel/vulkan/anv_cmd_buffer.c  | 19 
 src/intel/vulkan/anv_genX.h|  6 +++-
 src/intel/vulkan/anv_private.h |  6 
 src/intel/vulkan/genX_cmd_buffer.c | 27 ++
 src/intel/vulkan/genX_pipeline.c   | 46 --
 src/intel/vulkan/genX_state.c  | 41 +++---
 6 files changed, 99 insertions(+), 46 deletions(-)

diff --git a/src/intel/vulkan/anv_cmd_buffer.c 
b/src/intel/vulkan/anv_cmd_buffer.c
index 1b34644a434..101c1375430 100644
--- a/src/intel/vulkan/anv_cmd_buffer.c
+++ b/src/intel/vulkan/anv_cmd_buffer.c
@@ -28,6 +28,7 @@
 #include 
 
 #include "anv_private.h"
+#include "anv_sample_locations.h"
 
 #include "vk_format_info.h"
 #include "vk_util.h"
@@ -558,6 +559,24 @@ void anv_CmdSetStencilReference(
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE;
 }
 
+void
+anv_CmdSetSampleLocationsEXT(VkCommandBuffer commandBuffer,
+ const VkSampleLocationsInfoEXT 
*pSampleLocationsInfo)
+{
+   ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
+   assert(pSampleLocationsInfo);
+
+   struct anv_dynamic_state *dyn_state = _buffer->state.gfx.dynamic;
+   dyn_state->sample_locations.num_samples =
+  pSampleLocationsInfo->sampleLocationsPerPixel;
+
+   anv_calc_sample_locations(dyn_state->sample_locations.anv_samples,
+ dyn_state->sample_locations.num_samples,
+ pSampleLocationsInfo);
+
+   cmd_buffer->state.gfx.dynamic.sample_locations.valid = true;
+}
+
 static void
 anv_cmd_buffer_bind_descriptor_set(struct anv_cmd_buffer *cmd_buffer,
VkPipelineBindPoint bind_point,
diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index 52415c04a45..f84fe457152 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -89,7 +89,11 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer 
*cmd_buffer,
 void genX(blorp_exec)(struct blorp_batch *batch,
   const struct blorp_params *params);
 
+void genX(emit_multisample)(struct anv_batch *batch,
+uint32_t samples,
+uint32_t log2_samples);
+
 void genX(emit_sample_locations)(struct anv_batch *batch,
+ const struct anv_sample *anv_samples,
  uint32_t num_samples,
- const VkSampleLocationsInfoEXT *sl,
  bool custom_locations);
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 981956e5706..a2e1756cd99 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2135,6 +2135,12 @@ struct anv_dynamic_state {
   uint32_t  front;
   uint32_t  back;
} stencil_reference;
+
+   struct {
+  struct anv_sample 
anv_samples[MAX_SAMPLE_LOCATIONS];
+  uint32_t  num_samples;
+  bool  valid;
+   } sample_locations;
 };
 
 extern const struct anv_dynamic_state default_dynamic_state;
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 7687507e6b7..9229df84caa 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -25,11 +25,13 @@
 #include 
 
 #include "anv_private.h"
+#include "anv_sample_locations.h"
 #include "vk_format_info.h"
 #include "vk_util.h"
 #include "util/fast_idiv_by_const.h"
 
 #include "common/gen_l3_config.h"
+#include "common/gen_sample_positions.h"
 #include "genxml/gen_macros.h"
 #include "genxml/genX_pack.h"
 
@@ -2638,6 +2640,26 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer 
*cmd_buffer,
cmd_buffer->state.push_constants_dirty &= ~flushed;
 }
 
+static void
+cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer)
+{
+#if GEN_GEN >= 8
+   const struct anv_sample *anv_samples;
+   uint32_t log2_samples;
+   uint32_t samples;
+
+   samples = cmd_buffer->state.gfx.dynamic.sample_locations.num_samples;
+   assert(samples > 0);
+
+   log2_samples = __builtin_ffs(samples) - 1;
+   anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples;
+
+   genX(emit_multisample)(_buffer->batch, samples, log2_samples);
+   genX(emit_sample_locations)(_buffer->batch, 

[Mesa-dev] [PATCH v3 2/9] anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT

2019-03-13 Thread Eleni Maria Stea
The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled with
implementation dependent values and according to the table from the
Vulkan Specification section [36.1. Limit Requirements]:

pname | max | min
pname:sampleLocationSampleCounts   |-|ename:VK_SAMPLE_COUNT_4_BIT
pname:maxSampleLocationGridSize|-|(1, 1)
pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375)
pname:sampleLocationSubPixelBits   |-|4
pname:variableSampleLocations  | false   |implementation dependent

The hardware only supports setting the same sample location for all the
pixels, so we only support 1x1 grids.

Also, variableSampleLocations is set to false because we don't support the
feature.

v2: 1- Replaced false with VK_FALSE for consistency. (Sagar Ghuge)
2- Used the isl_device_sample_count to take the number of samples
per platform to avoid extra checks. (Sagar Ghuge)

Reviewed-by: Sagar Ghuge 
---
 src/intel/vulkan/anv_device.c  | 19 +++
 src/intel/vulkan/anv_private.h |  3 +++
 2 files changed, 22 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 729cceb3e32..bf6f03ebb1a 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1401,6 +1401,25 @@ void anv_GetPhysicalDeviceProperties2(
  break;
   }
 
+  case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT: {
+ VkPhysicalDeviceSampleLocationsPropertiesEXT *props =
+(VkPhysicalDeviceSampleLocationsPropertiesEXT *)ext;
+
+ props->sampleLocationSampleCounts =
+isl_device_get_sample_counts(>isl_dev);
+
+ props->maxSampleLocationGridSize.width = SAMPLE_LOC_GRID_W;
+ props->maxSampleLocationGridSize.height = SAMPLE_LOC_GRID_H;
+
+ props->sampleLocationCoordinateRange[0] = 0;
+ props->sampleLocationCoordinateRange[1] = 0.9375;
+ props->sampleLocationSubPixelBits = 4;
+
+ props->variableSampleLocations = VK_FALSE;
+
+ break;
+  }
+
   default:
  anv_debug_ignored_stype(ext->sType);
  break;
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index eed282ff985..5905299e59d 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -195,6 +195,9 @@ struct gen_l3_config;
 
 #define anv_printflike(a, b) __attribute__((__format__(__printf__, a, b)))
 
+#define SAMPLE_LOC_GRID_W 1
+#define SAMPLE_LOC_GRID_H 1
+
 static inline uint32_t
 align_down_npot_u32(uint32_t v, uint32_t a)
 {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 4/9] anv: Added support for non-dynamic sample locations on Gen8+

2019-03-13 Thread Eleni Maria Stea
Allowing the user to set custom sample locations non-dynamically, by
filling the extension structs and chaining them to the pipeline structs
according to the Vulkan specification section [26.5. Custom Sample Locations]
for the following structures:

'VkPipelineSampleLocationsStateCreateInfoEXT'
'VkSampleLocationsInfoEXT'
'VkSampleLocationEXT'

Once custom locations are used, the default locations are lost and need to be
re-emitted again in the next pipeline creation. For that, we emit the
3DSTATE_SAMPLE_PATTERN at every pipeline creation.
---
 src/intel/common/gen_sample_positions.h | 53 
 src/intel/vulkan/anv_genX.h |  5 ++
 src/intel/vulkan/anv_private.h  |  9 +++
 src/intel/vulkan/anv_sample_locations.c | 38 +++-
 src/intel/vulkan/anv_sample_locations.h | 29 +
 src/intel/vulkan/genX_pipeline.c| 80 +
 src/intel/vulkan/genX_state.c   | 59 ++
 7 files changed, 259 insertions(+), 14 deletions(-)
 create mode 100644 src/intel/vulkan/anv_sample_locations.h

diff --git a/src/intel/common/gen_sample_positions.h 
b/src/intel/common/gen_sample_positions.h
index da48dcb5ed0..e8af2a552dc 100644
--- a/src/intel/common/gen_sample_positions.h
+++ b/src/intel/common/gen_sample_positions.h
@@ -160,4 +160,57 @@ prefix##14YOffset  = 0.9375; \
 prefix##15XOffset  = 0.0625; \
 prefix##15YOffset  = 0.;
 
+/* Examples:
+ * in case of GEN_GEN < 8:
+ * SET_SAMPLE_POS(ms.Sample, 0); expands to:
+ *ms.Sample0XOffset = anv_samples[0].offs_x;
+ *ms.Sample0YOffset = anv_samples[0].offs_y;
+ *
+ * in case of GEN_GEN >= 8:
+ * SET_SAMPLE_POS(sp._16xSample, 0); expands to:
+ *sp._16xSample0XOffset = anv_samples[0].offs_x;
+ *sp._16xSample0YOffset = anv_samples[0].offs_y;
+ */
+#define SET_SAMPLE_POS(prefix, sample_idx) \
+prefix##sample_idx##XOffset = anv_samples[sample_idx].offs_x; \
+prefix##sample_idx##YOffset = anv_samples[sample_idx].offs_y;
+
+#define SET_SAMPLE_POS_2X(prefix) \
+SET_SAMPLE_POS(prefix, 0); \
+SET_SAMPLE_POS(prefix, 1);
+
+#define SET_SAMPLE_POS_4X(prefix) \
+SET_SAMPLE_POS(prefix, 0); \
+SET_SAMPLE_POS(prefix, 1); \
+SET_SAMPLE_POS(prefix, 2); \
+SET_SAMPLE_POS(prefix, 3);
+
+#define SET_SAMPLE_POS_8X(prefix) \
+SET_SAMPLE_POS(prefix, 0); \
+SET_SAMPLE_POS(prefix, 1); \
+SET_SAMPLE_POS(prefix, 2); \
+SET_SAMPLE_POS(prefix, 3); \
+SET_SAMPLE_POS(prefix, 4); \
+SET_SAMPLE_POS(prefix, 5); \
+SET_SAMPLE_POS(prefix, 6); \
+SET_SAMPLE_POS(prefix, 7);
+
+#define SET_SAMPLE_POS_16X(prefix) \
+SET_SAMPLE_POS(prefix, 0); \
+SET_SAMPLE_POS(prefix, 1); \
+SET_SAMPLE_POS(prefix, 2); \
+SET_SAMPLE_POS(prefix, 3); \
+SET_SAMPLE_POS(prefix, 4); \
+SET_SAMPLE_POS(prefix, 5); \
+SET_SAMPLE_POS(prefix, 6); \
+SET_SAMPLE_POS(prefix, 7); \
+SET_SAMPLE_POS(prefix, 8); \
+SET_SAMPLE_POS(prefix, 9); \
+SET_SAMPLE_POS(prefix, 10); \
+SET_SAMPLE_POS(prefix, 11); \
+SET_SAMPLE_POS(prefix, 12); \
+SET_SAMPLE_POS(prefix, 13); \
+SET_SAMPLE_POS(prefix, 14); \
+SET_SAMPLE_POS(prefix, 15);
+
 #endif /* GEN_SAMPLE_POSITIONS_H */
diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index 8fd32cabf1e..52415c04a45 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -88,3 +88,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer 
*cmd_buffer,
 
 void genX(blorp_exec)(struct blorp_batch *batch,
   const struct blorp_params *params);
+
+void genX(emit_sample_locations)(struct anv_batch *batch,
+ uint32_t num_samples,
+ const VkSampleLocationsInfoEXT *sl,
+ bool custom_locations);
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 5905299e59d..981956e5706 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -71,6 +71,7 @@ struct anv_buffer;
 struct anv_buffer_view;
 struct anv_image_view;
 struct anv_instance;
+struct anv_sample;
 
 struct gen_l3_config;
 
@@ -165,6 +166,7 @@ struct gen_l3_config;
 #define MAX_PUSH_DESCRIPTORS 32 /* Minimum requirement */
 #define MAX_INLINE_UNIFORM_BLOCK_SIZE 4096
 #define MAX_INLINE_UNIFORM_BLOCK_DESCRIPTORS 32
+#define MAX_SAMPLE_LOCATIONS 16
 
 /* The kernel relocation API has a limitation of a 32-bit delta value
  * applied to the address before it is written which, in spite of it being
@@ -2086,6 +2088,13 @@ struct anv_push_constants {
struct brw_image_param images[MAX_GEN8_IMAGES];
 };
 
+struct
+anv_sample {
+   float offs_x;
+   float offs_y;
+   float radius;
+};
+
 struct anv_dynamic_state {
struct {
   uint32_t  count;
diff --git a/src/intel/vulkan/anv_sample_locations.c 
b/src/intel/vulkan/anv_sample_locations.c
index 1ebf280e05b..c660cb5ae84 100644
--- a/src/intel/vulkan/anv_sample_locations.c
+++ b/src/intel/vulkan/anv_sample_locations.c
@@ -21,7 +21,7 @@
  * IN THE SOFTWARE.
  */
 
-#include "anv_private.h"

[Mesa-dev] [PATCH v3 0/9] Implementation of the VK_EXT_sample_locations

2019-03-13 Thread Eleni Maria Stea
Implemented the requirements from the VK_EXT_sample_locations extension
specification to allow setting custom sample locations on Intel Gen >= 7.

Some decisions explained:

The grid size was set to 1x1 because the hardware only supports a single
set of sample locations for the whole framebuffer.

The user can only set custom sample locations per pipeline by filling
the extension provided structs or dynamically the way it is described
in the sections 26.5, 36.1, 36.2 of the Vulkan specification.

Sections 6.7.3 and 7.4 describe how to use sample locations with images
when a layout transition is about to take place. These sections were
ignored as currently we aren't using sample locations with images in the
driver.

Variable sample locations aren't required and have not been implemented.

We have 754 vk-gl-cts tests for this extension:
The 690 pass on Gen >= 9 (where we can support 16 samples).
The remaining 64 tests aren't supported because they test the variable
sample locations.

Eleni Maria Stea (9):
  anv: Added the VK_EXT_sample_locations extension to the anv_extensions
list
  anv: Set the values for the
VkPhysicalDeviceSampleLocationsPropertiesEXT
  anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT
  anv: Added support for non-dynamic sample locations on Gen8+
  anv: Added support for dynamic sample locations on Gen8+
  anv: Added support for dynamic and non-dynamic sample locations on
Gen7
  anv: Optimized the emission of the default locations on Gen8+
  anv: Removed unused header file
  anv: Enabled the VK_EXT_sample_locations extension

 src/intel/Makefile.sources  |   1 +
 src/intel/common/gen_sample_positions.h |  53 ++
 src/intel/vulkan/anv_cmd_buffer.c   |  19 
 src/intel/vulkan/anv_device.c   |  21 
 src/intel/vulkan/anv_extensions.py  |   1 +
 src/intel/vulkan/anv_genX.h |   7 ++
 src/intel/vulkan/anv_private.h  |  18 
 src/intel/vulkan/anv_sample_locations.c |  96 ++
 src/intel/vulkan/anv_sample_locations.h |  29 ++
 src/intel/vulkan/genX_blorp_exec.c  |   1 -
 src/intel/vulkan/genX_cmd_buffer.c  |  24 +
 src/intel/vulkan/genX_pipeline.c|  92 +
 src/intel/vulkan/genX_state.c   | 128 
 src/intel/vulkan/meson.build|   1 +
 14 files changed, 450 insertions(+), 41 deletions(-)
 create mode 100644 src/intel/vulkan/anv_sample_locations.c
 create mode 100644 src/intel/vulkan/anv_sample_locations.h

-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/9] anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT

2019-03-12 Thread Eleni Maria Stea
On Mon, 11 Mar 2019 11:39:58 -0700
Sagar Ghuge  wrote:

> On Mon, 2019-03-11 at 17:04 +0200, Eleni Maria Stea wrote:
> > The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled
> > with implementation dependent values and according to the table
> > from the Vulkan Specification section [36.1. Limit Requirements]:
> > 
> > pname | max | min
> > pname:sampleLocationSampleCounts   |-
> > |ename:VK_SAMPLE_COU NT_4_BIT
> > pname:maxSampleLocationGridSize|-|(1, 1)
> > pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375)
> > pname:sampleLocationSubPixelBits   |-|4
> > pname:variableSampleLocations  | false   |implementation
> > dependent
> > 
> > The hardware only supports setting the same sample location for all
> > the
> > pixels, so we only support 1x1 grids.
> > 
> > Also, variableSampleLocations is set to false because we don't
> > support the
> > feature.
> > ---
> >  src/intel/vulkan/anv_device.c  | 21 +
> >  src/intel/vulkan/anv_private.h |  3 +++
> >  2 files changed, 24 insertions(+)
> > 
> > diff --git a/src/intel/vulkan/anv_device.c
> > b/src/intel/vulkan/anv_device.c
> > index 729cceb3e32..1e183b7f4ad 100644
> > --- a/src/intel/vulkan/anv_device.c
> > +++ b/src/intel/vulkan/anv_device.c
> > @@ -1401,6 +1401,27 @@ void anv_GetPhysicalDeviceProperties2(
> >   break;
> >}
> >  
> > +  case
> > VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT: {
> > + VkPhysicalDeviceSampleLocationsPropertiesEXT *props =
> > +(VkPhysicalDeviceSampleLocationsPropertiesEXT *)ext;
> > + props->sampleLocationSampleCounts = ISL_SAMPLE_COUNT_2_BIT
> > |
> > + ISL_SAMPLE_COUNT_4_BIT
> > |
> > +
> > ISL_SAMPLE_COUNT_8_BIT;
> > + if (pdevice->info.gen >= 9)
> > +props->sampleLocationSampleCounts |=
> > ISL_SAMPLE_COUNT_16_BIT;  
> 
> Hi Eleni,
> 
> Thanks for the series.
> 
> "isl_device_get_sample_counts" method figure out values according to
> platform so maybe we can make use of it and ignore
> ISL_SAMPLE_COUNT_1_BIT. So that we don't have to take care of values
> according to platform here. 
> 
> I am not sure about this, so it might be a good idea to consult with
> Jason/Lionel once. :)

I think that not only you are right here, but on top of that we
shouldn't ignore the ISL_SAMPLE_COUNT_1_BIT, as we can still write one
user defined location when only 1 sample per pixel is used (at least
MULTISAMPLE and SAMPLE_PATTERN commands allow us to do so). So, I've
made the change, thank you. :)

> 
> with or without the fix, this patch is:
> 
> Reviewed-by: Sagar Ghuge 
> 

Thanks for the review!
Eleni
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 8/9] anv: Removed unused header file

2019-03-12 Thread Eleni Maria Stea
In src/intel/vulkan/genX_blorp_exec.c we included the file:
common/gen_sample_positions.h but not use it. Removed.
---
 src/intel/vulkan/genX_blorp_exec.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/intel/vulkan/genX_blorp_exec.c 
b/src/intel/vulkan/genX_blorp_exec.c
index e9c85d56d5f..0eeefaaa9d6 100644
--- a/src/intel/vulkan/genX_blorp_exec.c
+++ b/src/intel/vulkan/genX_blorp_exec.c
@@ -31,7 +31,6 @@
 #undef __gen_combine_address
 
 #include "common/gen_l3_config.h"
-#include "common/gen_sample_positions.h"
 #include "blorp/blorp_genX_exec.h"
 
 static void *
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 7/9] anv: Optimized the emission of the default locations on Gen8+

2019-03-12 Thread Eleni Maria Stea
We only emit sample locations when the extension is enabled by the user.
In all other cases the default locations are emitted once when the device
is initialized to increase performance.
---
 src/intel/vulkan/anv_genX.h|  3 ++-
 src/intel/vulkan/genX_cmd_buffer.c |  2 +-
 src/intel/vulkan/genX_pipeline.c   | 11 +++
 src/intel/vulkan/genX_state.c  |  8 +---
 4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index e82d83465ef..7f33a2b0a68 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -93,4 +93,5 @@ void genX(emit_ms_state)(struct anv_batch *batch,
  struct anv_sample *anv_samples,
  uint32_t num_samples,
  uint32_t log2_samples,
- bool custom_sample_locations);
+ bool custom_sample_locations,
+ bool sample_locations_ext_enabled);
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 4752c66f350..ae7c5a80a3c 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2654,7 +2654,7 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer 
*cmd_buffer)
anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples;
 
genX(emit_ms_state)(_buffer->batch, anv_samples, samples,
-   log2_samples, true);
+   log2_samples, true, true);
 }
 
 void
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index 8afc08f0320..12adfa65da8 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -573,10 +573,12 @@ emit_sample_mask(struct anv_pipeline *pipeline,
 }
 
 static void
-emit_ms_state(struct anv_pipeline *pipeline,
+emit_ms_state(struct anv_device *device,
+  struct anv_pipeline *pipeline,
   const VkPipelineMultisampleStateCreateInfo *info,
   const VkPipelineDynamicStateCreateInfo *dinfo)
 {
+   bool sample_loc_enabled = device->enabled_extensions.EXT_sample_locations;
struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS];
VkSampleLocationsInfoEXT *sl;
bool custom_locations = false;
@@ -588,7 +590,7 @@ emit_ms_state(struct anv_pipeline *pipeline,
if (info) {
   samples = info->rasterizationSamples;
 
-  if (info->pNext) {
+  if (sample_loc_enabled && info->pNext) {
  VkPipelineSampleLocationsStateCreateInfoEXT *slinfo =
 (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext;
 
@@ -617,7 +619,7 @@ emit_ms_state(struct anv_pipeline *pipeline,
}
 
genX(emit_ms_state)(>batch, anv_samples, samples, log2_samples,
-   custom_locations);
+   custom_locations, sample_loc_enabled);
 }
 
 static const uint32_t vk_to_gen_logic_op[] = {
@@ -1947,7 +1949,8 @@ genX(graphics_pipeline_create)(
assert(pCreateInfo->pRasterizationState);
emit_rs_state(pipeline, pCreateInfo->pRasterizationState,
  pCreateInfo->pMultisampleState, pass, subpass);
-   emit_ms_state(pipeline, pCreateInfo->pMultisampleState, 
pCreateInfo->pDynamicState);
+   emit_ms_state(device, pipeline, pCreateInfo->pMultisampleState,
+ pCreateInfo->pDynamicState);
emit_ds_state(pipeline, pCreateInfo->pDepthStencilState, pass, subpass);
emit_cb_state(pipeline, pCreateInfo->pColorBlendState,
pCreateInfo->pMultisampleState);
diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
index 804cfab3a56..bc6b5870d8d 100644
--- a/src/intel/vulkan/genX_state.c
+++ b/src/intel/vulkan/genX_state.c
@@ -552,12 +552,14 @@ genX(emit_ms_state)(struct anv_batch *batch,
   struct anv_sample *anv_samples,
   uint32_t num_samples,
   uint32_t log2_samples,
-  bool custom_sample_locations)
+  bool custom_sample_locations,
+  bool sample_locations_ext_enabled)
 {
emit_multisample(batch, anv_samples, num_samples, log2_samples,
 custom_sample_locations);
 #if GEN_GEN >= 8
-   emit_sample_locations(batch, anv_samples, num_samples,
- custom_sample_locations);
+   if (sample_locations_ext_enabled)
+  emit_sample_locations(batch, anv_samples, num_samples,
+custom_sample_locations);
 #endif
 }
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 6/9] anv: Added support for dynamic and non-dynamic sample locations on Gen7

2019-03-12 Thread Eleni Maria Stea
Allowing setting dynamic and non-dynamic sample locations on Gen7.
---
 src/intel/vulkan/anv_genX.h| 13 ++---
 src/intel/vulkan/genX_cmd_buffer.c |  9 ++--
 src/intel/vulkan/genX_pipeline.c   | 13 +
 src/intel/vulkan/genX_state.c  | 86 +-
 4 files changed, 70 insertions(+), 51 deletions(-)

diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index f84fe457152..e82d83465ef 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -89,11 +89,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer 
*cmd_buffer,
 void genX(blorp_exec)(struct blorp_batch *batch,
   const struct blorp_params *params);
 
-void genX(emit_multisample)(struct anv_batch *batch,
-uint32_t samples,
-uint32_t log2_samples);
-
-void genX(emit_sample_locations)(struct anv_batch *batch,
- const struct anv_sample *anv_samples,
- uint32_t num_samples,
- bool custom_locations);
+void genX(emit_ms_state)(struct anv_batch *batch,
+ struct anv_sample *anv_samples,
+ uint32_t num_samples,
+ uint32_t log2_samples,
+ bool custom_sample_locations);
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 9229df84caa..4752c66f350 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2643,8 +2643,7 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer 
*cmd_buffer,
 static void
 cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer)
 {
-#if GEN_GEN >= 8
-   const struct anv_sample *anv_samples;
+   struct anv_sample *anv_samples;
uint32_t log2_samples;
uint32_t samples;
 
@@ -2654,10 +2653,8 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer 
*cmd_buffer)
log2_samples = __builtin_ffs(samples) - 1;
anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples;
 
-   genX(emit_multisample)(_buffer->batch, samples, log2_samples);
-   genX(emit_sample_locations)(_buffer->batch, anv_samples, samples,
-  true);
-#endif
+   genX(emit_ms_state)(_buffer->batch, anv_samples, samples,
+   log2_samples, true);
 }
 
 void
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index fa42e622077..8afc08f0320 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -577,12 +577,9 @@ emit_ms_state(struct anv_pipeline *pipeline,
   const VkPipelineMultisampleStateCreateInfo *info,
   const VkPipelineDynamicStateCreateInfo *dinfo)
 {
-#if GEN_GEN >= 8
struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS];
VkSampleLocationsInfoEXT *sl;
bool custom_locations = false;
-#endif
-
uint32_t samples = 1;
uint32_t log2_samples = 0;
 
@@ -591,7 +588,6 @@ emit_ms_state(struct anv_pipeline *pipeline,
if (info) {
   samples = info->rasterizationSamples;
 
-#if GEN_GEN >= 8
   if (info->pNext) {
  VkPipelineSampleLocationsStateCreateInfoEXT *slinfo =
 (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext;
@@ -616,17 +612,12 @@ emit_ms_state(struct anv_pipeline *pipeline,
 }
  }
   }
-#endif
 
   log2_samples = __builtin_ffs(samples) - 1;
}
 
-   genX(emit_multisample(>batch, samples, log2_samples));
-
-#if GEN_GEN >= 8
-   genX(emit_sample_locations)(>batch, anv_samples, samples,
-   custom_locations);
-#endif
+   genX(emit_ms_state)(>batch, anv_samples, samples, log2_samples,
+   custom_locations);
 }
 
 static const uint32_t vk_to_gen_logic_op[] = {
diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
index 44cfc925ed5..804cfab3a56 100644
--- a/src/intel/vulkan/genX_state.c
+++ b/src/intel/vulkan/genX_state.c
@@ -437,10 +437,12 @@ VkResult genX(CreateSampler)(
return VK_SUCCESS;
 }
 
-void
-genX(emit_multisample)(struct anv_batch *batch,
-   uint32_t samples,
-   uint32_t log2_samples)
+static void
+emit_multisample(struct anv_batch *batch,
+ const struct anv_sample *anv_samples,
+ uint32_t samples,
+ uint32_t log2_samples,
+ bool custom_locations)
 {
anv_batch_emit(batch, GENX(3DSTATE_MULTISAMPLE), ms) {
   ms.NumberofMultisamples = log2_samples;
@@ -453,34 +455,52 @@ genX(emit_multisample)(struct anv_batch *batch,
*/
   ms.PixelPositionOffsetEnable  = false;
 #else
-  switch (samples) {
-  case 1:
- GEN_SAMPLE_POS_1X(ms.Sample);
- break;
-  case 2:
- GEN_SAMPLE_POS_2X(ms.Sample);
- break;
-  case 4:
- GEN_SAMPLE_POS_4X(ms.Sample);
- break;
- 

[Mesa-dev] [PATCH v2 5/9] anv: Added support for dynamic sample locations on Gen8+

2019-03-12 Thread Eleni Maria Stea
Added support for setting the locations when the pipeline has been
created with the dynamic state bit enabled according to the Vulkan
Specification section [26.5. Custom Sample Locations] for the function:

'vkCmdSetSampleLocationsEXT'

The reason that we preferred to store the boolean valid inside the
dynamic state struct for locations instead of using a dirty bit
(ANV_CMD_DIRTY_SAMPLE_LOCATIONS for example) is that other functions
can modify the value of the dirty bits causing unexpected behavior.
---
 src/intel/vulkan/anv_cmd_buffer.c  | 19 
 src/intel/vulkan/anv_genX.h|  6 +++-
 src/intel/vulkan/anv_private.h |  6 
 src/intel/vulkan/genX_cmd_buffer.c | 27 ++
 src/intel/vulkan/genX_pipeline.c   | 46 --
 src/intel/vulkan/genX_state.c  | 41 +++---
 6 files changed, 99 insertions(+), 46 deletions(-)

diff --git a/src/intel/vulkan/anv_cmd_buffer.c 
b/src/intel/vulkan/anv_cmd_buffer.c
index 1b34644a434..101c1375430 100644
--- a/src/intel/vulkan/anv_cmd_buffer.c
+++ b/src/intel/vulkan/anv_cmd_buffer.c
@@ -28,6 +28,7 @@
 #include 
 
 #include "anv_private.h"
+#include "anv_sample_locations.h"
 
 #include "vk_format_info.h"
 #include "vk_util.h"
@@ -558,6 +559,24 @@ void anv_CmdSetStencilReference(
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE;
 }
 
+void
+anv_CmdSetSampleLocationsEXT(VkCommandBuffer commandBuffer,
+ const VkSampleLocationsInfoEXT 
*pSampleLocationsInfo)
+{
+   ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
+   assert(pSampleLocationsInfo);
+
+   struct anv_dynamic_state *dyn_state = _buffer->state.gfx.dynamic;
+   dyn_state->sample_locations.num_samples =
+  pSampleLocationsInfo->sampleLocationsPerPixel;
+
+   anv_calc_sample_locations(dyn_state->sample_locations.anv_samples,
+ dyn_state->sample_locations.num_samples,
+ pSampleLocationsInfo);
+
+   cmd_buffer->state.gfx.dynamic.sample_locations.valid = true;
+}
+
 static void
 anv_cmd_buffer_bind_descriptor_set(struct anv_cmd_buffer *cmd_buffer,
VkPipelineBindPoint bind_point,
diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index 52415c04a45..f84fe457152 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -89,7 +89,11 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer 
*cmd_buffer,
 void genX(blorp_exec)(struct blorp_batch *batch,
   const struct blorp_params *params);
 
+void genX(emit_multisample)(struct anv_batch *batch,
+uint32_t samples,
+uint32_t log2_samples);
+
 void genX(emit_sample_locations)(struct anv_batch *batch,
+ const struct anv_sample *anv_samples,
  uint32_t num_samples,
- const VkSampleLocationsInfoEXT *sl,
  bool custom_locations);
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 981956e5706..a2e1756cd99 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2135,6 +2135,12 @@ struct anv_dynamic_state {
   uint32_t  front;
   uint32_t  back;
} stencil_reference;
+
+   struct {
+  struct anv_sample 
anv_samples[MAX_SAMPLE_LOCATIONS];
+  uint32_t  num_samples;
+  bool  valid;
+   } sample_locations;
 };
 
 extern const struct anv_dynamic_state default_dynamic_state;
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 7687507e6b7..9229df84caa 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -25,11 +25,13 @@
 #include 
 
 #include "anv_private.h"
+#include "anv_sample_locations.h"
 #include "vk_format_info.h"
 #include "vk_util.h"
 #include "util/fast_idiv_by_const.h"
 
 #include "common/gen_l3_config.h"
+#include "common/gen_sample_positions.h"
 #include "genxml/gen_macros.h"
 #include "genxml/genX_pack.h"
 
@@ -2638,6 +2640,26 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer 
*cmd_buffer,
cmd_buffer->state.push_constants_dirty &= ~flushed;
 }
 
+static void
+cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer)
+{
+#if GEN_GEN >= 8
+   const struct anv_sample *anv_samples;
+   uint32_t log2_samples;
+   uint32_t samples;
+
+   samples = cmd_buffer->state.gfx.dynamic.sample_locations.num_samples;
+   assert(samples > 0);
+
+   log2_samples = __builtin_ffs(samples) - 1;
+   anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples;
+
+   genX(emit_multisample)(_buffer->batch, samples, log2_samples);
+   genX(emit_sample_locations)(_buffer->batch, 

[Mesa-dev] [PATCH v2 2/9] anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT

2019-03-12 Thread Eleni Maria Stea
The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled with
implementation dependent values and according to the table from the
Vulkan Specification section [36.1. Limit Requirements]:

pname | max | min
pname:sampleLocationSampleCounts   |-|ename:VK_SAMPLE_COUNT_4_BIT
pname:maxSampleLocationGridSize|-|(1, 1)
pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375)
pname:sampleLocationSubPixelBits   |-|4
pname:variableSampleLocations  | false   |implementation dependent

The hardware only supports setting the same sample location for all the
pixels, so we only support 1x1 grids.

Also, variableSampleLocations is set to false because we don't support the
feature.

v2: 1- Replaced false with VK_FALSE for consistency. (Sagar Ghuge)
2- Used the isl_device_sample_count to take the number of samples
per platform to avoid extra checks. (Sagar Ghuge)

Reviewed-by: Sagar Ghuge 
---
 src/intel/vulkan/anv_device.c  | 19 +++
 src/intel/vulkan/anv_private.h |  3 +++
 2 files changed, 22 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 729cceb3e32..bf6f03ebb1a 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1401,6 +1401,25 @@ void anv_GetPhysicalDeviceProperties2(
  break;
   }
 
+  case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT: {
+ VkPhysicalDeviceSampleLocationsPropertiesEXT *props =
+(VkPhysicalDeviceSampleLocationsPropertiesEXT *)ext;
+
+ props->sampleLocationSampleCounts =
+isl_device_get_sample_counts(>isl_dev);
+
+ props->maxSampleLocationGridSize.width = SAMPLE_LOC_GRID_W;
+ props->maxSampleLocationGridSize.height = SAMPLE_LOC_GRID_H;
+
+ props->sampleLocationCoordinateRange[0] = 0;
+ props->sampleLocationCoordinateRange[1] = 0.9375;
+ props->sampleLocationSubPixelBits = 4;
+
+ props->variableSampleLocations = VK_FALSE;
+
+ break;
+  }
+
   default:
  anv_debug_ignored_stype(ext->sType);
  break;
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index eed282ff985..5905299e59d 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -195,6 +195,9 @@ struct gen_l3_config;
 
 #define anv_printflike(a, b) __attribute__((__format__(__printf__, a, b)))
 
+#define SAMPLE_LOC_GRID_W 1
+#define SAMPLE_LOC_GRID_H 1
+
 static inline uint32_t
 align_down_npot_u32(uint32_t v, uint32_t a)
 {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 3/9] anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT

2019-03-12 Thread Eleni Maria Stea
Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT according to
the Vulkan Specification section [36.2. Additional Multisampling
Capabilities].
---
 src/intel/Makefile.sources  |  1 +
 src/intel/vulkan/anv_sample_locations.c | 60 +
 src/intel/vulkan/meson.build|  1 +
 3 files changed, 62 insertions(+)
 create mode 100644 src/intel/vulkan/anv_sample_locations.c

diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index a5c8828a6b6..a0873c7ccc2 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -251,6 +251,7 @@ VULKAN_FILES := \
vulkan/anv_pipeline_cache.c \
vulkan/anv_private.h \
vulkan/anv_queue.c \
+   vulkan/anv_sample_locations.c \
vulkan/anv_util.c \
vulkan/anv_wsi.c \
vulkan/vk_format_info.h
diff --git a/src/intel/vulkan/anv_sample_locations.c 
b/src/intel/vulkan/anv_sample_locations.c
new file mode 100644
index 000..1ebf280e05b
--- /dev/null
+++ b/src/intel/vulkan/anv_sample_locations.c
@@ -0,0 +1,60 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "anv_private.h"
+
+void
+anv_GetPhysicalDeviceMultisamplePropertiesEXT(VkPhysicalDevice physicalDevice,
+  VkSampleCountFlagBits samples,
+  VkMultisamplePropertiesEXT
+  *pMultisampleProperties)
+{
+   ANV_FROM_HANDLE(anv_physical_device, physical_device, physicalDevice);
+   const struct gen_device_info *devinfo = _device->info;
+
+   VkExtent2D grid_size;
+   switch (samples) {
+   case VK_SAMPLE_COUNT_2_BIT:
+   case VK_SAMPLE_COUNT_4_BIT:
+   case VK_SAMPLE_COUNT_8_BIT:
+  grid_size.width = SAMPLE_LOC_GRID_W;
+  grid_size.height = SAMPLE_LOC_GRID_H;
+  break;
+
+   case VK_SAMPLE_COUNT_16_BIT:
+  if (devinfo->gen >= 9) {
+ grid_size.width = SAMPLE_LOC_GRID_W;
+ grid_size.height = SAMPLE_LOC_GRID_H;
+ break;
+  }
+   default:
+  grid_size.width = grid_size.height = 0;
+  break;
+   };
+
+   *pMultisampleProperties = (VkMultisamplePropertiesEXT) {
+  .sType = VK_STRUCTURE_TYPE_MULTISAMPLE_PROPERTIES_EXT,
+  .pNext = NULL,
+  .maxSampleLocationGridSize = grid_size
+   };
+}
diff --git a/src/intel/vulkan/meson.build b/src/intel/vulkan/meson.build
index 7fa43a6ad79..3f78757c774 100644
--- a/src/intel/vulkan/meson.build
+++ b/src/intel/vulkan/meson.build
@@ -135,6 +135,7 @@ libanv_files = files(
   'anv_pipeline_cache.c',
   'anv_private.h',
   'anv_queue.c',
+  'anv_sample_locations.c',
   'anv_util.c',
   'anv_wsi.c',
   'vk_format_info.h',
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 9/9] anv: Enabled the VK_EXT_sample_locations extension

2019-03-12 Thread Eleni Maria Stea
Enabled the VK_EXT_sample_locations for Intel Gen >= 7.

v2: Replaced device.info->gen >= 7 with True, as Anv doesn't support
anything below Gen7. (Lionel Landwerlin)
---
 src/intel/vulkan/anv_extensions.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 9e4e03e46df..5a30c733c5c 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -129,7 +129,7 @@ EXTENSIONS = [
 Extension('VK_EXT_inline_uniform_block',  1, True),
 Extension('VK_EXT_pci_bus_info',  2, True),
 Extension('VK_EXT_post_depth_coverage',   1, 'device->info.gen 
>= 9'),
-Extension('VK_EXT_sample_locations',  1, False),
+Extension('VK_EXT_sample_locations',  1, True),
 Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen 
>= 9'),
 Extension('VK_EXT_scalar_block_layout',   1, True),
 Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen 
>= 9'),
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 0/9] Implementation of the VK_EXT_sample_locations

2019-03-12 Thread Eleni Maria Stea
Implemented the requirements from the VK_EXT_sample_locations extension
specification to allow setting custom sample locations on Intel Gen >= 7.

Some decisions explained:

The grid size was set to 1x1 because the hardware only supports a single
set of sample locations for the whole framebuffer.

The user can only set custom sample locations per pipeline by filling
the extension provided structs or dynamically the way it is described
in the sections 26.5, 36.1, 36.2 of the Vulkan specification.

Sections 6.7.3 and 7.4 describe how to use sample locations with images
when a layout transition is about to take place. These sections were
ignored as currently we aren't using sample locations with images in the
driver.

Variable sample locations aren't required and have not been implemented.

We have 754 vk-gl-cts tests for this extension:
The 690 pass on Gen >= 9 (where we can support 16 samples).
The remaining 64 tests aren't supported because they test the variable
sample locations.

Eleni Maria Stea (9):
  anv: Added the VK_EXT_sample_locations extension to the anv_extensions
list
  anv: Set the values for the
VkPhysicalDeviceSampleLocationsPropertiesEXT
  anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT
  anv: Added support for non-dynamic sample locations on Gen8+
  anv: Added support for dynamic sample locations on Gen8+
  anv: Added support for dynamic and non-dynamic sample locations on
Gen7
  anv: Optimized the emission of the default locations on Gen8+
  anv: Removed unused header file
  anv: Enabled the VK_EXT_sample_locations extension

 src/intel/Makefile.sources  |   1 +
 src/intel/common/gen_sample_positions.h |  53 ++
 src/intel/vulkan/anv_cmd_buffer.c   |  19 
 src/intel/vulkan/anv_device.c   |  21 
 src/intel/vulkan/anv_extensions.py  |   1 +
 src/intel/vulkan/anv_genX.h |   7 ++
 src/intel/vulkan/anv_private.h  |  18 
 src/intel/vulkan/anv_sample_locations.c |  96 ++
 src/intel/vulkan/anv_sample_locations.h |  29 ++
 src/intel/vulkan/genX_blorp_exec.c  |   1 -
 src/intel/vulkan/genX_cmd_buffer.c  |  24 +
 src/intel/vulkan/genX_pipeline.c|  92 +
 src/intel/vulkan/genX_state.c   | 128 
 src/intel/vulkan/meson.build|   1 +
 14 files changed, 450 insertions(+), 41 deletions(-)
 create mode 100644 src/intel/vulkan/anv_sample_locations.c
 create mode 100644 src/intel/vulkan/anv_sample_locations.h

-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 4/9] anv: Added support for non-dynamic sample locations on Gen8+

2019-03-12 Thread Eleni Maria Stea
Allowing the user to set custom sample locations non-dynamically, by
filling the extension structs and chaining them to the pipeline structs
according to the Vulkan specification section [26.5. Custom Sample Locations]
for the following structures:

'VkPipelineSampleLocationsStateCreateInfoEXT'
'VkSampleLocationsInfoEXT'
'VkSampleLocationEXT'

Once custom locations are used, the default locations are lost and need to be
re-emitted again in the next pipeline creation. For that, we emit the
3DSTATE_SAMPLE_PATTERN at every pipeline creation.
---
 src/intel/common/gen_sample_positions.h | 53 
 src/intel/vulkan/anv_genX.h |  5 ++
 src/intel/vulkan/anv_private.h  |  9 +++
 src/intel/vulkan/anv_sample_locations.c | 38 +++-
 src/intel/vulkan/anv_sample_locations.h | 29 +
 src/intel/vulkan/genX_pipeline.c| 80 +
 src/intel/vulkan/genX_state.c   | 59 ++
 7 files changed, 259 insertions(+), 14 deletions(-)
 create mode 100644 src/intel/vulkan/anv_sample_locations.h

diff --git a/src/intel/common/gen_sample_positions.h 
b/src/intel/common/gen_sample_positions.h
index da48dcb5ed0..e8af2a552dc 100644
--- a/src/intel/common/gen_sample_positions.h
+++ b/src/intel/common/gen_sample_positions.h
@@ -160,4 +160,57 @@ prefix##14YOffset  = 0.9375; \
 prefix##15XOffset  = 0.0625; \
 prefix##15YOffset  = 0.;
 
+/* Examples:
+ * in case of GEN_GEN < 8:
+ * SET_SAMPLE_POS(ms.Sample, 0); expands to:
+ *ms.Sample0XOffset = anv_samples[0].offs_x;
+ *ms.Sample0YOffset = anv_samples[0].offs_y;
+ *
+ * in case of GEN_GEN >= 8:
+ * SET_SAMPLE_POS(sp._16xSample, 0); expands to:
+ *sp._16xSample0XOffset = anv_samples[0].offs_x;
+ *sp._16xSample0YOffset = anv_samples[0].offs_y;
+ */
+#define SET_SAMPLE_POS(prefix, sample_idx) \
+prefix##sample_idx##XOffset = anv_samples[sample_idx].offs_x; \
+prefix##sample_idx##YOffset = anv_samples[sample_idx].offs_y;
+
+#define SET_SAMPLE_POS_2X(prefix) \
+SET_SAMPLE_POS(prefix, 0); \
+SET_SAMPLE_POS(prefix, 1);
+
+#define SET_SAMPLE_POS_4X(prefix) \
+SET_SAMPLE_POS(prefix, 0); \
+SET_SAMPLE_POS(prefix, 1); \
+SET_SAMPLE_POS(prefix, 2); \
+SET_SAMPLE_POS(prefix, 3);
+
+#define SET_SAMPLE_POS_8X(prefix) \
+SET_SAMPLE_POS(prefix, 0); \
+SET_SAMPLE_POS(prefix, 1); \
+SET_SAMPLE_POS(prefix, 2); \
+SET_SAMPLE_POS(prefix, 3); \
+SET_SAMPLE_POS(prefix, 4); \
+SET_SAMPLE_POS(prefix, 5); \
+SET_SAMPLE_POS(prefix, 6); \
+SET_SAMPLE_POS(prefix, 7);
+
+#define SET_SAMPLE_POS_16X(prefix) \
+SET_SAMPLE_POS(prefix, 0); \
+SET_SAMPLE_POS(prefix, 1); \
+SET_SAMPLE_POS(prefix, 2); \
+SET_SAMPLE_POS(prefix, 3); \
+SET_SAMPLE_POS(prefix, 4); \
+SET_SAMPLE_POS(prefix, 5); \
+SET_SAMPLE_POS(prefix, 6); \
+SET_SAMPLE_POS(prefix, 7); \
+SET_SAMPLE_POS(prefix, 8); \
+SET_SAMPLE_POS(prefix, 9); \
+SET_SAMPLE_POS(prefix, 10); \
+SET_SAMPLE_POS(prefix, 11); \
+SET_SAMPLE_POS(prefix, 12); \
+SET_SAMPLE_POS(prefix, 13); \
+SET_SAMPLE_POS(prefix, 14); \
+SET_SAMPLE_POS(prefix, 15);
+
 #endif /* GEN_SAMPLE_POSITIONS_H */
diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index 8fd32cabf1e..52415c04a45 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -88,3 +88,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer 
*cmd_buffer,
 
 void genX(blorp_exec)(struct blorp_batch *batch,
   const struct blorp_params *params);
+
+void genX(emit_sample_locations)(struct anv_batch *batch,
+ uint32_t num_samples,
+ const VkSampleLocationsInfoEXT *sl,
+ bool custom_locations);
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 5905299e59d..981956e5706 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -71,6 +71,7 @@ struct anv_buffer;
 struct anv_buffer_view;
 struct anv_image_view;
 struct anv_instance;
+struct anv_sample;
 
 struct gen_l3_config;
 
@@ -165,6 +166,7 @@ struct gen_l3_config;
 #define MAX_PUSH_DESCRIPTORS 32 /* Minimum requirement */
 #define MAX_INLINE_UNIFORM_BLOCK_SIZE 4096
 #define MAX_INLINE_UNIFORM_BLOCK_DESCRIPTORS 32
+#define MAX_SAMPLE_LOCATIONS 16
 
 /* The kernel relocation API has a limitation of a 32-bit delta value
  * applied to the address before it is written which, in spite of it being
@@ -2086,6 +2088,13 @@ struct anv_push_constants {
struct brw_image_param images[MAX_GEN8_IMAGES];
 };
 
+struct
+anv_sample {
+   float offs_x;
+   float offs_y;
+   float radius;
+};
+
 struct anv_dynamic_state {
struct {
   uint32_t  count;
diff --git a/src/intel/vulkan/anv_sample_locations.c 
b/src/intel/vulkan/anv_sample_locations.c
index 1ebf280e05b..c660cb5ae84 100644
--- a/src/intel/vulkan/anv_sample_locations.c
+++ b/src/intel/vulkan/anv_sample_locations.c
@@ -21,7 +21,7 @@
  * IN THE SOFTWARE.
  */
 
-#include "anv_private.h"

[Mesa-dev] [PATCH v2 1/9] anv: Added the VK_EXT_sample_locations extension to the anv_extensions list

2019-03-12 Thread Eleni Maria Stea
Added the VK_EXT_sample_locations to the anv_extensions.py list to
generate the related entrypoints.
---
 src/intel/vulkan/anv_extensions.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 6fff293dee4..9e4e03e46df 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -129,6 +129,7 @@ EXTENSIONS = [
 Extension('VK_EXT_inline_uniform_block',  1, True),
 Extension('VK_EXT_pci_bus_info',  2, True),
 Extension('VK_EXT_post_depth_coverage',   1, 'device->info.gen 
>= 9'),
+Extension('VK_EXT_sample_locations',  1, False),
 Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen 
>= 9'),
 Extension('VK_EXT_scalar_block_layout',   1, True),
 Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen 
>= 9'),
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 9/9] anv: Enabled the VK_EXT_sample_locations extension

2019-03-11 Thread Eleni Maria Stea
Enabled the VK_EXT_sample_locations for Intel Gen >= 7.
---
 src/intel/vulkan/anv_extensions.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 9e4e03e46df..99007544732 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -129,7 +129,7 @@ EXTENSIONS = [
 Extension('VK_EXT_inline_uniform_block',  1, True),
 Extension('VK_EXT_pci_bus_info',  2, True),
 Extension('VK_EXT_post_depth_coverage',   1, 'device->info.gen 
>= 9'),
-Extension('VK_EXT_sample_locations',  1, False),
+Extension('VK_EXT_sample_locations',  1, 'device->info.gen 
>= 7'),
 Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen 
>= 9'),
 Extension('VK_EXT_scalar_block_layout',   1, True),
 Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen 
>= 9'),
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 8/9] anv: Removed unused header file

2019-03-11 Thread Eleni Maria Stea
In src/intel/vulkan/genX_blorp_exec.c we included the file:
common/gen_sample_positions.h but not use it. Removed.
---
 src/intel/vulkan/genX_blorp_exec.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/intel/vulkan/genX_blorp_exec.c 
b/src/intel/vulkan/genX_blorp_exec.c
index e9c85d56d5f..0eeefaaa9d6 100644
--- a/src/intel/vulkan/genX_blorp_exec.c
+++ b/src/intel/vulkan/genX_blorp_exec.c
@@ -31,7 +31,6 @@
 #undef __gen_combine_address
 
 #include "common/gen_l3_config.h"
-#include "common/gen_sample_positions.h"
 #include "blorp/blorp_genX_exec.h"
 
 static void *
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 7/9] anv: Optimized the emission of the default locations on Gen8+

2019-03-11 Thread Eleni Maria Stea
We only emit sample locations when the extension is enabled by the user.
In all other cases the default locations are emitted once when the device
is initialized to increase performance.
---
 src/intel/vulkan/anv_genX.h|  3 ++-
 src/intel/vulkan/genX_cmd_buffer.c |  2 +-
 src/intel/vulkan/genX_pipeline.c   | 11 +++
 src/intel/vulkan/genX_state.c  |  8 +---
 4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index e82d83465ef..7f33a2b0a68 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -93,4 +93,5 @@ void genX(emit_ms_state)(struct anv_batch *batch,
  struct anv_sample *anv_samples,
  uint32_t num_samples,
  uint32_t log2_samples,
- bool custom_sample_locations);
+ bool custom_sample_locations,
+ bool sample_locations_ext_enabled);
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 4752c66f350..ae7c5a80a3c 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2654,7 +2654,7 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer 
*cmd_buffer)
anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples;
 
genX(emit_ms_state)(_buffer->batch, anv_samples, samples,
-   log2_samples, true);
+   log2_samples, true, true);
 }
 
 void
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index 8afc08f0320..12adfa65da8 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -573,10 +573,12 @@ emit_sample_mask(struct anv_pipeline *pipeline,
 }
 
 static void
-emit_ms_state(struct anv_pipeline *pipeline,
+emit_ms_state(struct anv_device *device,
+  struct anv_pipeline *pipeline,
   const VkPipelineMultisampleStateCreateInfo *info,
   const VkPipelineDynamicStateCreateInfo *dinfo)
 {
+   bool sample_loc_enabled = device->enabled_extensions.EXT_sample_locations;
struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS];
VkSampleLocationsInfoEXT *sl;
bool custom_locations = false;
@@ -588,7 +590,7 @@ emit_ms_state(struct anv_pipeline *pipeline,
if (info) {
   samples = info->rasterizationSamples;
 
-  if (info->pNext) {
+  if (sample_loc_enabled && info->pNext) {
  VkPipelineSampleLocationsStateCreateInfoEXT *slinfo =
 (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext;
 
@@ -617,7 +619,7 @@ emit_ms_state(struct anv_pipeline *pipeline,
}
 
genX(emit_ms_state)(>batch, anv_samples, samples, log2_samples,
-   custom_locations);
+   custom_locations, sample_loc_enabled);
 }
 
 static const uint32_t vk_to_gen_logic_op[] = {
@@ -1947,7 +1949,8 @@ genX(graphics_pipeline_create)(
assert(pCreateInfo->pRasterizationState);
emit_rs_state(pipeline, pCreateInfo->pRasterizationState,
  pCreateInfo->pMultisampleState, pass, subpass);
-   emit_ms_state(pipeline, pCreateInfo->pMultisampleState, 
pCreateInfo->pDynamicState);
+   emit_ms_state(device, pipeline, pCreateInfo->pMultisampleState,
+ pCreateInfo->pDynamicState);
emit_ds_state(pipeline, pCreateInfo->pDepthStencilState, pass, subpass);
emit_cb_state(pipeline, pCreateInfo->pColorBlendState,
pCreateInfo->pMultisampleState);
diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
index 804cfab3a56..bc6b5870d8d 100644
--- a/src/intel/vulkan/genX_state.c
+++ b/src/intel/vulkan/genX_state.c
@@ -552,12 +552,14 @@ genX(emit_ms_state)(struct anv_batch *batch,
   struct anv_sample *anv_samples,
   uint32_t num_samples,
   uint32_t log2_samples,
-  bool custom_sample_locations)
+  bool custom_sample_locations,
+  bool sample_locations_ext_enabled)
 {
emit_multisample(batch, anv_samples, num_samples, log2_samples,
 custom_sample_locations);
 #if GEN_GEN >= 8
-   emit_sample_locations(batch, anv_samples, num_samples,
- custom_sample_locations);
+   if (sample_locations_ext_enabled)
+  emit_sample_locations(batch, anv_samples, num_samples,
+custom_sample_locations);
 #endif
 }
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/9] anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT

2019-03-11 Thread Eleni Maria Stea
Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT according to
the Vulkan Specification section [36.2. Additional Multisampling
Capabilities].
---
 src/intel/Makefile.sources  |  1 +
 src/intel/vulkan/anv_sample_locations.c | 60 +
 src/intel/vulkan/meson.build|  1 +
 3 files changed, 62 insertions(+)
 create mode 100644 src/intel/vulkan/anv_sample_locations.c

diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index a5c8828a6b6..a0873c7ccc2 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -251,6 +251,7 @@ VULKAN_FILES := \
vulkan/anv_pipeline_cache.c \
vulkan/anv_private.h \
vulkan/anv_queue.c \
+   vulkan/anv_sample_locations.c \
vulkan/anv_util.c \
vulkan/anv_wsi.c \
vulkan/vk_format_info.h
diff --git a/src/intel/vulkan/anv_sample_locations.c 
b/src/intel/vulkan/anv_sample_locations.c
new file mode 100644
index 000..1ebf280e05b
--- /dev/null
+++ b/src/intel/vulkan/anv_sample_locations.c
@@ -0,0 +1,60 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "anv_private.h"
+
+void
+anv_GetPhysicalDeviceMultisamplePropertiesEXT(VkPhysicalDevice physicalDevice,
+  VkSampleCountFlagBits samples,
+  VkMultisamplePropertiesEXT
+  *pMultisampleProperties)
+{
+   ANV_FROM_HANDLE(anv_physical_device, physical_device, physicalDevice);
+   const struct gen_device_info *devinfo = _device->info;
+
+   VkExtent2D grid_size;
+   switch (samples) {
+   case VK_SAMPLE_COUNT_2_BIT:
+   case VK_SAMPLE_COUNT_4_BIT:
+   case VK_SAMPLE_COUNT_8_BIT:
+  grid_size.width = SAMPLE_LOC_GRID_W;
+  grid_size.height = SAMPLE_LOC_GRID_H;
+  break;
+
+   case VK_SAMPLE_COUNT_16_BIT:
+  if (devinfo->gen >= 9) {
+ grid_size.width = SAMPLE_LOC_GRID_W;
+ grid_size.height = SAMPLE_LOC_GRID_H;
+ break;
+  }
+   default:
+  grid_size.width = grid_size.height = 0;
+  break;
+   };
+
+   *pMultisampleProperties = (VkMultisamplePropertiesEXT) {
+  .sType = VK_STRUCTURE_TYPE_MULTISAMPLE_PROPERTIES_EXT,
+  .pNext = NULL,
+  .maxSampleLocationGridSize = grid_size
+   };
+}
diff --git a/src/intel/vulkan/meson.build b/src/intel/vulkan/meson.build
index 7fa43a6ad79..3f78757c774 100644
--- a/src/intel/vulkan/meson.build
+++ b/src/intel/vulkan/meson.build
@@ -135,6 +135,7 @@ libanv_files = files(
   'anv_pipeline_cache.c',
   'anv_private.h',
   'anv_queue.c',
+  'anv_sample_locations.c',
   'anv_util.c',
   'anv_wsi.c',
   'vk_format_info.h',
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/9] anv: Added support for dynamic and non-dynamic sample locations on Gen7

2019-03-11 Thread Eleni Maria Stea
Allowing setting dynamic and non-dynamic sample locations on Gen7.
---
 src/intel/vulkan/anv_genX.h| 13 ++---
 src/intel/vulkan/genX_cmd_buffer.c |  9 ++--
 src/intel/vulkan/genX_pipeline.c   | 13 +
 src/intel/vulkan/genX_state.c  | 86 +-
 4 files changed, 70 insertions(+), 51 deletions(-)

diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index f84fe457152..e82d83465ef 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -89,11 +89,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer 
*cmd_buffer,
 void genX(blorp_exec)(struct blorp_batch *batch,
   const struct blorp_params *params);
 
-void genX(emit_multisample)(struct anv_batch *batch,
-uint32_t samples,
-uint32_t log2_samples);
-
-void genX(emit_sample_locations)(struct anv_batch *batch,
- const struct anv_sample *anv_samples,
- uint32_t num_samples,
- bool custom_locations);
+void genX(emit_ms_state)(struct anv_batch *batch,
+ struct anv_sample *anv_samples,
+ uint32_t num_samples,
+ uint32_t log2_samples,
+ bool custom_sample_locations);
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 9229df84caa..4752c66f350 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2643,8 +2643,7 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer 
*cmd_buffer,
 static void
 cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer)
 {
-#if GEN_GEN >= 8
-   const struct anv_sample *anv_samples;
+   struct anv_sample *anv_samples;
uint32_t log2_samples;
uint32_t samples;
 
@@ -2654,10 +2653,8 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer 
*cmd_buffer)
log2_samples = __builtin_ffs(samples) - 1;
anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples;
 
-   genX(emit_multisample)(_buffer->batch, samples, log2_samples);
-   genX(emit_sample_locations)(_buffer->batch, anv_samples, samples,
-  true);
-#endif
+   genX(emit_ms_state)(_buffer->batch, anv_samples, samples,
+   log2_samples, true);
 }
 
 void
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index fa42e622077..8afc08f0320 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -577,12 +577,9 @@ emit_ms_state(struct anv_pipeline *pipeline,
   const VkPipelineMultisampleStateCreateInfo *info,
   const VkPipelineDynamicStateCreateInfo *dinfo)
 {
-#if GEN_GEN >= 8
struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS];
VkSampleLocationsInfoEXT *sl;
bool custom_locations = false;
-#endif
-
uint32_t samples = 1;
uint32_t log2_samples = 0;
 
@@ -591,7 +588,6 @@ emit_ms_state(struct anv_pipeline *pipeline,
if (info) {
   samples = info->rasterizationSamples;
 
-#if GEN_GEN >= 8
   if (info->pNext) {
  VkPipelineSampleLocationsStateCreateInfoEXT *slinfo =
 (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext;
@@ -616,17 +612,12 @@ emit_ms_state(struct anv_pipeline *pipeline,
 }
  }
   }
-#endif
 
   log2_samples = __builtin_ffs(samples) - 1;
}
 
-   genX(emit_multisample(>batch, samples, log2_samples));
-
-#if GEN_GEN >= 8
-   genX(emit_sample_locations)(>batch, anv_samples, samples,
-   custom_locations);
-#endif
+   genX(emit_ms_state)(>batch, anv_samples, samples, log2_samples,
+   custom_locations);
 }
 
 static const uint32_t vk_to_gen_logic_op[] = {
diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
index 44cfc925ed5..804cfab3a56 100644
--- a/src/intel/vulkan/genX_state.c
+++ b/src/intel/vulkan/genX_state.c
@@ -437,10 +437,12 @@ VkResult genX(CreateSampler)(
return VK_SUCCESS;
 }
 
-void
-genX(emit_multisample)(struct anv_batch *batch,
-   uint32_t samples,
-   uint32_t log2_samples)
+static void
+emit_multisample(struct anv_batch *batch,
+ const struct anv_sample *anv_samples,
+ uint32_t samples,
+ uint32_t log2_samples,
+ bool custom_locations)
 {
anv_batch_emit(batch, GENX(3DSTATE_MULTISAMPLE), ms) {
   ms.NumberofMultisamples = log2_samples;
@@ -453,34 +455,52 @@ genX(emit_multisample)(struct anv_batch *batch,
*/
   ms.PixelPositionOffsetEnable  = false;
 #else
-  switch (samples) {
-  case 1:
- GEN_SAMPLE_POS_1X(ms.Sample);
- break;
-  case 2:
- GEN_SAMPLE_POS_2X(ms.Sample);
- break;
-  case 4:
- GEN_SAMPLE_POS_4X(ms.Sample);
- break;
- 

[Mesa-dev] [PATCH 1/9] anv: Added the VK_EXT_sample_locations extension to the anv_extensions list

2019-03-11 Thread Eleni Maria Stea
Added the VK_EXT_sample_locations to the anv_extensions.py list to
generate the related entrypoints.
---
 src/intel/vulkan/anv_extensions.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 6fff293dee4..9e4e03e46df 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -129,6 +129,7 @@ EXTENSIONS = [
 Extension('VK_EXT_inline_uniform_block',  1, True),
 Extension('VK_EXT_pci_bus_info',  2, True),
 Extension('VK_EXT_post_depth_coverage',   1, 'device->info.gen 
>= 9'),
+Extension('VK_EXT_sample_locations',  1, False),
 Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen 
>= 9'),
 Extension('VK_EXT_scalar_block_layout',   1, True),
 Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen 
>= 9'),
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/9] anv: Added support for dynamic sample locations on Gen8+

2019-03-11 Thread Eleni Maria Stea
Added support for setting the locations when the pipeline has been
created with the dynamic state bit enabled according to the Vulkan
Specification section [26.5. Custom Sample Locations] for the function:

'vkCmdSetSampleLocationsEXT'

The reason that we preferred to store the boolean valid inside the
dynamic state struct for locations instead of using a dirty bit
(ANV_CMD_DIRTY_SAMPLE_LOCATIONS for example) is that other functions
can modify the value of the dirty bits causing unexpected behavior.
---
 src/intel/vulkan/anv_cmd_buffer.c  | 19 
 src/intel/vulkan/anv_genX.h|  6 +++-
 src/intel/vulkan/anv_private.h |  6 
 src/intel/vulkan/genX_cmd_buffer.c | 27 ++
 src/intel/vulkan/genX_pipeline.c   | 46 --
 src/intel/vulkan/genX_state.c  | 41 +++---
 6 files changed, 99 insertions(+), 46 deletions(-)

diff --git a/src/intel/vulkan/anv_cmd_buffer.c 
b/src/intel/vulkan/anv_cmd_buffer.c
index 1b34644a434..101c1375430 100644
--- a/src/intel/vulkan/anv_cmd_buffer.c
+++ b/src/intel/vulkan/anv_cmd_buffer.c
@@ -28,6 +28,7 @@
 #include 
 
 #include "anv_private.h"
+#include "anv_sample_locations.h"
 
 #include "vk_format_info.h"
 #include "vk_util.h"
@@ -558,6 +559,24 @@ void anv_CmdSetStencilReference(
cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE;
 }
 
+void
+anv_CmdSetSampleLocationsEXT(VkCommandBuffer commandBuffer,
+ const VkSampleLocationsInfoEXT 
*pSampleLocationsInfo)
+{
+   ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
+   assert(pSampleLocationsInfo);
+
+   struct anv_dynamic_state *dyn_state = _buffer->state.gfx.dynamic;
+   dyn_state->sample_locations.num_samples =
+  pSampleLocationsInfo->sampleLocationsPerPixel;
+
+   anv_calc_sample_locations(dyn_state->sample_locations.anv_samples,
+ dyn_state->sample_locations.num_samples,
+ pSampleLocationsInfo);
+
+   cmd_buffer->state.gfx.dynamic.sample_locations.valid = true;
+}
+
 static void
 anv_cmd_buffer_bind_descriptor_set(struct anv_cmd_buffer *cmd_buffer,
VkPipelineBindPoint bind_point,
diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index 52415c04a45..f84fe457152 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -89,7 +89,11 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer 
*cmd_buffer,
 void genX(blorp_exec)(struct blorp_batch *batch,
   const struct blorp_params *params);
 
+void genX(emit_multisample)(struct anv_batch *batch,
+uint32_t samples,
+uint32_t log2_samples);
+
 void genX(emit_sample_locations)(struct anv_batch *batch,
+ const struct anv_sample *anv_samples,
  uint32_t num_samples,
- const VkSampleLocationsInfoEXT *sl,
  bool custom_locations);
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 981956e5706..a2e1756cd99 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2135,6 +2135,12 @@ struct anv_dynamic_state {
   uint32_t  front;
   uint32_t  back;
} stencil_reference;
+
+   struct {
+  struct anv_sample 
anv_samples[MAX_SAMPLE_LOCATIONS];
+  uint32_t  num_samples;
+  bool  valid;
+   } sample_locations;
 };
 
 extern const struct anv_dynamic_state default_dynamic_state;
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 7687507e6b7..9229df84caa 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -25,11 +25,13 @@
 #include 
 
 #include "anv_private.h"
+#include "anv_sample_locations.h"
 #include "vk_format_info.h"
 #include "vk_util.h"
 #include "util/fast_idiv_by_const.h"
 
 #include "common/gen_l3_config.h"
+#include "common/gen_sample_positions.h"
 #include "genxml/gen_macros.h"
 #include "genxml/genX_pack.h"
 
@@ -2638,6 +2640,26 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer 
*cmd_buffer,
cmd_buffer->state.push_constants_dirty &= ~flushed;
 }
 
+static void
+cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer)
+{
+#if GEN_GEN >= 8
+   const struct anv_sample *anv_samples;
+   uint32_t log2_samples;
+   uint32_t samples;
+
+   samples = cmd_buffer->state.gfx.dynamic.sample_locations.num_samples;
+   assert(samples > 0);
+
+   log2_samples = __builtin_ffs(samples) - 1;
+   anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples;
+
+   genX(emit_multisample)(_buffer->batch, samples, log2_samples);
+   genX(emit_sample_locations)(_buffer->batch, 

[Mesa-dev] [PATCH 4/9] anv: Added support for non-dynamic sample locations on Gen8+

2019-03-11 Thread Eleni Maria Stea
Allowing the user to set custom sample locations non-dynamically, by
filling the extension structs and chaining them to the pipeline structs
according to the Vulkan specification section [26.5. Custom Sample Locations]
for the following structures:

'VkPipelineSampleLocationsStateCreateInfoEXT'
'VkSampleLocationsInfoEXT'
'VkSampleLocationEXT'

Once custom locations are used, the default locations are lost and need to be
re-emitted again in the next pipeline creation. For that, we emit the
3DSTATE_SAMPLE_PATTERN at every pipeline creation.
---
 src/intel/common/gen_sample_positions.h | 53 
 src/intel/vulkan/anv_genX.h |  5 ++
 src/intel/vulkan/anv_private.h  |  9 +++
 src/intel/vulkan/anv_sample_locations.c | 38 +++-
 src/intel/vulkan/anv_sample_locations.h | 29 +
 src/intel/vulkan/genX_pipeline.c| 80 +
 src/intel/vulkan/genX_state.c   | 59 ++
 7 files changed, 259 insertions(+), 14 deletions(-)
 create mode 100644 src/intel/vulkan/anv_sample_locations.h

diff --git a/src/intel/common/gen_sample_positions.h 
b/src/intel/common/gen_sample_positions.h
index da48dcb5ed0..e8af2a552dc 100644
--- a/src/intel/common/gen_sample_positions.h
+++ b/src/intel/common/gen_sample_positions.h
@@ -160,4 +160,57 @@ prefix##14YOffset  = 0.9375; \
 prefix##15XOffset  = 0.0625; \
 prefix##15YOffset  = 0.;
 
+/* Examples:
+ * in case of GEN_GEN < 8:
+ * SET_SAMPLE_POS(ms.Sample, 0); expands to:
+ *ms.Sample0XOffset = anv_samples[0].offs_x;
+ *ms.Sample0YOffset = anv_samples[0].offs_y;
+ *
+ * in case of GEN_GEN >= 8:
+ * SET_SAMPLE_POS(sp._16xSample, 0); expands to:
+ *sp._16xSample0XOffset = anv_samples[0].offs_x;
+ *sp._16xSample0YOffset = anv_samples[0].offs_y;
+ */
+#define SET_SAMPLE_POS(prefix, sample_idx) \
+prefix##sample_idx##XOffset = anv_samples[sample_idx].offs_x; \
+prefix##sample_idx##YOffset = anv_samples[sample_idx].offs_y;
+
+#define SET_SAMPLE_POS_2X(prefix) \
+SET_SAMPLE_POS(prefix, 0); \
+SET_SAMPLE_POS(prefix, 1);
+
+#define SET_SAMPLE_POS_4X(prefix) \
+SET_SAMPLE_POS(prefix, 0); \
+SET_SAMPLE_POS(prefix, 1); \
+SET_SAMPLE_POS(prefix, 2); \
+SET_SAMPLE_POS(prefix, 3);
+
+#define SET_SAMPLE_POS_8X(prefix) \
+SET_SAMPLE_POS(prefix, 0); \
+SET_SAMPLE_POS(prefix, 1); \
+SET_SAMPLE_POS(prefix, 2); \
+SET_SAMPLE_POS(prefix, 3); \
+SET_SAMPLE_POS(prefix, 4); \
+SET_SAMPLE_POS(prefix, 5); \
+SET_SAMPLE_POS(prefix, 6); \
+SET_SAMPLE_POS(prefix, 7);
+
+#define SET_SAMPLE_POS_16X(prefix) \
+SET_SAMPLE_POS(prefix, 0); \
+SET_SAMPLE_POS(prefix, 1); \
+SET_SAMPLE_POS(prefix, 2); \
+SET_SAMPLE_POS(prefix, 3); \
+SET_SAMPLE_POS(prefix, 4); \
+SET_SAMPLE_POS(prefix, 5); \
+SET_SAMPLE_POS(prefix, 6); \
+SET_SAMPLE_POS(prefix, 7); \
+SET_SAMPLE_POS(prefix, 8); \
+SET_SAMPLE_POS(prefix, 9); \
+SET_SAMPLE_POS(prefix, 10); \
+SET_SAMPLE_POS(prefix, 11); \
+SET_SAMPLE_POS(prefix, 12); \
+SET_SAMPLE_POS(prefix, 13); \
+SET_SAMPLE_POS(prefix, 14); \
+SET_SAMPLE_POS(prefix, 15);
+
 #endif /* GEN_SAMPLE_POSITIONS_H */
diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index 8fd32cabf1e..52415c04a45 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -88,3 +88,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer 
*cmd_buffer,
 
 void genX(blorp_exec)(struct blorp_batch *batch,
   const struct blorp_params *params);
+
+void genX(emit_sample_locations)(struct anv_batch *batch,
+ uint32_t num_samples,
+ const VkSampleLocationsInfoEXT *sl,
+ bool custom_locations);
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 5905299e59d..981956e5706 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -71,6 +71,7 @@ struct anv_buffer;
 struct anv_buffer_view;
 struct anv_image_view;
 struct anv_instance;
+struct anv_sample;
 
 struct gen_l3_config;
 
@@ -165,6 +166,7 @@ struct gen_l3_config;
 #define MAX_PUSH_DESCRIPTORS 32 /* Minimum requirement */
 #define MAX_INLINE_UNIFORM_BLOCK_SIZE 4096
 #define MAX_INLINE_UNIFORM_BLOCK_DESCRIPTORS 32
+#define MAX_SAMPLE_LOCATIONS 16
 
 /* The kernel relocation API has a limitation of a 32-bit delta value
  * applied to the address before it is written which, in spite of it being
@@ -2086,6 +2088,13 @@ struct anv_push_constants {
struct brw_image_param images[MAX_GEN8_IMAGES];
 };
 
+struct
+anv_sample {
+   float offs_x;
+   float offs_y;
+   float radius;
+};
+
 struct anv_dynamic_state {
struct {
   uint32_t  count;
diff --git a/src/intel/vulkan/anv_sample_locations.c 
b/src/intel/vulkan/anv_sample_locations.c
index 1ebf280e05b..c660cb5ae84 100644
--- a/src/intel/vulkan/anv_sample_locations.c
+++ b/src/intel/vulkan/anv_sample_locations.c
@@ -21,7 +21,7 @@
  * IN THE SOFTWARE.
  */
 
-#include "anv_private.h"

[Mesa-dev] [PATCH 2/9] anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT

2019-03-11 Thread Eleni Maria Stea
The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled with
implementation dependent values and according to the table from the
Vulkan Specification section [36.1. Limit Requirements]:

pname | max | min
pname:sampleLocationSampleCounts   |-|ename:VK_SAMPLE_COUNT_4_BIT
pname:maxSampleLocationGridSize|-|(1, 1)
pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375)
pname:sampleLocationSubPixelBits   |-|4
pname:variableSampleLocations  | false   |implementation dependent

The hardware only supports setting the same sample location for all the
pixels, so we only support 1x1 grids.

Also, variableSampleLocations is set to false because we don't support the
feature.
---
 src/intel/vulkan/anv_device.c  | 21 +
 src/intel/vulkan/anv_private.h |  3 +++
 2 files changed, 24 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 729cceb3e32..1e183b7f4ad 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1401,6 +1401,27 @@ void anv_GetPhysicalDeviceProperties2(
  break;
   }
 
+  case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT: {
+ VkPhysicalDeviceSampleLocationsPropertiesEXT *props =
+(VkPhysicalDeviceSampleLocationsPropertiesEXT *)ext;
+ props->sampleLocationSampleCounts = ISL_SAMPLE_COUNT_2_BIT |
+ ISL_SAMPLE_COUNT_4_BIT |
+ ISL_SAMPLE_COUNT_8_BIT;
+ if (pdevice->info.gen >= 9)
+props->sampleLocationSampleCounts |= ISL_SAMPLE_COUNT_16_BIT;
+
+ props->maxSampleLocationGridSize.width = SAMPLE_LOC_GRID_W;
+ props->maxSampleLocationGridSize.height = SAMPLE_LOC_GRID_H;
+
+ props->sampleLocationCoordinateRange[0] = 0;
+ props->sampleLocationCoordinateRange[1] = 0.9375;
+ props->sampleLocationSubPixelBits = 4;
+
+ props->variableSampleLocations = false;
+
+ break;
+  }
+
   default:
  anv_debug_ignored_stype(ext->sType);
  break;
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index eed282ff985..5905299e59d 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -195,6 +195,9 @@ struct gen_l3_config;
 
 #define anv_printflike(a, b) __attribute__((__format__(__printf__, a, b)))
 
+#define SAMPLE_LOC_GRID_W 1
+#define SAMPLE_LOC_GRID_H 1
+
 static inline uint32_t
 align_down_npot_u32(uint32_t v, uint32_t a)
 {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v4] i965: fixed clamping in set_scissor_bits when the y is flipped

2019-02-22 Thread Eleni Maria Stea
Calculating the scissor rectangle fields with the y flipped (0 on top)
can generate negative values that will cause assertion failure later on
as the scissor fields are all unsigned. We must clamp the bbox values
again to make sure they don't exceed the fb_height. Also fixed a
calculation error.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999
  https://bugs.freedesktop.org/show_bug.cgi?id=109594

v2:
   - I initially clamped the values inside the if (Y is flipped) case
   and I made a mistake in the calculation: the clamp of the bbox[2] should
   be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I
   shouldn't have changed the ScissorRectangleYMax calculation. As the
   fixed code is equivalent with using CLAMP instead of MAX2 at the top of
   the function when bbox[2] and bbox[3] are calculated, and the 2nd is more
   clear, I replaced it. (Nanley Chery)

v3:
   - Reversed the CLAMP change in bbox[3] as the API guarantees that the
   viewport height is positive. (Nanley Chery)

v4:
  - Added nomination for the mesa-stable branch and the link to the second
  bugzilla bug (Nanley Chery)

CC: 
Tested-by: Paul Chelombitko 
Reviewed-by: Nanley Chery 
---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 027dad1e089..73c983ce742 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -2446,7 +2446,7 @@ set_scissor_bits(const struct gl_context *ctx, int i,
 
bbox[0] = MAX2(ctx->ViewportArray[i].X, 0);
bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width);
-   bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0);
+   bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height);
bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height);
_mesa_intersect_scissor_bounding_box(ctx, i, bbox);
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3] i965: fixed clamping in set_scissor_bits when the y is flipped

2019-02-21 Thread Eleni Maria Stea
Calculating the scissor rectangle fields with the y flipped (0 on top)
can generate negative values that will cause assertion failure later on
as the scissor fields are all unsigned. We must clamp the bbox values
again to make sure they don't exceed the fb_height. Also fixed a
calculation error.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999

v2:
   - I initially clamped the values inside the if (Y is flipped) case
   and I made a mistake in the calculation: the clamp of the bbox[2] should
   be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I
   shouldn't have changed the ScissorRectangleYMax calculation. As the
   fixed code is equivalent with using CLAMP instead of MAX2 at the top of
   the function when bbox[2] and bbox[3] are calculated, and the 2nd is more
   clear, I replaced it. (Nanley Chery)

v3:
   - Reversed the CLAMP change in bbox[3] as the API guarantees that the
   viewport height is positive. (Nanley Chery)
---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index dcdfb3c9292..47f3741e673 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -2445,7 +2445,7 @@ set_scissor_bits(const struct gl_context *ctx, int i,
 
bbox[0] = MAX2(ctx->ViewportArray[i].X, 0);
bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width);
-   bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0);
+   bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height);
bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height);
_mesa_intersect_scissor_bounding_box(ctx, i, bbox);
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] i965: fixed clamping in set_scissor_bits when the y is flipped

2019-02-20 Thread Eleni Maria Stea
Calculating the scissor rectangle fields with the y flipped (0 on top)
can generate negative values that will cause assertion failure later on
as the scissor fields are all unsigned. We must clamp the bbox values
again to make sure they don't exceed the fb_height. Also fixed a
calculation error.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999

v2:
   - I initially clamped the values inside the if (Y is flipped) case
   and I made a mistake in the calculation: the clamp of the bbox[2] should
   be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I
   shouldn't have changed the ScissorRectangleYMax calculation. As the
   fixed code is equivalent with using CLAMP instead of MAX2 at the top of
   the function when bbox[2] and bbox[3] are calculated, and the 2nd is more
   clear, I replaced it. (Nanley Chery)
---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index dcdfb3c9292..dd695218fea 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -2445,8 +2445,8 @@ set_scissor_bits(const struct gl_context *ctx, int i,
 
bbox[0] = MAX2(ctx->ViewportArray[i].X, 0);
bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width);
-   bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0);
-   bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height);
+   bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height);
+   bbox[3] = CLAMP(bbox[2] + ctx->ViewportArray[i].Height, 0, fb_height);
_mesa_intersect_scissor_bounding_box(ctx, i, bbox);
 
if (bbox[0] == bbox[1] || bbox[2] == bbox[3]) {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: fixed clamping in set_scissor_bits when the y is flipped

2019-02-20 Thread Eleni Maria Stea
On Tue, 19 Feb 2019 16:27:56 -0800
Nanley Chery  wrote:

> On Mon, Dec 10, 2018 at 12:42:40PM +0200, Eleni Maria Stea wrote:
> > Calculating the scissor rectangle fields with the y flipped (0 on
> > top) can generate negative values that will cause assertion failure
> > later on as the scissor fields are all unsigned. We must clamp the
> > bbox values again to make sure they don't exceed the fb_height.
> > Also fixed a calculation error.
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999  
> 
> Good find. Could you send the test to the piglit list?
Sure, I will send it.


> 
> > ---
> >  src/mesa/drivers/dri/i965/genX_state_upload.c | 15 ++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c
> > b/src/mesa/drivers/dri/i965/genX_state_upload.c index
> > 8e3fcbf12e..5d8fc8214e 100644 ---
> > a/src/mesa/drivers/dri/i965/genX_state_upload.c +++
> > b/src/mesa/drivers/dri/i965/genX_state_upload.c @@ -2424,8 +2424,21
> > @@ set_scissor_bits(const struct gl_context *ctx, int i, /* memory:
> > Y=0=top */ sc->ScissorRectangleXMin = bbox[0];
> >sc->ScissorRectangleXMax = bbox[1] - 1;
> > +
> > +  /* Clamping to fb_height is necessary because otherwise the
> > +   * subtractions below would produce a negative result, which
> > would
> > +   * then be assigned to the unsigned YMin/YMax scissor fields,
> > +   * resulting in an assertion failure in
> > GENX(SCISSOR_RECT_pack)
> > +   */
> > +
> > +  if (bbox[3] > fb_height)
> > + bbox[3] = fb_height;
> > +
> > +  if (bbox[2] > fb_height)
> > + bbox[2] = fb_height;
> > +  
> 
> We should be able to fix this bug in a simpler manner by changing the
> MAX2 calls at the top of this function to CLAMP calls.
> 
> >sc->ScissorRectangleYMin = fb_height - bbox[3];
> > -  sc->ScissorRectangleYMax = fb_height - bbox[2] - 1;
> > +  sc->ScissorRectangleYMax = fb_height - (bbox[2] - 1);  
> 
> I don't think we want to start adding 1 instead of subtracting 1. The
> subtraction is there to satisfy the requirement for the HW packet.
> 
> -Nanley

Right! This code would be correct if I had done:

  if (bbox[2] >= fb_height)
 bbox[2] = fb_height - 1;

and then had left:
  sc->ScissorRectangleYMax = fb_height - bbox[2] - 1;

as it was. :)

I think I like your solution better because with the CLAMP at the top
what we do here is more clear. I am going to send a new patch soon.

Thank you!
Eleni
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v6 2/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-15 Thread Eleni Maria Stea
GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
compressed EAC/ETC2 images to non-compressed RGBA images. When
GetCompressed* functions were called, the pixels were returned in this
RGBA format and not the compressed format that was expected.

Trying to fix this problem, we use a secondary shadow miptree to store the
decompressed data for the rendering and the main miptree to store the
compressed for the Get functions to work. Each time that the main miptree
is written with compressed data, we decompress them to RGB and update the
shadow. Then we use the shadow for rendering.

v2:
   - Fixes in the commit message (Nanley Chery)
   - Reversed the changes in brw_get_texture_swizzle and swapped the b, g
   values at the time that we decompress the data in the function:
   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
   - Simplified the format checks in the miptree_create function of the
   intel_mipmap_tree.c and reserved the call of the
   intel_lower_compressed_format for the case that we are faking the ETC
   support (Nanley Chery)
   - Removed the check for the auxiliary usage for the shadow miptree at
   creation (miptree_create of intel_mipmap_tree.c) as we won't use
   auxiliary buffers with these types of trees (Nanley Chery)
   - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
   removed the unecessary checks (Nanley Chery)
   - Fixed an unrelated indentation change (Nanley Chery)
   - Modified the function intel_miptree_finish_write to set the
   mt->shadow_needs_update to true to catch all the cases when we need to
   update the miptree (Nanley Chery)
   - In order to update the shadow miptree during the unmap of the
   main and always map the main (Nanley Chery) the following change was
   necessary: Splitted the previous update function that was updating all
   the mipmap levels and use two functions instead: one that updates one
   level and one that updates all of them. Used the first during unmap
   and the second before the rendering.
   - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
   miptree should be mapped each time and reversed all the changes in the
   higher level texture functions that upload data to textures as they
   aren't needed anymore.
   - Replaced the boolean needs_fake_etc with an inline function that
   checks when we need to fake the ETC compression (Nanley Chery)
   - Removed the initialization of the strides in the update function as
   the values will be overwritten by the intel_miptree_map call (Nanley
   Chery)
   - Used minify instead of division in the new update function
   intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
   Chery)
   - Removed the depth from the calculation of the number of slices in
   the new update function (intel_miptree_update_etc_shadow_levels of
   intel_mipmap_tree.c) as we don't need to support 3D ETC images.
   (Nanley Chery)

v3:
  - Renamed the rgba_fmt in function miptree_create
  (intel_mipmap_tree.c) to decomp_format as the format is not always in
  rgba order. (Nanley Chery)
  - Documented the new usage for the shadow miptree in the comment above
  the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley
  Chery)
  - Removed the redundant flags from the mapping of the miptrees in
  intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
  - Fixed the switch from surface's logical level to physical level in
  the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c
  (Nanley Chery)
  - Excluded the Baytrail GPUs from the check for the ETC emulation as
  they support the ETC formats natively. (Nanley Chery)
  - Simplified the check if the format is BGRA in
  intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)

v4:
  - Removed the functions intel_miptree_(map|unmap)_etc and the check if
   we need to call them as with the new changes, they became unreachable.
   (Nanley Chery)
  - We'd rather calculate the level width and height using the shadow
  miptree instead of the main in intel_miptree_update_etc_shadow_levels of
  intel_mipmap_tree.c (Nanley Chery)
  - Fixed the format in the mt_surface_usage, set at the miptree creation,
   in miptree_create of intel_mipmap_tree.c (Nanley Chery)

v5:
  - Fixed the levels calculations in intel_mipmap_tree.c (Nanley Chery)
  - Update the flag shadow_needs_update outside the function
  intel_miptree_update_etc_shadow (Nanley Chery)
  - Fixed indentation error (Nanley Chery)

v6:
  - Fixed typo in commit message (Nanley Chery)
  - Simplified the assignment of the mt_fmt in the miptree_create of the
  intel_mipmap_tree.c (Nanley Chery)
  - Combined declarations and assignments where it was possible in the
  intel_miptree_update_etc_shadow and
  intel_miptree_update_etc_shadow_levels of the intel_mipmap_tree.c
  (Nanley Chery)
---
 .../drivers/dri/i965/brw_wm_surface_state.c   |   5 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 174 

[Mesa-dev] [PATCH v6 1/5] i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*

2019-02-15 Thread Eleni Maria Stea
From: Nanley Chery 

Use more generic field names. We'll reuse these fields for a workaround
with ASTC miptrees.

Reviewed-by: Eleni Maria Stea 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  8 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 16 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 14 +++---
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index ece3197a858..c55182d7ffb 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -571,15 +571,15 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
 
   if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) 
{
  if (devinfo->gen <= 7) {
-assert(mt->r8stencil_mt && 
!mt->stencil_mt->r8stencil_needs_update);
-mt = mt->r8stencil_mt;
+assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update);
+mt = mt->shadow_mt;
  } else {
 mt = mt->stencil_mt;
  }
  format = ISL_FORMAT_R8_UINT;
   } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) {
- assert(mt->r8stencil_mt && !mt->r8stencil_needs_update);
- mt = mt->r8stencil_mt;
+ assert(mt->shadow_mt && !mt->shadow_needs_update);
+ mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
   }
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index fe77d72fae4..e364fed2cc7 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1214,7 +1214,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt)
 
   brw_bo_unreference((*mt)->bo);
   intel_miptree_release(&(*mt)->stencil_mt);
-  intel_miptree_release(&(*mt)->r8stencil_mt);
+  intel_miptree_release(&(*mt)->shadow_mt);
   intel_miptree_aux_buffer_free((*mt)->aux_buf);
   free_aux_state_map((*mt)->aux_state);
 
@@ -2427,7 +2427,7 @@ intel_miptree_finish_write(struct brw_context *brw,
switch (mt->aux_usage) {
case ISL_AUX_USAGE_NONE:
   if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7)
- mt->r8stencil_needs_update = true;
+ mt->shadow_needs_update = true;
   break;
 
case ISL_AUX_USAGE_MCS:
@@ -2933,9 +2933,9 @@ intel_update_r8stencil(struct brw_context *brw,
 
assert(src->surf.size_B > 0);
 
-   if (!mt->r8stencil_mt) {
+   if (!mt->shadow_mt) {
   assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */
-  mt->r8stencil_mt = make_surface(
+  mt->shadow_mt = make_surface(
 brw,
 src->target,
 MESA_FORMAT_R_UINT8,
@@ -2949,13 +2949,13 @@ intel_update_r8stencil(struct brw_context *brw,
 ISL_TILING_Y0_BIT,
 ISL_SURF_USAGE_TEXTURE_BIT,
 BO_ALLOC_BUSY, 0, NULL);
-  assert(mt->r8stencil_mt);
+  assert(mt->shadow_mt);
}
 
-   if (src->r8stencil_needs_update == false)
+   if (src->shadow_needs_update == false)
   return;
 
-   struct intel_mipmap_tree *dst = mt->r8stencil_mt;
+   struct intel_mipmap_tree *dst = mt->shadow_mt;
 
for (int level = src->first_level; level <= src->last_level; level++) {
   const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ?
@@ -2975,7 +2975,7 @@ intel_update_r8stencil(struct brw_context *brw,
}
 
brw_cache_flush_for_read(brw, dst->bo);
-   src->r8stencil_needs_update = false;
+   src->shadow_needs_update = false;
 }
 
 static void *
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 17668944adc..1a7507023a1 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -294,16 +294,16 @@ struct intel_mipmap_tree
struct intel_mipmap_tree *stencil_mt;
 
/**
-* \brief Stencil texturing miptree for sampling from a stencil texture
+* \brief Shadow miptree for sampling when the main isn't supported by HW.
 *
-* Some hardware doesn't support sampling from the stencil texture as
-* required by the GL_ARB_stencil_texturing extenion. To workaround this we
-* blit the texture into a new texture that can be sampled.
+* To workaround various sampler bugs and limitations, we blit the main
+* texture into a new texture that can be sampled.
 *
-* \see intel_update_r8stencil()
+* This miptree may be used for:
+* - Stencil texturin

[Mesa-dev] [PATCH v6 5/5] i965: Removed the field etc_format from the struct intel_mipmap_tree

2019-02-15 Thread Eleni Maria Stea
After the previous changes to emulate the ETC/EAC formats using the
secondary shadow miptree, the etc_format field of the intel_mipmap_tree
struct became redundant and the remaining check that used it has been
replaced. (Nanley Chery)
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  2 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c|  7 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 10 --
 3 files changed, 1 insertion(+), 18 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 19a46fcf243..a0984791614 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -520,7 +520,7 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
   * is safe because texture views aren't allowed on depth/stencil.
   */
  mesa_fmt = mt->format;
-  } else if (mt->etc_format != MESA_FORMAT_NONE) {
+  } else if (intel_miptree_has_etc_shadow(brw, mt)) {
  mesa_fmt = mt->shadow_mt->format;
   } else if (plane > 0) {
  mesa_fmt = mt->format;
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 7146fcb6582..426782c5883 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -706,7 +706,6 @@ miptree_create(struct brw_context *brw,
 
if (intel_miptree_needs_fake_etc(brw, mt)) {
   mesa_format decomp_format = intel_lower_compressed_format(brw, format);
-  mt->etc_format = format;
   mt->shadow_mt = make_surface(brw, target, decomp_format, first_level,
last_level, width0, height0, depth0,
num_samples, tiling_flags,
@@ -717,10 +716,6 @@ miptree_create(struct brw_context *brw,
  intel_miptree_release();
  return NULL;
   }
-
-  mt->shadow_mt->etc_format = MESA_FORMAT_NONE;
-   } else {
-  mt->etc_format = MESA_FORMAT_NONE;
}
 
if (needs_separate_stencil(brw, mt, format)) {
@@ -1302,8 +1297,6 @@ intel_miptree_match_image(struct intel_mipmap_tree *mt,
   mt_format = MESA_FORMAT_Z24_UNORM_S8_UINT;
if (mt->format == MESA_FORMAT_Z_FLOAT32 && mt->stencil_mt)
   mt_format = MESA_FORMAT_Z32_FLOAT_S8X24_UINT;
-   if (mt->etc_format != MESA_FORMAT_NONE)
-  mt_format = mt->etc_format;
 
if (_mesa_get_srgb_format_linear(image->TexFormat) !=
_mesa_get_srgb_format_linear(mt_format))
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 752aeaaf9b7..3e53a0049cc 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -215,21 +215,11 @@ struct intel_mipmap_tree
 * MESA_FORMAT_Z_FLOAT32, otherwise for MESA_FORMAT_Z24_UNORM_S8_UINT 
objects it will be
 * MESA_FORMAT_Z24_UNORM_X8_UINT.
 *
-* For ETC1/ETC2 textures, this is one of the uncompressed mesa texture
-* formats if the hardware lacks support for ETC1/ETC2. See @ref etc_format.
-*
 * @see RENDER_SURFACE_STATE.SurfaceFormat
 * @see 3DSTATE_DEPTH_BUFFER.SurfaceFormat
 */
mesa_format format;
 
-   /**
-* This variable stores the value of ETC compressed texture format
-*
-* @see RENDER_SURFACE_STATE.SurfaceFormat
-*/
-   mesa_format etc_format;
-
GLuint first_level;
GLuint last_level;
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v6 0/5] improved the support for ETC2 formats on Gen 7

2019-02-15 Thread Eleni Maria Stea
Intel Gen7 GPUs don't support the ETC2 formats natively and in order to
show the pixels properly we decompress them and create decompressed
miptrees. The problem with that is that the functions that map the
miptrees for reading (for example the GetCompressed* calls), and would
be supposed to read compressed pixel values, would read decompressed
values instead unless if we prevented this with assertions that make
the user programs either crash or misfunction.

These patches are an attempt to give a solution to this problem by using 2
miptrees: the main to store the ETC values and the generic shadow
(mt->shadow) to store the decompressed values. Each time that the main
miptree is mapped for writing we set a flag that the shadow will need
update and we check this flag before every draw call to update the
shadow miptree. (We perform the check right before drawing to avoid
missing changes from functions like the CopyImageSubData in the next 
frame). Then we map the shadow for sampling. This way, we can render the
images using the decompressed pixels of the shadow but we return the
compressed ones from the main when the texture is mapped for reading.

Also, the OES_copy_image extension that couldn't work on Gen 7 due to the
lack of the ETC support is now enabled back.

Finally, the following glcts and piglit tests pass:

On HSW (previously failing):

KHR-GL46.direct_state_access.textures_compressed_subimage

On HSW and IVB (previously skipped):
-
dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_alpha8_etc2_eac.*
   (6 tests)
dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_etc2.*
   (6 tests)
dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_punchthrough_alpha1_etc2.*
   (6 tests)

On HSW, IVB, SNB (previously skipped):
---
dEQP-GLES3.functional.texture.format.compressed.*
   (12 tests)
dEQP-GLES3.functional.texture.wrap.etc2_eac_srgb8_alpha8.*
   (36 tests)
dEQP-GLES3.functional.texture.wrap.etc2_srgb8.*
   (36 tests)
dEQP-GLES3.functional.texture.wrap.etc2_srgb8_punchthrough_alpha1.*
   (36 tests)

piglit.spec.!opengl es 3_0.oes_compressed_etc2_texture-miptree_gles3 
   (srgb8, srgb8-alpha, srgb8-punchthrough-alpha1)
piglit.spec.arb_es3_compatibility.oes_compressed_etc2_texture-miptree
   (srgb8 compat, srgb8 core, srgb8-alpha8 compat, srgb8-alpha8 core,
srgb8-punchthrough-alpha1 compat, srgb8-punchthrough-alpha1 core)
(9 tests)

Total tests passing: 148

Eleni Maria Stea (4):
  i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
  i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
  i965: Enabled the OES_copy_image extension on Gen 7 GPUs
  i965: Removed the field etc_format from the struct intel_mipmap_tree

Nanley Chery (1):
  i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*

 src/mesa/drivers/dri/i965/brw_draw.c  |   5 +
 .../drivers/dri/i965/brw_wm_surface_state.c   |  15 +-
 src/mesa/drivers/dri/i965/intel_extensions.c  |  16 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 170 ++
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  48 +++--
 5 files changed, 149 insertions(+), 105 deletions(-)

-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v6 3/5] i965: Fixed the CopyImageSubData for ETC2 on Gen < 8

2019-02-15 Thread Eleni Maria Stea
For CopyImageSubData to copy the data during the 1st draw call, we need
to update the shadow tree right before the rendering.

v2:
  - Added assertion that the miptree doesn't need update at the time we
  update the texture surface. (Nanley Chery)

v3:
  - As we now update the tree before the rendering we don't need to copy
  the data during the unmap anymore. Removed the unnecessary update from
  the intel_miptree_unmap in intel_mipmap_tree.c (Nanley Chery)

v4:
  - Fixed unrelated empty line removal (Nanley Chery)
  - As now the intel_upate_etc_shadow of intel_mipmap_tree.c is only
  called inside its following function, we don't need to declare it at
  the top of the file anymore. (Nanley Chery)
---
 src/mesa/drivers/dri/i965/brw_draw.c|  5 +
 .../drivers/dri/i965/brw_wm_surface_state.c |  2 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c   | 17 -
 3 files changed, 6 insertions(+), 18 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 40bcf82ae8d..d07349419cc 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -559,6 +559,11 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool 
rendering,
   tex_obj->mt->format == MESA_FORMAT_S_UINT8) {
  intel_update_r8stencil(brw, tex_obj->mt);
   }
+
+  if (intel_miptree_has_etc_shadow(brw, tex_obj->mt) &&
+  tex_obj->mt->shadow_needs_update) {
+ intel_miptree_update_etc_shadow_levels(brw, tex_obj->mt);
+  }
}
 
/* Resolve color for each active shader image. */
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index c3d267721e1..19a46fcf243 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -582,7 +582,7 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
  mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
   } else if (intel_miptree_needs_fake_etc(brw, mt)) {
- assert(mt->shadow_mt);
+ assert(mt->shadow_mt && !mt->shadow_needs_update);
  mt = mt->shadow_mt;
   }
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 976a004ade0..7146fcb6582 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -57,11 +57,6 @@ static void *intel_miptree_map_raw(struct brw_context *brw,
GLbitfield mode);
 
 static void intel_miptree_unmap_raw(struct intel_mipmap_tree *mt);
-static void intel_miptree_update_etc_shadow(struct brw_context *brw,
-struct intel_mipmap_tree *mt,
-unsigned int level,
-unsigned int slice,
-int level_w, int level_h);
 
 static bool
 intel_miptree_supports_mcs(struct brw_context *brw,
@@ -3779,7 +3774,6 @@ intel_miptree_unmap(struct brw_context *brw,
 unsigned int slice)
 {
struct intel_miptree_map *map = mt->level[level].slice[slice].map;
-   int level_w, level_h;
 
assert(mt->surf.samples == 1);
 
@@ -3789,21 +3783,10 @@ intel_miptree_unmap(struct brw_context *brw,
DBG("%s: mt %p (%s) level %d slice %d\n", __func__,
mt, _mesa_get_format_name(mt->format), level, slice);
 
-   level_w = minify(mt->surf.phys_level0_sa.width,
-level - mt->first_level);
-   level_h = minify(mt->surf.phys_level0_sa.height,
-level - mt->first_level);
-
if (map->unmap)
   map->unmap(brw, mt, map, level, slice);
 
intel_miptree_release_map(mt, level, slice);
-
-   if (intel_miptree_has_etc_shadow(brw, mt) && mt->shadow_needs_update) {
-  mt->shadow_needs_update = false;
-  intel_miptree_update_etc_shadow(brw, mt, level, slice, level_w,
-  level_h);
-   }
 }
 
 enum isl_surf_dim
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v6 4/5] i965: Enabled the OES_copy_image extension on Gen 7 GPUs

2019-02-15 Thread Eleni Maria Stea
OES_copy_image extension was disabled on Gen7 due to the lack of support
for ETC2 images. Enabled it back. (Kenneth Graunke)

v2:
  - Removed the blank lines in the comments above OES_copy_image and
  OES_texture_view extensions in intel_extensions.c (Nanley Chery)
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 3a95be58a63..d2a6aa185c2 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -287,14 +287,22 @@ intelInitExtensions(struct gl_context *ctx)
}
 
if (devinfo->gen >= 8 || devinfo->is_baytrail) {
-  /* For now, we only enable OES_copy_image on platforms that support
-   * ETC2 natively in hardware.  We would need more hacks to support it
-   * elsewhere. Same with OES_texture_view.
+  /* For now, we can't enable OES_texture_view on Gen 7 because of
+   * some piglit failures coming from
+   * piglit/tests/spec/arb_texture_view/rendering-formats.c that need
+   * investigation.
*/
-  ctx->Extensions.OES_copy_image = true;
   ctx->Extensions.OES_texture_view = true;
}
 
+   if (devinfo->gen >= 7) {
+  /* We can safely enable OES_copy_image on Gen 7, since we emulate
+   * the ETC2 support using the shadow_miptree to store the
+   * compressed data.
+   */
+  ctx->Extensions.OES_copy_image = true;
+   }
+
if (devinfo->gen >= 8) {
   ctx->Extensions.ARB_gpu_shader_int64 = true;
   /* requires ARB_gpu_shader_int64 */
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v5 4/4] i965: Enabled the OES_copy_image extension on Gen 7 GPUs

2019-02-13 Thread Eleni Maria Stea
OES_copy_image extension was disabled on Gen7 due to the lack of support
for ETC2 images. Enabled it back. (Kenneth Graunke)

v2:
  - Removed the blank lines in the comments above OES_copy_image and
  OES_texture_view extensions in intel_extensions.c (Nanley Chery)
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 3a95be58a63..d2a6aa185c2 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -287,14 +287,22 @@ intelInitExtensions(struct gl_context *ctx)
}
 
if (devinfo->gen >= 8 || devinfo->is_baytrail) {
-  /* For now, we only enable OES_copy_image on platforms that support
-   * ETC2 natively in hardware.  We would need more hacks to support it
-   * elsewhere. Same with OES_texture_view.
+  /* For now, we can't enable OES_texture_view on Gen 7 because of
+   * some piglit failures coming from
+   * piglit/tests/spec/arb_texture_view/rendering-formats.c that need
+   * investigation.
*/
-  ctx->Extensions.OES_copy_image = true;
   ctx->Extensions.OES_texture_view = true;
}
 
+   if (devinfo->gen >= 7) {
+  /* We can safely enable OES_copy_image on Gen 7, since we emulate
+   * the ETC2 support using the shadow_miptree to store the
+   * compressed data.
+   */
+  ctx->Extensions.OES_copy_image = true;
+   }
+
if (devinfo->gen >= 8) {
   ctx->Extensions.ARB_gpu_shader_int64 = true;
   /* requires ARB_gpu_shader_int64 */
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v5 3/4] i965: Fixed the CopyImageSubData for ETC2 on Gen < 8

2019-02-13 Thread Eleni Maria Stea
For CopyImageSubData to copy the data during the 1st draw call, we need
to update the shadow tree right before the rendering.

v2:
  - Added assertion that the miptree doesn't need update at the time we
  update the texture surface. (Nanley Chery)

v3:
  - As we now update the tree before the rendering we don't need to copy
  the data during the unmap anymore. Removed the unnecessary update from
  the intel_miptree_unmap in intel_mipmap_tree.c (Nanley Chery)
---
 src/mesa/drivers/dri/i965/brw_draw.c |  5 +
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  2 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 13 -
 3 files changed, 6 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 40bcf82ae8d..d07349419cc 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -559,6 +559,11 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool 
rendering,
   tex_obj->mt->format == MESA_FORMAT_S_UINT8) {
  intel_update_r8stencil(brw, tex_obj->mt);
   }
+
+  if (intel_miptree_has_etc_shadow(brw, tex_obj->mt) &&
+  tex_obj->mt->shadow_needs_update) {
+ intel_miptree_update_etc_shadow_levels(brw, tex_obj->mt);
+  }
}
 
/* Resolve color for each active shader image. */
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index c3d267721e1..19a46fcf243 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -582,7 +582,7 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
  mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
   } else if (intel_miptree_needs_fake_etc(brw, mt)) {
- assert(mt->shadow_mt);
+ assert(mt->shadow_mt && !mt->shadow_needs_update);
  mt = mt->shadow_mt;
   }
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 1643ce2eeb2..89b31c78bc4 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3780,7 +3780,6 @@ intel_miptree_unmap(struct brw_context *brw,
 unsigned int slice)
 {
struct intel_miptree_map *map = mt->level[level].slice[slice].map;
-   int level_w, level_h;
 
assert(mt->surf.samples == 1);
 
@@ -3790,21 +3789,10 @@ intel_miptree_unmap(struct brw_context *brw,
DBG("%s: mt %p (%s) level %d slice %d\n", __func__,
mt, _mesa_get_format_name(mt->format), level, slice);
 
-   level_w = minify(mt->surf.phys_level0_sa.width,
-level - mt->first_level);
-   level_h = minify(mt->surf.phys_level0_sa.height,
-level - mt->first_level);
-
if (map->unmap)
   map->unmap(brw, mt, map, level, slice);
 
intel_miptree_release_map(mt, level, slice);
-
-   if (intel_miptree_has_etc_shadow(brw, mt) && mt->shadow_needs_update) {
-  mt->shadow_needs_update = false;
-  intel_miptree_update_etc_shadow(brw, mt, level, slice, level_w,
-  level_h);
-   }
 }
 
 enum isl_surf_dim
@@ -3984,6 +3972,5 @@ intel_miptree_update_etc_shadow_levels(struct brw_context 
*brw,
  level_h);
   }
}
-
mt->shadow_needs_update = false;
 }
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v5 2/4] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-13 Thread Eleni Maria Stea
GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
compressed EAC/ETC2 images to non-compressed RGBA images. When
GetCompressed* functions were called, the pixels were returned in this
RGBA format and not the compressed format that was expected.

Trying to fix this problem, we use a secondary shadow miptree to store the
decompressed data for the rendering and the main miptree to store the
compressed for the Get functions to work. Each time that the main miptree
is written with compressed data, we decompress them to RGB and update the
shadow. Then we use the shadow for rendering.

v2:
   - Fixes in the commit message (Nanley Chery)
   - Reversed the changes in brw_get_texture_swizzle and swapped the b, g
   values at the time that we decompress the data in the function:
   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
   - Simplified the format checks in the miptree_create function of the
   intel_mipmap_tree.c and reserved the call of the
   intel_lower_compressed_format for the case that we are faking the ETC
   support (Nanley Chery)
   - Removed the check for the auxiliary usage for the shadow miptree at
   creation (miptree_create of intel_mipmap_tree.c) as we won't use
   auxiliary buffers with these types of trees (Nanley Chery)
   - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
   removed the unecessary checks (Nanley Chery)
   - Fixed an unrelated indentation change (Nanley Chery)
   - Modified the function intel_miptree_finish_write to set the
   mt->shadow_needs_update to true to catch all the cases when we need to
   update the miptree (Nanley Chery)
   - In order to update the shadow miptree during the unmap of the
   main and always map the main (Nanley Chery) the following change was
   necessary: Splitted the previous update function that was updating all
   the mipmap levels and use two functions instead: one that updates one
   level and one that updates all of them. Used the first during unmap
   and the second before the rendering.
   - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
   miptree should be mapped each time and reversed all the changes in the
   higher level texture functions that upload data to textures as they
   aren't needed anymore.
   - Replaced the boolean needs_fake_etc with an inline function that
   checks when we need to fake the ETC compression (Nanley Chery)
   - Removed the initialization of the strides in the update function as
   the values will be overwritten by the intel_miptree_map call (Nanley
   Chery)
   - Used minify instead of division in the new update function
   intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
   Chery)
   - Removed the depth from the calculation of the number of slices in
   the new update function (intel_miptree_update_etc_shadow_levels of
   intel_mipmap_tree.c) as we don't need to support 3D ETC images.
   (Nanley Chery)

v3:
  - Renamed the rgba_fmt in function miptree_create
  (intel_mipmap_tree.c) to decomp_format as the format is not always in
  rgba order. (Nanley Chery)
  - Documented the new usage for the shadow miptree in the comment above
  the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley
  Chery)
  - Removed the redundant flags from the mapping of the miptrees in
  intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
  - Fixed the switch from surface's logical level to physical level in
  the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c
  (Nanley Chery)
  - Excluded the Baytrail GPUs from the check for the ETC emulation as
  they support the ETC formats natively. (Nanley Chery)
  - Simplified the check if the format is BGRA in
  intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)

v4:
  - Removed the functions intel_miptree_(map|unmap)_etc and the check if
   we need to call them as with the new changes, they became unreachable.
   (Nanley Chery)
  - We'd rather calculate the level width and height using the shadow
  miptree instead of the main in intel_miptree_update_etc_shadow_levels of
  intel_mipmap_tree.c (Nanley Chery)

v5:
  - Fixed the levels calculations in intel_mipmap_tree.c (Nanley Chery)
  - Update the flag shadow_needs_update outside the function
  intel_miptree_update_etc_shadow (Nanley Chery)
  - Fixed indentation error (Nanley Cherry)
---
 .../drivers/dri/i965/brw_wm_surface_state.c   |   5 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 176 +++---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  24 +++
 3 files changed, 138 insertions(+), 67 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index c55182d7ffb..c3d267721e1 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -521,7 +521,7 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
   */
   

[Mesa-dev] [PATCH v5 1/4] i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*

2019-02-13 Thread Eleni Maria Stea
From: Nanley Chery 

Use more generic field names. We'll reuse these fields for a workaround
with ASTC miptrees.

Reviewed-by: Eleni Maria Stea 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  8 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 16 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 14 +++---
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index ece3197a858..c55182d7ffb 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -571,15 +571,15 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
 
   if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) 
{
  if (devinfo->gen <= 7) {
-assert(mt->r8stencil_mt && 
!mt->stencil_mt->r8stencil_needs_update);
-mt = mt->r8stencil_mt;
+assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update);
+mt = mt->shadow_mt;
  } else {
 mt = mt->stencil_mt;
  }
  format = ISL_FORMAT_R8_UINT;
   } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) {
- assert(mt->r8stencil_mt && !mt->r8stencil_needs_update);
- mt = mt->r8stencil_mt;
+ assert(mt->shadow_mt && !mt->shadow_needs_update);
+ mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
   }
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index b4e3524aa51..479188fd1c8 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1214,7 +1214,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt)
 
   brw_bo_unreference((*mt)->bo);
   intel_miptree_release(&(*mt)->stencil_mt);
-  intel_miptree_release(&(*mt)->r8stencil_mt);
+  intel_miptree_release(&(*mt)->shadow_mt);
   intel_miptree_aux_buffer_free((*mt)->aux_buf);
   free_aux_state_map((*mt)->aux_state);
 
@@ -2427,7 +2427,7 @@ intel_miptree_finish_write(struct brw_context *brw,
switch (mt->aux_usage) {
case ISL_AUX_USAGE_NONE:
   if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7)
- mt->r8stencil_needs_update = true;
+ mt->shadow_needs_update = true;
   break;
 
case ISL_AUX_USAGE_MCS:
@@ -2933,9 +2933,9 @@ intel_update_r8stencil(struct brw_context *brw,
 
assert(src->surf.size_B > 0);
 
-   if (!mt->r8stencil_mt) {
+   if (!mt->shadow_mt) {
   assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */
-  mt->r8stencil_mt = make_surface(
+  mt->shadow_mt = make_surface(
 brw,
 src->target,
 MESA_FORMAT_R_UINT8,
@@ -2949,13 +2949,13 @@ intel_update_r8stencil(struct brw_context *brw,
 ISL_TILING_Y0_BIT,
 ISL_SURF_USAGE_TEXTURE_BIT,
 BO_ALLOC_BUSY, 0, NULL);
-  assert(mt->r8stencil_mt);
+  assert(mt->shadow_mt);
}
 
-   if (src->r8stencil_needs_update == false)
+   if (src->shadow_needs_update == false)
   return;
 
-   struct intel_mipmap_tree *dst = mt->r8stencil_mt;
+   struct intel_mipmap_tree *dst = mt->shadow_mt;
 
for (int level = src->first_level; level <= src->last_level; level++) {
   const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ?
@@ -2975,7 +2975,7 @@ intel_update_r8stencil(struct brw_context *brw,
}
 
brw_cache_flush_for_read(brw, dst->bo);
-   src->r8stencil_needs_update = false;
+   src->shadow_needs_update = false;
 }
 
 static void *
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 17668944adc..1a7507023a1 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -294,16 +294,16 @@ struct intel_mipmap_tree
struct intel_mipmap_tree *stencil_mt;
 
/**
-* \brief Stencil texturing miptree for sampling from a stencil texture
+* \brief Shadow miptree for sampling when the main isn't supported by HW.
 *
-* Some hardware doesn't support sampling from the stencil texture as
-* required by the GL_ARB_stencil_texturing extenion. To workaround this we
-* blit the texture into a new texture that can be sampled.
+* To workaround various sampler bugs and limitations, we blit the main
+* texture into a new texture that can be sampled.
 *
-* \see intel_update_r8stencil()
+* This miptree may be used for:
+* - Stencil texturin

[Mesa-dev] [PATCH v5 0/4] improved the support for ETC2 formats on Gen 7

2019-02-13 Thread Eleni Maria Stea
Intel Gen7 GPUs don't support the ETC2 formats natively and in order to
show the pixels properly we convert them to RGBA and create RGBA miptrees.
The problem with that is that the GetCompressed* functions that should
return the compressed pixel values return the RGBA instead.

These patches are an attempt to give a solution to this problem, by
using 2 miptrees: the main to stores the ETC values and the generic
shadow (mt->shadow) to store the RGBA. Each time that the main miptree
is unmapped we unpack the ETC to RGBA and we update the shadow. Similarly,
we update all the mipmap levels of the image (if necessary) before the
drawing, for the CopyImageSubData to work.

Also, the OES_copy_image extension that couldn't work on Gen 7 due to the
lack of the ETC support is now enabled back.

Finally, the following glcts and piglit tests pass:

On HSW (previously failing):

KHR-GL46.direct_state_access.textures_compressed_subimage

On HSW and IVB (previously skipped):
-
dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_alpha8_etc2_eac.*
   (6 tests)
dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_etc2.*
   (6 tests)
dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_punchthrough_alpha1_etc2.*
   (6 tests)

On HSW, IVB, SNB (previously skipped):
---
dEQP-GLES3.functional.texture.format.compressed.*
   (12 tests)
dEQP-GLES3.functional.texture.wrap.etc2_eac_srgb8_alpha8.*
   (36 tests)
dEQP-GLES3.functional.texture.wrap.etc2_srgb8.*
   (36 tests)
dEQP-GLES3.functional.texture.wrap.etc2_srgb8_punchthrough_alpha1.*
   (36 tests)

piglit.spec.!opengl es 3_0.oes_compressed_etc2_texture-miptree_gles3 
   (srgb8, srgb8-alpha, srgb8-punchthrough-alpha1)
piglit.spec.arb_es3_compatibility.oes_compressed_etc2_texture-miptree
   (srgb8 compat, srgb8 core, srgb8-alpha8 compat, srgb8-alpha8 core,
srgb8-punchthrough-alpha1 compat, srgb8-punchthrough-alpha1 core)
(9 tests)

Total tests passing: 148

Eleni Maria Stea (3):
  i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
  i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
  i965: Enabled the OES_copy_image extension on Gen 7 GPUs

Nanley Chery (1):
  i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*

 src/mesa/drivers/dri/i965/brw_draw.c  |   5 +
 .../drivers/dri/i965/brw_wm_surface_state.c   |  13 +-
 src/mesa/drivers/dri/i965/intel_extensions.c  |  16 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 179 ++
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  38 +++-
 5 files changed, 161 insertions(+), 90 deletions(-)

-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v4 0/4] improved the support for ETC2 formats on Gen 7

2019-02-10 Thread Eleni Maria Stea
Intel Gen7 GPUs don't support the ETC2 formats natively and in order to
show the pixels properly we convert them to RGBA and create RGBA miptrees.
The problem with that is that the GetCompressed* functions that should
return the compressed pixel values return the RGBA instead.

These patches are an attempt to give a solution to this problem, by
using 2 miptrees: the main to stores the ETC values and the generic
shadow (mt->shadow) to store the RGBA. Each time that the main miptree
is unmapped we unpack the ETC to RGBA and we update the shadow. Similarly,
we update all the mipmap levels of the image (if necessary) before the
drawing, for the CopyImageSubData to work.

Also, the OES_copy_image extension that couldn't work on Gen 7 due to the
lack of the ETC support is now enabled back.

Finally, the following glcts and piglit tests pass:

On HSW (previously failing):

KHR-GL46.direct_state_access.textures_compressed_subimage

On HSW and IVB (previously skipped):
-
dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_alpha8_etc2_eac.*
   (6 tests)
dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_etc2.*
   (6 tests)
dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_punchthrough_alpha1_etc2.*
   (6 tests)

On HSW, IVB, SNB (previously skipped):
---
dEQP-GLES3.functional.texture.format.compressed.*
   (12 tests)
dEQP-GLES3.functional.texture.wrap.etc2_eac_srgb8_alpha8.*
   (36 tests)
dEQP-GLES3.functional.texture.wrap.etc2_srgb8.*
   (36 tests)
dEQP-GLES3.functional.texture.wrap.etc2_srgb8_punchthrough_alpha1.*
   (36 tests)

piglit.spec.!opengl es 3_0.oes_compressed_etc2_texture-miptree_gles3 
   (srgb8, srgb8-alpha, srgb8-punchthrough-alpha1)
piglit.spec.arb_es3_compatibility.oes_compressed_etc2_texture-miptree
   (srgb8 compat, srgb8 core, srgb8-alpha8 compat, srgb8-alpha8 core,
srgb8-punchthrough-alpha1 compat, srgb8-punchthrough-alpha1 core)
(9 tests)

Total tests passing: 148

Eleni Maria Stea (3):
  i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
  i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
  i965: Enabled the OES_copy_image extension on Gen 7 GPUs

Nanley Chery (1):
  i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*

 src/mesa/drivers/dri/i965/brw_draw.c  |   5 +
 .../drivers/dri/i965/brw_wm_surface_state.c   |  13 +-
 src/mesa/drivers/dri/i965/intel_extensions.c  |  16 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 188 +++---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  38 +++-
 5 files changed, 170 insertions(+), 90 deletions(-)

-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 1/4] i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*

2019-02-10 Thread Eleni Maria Stea
From: Nanley Chery 

Use more generic field names. We'll reuse these fields for a workaround
with ASTC miptrees.

Reviewed-by: Eleni Maria Stea 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  8 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 16 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 14 +++---
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index b067a174056..618e2ab35bc 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -571,15 +571,15 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
 
   if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) 
{
  if (devinfo->gen <= 7) {
-assert(mt->r8stencil_mt && 
!mt->stencil_mt->r8stencil_needs_update);
-mt = mt->r8stencil_mt;
+assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update);
+mt = mt->shadow_mt;
  } else {
 mt = mt->stencil_mt;
  }
  format = ISL_FORMAT_R8_UINT;
   } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) {
- assert(mt->r8stencil_mt && !mt->r8stencil_needs_update);
- mt = mt->r8stencil_mt;
+ assert(mt->shadow_mt && !mt->shadow_needs_update);
+ mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
   }
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index b4e3524aa51..479188fd1c8 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1214,7 +1214,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt)
 
   brw_bo_unreference((*mt)->bo);
   intel_miptree_release(&(*mt)->stencil_mt);
-  intel_miptree_release(&(*mt)->r8stencil_mt);
+  intel_miptree_release(&(*mt)->shadow_mt);
   intel_miptree_aux_buffer_free((*mt)->aux_buf);
   free_aux_state_map((*mt)->aux_state);
 
@@ -2427,7 +2427,7 @@ intel_miptree_finish_write(struct brw_context *brw,
switch (mt->aux_usage) {
case ISL_AUX_USAGE_NONE:
   if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7)
- mt->r8stencil_needs_update = true;
+ mt->shadow_needs_update = true;
   break;
 
case ISL_AUX_USAGE_MCS:
@@ -2933,9 +2933,9 @@ intel_update_r8stencil(struct brw_context *brw,
 
assert(src->surf.size_B > 0);
 
-   if (!mt->r8stencil_mt) {
+   if (!mt->shadow_mt) {
   assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */
-  mt->r8stencil_mt = make_surface(
+  mt->shadow_mt = make_surface(
 brw,
 src->target,
 MESA_FORMAT_R_UINT8,
@@ -2949,13 +2949,13 @@ intel_update_r8stencil(struct brw_context *brw,
 ISL_TILING_Y0_BIT,
 ISL_SURF_USAGE_TEXTURE_BIT,
 BO_ALLOC_BUSY, 0, NULL);
-  assert(mt->r8stencil_mt);
+  assert(mt->shadow_mt);
}
 
-   if (src->r8stencil_needs_update == false)
+   if (src->shadow_needs_update == false)
   return;
 
-   struct intel_mipmap_tree *dst = mt->r8stencil_mt;
+   struct intel_mipmap_tree *dst = mt->shadow_mt;
 
for (int level = src->first_level; level <= src->last_level; level++) {
   const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ?
@@ -2975,7 +2975,7 @@ intel_update_r8stencil(struct brw_context *brw,
}
 
brw_cache_flush_for_read(brw, dst->bo);
-   src->r8stencil_needs_update = false;
+   src->shadow_needs_update = false;
 }
 
 static void *
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 17668944adc..1a7507023a1 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -294,16 +294,16 @@ struct intel_mipmap_tree
struct intel_mipmap_tree *stencil_mt;
 
/**
-* \brief Stencil texturing miptree for sampling from a stencil texture
+* \brief Shadow miptree for sampling when the main isn't supported by HW.
 *
-* Some hardware doesn't support sampling from the stencil texture as
-* required by the GL_ARB_stencil_texturing extenion. To workaround this we
-* blit the texture into a new texture that can be sampled.
+* To workaround various sampler bugs and limitations, we blit the main
+* texture into a new texture that can be sampled.
 *
-* \see intel_update_r8stencil()
+* This miptree may be used for:
+* - Stencil texturin

[Mesa-dev] [PATCH v4 4/4] i965: Enabled the OES_copy_image extension on Gen 7 GPUs

2019-02-10 Thread Eleni Maria Stea
OES_copy_image extension was disabled on Gen7 due to the lack of support
for ETC2 images. Enabled it back. (Kenneth Graunke)

v2:
  - Removed the blank lines in the comments above OES_copy_image and
  OES_texture_view extensions in intel_extensions.c (Nanley Chery)
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 3a95be58a63..d2a6aa185c2 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -287,14 +287,22 @@ intelInitExtensions(struct gl_context *ctx)
}
 
if (devinfo->gen >= 8 || devinfo->is_baytrail) {
-  /* For now, we only enable OES_copy_image on platforms that support
-   * ETC2 natively in hardware.  We would need more hacks to support it
-   * elsewhere. Same with OES_texture_view.
+  /* For now, we can't enable OES_texture_view on Gen 7 because of
+   * some piglit failures coming from
+   * piglit/tests/spec/arb_texture_view/rendering-formats.c that need
+   * investigation.
*/
-  ctx->Extensions.OES_copy_image = true;
   ctx->Extensions.OES_texture_view = true;
}
 
+   if (devinfo->gen >= 7) {
+  /* We can safely enable OES_copy_image on Gen 7, since we emulate
+   * the ETC2 support using the shadow_miptree to store the
+   * compressed data.
+   */
+  ctx->Extensions.OES_copy_image = true;
+   }
+
if (devinfo->gen >= 8) {
   ctx->Extensions.ARB_gpu_shader_int64 = true;
   /* requires ARB_gpu_shader_int64 */
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 0/4] improved the support for ETC2 formats on Gen 7

2019-02-10 Thread Eleni Maria Stea
Intel Gen7 GPUs don't support the ETC2 formats natively and in order to
show the pixels properly we convert them to RGBA and create RGBA miptrees.
The problem with that is that the GetCompressed* functions that should
return the compressed pixel values return the RGBA instead.

These patches are an attempt to give a solution to this problem, by
using 2 miptrees: the main to stores the ETC values and the generic
shadow (mt->shadow) to store the RGBA. Each time that the main miptree
is unmapped we unpack the ETC to RGBA and we update the shadow. Similarly,
we update all the mipmap levels of the image (if necessary) before the
drawing, for the CopyImageSubData to work.

Also, the OES_copy_image extension that couldn't work on Gen 7 due to the
lack of the ETC support is now enabled back.

Finally, the following glcts and piglit tests pass:

On HSW (previously failing):

KHR-GL46.direct_state_access.textures_compressed_subimage

On HSW and IVB (previously skipped):
-
dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_alpha8_etc2_eac.*
   (6 tests)
dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_etc2.*
   (6 tests)
dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_punchthrough_alpha1_etc2.*
   (6 tests)

On HSW, IVB, SNB (previously skipped):
---
dEQP-GLES3.functional.texture.format.compressed.*
   (12 tests)
dEQP-GLES3.functional.texture.wrap.etc2_eac_srgb8_alpha8.*
   (36 tests)
dEQP-GLES3.functional.texture.wrap.etc2_srgb8.*
   (36 tests)
dEQP-GLES3.functional.texture.wrap.etc2_srgb8_punchthrough_alpha1.*
   (36 tests)

piglit.spec.!opengl es 3_0.oes_compressed_etc2_texture-miptree_gles3 
   (srgb8, srgb8-alpha, srgb8-punchthrough-alpha1)
piglit.spec.arb_es3_compatibility.oes_compressed_etc2_texture-miptree
   (srgb8 compat, srgb8 core, srgb8-alpha8 compat, srgb8-alpha8 core,
srgb8-punchthrough-alpha1 compat, srgb8-punchthrough-alpha1 core)
(9 tests)

Total tests passing: 148

Eleni Maria Stea (3):
  i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
  i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
  i965: Enabled the OES_copy_image extension on Gen 7 GPUs

Nanley Chery (1):
  i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*

 src/mesa/drivers/dri/i965/brw_draw.c  |   5 +
 .../drivers/dri/i965/brw_wm_surface_state.c   |  13 +-
 src/mesa/drivers/dri/i965/intel_extensions.c  |  16 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 188 +++---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  38 +++-
 5 files changed, 170 insertions(+), 90 deletions(-)

-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 3/4] i965: Fixed the CopyImageSubData for ETC2 on Gen < 8

2019-02-10 Thread Eleni Maria Stea
For CopyImageSubData to copy the data during the 1st draw call, we need
to update the shadow tree right before the rendering.

v2:
  - Added assertion that the miptree doesn't need update at the time we
  update the texture surface. (Nanley Chery)

v3:
  - As we now update the tree before the rendering we don't need to copy
  the data during the unmap anymore. Removed the unnecessary update from
  the intel_miptree_unmap in intel_mipmap_tree.c and modified the
  intel_miptree_update_etc_shadow.* functions in the same file to update
  properly the mipmap levels for the mipmaps generation to continue to
  work after the change. (Nanley Chery)
---
 src/mesa/drivers/dri/i965/brw_draw.c|  5 +
 .../drivers/dri/i965/brw_wm_surface_state.c |  2 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c   | 17 ++---
 3 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index ec4fe0b096f..d00e0a726b1 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -559,6 +559,11 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool 
rendering,
   tex_obj->mt->format == MESA_FORMAT_S_UINT8) {
  intel_update_r8stencil(brw, tex_obj->mt);
   }
+
+  if (intel_miptree_has_etc_shadow(brw, tex_obj->mt) &&
+  tex_obj->mt->shadow_needs_update) {
+ intel_miptree_update_etc_shadow_levels(brw, tex_obj->mt);
+  }
}
 
/* Resolve color for each active shader image. */
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index c2cf34aee71..437c7c82555 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -582,7 +582,7 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
  mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
   } else if (intel_miptree_needs_fake_etc(brw, mt)) {
- assert(mt->shadow_mt);
+ assert(mt->shadow_mt && !mt->shadow_needs_update);
  mt = mt->shadow_mt;
   }
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index e50db649a23..86085db6a90 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3780,7 +3780,6 @@ intel_miptree_unmap(struct brw_context *brw,
 unsigned int slice)
 {
struct intel_miptree_map *map = mt->level[level].slice[slice].map;
-   int level_w, level_h;
 
assert(mt->surf.samples == 1);
 
@@ -3790,20 +3789,10 @@ intel_miptree_unmap(struct brw_context *brw,
DBG("%s: mt %p (%s) level %d slice %d\n", __func__,
mt, _mesa_get_format_name(mt->format), level, slice);
 
-   level_w = minify(mt->surf.phys_level0_sa.width,
-level - mt->first_level);
-   level_h = minify(mt->surf.phys_level0_sa.height,
-level - mt->first_level);
-
if (map->unmap)
   map->unmap(brw, mt, map, level, slice);
 
intel_miptree_release_map(mt, level, slice);
-
-   if (intel_miptree_has_etc_shadow(brw, mt) && mt->shadow_needs_update) {
-  intel_miptree_update_etc_shadow(brw, mt, level, slice, level_w,
-  level_h);
-   }
 }
 
 enum isl_surf_dim
@@ -3936,7 +3925,6 @@ intel_miptree_update_etc_shadow(struct brw_context *brw,
if (!mt->shadow_needs_update)
   return;
 
-   mt->shadow_needs_update = false;
smt = mt->shadow_mt;
 
etc_mode = GL_MAP_READ_BIT;
@@ -3989,10 +3977,9 @@ intel_miptree_update_etc_shadow_levels(struct 
brw_context *brw,
   }
 
   level_w = minify(smt->surf.logical_level0_px.width,
-   level - smt->first_level);
+   level - smt->first_level + 1);
   level_h = minify(smt->surf.logical_level0_px.height,
-   level - smt->first_level);
+   level - smt->first_level + 1);
}
-
mt->shadow_needs_update = false;
 }
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 2/4] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-10 Thread Eleni Maria Stea
GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
compressed EAC/ETC2 images to non-compressed RGBA images. When
GetCompressed* functions were called, the pixels were returned in this
RGBA format and not the compressed format that was expected.

Trying to fix this problem, we use a secondary shadow miptree to store the
decompressed data for the rendering and the main miptree to store the
compressed for the Get functions to work. Each time that the main miptree
is written with compressed data, we decompress them to RGB and update the
shadow. Then we use the shadow for rendering.

v2:
   - Fixes in the commit message (Nanley Chery)
   - Reversed the changes in brw_get_texture_swizzle and swapped the b, g
   values at the time that we decompress the data in the function:
   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
   - Simplified the format checks in the miptree_create function of the
   intel_mipmap_tree.c and reserved the call of the
   intel_lower_compressed_format for the case that we are faking the ETC
   support (Nanley Chery)
   - Removed the check for the auxiliary usage for the shadow miptree at
   creation (miptree_create of intel_mipmap_tree.c) as we won't use
   auxiliary buffers with these types of trees (Nanley Chery)
   - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
   removed the unecessary checks (Nanley Chery)
   - Fixed an unrelated indentation change (Nanley Chery)
   - Modified the function intel_miptree_finish_write to set the
   mt->shadow_needs_update to true to catch all the cases when we need to
   update the miptree (Nanley Chery)
   - In order to update the shadow miptree during the unmap of the
   main and always map the main (Nanley Chery) the following change was
   necessary: Splitted the previous update function that was updating all
   the mipmap levels and use two functions instead: one that updates one
   level and one that updates all of them. Used the first during unmap
   and the second before the rendering.
   - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
   miptree should be mapped each time and reversed all the changes in the
   higher level texture functions that upload data to textures as they
   aren't needed anymore.
   - Replaced the boolean needs_fake_etc with an inline function that
   checks when we need to fake the ETC compression (Nanley Chery)
   - Removed the initialization of the strides in the update function as
   the values will be overwritten by the intel_miptree_map call (Nanley
   Chery)
   - Used minify instead of division in the new update function
   intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
   Chery)
   - Removed the depth from the calculation of the number of slices in
   the new update function (intel_miptree_update_etc_shadow_levels of
   intel_mipmap_tree.c) as we don't need to support 3D ETC images.
   (Nanley Chery)

v3:
  - Renamed the rgba_fmt in function miptree_create
  (intel_mipmap_tree.c) to decomp_format as the format is not always in
  rgba order. (Nanley Chery)
  - Documented the new usage for the shadow miptree in the comment above
  the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley
  Chery)
  - Removed the redundant flags from the mapping of the miptrees in
  intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
  - Fixed the switch from surface's logical level to physical level in
  the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c
  (Nanley Chery)
  - Excluded the Baytrail GPUs from the check for the ETC emulation as
  they support the ETC formats natively. (Nanley Chery)
  - Simplified the check if the format is BGRA in
  intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)

v4:
  - Removed the functions intel_miptree_(map|unmap)_etc and the check if
   we need to call them as with the new changes, they became unreachable.
   (Nanley Chery)
  - We'd rather calculate the level width and height using the shadow
  miptree instead of the main in intel_miptree_update_etc_shadow_levels of
  intel_mipmap_tree.c (Nanley Chery)
---
 .../drivers/dri/i965/brw_wm_surface_state.c   |   5 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 185 +++---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  24 +++
 3 files changed, 147 insertions(+), 67 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 618e2ab35bc..c2cf34aee71 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -521,7 +521,7 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
   */
  mesa_fmt = mt->format;
   } else if (mt->etc_format != MESA_FORMAT_NONE) {
- mesa_fmt = mt->format;
+ mesa_fmt = mt->shadow_mt->format;
   } else if (plane > 0) {
  mesa_fmt = mt->format;

Re: [Mesa-dev] [PATCH v3 2/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-08 Thread Eleni Maria Stea
Hi Nanley,

On Thu, 7 Feb 2019 15:46:29 -0800
Nanley Chery  wrote:
 >  
> > @@ -3825,10 +3849,20 @@ intel_miptree_unmap(struct brw_context *brw,
> > DBG("%s: mt %p (%s) level %d slice %d\n", __func__,
> > mt, _mesa_get_format_name(mt->format), level, slice);
> >  
> > +   level_w = minify(mt->surf.phys_level0_sa.width,
> > +level - mt->first_level);
> > +   level_h = minify(mt->surf.phys_level0_sa.height,
> > +level - mt->first_level);
> > +
> > if (map->unmap)
> >map->unmap(brw, mt, map, level, slice);
> >  
> > intel_miptree_release_map(mt, level, slice);
> > +
> > +   if (intel_miptree_has_etc_shadow(brw, mt) &&
> > mt->shadow_needs_update) {
> > +  intel_miptree_update_etc_shadow(brw, mt, level, slice,
> > level_w,
> > +  level_h);
> > +   }  
> 
> With the next patch applied, the change in this function becomes
> unnecessary. Is there any reason you're leaving it around?

After a second thought, I believe that this change wasn't unnecessary.
There is a problem if we remove it:

When we generate mipmaps we need to update the shadow for each level.
As the update is done per level during unmap, if we remove the call we
end-up with the first level correctly updated but all the others empty.

An example:
git clone https://github.com/hikiko/test-compression.git
make
./test compressed/full.tex

This test loads dumped compressed mipmap levels from the full.tex and
displays them, if you run it with the per level update inside the unmap
you will see all the mipmap levels. Without, you will see only the
first, like here: https://imgur.com/a/VvS0CYC

Do you have any suggestion on how I could bypass this problem?

Thanks again,
Eleni




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 2/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-07 Thread Eleni Maria Stea
Hello,

On Thu, 7 Feb 2019 15:46:29 -0800
Nanley Chery  wrote:


> > -  !(mode & BRW_MAP_DIRECT_BIT)) {
> > +  !(mode & BRW_MAP_DIRECT_BIT) &&
> > +  !(intel_miptree_needs_fake_etc(brw, mt))) {
> >intel_miptree_map_etc(brw, mt, map, level, slice);  
> 
> Out of curiosity, is there any reason you wait until patch 5 to delete
> this case?

No, I just removed this lines together with the unreached map/unmap_etc
functions. I will move the change to this patch.

> 
> > } else if (mt->stencil_mt && !(mode & BRW_MAP_DIRECT_BIT)) {
> >intel_miptree_map_depthstencil(brw, mt, map, level, slice);
> > @@ -3816,6 +3839,7 @@ intel_miptree_unmap(struct brw_context *brw,
> >  unsigned int slice)
> >  {
> > struct intel_miptree_map *map =
> > mt->level[level].slice[slice].map;
> > +   int level_w, level_h;
> >  
> > assert(mt->surf.samples == 1);
> >  
> > @@ -3825,10 +3849,20 @@ intel_miptree_unmap(struct brw_context *brw,
> > DBG("%s: mt %p (%s) level %d slice %d\n", __func__,
> > mt, _mesa_get_format_name(mt->format), level, slice);
> >  
> > +   level_w = minify(mt->surf.phys_level0_sa.width,
> > +level - mt->first_level);
> > +   level_h = minify(mt->surf.phys_level0_sa.height,
> > +level - mt->first_level);
> > +
> > if (map->unmap)
> >map->unmap(brw, mt, map, level, slice);
> >  
> > intel_miptree_release_map(mt, level, slice);
> > +
> > +   if (intel_miptree_has_etc_shadow(brw, mt) &&
> > mt->shadow_needs_update) {
> > +  intel_miptree_update_etc_shadow(brw, mt, level, slice,
> > level_w,
> > +  level_h);
> > +   }  
> 
> With the next patch applied, the change in this function becomes
> unnecessary. Is there any reason you're leaving it around?

Right, if we force the update before the rendering, we don't need to
copy the data during the unmap. I will remove it, sorry I dismissed
it in the previous email.
 
> >  }
> >  
> >  enum isl_surf_dim
> > @@ -3943,3 +3977,81 @@ intel_miptree_get_clear_color(const struct
> > gen_device_info *devinfo, return mt->fast_clear_color;
> > }
> >  }
> > +
> > +static void
> > +intel_miptree_update_etc_shadow(struct brw_context *brw,
> > +struct intel_mipmap_tree *mt,
> > +unsigned int level,
> > +unsigned int slice,
> > +int level_w,
> > +int level_h)
> > +{
> > +   struct intel_mipmap_tree *smt;
> > +   ptrdiff_t etc_stride, shadow_stride;
> > +   GLbitfield etc_mode, shadow_mode;
> > +   void *mptr, *sptr;
> > +
> > +   assert(intel_miptree_has_etc_shadow(brw, mt));
> > +   if (!mt->shadow_needs_update)
> > +  return;
> > +
> > +   mt->shadow_needs_update = false;
> > +   smt = mt->shadow_mt;
> > +
> > +   etc_mode = GL_MAP_READ_BIT;
> > +   shadow_mode = GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_RANGE_BIT;
> > +
> > +   intel_miptree_map(brw, mt, level, slice, 0, 0, level_w, level_h,
> > + etc_mode, , _stride);
> > +   intel_miptree_map(brw, smt, level, slice, 0, 0, level_w,
> > level_h,
> > + shadow_mode, , _stride);
> > +
> > +   if (mt->format == MESA_FORMAT_ETC1_RGB8) {
> > +  _mesa_etc1_unpack_rgba(sptr, shadow_stride, mptr,
> > etc_stride,
> > + level_w, level_h);
> > +   } else {
> > +  /* destination and source images must have the same swizzle
> > */
> > +  bool is_bgra = (smt->format == MESA_FORMAT_B8G8R8A8_SRGB);
> > +  _mesa_unpack_etc2_format(sptr, shadow_stride, mptr,
> > etc_stride,
> > +   level_w, level_h, mt->format,
> > is_bgra);
> > +   }
> > +
> > +   intel_miptree_unmap(brw, mt, level, slice);
> > +   intel_miptree_unmap(brw, smt, level, slice);
> > +}
> > +
> > +void
> > +intel_miptree_update_etc_shadow_levels(struct brw_context *brw,
> > +   struct intel_mipmap_tree
> > *mt) +{
> > +   struct intel_mipmap_tree *smt;
> > +   int num_slices;
> > +   int level_w, level_h;
> > +
> > +   assert(mt);
> > +   assert(mt->surf.size_B > 0);
> > +
> > +   assert(intel_miptree_has_etc_shadow(brw, mt));
> > +
> > +   smt = mt->shadow_mt;
> > +
> > +   level_w = smt->surf.logical_level0_px.width;
> > +   level_h = smt->surf.logical_level0_px.height;
> > +
> > +   num_slices = smt->surf.logical_level0_px.array_len;
> > +
> > +   for (int level = smt->first_level; level <= smt->last_level;
> > level++)
> > +   {
> > +  for (unsigned int slice = 0; slice < num_slices; slice++) {
> > + intel_miptree_update_etc_shadow(brw, mt, level, slice,
> > level_w,
> > + level_h);
> > +  }
> > +
> > +  level_w = minify(mt->surf.logical_level0_px.width,
> > +   level - mt->first_level);
> > +  level_h 

Re: [Mesa-dev] [PATCH v2 5/5] i965: Enabled the OES_copy_image extension on Gen 7 GPUs

2019-02-07 Thread Eleni Maria Stea
On Thu, 7 Feb 2019 11:18:59 -0500
Ilia Mirkin  wrote:

> On Thu, Feb 7, 2019 at 2:49 AM Eleni Maria Stea 
> wrote:
> >
> > On Wed, 6 Feb 2019 12:12:27 -0800
> > Nanley Chery  wrote:
> >  
> > > > +   * For now, we can't enable OES_texture_view on Gen 7
> > > > because of
> > > > +   * some piglit failures coming from
> > > > +   * piglit/tests/spec/arb_texture_view/rendering-formats.c
> > > > that need
> > > > +   * investigation.
> > > > */  
> > >
> > > What kind of failures are you seeing? I'd imagine texture views to
> > > work with this version of your series.
> > >  
> >
> > Hi Nanley,
> >
> > If you run the piglit test: arb_texture_view-rendering-format, and
> > grep for failures on HSW:
> >
> > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_R32F" : "fail"}}
> > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16F" : "fail"}}
> > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16I" : "fail"}}
> > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16_SNORM" : "fail"}}
> > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RGBA8I" : "fail"}}
> > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RGBA8_SNORM" :
> > "fail"}}
> >
> > I remember seeing similar errors on Ivy too. They must be
> > irrelevant to the ETC support but as this test passes on BDW where
> > the extension is enabled, I didn't enable it on Gen 7 for the
> > moment. I think I had discussed about these failures with Kenneth
> > before I disabled them, but I didn't investigated them further
> > after that.  
> 
> Do you also see the failures with desktop GL (and ARB_texture_view)?
> If not, that'd be very surprising.
> 
> Note that the piglit test arb_texture_view-rendering-formats is the
> desktop GL test. arb_texture_view-rendering-formats_gles3 is the ES
> version.
> 
>   -ilia
> 

Hi Ilia,

I just checked on HSW and IVY with my final patches (sent a few minutes
before your reply) and:

HSW:

extension disabled: the desktop test passes but we receive the following
error several times:

User Error: GL_INVALID_OPERATION in glTextureView(internalformat X not
compatible with origtexture Y) in each subtest.

extension enabled: I see the same error but now both the desktop and
gles versions pass (which wasn't the case when I checked last week with
my previous patches)

I could probably enable it now on gen >= 75, if you and Nanley (CC-ed)
are OK with this decision. What do you think?

on Ivy:
---
extension disabled: the desktop version of the test
fails with the failures below (and the gles is skipped) 

extension enabled: both the desktop and the gles versions
fail and the failures are the same (see below)

PIGLIT: {"subtest": {"clear GL_RGBA16_SNORM as GL_RG32F" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGBA16_SNORM as GL_RG32UI" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGBA16_SNORM as GL_RG32I" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGBA16_SNORM as GL_RGBA16F" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGBA16_SNORM as GL_RGBA16UI" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGBA16_SNORM as GL_RGBA16" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGB16_SNORM as GL_RGB16F" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGB16_SNORM as GL_RGB16" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_R32F" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_R32UI" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_R32I" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_RG16F" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_RG16UI" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_RG16" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_RGBA8UI" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_RGBA8I" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_RGBA8" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_RGBA8_SNORM" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_RGB10_A2UI" : "fail"}}
PIGLIT: {"s

[Mesa-dev] [PATCH v3 0/5] improved the support for ETC2 formats on Gen 7

2019-02-07 Thread Eleni Maria Stea
Intel Gen7 GPUs don't support the ETC2 formats natively and in order to
show the pixels properly we convert them to RGBA and create RGBA miptrees.
The problem with that is that the GetCompressed* functions that should
return the compressed pixel values return the RGBA instead.

These patches are an attempt to give a solution to this problem, by
using 2 miptrees: the main to stores the ETC values and the generic
shadow (mt->shadow) to store the RGBA. Each time that the main miptree
is unmapped we unpack the ETC to RGBA and we update the shadow. Similarly,
we update all the mipmap levels of the image (if necessary) before the
drawing, for the CopyImageSubData to work.

Also, the OES_copy_image extension that couldn't work on Gen 7 due to the
lack of the ETC support is now enabled back.

Finally, the following glcts and piglit tests pass:

On HSW (previously failing):

KHR-GL46.direct_state_access.textures_compressed_subimage

On HSW and IVB (previously disabled):
-
dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_alpha8_etc2_eac.*
   (6 tests)
dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_etc2.*
   (6 tests)
dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_punchthrough_alpha1_etc2.*
   (6 tests)

On HSW, IVB, SNB (previously disabled):
---
dEQP-GLES3.functional.texture.format.compressed.*
   (12 tests)
dEQP-GLES3.functional.texture.wrap.etc2_eac_srgb8_alpha8.*
   (36 tests)
dEQP-GLES3.functional.texture.wrap.etc2_srgb8.*
   (36 tests)
dEQP-GLES3.functional.texture.wrap.etc2_srgb8_punchthrough_alpha1.*
   (36 tests)

piglit.spec.!opengl es 3_0.oes_compressed_etc2_texture-miptree_gles3 
   (srgb8, srgb8-alpha, srgb8-punchthrough-alpha1)
piglit.spec.arb_es3_compatibility.oes_compressed_etc2_texture-miptree
   (srgb8 compat, srgb8 core, srgb8-alpha8 compat, srgb8-alpha8 core,
srgb8-punchthrough-alpha1 compat, srgb8-punchthrough-alpha1 core)
(9 tests)

Total tests passing: 148
---

Eleni Maria Stea (4):
  i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
  i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
  i965: Enabled the OES_copy_image extension on Gen 7 GPUs
  i965: Removed unused intel_miptree_map_etc/intel_miptree_unmap_etc

Nanley Chery (1):
  i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*

 src/mesa/drivers/dri/i965/brw_draw.c  |   5 +
 .../drivers/dri/i965/brw_wm_surface_state.c   |  13 +-
 src/mesa/drivers/dri/i965/intel_extensions.c  |  16 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 201 +++---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  38 +++-
 5 files changed, 183 insertions(+), 90 deletions(-)

-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 3/5] i965: Fixed the CopyImageSubData for ETC2 on Gen < 8

2019-02-07 Thread Eleni Maria Stea
For CopyImageSubData to copy the data during the 1st draw call, we need
to update the shadow tree right before the rendering.

v2:
  - Added assertion that the miptree doesn't need update at the time we
  update the texture surface. (Nanley Chery)
---
 src/mesa/drivers/dri/i965/brw_draw.c | 5 +
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index ec4fe0b096f..d00e0a726b1 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -559,6 +559,11 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool 
rendering,
   tex_obj->mt->format == MESA_FORMAT_S_UINT8) {
  intel_update_r8stencil(brw, tex_obj->mt);
   }
+
+  if (intel_miptree_has_etc_shadow(brw, tex_obj->mt) &&
+  tex_obj->mt->shadow_needs_update) {
+ intel_miptree_update_etc_shadow_levels(brw, tex_obj->mt);
+  }
}
 
/* Resolve color for each active shader image. */
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index c2cf34aee71..437c7c82555 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -582,7 +582,7 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
  mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
   } else if (intel_miptree_needs_fake_etc(brw, mt)) {
- assert(mt->shadow_mt);
+ assert(mt->shadow_mt && !mt->shadow_needs_update);
  mt = mt->shadow_mt;
   }
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 5/5] i965: Removed unused intel_miptree_map_etc/intel_miptree_unmap_etc

2019-02-07 Thread Eleni Maria Stea
Functions intel_miptree_(map|unmap)_etc are not reached anymore, as we
now use the shadow_mt of each compressed ETC miptree for the emulation.
We removed the functions.

v2:
  - In the previous patch series, we only removed the assertions that
  the tree was mapped for writing. We can now safely remove the whole
  functions as they won't be reached anymore. (Nanley Chery)
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 59 ---
 1 file changed, 59 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index c7367fc385f..a40f606f351 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3476,61 +3476,6 @@ intel_miptree_map_s8(struct brw_context *brw,
map->unmap = intel_miptree_unmap_s8;
 }
 
-static void
-intel_miptree_unmap_etc(struct brw_context *brw,
-struct intel_mipmap_tree *mt,
-struct intel_miptree_map *map,
-unsigned int level,
-unsigned int slice)
-{
-   uint32_t image_x;
-   uint32_t image_y;
-   intel_miptree_get_image_offset(mt, level, slice, _x, _y);
-
-   image_x += map->x;
-   image_y += map->y;
-
-   uint8_t *dst = intel_miptree_map_raw(brw, mt, GL_MAP_WRITE_BIT)
-+ image_y * mt->surf.row_pitch_B
-+ image_x * mt->cpp;
-
-   if (mt->etc_format == MESA_FORMAT_ETC1_RGB8)
-  _mesa_etc1_unpack_rgba(dst, mt->surf.row_pitch_B,
- map->ptr, map->stride,
- map->w, map->h);
-   else
-  _mesa_unpack_etc2_format(dst, mt->surf.row_pitch_B,
-   map->ptr, map->stride,
-  map->w, map->h, mt->etc_format, true);
-
-   intel_miptree_unmap_raw(mt);
-   free(map->buffer);
-}
-
-static void
-intel_miptree_map_etc(struct brw_context *brw,
-  struct intel_mipmap_tree *mt,
-  struct intel_miptree_map *map,
-  unsigned int level,
-  unsigned int slice)
-{
-   assert(mt->etc_format != MESA_FORMAT_NONE);
-   if (mt->etc_format == MESA_FORMAT_ETC1_RGB8) {
-  assert(mt->format == MESA_FORMAT_R8G8B8X8_UNORM);
-   }
-
-   assert(map->mode & GL_MAP_WRITE_BIT);
-   assert(map->mode & GL_MAP_INVALIDATE_RANGE_BIT);
-
-   intel_miptree_access_raw(brw, mt, level, slice, true);
-
-   map->stride = _mesa_format_row_stride(mt->etc_format, map->w);
-   map->buffer = malloc(_mesa_format_image_size(mt->etc_format,
-map->w, map->h, 1));
-   map->ptr = map->buffer;
-   map->unmap = intel_miptree_unmap_etc;
-}
-
 /**
  * Mapping functions for packed depth/stencil miptrees backed by real separate
  * miptrees for depth and stencil.
@@ -3803,10 +3748,6 @@ intel_miptree_map(struct brw_context *brw,
 
if (mt->format == MESA_FORMAT_S_UINT8) {
   intel_miptree_map_s8(brw, mt, map, level, slice);
-   } else if (mt->etc_format != MESA_FORMAT_NONE &&
-  !(mode & BRW_MAP_DIRECT_BIT) &&
-  !(intel_miptree_needs_fake_etc(brw, mt))) {
-  intel_miptree_map_etc(brw, mt, map, level, slice);
} else if (mt->stencil_mt && !(mode & BRW_MAP_DIRECT_BIT)) {
   intel_miptree_map_depthstencil(brw, mt, map, level, slice);
} else if (use_intel_mipree_map_blit(brw, mt, map)) {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 4/5] i965: Enabled the OES_copy_image extension on Gen 7 GPUs

2019-02-07 Thread Eleni Maria Stea
OES_copy_image extension was disabled on Gen7 due to the lack of support
for ETC2 images. Enabled it back. (Kenneth Graunke)

v2:
  - Removed the blank lines in the comments above OES_copy_image and
  OES_texture_view extensions in intel_extensions.c (Nanley Chery)
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 3a95be58a63..d2a6aa185c2 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -287,14 +287,22 @@ intelInitExtensions(struct gl_context *ctx)
}
 
if (devinfo->gen >= 8 || devinfo->is_baytrail) {
-  /* For now, we only enable OES_copy_image on platforms that support
-   * ETC2 natively in hardware.  We would need more hacks to support it
-   * elsewhere. Same with OES_texture_view.
+  /* For now, we can't enable OES_texture_view on Gen 7 because of
+   * some piglit failures coming from
+   * piglit/tests/spec/arb_texture_view/rendering-formats.c that need
+   * investigation.
*/
-  ctx->Extensions.OES_copy_image = true;
   ctx->Extensions.OES_texture_view = true;
}
 
+   if (devinfo->gen >= 7) {
+  /* We can safely enable OES_copy_image on Gen 7, since we emulate
+   * the ETC2 support using the shadow_miptree to store the
+   * compressed data.
+   */
+  ctx->Extensions.OES_copy_image = true;
+   }
+
if (devinfo->gen >= 8) {
   ctx->Extensions.ARB_gpu_shader_int64 = true;
   /* requires ARB_gpu_shader_int64 */
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 2/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-07 Thread Eleni Maria Stea
GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
compressed EAC/ETC2 images to non-compressed RGBA images. When
GetCompressed* functions were called, the pixels were returned in this
RGBA format and not the compressed format that was expected.

Trying to fix this problem, we use a secondary shadow miptree to store the
decompressed data for the rendering and the main miptree to store the
compressed for the Get functions to work. Each time that the main miptree
is written with compressed data, we decompress them to RGB and update the
shadow. Then we use the shadow for rendering.

v2:
   - Fixes in the commit message (Nanley Chery)
   - Reversed the changes in brw_get_texture_swizzle and swapped the b, g
   values at the time that we decompress the data in the function:
   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
   - Simplified the format checks in the miptree_create function of the
   intel_mipmap_tree.c and reserved the call of the
   intel_lower_compressed_format for the case that we are faking the ETC
   support (Nanley Chery)
   - Removed the check for the auxiliary usage for the shadow miptree at
   creation (miptree_create of intel_mipmap_tree.c) as we won't use
   auxiliary buffers with these types of trees (Nanley Chery)
   - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
   removed the unecessary checks (Nanley Chery)
   - Fixed an unrelated indentation change (Nanley Chery)
   - Modified the function intel_miptree_finish_write to set the
   mt->shadow_needs_update to true to catch all the cases when we need to
   update the miptree (Nanley Chery)
   - In order to update the shadow miptree during the unmap of the
   main and always map the main (Nanley Chery) the following change was
   necessary: Splitted the previous update function that was updating all
   the mipmap levels and use two functions instead: one that updates one
   level and one that updates all of them. Used the first during unmap
   and the second before the rendering.
   - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
   miptree should be mapped each time and reversed all the changes in the
   higher level texture functions that upload data to textures as they
   aren't needed anymore.
   - Replaced the boolean needs_fake_etc with an inline function that
   checks when we need to fake the ETC compression (Nanley Chery)
   - Removed the initialization of the strides in the update function as
   the values will be overwritten by the intel_miptree_map call (Nanley
   Chery)
   - Used minify instead of division in the new update function
   intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
   Chery)
   - Removed the depth from the calculation of the number of slices in
   the new update function (intel_miptree_update_etc_shadow_levels of
   intel_mipmap_tree.c) as we don't need to support 3D ETC images.
   (Nanley Chery)

v3:
  - Renamed the rgba_fmt in function miptree_create
  (intel_mipmap_tree.c) to decomp_format as the format is not always in
  rgba order. (Nanley Chery)
  - Documented the new usage for the shadow miptree in the comment above
  the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley
  Chery)
  - Removed the redundant flags from the mapping of the miptrees in
  intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
  - Fixed the switch from surface's logical level to physical level in
  the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c
  (Nanley Chery)
  - Excluded the Baytrail GPUs from the check for the ETC emulation as
  they support the ETC formats natively. (Nanley Chery)
  - Simplified the check if the format is BGRA in
  intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
---
 .../drivers/dri/i965/brw_wm_surface_state.c   |   5 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 130 --
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  24 
 3 files changed, 149 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 618e2ab35bc..c2cf34aee71 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -521,7 +521,7 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
   */
  mesa_fmt = mt->format;
   } else if (mt->etc_format != MESA_FORMAT_NONE) {
- mesa_fmt = mt->format;
+ mesa_fmt = mt->shadow_mt->format;
   } else if (plane > 0) {
  mesa_fmt = mt->format;
   } else {
@@ -581,6 +581,9 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
  assert(mt->shadow_mt && !mt->shadow_needs_update);
  mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
+  } else if (intel_miptree_needs_fake_etc(brw, mt)) {
+ assert(mt->shadow_mt);
+ mt = mt->shadow_mt;
   

[Mesa-dev] [PATCH v3 1/5] i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*

2019-02-07 Thread Eleni Maria Stea
From: Nanley Chery 

Use more generic field names. We'll reuse these fields for a workaround
with ASTC miptrees.

Reviewed-by: Eleni Maria Stea 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  8 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 16 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 14 +++---
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index b067a174056..618e2ab35bc 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -571,15 +571,15 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
 
   if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) 
{
  if (devinfo->gen <= 7) {
-assert(mt->r8stencil_mt && 
!mt->stencil_mt->r8stencil_needs_update);
-mt = mt->r8stencil_mt;
+assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update);
+mt = mt->shadow_mt;
  } else {
 mt = mt->stencil_mt;
  }
  format = ISL_FORMAT_R8_UINT;
   } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) {
- assert(mt->r8stencil_mt && !mt->r8stencil_needs_update);
- mt = mt->r8stencil_mt;
+ assert(mt->shadow_mt && !mt->shadow_needs_update);
+ mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
   }
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index b4e3524aa51..479188fd1c8 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1214,7 +1214,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt)
 
   brw_bo_unreference((*mt)->bo);
   intel_miptree_release(&(*mt)->stencil_mt);
-  intel_miptree_release(&(*mt)->r8stencil_mt);
+  intel_miptree_release(&(*mt)->shadow_mt);
   intel_miptree_aux_buffer_free((*mt)->aux_buf);
   free_aux_state_map((*mt)->aux_state);
 
@@ -2427,7 +2427,7 @@ intel_miptree_finish_write(struct brw_context *brw,
switch (mt->aux_usage) {
case ISL_AUX_USAGE_NONE:
   if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7)
- mt->r8stencil_needs_update = true;
+ mt->shadow_needs_update = true;
   break;
 
case ISL_AUX_USAGE_MCS:
@@ -2933,9 +2933,9 @@ intel_update_r8stencil(struct brw_context *brw,
 
assert(src->surf.size_B > 0);
 
-   if (!mt->r8stencil_mt) {
+   if (!mt->shadow_mt) {
   assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */
-  mt->r8stencil_mt = make_surface(
+  mt->shadow_mt = make_surface(
 brw,
 src->target,
 MESA_FORMAT_R_UINT8,
@@ -2949,13 +2949,13 @@ intel_update_r8stencil(struct brw_context *brw,
 ISL_TILING_Y0_BIT,
 ISL_SURF_USAGE_TEXTURE_BIT,
 BO_ALLOC_BUSY, 0, NULL);
-  assert(mt->r8stencil_mt);
+  assert(mt->shadow_mt);
}
 
-   if (src->r8stencil_needs_update == false)
+   if (src->shadow_needs_update == false)
   return;
 
-   struct intel_mipmap_tree *dst = mt->r8stencil_mt;
+   struct intel_mipmap_tree *dst = mt->shadow_mt;
 
for (int level = src->first_level; level <= src->last_level; level++) {
   const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ?
@@ -2975,7 +2975,7 @@ intel_update_r8stencil(struct brw_context *brw,
}
 
brw_cache_flush_for_read(brw, dst->bo);
-   src->r8stencil_needs_update = false;
+   src->shadow_needs_update = false;
 }
 
 static void *
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 17668944adc..1a7507023a1 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -294,16 +294,16 @@ struct intel_mipmap_tree
struct intel_mipmap_tree *stencil_mt;
 
/**
-* \brief Stencil texturing miptree for sampling from a stencil texture
+* \brief Shadow miptree for sampling when the main isn't supported by HW.
 *
-* Some hardware doesn't support sampling from the stencil texture as
-* required by the GL_ARB_stencil_texturing extenion. To workaround this we
-* blit the texture into a new texture that can be sampled.
+* To workaround various sampler bugs and limitations, we blit the main
+* texture into a new texture that can be sampled.
 *
-* \see intel_update_r8stencil()
+* This miptree may be used for:
+* - Stencil texturin

Re: [Mesa-dev] [PATCH v2 5/5] i965: Enabled the OES_copy_image extension on Gen 7 GPUs

2019-02-06 Thread Eleni Maria Stea
On Wed, 6 Feb 2019 12:12:27 -0800
Nanley Chery  wrote:

> > +   * For now, we can't enable OES_texture_view on Gen 7
> > because of
> > +   * some piglit failures coming from
> > +   * piglit/tests/spec/arb_texture_view/rendering-formats.c
> > that need
> > +   * investigation.
> > */  
> 
> What kind of failures are you seeing? I'd imagine texture views to
> work with this version of your series.
> 

Hi Nanley,

If you run the piglit test: arb_texture_view-rendering-format, and grep
for failures on HSW:

PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_R32F" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16F" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16I" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16_SNORM" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RGBA8I" : "fail"}}
PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RGBA8_SNORM" : "fail"}} 

I remember seeing similar errors on Ivy too. They must be irrelevant to
the ETC support but as this test passes on BDW where the extension is
enabled, I didn't enable it on Gen 7 for the moment. I think I had
discussed about these failures with Kenneth before I disabled them, but
I didn't investigated them further after that.

Do you think I should enable it back?

Thanks,
Eleni

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/8] i965: Update the shadow miptree from the main to fake the ETC2 compression

2019-02-03 Thread Eleni Maria Stea
On Fri, 18 Jan 2019 17:09:03 -0800
Nanley Chery  wrote:

> On Mon, Nov 19, 2018 at 10:54:08AM +0200, Eleni Maria Stea wrote:
[...]
> > +   int img_d = smt->surf.logical_level0_px.depth;  
> 
> I don't think 3D ETC textures are possible. From the GL4.6 spec:
> 
>   An INVALID_OPERATION error is generated by
> CompressedTexImage3D if internalformat is one of the EAC, ETC2, or
> RGTC formats and either border is non-zero, or target is not
> TEXTURE_2D_ARRAY.

Hi Nanley,

Thanks for pointing this out. I've made the change in my new series
of patches but after giving it a second thought, I believe that I'd
rather put back the depth in the calculation of num_slices:

As, I understand the spec, if the border is zero, the 3D images should
be supported. Mesa already checks the border value in the file:
src/mesa/main/teximage.c function: compressed_texture_error_check and
has a comment:

/* No compressed formats support borders at this time */

and so only ETC/EAC compressed formats without border will reach the
update function and we should support them.

Also, I see that we have some CTS tests that call the
CompressedTexImage3D for ETC/EAC formats with 0 border value, so I
suppose that is expected to have 3D images of these formats.

What do you think?

Thank you in advance,
Eleni
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 3/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-03 Thread Eleni Maria Stea
GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
compressed EAC/ETC2 images to non-compressed RGBA images. When
GetCompressed* functions were called, the pixels were returned in this
RGBA format and not the compressed format that was expected.

Trying to fix this problem, we use a secondary shadow miptree to store the
decompressed data for the rendering and the main miptree to store the
compressed for the Get functions to work. Each time that the main miptree
is written with compressed data, we decompress them to RGB and update the
shadow. Then we use the shadow for rendering.

v2:
   - Fixes in the commit message (Nanley Chery)
   - Reversed the changes in brw_get_texture_swizzle and swapped the b, g
   values at the time that we decompress the data in the function:
   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
   - Simplified the format checks in the miptree_create function of the
   intel_mipmap_tree.c and reserved the call of the
   intel_lower_compressed_format for the case that we are faking the ETC
   support (Nanley Chery)
   - Removed the check for the auxiliary usage for the shadow miptree at
   creation (miptree_create of intel_mipmap_tree.c) as we won't use
   auxiliary buffers with these types of trees (Nanley Chery)
   - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
   removed the unecessary checks (Nanley Chery)
   - Fixed an unrelated indentation change (Nanley Chery)
   - Modified the function intel_miptree_finish_write to set the
   mt->shadow_needs_update to true to catch all the cases when we need to
   update the miptree (Nanley Chery)
   - In order to update the shadow miptree during the unmap of the
   main and always map the main (Nanley Chery) the following change was
   necessary: Splitted the previous update function that was updating all
   the mipmap levels and use two functions instead: one that updates one
   level and one that updates all of them. Used the first during unmap
   and the second before the rendering.
   - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
   miptree should be mapped each time and reversed all the changes in the
   higher level texture functions that upload data to textures as they
   aren't needed anymore.
   - Replaced the boolean needs_fake_etc with an inline function that
   checks when we need to fake the ETC compression (Nanley Chery)
   - Removed the initialization of the strides in the update function as
   the values will be overwritten by the intel_miptree_map call (Nanley
   Chery)
   - Used minify instead of division in the new update function
   intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
   Chery)
   - Removed the depth from the calculation of the number of slices in
   the new update function (intel_miptree_update_etc_shadow_levels of
   intel_mipmap_tree.c) as we don't need to support 3D ETC images.
   (Nanley Chery)
---
 .../drivers/dri/i965/brw_wm_surface_state.c   |   5 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 133 --
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  22 +++
 3 files changed, 150 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 618e2ab35bc..c2cf34aee71 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -521,7 +521,7 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
   */
  mesa_fmt = mt->format;
   } else if (mt->etc_format != MESA_FORMAT_NONE) {
- mesa_fmt = mt->format;
+ mesa_fmt = mt->shadow_mt->format;
   } else if (plane > 0) {
  mesa_fmt = mt->format;
   } else {
@@ -581,6 +581,9 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
  assert(mt->shadow_mt && !mt->shadow_needs_update);
  mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
+  } else if (intel_miptree_needs_fake_etc(brw, mt)) {
+ assert(mt->shadow_mt);
+ mt = mt->shadow_mt;
   }
 
   const int surf_index = surf_offset - >wm.base.surf_offset[0];
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 0a25dfd0161..3ff36b84a5a 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -57,6 +57,11 @@ static void *intel_miptree_map_raw(struct brw_context *brw,
GLbitfield mode);
 
 static void intel_miptree_unmap_raw(struct intel_mipmap_tree *mt);
+static void intel_miptree_update_etc_shadow(struct brw_context *brw,
+struct intel_mipmap_tree *mt,
+unsigned int level,
+unsigned int slice,
+int 

[Mesa-dev] [PATCH v2 2/5] i965: Removed assertions from intel_miptree_map_etc

2019-02-03 Thread Eleni Maria Stea
The assertions that the GL_MAP_WRITE_BIT and GL_MAP_INVALIDATE_RANGE_BIT
in intel_miptree_map_etc will fail when the ETC miptree is mapped for
reading. As we are about to fix the GetCompressed* functions in the
following patches and allow the reading from etc miptrees, we have to
remove them.

Fixes the crash of the test
KHR-GL45.direct_state_access.textures_compressed_subimage on Gen 7 GPUs.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 479188fd1c8..0a25dfd0161 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3497,9 +3497,6 @@ intel_miptree_map_etc(struct brw_context *brw,
   assert(mt->format == MESA_FORMAT_R8G8B8X8_UNORM);
}
 
-   assert(map->mode & GL_MAP_WRITE_BIT);
-   assert(map->mode & GL_MAP_INVALIDATE_RANGE_BIT);
-
intel_miptree_access_raw(brw, mt, level, slice, true);
 
map->stride = _mesa_format_row_stride(mt->etc_format, map->w);
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 5/5] i965: Enabled the OES_copy_image extension on Gen 7 GPUs

2019-02-03 Thread Eleni Maria Stea
OES_copy_image extension was disabled on Gen7 due to the lack of support
for ETC2 images. Enabled it back. (Kenneth Graunke)
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 3a95be58a63..d2e232f3ff1 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -287,14 +287,24 @@ intelInitExtensions(struct gl_context *ctx)
}
 
if (devinfo->gen >= 8 || devinfo->is_baytrail) {
-  /* For now, we only enable OES_copy_image on platforms that support
-   * ETC2 natively in hardware.  We would need more hacks to support it
-   * elsewhere. Same with OES_texture_view.
+  /*
+   * For now, we can't enable OES_texture_view on Gen 7 because of
+   * some piglit failures coming from
+   * piglit/tests/spec/arb_texture_view/rendering-formats.c that need
+   * investigation.
*/
-  ctx->Extensions.OES_copy_image = true;
   ctx->Extensions.OES_texture_view = true;
}
 
+   if (devinfo->gen >= 7) {
+  /*
+   * We can safely enable OES_copy_image on Gen 7, since we emulate
+   * the ETC2 support using the shadow_miptree to store the
+   * compressed data.
+   */
+  ctx->Extensions.OES_copy_image = true;
+   }
+
if (devinfo->gen >= 8) {
   ctx->Extensions.ARB_gpu_shader_int64 = true;
   /* requires ARB_gpu_shader_int64 */
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 4/5] i965: Fixed the CopyImageSubData for ETC2 on Gen < 8

2019-02-03 Thread Eleni Maria Stea
For CopyImageSubData to copy the data during the 1st draw call, we need
to update the shadow tree right before the rendering.
---
 src/mesa/drivers/dri/i965/brw_draw.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index ec4fe0b096f..d00e0a726b1 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -559,6 +559,11 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool 
rendering,
   tex_obj->mt->format == MESA_FORMAT_S_UINT8) {
  intel_update_r8stencil(brw, tex_obj->mt);
   }
+
+  if (intel_miptree_has_etc_shadow(brw, tex_obj->mt) &&
+  tex_obj->mt->shadow_needs_update) {
+ intel_miptree_update_etc_shadow_levels(brw, tex_obj->mt);
+  }
}
 
/* Resolve color for each active shader image. */
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/5] i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*

2019-02-03 Thread Eleni Maria Stea
From: Nanley Chery 

Use more generic field names. We'll reuse these fields for a workaround
with ASTC miptrees.

Reviewed-by: Eleni Maria Stea 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  8 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 16 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 14 +++---
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index b067a174056..618e2ab35bc 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -571,15 +571,15 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
 
   if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) 
{
  if (devinfo->gen <= 7) {
-assert(mt->r8stencil_mt && 
!mt->stencil_mt->r8stencil_needs_update);
-mt = mt->r8stencil_mt;
+assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update);
+mt = mt->shadow_mt;
  } else {
 mt = mt->stencil_mt;
  }
  format = ISL_FORMAT_R8_UINT;
   } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) {
- assert(mt->r8stencil_mt && !mt->r8stencil_needs_update);
- mt = mt->r8stencil_mt;
+ assert(mt->shadow_mt && !mt->shadow_needs_update);
+ mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
   }
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index b4e3524aa51..479188fd1c8 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1214,7 +1214,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt)
 
   brw_bo_unreference((*mt)->bo);
   intel_miptree_release(&(*mt)->stencil_mt);
-  intel_miptree_release(&(*mt)->r8stencil_mt);
+  intel_miptree_release(&(*mt)->shadow_mt);
   intel_miptree_aux_buffer_free((*mt)->aux_buf);
   free_aux_state_map((*mt)->aux_state);
 
@@ -2427,7 +2427,7 @@ intel_miptree_finish_write(struct brw_context *brw,
switch (mt->aux_usage) {
case ISL_AUX_USAGE_NONE:
   if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7)
- mt->r8stencil_needs_update = true;
+ mt->shadow_needs_update = true;
   break;
 
case ISL_AUX_USAGE_MCS:
@@ -2933,9 +2933,9 @@ intel_update_r8stencil(struct brw_context *brw,
 
assert(src->surf.size_B > 0);
 
-   if (!mt->r8stencil_mt) {
+   if (!mt->shadow_mt) {
   assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */
-  mt->r8stencil_mt = make_surface(
+  mt->shadow_mt = make_surface(
 brw,
 src->target,
 MESA_FORMAT_R_UINT8,
@@ -2949,13 +2949,13 @@ intel_update_r8stencil(struct brw_context *brw,
 ISL_TILING_Y0_BIT,
 ISL_SURF_USAGE_TEXTURE_BIT,
 BO_ALLOC_BUSY, 0, NULL);
-  assert(mt->r8stencil_mt);
+  assert(mt->shadow_mt);
}
 
-   if (src->r8stencil_needs_update == false)
+   if (src->shadow_needs_update == false)
   return;
 
-   struct intel_mipmap_tree *dst = mt->r8stencil_mt;
+   struct intel_mipmap_tree *dst = mt->shadow_mt;
 
for (int level = src->first_level; level <= src->last_level; level++) {
   const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ?
@@ -2975,7 +2975,7 @@ intel_update_r8stencil(struct brw_context *brw,
}
 
brw_cache_flush_for_read(brw, dst->bo);
-   src->r8stencil_needs_update = false;
+   src->shadow_needs_update = false;
 }
 
 static void *
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 17668944adc..1a7507023a1 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -294,16 +294,16 @@ struct intel_mipmap_tree
struct intel_mipmap_tree *stencil_mt;
 
/**
-* \brief Stencil texturing miptree for sampling from a stencil texture
+* \brief Shadow miptree for sampling when the main isn't supported by HW.
 *
-* Some hardware doesn't support sampling from the stencil texture as
-* required by the GL_ARB_stencil_texturing extenion. To workaround this we
-* blit the texture into a new texture that can be sampled.
+* To workaround various sampler bugs and limitations, we blit the main
+* texture into a new texture that can be sampled.
 *
-* \see intel_update_r8stencil()
+* This miptree may be used for:
+* - Stencil texturin

[Mesa-dev] [PATCH v2 0/5] improved the support for ETC2 formats on Gen 7

2019-02-03 Thread Eleni Maria Stea
Intel Gen7 GPUs don't support the ETC2 formats natively and in order to
show the pixels properly we convert them to RGBA and create RGBA miptrees.
The problem with that is that the GetCompressed* functions that should
return the compressed pixel values return the RGBA instead.

These patches are an attempt to give a solution to this problem, by
using 2 miptrees: the main to stores the ETC values and the generic
shadow (mt->shadow) to store the RGBA. Each time that the main miptree
is unmapped we unpack the ETC to RGBA and we update the shadow. Similarly,
we update all the mipmap levels of the image (if necessary) before the
drawing, for the CopyImageSubData to work.

Also, the OES_copy_image extension that couldn't work on Gen 7 due to the
lack of the ETC support is now enabled back.

Eleni Maria Stea (4):
  i965: Removed assertions from intel_miptree_map_etc
  i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
  i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
  i965: Enabled the OES_copy_image extension on Gen 7 GPUs

Nanley Chery (1):
  i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*

 src/mesa/drivers/dri/i965/brw_draw.c  |   5 +
 .../drivers/dri/i965/brw_wm_surface_state.c   |  13 +-
 src/mesa/drivers/dri/i965/intel_extensions.c  |  18 ++-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 152 +++---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  36 -
 5 files changed, 188 insertions(+), 36 deletions(-)

-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/8] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-01-26 Thread Eleni Maria Stea
Hi Nanley,

On Fri, 18 Jan 2019 15:32:02 -0800
Nanley Chery  wrote:


> > diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index
> > e214fae140..4d1eafac91 100644 ---
> > a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++
> > b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -329,6
[...]

> > @@ -474,6 +485,11 @@ static void brw_update_texture_surface(struct
> > gl_context *ctx, struct intel_texture_object *intel_obj =
> > intel_texture_object(obj); struct intel_mipmap_tree *mt =
> > intel_obj->mt; 
> > +  if (mt->needs_fake_etc) {
> > + assert(mt->shadow_mt);
> > + mt = mt->shadow_mt;
> > +  }
> > +
> >if (plane > 0) {
> >   if (mt->plane[plane - 1] == NULL)
> >  return;
> > @@ -512,7 +528,7 @@ static void brw_update_texture_surface(struct
> > gl_context *ctx,
> >* is safe because texture views aren't allowed on
> > depth/stencil. */
> >   mesa_fmt = mt->format;
> > -  } else if (mt->etc_format != MESA_FORMAT_NONE) {
> > +  } else if (intel_obj->mt->etc_format != MESA_FORMAT_NONE) {
> >   mesa_fmt = mt->format;  
> 
> For uniformity, lets access mt->shadow_mt->format here and move the
> mt->needs_fake_etc check from above to below this condition:
> 
>   } else if (devinfo->gen <= 7 && mt->format ==
> MESA_FORMAT_S_UINT8) {

I'd like to ask you one more question on this change: if I do the check
for the fake etc later, the following code will run for the main
miptree that contains the compressed data and has ETC2 format:

> >if (plane > 0) {
> >   if (mt->plane[plane - 1] == NULL)
> >  return;
> > @@ -512,7 +528,7 @@ static void brw_update_texture_surface(struct
> > gl_context *ctx,
> >* is safe because texture views aren't allowed on
> > depth/stencil. */
> >   mesa_fmt = mt->format;

Wouldn't this be a problem?

Thank you in advance,
Eleni

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/8] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-01-22 Thread Eleni Maria Stea
On 1/22/19 9:25 PM, Nanley Chery wrote:
[...]
> 
> The performance difference should be negligible if the function is
> declared static inline in the intel_mipmap_tree.h header. The compiler
> should include the body of function (which should be small) and avoid
> the overhead of a function call.

[...]

> 
> Firstly, it's not information that's generally useful for most
> intel_mipmap_tree objects. Having too much of such state makes debugging
> and reading the struct definition more difficult.
> 
> Secondly, it adds to the amount of state-dependent variables I have to
> keep in mind when looking at the code. I have to start asking, when is
> needs_fake_etc initialized? Is needs_fake_etc ever modified later? I'm
> already familiar with the other variables needs_fake_etc can be computed
> by: the gen, the miptree format, and the shadow_mt. I hope that helps.
> 
> -Nanley
> 

Ok, I understand, I am going to change the code to use an inline
function then.

Thank you very much,
Eleni
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/8] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-01-22 Thread Eleni Maria Stea
On 1/19/19 1:32 AM, Nanley Chery wrote:
>> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
>> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> index e214fae140..4d1eafac91 100644
>> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> @@ -329,6 +329,17 @@ brw_get_texture_swizzle(const struct gl_context *ctx,
>>  {
>> const struct gl_texture_image *img = t->Image[0][t->BaseLevel];
>>  
>> +   struct brw_context *brw = brw_context((struct gl_context *)ctx);
>> +   const struct gen_device_info *devinfo = >screen->devinfo;
>> +   bool is_fake_etc = _mesa_is_format_etc2(img->TexFormat) &&
>> +  devinfo->gen < 8;
>> +
>> +   mesa_format format;
>> +   if (is_fake_etc)
>> +  format = intel_lower_compressed_format(brw, img->TexFormat);
>> +   else
>> +  format = img->TexFormat;
>> +
> 
> Why is modifying this function necessary?

Hi,

I'll try to explain this modification:

After the changes we made:
- the image TexFormat remains ETC2 to match the main miptree's format
- the main miptree stores the compressed data (ETC2) so that the
GetCompressed* functions work
- the shadow miptree stores the RGBA data and we map it for the drawing

This texture swizzle function is called before the drawing and it can't
access the miptrees. Instead it reads the format of the texture we are
supposed to have in the memory from the gl_texture_image struct directly
so in this case it reads the ETC2 format.

At this time, the texture that we have in the memory and is about to be
used in the drawing is RGBA (from the shadow miptree).

As a result, we end up calculating the swizzle of the ETC2 format used
in the original image (+the main miptree) for the RGBA texture that we
have in the memory. As a result the texture is not rendered properly.

The solution was to use the corresponding RGBA format when we fake the
ETC2, but as I couldn't read it from the shadow miptree inside this
function, I took it by calling intel_lower_compressed_format for the
original ETC2 format of the gl_texture_image.

I hope that this change is more clear now, I will add a comment
explaining this just in case,

Thank you!
Eleni


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/8] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-01-22 Thread Eleni Maria Stea
On 1/22/19 12:46 PM, Eleni Maria Stea wrote:
>>> +   /**
>>> +* \brief Indicates that we fake the ETC2 compression support
>>> +*
>>> +* GPUs Gen < 8 don't support sampling and rendering of ETC2
>>> formats so
>>> +* we need to fake it. This variable is set to true when we
>>> fake it.
>>> +*/
>>> +   bool needs_fake_etc;
>>> +  
>>
>> Let's make a function to detect needs_fake_etc instead of adding to
>> the data structure. That'd be easier to follow.
>>
>> -Nanley
> 
> 
> Hi Nanley,
> 
> I'd like a small clarification here if you don't mind: I wasn't very
> sure about this last change you suggest.
> 
> The reasons I preferred to extend the data structure instead of adding
> a function were:
> 
> 1- that I need to check if we fake ETC in several different places in
> which I don't always have access to the information that helped me
> decide if we need to fake the ETC or not, so I found it much easier to
> keep this information in the miptree that can be accessed from
> everywhere. (That was the main reason).

Actually, now I better thought of it, I only need the GPU version and if
the format is compressed, so I can probably get this information in all
places but we would still need to make many unnecessary calls...
Couldn't we avoid them by just checking this once at the beginning?

Thanks again,
Eleni

> The other reasons were that:
> 2- I thought that it would be faster to check the miptree than call a
> function.
> 3- I was hoping that from the name of the variable it won't be
> difficult to follow (but I could rename it to something better if you
> prefer it).
> 
> Could you explain me why you'd like me to replace it? Is there an
> advantage I hadn't thought of?
> 
> Thank you in advance,
> Eleni
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/8] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-01-22 Thread Eleni Maria Stea
> > +   /**
> > +* \brief Indicates that we fake the ETC2 compression support
> > +*
> > +* GPUs Gen < 8 don't support sampling and rendering of ETC2
> > formats so
> > +* we need to fake it. This variable is set to true when we
> > fake it.
> > +*/
> > +   bool needs_fake_etc;
> > +  
> 
> Let's make a function to detect needs_fake_etc instead of adding to
> the data structure. That'd be easier to follow.
> 
> -Nanley


Hi Nanley,

I'd like a small clarification here if you don't mind: I wasn't very
sure about this last change you suggest.

The reasons I preferred to extend the data structure instead of adding
a function were:

1- that I need to check if we fake ETC in several different places in
which I don't always have access to the information that helped me
decide if we need to fake the ETC or not, so I found it much easier to
keep this information in the miptree that can be accessed from
everywhere. (That was the main reason).

The other reasons were that:
2- I thought that it would be faster to check the miptree than call a
function.
3- I was hoping that from the name of the variable it won't be
difficult to follow (but I could rename it to something better if you
prefer it).

Could you explain me why you'd like me to replace it? Is there an
advantage I hadn't thought of?

Thank you in advance,
Eleni
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/8] i965: r8stencil_mt/needs_update renamed to shadow_mt/needs_update

2019-01-21 Thread Eleni Maria Stea
On 1/19/19 12:55 AM, Nanley Chery wrote:
> The series I pointed you to earlier has a patch like this, but it's more
> complete. It also modifies the comment above the data structure being
> modified. Do you want to review it?
> 
> https://patchwork.freedesktop.org/patch/253197/
> 
> I think what people usually do in this case is send out their series
> with the other person's patch included (and their rb tacked onto it).


Hi Nanley,

First of all, thank you for taking the time to look at the patches.

I will review your patch and replace mine with it in the fixed series
when I complete the other changes you suggested.

Regards,
Eleni
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] i965: improved the support for ETC2 formats on Gen 7

2019-01-14 Thread Eleni Maria Stea
On Mon, 19 Nov 2018 10:54:04 +0200
Eleni Maria Stea  wrote:

> Intel Gen7 GPUs don't have native support for ETC2 formats. We store
> the ETC2 images as RGBA in order to render them. This is a problem for
> GetCompressed* functions that should return compressed pixel values
> but return instead RGBA.
> 
[...]

Hi Nanley and Kenneth,

It's been a while I've sent these ETC2-related patches and I was
wondering if you could get a look when you have some time available.

I've also written a test to check the compressed cubemaps rendering (we
already had tests for the Get functions, and compressed mipmaps, so this
case was the only one missing). The patch is here (compressed-cubemap
test):

https://patchwork.freedesktop.org/series/54880/

While working on the test I found an issue with TexImage2D and some
other compressed formats (like BPTC), and I wrote another test
(included in the same patch) that points it out (see the cover letter).

Another problem I hit while working on the cubemap test is described
here (I found it by calling glViewport with invalid values
accidentally):

https://bugs.freedesktop.org/show_bug.cgi?id=108999

I've sent a small patch for it, but so far there was no reply:

https://patchwork.freedesktop.org/patch/267292/

I'd really appreciate it if you could take some time to look at these 3
issues.

Thank you very much in advance,
Eleni

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: fixed clamping in set_scissor_bits when the y is flipped

2018-12-10 Thread Eleni Maria Stea
Calculating the scissor rectangle fields with the y flipped (0 on top)
can generate negative values that will cause assertion failure later on
as the scissor fields are all unsigned. We must clamp the bbox values
again to make sure they don't exceed the fb_height. Also fixed a
calculation error.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999
---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 8e3fcbf12e..5d8fc8214e 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -2424,8 +2424,21 @@ set_scissor_bits(const struct gl_context *ctx, int i,
   /* memory: Y=0=top */
   sc->ScissorRectangleXMin = bbox[0];
   sc->ScissorRectangleXMax = bbox[1] - 1;
+
+  /* Clamping to fb_height is necessary because otherwise the
+   * subtractions below would produce a negative result, which would
+   * then be assigned to the unsigned YMin/YMax scissor fields,
+   * resulting in an assertion failure in GENX(SCISSOR_RECT_pack)
+   */
+
+  if (bbox[3] > fb_height)
+ bbox[3] = fb_height;
+
+  if (bbox[2] > fb_height)
+ bbox[2] = fb_height;
+
   sc->ScissorRectangleYMin = fb_height - bbox[3];
-  sc->ScissorRectangleYMax = fb_height - bbox[2] - 1;
+  sc->ScissorRectangleYMax = fb_height - (bbox[2] - 1);
}
 }
 
-- 
2.20.0.rc2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/8] i965: Added support for ETC2 texture arrays on Gen7

2018-11-19 Thread Eleni Maria Stea
Modified the calculation of the number of slices in the
intel_update_decompressed_shadow function to take the array length into
account to support arrays.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 4886bb2b96..0840b3b243 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3965,6 +3965,8 @@ intel_update_decompressed_shadow(struct brw_context *brw,
int level_w = img_w;
int level_h = img_h;
 
+   int num_slices = img_d * smt->surf.logical_level0_px.array_len;
+
for (int level = smt->first_level; level <= smt->last_level; level++) {
   ptrdiff_t shadow_stride = _mesa_format_row_stride(smt->format,
 level_w);
@@ -3972,7 +3974,7 @@ intel_update_decompressed_shadow(struct brw_context *brw,
   ptrdiff_t main_stride = _mesa_format_row_stride(mt->format,
   level_w);
 
-  for (unsigned int slice = 0; slice < img_d; slice++) {
+  for (unsigned int slice = 0; slice < num_slices; slice++) {
  GLbitfield mmode = GL_MAP_READ_BIT | BRW_MAP_DIRECT_BIT |
 BRW_MAP_ETC_BIT;
  GLbitfield smode = GL_MAP_WRITE_BIT |
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/8] i965: Enabled the OES_copy_image extension on Gen 7 GPUs

2018-11-19 Thread Eleni Maria Stea
OES_copy_image extension was disabled on Gen7 due to the lack of support
for ETC2 images. Enabled it back.
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index d7e02efb54..c3b3c1bd12 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -286,14 +286,24 @@ intelInitExtensions(struct gl_context *ctx)
}
 
if (devinfo->gen >= 8 || devinfo->is_baytrail) {
-  /* For now, we only enable OES_copy_image on platforms that support
-   * ETC2 natively in hardware.  We would need more hacks to support it
-   * elsewhere. Same with OES_texture_view.
+  /*
+   * For now, we can't enable OES_texture_view on Gen 7 because of
+   * some piglit failures coming from
+   * piglit/tests/spec/arb_texture_view/rendering-formats.c that need
+   * investigation.
*/
-  ctx->Extensions.OES_copy_image = true;
   ctx->Extensions.OES_texture_view = true;
}
 
+   if (devinfo->gen >= 7) {
+  /*
+   * We can safely enable OES_copy_image on Gen 7, since we emulate
+   * the ETC2 support using the shadow_miptree to store the
+   * compressed data.
+   */
+  ctx->Extensions.OES_copy_image = true;
+   }
+
if (devinfo->gen >= 8) {
   ctx->Extensions.ARB_gpu_shader_int64 = devinfo->has_64bit_types;
   /* requires ARB_gpu_shader_int64 */
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/8] i965: improved the support for ETC2 formats on Gen 7

2018-11-19 Thread Eleni Maria Stea
Intel Gen7 GPUs don't have native support for ETC2 formats. We store the
ETC2 images as RGBA in order to render them. This is a problem for
GetCompressed* functions that should return compressed pixel values but
return instead RGBA.

With these patches, we store the compressed image data in the main image
mipmap tree and we use a secondary mipmap tree to store the RGBA values
for the rendering. We perform a lazy update every time that the main
miptree changes.

Fix: KHR-GL46.direct_state_access.textures_compressed_subimage

Eleni Maria Stea (8):
  i965: Removed assertions from intel_miptree_map_etc
  i965: r8stencil_mt/needs_update renamed to shadow_mt/needs_update
  i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
  i965: Update the shadow miptree from the main to fake the ETC2
compression
  i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
  i965: Added support for ETC2 mipmaps
  i965: Added support for ETC2 texture arrays on Gen7
  i965: Enabled the OES_copy_image extension on Gen 7 GPUs

 src/mesa/drivers/dri/i965/brw_draw.c  |   3 +
 .../drivers/dri/i965/brw_wm_surface_state.c   |  35 +++-
 src/mesa/drivers/dri/i965/intel_extensions.c  |  18 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 168 +++---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  24 ++-
 src/mesa/drivers/dri/i965/intel_tex_image.c   |  45 -
 src/mesa/main/texstore.c  |  92 +-
 src/mesa/main/texstore.h  |   9 +
 8 files changed, 315 insertions(+), 79 deletions(-)

-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/8] i965: Update the shadow miptree from the main to fake the ETC2 compression

2018-11-19 Thread Eleni Maria Stea
On GPUs gen < 8 that don't support ETC2 sampling/rendering we now fake
the support using 2 mipmap trees: one (the main) that stores the
compressed data for the Get* functions to work and one (the shadow) that
stores the same data decompressed for the render/sampling to work.

Added the intel_update_decompressed_shadow function to update the shadow
tree with the decompressed data whenever the main miptree with the
compressed is changing.
---
 .../drivers/dri/i965/brw_wm_surface_state.c   |  1 +
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 70 ++-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  3 +
 3 files changed, 71 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 4d1eafac91..2e6d85e1fe 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -579,6 +579,7 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
 
   if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) 
{
  if (devinfo->gen <= 7) {
+assert(!intel_obj->mt->needs_fake_etc);
 assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update);
 mt = mt->shadow_mt;
  } else {
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index b24332ff67..ef3e2c33d3 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3740,12 +3740,15 @@ intel_miptree_map(struct brw_context *brw,
assert(mt->surf.samples == 1);
 
if (mt->needs_fake_etc) {
-  if (!(mode & BRW_MAP_ETC_BIT)) {
+  if (!(mode & BRW_MAP_ETC_BIT) && !(mode & GL_MAP_READ_BIT)) {
  assert(mt->shadow_mt);
 
- mt->is_shadow_mapped = true;
+ if (mt->shadow_needs_update) {
+intel_update_decompressed_shadow(brw, mt);
+mt->shadow_needs_update = false;
+ }
 
- mt->shadow_needs_update = false;
+ mt->is_shadow_mapped = true;
  mt = miptree->shadow_mt;
   } else {
  mt->is_shadow_mapped = false;
@@ -3762,6 +3765,8 @@ intel_miptree_map(struct brw_context *brw,
 
map = intel_miptree_attach_map(mt, level, slice, x, y, w, h, mode);
if (!map){
+  miptree->is_shadow_mapped = false;
+
   *out_ptr = NULL;
   *out_stride = 0;
   return;
@@ -3942,3 +3947,62 @@ intel_miptree_get_clear_color(const struct 
gen_device_info *devinfo,
   return mt->fast_clear_color;
}
 }
+
+void
+intel_update_decompressed_shadow(struct brw_context *brw,
+ struct intel_mipmap_tree *mt)
+{
+   struct intel_mipmap_tree *smt = mt->shadow_mt;
+
+   assert(smt);
+   assert(mt->needs_fake_etc);
+   assert(mt->surf.size_B > 0);
+
+   int img_w = smt->surf.logical_level0_px.width;
+   int img_h = smt->surf.logical_level0_px.height;
+   int img_d = smt->surf.logical_level0_px.depth;
+
+   ptrdiff_t shadow_stride = _mesa_format_row_stride(smt->format, img_w);
+
+   for (int level = smt->first_level; level <= smt->last_level; level++) {
+  struct compressed_pixelstore store;
+  _mesa_compute_compressed_pixelstore(mt->surf.dim,
+  mt->format,
+  img_w, img_h, img_d,
+  >ctx.Unpack,
+  );
+  for (unsigned int slice = 0; slice < img_d; slice++) {
+ GLbitfield mmode = GL_MAP_READ_BIT | BRW_MAP_DIRECT_BIT |
+BRW_MAP_ETC_BIT;
+ GLbitfield smode = GL_MAP_WRITE_BIT |
+GL_MAP_INVALIDATE_RANGE_BIT |
+BRW_MAP_DIRECT_BIT;
+
+ uint32_t img_x, img_y;
+ intel_miptree_get_image_offset(smt, level, slice, _x, _y);
+
+ void *mptr = intel_miptree_map_raw(brw, mt, mmode) + mt->offset
++ img_y * store.TotalBytesPerRow
++ img_x * store.TotalBytesPerRow / img_w;
+
+ void *sptr;
+ intel_miptree_map(brw, smt, level, slice, img_x, img_y, img_w, img_h,
+   smode, , _stride);
+
+ if (mt->format == MESA_FORMAT_ETC1_RGB8) {
+_mesa_etc1_unpack_rgba(sptr, shadow_stride,
+   mptr, store.TotalBytesPerRow,
+   img_w, img_h);
+ } else {
+_mesa_unpack_etc2_format(sptr, shadow_stride,
+ mptr, store.TotalBytesPerRow,
+ img_w, img_h, mt->format, true);
+ }
+
+ intel_miptree_unmap_raw(mt);
+ intel_miptree_unmap(brw, smt, level, slice);
+  }
+   }
+
+   mt->shadow_needs_update = false;
+}
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 

[Mesa-dev] [PATCH 3/8] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2018-11-19 Thread Eleni Maria Stea
GPUs Gen < 8 cannot render ETC2 formats. So far, they converted the
compressed EAC/ETC2 images to non-compressed RGB format images that they
can render. When GetCompressed* functions were called, the pixels were
returned in the RGB format and not the compressed format as expected.

Trying to fix this problem, we use the shadow miptree to store the
decompressed data for the rendering and the main miptree to store the
compressed. We use the BRW_MAP_ETC_BIT as a flag to indicate when we
use the fake compression in order to map the main tree with the
compressed data. The functions that upload the compressed data as well
as the mapping/unmapping functions are now updated to use this flag.
---
 .../drivers/dri/i965/brw_wm_surface_state.c   | 26 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 73 +--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 17 
 src/mesa/drivers/dri/i965/intel_tex_image.c   | 45 -
 src/mesa/main/texstore.c  | 92 +++
 src/mesa/main/texstore.h  |  9 ++
 6 files changed, 204 insertions(+), 58 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index e214fae140..4d1eafac91 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -329,6 +329,17 @@ brw_get_texture_swizzle(const struct gl_context *ctx,
 {
const struct gl_texture_image *img = t->Image[0][t->BaseLevel];
 
+   struct brw_context *brw = brw_context((struct gl_context *)ctx);
+   const struct gen_device_info *devinfo = >screen->devinfo;
+   bool is_fake_etc = _mesa_is_format_etc2(img->TexFormat) &&
+  devinfo->gen < 8;
+
+   mesa_format format;
+   if (is_fake_etc)
+  format = intel_lower_compressed_format(brw, img->TexFormat);
+   else
+  format = img->TexFormat;
+
int swizzles[SWIZZLE_NIL + 1] = {
   SWIZZLE_X,
   SWIZZLE_Y,
@@ -381,7 +392,7 @@ brw_get_texture_swizzle(const struct gl_context *ctx,
   }
}
 
-   GLenum datatype = _mesa_get_format_datatype(img->TexFormat);
+   GLenum datatype = _mesa_get_format_datatype(format);
 
/* If the texture's format is alpha-only, force R, G, and B to
 * 0.0. Similarly, if the texture's format has no alpha channel,
@@ -422,9 +433,9 @@ brw_get_texture_swizzle(const struct gl_context *ctx,
case GL_RED:
case GL_RG:
case GL_RGB:
-  if (_mesa_get_format_bits(img->TexFormat, GL_ALPHA_BITS) > 0 ||
-  img->TexFormat == MESA_FORMAT_RGB_DXT1 ||
-  img->TexFormat == MESA_FORMAT_SRGB_DXT1)
+  if (_mesa_get_format_bits(format, GL_ALPHA_BITS) > 0 ||
+  format == MESA_FORMAT_RGB_DXT1 ||
+  format == MESA_FORMAT_SRGB_DXT1)
  swizzles[3] = SWIZZLE_ONE;
   break;
}
@@ -474,6 +485,11 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
   struct intel_texture_object *intel_obj = intel_texture_object(obj);
   struct intel_mipmap_tree *mt = intel_obj->mt;
 
+  if (mt->needs_fake_etc) {
+ assert(mt->shadow_mt);
+ mt = mt->shadow_mt;
+  }
+
   if (plane > 0) {
  if (mt->plane[plane - 1] == NULL)
 return;
@@ -512,7 +528,7 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
   * is safe because texture views aren't allowed on depth/stencil.
   */
  mesa_fmt = mt->format;
-  } else if (mt->etc_format != MESA_FORMAT_NONE) {
+  } else if (intel_obj->mt->etc_format != MESA_FORMAT_NONE) {
  mesa_fmt = mt->format;
   } else if (plane > 0) {
  mesa_fmt = mt->format;
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 0e67e4d8f3..b24332ff67 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -689,6 +689,8 @@ miptree_create(struct brw_context *brw,
if (devinfo->gen < 6 && _mesa_is_format_color_format(format))
   tiling_flags &= ~ISL_TILING_Y0_BIT;
 
+   bool fakes_etc_compression = devinfo->gen < 8 && 
_mesa_is_format_etc2(format);
+
mesa_format mt_fmt;
if (_mesa_is_format_color_format(format)) {
   mt_fmt = intel_lower_compressed_format(brw, format);
@@ -700,18 +702,41 @@ miptree_create(struct brw_context *brw,
intel_depth_format_for_depthstencil_format(format);
}
 
+   mesa_format fmt = fakes_etc_compression ? format : mt_fmt;
struct intel_mipmap_tree *mt =
-  make_surface(brw, target, mt_fmt, first_level, last_level,
+  make_surface(brw, target, fmt, first_level, last_level,
width0, height0, depth0, num_samples,
-   tiling_flags, mt_surf_usage(mt_fmt),
+   tiling_flags, mt_surf_usage(fmt),
alloc_flags, 0, NULL);
 
if (mt == NULL)
   return NULL;
 
+   mt->needs_fake_etc = 

[Mesa-dev] [PATCH 2/8] i965: r8stencil_mt/needs_update renamed to shadow_mt/needs_update

2018-11-19 Thread Eleni Maria Stea
Renamed the r8stencil_mt and r8stencil_needs_update to shadow_mt and
shadow_needs_update respectively to allow reusing the shadow_mt as a
generic purpose secondary mipmap tree.
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  8 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 16 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h|  4 ++--
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 8d21cf5fa7..e214fae140 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -563,15 +563,15 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
 
   if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) 
{
  if (devinfo->gen <= 7) {
-assert(mt->r8stencil_mt && 
!mt->stencil_mt->r8stencil_needs_update);
-mt = mt->r8stencil_mt;
+assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update);
+mt = mt->shadow_mt;
  } else {
 mt = mt->stencil_mt;
  }
  format = ISL_FORMAT_R8_UINT;
   } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) {
- assert(mt->r8stencil_mt && !mt->r8stencil_needs_update);
- mt = mt->r8stencil_mt;
+ assert(mt->shadow_mt && !mt->shadow_needs_update);
+ mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
   }
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 5e11ec0c30..0e67e4d8f3 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1216,7 +1216,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt)
 
   brw_bo_unreference((*mt)->bo);
   intel_miptree_release(&(*mt)->stencil_mt);
-  intel_miptree_release(&(*mt)->r8stencil_mt);
+  intel_miptree_release(&(*mt)->shadow_mt);
   intel_miptree_aux_buffer_free((*mt)->aux_buf);
   free_aux_state_map((*mt)->aux_state);
 
@@ -2429,7 +2429,7 @@ intel_miptree_finish_write(struct brw_context *brw,
switch (mt->aux_usage) {
case ISL_AUX_USAGE_NONE:
   if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7)
- mt->r8stencil_needs_update = true;
+ mt->shadow_needs_update = true;
   break;
 
case ISL_AUX_USAGE_MCS:
@@ -2935,9 +2935,9 @@ intel_update_r8stencil(struct brw_context *brw,
 
assert(src->surf.size_B > 0);
 
-   if (!mt->r8stencil_mt) {
+   if (!mt->shadow_mt) {
   assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */
-  mt->r8stencil_mt = make_surface(
+  mt->shadow_mt = make_surface(
 brw,
 src->target,
 MESA_FORMAT_R_UINT8,
@@ -2951,13 +2951,13 @@ intel_update_r8stencil(struct brw_context *brw,
 ISL_TILING_Y0_BIT,
 ISL_SURF_USAGE_TEXTURE_BIT,
 BO_ALLOC_BUSY, 0, NULL);
-  assert(mt->r8stencil_mt);
+  assert(mt->shadow_mt);
}
 
-   if (src->r8stencil_needs_update == false)
+   if (src->shadow_needs_update == false)
   return;
 
-   struct intel_mipmap_tree *dst = mt->r8stencil_mt;
+   struct intel_mipmap_tree *dst = mt->shadow_mt;
 
for (int level = src->first_level; level <= src->last_level; level++) {
   const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ?
@@ -2977,7 +2977,7 @@ intel_update_r8stencil(struct brw_context *brw,
}
 
brw_cache_flush_for_read(brw, dst->bo);
-   src->r8stencil_needs_update = false;
+   src->shadow_needs_update = false;
 }
 
 static void *
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index b0333655ad..b955a2bab1 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -302,8 +302,8 @@ struct intel_mipmap_tree
 *
 * \see intel_update_r8stencil()
 */
-   struct intel_mipmap_tree *r8stencil_mt;
-   bool r8stencil_needs_update;
+   struct intel_mipmap_tree *shadow_mt;
+   bool shadow_needs_update;
 
/**
 * \brief CCS, MCS, or HiZ auxiliary buffer.
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/8] i965: Removed assertions from intel_miptree_map_etc

2018-11-19 Thread Eleni Maria Stea
The assertions that the GL_MAP_WRITE_BIT and GL_MAP_INVALIDATE_RANGE_BIT
in intel_miptree_map_etc should be removed since they will fail when the
ETC miptree is mapped for reading.

Fixes: KHR-GL45.direct_state_access.textures_compressed_subimage crash
on Gen 7 GPUs.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 8e50aabb3b..5e11ec0c30 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3444,9 +3444,6 @@ intel_miptree_map_etc(struct brw_context *brw,
   assert(mt->format == MESA_FORMAT_R8G8B8X8_UNORM);
}
 
-   assert(map->mode & GL_MAP_WRITE_BIT);
-   assert(map->mode & GL_MAP_INVALIDATE_RANGE_BIT);
-
intel_miptree_access_raw(brw, mt, level, slice, true);
 
map->stride = _mesa_format_row_stride(mt->etc_format, map->w);
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/8] i965: Fixed the CopyImageSubData for ETC2 on Gen < 8

2018-11-19 Thread Eleni Maria Stea
CopyImageSubData couldn't work for the first draw call because
intel_update_decompressed_shadow was called during the rendering. Moved
the intel_update_decompressed_shadow in brw_predraw_resolve_inputs to
fix this problem.
---
 src/mesa/drivers/dri/i965/brw_draw.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 8536c04010..b331561f36 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -559,6 +559,9 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool 
rendering,
   tex_obj->mt->format == MESA_FORMAT_S_UINT8) {
  intel_update_r8stencil(brw, tex_obj->mt);
   }
+
+  if (tex_obj->mt->needs_fake_etc && tex_obj->mt->shadow_needs_update)
+ intel_update_decompressed_shadow(brw, tex_obj->mt);
}
 
/* Resolve color for each active shader image. */
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/8] i965: Added support for ETC2 mipmaps

2018-11-19 Thread Eleni Maria Stea
Extended the intel_update_decompress_shadow to update all the mipmap
tree levels so that we can display and run Get functions on mipmaps.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 48 +++
 1 file changed, 29 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index ef3e2c33d3..4886bb2b96 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3962,15 +3962,16 @@ intel_update_decompressed_shadow(struct brw_context 
*brw,
int img_h = smt->surf.logical_level0_px.height;
int img_d = smt->surf.logical_level0_px.depth;
 
-   ptrdiff_t shadow_stride = _mesa_format_row_stride(smt->format, img_w);
+   int level_w = img_w;
+   int level_h = img_h;
 
for (int level = smt->first_level; level <= smt->last_level; level++) {
-  struct compressed_pixelstore store;
-  _mesa_compute_compressed_pixelstore(mt->surf.dim,
-  mt->format,
-  img_w, img_h, img_d,
-  >ctx.Unpack,
-  );
+  ptrdiff_t shadow_stride = _mesa_format_row_stride(smt->format,
+level_w);
+
+  ptrdiff_t main_stride = _mesa_format_row_stride(mt->format,
+  level_w);
+
   for (unsigned int slice = 0; slice < img_d; slice++) {
  GLbitfield mmode = GL_MAP_READ_BIT | BRW_MAP_DIRECT_BIT |
 BRW_MAP_ETC_BIT;
@@ -3978,30 +3979,39 @@ intel_update_decompressed_shadow(struct brw_context 
*brw,
 GL_MAP_INVALIDATE_RANGE_BIT |
 BRW_MAP_DIRECT_BIT;
 
- uint32_t img_x, img_y;
- intel_miptree_get_image_offset(smt, level, slice, _x, _y);
+ uint32_t slevel_x, slevel_y;
+ intel_miptree_get_image_offset(smt, level, slice, _x,
+_y);
+
+ uint32_t mlevel_x, mlevel_y;
+ intel_miptree_get_image_offset(mt, level, slice, _x,
+_y);
+
+ void *mptr;
+ intel_miptree_map(brw, mt, level, slice, 0, 0,
+   level_w, level_h, mmode, , _stride);
 
- void *mptr = intel_miptree_map_raw(brw, mt, mmode) + mt->offset
-+ img_y * store.TotalBytesPerRow
-+ img_x * store.TotalBytesPerRow / img_w;
 
  void *sptr;
- intel_miptree_map(brw, smt, level, slice, img_x, img_y, img_w, img_h,
-   smode, , _stride);
+ intel_miptree_map(brw, smt, level, slice, 0, 0, level_w,
+   level_h, smode, , _stride);
 
  if (mt->format == MESA_FORMAT_ETC1_RGB8) {
 _mesa_etc1_unpack_rgba(sptr, shadow_stride,
-   mptr, store.TotalBytesPerRow,
-   img_w, img_h);
+   mptr, main_stride,
+   level_w, level_h);
  } else {
 _mesa_unpack_etc2_format(sptr, shadow_stride,
- mptr, store.TotalBytesPerRow,
- img_w, img_h, mt->format, true);
+ mptr, main_stride,
+ level_w, level_h, mt->format, true);
  }
 
- intel_miptree_unmap_raw(mt);
+ intel_miptree_unmap(brw, mt, level, slice);
  intel_miptree_unmap(brw, smt, level, slice);
   }
+
+  level_w /= 2;
+  level_h /= 2;
}
 
mt->shadow_needs_update = false;
-- 
2.19.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5] i965: Fix ETC2/EAC GetCompressed* functions on Gen7 GPUs

2018-07-18 Thread Eleni Maria Stea
On 07/10/2018 03:10 AM, Nanley Chery wrote:
> On Thu, Jun 14, 2018 at 10:50:57PM +0300, Eleni Maria Stea wrote:
>> On 06/14/2018 10:27 PM, Nanley Chery wrote:
>>
>>> +Jason, Ken
>>>
>>> Hello,
>>>
>>> I recently did some miptree work relating to the r8stencil_mt and I
>>> think I now have a more informed opinion about how things should be
>>> structured. I'd like to propose an alternative solution.
>>>
>>> I had initially thought we should have a separate miptree to hold the
>>> compressed data, like this patch does, but now I think we should
>>> actually have the compressed data be the main miptree and to store the
>>> decompressed miptree as part of the main one. The reasoning is that we
>>> could reuse this structure to handle the r8stencil workaround and to
>>> eventually handle the ASTC_LDR surfaces that are modified on gen9.
>>>
>>> I'm proposing something like the following:
>>>
>>> 1. Rename r8stencil_mt ->shadow_mt and
>>>r8stencil_needs_update -> shadow_needs_update.
>>> 2. Make shadow_mt hold the decompressed ETC miptree
>>> 3. Update shadow_needs_update whenever the main mt is modified
>>> 4. Add an function to update the shadow_mt using the main mt as a source
>>> 5. Sample from the shadow_mt as appropriate
>>> 6. Make the main miptree hold the compressed data
>>>
>>> This method should also be able to handle the CopyImage functions. What
>>> do you all think?
>>>
>>> -Nanley
>>
>> Hi Nanley,
>>
>> Thank you for your reply. I wasn't aware that there are other cases we
>> might need to store a 2nd image. I agree that it's more reasonable to
>> use one generic purpose miptree that can be accessible from different
>> parts of the i965 code for such cases instead of storing miptrees in
>> different places for different hacks when a feature is not supported.
>>
>> I will search your patch to get a look and I will also get a look at the
>> mesa code to see how easy this fix would be (which parts of the code it
>> might affect) and if everyone agrees that this is a good idea I will
>> modify this patch according to your suggestions.
>>
>> BR :)
>> Eleni
> 
> Hi Eleni,
> 
> I gave this more thought and am now thinking that what you have here is
> fine. Having two different ways of working with a shadow miptree
> suggests a refactor later on, but IMO this is ultimately a step in the
> right direction. Sorry for the noise.
> 
> With code-sharing among shadow miptrees in mind, my two main
> suggestions are 1) to perform mapping operations only with the cmt (if
> it's present) and 2) to update the decompressed mt, on demand. Maybe
> with intel_miptree_copy_slice_sw?
> 
> Regards,
> Nanley
> 

Hi Nanley,

I talked to you on IRC but I reply here as well:

Thank you for the suggestions, I had misunderstood something from our
IRC conversation that followed this e-mail, so the patch v6 has several
issues. I will send a new one soon and I will implement the solution you
suggested earlier (suggestions 1-6) instead. Sorry for the noise with
the patch v6.

Thanks,
Eleni



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v6] i965: Fix ETC2/EAC GetCompressed* functions on Gen7 GPUs

2018-07-18 Thread Eleni Maria Stea
Gen 7 GPUs store the compressed EAC/ETC2 images in other non-compressed
formats that can render. When GetCompressed* functions are called, the
pixels are returned in the non-compressed format that is used for the
rendering.

With this patch we store both the compressed and non-compressed versions
of the image, so that both rendering commands and GetCompressed*
commands work.

Also, the assertions for GL_MAP_WRITE_BIT and GL_MAP_INVALIDATE_RANGE_BIT
in intel_miptree_map_etc function have been removed because when the
miptree is mapped for reading (for example from a GetCompress*
function) the GL_MAP_WRITE_BIT won't be set (and shouldn't be set).

Fixes: the following test in CTS for gen7:
KHR-GL45.direct_state_access.textures_compressed_subimage test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104272
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81843

v2: fixes issues:
   a) initialized uninitialized variables (Juan A. Suarez, Andres Gomez)
   b) fixed race condition where mt and cmt were mapped at the same time
   c) fixed indentation issues (Andres Gomez)
v3: adds bugzilla bug with id: 104272
v4: adds bugzilla bug with id: 81843
v5: replaced the flags with a bitfield, refactoring (Kenneth Graunke)
v6: renamed the r8stencil_mt secondary miptree that is now part of the
intel_miptree_struct to shadow_mt and used it to store the compressed
miptree (Nanley Chery)
---
 .../drivers/dri/i965/brw_wm_surface_state.c   |  8 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 27 +++---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 14 ++-
 src/mesa/drivers/dri/i965/intel_tex.c | 90 ++-
 src/mesa/drivers/dri/i965/intel_tex_image.c   | 46 +-
 src/mesa/main/texstore.c  | 62 -
 src/mesa/main/texstore.h  |  8 ++
 7 files changed, 209 insertions(+), 46 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 9397b637c7..2097fabaeb 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -563,15 +563,15 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
 
   if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) 
{
  if (devinfo->gen <= 7) {
-assert(mt->r8stencil_mt && 
!mt->stencil_mt->r8stencil_needs_update);
-mt = mt->r8stencil_mt;
+assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update);
+mt = mt->shadow_mt;
  } else {
 mt = mt->stencil_mt;
  }
  format = ISL_FORMAT_R8_UINT;
   } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) {
- assert(mt->r8stencil_mt && !mt->r8stencil_needs_update);
- mt = mt->r8stencil_mt;
+ assert(mt->shadow_mt && !mt->shadow_needs_update);
+ mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
   }
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 7b1f0896ae..6d07fede52 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -719,8 +719,12 @@ miptree_create(struct brw_context *brw,
   }
}
 
-   mt->etc_format = (_mesa_is_format_color_format(format) && mt_fmt != format) 
?
-format : MESA_FORMAT_NONE;
+   if (!(flags & MIPTREE_CREATE_ETC)) {
+  mt->etc_format = (_mesa_is_format_color_format(format) &&
+mt_fmt != format) ? format : MESA_FORMAT_NONE;
+   } else {
+  mt->etc_format = MESA_FORMAT_NONE;
+   }
 
if (!(flags & MIPTREE_CREATE_NO_AUX))
   intel_miptree_choose_aux_usage(brw, mt);
@@ -1214,7 +1218,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt)
 
   brw_bo_unreference((*mt)->bo);
   intel_miptree_release(&(*mt)->stencil_mt);
-  intel_miptree_release(&(*mt)->r8stencil_mt);
+  intel_miptree_release(&(*mt)->shadow_mt);
   intel_miptree_aux_buffer_free((*mt)->aux_buf);
   free_aux_state_map((*mt)->aux_state);
 
@@ -2426,7 +2430,7 @@ intel_miptree_finish_write(struct brw_context *brw,
switch (mt->aux_usage) {
case ISL_AUX_USAGE_NONE:
   if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7)
- mt->r8stencil_needs_update = true;
+ mt->shadow_needs_update = true;
   break;
 
case ISL_AUX_USAGE_MCS:
@@ -2919,9 +2923,9 @@ intel_update_r8stencil(struct brw_context *brw,
 
assert(src->surf.size > 0);
 
-   if (!mt->r8stencil_mt) {
+   if (!mt->shadow_mt) {
   assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */
-  mt->r8stencil_mt = make_surface(
+  mt->shadow_mt = make_surface(
 brw,
 src->target,
 MESA_FORMAT_R_UINT8,
@@ -2935,13 +2939,13 @@ intel_update_r8stencil(struct 

  1   2   >