[Mesa-dev] [PATCH 1/1] radv: consider MESA_VK_VERSION_OVERRIDE when setting the api version
Before setting the physical device API version, we should check if the MESA_VK_VERSION_OVERRIDE environment variable is set and take it into account. --- src/amd/vulkan/radv_extensions.py | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/src/amd/vulkan/radv_extensions.py b/src/amd/vulkan/radv_extensions.py index 9743ce1a774..8f29f4ca40f 100644 --- a/src/amd/vulkan/radv_extensions.py +++ b/src/amd/vulkan/radv_extensions.py @@ -333,9 +333,13 @@ VkResult radv_EnumerateInstanceVersion( uint32_t radv_physical_device_api_version(struct radv_physical_device *dev) { +uint32_t override = vk_get_version_override(); +uint32_t version = VK_MAKE_VERSION(1, 0, 68); + if (!ANDROID && dev->rad_info.has_syncobj_wait_for_submit) -return ${MAX_API_VERSION.c_vk_version()}; -return VK_MAKE_VERSION(1, 0, 68); +version = ${MAX_API_VERSION.c_vk_version()}; + +return override ? MIN2(override, version) : version; } """) -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v5 08/11] anv: Added support for dynamic sample locations
On Thu, 14 Mar 2019 20:00:45 -0500 Jason Ekstrand wrote: > > > > extern const struct anv_dynamic_state default_dynamic_state; > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c > > b/src/intel/vulkan/genX_cmd_buffer.c > > index 7687507e6b7..5d2b17cf8ae 100644 > > --- a/src/intel/vulkan/genX_cmd_buffer.c > > +++ b/src/intel/vulkan/genX_cmd_buffer.c > > @@ -2796,6 +2796,17 @@ genX(cmd_buffer_flush_state)(struct > > anv_cmd_buffer *cmd_buffer) > >ANV_CMD_DIRTY_RENDER_TARGETS)) > >gen7_cmd_buffer_emit_scissor(cmd_buffer); > > > > + if (cmd_buffer->state.gfx.dynamic.sample_locations.valid) { > > + uint32_t samples = > > cmd_buffer->state.gfx.dynamic.sample_locations.samples; > > + const VkSampleLocationEXT *locations = > > + cmd_buffer->state.gfx.dynamic.sample_locations.locations; > > + genX(emit_multisample)(_buffer->batch, samples, > > locations); +#if GEN_GEN >= 8 > > + genX(emit_sample_pattern)(_buffer->batch, samples, > > locations); +#endif > > + cmd_buffer->state.gfx.dynamic.sample_locations.valid = false; > > > > I'm not sure this is actually going to be correct. With dynamic > state, you're required to set it before you use it. With pipeline > state, it gets set every time the pipeline is bound. Effectively, > the pipeline state is a big bag of dynamic state. With both of > these, however, there are no defaults and you're required to bind a > pipeline containing the state or explicitly set it on the command > buffer before it gets used. VK_EXT_sample_locations is different > though because it does have a default. So the question I'm coming > around to is: When does the default get applied? The only sensible > thing I can think of is at the top of the command buffer or maybe the > top of the subpass. If this is the case, then we need to emit sample > positions at the start of every subpass. Does the spec talk about > this at all? > > --Jason > > > Hi Jason, If I understand well (sorry if I misunderstood), you want to make sure that in every case we will have locations set either the default or custom, and that we emit the default locations when a pipeline (or subpass, but we don't have locations per subpass, only per pipeline) is bound after a pipeline for which custom locations were set dynamically? I didn't find any reference to this problem in the spec, the solution I thought myself was to use the variable bool custom_locations to decide if we are going to emit custom locations or the default in emit_ms_state and always emit some locations when the extension is enabled. So: When the emit_ms_state is called for each new pipeline at rasterization, we check if 1- the custom locations are TRUE and if 2- the flag for the dynamic locations (VK_DYNAMIC_STATE_SAMPLE_LOCATIONS_EXT) is set to false. If both apply we emit the custom locations. In every other case (when locations_enabled are false, or when dynamic state is true) we emit the default locations. This way, in the non-dynamic case (no VK_DYNAMIC_STATE... set): If a pipeline has the locations enabled, we emit the custom. If a pipeline doesn't have locations enabled, we emit the default. So, we always have some locations set for the pipeline. Similarly in the dynamic case: When the user sets locations with vkCmdSetSampleLocations we use these locations. As we have locations per pipeline not per subpass (variable sample locations = false) next pipeline that will be bound will have either custom locations (from emit_ms_state) or the default (emitted by the emit_ms_state), unless if the DYNAMIC_STATE flag is set and the user calls the vkCmdSetSampleLocationsEXT again to override the default, that we'll set the user's. So, again we'll always have some locations set (and these will be the default when the user doesn't chain the locations info struct, or disables the locations, or sets the DYNAMIC flag but doesn't override the locations with the VkCmdSetLocationsEXT). So, I think we are fine. If we didn't emit always the default pipelines created after setting locations with any of the 2 possible ways would have garbage locations set. I verified this would happen like that: If you don't make use of the bool custom_locations and run all the multisample.* vulkancts tests at once, you will notice that tests that don't set the sample locations in the pipeline and run after some tests that set them fail. I had spotted the following failures: dEQP-VK.glsl.builtin_var.fragcoord_msaa.* dEQP-VK.pipeline.multisample.sample_mask_with_depth_test.samples_.* dEQP-VK.pipeline.multisample.sample_mask_with_depth_test.samples_.*_post_depth_coverage dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_distinct_values.128_128_1.samples_.* dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_distinct_values.137_191_1.samples_.* dEQP-VK.pipeline.multisample_interpolation.sample_qualifier_distinct_values.128_128_1.samples_.*
[Mesa-dev] [PATCH v4 3/9] anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT
Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT according to the Vulkan Specification section [36.2. Additional Multisampling Capabilities]. v2: 1- Moved the vkGetPhysicalDeviceMultisamplePropertiesEXT from the anv_sample_locations.c to the anv_device.c (Jason Ekstrand) 2- Simplified the code that sets the grid size (Jason Ekstrand) 3- Instead of filling the whole struct, we only fill the parts we should override (sType, grid size) and we call anv_debug_ignored_stype to any pNext elements (Jason Ekstrand) --- src/intel/vulkan/anv_device.c | 25 + 1 file changed, 25 insertions(+) diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index 52ea058bdd5..0bfff7e0b30 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -3557,6 +3557,31 @@ VkResult anv_GetCalibratedTimestampsEXT( return VK_SUCCESS; } +void +anv_GetPhysicalDeviceMultisamplePropertiesEXT(VkPhysicalDevice physicalDevice, + VkSampleCountFlagBits samples, + VkMultisamplePropertiesEXT + *pMultisampleProperties) +{ + ANV_FROM_HANDLE(anv_physical_device, physical_device, physicalDevice); + + VkExtent2D grid_size; + if (samples & isl_device_get_sample_counts(_device->isl_dev)) { + grid_size.width = 1; + grid_size.height = 1; + } else { + grid_size.width = 0; + grid_size.height = 0; + } + + pMultisampleProperties->sType = + VK_STRUCTURE_TYPE_MULTISAMPLE_PROPERTIES_EXT; + pMultisampleProperties->maxSampleLocationGridSize = grid_size; + + vk_foreach_struct(ext, pMultisampleProperties->pNext) + anv_debug_ignored_stype(ext->sType); +} + /* vk_icd.h does not declare this function, so we declare it here to * suppress Wmissing-prototypes. */ -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 7/9] anv: Optimized the emission of the default locations on Gen8+
We only emit sample locations when the extension is enabled by the user. In all other cases the default locations are emitted once when the device is initialized to increase performance. --- src/intel/vulkan/anv_genX.h| 3 ++- src/intel/vulkan/genX_cmd_buffer.c | 2 +- src/intel/vulkan/genX_pipeline.c | 13 - src/intel/vulkan/genX_state.c | 8 +--- 4 files changed, 16 insertions(+), 10 deletions(-) diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index 82fe5cc93bf..f28ee0b1a76 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -93,4 +93,5 @@ void genX(emit_ms_state)(struct anv_batch *batch, const VkSampleLocationEXT *sl, uint32_t num_samples, uint32_t log2_samples, - bool custom_sample_locations); + bool custom_sample_locations, + bool sample_locations_ext_enabled); diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 57dd94bfbd7..63913dd0668 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -2651,7 +2651,7 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) genX(emit_ms_state)(_buffer->batch, dyn_state->sample_locations.positions, - samples, log2_samples, true); + samples, log2_samples, true, true); } void diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c index 21b21a719da..1245090386c 100644 --- a/src/intel/vulkan/genX_pipeline.c +++ b/src/intel/vulkan/genX_pipeline.c @@ -572,10 +572,12 @@ emit_sample_mask(struct anv_pipeline *pipeline, } static void -emit_ms_state(struct anv_pipeline *pipeline, +emit_ms_state(struct anv_device *device, + struct anv_pipeline *pipeline, const VkPipelineMultisampleStateCreateInfo *info, const VkPipelineDynamicStateCreateInfo *dinfo) { + bool sample_loc_enabled = device->enabled_extensions.EXT_sample_locations; VkSampleLocationsInfoEXT *sl; bool custom_locations = false; uint32_t samples = 1; @@ -586,7 +588,7 @@ emit_ms_state(struct anv_pipeline *pipeline, if (info) { samples = info->rasterizationSamples; - if (info->pNext) { + if (sample_loc_enabled && info->pNext) { VkPipelineSampleLocationsStateCreateInfoEXT *slinfo = (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext; @@ -613,8 +615,8 @@ emit_ms_state(struct anv_pipeline *pipeline, log2_samples = __builtin_ffs(samples) - 1; } - genX(emit_ms_state)(>batch, sl->pSampleLocations, samples, log2_samples, - custom_locations); + genX(emit_ms_state)(>batch, sl->pSampleLocations, samples, + log2_samples, custom_locations, sample_loc_enabled); } static const uint32_t vk_to_gen_logic_op[] = { @@ -1944,7 +1946,8 @@ genX(graphics_pipeline_create)( assert(pCreateInfo->pRasterizationState); emit_rs_state(pipeline, pCreateInfo->pRasterizationState, pCreateInfo->pMultisampleState, pass, subpass); - emit_ms_state(pipeline, pCreateInfo->pMultisampleState, pCreateInfo->pDynamicState); + emit_ms_state(device, pipeline, pCreateInfo->pMultisampleState, + pCreateInfo->pDynamicState); emit_ds_state(pipeline, pCreateInfo->pDepthStencilState, pass, subpass); emit_cb_state(pipeline, pCreateInfo->pColorBlendState, pCreateInfo->pMultisampleState); diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c index 9b05506f3af..6e13001b74f 100644 --- a/src/intel/vulkan/genX_state.c +++ b/src/intel/vulkan/genX_state.c @@ -568,12 +568,14 @@ genX(emit_ms_state)(struct anv_batch *batch, const VkSampleLocationEXT *sl, uint32_t num_samples, uint32_t log2_samples, -bool custom_sample_locations) +bool custom_sample_locations, +bool sample_locations_ext_enabled) { emit_multisample(batch, sl, num_samples, log2_samples, custom_sample_locations); #if GEN_GEN >= 8 - emit_sample_locations(batch, sl, num_samples, - custom_sample_locations); + if (sample_locations_ext_enabled) + emit_sample_locations(batch, sl, num_samples, +custom_sample_locations); #endif } -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 8/9] anv: Removed unused header file
In src/intel/vulkan/genX_blorp_exec.c we included the file: common/gen_sample_positions.h but not use it. Removed. Reviewed-by: Sagar Ghuge Reviewed-by: Jason Ekstrand --- src/intel/vulkan/genX_blorp_exec.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/intel/vulkan/genX_blorp_exec.c b/src/intel/vulkan/genX_blorp_exec.c index e9c85d56d5f..0eeefaaa9d6 100644 --- a/src/intel/vulkan/genX_blorp_exec.c +++ b/src/intel/vulkan/genX_blorp_exec.c @@ -31,7 +31,6 @@ #undef __gen_combine_address #include "common/gen_l3_config.h" -#include "common/gen_sample_positions.h" #include "blorp/blorp_genX_exec.h" static void * -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 6/9] anv: Added support for dynamic and non-dynamic sample locations on Gen7
Allowing setting dynamic and non-dynamic sample locations on Gen7. v2: Similarly to the previous patches, removed structs and functions that were used to sort and store the sorted sample positions (Jason Ekstrand) --- src/intel/vulkan/anv_genX.h| 13 ++--- src/intel/vulkan/genX_cmd_buffer.c | 9 ++-- src/intel/vulkan/genX_pipeline.c | 13 + src/intel/vulkan/genX_state.c | 86 +- 4 files changed, 70 insertions(+), 51 deletions(-) diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index 5c618ab..82fe5cc93bf 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -89,11 +89,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer *cmd_buffer, void genX(blorp_exec)(struct blorp_batch *batch, const struct blorp_params *params); -void genX(emit_multisample)(struct anv_batch *batch, -uint32_t samples, -uint32_t log2_samples); - -void genX(emit_sample_locations)(struct anv_batch *batch, - const VkSampleLocationEXT *sl, - uint32_t num_samples, - bool custom_locations); +void genX(emit_ms_state)(struct anv_batch *batch, + const VkSampleLocationEXT *sl, + uint32_t num_samples, + uint32_t log2_samples, + bool custom_sample_locations); diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 5d7c9b51a84..57dd94bfbd7 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -2642,7 +2642,6 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer, static void cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) { -#if GEN_GEN >= 8 struct anv_dynamic_state *dyn_state = _buffer->state.gfx.dynamic; uint32_t samples = dyn_state->sample_locations.num_samples; uint32_t log2_samples; @@ -2650,11 +2649,9 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) assert(samples > 0); log2_samples = __builtin_ffs(samples) - 1; - genX(emit_multisample)(_buffer->batch, samples, log2_samples); - genX(emit_sample_locations)(_buffer->batch, - dyn_state->sample_locations.positions, - samples, true); -#endif + genX(emit_ms_state)(_buffer->batch, + dyn_state->sample_locations.positions, + samples, log2_samples, true); } void diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c index ada022620d1..21b21a719da 100644 --- a/src/intel/vulkan/genX_pipeline.c +++ b/src/intel/vulkan/genX_pipeline.c @@ -576,11 +576,8 @@ emit_ms_state(struct anv_pipeline *pipeline, const VkPipelineMultisampleStateCreateInfo *info, const VkPipelineDynamicStateCreateInfo *dinfo) { -#if GEN_GEN >= 8 VkSampleLocationsInfoEXT *sl; bool custom_locations = false; -#endif - uint32_t samples = 1; uint32_t log2_samples = 0; @@ -589,7 +586,6 @@ emit_ms_state(struct anv_pipeline *pipeline, if (info) { samples = info->rasterizationSamples; -#if GEN_GEN >= 8 if (info->pNext) { VkPipelineSampleLocationsStateCreateInfoEXT *slinfo = (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext; @@ -613,17 +609,12 @@ emit_ms_state(struct anv_pipeline *pipeline, } } } -#endif log2_samples = __builtin_ffs(samples) - 1; } - genX(emit_multisample(>batch, samples, log2_samples)); - -#if GEN_GEN >= 8 - genX(emit_sample_locations)(>batch, sl->pSampleLocations, - samples, custom_locations); -#endif + genX(emit_ms_state)(>batch, sl->pSampleLocations, samples, log2_samples, + custom_locations); } static const uint32_t vk_to_gen_logic_op[] = { diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c index 4fdb74111a5..9b05506f3af 100644 --- a/src/intel/vulkan/genX_state.c +++ b/src/intel/vulkan/genX_state.c @@ -436,10 +436,12 @@ VkResult genX(CreateSampler)( return VK_SUCCESS; } -void -genX(emit_multisample)(struct anv_batch *batch, - uint32_t samples, - uint32_t log2_samples) +static void +emit_multisample(struct anv_batch *batch, + const VkSampleLocationEXT *sl, + uint32_t samples, + uint32_t log2_samples, + bool custom_locations) { anv_batch_emit(batch, GENX(3DSTATE_MULTISAMPLE), ms) { ms.NumberofMultisamples = log2_samples; @@ -452,31 +454,51 @@ genX(emit_multisample)(struct anv_batch *batch, */ ms.PixelPositionOffsetEnable = false; #else - switch (samples) { - case 1: -
[Mesa-dev] [PATCH v4 9/9] anv: Enabled the VK_EXT_sample_locations extension
Enabled the VK_EXT_sample_locations for Intel Gen >= 7. v2: Replaced device.info->gen >= 7 with True, as Anv doesn't support anything below Gen7. (Lionel Landwerlin) Reviewed-by: Sagar Ghuge --- src/intel/vulkan/anv_extensions.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_extensions.py index 9e4e03e46df..5a30c733c5c 100644 --- a/src/intel/vulkan/anv_extensions.py +++ b/src/intel/vulkan/anv_extensions.py @@ -129,7 +129,7 @@ EXTENSIONS = [ Extension('VK_EXT_inline_uniform_block', 1, True), Extension('VK_EXT_pci_bus_info', 2, True), Extension('VK_EXT_post_depth_coverage', 1, 'device->info.gen >= 9'), -Extension('VK_EXT_sample_locations', 1, False), +Extension('VK_EXT_sample_locations', 1, True), Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen >= 9'), Extension('VK_EXT_scalar_block_layout', 1, True), Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen >= 9'), -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 4/9] anv: Added support for non-dynamic sample locations on Gen8+
Allowing the user to set custom sample locations non-dynamically, by filling the extension structs and chaining them to the pipeline structs according to the Vulkan specification section [26.5. Custom Sample Locations] for the following structures: 'VkPipelineSampleLocationsStateCreateInfoEXT' 'VkSampleLocationsInfoEXT' 'VkSampleLocationEXT' Once custom locations are used, the default locations are lost and need to be re-emitted again in the next pipeline creation. For that, we emit the 3DSTATE_SAMPLE_PATTERN at every pipeline creation. v2: In v1, we used the custom anv_sample struct to store the location and the distance from the pixel center because we would then use this distance to sort the locations and send them in increasing monotonical order to the GPU. That was because the Skylake PRM Vol. 2a "3DSTATE_SAMPLE_PATTERN" says that the samples must have monotonically increasing distance from the pixel center to get the correct centroid computation in the device. However, the Vulkan spec seems to require that the samples occur in the order provided through the API and this requirement is only for the standard locations. As long as this only affects centroid calculations as the docs say, we should be ok because OpenGL and Vulkan only require that the centroid be some lit sample and that it's the same for all samples in a pixel; they have no requirement that it be the one closest to center. (Jason Ekstrand) For that we made the following changes: 1- We removed the custom structs and functions from anv_private.h and anv_sample_locations.h and anv_sample_locations.c (the last two files were removed). (Jason Ekstrand) 2- We modified the macros used to take also the array as parameter and we renamed them to start by GEN_. (Jason Ekstrand) 3- We don't sort the samples anymore. (Jason Ekstrand) --- src/intel/common/gen_sample_positions.h | 57 ++ src/intel/vulkan/anv_genX.h | 5 ++ src/intel/vulkan/anv_private.h | 1 + src/intel/vulkan/genX_pipeline.c| 79 + src/intel/vulkan/genX_state.c | 72 ++ 5 files changed, 201 insertions(+), 13 deletions(-) diff --git a/src/intel/common/gen_sample_positions.h b/src/intel/common/gen_sample_positions.h index da48dcb5ed0..850661931cf 100644 --- a/src/intel/common/gen_sample_positions.h +++ b/src/intel/common/gen_sample_positions.h @@ -160,4 +160,61 @@ prefix##14YOffset = 0.9375; \ prefix##15XOffset = 0.0625; \ prefix##15YOffset = 0.; +/* Examples: + * in case of GEN_GEN < 8: + * GEN_SAMPLE_POS_ELEM(ms.Sample, info->pSampleLocations, 0); expands to: + *ms.Sample0XOffset = info->pSampleLocations[0].pos.x; + *ms.Sample0YOffset = info->pSampleLocations[0].y; + * + * in case of GEN_GEN >= 8: + * GEN_SAMPLE_POS_ELEM(sp._16xSample, info->pSampleLocations, 0); expands to: + *sp._16xSample0XOffset = info->pSampleLocations[0].x; + *sp._16xSample0YOffset = info->pSampleLocations[0].y; + */ + +#define GEN_SAMPLE_POS_ELEM(prefix, arr, sample_idx) \ +prefix##sample_idx##XOffset = arr[sample_idx].x; \ +prefix##sample_idx##YOffset = arr[sample_idx].y; + +#define GEN_SAMPLE_POS_1X_ARRAY(prefix, arr)\ +GEN_SAMPLE_POS_ELEM(prefix, arr, 0); + +#define GEN_SAMPLE_POS_2X_ARRAY(prefix, arr) \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 0); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 1); + +#define GEN_SAMPLE_POS_4X_ARRAY(prefix, arr) \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 0); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 1); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 2); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 3); + +#define GEN_SAMPLE_POS_8X_ARRAY(prefix, arr) \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 0); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 1); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 2); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 3); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 4); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 5); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 6); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 7); + +#define GEN_SAMPLE_POS_16X_ARRAY(prefix, arr) \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 0); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 1); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 2); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 3); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 4); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 5); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 6); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 7); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 8); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 9); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 10); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 11); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 12); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 13); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 14); \ +GEN_SAMPLE_POS_ELEM(prefix, arr, 15); + #endif /* GEN_SAMPLE_POSITIONS_H */ diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index 8fd32cabf1e..fb7419b6347 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -88,3 +88,8 @@ void
[Mesa-dev] [PATCH v4 5/9] anv: Added support for dynamic sample locations on Gen8+
Added support for setting the locations when the pipeline has been created with the dynamic state bit enabled according to the Vulkan Specification section [26.5. Custom Sample Locations] for the function: 'vkCmdSetSampleLocationsEXT' The reason that we preferred to store the boolean valid inside the dynamic state struct for locations instead of using a dirty bit (ANV_CMD_DIRTY_SAMPLE_LOCATIONS for example) is that other functions can modify the value of the dirty bits causing unexpected behavior. v2: Removed all the anv* structs used with sample locations to store the locations in order for dynamic case. (see also the patch for the non-dynamic case. (Jason Ekstrand) --- src/intel/vulkan/anv_cmd_buffer.c | 19 ++ src/intel/vulkan/anv_genX.h| 4 +++ src/intel/vulkan/anv_private.h | 6 + src/intel/vulkan/genX_cmd_buffer.c | 24 ++ src/intel/vulkan/genX_pipeline.c | 40 +- src/intel/vulkan/genX_state.c | 36 +++ 6 files changed, 90 insertions(+), 39 deletions(-) diff --git a/src/intel/vulkan/anv_cmd_buffer.c b/src/intel/vulkan/anv_cmd_buffer.c index 1b34644a434..866cd03b05e 100644 --- a/src/intel/vulkan/anv_cmd_buffer.c +++ b/src/intel/vulkan/anv_cmd_buffer.c @@ -558,6 +558,25 @@ void anv_CmdSetStencilReference( cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE; } +void +anv_CmdSetSampleLocationsEXT(VkCommandBuffer commandBuffer, + const VkSampleLocationsInfoEXT *pSampleLocationsInfo) +{ + ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer); + + struct anv_dynamic_state *dyn_state = _buffer->state.gfx.dynamic; + uint32_t num_samples = pSampleLocationsInfo->sampleLocationsPerPixel; + + assert(pSampleLocationsInfo); + dyn_state->sample_locations.num_samples = num_samples; + + memcpy(dyn_state->sample_locations.positions, + pSampleLocationsInfo->pSampleLocations, + num_samples * sizeof *pSampleLocationsInfo->pSampleLocations); + + dyn_state->sample_locations.valid = true; +} + static void anv_cmd_buffer_bind_descriptor_set(struct anv_cmd_buffer *cmd_buffer, VkPipelineBindPoint bind_point, diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index fb7419b6347..5c618ab 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -89,6 +89,10 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer *cmd_buffer, void genX(blorp_exec)(struct blorp_batch *batch, const struct blorp_params *params); +void genX(emit_multisample)(struct anv_batch *batch, +uint32_t samples, +uint32_t log2_samples); + void genX(emit_sample_locations)(struct anv_batch *batch, const VkSampleLocationEXT *sl, uint32_t num_samples, diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index a39195733cd..1e1d2feaa50 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -2124,6 +2124,12 @@ struct anv_dynamic_state { uint32_t front; uint32_t back; } stencil_reference; + + struct { + VkSampleLocationEXT positions[MAX_SAMPLE_LOCATIONS]; + uint32_t num_samples; + bool valid; + } sample_locations; }; extern const struct anv_dynamic_state default_dynamic_state; diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 7687507e6b7..5d7c9b51a84 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -30,6 +30,7 @@ #include "util/fast_idiv_by_const.h" #include "common/gen_l3_config.h" +#include "common/gen_sample_positions.h" #include "genxml/gen_macros.h" #include "genxml/genX_pack.h" @@ -2638,6 +2639,24 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer, cmd_buffer->state.push_constants_dirty &= ~flushed; } +static void +cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) +{ +#if GEN_GEN >= 8 + struct anv_dynamic_state *dyn_state = _buffer->state.gfx.dynamic; + uint32_t samples = dyn_state->sample_locations.num_samples; + uint32_t log2_samples; + + assert(samples > 0); + log2_samples = __builtin_ffs(samples) - 1; + + genX(emit_multisample)(_buffer->batch, samples, log2_samples); + genX(emit_sample_locations)(_buffer->batch, + dyn_state->sample_locations.positions, + samples, true); +#endif +} + void genX(cmd_buffer_flush_state)(struct anv_cmd_buffer *cmd_buffer) { @@ -2796,6 +2815,11 @@ genX(cmd_buffer_flush_state)(struct anv_cmd_buffer *cmd_buffer)
[Mesa-dev] [PATCH v4 0/9] Implementation of the VK_EXT_sample_locations
Implemented the requirements from the VK_EXT_sample_locations extension specification to allow setting custom sample locations on Intel Gen >= 7. Some decisions explained: The grid size was set to 1x1 because the hardware only supports a single set of sample locations for the whole framebuffer. The user can set custom sample locations either per pipeline, by filling the extension provided structs, or dynamically the way it is described in sections 26.5, 36.1, 36.2 of the Vulkan specification. Sections 6.7.3 and 7.4 describe how to use sample locations with images when a layout transition is about to take place. These sections were ignored as currently we aren't using sample locations with images in the driver. Variable sample locations aren't required and have not been implemented. (v2): Initially, we were sorting the samples because according to the Skylake PRM (vol 2a SAMPLE_PATTERN) the samples should be sent in a monotonically increasing distance from the center to get the correct centroid computation in the device. However the Vulkan spec seems to require that the samples occur in the order provided through the API. As long as this requirement only affects centroid calculations we should be ok without the ordering because OpenGL and Vulkan only require the centroid to be some lit sample and that it's the same for all samples in a pixel. They have no requirement that it be the one closest to the center. (Jason Ekstrand) We have 754 vk-gl-cts tests for this extension: 690 of the tests pass on Gen >= 9 (where we can support 16 samples). The remaining 64 tests aren't supported because they test the variable sample locations. Eleni Maria Stea (9): anv: Added the VK_EXT_sample_locations extension to the anv_extensions list anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT anv: Added support for non-dynamic sample locations on Gen8+ anv: Added support for dynamic sample locations on Gen8+ anv: Added support for dynamic and non-dynamic sample locations on Gen7 anv: Optimized the emission of the default locations on Gen8+ anv: Removed unused header file anv: Enabled the VK_EXT_sample_locations extension src/intel/Makefile.sources | 1 + src/intel/common/gen_sample_positions.h | 53 ++ src/intel/vulkan/anv_cmd_buffer.c | 19 src/intel/vulkan/anv_device.c | 21 src/intel/vulkan/anv_extensions.py | 1 + src/intel/vulkan/anv_genX.h | 7 ++ src/intel/vulkan/anv_private.h | 18 src/intel/vulkan/anv_sample_locations.c | 96 ++ src/intel/vulkan/anv_sample_locations.h | 29 ++ src/intel/vulkan/genX_blorp_exec.c | 1 - src/intel/vulkan/genX_cmd_buffer.c | 24 + src/intel/vulkan/genX_pipeline.c| 92 + src/intel/vulkan/genX_state.c | 128 src/intel/vulkan/meson.build| 1 + 14 files changed, 450 insertions(+), 41 deletions(-) create mode 100644 src/intel/vulkan/anv_sample_locations.c create mode 100644 src/intel/vulkan/anv_sample_locations.h -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 2/9] anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT
The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled with implementation dependent values and according to the table from the Vulkan Specification section [36.1. Limit Requirements]: pname | max | min pname:sampleLocationSampleCounts |-|ename:VK_SAMPLE_COUNT_4_BIT pname:maxSampleLocationGridSize|-|(1, 1) pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375) pname:sampleLocationSubPixelBits |-|4 pname:variableSampleLocations | false |implementation dependent The hardware only supports setting the same sample location for all the pixels, so we only support 1x1 grids. Also, variableSampleLocations is set to false because we don't support the feature. v2: 1- Replaced false with VK_FALSE for consistency. (Sagar Ghuge) 2- Used the isl_device_sample_count to take the number of samples per platform to avoid extra checks. (Sagar Ghuge) v3: 1- Replaced VK_FALSE with false as Jason has sent a patch to replace VK_FALSE with false in other places. (Jason Ekstrand) 2- Removed unecessary defines and set the grid size to 1 (Jason Ekstrand) Reviewed-by: Sagar Ghuge --- src/intel/vulkan/anv_device.c | 20 1 file changed, 20 insertions(+) diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index 83fa3936c19..52ea058bdd5 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -1401,6 +1401,26 @@ void anv_GetPhysicalDeviceProperties2( break; } + case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT: { + VkPhysicalDeviceSampleLocationsPropertiesEXT *props = +(VkPhysicalDeviceSampleLocationsPropertiesEXT *)ext; + + props->sampleLocationSampleCounts = +isl_device_get_sample_counts(>isl_dev); + + /* See also anv_GetPhysicalDeviceMultisamplePropertiesEXT */ + props->maxSampleLocationGridSize.width = 1; + props->maxSampleLocationGridSize.height = 1; + + props->sampleLocationCoordinateRange[0] = 0; + props->sampleLocationCoordinateRange[1] = 0.9375; + props->sampleLocationSubPixelBits = 4; + + props->variableSampleLocations = false; + + break; + } + default: anv_debug_ignored_stype(ext->sType); break; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 1/9] anv: Added the VK_EXT_sample_locations extension to the anv_extensions list
Added the VK_EXT_sample_locations to the anv_extensions.py list to generate the related entrypoints. Reviewed-by: Sagar Ghuge --- src/intel/vulkan/anv_extensions.py | 1 + 1 file changed, 1 insertion(+) diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_extensions.py index 6fff293dee4..9e4e03e46df 100644 --- a/src/intel/vulkan/anv_extensions.py +++ b/src/intel/vulkan/anv_extensions.py @@ -129,6 +129,7 @@ EXTENSIONS = [ Extension('VK_EXT_inline_uniform_block', 1, True), Extension('VK_EXT_pci_bus_info', 2, True), Extension('VK_EXT_post_depth_coverage', 1, 'device->info.gen >= 9'), +Extension('VK_EXT_sample_locations', 1, False), Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen >= 9'), Extension('VK_EXT_scalar_block_layout', 1, True), Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen >= 9'), -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/9] anv: Added support for non-dynamic sample locations on Gen8+
On Wed, 13 Mar 2019 08:16:10 -0500 Jason Ekstrand wrote: > On Mon, Mar 11, 2019 at 10:05 AM Eleni Maria Stea > wrote: > > > Allowing the user to set custom sample locations non-dynamically, by > > filling the extension structs and chaining them to the pipeline > > structs according to the Vulkan specification section [26.5. Custom > > Sample Locations] [...] > > +void > > +anv_calc_sample_locations(struct anv_sample *samples, > > + uint32_t num_samples, > > + const VkSampleLocationsInfoEXT *info) > > +{ > > + int i; > > + > > + for(i = 0; i < num_samples; i++) { > > + float dx, dy; > > + > > + /* this is because the grid is 1x1, in case that > > + * we support different grid sizes in the future > > + * this must be changed. > > + */ > > + samples[i].offs_x = info->pSampleLocations[i].x; > > + samples[i].offs_y = info->pSampleLocations[i].y; > > + > > + /* distance from the center */ > > + dx = samples[i].offs_x - 0.5; > > + dy = samples[i].offs_y - 0.5; > > + > > + samples[i].radius = dx * dx + dy * dy; > > + } > > + > > + qsort(samples, num_samples, sizeof *samples, compare_samples); > > > > Are we allowed to re-order the samples like this? The spec says: > > The sample location for sample i at the pixel grid location (x,y) is > taken from pSampleLocations[(x + y * sampleLocationGridSize.width) * > sampleLocationsPerPixel + i] > > Which leads me to think that they expect the ordering of samples to be > respected. Yes, I know the HW docs say we're supposed to order them > from nearest to furthest. However, AFAIK, that's only so we get nice > centroids and I don't know that it's actually required. > > --Jason I wasn't sure about this to be honest. I could remove the qsort and explain why we decided to ignore the PRM in a comment for the case that someone decides to put this back in the future. Thanks a lot for reviewing the series, BTW. I am working on the changes for all patches. Eleni ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 9/9] anv: Enabled the VK_EXT_sample_locations extension
Enabled the VK_EXT_sample_locations for Intel Gen >= 7. v2: Replaced device.info->gen >= 7 with True, as Anv doesn't support anything below Gen7. (Lionel Landwerlin) Reviewed-by: Sagar Ghuge --- src/intel/vulkan/anv_extensions.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_extensions.py index 9e4e03e46df..5a30c733c5c 100644 --- a/src/intel/vulkan/anv_extensions.py +++ b/src/intel/vulkan/anv_extensions.py @@ -129,7 +129,7 @@ EXTENSIONS = [ Extension('VK_EXT_inline_uniform_block', 1, True), Extension('VK_EXT_pci_bus_info', 2, True), Extension('VK_EXT_post_depth_coverage', 1, 'device->info.gen >= 9'), -Extension('VK_EXT_sample_locations', 1, False), +Extension('VK_EXT_sample_locations', 1, True), Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen >= 9'), Extension('VK_EXT_scalar_block_layout', 1, True), Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen >= 9'), -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 8/9] anv: Removed unused header file
In src/intel/vulkan/genX_blorp_exec.c we included the file: common/gen_sample_positions.h but not use it. Removed. Reviewed-by: Sagar Ghuge --- src/intel/vulkan/genX_blorp_exec.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/intel/vulkan/genX_blorp_exec.c b/src/intel/vulkan/genX_blorp_exec.c index e9c85d56d5f..0eeefaaa9d6 100644 --- a/src/intel/vulkan/genX_blorp_exec.c +++ b/src/intel/vulkan/genX_blorp_exec.c @@ -31,7 +31,6 @@ #undef __gen_combine_address #include "common/gen_l3_config.h" -#include "common/gen_sample_positions.h" #include "blorp/blorp_genX_exec.h" static void * -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 7/9] anv: Optimized the emission of the default locations on Gen8+
We only emit sample locations when the extension is enabled by the user. In all other cases the default locations are emitted once when the device is initialized to increase performance. --- src/intel/vulkan/anv_genX.h| 3 ++- src/intel/vulkan/genX_cmd_buffer.c | 2 +- src/intel/vulkan/genX_pipeline.c | 11 +++ src/intel/vulkan/genX_state.c | 8 +--- 4 files changed, 15 insertions(+), 9 deletions(-) diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index e82d83465ef..7f33a2b0a68 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -93,4 +93,5 @@ void genX(emit_ms_state)(struct anv_batch *batch, struct anv_sample *anv_samples, uint32_t num_samples, uint32_t log2_samples, - bool custom_sample_locations); + bool custom_sample_locations, + bool sample_locations_ext_enabled); diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 4752c66f350..ae7c5a80a3c 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -2654,7 +2654,7 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples; genX(emit_ms_state)(_buffer->batch, anv_samples, samples, - log2_samples, true); + log2_samples, true, true); } void diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c index 8afc08f0320..12adfa65da8 100644 --- a/src/intel/vulkan/genX_pipeline.c +++ b/src/intel/vulkan/genX_pipeline.c @@ -573,10 +573,12 @@ emit_sample_mask(struct anv_pipeline *pipeline, } static void -emit_ms_state(struct anv_pipeline *pipeline, +emit_ms_state(struct anv_device *device, + struct anv_pipeline *pipeline, const VkPipelineMultisampleStateCreateInfo *info, const VkPipelineDynamicStateCreateInfo *dinfo) { + bool sample_loc_enabled = device->enabled_extensions.EXT_sample_locations; struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS]; VkSampleLocationsInfoEXT *sl; bool custom_locations = false; @@ -588,7 +590,7 @@ emit_ms_state(struct anv_pipeline *pipeline, if (info) { samples = info->rasterizationSamples; - if (info->pNext) { + if (sample_loc_enabled && info->pNext) { VkPipelineSampleLocationsStateCreateInfoEXT *slinfo = (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext; @@ -617,7 +619,7 @@ emit_ms_state(struct anv_pipeline *pipeline, } genX(emit_ms_state)(>batch, anv_samples, samples, log2_samples, - custom_locations); + custom_locations, sample_loc_enabled); } static const uint32_t vk_to_gen_logic_op[] = { @@ -1947,7 +1949,8 @@ genX(graphics_pipeline_create)( assert(pCreateInfo->pRasterizationState); emit_rs_state(pipeline, pCreateInfo->pRasterizationState, pCreateInfo->pMultisampleState, pass, subpass); - emit_ms_state(pipeline, pCreateInfo->pMultisampleState, pCreateInfo->pDynamicState); + emit_ms_state(device, pipeline, pCreateInfo->pMultisampleState, + pCreateInfo->pDynamicState); emit_ds_state(pipeline, pCreateInfo->pDepthStencilState, pass, subpass); emit_cb_state(pipeline, pCreateInfo->pColorBlendState, pCreateInfo->pMultisampleState); diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c index 804cfab3a56..bc6b5870d8d 100644 --- a/src/intel/vulkan/genX_state.c +++ b/src/intel/vulkan/genX_state.c @@ -552,12 +552,14 @@ genX(emit_ms_state)(struct anv_batch *batch, struct anv_sample *anv_samples, uint32_t num_samples, uint32_t log2_samples, - bool custom_sample_locations) + bool custom_sample_locations, + bool sample_locations_ext_enabled) { emit_multisample(batch, anv_samples, num_samples, log2_samples, custom_sample_locations); #if GEN_GEN >= 8 - emit_sample_locations(batch, anv_samples, num_samples, - custom_sample_locations); + if (sample_locations_ext_enabled) + emit_sample_locations(batch, anv_samples, num_samples, +custom_sample_locations); #endif } -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 3/9] anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT
Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT according to the Vulkan Specification section [36.2. Additional Multisampling Capabilities]. --- src/intel/Makefile.sources | 1 + src/intel/vulkan/anv_sample_locations.c | 60 + src/intel/vulkan/meson.build| 1 + 3 files changed, 62 insertions(+) create mode 100644 src/intel/vulkan/anv_sample_locations.c diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources index a5c8828a6b6..a0873c7ccc2 100644 --- a/src/intel/Makefile.sources +++ b/src/intel/Makefile.sources @@ -251,6 +251,7 @@ VULKAN_FILES := \ vulkan/anv_pipeline_cache.c \ vulkan/anv_private.h \ vulkan/anv_queue.c \ + vulkan/anv_sample_locations.c \ vulkan/anv_util.c \ vulkan/anv_wsi.c \ vulkan/vk_format_info.h diff --git a/src/intel/vulkan/anv_sample_locations.c b/src/intel/vulkan/anv_sample_locations.c new file mode 100644 index 000..1ebf280e05b --- /dev/null +++ b/src/intel/vulkan/anv_sample_locations.c @@ -0,0 +1,60 @@ +/* + * Copyright © 2019 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include "anv_private.h" + +void +anv_GetPhysicalDeviceMultisamplePropertiesEXT(VkPhysicalDevice physicalDevice, + VkSampleCountFlagBits samples, + VkMultisamplePropertiesEXT + *pMultisampleProperties) +{ + ANV_FROM_HANDLE(anv_physical_device, physical_device, physicalDevice); + const struct gen_device_info *devinfo = _device->info; + + VkExtent2D grid_size; + switch (samples) { + case VK_SAMPLE_COUNT_2_BIT: + case VK_SAMPLE_COUNT_4_BIT: + case VK_SAMPLE_COUNT_8_BIT: + grid_size.width = SAMPLE_LOC_GRID_W; + grid_size.height = SAMPLE_LOC_GRID_H; + break; + + case VK_SAMPLE_COUNT_16_BIT: + if (devinfo->gen >= 9) { + grid_size.width = SAMPLE_LOC_GRID_W; + grid_size.height = SAMPLE_LOC_GRID_H; + break; + } + default: + grid_size.width = grid_size.height = 0; + break; + }; + + *pMultisampleProperties = (VkMultisamplePropertiesEXT) { + .sType = VK_STRUCTURE_TYPE_MULTISAMPLE_PROPERTIES_EXT, + .pNext = NULL, + .maxSampleLocationGridSize = grid_size + }; +} diff --git a/src/intel/vulkan/meson.build b/src/intel/vulkan/meson.build index 7fa43a6ad79..3f78757c774 100644 --- a/src/intel/vulkan/meson.build +++ b/src/intel/vulkan/meson.build @@ -135,6 +135,7 @@ libanv_files = files( 'anv_pipeline_cache.c', 'anv_private.h', 'anv_queue.c', + 'anv_sample_locations.c', 'anv_util.c', 'anv_wsi.c', 'vk_format_info.h', -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 1/9] anv: Added the VK_EXT_sample_locations extension to the anv_extensions list
Added the VK_EXT_sample_locations to the anv_extensions.py list to generate the related entrypoints. Reviewed-by: Sagar Ghuge --- src/intel/vulkan/anv_extensions.py | 1 + 1 file changed, 1 insertion(+) diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_extensions.py index 6fff293dee4..9e4e03e46df 100644 --- a/src/intel/vulkan/anv_extensions.py +++ b/src/intel/vulkan/anv_extensions.py @@ -129,6 +129,7 @@ EXTENSIONS = [ Extension('VK_EXT_inline_uniform_block', 1, True), Extension('VK_EXT_pci_bus_info', 2, True), Extension('VK_EXT_post_depth_coverage', 1, 'device->info.gen >= 9'), +Extension('VK_EXT_sample_locations', 1, False), Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen >= 9'), Extension('VK_EXT_scalar_block_layout', 1, True), Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen >= 9'), -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 6/9] anv: Added support for dynamic and non-dynamic sample locations on Gen7
Allowing setting dynamic and non-dynamic sample locations on Gen7. --- src/intel/vulkan/anv_genX.h| 13 ++--- src/intel/vulkan/genX_cmd_buffer.c | 9 ++-- src/intel/vulkan/genX_pipeline.c | 13 + src/intel/vulkan/genX_state.c | 86 +- 4 files changed, 70 insertions(+), 51 deletions(-) diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index f84fe457152..e82d83465ef 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -89,11 +89,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer *cmd_buffer, void genX(blorp_exec)(struct blorp_batch *batch, const struct blorp_params *params); -void genX(emit_multisample)(struct anv_batch *batch, -uint32_t samples, -uint32_t log2_samples); - -void genX(emit_sample_locations)(struct anv_batch *batch, - const struct anv_sample *anv_samples, - uint32_t num_samples, - bool custom_locations); +void genX(emit_ms_state)(struct anv_batch *batch, + struct anv_sample *anv_samples, + uint32_t num_samples, + uint32_t log2_samples, + bool custom_sample_locations); diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 9229df84caa..4752c66f350 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -2643,8 +2643,7 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer, static void cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) { -#if GEN_GEN >= 8 - const struct anv_sample *anv_samples; + struct anv_sample *anv_samples; uint32_t log2_samples; uint32_t samples; @@ -2654,10 +2653,8 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) log2_samples = __builtin_ffs(samples) - 1; anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples; - genX(emit_multisample)(_buffer->batch, samples, log2_samples); - genX(emit_sample_locations)(_buffer->batch, anv_samples, samples, - true); -#endif + genX(emit_ms_state)(_buffer->batch, anv_samples, samples, + log2_samples, true); } void diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c index fa42e622077..8afc08f0320 100644 --- a/src/intel/vulkan/genX_pipeline.c +++ b/src/intel/vulkan/genX_pipeline.c @@ -577,12 +577,9 @@ emit_ms_state(struct anv_pipeline *pipeline, const VkPipelineMultisampleStateCreateInfo *info, const VkPipelineDynamicStateCreateInfo *dinfo) { -#if GEN_GEN >= 8 struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS]; VkSampleLocationsInfoEXT *sl; bool custom_locations = false; -#endif - uint32_t samples = 1; uint32_t log2_samples = 0; @@ -591,7 +588,6 @@ emit_ms_state(struct anv_pipeline *pipeline, if (info) { samples = info->rasterizationSamples; -#if GEN_GEN >= 8 if (info->pNext) { VkPipelineSampleLocationsStateCreateInfoEXT *slinfo = (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext; @@ -616,17 +612,12 @@ emit_ms_state(struct anv_pipeline *pipeline, } } } -#endif log2_samples = __builtin_ffs(samples) - 1; } - genX(emit_multisample(>batch, samples, log2_samples)); - -#if GEN_GEN >= 8 - genX(emit_sample_locations)(>batch, anv_samples, samples, - custom_locations); -#endif + genX(emit_ms_state)(>batch, anv_samples, samples, log2_samples, + custom_locations); } static const uint32_t vk_to_gen_logic_op[] = { diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c index 44cfc925ed5..804cfab3a56 100644 --- a/src/intel/vulkan/genX_state.c +++ b/src/intel/vulkan/genX_state.c @@ -437,10 +437,12 @@ VkResult genX(CreateSampler)( return VK_SUCCESS; } -void -genX(emit_multisample)(struct anv_batch *batch, - uint32_t samples, - uint32_t log2_samples) +static void +emit_multisample(struct anv_batch *batch, + const struct anv_sample *anv_samples, + uint32_t samples, + uint32_t log2_samples, + bool custom_locations) { anv_batch_emit(batch, GENX(3DSTATE_MULTISAMPLE), ms) { ms.NumberofMultisamples = log2_samples; @@ -453,34 +455,52 @@ genX(emit_multisample)(struct anv_batch *batch, */ ms.PixelPositionOffsetEnable = false; #else - switch (samples) { - case 1: - GEN_SAMPLE_POS_1X(ms.Sample); - break; - case 2: - GEN_SAMPLE_POS_2X(ms.Sample); - break; - case 4: - GEN_SAMPLE_POS_4X(ms.Sample); - break; -
[Mesa-dev] [PATCH v3 5/9] anv: Added support for dynamic sample locations on Gen8+
Added support for setting the locations when the pipeline has been created with the dynamic state bit enabled according to the Vulkan Specification section [26.5. Custom Sample Locations] for the function: 'vkCmdSetSampleLocationsEXT' The reason that we preferred to store the boolean valid inside the dynamic state struct for locations instead of using a dirty bit (ANV_CMD_DIRTY_SAMPLE_LOCATIONS for example) is that other functions can modify the value of the dirty bits causing unexpected behavior. --- src/intel/vulkan/anv_cmd_buffer.c | 19 src/intel/vulkan/anv_genX.h| 6 +++- src/intel/vulkan/anv_private.h | 6 src/intel/vulkan/genX_cmd_buffer.c | 27 ++ src/intel/vulkan/genX_pipeline.c | 46 -- src/intel/vulkan/genX_state.c | 41 +++--- 6 files changed, 99 insertions(+), 46 deletions(-) diff --git a/src/intel/vulkan/anv_cmd_buffer.c b/src/intel/vulkan/anv_cmd_buffer.c index 1b34644a434..101c1375430 100644 --- a/src/intel/vulkan/anv_cmd_buffer.c +++ b/src/intel/vulkan/anv_cmd_buffer.c @@ -28,6 +28,7 @@ #include #include "anv_private.h" +#include "anv_sample_locations.h" #include "vk_format_info.h" #include "vk_util.h" @@ -558,6 +559,24 @@ void anv_CmdSetStencilReference( cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE; } +void +anv_CmdSetSampleLocationsEXT(VkCommandBuffer commandBuffer, + const VkSampleLocationsInfoEXT *pSampleLocationsInfo) +{ + ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer); + assert(pSampleLocationsInfo); + + struct anv_dynamic_state *dyn_state = _buffer->state.gfx.dynamic; + dyn_state->sample_locations.num_samples = + pSampleLocationsInfo->sampleLocationsPerPixel; + + anv_calc_sample_locations(dyn_state->sample_locations.anv_samples, + dyn_state->sample_locations.num_samples, + pSampleLocationsInfo); + + cmd_buffer->state.gfx.dynamic.sample_locations.valid = true; +} + static void anv_cmd_buffer_bind_descriptor_set(struct anv_cmd_buffer *cmd_buffer, VkPipelineBindPoint bind_point, diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index 52415c04a45..f84fe457152 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -89,7 +89,11 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer *cmd_buffer, void genX(blorp_exec)(struct blorp_batch *batch, const struct blorp_params *params); +void genX(emit_multisample)(struct anv_batch *batch, +uint32_t samples, +uint32_t log2_samples); + void genX(emit_sample_locations)(struct anv_batch *batch, + const struct anv_sample *anv_samples, uint32_t num_samples, - const VkSampleLocationsInfoEXT *sl, bool custom_locations); diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index 981956e5706..a2e1756cd99 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -2135,6 +2135,12 @@ struct anv_dynamic_state { uint32_t front; uint32_t back; } stencil_reference; + + struct { + struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS]; + uint32_t num_samples; + bool valid; + } sample_locations; }; extern const struct anv_dynamic_state default_dynamic_state; diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 7687507e6b7..9229df84caa 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -25,11 +25,13 @@ #include #include "anv_private.h" +#include "anv_sample_locations.h" #include "vk_format_info.h" #include "vk_util.h" #include "util/fast_idiv_by_const.h" #include "common/gen_l3_config.h" +#include "common/gen_sample_positions.h" #include "genxml/gen_macros.h" #include "genxml/genX_pack.h" @@ -2638,6 +2640,26 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer, cmd_buffer->state.push_constants_dirty &= ~flushed; } +static void +cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) +{ +#if GEN_GEN >= 8 + const struct anv_sample *anv_samples; + uint32_t log2_samples; + uint32_t samples; + + samples = cmd_buffer->state.gfx.dynamic.sample_locations.num_samples; + assert(samples > 0); + + log2_samples = __builtin_ffs(samples) - 1; + anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples; + + genX(emit_multisample)(_buffer->batch, samples, log2_samples); + genX(emit_sample_locations)(_buffer->batch,
[Mesa-dev] [PATCH v3 2/9] anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT
The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled with implementation dependent values and according to the table from the Vulkan Specification section [36.1. Limit Requirements]: pname | max | min pname:sampleLocationSampleCounts |-|ename:VK_SAMPLE_COUNT_4_BIT pname:maxSampleLocationGridSize|-|(1, 1) pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375) pname:sampleLocationSubPixelBits |-|4 pname:variableSampleLocations | false |implementation dependent The hardware only supports setting the same sample location for all the pixels, so we only support 1x1 grids. Also, variableSampleLocations is set to false because we don't support the feature. v2: 1- Replaced false with VK_FALSE for consistency. (Sagar Ghuge) 2- Used the isl_device_sample_count to take the number of samples per platform to avoid extra checks. (Sagar Ghuge) Reviewed-by: Sagar Ghuge --- src/intel/vulkan/anv_device.c | 19 +++ src/intel/vulkan/anv_private.h | 3 +++ 2 files changed, 22 insertions(+) diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index 729cceb3e32..bf6f03ebb1a 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -1401,6 +1401,25 @@ void anv_GetPhysicalDeviceProperties2( break; } + case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT: { + VkPhysicalDeviceSampleLocationsPropertiesEXT *props = +(VkPhysicalDeviceSampleLocationsPropertiesEXT *)ext; + + props->sampleLocationSampleCounts = +isl_device_get_sample_counts(>isl_dev); + + props->maxSampleLocationGridSize.width = SAMPLE_LOC_GRID_W; + props->maxSampleLocationGridSize.height = SAMPLE_LOC_GRID_H; + + props->sampleLocationCoordinateRange[0] = 0; + props->sampleLocationCoordinateRange[1] = 0.9375; + props->sampleLocationSubPixelBits = 4; + + props->variableSampleLocations = VK_FALSE; + + break; + } + default: anv_debug_ignored_stype(ext->sType); break; diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index eed282ff985..5905299e59d 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -195,6 +195,9 @@ struct gen_l3_config; #define anv_printflike(a, b) __attribute__((__format__(__printf__, a, b))) +#define SAMPLE_LOC_GRID_W 1 +#define SAMPLE_LOC_GRID_H 1 + static inline uint32_t align_down_npot_u32(uint32_t v, uint32_t a) { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 4/9] anv: Added support for non-dynamic sample locations on Gen8+
Allowing the user to set custom sample locations non-dynamically, by filling the extension structs and chaining them to the pipeline structs according to the Vulkan specification section [26.5. Custom Sample Locations] for the following structures: 'VkPipelineSampleLocationsStateCreateInfoEXT' 'VkSampleLocationsInfoEXT' 'VkSampleLocationEXT' Once custom locations are used, the default locations are lost and need to be re-emitted again in the next pipeline creation. For that, we emit the 3DSTATE_SAMPLE_PATTERN at every pipeline creation. --- src/intel/common/gen_sample_positions.h | 53 src/intel/vulkan/anv_genX.h | 5 ++ src/intel/vulkan/anv_private.h | 9 +++ src/intel/vulkan/anv_sample_locations.c | 38 +++- src/intel/vulkan/anv_sample_locations.h | 29 + src/intel/vulkan/genX_pipeline.c| 80 + src/intel/vulkan/genX_state.c | 59 ++ 7 files changed, 259 insertions(+), 14 deletions(-) create mode 100644 src/intel/vulkan/anv_sample_locations.h diff --git a/src/intel/common/gen_sample_positions.h b/src/intel/common/gen_sample_positions.h index da48dcb5ed0..e8af2a552dc 100644 --- a/src/intel/common/gen_sample_positions.h +++ b/src/intel/common/gen_sample_positions.h @@ -160,4 +160,57 @@ prefix##14YOffset = 0.9375; \ prefix##15XOffset = 0.0625; \ prefix##15YOffset = 0.; +/* Examples: + * in case of GEN_GEN < 8: + * SET_SAMPLE_POS(ms.Sample, 0); expands to: + *ms.Sample0XOffset = anv_samples[0].offs_x; + *ms.Sample0YOffset = anv_samples[0].offs_y; + * + * in case of GEN_GEN >= 8: + * SET_SAMPLE_POS(sp._16xSample, 0); expands to: + *sp._16xSample0XOffset = anv_samples[0].offs_x; + *sp._16xSample0YOffset = anv_samples[0].offs_y; + */ +#define SET_SAMPLE_POS(prefix, sample_idx) \ +prefix##sample_idx##XOffset = anv_samples[sample_idx].offs_x; \ +prefix##sample_idx##YOffset = anv_samples[sample_idx].offs_y; + +#define SET_SAMPLE_POS_2X(prefix) \ +SET_SAMPLE_POS(prefix, 0); \ +SET_SAMPLE_POS(prefix, 1); + +#define SET_SAMPLE_POS_4X(prefix) \ +SET_SAMPLE_POS(prefix, 0); \ +SET_SAMPLE_POS(prefix, 1); \ +SET_SAMPLE_POS(prefix, 2); \ +SET_SAMPLE_POS(prefix, 3); + +#define SET_SAMPLE_POS_8X(prefix) \ +SET_SAMPLE_POS(prefix, 0); \ +SET_SAMPLE_POS(prefix, 1); \ +SET_SAMPLE_POS(prefix, 2); \ +SET_SAMPLE_POS(prefix, 3); \ +SET_SAMPLE_POS(prefix, 4); \ +SET_SAMPLE_POS(prefix, 5); \ +SET_SAMPLE_POS(prefix, 6); \ +SET_SAMPLE_POS(prefix, 7); + +#define SET_SAMPLE_POS_16X(prefix) \ +SET_SAMPLE_POS(prefix, 0); \ +SET_SAMPLE_POS(prefix, 1); \ +SET_SAMPLE_POS(prefix, 2); \ +SET_SAMPLE_POS(prefix, 3); \ +SET_SAMPLE_POS(prefix, 4); \ +SET_SAMPLE_POS(prefix, 5); \ +SET_SAMPLE_POS(prefix, 6); \ +SET_SAMPLE_POS(prefix, 7); \ +SET_SAMPLE_POS(prefix, 8); \ +SET_SAMPLE_POS(prefix, 9); \ +SET_SAMPLE_POS(prefix, 10); \ +SET_SAMPLE_POS(prefix, 11); \ +SET_SAMPLE_POS(prefix, 12); \ +SET_SAMPLE_POS(prefix, 13); \ +SET_SAMPLE_POS(prefix, 14); \ +SET_SAMPLE_POS(prefix, 15); + #endif /* GEN_SAMPLE_POSITIONS_H */ diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index 8fd32cabf1e..52415c04a45 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -88,3 +88,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer *cmd_buffer, void genX(blorp_exec)(struct blorp_batch *batch, const struct blorp_params *params); + +void genX(emit_sample_locations)(struct anv_batch *batch, + uint32_t num_samples, + const VkSampleLocationsInfoEXT *sl, + bool custom_locations); diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index 5905299e59d..981956e5706 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -71,6 +71,7 @@ struct anv_buffer; struct anv_buffer_view; struct anv_image_view; struct anv_instance; +struct anv_sample; struct gen_l3_config; @@ -165,6 +166,7 @@ struct gen_l3_config; #define MAX_PUSH_DESCRIPTORS 32 /* Minimum requirement */ #define MAX_INLINE_UNIFORM_BLOCK_SIZE 4096 #define MAX_INLINE_UNIFORM_BLOCK_DESCRIPTORS 32 +#define MAX_SAMPLE_LOCATIONS 16 /* The kernel relocation API has a limitation of a 32-bit delta value * applied to the address before it is written which, in spite of it being @@ -2086,6 +2088,13 @@ struct anv_push_constants { struct brw_image_param images[MAX_GEN8_IMAGES]; }; +struct +anv_sample { + float offs_x; + float offs_y; + float radius; +}; + struct anv_dynamic_state { struct { uint32_t count; diff --git a/src/intel/vulkan/anv_sample_locations.c b/src/intel/vulkan/anv_sample_locations.c index 1ebf280e05b..c660cb5ae84 100644 --- a/src/intel/vulkan/anv_sample_locations.c +++ b/src/intel/vulkan/anv_sample_locations.c @@ -21,7 +21,7 @@ * IN THE SOFTWARE. */ -#include "anv_private.h"
[Mesa-dev] [PATCH v3 0/9] Implementation of the VK_EXT_sample_locations
Implemented the requirements from the VK_EXT_sample_locations extension specification to allow setting custom sample locations on Intel Gen >= 7. Some decisions explained: The grid size was set to 1x1 because the hardware only supports a single set of sample locations for the whole framebuffer. The user can only set custom sample locations per pipeline by filling the extension provided structs or dynamically the way it is described in the sections 26.5, 36.1, 36.2 of the Vulkan specification. Sections 6.7.3 and 7.4 describe how to use sample locations with images when a layout transition is about to take place. These sections were ignored as currently we aren't using sample locations with images in the driver. Variable sample locations aren't required and have not been implemented. We have 754 vk-gl-cts tests for this extension: The 690 pass on Gen >= 9 (where we can support 16 samples). The remaining 64 tests aren't supported because they test the variable sample locations. Eleni Maria Stea (9): anv: Added the VK_EXT_sample_locations extension to the anv_extensions list anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT anv: Added support for non-dynamic sample locations on Gen8+ anv: Added support for dynamic sample locations on Gen8+ anv: Added support for dynamic and non-dynamic sample locations on Gen7 anv: Optimized the emission of the default locations on Gen8+ anv: Removed unused header file anv: Enabled the VK_EXT_sample_locations extension src/intel/Makefile.sources | 1 + src/intel/common/gen_sample_positions.h | 53 ++ src/intel/vulkan/anv_cmd_buffer.c | 19 src/intel/vulkan/anv_device.c | 21 src/intel/vulkan/anv_extensions.py | 1 + src/intel/vulkan/anv_genX.h | 7 ++ src/intel/vulkan/anv_private.h | 18 src/intel/vulkan/anv_sample_locations.c | 96 ++ src/intel/vulkan/anv_sample_locations.h | 29 ++ src/intel/vulkan/genX_blorp_exec.c | 1 - src/intel/vulkan/genX_cmd_buffer.c | 24 + src/intel/vulkan/genX_pipeline.c| 92 + src/intel/vulkan/genX_state.c | 128 src/intel/vulkan/meson.build| 1 + 14 files changed, 450 insertions(+), 41 deletions(-) create mode 100644 src/intel/vulkan/anv_sample_locations.c create mode 100644 src/intel/vulkan/anv_sample_locations.h -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/9] anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT
On Mon, 11 Mar 2019 11:39:58 -0700 Sagar Ghuge wrote: > On Mon, 2019-03-11 at 17:04 +0200, Eleni Maria Stea wrote: > > The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled > > with implementation dependent values and according to the table > > from the Vulkan Specification section [36.1. Limit Requirements]: > > > > pname | max | min > > pname:sampleLocationSampleCounts |- > > |ename:VK_SAMPLE_COU NT_4_BIT > > pname:maxSampleLocationGridSize|-|(1, 1) > > pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375) > > pname:sampleLocationSubPixelBits |-|4 > > pname:variableSampleLocations | false |implementation > > dependent > > > > The hardware only supports setting the same sample location for all > > the > > pixels, so we only support 1x1 grids. > > > > Also, variableSampleLocations is set to false because we don't > > support the > > feature. > > --- > > src/intel/vulkan/anv_device.c | 21 + > > src/intel/vulkan/anv_private.h | 3 +++ > > 2 files changed, 24 insertions(+) > > > > diff --git a/src/intel/vulkan/anv_device.c > > b/src/intel/vulkan/anv_device.c > > index 729cceb3e32..1e183b7f4ad 100644 > > --- a/src/intel/vulkan/anv_device.c > > +++ b/src/intel/vulkan/anv_device.c > > @@ -1401,6 +1401,27 @@ void anv_GetPhysicalDeviceProperties2( > > break; > >} > > > > + case > > VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT: { > > + VkPhysicalDeviceSampleLocationsPropertiesEXT *props = > > +(VkPhysicalDeviceSampleLocationsPropertiesEXT *)ext; > > + props->sampleLocationSampleCounts = ISL_SAMPLE_COUNT_2_BIT > > | > > + ISL_SAMPLE_COUNT_4_BIT > > | > > + > > ISL_SAMPLE_COUNT_8_BIT; > > + if (pdevice->info.gen >= 9) > > +props->sampleLocationSampleCounts |= > > ISL_SAMPLE_COUNT_16_BIT; > > Hi Eleni, > > Thanks for the series. > > "isl_device_get_sample_counts" method figure out values according to > platform so maybe we can make use of it and ignore > ISL_SAMPLE_COUNT_1_BIT. So that we don't have to take care of values > according to platform here. > > I am not sure about this, so it might be a good idea to consult with > Jason/Lionel once. :) I think that not only you are right here, but on top of that we shouldn't ignore the ISL_SAMPLE_COUNT_1_BIT, as we can still write one user defined location when only 1 sample per pixel is used (at least MULTISAMPLE and SAMPLE_PATTERN commands allow us to do so). So, I've made the change, thank you. :) > > with or without the fix, this patch is: > > Reviewed-by: Sagar Ghuge > Thanks for the review! Eleni ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 8/9] anv: Removed unused header file
In src/intel/vulkan/genX_blorp_exec.c we included the file: common/gen_sample_positions.h but not use it. Removed. --- src/intel/vulkan/genX_blorp_exec.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/intel/vulkan/genX_blorp_exec.c b/src/intel/vulkan/genX_blorp_exec.c index e9c85d56d5f..0eeefaaa9d6 100644 --- a/src/intel/vulkan/genX_blorp_exec.c +++ b/src/intel/vulkan/genX_blorp_exec.c @@ -31,7 +31,6 @@ #undef __gen_combine_address #include "common/gen_l3_config.h" -#include "common/gen_sample_positions.h" #include "blorp/blorp_genX_exec.h" static void * -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 7/9] anv: Optimized the emission of the default locations on Gen8+
We only emit sample locations when the extension is enabled by the user. In all other cases the default locations are emitted once when the device is initialized to increase performance. --- src/intel/vulkan/anv_genX.h| 3 ++- src/intel/vulkan/genX_cmd_buffer.c | 2 +- src/intel/vulkan/genX_pipeline.c | 11 +++ src/intel/vulkan/genX_state.c | 8 +--- 4 files changed, 15 insertions(+), 9 deletions(-) diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index e82d83465ef..7f33a2b0a68 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -93,4 +93,5 @@ void genX(emit_ms_state)(struct anv_batch *batch, struct anv_sample *anv_samples, uint32_t num_samples, uint32_t log2_samples, - bool custom_sample_locations); + bool custom_sample_locations, + bool sample_locations_ext_enabled); diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 4752c66f350..ae7c5a80a3c 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -2654,7 +2654,7 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples; genX(emit_ms_state)(_buffer->batch, anv_samples, samples, - log2_samples, true); + log2_samples, true, true); } void diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c index 8afc08f0320..12adfa65da8 100644 --- a/src/intel/vulkan/genX_pipeline.c +++ b/src/intel/vulkan/genX_pipeline.c @@ -573,10 +573,12 @@ emit_sample_mask(struct anv_pipeline *pipeline, } static void -emit_ms_state(struct anv_pipeline *pipeline, +emit_ms_state(struct anv_device *device, + struct anv_pipeline *pipeline, const VkPipelineMultisampleStateCreateInfo *info, const VkPipelineDynamicStateCreateInfo *dinfo) { + bool sample_loc_enabled = device->enabled_extensions.EXT_sample_locations; struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS]; VkSampleLocationsInfoEXT *sl; bool custom_locations = false; @@ -588,7 +590,7 @@ emit_ms_state(struct anv_pipeline *pipeline, if (info) { samples = info->rasterizationSamples; - if (info->pNext) { + if (sample_loc_enabled && info->pNext) { VkPipelineSampleLocationsStateCreateInfoEXT *slinfo = (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext; @@ -617,7 +619,7 @@ emit_ms_state(struct anv_pipeline *pipeline, } genX(emit_ms_state)(>batch, anv_samples, samples, log2_samples, - custom_locations); + custom_locations, sample_loc_enabled); } static const uint32_t vk_to_gen_logic_op[] = { @@ -1947,7 +1949,8 @@ genX(graphics_pipeline_create)( assert(pCreateInfo->pRasterizationState); emit_rs_state(pipeline, pCreateInfo->pRasterizationState, pCreateInfo->pMultisampleState, pass, subpass); - emit_ms_state(pipeline, pCreateInfo->pMultisampleState, pCreateInfo->pDynamicState); + emit_ms_state(device, pipeline, pCreateInfo->pMultisampleState, + pCreateInfo->pDynamicState); emit_ds_state(pipeline, pCreateInfo->pDepthStencilState, pass, subpass); emit_cb_state(pipeline, pCreateInfo->pColorBlendState, pCreateInfo->pMultisampleState); diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c index 804cfab3a56..bc6b5870d8d 100644 --- a/src/intel/vulkan/genX_state.c +++ b/src/intel/vulkan/genX_state.c @@ -552,12 +552,14 @@ genX(emit_ms_state)(struct anv_batch *batch, struct anv_sample *anv_samples, uint32_t num_samples, uint32_t log2_samples, - bool custom_sample_locations) + bool custom_sample_locations, + bool sample_locations_ext_enabled) { emit_multisample(batch, anv_samples, num_samples, log2_samples, custom_sample_locations); #if GEN_GEN >= 8 - emit_sample_locations(batch, anv_samples, num_samples, - custom_sample_locations); + if (sample_locations_ext_enabled) + emit_sample_locations(batch, anv_samples, num_samples, +custom_sample_locations); #endif } -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 6/9] anv: Added support for dynamic and non-dynamic sample locations on Gen7
Allowing setting dynamic and non-dynamic sample locations on Gen7. --- src/intel/vulkan/anv_genX.h| 13 ++--- src/intel/vulkan/genX_cmd_buffer.c | 9 ++-- src/intel/vulkan/genX_pipeline.c | 13 + src/intel/vulkan/genX_state.c | 86 +- 4 files changed, 70 insertions(+), 51 deletions(-) diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index f84fe457152..e82d83465ef 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -89,11 +89,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer *cmd_buffer, void genX(blorp_exec)(struct blorp_batch *batch, const struct blorp_params *params); -void genX(emit_multisample)(struct anv_batch *batch, -uint32_t samples, -uint32_t log2_samples); - -void genX(emit_sample_locations)(struct anv_batch *batch, - const struct anv_sample *anv_samples, - uint32_t num_samples, - bool custom_locations); +void genX(emit_ms_state)(struct anv_batch *batch, + struct anv_sample *anv_samples, + uint32_t num_samples, + uint32_t log2_samples, + bool custom_sample_locations); diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 9229df84caa..4752c66f350 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -2643,8 +2643,7 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer, static void cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) { -#if GEN_GEN >= 8 - const struct anv_sample *anv_samples; + struct anv_sample *anv_samples; uint32_t log2_samples; uint32_t samples; @@ -2654,10 +2653,8 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) log2_samples = __builtin_ffs(samples) - 1; anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples; - genX(emit_multisample)(_buffer->batch, samples, log2_samples); - genX(emit_sample_locations)(_buffer->batch, anv_samples, samples, - true); -#endif + genX(emit_ms_state)(_buffer->batch, anv_samples, samples, + log2_samples, true); } void diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c index fa42e622077..8afc08f0320 100644 --- a/src/intel/vulkan/genX_pipeline.c +++ b/src/intel/vulkan/genX_pipeline.c @@ -577,12 +577,9 @@ emit_ms_state(struct anv_pipeline *pipeline, const VkPipelineMultisampleStateCreateInfo *info, const VkPipelineDynamicStateCreateInfo *dinfo) { -#if GEN_GEN >= 8 struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS]; VkSampleLocationsInfoEXT *sl; bool custom_locations = false; -#endif - uint32_t samples = 1; uint32_t log2_samples = 0; @@ -591,7 +588,6 @@ emit_ms_state(struct anv_pipeline *pipeline, if (info) { samples = info->rasterizationSamples; -#if GEN_GEN >= 8 if (info->pNext) { VkPipelineSampleLocationsStateCreateInfoEXT *slinfo = (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext; @@ -616,17 +612,12 @@ emit_ms_state(struct anv_pipeline *pipeline, } } } -#endif log2_samples = __builtin_ffs(samples) - 1; } - genX(emit_multisample(>batch, samples, log2_samples)); - -#if GEN_GEN >= 8 - genX(emit_sample_locations)(>batch, anv_samples, samples, - custom_locations); -#endif + genX(emit_ms_state)(>batch, anv_samples, samples, log2_samples, + custom_locations); } static const uint32_t vk_to_gen_logic_op[] = { diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c index 44cfc925ed5..804cfab3a56 100644 --- a/src/intel/vulkan/genX_state.c +++ b/src/intel/vulkan/genX_state.c @@ -437,10 +437,12 @@ VkResult genX(CreateSampler)( return VK_SUCCESS; } -void -genX(emit_multisample)(struct anv_batch *batch, - uint32_t samples, - uint32_t log2_samples) +static void +emit_multisample(struct anv_batch *batch, + const struct anv_sample *anv_samples, + uint32_t samples, + uint32_t log2_samples, + bool custom_locations) { anv_batch_emit(batch, GENX(3DSTATE_MULTISAMPLE), ms) { ms.NumberofMultisamples = log2_samples; @@ -453,34 +455,52 @@ genX(emit_multisample)(struct anv_batch *batch, */ ms.PixelPositionOffsetEnable = false; #else - switch (samples) { - case 1: - GEN_SAMPLE_POS_1X(ms.Sample); - break; - case 2: - GEN_SAMPLE_POS_2X(ms.Sample); - break; - case 4: - GEN_SAMPLE_POS_4X(ms.Sample); - break; -
[Mesa-dev] [PATCH v2 5/9] anv: Added support for dynamic sample locations on Gen8+
Added support for setting the locations when the pipeline has been created with the dynamic state bit enabled according to the Vulkan Specification section [26.5. Custom Sample Locations] for the function: 'vkCmdSetSampleLocationsEXT' The reason that we preferred to store the boolean valid inside the dynamic state struct for locations instead of using a dirty bit (ANV_CMD_DIRTY_SAMPLE_LOCATIONS for example) is that other functions can modify the value of the dirty bits causing unexpected behavior. --- src/intel/vulkan/anv_cmd_buffer.c | 19 src/intel/vulkan/anv_genX.h| 6 +++- src/intel/vulkan/anv_private.h | 6 src/intel/vulkan/genX_cmd_buffer.c | 27 ++ src/intel/vulkan/genX_pipeline.c | 46 -- src/intel/vulkan/genX_state.c | 41 +++--- 6 files changed, 99 insertions(+), 46 deletions(-) diff --git a/src/intel/vulkan/anv_cmd_buffer.c b/src/intel/vulkan/anv_cmd_buffer.c index 1b34644a434..101c1375430 100644 --- a/src/intel/vulkan/anv_cmd_buffer.c +++ b/src/intel/vulkan/anv_cmd_buffer.c @@ -28,6 +28,7 @@ #include #include "anv_private.h" +#include "anv_sample_locations.h" #include "vk_format_info.h" #include "vk_util.h" @@ -558,6 +559,24 @@ void anv_CmdSetStencilReference( cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE; } +void +anv_CmdSetSampleLocationsEXT(VkCommandBuffer commandBuffer, + const VkSampleLocationsInfoEXT *pSampleLocationsInfo) +{ + ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer); + assert(pSampleLocationsInfo); + + struct anv_dynamic_state *dyn_state = _buffer->state.gfx.dynamic; + dyn_state->sample_locations.num_samples = + pSampleLocationsInfo->sampleLocationsPerPixel; + + anv_calc_sample_locations(dyn_state->sample_locations.anv_samples, + dyn_state->sample_locations.num_samples, + pSampleLocationsInfo); + + cmd_buffer->state.gfx.dynamic.sample_locations.valid = true; +} + static void anv_cmd_buffer_bind_descriptor_set(struct anv_cmd_buffer *cmd_buffer, VkPipelineBindPoint bind_point, diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index 52415c04a45..f84fe457152 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -89,7 +89,11 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer *cmd_buffer, void genX(blorp_exec)(struct blorp_batch *batch, const struct blorp_params *params); +void genX(emit_multisample)(struct anv_batch *batch, +uint32_t samples, +uint32_t log2_samples); + void genX(emit_sample_locations)(struct anv_batch *batch, + const struct anv_sample *anv_samples, uint32_t num_samples, - const VkSampleLocationsInfoEXT *sl, bool custom_locations); diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index 981956e5706..a2e1756cd99 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -2135,6 +2135,12 @@ struct anv_dynamic_state { uint32_t front; uint32_t back; } stencil_reference; + + struct { + struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS]; + uint32_t num_samples; + bool valid; + } sample_locations; }; extern const struct anv_dynamic_state default_dynamic_state; diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 7687507e6b7..9229df84caa 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -25,11 +25,13 @@ #include #include "anv_private.h" +#include "anv_sample_locations.h" #include "vk_format_info.h" #include "vk_util.h" #include "util/fast_idiv_by_const.h" #include "common/gen_l3_config.h" +#include "common/gen_sample_positions.h" #include "genxml/gen_macros.h" #include "genxml/genX_pack.h" @@ -2638,6 +2640,26 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer, cmd_buffer->state.push_constants_dirty &= ~flushed; } +static void +cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) +{ +#if GEN_GEN >= 8 + const struct anv_sample *anv_samples; + uint32_t log2_samples; + uint32_t samples; + + samples = cmd_buffer->state.gfx.dynamic.sample_locations.num_samples; + assert(samples > 0); + + log2_samples = __builtin_ffs(samples) - 1; + anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples; + + genX(emit_multisample)(_buffer->batch, samples, log2_samples); + genX(emit_sample_locations)(_buffer->batch,
[Mesa-dev] [PATCH v2 2/9] anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT
The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled with implementation dependent values and according to the table from the Vulkan Specification section [36.1. Limit Requirements]: pname | max | min pname:sampleLocationSampleCounts |-|ename:VK_SAMPLE_COUNT_4_BIT pname:maxSampleLocationGridSize|-|(1, 1) pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375) pname:sampleLocationSubPixelBits |-|4 pname:variableSampleLocations | false |implementation dependent The hardware only supports setting the same sample location for all the pixels, so we only support 1x1 grids. Also, variableSampleLocations is set to false because we don't support the feature. v2: 1- Replaced false with VK_FALSE for consistency. (Sagar Ghuge) 2- Used the isl_device_sample_count to take the number of samples per platform to avoid extra checks. (Sagar Ghuge) Reviewed-by: Sagar Ghuge --- src/intel/vulkan/anv_device.c | 19 +++ src/intel/vulkan/anv_private.h | 3 +++ 2 files changed, 22 insertions(+) diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index 729cceb3e32..bf6f03ebb1a 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -1401,6 +1401,25 @@ void anv_GetPhysicalDeviceProperties2( break; } + case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT: { + VkPhysicalDeviceSampleLocationsPropertiesEXT *props = +(VkPhysicalDeviceSampleLocationsPropertiesEXT *)ext; + + props->sampleLocationSampleCounts = +isl_device_get_sample_counts(>isl_dev); + + props->maxSampleLocationGridSize.width = SAMPLE_LOC_GRID_W; + props->maxSampleLocationGridSize.height = SAMPLE_LOC_GRID_H; + + props->sampleLocationCoordinateRange[0] = 0; + props->sampleLocationCoordinateRange[1] = 0.9375; + props->sampleLocationSubPixelBits = 4; + + props->variableSampleLocations = VK_FALSE; + + break; + } + default: anv_debug_ignored_stype(ext->sType); break; diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index eed282ff985..5905299e59d 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -195,6 +195,9 @@ struct gen_l3_config; #define anv_printflike(a, b) __attribute__((__format__(__printf__, a, b))) +#define SAMPLE_LOC_GRID_W 1 +#define SAMPLE_LOC_GRID_H 1 + static inline uint32_t align_down_npot_u32(uint32_t v, uint32_t a) { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/9] anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT
Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT according to the Vulkan Specification section [36.2. Additional Multisampling Capabilities]. --- src/intel/Makefile.sources | 1 + src/intel/vulkan/anv_sample_locations.c | 60 + src/intel/vulkan/meson.build| 1 + 3 files changed, 62 insertions(+) create mode 100644 src/intel/vulkan/anv_sample_locations.c diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources index a5c8828a6b6..a0873c7ccc2 100644 --- a/src/intel/Makefile.sources +++ b/src/intel/Makefile.sources @@ -251,6 +251,7 @@ VULKAN_FILES := \ vulkan/anv_pipeline_cache.c \ vulkan/anv_private.h \ vulkan/anv_queue.c \ + vulkan/anv_sample_locations.c \ vulkan/anv_util.c \ vulkan/anv_wsi.c \ vulkan/vk_format_info.h diff --git a/src/intel/vulkan/anv_sample_locations.c b/src/intel/vulkan/anv_sample_locations.c new file mode 100644 index 000..1ebf280e05b --- /dev/null +++ b/src/intel/vulkan/anv_sample_locations.c @@ -0,0 +1,60 @@ +/* + * Copyright © 2019 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include "anv_private.h" + +void +anv_GetPhysicalDeviceMultisamplePropertiesEXT(VkPhysicalDevice physicalDevice, + VkSampleCountFlagBits samples, + VkMultisamplePropertiesEXT + *pMultisampleProperties) +{ + ANV_FROM_HANDLE(anv_physical_device, physical_device, physicalDevice); + const struct gen_device_info *devinfo = _device->info; + + VkExtent2D grid_size; + switch (samples) { + case VK_SAMPLE_COUNT_2_BIT: + case VK_SAMPLE_COUNT_4_BIT: + case VK_SAMPLE_COUNT_8_BIT: + grid_size.width = SAMPLE_LOC_GRID_W; + grid_size.height = SAMPLE_LOC_GRID_H; + break; + + case VK_SAMPLE_COUNT_16_BIT: + if (devinfo->gen >= 9) { + grid_size.width = SAMPLE_LOC_GRID_W; + grid_size.height = SAMPLE_LOC_GRID_H; + break; + } + default: + grid_size.width = grid_size.height = 0; + break; + }; + + *pMultisampleProperties = (VkMultisamplePropertiesEXT) { + .sType = VK_STRUCTURE_TYPE_MULTISAMPLE_PROPERTIES_EXT, + .pNext = NULL, + .maxSampleLocationGridSize = grid_size + }; +} diff --git a/src/intel/vulkan/meson.build b/src/intel/vulkan/meson.build index 7fa43a6ad79..3f78757c774 100644 --- a/src/intel/vulkan/meson.build +++ b/src/intel/vulkan/meson.build @@ -135,6 +135,7 @@ libanv_files = files( 'anv_pipeline_cache.c', 'anv_private.h', 'anv_queue.c', + 'anv_sample_locations.c', 'anv_util.c', 'anv_wsi.c', 'vk_format_info.h', -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 9/9] anv: Enabled the VK_EXT_sample_locations extension
Enabled the VK_EXT_sample_locations for Intel Gen >= 7. v2: Replaced device.info->gen >= 7 with True, as Anv doesn't support anything below Gen7. (Lionel Landwerlin) --- src/intel/vulkan/anv_extensions.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_extensions.py index 9e4e03e46df..5a30c733c5c 100644 --- a/src/intel/vulkan/anv_extensions.py +++ b/src/intel/vulkan/anv_extensions.py @@ -129,7 +129,7 @@ EXTENSIONS = [ Extension('VK_EXT_inline_uniform_block', 1, True), Extension('VK_EXT_pci_bus_info', 2, True), Extension('VK_EXT_post_depth_coverage', 1, 'device->info.gen >= 9'), -Extension('VK_EXT_sample_locations', 1, False), +Extension('VK_EXT_sample_locations', 1, True), Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen >= 9'), Extension('VK_EXT_scalar_block_layout', 1, True), Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen >= 9'), -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 0/9] Implementation of the VK_EXT_sample_locations
Implemented the requirements from the VK_EXT_sample_locations extension specification to allow setting custom sample locations on Intel Gen >= 7. Some decisions explained: The grid size was set to 1x1 because the hardware only supports a single set of sample locations for the whole framebuffer. The user can only set custom sample locations per pipeline by filling the extension provided structs or dynamically the way it is described in the sections 26.5, 36.1, 36.2 of the Vulkan specification. Sections 6.7.3 and 7.4 describe how to use sample locations with images when a layout transition is about to take place. These sections were ignored as currently we aren't using sample locations with images in the driver. Variable sample locations aren't required and have not been implemented. We have 754 vk-gl-cts tests for this extension: The 690 pass on Gen >= 9 (where we can support 16 samples). The remaining 64 tests aren't supported because they test the variable sample locations. Eleni Maria Stea (9): anv: Added the VK_EXT_sample_locations extension to the anv_extensions list anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT anv: Added support for non-dynamic sample locations on Gen8+ anv: Added support for dynamic sample locations on Gen8+ anv: Added support for dynamic and non-dynamic sample locations on Gen7 anv: Optimized the emission of the default locations on Gen8+ anv: Removed unused header file anv: Enabled the VK_EXT_sample_locations extension src/intel/Makefile.sources | 1 + src/intel/common/gen_sample_positions.h | 53 ++ src/intel/vulkan/anv_cmd_buffer.c | 19 src/intel/vulkan/anv_device.c | 21 src/intel/vulkan/anv_extensions.py | 1 + src/intel/vulkan/anv_genX.h | 7 ++ src/intel/vulkan/anv_private.h | 18 src/intel/vulkan/anv_sample_locations.c | 96 ++ src/intel/vulkan/anv_sample_locations.h | 29 ++ src/intel/vulkan/genX_blorp_exec.c | 1 - src/intel/vulkan/genX_cmd_buffer.c | 24 + src/intel/vulkan/genX_pipeline.c| 92 + src/intel/vulkan/genX_state.c | 128 src/intel/vulkan/meson.build| 1 + 14 files changed, 450 insertions(+), 41 deletions(-) create mode 100644 src/intel/vulkan/anv_sample_locations.c create mode 100644 src/intel/vulkan/anv_sample_locations.h -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 4/9] anv: Added support for non-dynamic sample locations on Gen8+
Allowing the user to set custom sample locations non-dynamically, by filling the extension structs and chaining them to the pipeline structs according to the Vulkan specification section [26.5. Custom Sample Locations] for the following structures: 'VkPipelineSampleLocationsStateCreateInfoEXT' 'VkSampleLocationsInfoEXT' 'VkSampleLocationEXT' Once custom locations are used, the default locations are lost and need to be re-emitted again in the next pipeline creation. For that, we emit the 3DSTATE_SAMPLE_PATTERN at every pipeline creation. --- src/intel/common/gen_sample_positions.h | 53 src/intel/vulkan/anv_genX.h | 5 ++ src/intel/vulkan/anv_private.h | 9 +++ src/intel/vulkan/anv_sample_locations.c | 38 +++- src/intel/vulkan/anv_sample_locations.h | 29 + src/intel/vulkan/genX_pipeline.c| 80 + src/intel/vulkan/genX_state.c | 59 ++ 7 files changed, 259 insertions(+), 14 deletions(-) create mode 100644 src/intel/vulkan/anv_sample_locations.h diff --git a/src/intel/common/gen_sample_positions.h b/src/intel/common/gen_sample_positions.h index da48dcb5ed0..e8af2a552dc 100644 --- a/src/intel/common/gen_sample_positions.h +++ b/src/intel/common/gen_sample_positions.h @@ -160,4 +160,57 @@ prefix##14YOffset = 0.9375; \ prefix##15XOffset = 0.0625; \ prefix##15YOffset = 0.; +/* Examples: + * in case of GEN_GEN < 8: + * SET_SAMPLE_POS(ms.Sample, 0); expands to: + *ms.Sample0XOffset = anv_samples[0].offs_x; + *ms.Sample0YOffset = anv_samples[0].offs_y; + * + * in case of GEN_GEN >= 8: + * SET_SAMPLE_POS(sp._16xSample, 0); expands to: + *sp._16xSample0XOffset = anv_samples[0].offs_x; + *sp._16xSample0YOffset = anv_samples[0].offs_y; + */ +#define SET_SAMPLE_POS(prefix, sample_idx) \ +prefix##sample_idx##XOffset = anv_samples[sample_idx].offs_x; \ +prefix##sample_idx##YOffset = anv_samples[sample_idx].offs_y; + +#define SET_SAMPLE_POS_2X(prefix) \ +SET_SAMPLE_POS(prefix, 0); \ +SET_SAMPLE_POS(prefix, 1); + +#define SET_SAMPLE_POS_4X(prefix) \ +SET_SAMPLE_POS(prefix, 0); \ +SET_SAMPLE_POS(prefix, 1); \ +SET_SAMPLE_POS(prefix, 2); \ +SET_SAMPLE_POS(prefix, 3); + +#define SET_SAMPLE_POS_8X(prefix) \ +SET_SAMPLE_POS(prefix, 0); \ +SET_SAMPLE_POS(prefix, 1); \ +SET_SAMPLE_POS(prefix, 2); \ +SET_SAMPLE_POS(prefix, 3); \ +SET_SAMPLE_POS(prefix, 4); \ +SET_SAMPLE_POS(prefix, 5); \ +SET_SAMPLE_POS(prefix, 6); \ +SET_SAMPLE_POS(prefix, 7); + +#define SET_SAMPLE_POS_16X(prefix) \ +SET_SAMPLE_POS(prefix, 0); \ +SET_SAMPLE_POS(prefix, 1); \ +SET_SAMPLE_POS(prefix, 2); \ +SET_SAMPLE_POS(prefix, 3); \ +SET_SAMPLE_POS(prefix, 4); \ +SET_SAMPLE_POS(prefix, 5); \ +SET_SAMPLE_POS(prefix, 6); \ +SET_SAMPLE_POS(prefix, 7); \ +SET_SAMPLE_POS(prefix, 8); \ +SET_SAMPLE_POS(prefix, 9); \ +SET_SAMPLE_POS(prefix, 10); \ +SET_SAMPLE_POS(prefix, 11); \ +SET_SAMPLE_POS(prefix, 12); \ +SET_SAMPLE_POS(prefix, 13); \ +SET_SAMPLE_POS(prefix, 14); \ +SET_SAMPLE_POS(prefix, 15); + #endif /* GEN_SAMPLE_POSITIONS_H */ diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index 8fd32cabf1e..52415c04a45 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -88,3 +88,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer *cmd_buffer, void genX(blorp_exec)(struct blorp_batch *batch, const struct blorp_params *params); + +void genX(emit_sample_locations)(struct anv_batch *batch, + uint32_t num_samples, + const VkSampleLocationsInfoEXT *sl, + bool custom_locations); diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index 5905299e59d..981956e5706 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -71,6 +71,7 @@ struct anv_buffer; struct anv_buffer_view; struct anv_image_view; struct anv_instance; +struct anv_sample; struct gen_l3_config; @@ -165,6 +166,7 @@ struct gen_l3_config; #define MAX_PUSH_DESCRIPTORS 32 /* Minimum requirement */ #define MAX_INLINE_UNIFORM_BLOCK_SIZE 4096 #define MAX_INLINE_UNIFORM_BLOCK_DESCRIPTORS 32 +#define MAX_SAMPLE_LOCATIONS 16 /* The kernel relocation API has a limitation of a 32-bit delta value * applied to the address before it is written which, in spite of it being @@ -2086,6 +2088,13 @@ struct anv_push_constants { struct brw_image_param images[MAX_GEN8_IMAGES]; }; +struct +anv_sample { + float offs_x; + float offs_y; + float radius; +}; + struct anv_dynamic_state { struct { uint32_t count; diff --git a/src/intel/vulkan/anv_sample_locations.c b/src/intel/vulkan/anv_sample_locations.c index 1ebf280e05b..c660cb5ae84 100644 --- a/src/intel/vulkan/anv_sample_locations.c +++ b/src/intel/vulkan/anv_sample_locations.c @@ -21,7 +21,7 @@ * IN THE SOFTWARE. */ -#include "anv_private.h"
[Mesa-dev] [PATCH v2 1/9] anv: Added the VK_EXT_sample_locations extension to the anv_extensions list
Added the VK_EXT_sample_locations to the anv_extensions.py list to generate the related entrypoints. --- src/intel/vulkan/anv_extensions.py | 1 + 1 file changed, 1 insertion(+) diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_extensions.py index 6fff293dee4..9e4e03e46df 100644 --- a/src/intel/vulkan/anv_extensions.py +++ b/src/intel/vulkan/anv_extensions.py @@ -129,6 +129,7 @@ EXTENSIONS = [ Extension('VK_EXT_inline_uniform_block', 1, True), Extension('VK_EXT_pci_bus_info', 2, True), Extension('VK_EXT_post_depth_coverage', 1, 'device->info.gen >= 9'), +Extension('VK_EXT_sample_locations', 1, False), Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen >= 9'), Extension('VK_EXT_scalar_block_layout', 1, True), Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen >= 9'), -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 9/9] anv: Enabled the VK_EXT_sample_locations extension
Enabled the VK_EXT_sample_locations for Intel Gen >= 7. --- src/intel/vulkan/anv_extensions.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_extensions.py index 9e4e03e46df..99007544732 100644 --- a/src/intel/vulkan/anv_extensions.py +++ b/src/intel/vulkan/anv_extensions.py @@ -129,7 +129,7 @@ EXTENSIONS = [ Extension('VK_EXT_inline_uniform_block', 1, True), Extension('VK_EXT_pci_bus_info', 2, True), Extension('VK_EXT_post_depth_coverage', 1, 'device->info.gen >= 9'), -Extension('VK_EXT_sample_locations', 1, False), +Extension('VK_EXT_sample_locations', 1, 'device->info.gen >= 7'), Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen >= 9'), Extension('VK_EXT_scalar_block_layout', 1, True), Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen >= 9'), -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/9] anv: Removed unused header file
In src/intel/vulkan/genX_blorp_exec.c we included the file: common/gen_sample_positions.h but not use it. Removed. --- src/intel/vulkan/genX_blorp_exec.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/intel/vulkan/genX_blorp_exec.c b/src/intel/vulkan/genX_blorp_exec.c index e9c85d56d5f..0eeefaaa9d6 100644 --- a/src/intel/vulkan/genX_blorp_exec.c +++ b/src/intel/vulkan/genX_blorp_exec.c @@ -31,7 +31,6 @@ #undef __gen_combine_address #include "common/gen_l3_config.h" -#include "common/gen_sample_positions.h" #include "blorp/blorp_genX_exec.h" static void * -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/9] anv: Optimized the emission of the default locations on Gen8+
We only emit sample locations when the extension is enabled by the user. In all other cases the default locations are emitted once when the device is initialized to increase performance. --- src/intel/vulkan/anv_genX.h| 3 ++- src/intel/vulkan/genX_cmd_buffer.c | 2 +- src/intel/vulkan/genX_pipeline.c | 11 +++ src/intel/vulkan/genX_state.c | 8 +--- 4 files changed, 15 insertions(+), 9 deletions(-) diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index e82d83465ef..7f33a2b0a68 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -93,4 +93,5 @@ void genX(emit_ms_state)(struct anv_batch *batch, struct anv_sample *anv_samples, uint32_t num_samples, uint32_t log2_samples, - bool custom_sample_locations); + bool custom_sample_locations, + bool sample_locations_ext_enabled); diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 4752c66f350..ae7c5a80a3c 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -2654,7 +2654,7 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples; genX(emit_ms_state)(_buffer->batch, anv_samples, samples, - log2_samples, true); + log2_samples, true, true); } void diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c index 8afc08f0320..12adfa65da8 100644 --- a/src/intel/vulkan/genX_pipeline.c +++ b/src/intel/vulkan/genX_pipeline.c @@ -573,10 +573,12 @@ emit_sample_mask(struct anv_pipeline *pipeline, } static void -emit_ms_state(struct anv_pipeline *pipeline, +emit_ms_state(struct anv_device *device, + struct anv_pipeline *pipeline, const VkPipelineMultisampleStateCreateInfo *info, const VkPipelineDynamicStateCreateInfo *dinfo) { + bool sample_loc_enabled = device->enabled_extensions.EXT_sample_locations; struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS]; VkSampleLocationsInfoEXT *sl; bool custom_locations = false; @@ -588,7 +590,7 @@ emit_ms_state(struct anv_pipeline *pipeline, if (info) { samples = info->rasterizationSamples; - if (info->pNext) { + if (sample_loc_enabled && info->pNext) { VkPipelineSampleLocationsStateCreateInfoEXT *slinfo = (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext; @@ -617,7 +619,7 @@ emit_ms_state(struct anv_pipeline *pipeline, } genX(emit_ms_state)(>batch, anv_samples, samples, log2_samples, - custom_locations); + custom_locations, sample_loc_enabled); } static const uint32_t vk_to_gen_logic_op[] = { @@ -1947,7 +1949,8 @@ genX(graphics_pipeline_create)( assert(pCreateInfo->pRasterizationState); emit_rs_state(pipeline, pCreateInfo->pRasterizationState, pCreateInfo->pMultisampleState, pass, subpass); - emit_ms_state(pipeline, pCreateInfo->pMultisampleState, pCreateInfo->pDynamicState); + emit_ms_state(device, pipeline, pCreateInfo->pMultisampleState, + pCreateInfo->pDynamicState); emit_ds_state(pipeline, pCreateInfo->pDepthStencilState, pass, subpass); emit_cb_state(pipeline, pCreateInfo->pColorBlendState, pCreateInfo->pMultisampleState); diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c index 804cfab3a56..bc6b5870d8d 100644 --- a/src/intel/vulkan/genX_state.c +++ b/src/intel/vulkan/genX_state.c @@ -552,12 +552,14 @@ genX(emit_ms_state)(struct anv_batch *batch, struct anv_sample *anv_samples, uint32_t num_samples, uint32_t log2_samples, - bool custom_sample_locations) + bool custom_sample_locations, + bool sample_locations_ext_enabled) { emit_multisample(batch, anv_samples, num_samples, log2_samples, custom_sample_locations); #if GEN_GEN >= 8 - emit_sample_locations(batch, anv_samples, num_samples, - custom_sample_locations); + if (sample_locations_ext_enabled) + emit_sample_locations(batch, anv_samples, num_samples, +custom_sample_locations); #endif } -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/9] anv: Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT
Implemented the vkGetPhysicalDeviceMultisamplePropertiesEXT according to the Vulkan Specification section [36.2. Additional Multisampling Capabilities]. --- src/intel/Makefile.sources | 1 + src/intel/vulkan/anv_sample_locations.c | 60 + src/intel/vulkan/meson.build| 1 + 3 files changed, 62 insertions(+) create mode 100644 src/intel/vulkan/anv_sample_locations.c diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources index a5c8828a6b6..a0873c7ccc2 100644 --- a/src/intel/Makefile.sources +++ b/src/intel/Makefile.sources @@ -251,6 +251,7 @@ VULKAN_FILES := \ vulkan/anv_pipeline_cache.c \ vulkan/anv_private.h \ vulkan/anv_queue.c \ + vulkan/anv_sample_locations.c \ vulkan/anv_util.c \ vulkan/anv_wsi.c \ vulkan/vk_format_info.h diff --git a/src/intel/vulkan/anv_sample_locations.c b/src/intel/vulkan/anv_sample_locations.c new file mode 100644 index 000..1ebf280e05b --- /dev/null +++ b/src/intel/vulkan/anv_sample_locations.c @@ -0,0 +1,60 @@ +/* + * Copyright © 2019 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include "anv_private.h" + +void +anv_GetPhysicalDeviceMultisamplePropertiesEXT(VkPhysicalDevice physicalDevice, + VkSampleCountFlagBits samples, + VkMultisamplePropertiesEXT + *pMultisampleProperties) +{ + ANV_FROM_HANDLE(anv_physical_device, physical_device, physicalDevice); + const struct gen_device_info *devinfo = _device->info; + + VkExtent2D grid_size; + switch (samples) { + case VK_SAMPLE_COUNT_2_BIT: + case VK_SAMPLE_COUNT_4_BIT: + case VK_SAMPLE_COUNT_8_BIT: + grid_size.width = SAMPLE_LOC_GRID_W; + grid_size.height = SAMPLE_LOC_GRID_H; + break; + + case VK_SAMPLE_COUNT_16_BIT: + if (devinfo->gen >= 9) { + grid_size.width = SAMPLE_LOC_GRID_W; + grid_size.height = SAMPLE_LOC_GRID_H; + break; + } + default: + grid_size.width = grid_size.height = 0; + break; + }; + + *pMultisampleProperties = (VkMultisamplePropertiesEXT) { + .sType = VK_STRUCTURE_TYPE_MULTISAMPLE_PROPERTIES_EXT, + .pNext = NULL, + .maxSampleLocationGridSize = grid_size + }; +} diff --git a/src/intel/vulkan/meson.build b/src/intel/vulkan/meson.build index 7fa43a6ad79..3f78757c774 100644 --- a/src/intel/vulkan/meson.build +++ b/src/intel/vulkan/meson.build @@ -135,6 +135,7 @@ libanv_files = files( 'anv_pipeline_cache.c', 'anv_private.h', 'anv_queue.c', + 'anv_sample_locations.c', 'anv_util.c', 'anv_wsi.c', 'vk_format_info.h', -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/9] anv: Added support for dynamic and non-dynamic sample locations on Gen7
Allowing setting dynamic and non-dynamic sample locations on Gen7. --- src/intel/vulkan/anv_genX.h| 13 ++--- src/intel/vulkan/genX_cmd_buffer.c | 9 ++-- src/intel/vulkan/genX_pipeline.c | 13 + src/intel/vulkan/genX_state.c | 86 +- 4 files changed, 70 insertions(+), 51 deletions(-) diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index f84fe457152..e82d83465ef 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -89,11 +89,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer *cmd_buffer, void genX(blorp_exec)(struct blorp_batch *batch, const struct blorp_params *params); -void genX(emit_multisample)(struct anv_batch *batch, -uint32_t samples, -uint32_t log2_samples); - -void genX(emit_sample_locations)(struct anv_batch *batch, - const struct anv_sample *anv_samples, - uint32_t num_samples, - bool custom_locations); +void genX(emit_ms_state)(struct anv_batch *batch, + struct anv_sample *anv_samples, + uint32_t num_samples, + uint32_t log2_samples, + bool custom_sample_locations); diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 9229df84caa..4752c66f350 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -2643,8 +2643,7 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer, static void cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) { -#if GEN_GEN >= 8 - const struct anv_sample *anv_samples; + struct anv_sample *anv_samples; uint32_t log2_samples; uint32_t samples; @@ -2654,10 +2653,8 @@ cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) log2_samples = __builtin_ffs(samples) - 1; anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples; - genX(emit_multisample)(_buffer->batch, samples, log2_samples); - genX(emit_sample_locations)(_buffer->batch, anv_samples, samples, - true); -#endif + genX(emit_ms_state)(_buffer->batch, anv_samples, samples, + log2_samples, true); } void diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c index fa42e622077..8afc08f0320 100644 --- a/src/intel/vulkan/genX_pipeline.c +++ b/src/intel/vulkan/genX_pipeline.c @@ -577,12 +577,9 @@ emit_ms_state(struct anv_pipeline *pipeline, const VkPipelineMultisampleStateCreateInfo *info, const VkPipelineDynamicStateCreateInfo *dinfo) { -#if GEN_GEN >= 8 struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS]; VkSampleLocationsInfoEXT *sl; bool custom_locations = false; -#endif - uint32_t samples = 1; uint32_t log2_samples = 0; @@ -591,7 +588,6 @@ emit_ms_state(struct anv_pipeline *pipeline, if (info) { samples = info->rasterizationSamples; -#if GEN_GEN >= 8 if (info->pNext) { VkPipelineSampleLocationsStateCreateInfoEXT *slinfo = (VkPipelineSampleLocationsStateCreateInfoEXT *)info->pNext; @@ -616,17 +612,12 @@ emit_ms_state(struct anv_pipeline *pipeline, } } } -#endif log2_samples = __builtin_ffs(samples) - 1; } - genX(emit_multisample(>batch, samples, log2_samples)); - -#if GEN_GEN >= 8 - genX(emit_sample_locations)(>batch, anv_samples, samples, - custom_locations); -#endif + genX(emit_ms_state)(>batch, anv_samples, samples, log2_samples, + custom_locations); } static const uint32_t vk_to_gen_logic_op[] = { diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c index 44cfc925ed5..804cfab3a56 100644 --- a/src/intel/vulkan/genX_state.c +++ b/src/intel/vulkan/genX_state.c @@ -437,10 +437,12 @@ VkResult genX(CreateSampler)( return VK_SUCCESS; } -void -genX(emit_multisample)(struct anv_batch *batch, - uint32_t samples, - uint32_t log2_samples) +static void +emit_multisample(struct anv_batch *batch, + const struct anv_sample *anv_samples, + uint32_t samples, + uint32_t log2_samples, + bool custom_locations) { anv_batch_emit(batch, GENX(3DSTATE_MULTISAMPLE), ms) { ms.NumberofMultisamples = log2_samples; @@ -453,34 +455,52 @@ genX(emit_multisample)(struct anv_batch *batch, */ ms.PixelPositionOffsetEnable = false; #else - switch (samples) { - case 1: - GEN_SAMPLE_POS_1X(ms.Sample); - break; - case 2: - GEN_SAMPLE_POS_2X(ms.Sample); - break; - case 4: - GEN_SAMPLE_POS_4X(ms.Sample); - break; -
[Mesa-dev] [PATCH 1/9] anv: Added the VK_EXT_sample_locations extension to the anv_extensions list
Added the VK_EXT_sample_locations to the anv_extensions.py list to generate the related entrypoints. --- src/intel/vulkan/anv_extensions.py | 1 + 1 file changed, 1 insertion(+) diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_extensions.py index 6fff293dee4..9e4e03e46df 100644 --- a/src/intel/vulkan/anv_extensions.py +++ b/src/intel/vulkan/anv_extensions.py @@ -129,6 +129,7 @@ EXTENSIONS = [ Extension('VK_EXT_inline_uniform_block', 1, True), Extension('VK_EXT_pci_bus_info', 2, True), Extension('VK_EXT_post_depth_coverage', 1, 'device->info.gen >= 9'), +Extension('VK_EXT_sample_locations', 1, False), Extension('VK_EXT_sampler_filter_minmax', 1, 'device->info.gen >= 9'), Extension('VK_EXT_scalar_block_layout', 1, True), Extension('VK_EXT_shader_stencil_export', 1, 'device->info.gen >= 9'), -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/9] anv: Added support for dynamic sample locations on Gen8+
Added support for setting the locations when the pipeline has been created with the dynamic state bit enabled according to the Vulkan Specification section [26.5. Custom Sample Locations] for the function: 'vkCmdSetSampleLocationsEXT' The reason that we preferred to store the boolean valid inside the dynamic state struct for locations instead of using a dirty bit (ANV_CMD_DIRTY_SAMPLE_LOCATIONS for example) is that other functions can modify the value of the dirty bits causing unexpected behavior. --- src/intel/vulkan/anv_cmd_buffer.c | 19 src/intel/vulkan/anv_genX.h| 6 +++- src/intel/vulkan/anv_private.h | 6 src/intel/vulkan/genX_cmd_buffer.c | 27 ++ src/intel/vulkan/genX_pipeline.c | 46 -- src/intel/vulkan/genX_state.c | 41 +++--- 6 files changed, 99 insertions(+), 46 deletions(-) diff --git a/src/intel/vulkan/anv_cmd_buffer.c b/src/intel/vulkan/anv_cmd_buffer.c index 1b34644a434..101c1375430 100644 --- a/src/intel/vulkan/anv_cmd_buffer.c +++ b/src/intel/vulkan/anv_cmd_buffer.c @@ -28,6 +28,7 @@ #include #include "anv_private.h" +#include "anv_sample_locations.h" #include "vk_format_info.h" #include "vk_util.h" @@ -558,6 +559,24 @@ void anv_CmdSetStencilReference( cmd_buffer->state.gfx.dirty |= ANV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE; } +void +anv_CmdSetSampleLocationsEXT(VkCommandBuffer commandBuffer, + const VkSampleLocationsInfoEXT *pSampleLocationsInfo) +{ + ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer); + assert(pSampleLocationsInfo); + + struct anv_dynamic_state *dyn_state = _buffer->state.gfx.dynamic; + dyn_state->sample_locations.num_samples = + pSampleLocationsInfo->sampleLocationsPerPixel; + + anv_calc_sample_locations(dyn_state->sample_locations.anv_samples, + dyn_state->sample_locations.num_samples, + pSampleLocationsInfo); + + cmd_buffer->state.gfx.dynamic.sample_locations.valid = true; +} + static void anv_cmd_buffer_bind_descriptor_set(struct anv_cmd_buffer *cmd_buffer, VkPipelineBindPoint bind_point, diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index 52415c04a45..f84fe457152 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -89,7 +89,11 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer *cmd_buffer, void genX(blorp_exec)(struct blorp_batch *batch, const struct blorp_params *params); +void genX(emit_multisample)(struct anv_batch *batch, +uint32_t samples, +uint32_t log2_samples); + void genX(emit_sample_locations)(struct anv_batch *batch, + const struct anv_sample *anv_samples, uint32_t num_samples, - const VkSampleLocationsInfoEXT *sl, bool custom_locations); diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index 981956e5706..a2e1756cd99 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -2135,6 +2135,12 @@ struct anv_dynamic_state { uint32_t front; uint32_t back; } stencil_reference; + + struct { + struct anv_sample anv_samples[MAX_SAMPLE_LOCATIONS]; + uint32_t num_samples; + bool valid; + } sample_locations; }; extern const struct anv_dynamic_state default_dynamic_state; diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 7687507e6b7..9229df84caa 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -25,11 +25,13 @@ #include #include "anv_private.h" +#include "anv_sample_locations.h" #include "vk_format_info.h" #include "vk_util.h" #include "util/fast_idiv_by_const.h" #include "common/gen_l3_config.h" +#include "common/gen_sample_positions.h" #include "genxml/gen_macros.h" #include "genxml/genX_pack.h" @@ -2638,6 +2640,26 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer, cmd_buffer->state.push_constants_dirty &= ~flushed; } +static void +cmd_buffer_emit_sample_locations(struct anv_cmd_buffer *cmd_buffer) +{ +#if GEN_GEN >= 8 + const struct anv_sample *anv_samples; + uint32_t log2_samples; + uint32_t samples; + + samples = cmd_buffer->state.gfx.dynamic.sample_locations.num_samples; + assert(samples > 0); + + log2_samples = __builtin_ffs(samples) - 1; + anv_samples = cmd_buffer->state.gfx.dynamic.sample_locations.anv_samples; + + genX(emit_multisample)(_buffer->batch, samples, log2_samples); + genX(emit_sample_locations)(_buffer->batch,
[Mesa-dev] [PATCH 4/9] anv: Added support for non-dynamic sample locations on Gen8+
Allowing the user to set custom sample locations non-dynamically, by filling the extension structs and chaining them to the pipeline structs according to the Vulkan specification section [26.5. Custom Sample Locations] for the following structures: 'VkPipelineSampleLocationsStateCreateInfoEXT' 'VkSampleLocationsInfoEXT' 'VkSampleLocationEXT' Once custom locations are used, the default locations are lost and need to be re-emitted again in the next pipeline creation. For that, we emit the 3DSTATE_SAMPLE_PATTERN at every pipeline creation. --- src/intel/common/gen_sample_positions.h | 53 src/intel/vulkan/anv_genX.h | 5 ++ src/intel/vulkan/anv_private.h | 9 +++ src/intel/vulkan/anv_sample_locations.c | 38 +++- src/intel/vulkan/anv_sample_locations.h | 29 + src/intel/vulkan/genX_pipeline.c| 80 + src/intel/vulkan/genX_state.c | 59 ++ 7 files changed, 259 insertions(+), 14 deletions(-) create mode 100644 src/intel/vulkan/anv_sample_locations.h diff --git a/src/intel/common/gen_sample_positions.h b/src/intel/common/gen_sample_positions.h index da48dcb5ed0..e8af2a552dc 100644 --- a/src/intel/common/gen_sample_positions.h +++ b/src/intel/common/gen_sample_positions.h @@ -160,4 +160,57 @@ prefix##14YOffset = 0.9375; \ prefix##15XOffset = 0.0625; \ prefix##15YOffset = 0.; +/* Examples: + * in case of GEN_GEN < 8: + * SET_SAMPLE_POS(ms.Sample, 0); expands to: + *ms.Sample0XOffset = anv_samples[0].offs_x; + *ms.Sample0YOffset = anv_samples[0].offs_y; + * + * in case of GEN_GEN >= 8: + * SET_SAMPLE_POS(sp._16xSample, 0); expands to: + *sp._16xSample0XOffset = anv_samples[0].offs_x; + *sp._16xSample0YOffset = anv_samples[0].offs_y; + */ +#define SET_SAMPLE_POS(prefix, sample_idx) \ +prefix##sample_idx##XOffset = anv_samples[sample_idx].offs_x; \ +prefix##sample_idx##YOffset = anv_samples[sample_idx].offs_y; + +#define SET_SAMPLE_POS_2X(prefix) \ +SET_SAMPLE_POS(prefix, 0); \ +SET_SAMPLE_POS(prefix, 1); + +#define SET_SAMPLE_POS_4X(prefix) \ +SET_SAMPLE_POS(prefix, 0); \ +SET_SAMPLE_POS(prefix, 1); \ +SET_SAMPLE_POS(prefix, 2); \ +SET_SAMPLE_POS(prefix, 3); + +#define SET_SAMPLE_POS_8X(prefix) \ +SET_SAMPLE_POS(prefix, 0); \ +SET_SAMPLE_POS(prefix, 1); \ +SET_SAMPLE_POS(prefix, 2); \ +SET_SAMPLE_POS(prefix, 3); \ +SET_SAMPLE_POS(prefix, 4); \ +SET_SAMPLE_POS(prefix, 5); \ +SET_SAMPLE_POS(prefix, 6); \ +SET_SAMPLE_POS(prefix, 7); + +#define SET_SAMPLE_POS_16X(prefix) \ +SET_SAMPLE_POS(prefix, 0); \ +SET_SAMPLE_POS(prefix, 1); \ +SET_SAMPLE_POS(prefix, 2); \ +SET_SAMPLE_POS(prefix, 3); \ +SET_SAMPLE_POS(prefix, 4); \ +SET_SAMPLE_POS(prefix, 5); \ +SET_SAMPLE_POS(prefix, 6); \ +SET_SAMPLE_POS(prefix, 7); \ +SET_SAMPLE_POS(prefix, 8); \ +SET_SAMPLE_POS(prefix, 9); \ +SET_SAMPLE_POS(prefix, 10); \ +SET_SAMPLE_POS(prefix, 11); \ +SET_SAMPLE_POS(prefix, 12); \ +SET_SAMPLE_POS(prefix, 13); \ +SET_SAMPLE_POS(prefix, 14); \ +SET_SAMPLE_POS(prefix, 15); + #endif /* GEN_SAMPLE_POSITIONS_H */ diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h index 8fd32cabf1e..52415c04a45 100644 --- a/src/intel/vulkan/anv_genX.h +++ b/src/intel/vulkan/anv_genX.h @@ -88,3 +88,8 @@ void genX(cmd_buffer_mi_memset)(struct anv_cmd_buffer *cmd_buffer, void genX(blorp_exec)(struct blorp_batch *batch, const struct blorp_params *params); + +void genX(emit_sample_locations)(struct anv_batch *batch, + uint32_t num_samples, + const VkSampleLocationsInfoEXT *sl, + bool custom_locations); diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index 5905299e59d..981956e5706 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -71,6 +71,7 @@ struct anv_buffer; struct anv_buffer_view; struct anv_image_view; struct anv_instance; +struct anv_sample; struct gen_l3_config; @@ -165,6 +166,7 @@ struct gen_l3_config; #define MAX_PUSH_DESCRIPTORS 32 /* Minimum requirement */ #define MAX_INLINE_UNIFORM_BLOCK_SIZE 4096 #define MAX_INLINE_UNIFORM_BLOCK_DESCRIPTORS 32 +#define MAX_SAMPLE_LOCATIONS 16 /* The kernel relocation API has a limitation of a 32-bit delta value * applied to the address before it is written which, in spite of it being @@ -2086,6 +2088,13 @@ struct anv_push_constants { struct brw_image_param images[MAX_GEN8_IMAGES]; }; +struct +anv_sample { + float offs_x; + float offs_y; + float radius; +}; + struct anv_dynamic_state { struct { uint32_t count; diff --git a/src/intel/vulkan/anv_sample_locations.c b/src/intel/vulkan/anv_sample_locations.c index 1ebf280e05b..c660cb5ae84 100644 --- a/src/intel/vulkan/anv_sample_locations.c +++ b/src/intel/vulkan/anv_sample_locations.c @@ -21,7 +21,7 @@ * IN THE SOFTWARE. */ -#include "anv_private.h"
[Mesa-dev] [PATCH 2/9] anv: Set the values for the VkPhysicalDeviceSampleLocationsPropertiesEXT
The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled with implementation dependent values and according to the table from the Vulkan Specification section [36.1. Limit Requirements]: pname | max | min pname:sampleLocationSampleCounts |-|ename:VK_SAMPLE_COUNT_4_BIT pname:maxSampleLocationGridSize|-|(1, 1) pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375) pname:sampleLocationSubPixelBits |-|4 pname:variableSampleLocations | false |implementation dependent The hardware only supports setting the same sample location for all the pixels, so we only support 1x1 grids. Also, variableSampleLocations is set to false because we don't support the feature. --- src/intel/vulkan/anv_device.c | 21 + src/intel/vulkan/anv_private.h | 3 +++ 2 files changed, 24 insertions(+) diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index 729cceb3e32..1e183b7f4ad 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -1401,6 +1401,27 @@ void anv_GetPhysicalDeviceProperties2( break; } + case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SAMPLE_LOCATIONS_PROPERTIES_EXT: { + VkPhysicalDeviceSampleLocationsPropertiesEXT *props = +(VkPhysicalDeviceSampleLocationsPropertiesEXT *)ext; + props->sampleLocationSampleCounts = ISL_SAMPLE_COUNT_2_BIT | + ISL_SAMPLE_COUNT_4_BIT | + ISL_SAMPLE_COUNT_8_BIT; + if (pdevice->info.gen >= 9) +props->sampleLocationSampleCounts |= ISL_SAMPLE_COUNT_16_BIT; + + props->maxSampleLocationGridSize.width = SAMPLE_LOC_GRID_W; + props->maxSampleLocationGridSize.height = SAMPLE_LOC_GRID_H; + + props->sampleLocationCoordinateRange[0] = 0; + props->sampleLocationCoordinateRange[1] = 0.9375; + props->sampleLocationSubPixelBits = 4; + + props->variableSampleLocations = false; + + break; + } + default: anv_debug_ignored_stype(ext->sType); break; diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index eed282ff985..5905299e59d 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -195,6 +195,9 @@ struct gen_l3_config; #define anv_printflike(a, b) __attribute__((__format__(__printf__, a, b))) +#define SAMPLE_LOC_GRID_W 1 +#define SAMPLE_LOC_GRID_H 1 + static inline uint32_t align_down_npot_u32(uint32_t v, uint32_t a) { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4] i965: fixed clamping in set_scissor_bits when the y is flipped
Calculating the scissor rectangle fields with the y flipped (0 on top) can generate negative values that will cause assertion failure later on as the scissor fields are all unsigned. We must clamp the bbox values again to make sure they don't exceed the fb_height. Also fixed a calculation error. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999 https://bugs.freedesktop.org/show_bug.cgi?id=109594 v2: - I initially clamped the values inside the if (Y is flipped) case and I made a mistake in the calculation: the clamp of the bbox[2] should be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I shouldn't have changed the ScissorRectangleYMax calculation. As the fixed code is equivalent with using CLAMP instead of MAX2 at the top of the function when bbox[2] and bbox[3] are calculated, and the 2nd is more clear, I replaced it. (Nanley Chery) v3: - Reversed the CLAMP change in bbox[3] as the API guarantees that the viewport height is positive. (Nanley Chery) v4: - Added nomination for the mesa-stable branch and the link to the second bugzilla bug (Nanley Chery) CC: Tested-by: Paul Chelombitko Reviewed-by: Nanley Chery --- src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c index 027dad1e089..73c983ce742 100644 --- a/src/mesa/drivers/dri/i965/genX_state_upload.c +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c @@ -2446,7 +2446,7 @@ set_scissor_bits(const struct gl_context *ctx, int i, bbox[0] = MAX2(ctx->ViewportArray[i].X, 0); bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width); - bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0); + bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height); bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height); _mesa_intersect_scissor_bounding_box(ctx, i, bbox); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3] i965: fixed clamping in set_scissor_bits when the y is flipped
Calculating the scissor rectangle fields with the y flipped (0 on top) can generate negative values that will cause assertion failure later on as the scissor fields are all unsigned. We must clamp the bbox values again to make sure they don't exceed the fb_height. Also fixed a calculation error. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999 v2: - I initially clamped the values inside the if (Y is flipped) case and I made a mistake in the calculation: the clamp of the bbox[2] should be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I shouldn't have changed the ScissorRectangleYMax calculation. As the fixed code is equivalent with using CLAMP instead of MAX2 at the top of the function when bbox[2] and bbox[3] are calculated, and the 2nd is more clear, I replaced it. (Nanley Chery) v3: - Reversed the CLAMP change in bbox[3] as the API guarantees that the viewport height is positive. (Nanley Chery) --- src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c index dcdfb3c9292..47f3741e673 100644 --- a/src/mesa/drivers/dri/i965/genX_state_upload.c +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c @@ -2445,7 +2445,7 @@ set_scissor_bits(const struct gl_context *ctx, int i, bbox[0] = MAX2(ctx->ViewportArray[i].X, 0); bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width); - bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0); + bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height); bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height); _mesa_intersect_scissor_bounding_box(ctx, i, bbox); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] i965: fixed clamping in set_scissor_bits when the y is flipped
Calculating the scissor rectangle fields with the y flipped (0 on top) can generate negative values that will cause assertion failure later on as the scissor fields are all unsigned. We must clamp the bbox values again to make sure they don't exceed the fb_height. Also fixed a calculation error. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999 v2: - I initially clamped the values inside the if (Y is flipped) case and I made a mistake in the calculation: the clamp of the bbox[2] should be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I shouldn't have changed the ScissorRectangleYMax calculation. As the fixed code is equivalent with using CLAMP instead of MAX2 at the top of the function when bbox[2] and bbox[3] are calculated, and the 2nd is more clear, I replaced it. (Nanley Chery) --- src/mesa/drivers/dri/i965/genX_state_upload.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c index dcdfb3c9292..dd695218fea 100644 --- a/src/mesa/drivers/dri/i965/genX_state_upload.c +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c @@ -2445,8 +2445,8 @@ set_scissor_bits(const struct gl_context *ctx, int i, bbox[0] = MAX2(ctx->ViewportArray[i].X, 0); bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width); - bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0); - bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height); + bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height); + bbox[3] = CLAMP(bbox[2] + ctx->ViewportArray[i].Height, 0, fb_height); _mesa_intersect_scissor_bounding_box(ctx, i, bbox); if (bbox[0] == bbox[1] || bbox[2] == bbox[3]) { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: fixed clamping in set_scissor_bits when the y is flipped
On Tue, 19 Feb 2019 16:27:56 -0800 Nanley Chery wrote: > On Mon, Dec 10, 2018 at 12:42:40PM +0200, Eleni Maria Stea wrote: > > Calculating the scissor rectangle fields with the y flipped (0 on > > top) can generate negative values that will cause assertion failure > > later on as the scissor fields are all unsigned. We must clamp the > > bbox values again to make sure they don't exceed the fb_height. > > Also fixed a calculation error. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999 > > Good find. Could you send the test to the piglit list? Sure, I will send it. > > > --- > > src/mesa/drivers/dri/i965/genX_state_upload.c | 15 ++- > > 1 file changed, 14 insertions(+), 1 deletion(-) > > > > diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c > > b/src/mesa/drivers/dri/i965/genX_state_upload.c index > > 8e3fcbf12e..5d8fc8214e 100644 --- > > a/src/mesa/drivers/dri/i965/genX_state_upload.c +++ > > b/src/mesa/drivers/dri/i965/genX_state_upload.c @@ -2424,8 +2424,21 > > @@ set_scissor_bits(const struct gl_context *ctx, int i, /* memory: > > Y=0=top */ sc->ScissorRectangleXMin = bbox[0]; > >sc->ScissorRectangleXMax = bbox[1] - 1; > > + > > + /* Clamping to fb_height is necessary because otherwise the > > + * subtractions below would produce a negative result, which > > would > > + * then be assigned to the unsigned YMin/YMax scissor fields, > > + * resulting in an assertion failure in > > GENX(SCISSOR_RECT_pack) > > + */ > > + > > + if (bbox[3] > fb_height) > > + bbox[3] = fb_height; > > + > > + if (bbox[2] > fb_height) > > + bbox[2] = fb_height; > > + > > We should be able to fix this bug in a simpler manner by changing the > MAX2 calls at the top of this function to CLAMP calls. > > >sc->ScissorRectangleYMin = fb_height - bbox[3]; > > - sc->ScissorRectangleYMax = fb_height - bbox[2] - 1; > > + sc->ScissorRectangleYMax = fb_height - (bbox[2] - 1); > > I don't think we want to start adding 1 instead of subtracting 1. The > subtraction is there to satisfy the requirement for the HW packet. > > -Nanley Right! This code would be correct if I had done: if (bbox[2] >= fb_height) bbox[2] = fb_height - 1; and then had left: sc->ScissorRectangleYMax = fb_height - bbox[2] - 1; as it was. :) I think I like your solution better because with the CLAMP at the top what we do here is more clear. I am going to send a new patch soon. Thank you! Eleni ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v6 2/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the compressed EAC/ETC2 images to non-compressed RGBA images. When GetCompressed* functions were called, the pixels were returned in this RGBA format and not the compressed format that was expected. Trying to fix this problem, we use a secondary shadow miptree to store the decompressed data for the rendering and the main miptree to store the compressed for the Get functions to work. Each time that the main miptree is written with compressed data, we decompress them to RGB and update the shadow. Then we use the shadow for rendering. v2: - Fixes in the commit message (Nanley Chery) - Reversed the changes in brw_get_texture_swizzle and swapped the b, g values at the time that we decompress the data in the function: intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) - Simplified the format checks in the miptree_create function of the intel_mipmap_tree.c and reserved the call of the intel_lower_compressed_format for the case that we are faking the ETC support (Nanley Chery) - Removed the check for the auxiliary usage for the shadow miptree at creation (miptree_create of intel_mipmap_tree.c) as we won't use auxiliary buffers with these types of trees (Nanley Chery) - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and removed the unecessary checks (Nanley Chery) - Fixed an unrelated indentation change (Nanley Chery) - Modified the function intel_miptree_finish_write to set the mt->shadow_needs_update to true to catch all the cases when we need to update the miptree (Nanley Chery) - In order to update the shadow miptree during the unmap of the main and always map the main (Nanley Chery) the following change was necessary: Splitted the previous update function that was updating all the mipmap levels and use two functions instead: one that updates one level and one that updates all of them. Used the first during unmap and the second before the rendering. - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which miptree should be mapped each time and reversed all the changes in the higher level texture functions that upload data to textures as they aren't needed anymore. - Replaced the boolean needs_fake_etc with an inline function that checks when we need to fake the ETC compression (Nanley Chery) - Removed the initialization of the strides in the update function as the values will be overwritten by the intel_miptree_map call (Nanley Chery) - Used minify instead of division in the new update function intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley Chery) - Removed the depth from the calculation of the number of slices in the new update function (intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c) as we don't need to support 3D ETC images. (Nanley Chery) v3: - Renamed the rgba_fmt in function miptree_create (intel_mipmap_tree.c) to decomp_format as the format is not always in rgba order. (Nanley Chery) - Documented the new usage for the shadow miptree in the comment above the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley Chery) - Removed the redundant flags from the mapping of the miptrees in intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) - Fixed the switch from surface's logical level to physical level in the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c (Nanley Chery) - Excluded the Baytrail GPUs from the check for the ETC emulation as they support the ETC formats natively. (Nanley Chery) - Simplified the check if the format is BGRA in intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) v4: - Removed the functions intel_miptree_(map|unmap)_etc and the check if we need to call them as with the new changes, they became unreachable. (Nanley Chery) - We'd rather calculate the level width and height using the shadow miptree instead of the main in intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c (Nanley Chery) - Fixed the format in the mt_surface_usage, set at the miptree creation, in miptree_create of intel_mipmap_tree.c (Nanley Chery) v5: - Fixed the levels calculations in intel_mipmap_tree.c (Nanley Chery) - Update the flag shadow_needs_update outside the function intel_miptree_update_etc_shadow (Nanley Chery) - Fixed indentation error (Nanley Chery) v6: - Fixed typo in commit message (Nanley Chery) - Simplified the assignment of the mt_fmt in the miptree_create of the intel_mipmap_tree.c (Nanley Chery) - Combined declarations and assignments where it was possible in the intel_miptree_update_etc_shadow and intel_miptree_update_etc_shadow_levels of the intel_mipmap_tree.c (Nanley Chery) --- .../drivers/dri/i965/brw_wm_surface_state.c | 5 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 174
[Mesa-dev] [PATCH v6 1/5] i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*
From: Nanley Chery Use more generic field names. We'll reuse these fields for a workaround with ASTC miptrees. Reviewed-by: Eleni Maria Stea --- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 8 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 16 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 14 +++--- 3 files changed, 19 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index ece3197a858..c55182d7ffb 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -571,15 +571,15 @@ static void brw_update_texture_surface(struct gl_context *ctx, if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) { if (devinfo->gen <= 7) { -assert(mt->r8stencil_mt && !mt->stencil_mt->r8stencil_needs_update); -mt = mt->r8stencil_mt; +assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update); +mt = mt->shadow_mt; } else { mt = mt->stencil_mt; } format = ISL_FORMAT_R8_UINT; } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) { - assert(mt->r8stencil_mt && !mt->r8stencil_needs_update); - mt = mt->r8stencil_mt; + assert(mt->shadow_mt && !mt->shadow_needs_update); + mt = mt->shadow_mt; format = ISL_FORMAT_R8_UINT; } diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index fe77d72fae4..e364fed2cc7 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -1214,7 +1214,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt) brw_bo_unreference((*mt)->bo); intel_miptree_release(&(*mt)->stencil_mt); - intel_miptree_release(&(*mt)->r8stencil_mt); + intel_miptree_release(&(*mt)->shadow_mt); intel_miptree_aux_buffer_free((*mt)->aux_buf); free_aux_state_map((*mt)->aux_state); @@ -2427,7 +2427,7 @@ intel_miptree_finish_write(struct brw_context *brw, switch (mt->aux_usage) { case ISL_AUX_USAGE_NONE: if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7) - mt->r8stencil_needs_update = true; + mt->shadow_needs_update = true; break; case ISL_AUX_USAGE_MCS: @@ -2933,9 +2933,9 @@ intel_update_r8stencil(struct brw_context *brw, assert(src->surf.size_B > 0); - if (!mt->r8stencil_mt) { + if (!mt->shadow_mt) { assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */ - mt->r8stencil_mt = make_surface( + mt->shadow_mt = make_surface( brw, src->target, MESA_FORMAT_R_UINT8, @@ -2949,13 +2949,13 @@ intel_update_r8stencil(struct brw_context *brw, ISL_TILING_Y0_BIT, ISL_SURF_USAGE_TEXTURE_BIT, BO_ALLOC_BUSY, 0, NULL); - assert(mt->r8stencil_mt); + assert(mt->shadow_mt); } - if (src->r8stencil_needs_update == false) + if (src->shadow_needs_update == false) return; - struct intel_mipmap_tree *dst = mt->r8stencil_mt; + struct intel_mipmap_tree *dst = mt->shadow_mt; for (int level = src->first_level; level <= src->last_level; level++) { const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ? @@ -2975,7 +2975,7 @@ intel_update_r8stencil(struct brw_context *brw, } brw_cache_flush_for_read(brw, dst->bo); - src->r8stencil_needs_update = false; + src->shadow_needs_update = false; } static void * diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h index 17668944adc..1a7507023a1 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h @@ -294,16 +294,16 @@ struct intel_mipmap_tree struct intel_mipmap_tree *stencil_mt; /** -* \brief Stencil texturing miptree for sampling from a stencil texture +* \brief Shadow miptree for sampling when the main isn't supported by HW. * -* Some hardware doesn't support sampling from the stencil texture as -* required by the GL_ARB_stencil_texturing extenion. To workaround this we -* blit the texture into a new texture that can be sampled. +* To workaround various sampler bugs and limitations, we blit the main +* texture into a new texture that can be sampled. * -* \see intel_update_r8stencil() +* This miptree may be used for: +* - Stencil texturin
[Mesa-dev] [PATCH v6 5/5] i965: Removed the field etc_format from the struct intel_mipmap_tree
After the previous changes to emulate the ETC/EAC formats using the secondary shadow miptree, the etc_format field of the intel_mipmap_tree struct became redundant and the remaining check that used it has been replaced. (Nanley Chery) --- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 2 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 7 --- src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 10 -- 3 files changed, 1 insertion(+), 18 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 19a46fcf243..a0984791614 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -520,7 +520,7 @@ static void brw_update_texture_surface(struct gl_context *ctx, * is safe because texture views aren't allowed on depth/stencil. */ mesa_fmt = mt->format; - } else if (mt->etc_format != MESA_FORMAT_NONE) { + } else if (intel_miptree_has_etc_shadow(brw, mt)) { mesa_fmt = mt->shadow_mt->format; } else if (plane > 0) { mesa_fmt = mt->format; diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 7146fcb6582..426782c5883 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -706,7 +706,6 @@ miptree_create(struct brw_context *brw, if (intel_miptree_needs_fake_etc(brw, mt)) { mesa_format decomp_format = intel_lower_compressed_format(brw, format); - mt->etc_format = format; mt->shadow_mt = make_surface(brw, target, decomp_format, first_level, last_level, width0, height0, depth0, num_samples, tiling_flags, @@ -717,10 +716,6 @@ miptree_create(struct brw_context *brw, intel_miptree_release(); return NULL; } - - mt->shadow_mt->etc_format = MESA_FORMAT_NONE; - } else { - mt->etc_format = MESA_FORMAT_NONE; } if (needs_separate_stencil(brw, mt, format)) { @@ -1302,8 +1297,6 @@ intel_miptree_match_image(struct intel_mipmap_tree *mt, mt_format = MESA_FORMAT_Z24_UNORM_S8_UINT; if (mt->format == MESA_FORMAT_Z_FLOAT32 && mt->stencil_mt) mt_format = MESA_FORMAT_Z32_FLOAT_S8X24_UINT; - if (mt->etc_format != MESA_FORMAT_NONE) - mt_format = mt->etc_format; if (_mesa_get_srgb_format_linear(image->TexFormat) != _mesa_get_srgb_format_linear(mt_format)) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h index 752aeaaf9b7..3e53a0049cc 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h @@ -215,21 +215,11 @@ struct intel_mipmap_tree * MESA_FORMAT_Z_FLOAT32, otherwise for MESA_FORMAT_Z24_UNORM_S8_UINT objects it will be * MESA_FORMAT_Z24_UNORM_X8_UINT. * -* For ETC1/ETC2 textures, this is one of the uncompressed mesa texture -* formats if the hardware lacks support for ETC1/ETC2. See @ref etc_format. -* * @see RENDER_SURFACE_STATE.SurfaceFormat * @see 3DSTATE_DEPTH_BUFFER.SurfaceFormat */ mesa_format format; - /** -* This variable stores the value of ETC compressed texture format -* -* @see RENDER_SURFACE_STATE.SurfaceFormat -*/ - mesa_format etc_format; - GLuint first_level; GLuint last_level; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v6 0/5] improved the support for ETC2 formats on Gen 7
Intel Gen7 GPUs don't support the ETC2 formats natively and in order to show the pixels properly we decompress them and create decompressed miptrees. The problem with that is that the functions that map the miptrees for reading (for example the GetCompressed* calls), and would be supposed to read compressed pixel values, would read decompressed values instead unless if we prevented this with assertions that make the user programs either crash or misfunction. These patches are an attempt to give a solution to this problem by using 2 miptrees: the main to store the ETC values and the generic shadow (mt->shadow) to store the decompressed values. Each time that the main miptree is mapped for writing we set a flag that the shadow will need update and we check this flag before every draw call to update the shadow miptree. (We perform the check right before drawing to avoid missing changes from functions like the CopyImageSubData in the next frame). Then we map the shadow for sampling. This way, we can render the images using the decompressed pixels of the shadow but we return the compressed ones from the main when the texture is mapped for reading. Also, the OES_copy_image extension that couldn't work on Gen 7 due to the lack of the ETC support is now enabled back. Finally, the following glcts and piglit tests pass: On HSW (previously failing): KHR-GL46.direct_state_access.textures_compressed_subimage On HSW and IVB (previously skipped): - dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_alpha8_etc2_eac.* (6 tests) dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_etc2.* (6 tests) dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_punchthrough_alpha1_etc2.* (6 tests) On HSW, IVB, SNB (previously skipped): --- dEQP-GLES3.functional.texture.format.compressed.* (12 tests) dEQP-GLES3.functional.texture.wrap.etc2_eac_srgb8_alpha8.* (36 tests) dEQP-GLES3.functional.texture.wrap.etc2_srgb8.* (36 tests) dEQP-GLES3.functional.texture.wrap.etc2_srgb8_punchthrough_alpha1.* (36 tests) piglit.spec.!opengl es 3_0.oes_compressed_etc2_texture-miptree_gles3 (srgb8, srgb8-alpha, srgb8-punchthrough-alpha1) piglit.spec.arb_es3_compatibility.oes_compressed_etc2_texture-miptree (srgb8 compat, srgb8 core, srgb8-alpha8 compat, srgb8-alpha8 core, srgb8-punchthrough-alpha1 compat, srgb8-punchthrough-alpha1 core) (9 tests) Total tests passing: 148 Eleni Maria Stea (4): i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees. i965: Fixed the CopyImageSubData for ETC2 on Gen < 8 i965: Enabled the OES_copy_image extension on Gen 7 GPUs i965: Removed the field etc_format from the struct intel_mipmap_tree Nanley Chery (1): i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_* src/mesa/drivers/dri/i965/brw_draw.c | 5 + .../drivers/dri/i965/brw_wm_surface_state.c | 15 +- src/mesa/drivers/dri/i965/intel_extensions.c | 16 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 170 ++ src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 48 +++-- 5 files changed, 149 insertions(+), 105 deletions(-) -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v6 3/5] i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
For CopyImageSubData to copy the data during the 1st draw call, we need to update the shadow tree right before the rendering. v2: - Added assertion that the miptree doesn't need update at the time we update the texture surface. (Nanley Chery) v3: - As we now update the tree before the rendering we don't need to copy the data during the unmap anymore. Removed the unnecessary update from the intel_miptree_unmap in intel_mipmap_tree.c (Nanley Chery) v4: - Fixed unrelated empty line removal (Nanley Chery) - As now the intel_upate_etc_shadow of intel_mipmap_tree.c is only called inside its following function, we don't need to declare it at the top of the file anymore. (Nanley Chery) --- src/mesa/drivers/dri/i965/brw_draw.c| 5 + .../drivers/dri/i965/brw_wm_surface_state.c | 2 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 17 - 3 files changed, 6 insertions(+), 18 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 40bcf82ae8d..d07349419cc 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -559,6 +559,11 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering, tex_obj->mt->format == MESA_FORMAT_S_UINT8) { intel_update_r8stencil(brw, tex_obj->mt); } + + if (intel_miptree_has_etc_shadow(brw, tex_obj->mt) && + tex_obj->mt->shadow_needs_update) { + intel_miptree_update_etc_shadow_levels(brw, tex_obj->mt); + } } /* Resolve color for each active shader image. */ diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index c3d267721e1..19a46fcf243 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -582,7 +582,7 @@ static void brw_update_texture_surface(struct gl_context *ctx, mt = mt->shadow_mt; format = ISL_FORMAT_R8_UINT; } else if (intel_miptree_needs_fake_etc(brw, mt)) { - assert(mt->shadow_mt); + assert(mt->shadow_mt && !mt->shadow_needs_update); mt = mt->shadow_mt; } diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 976a004ade0..7146fcb6582 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -57,11 +57,6 @@ static void *intel_miptree_map_raw(struct brw_context *brw, GLbitfield mode); static void intel_miptree_unmap_raw(struct intel_mipmap_tree *mt); -static void intel_miptree_update_etc_shadow(struct brw_context *brw, -struct intel_mipmap_tree *mt, -unsigned int level, -unsigned int slice, -int level_w, int level_h); static bool intel_miptree_supports_mcs(struct brw_context *brw, @@ -3779,7 +3774,6 @@ intel_miptree_unmap(struct brw_context *brw, unsigned int slice) { struct intel_miptree_map *map = mt->level[level].slice[slice].map; - int level_w, level_h; assert(mt->surf.samples == 1); @@ -3789,21 +3783,10 @@ intel_miptree_unmap(struct brw_context *brw, DBG("%s: mt %p (%s) level %d slice %d\n", __func__, mt, _mesa_get_format_name(mt->format), level, slice); - level_w = minify(mt->surf.phys_level0_sa.width, -level - mt->first_level); - level_h = minify(mt->surf.phys_level0_sa.height, -level - mt->first_level); - if (map->unmap) map->unmap(brw, mt, map, level, slice); intel_miptree_release_map(mt, level, slice); - - if (intel_miptree_has_etc_shadow(brw, mt) && mt->shadow_needs_update) { - mt->shadow_needs_update = false; - intel_miptree_update_etc_shadow(brw, mt, level, slice, level_w, - level_h); - } } enum isl_surf_dim -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v6 4/5] i965: Enabled the OES_copy_image extension on Gen 7 GPUs
OES_copy_image extension was disabled on Gen7 due to the lack of support for ETC2 images. Enabled it back. (Kenneth Graunke) v2: - Removed the blank lines in the comments above OES_copy_image and OES_texture_view extensions in intel_extensions.c (Nanley Chery) --- src/mesa/drivers/dri/i965/intel_extensions.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 3a95be58a63..d2a6aa185c2 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -287,14 +287,22 @@ intelInitExtensions(struct gl_context *ctx) } if (devinfo->gen >= 8 || devinfo->is_baytrail) { - /* For now, we only enable OES_copy_image on platforms that support - * ETC2 natively in hardware. We would need more hacks to support it - * elsewhere. Same with OES_texture_view. + /* For now, we can't enable OES_texture_view on Gen 7 because of + * some piglit failures coming from + * piglit/tests/spec/arb_texture_view/rendering-formats.c that need + * investigation. */ - ctx->Extensions.OES_copy_image = true; ctx->Extensions.OES_texture_view = true; } + if (devinfo->gen >= 7) { + /* We can safely enable OES_copy_image on Gen 7, since we emulate + * the ETC2 support using the shadow_miptree to store the + * compressed data. + */ + ctx->Extensions.OES_copy_image = true; + } + if (devinfo->gen >= 8) { ctx->Extensions.ARB_gpu_shader_int64 = true; /* requires ARB_gpu_shader_int64 */ -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v5 4/4] i965: Enabled the OES_copy_image extension on Gen 7 GPUs
OES_copy_image extension was disabled on Gen7 due to the lack of support for ETC2 images. Enabled it back. (Kenneth Graunke) v2: - Removed the blank lines in the comments above OES_copy_image and OES_texture_view extensions in intel_extensions.c (Nanley Chery) --- src/mesa/drivers/dri/i965/intel_extensions.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 3a95be58a63..d2a6aa185c2 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -287,14 +287,22 @@ intelInitExtensions(struct gl_context *ctx) } if (devinfo->gen >= 8 || devinfo->is_baytrail) { - /* For now, we only enable OES_copy_image on platforms that support - * ETC2 natively in hardware. We would need more hacks to support it - * elsewhere. Same with OES_texture_view. + /* For now, we can't enable OES_texture_view on Gen 7 because of + * some piglit failures coming from + * piglit/tests/spec/arb_texture_view/rendering-formats.c that need + * investigation. */ - ctx->Extensions.OES_copy_image = true; ctx->Extensions.OES_texture_view = true; } + if (devinfo->gen >= 7) { + /* We can safely enable OES_copy_image on Gen 7, since we emulate + * the ETC2 support using the shadow_miptree to store the + * compressed data. + */ + ctx->Extensions.OES_copy_image = true; + } + if (devinfo->gen >= 8) { ctx->Extensions.ARB_gpu_shader_int64 = true; /* requires ARB_gpu_shader_int64 */ -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v5 3/4] i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
For CopyImageSubData to copy the data during the 1st draw call, we need to update the shadow tree right before the rendering. v2: - Added assertion that the miptree doesn't need update at the time we update the texture surface. (Nanley Chery) v3: - As we now update the tree before the rendering we don't need to copy the data during the unmap anymore. Removed the unnecessary update from the intel_miptree_unmap in intel_mipmap_tree.c (Nanley Chery) --- src/mesa/drivers/dri/i965/brw_draw.c | 5 + src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 2 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 13 - 3 files changed, 6 insertions(+), 14 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 40bcf82ae8d..d07349419cc 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -559,6 +559,11 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering, tex_obj->mt->format == MESA_FORMAT_S_UINT8) { intel_update_r8stencil(brw, tex_obj->mt); } + + if (intel_miptree_has_etc_shadow(brw, tex_obj->mt) && + tex_obj->mt->shadow_needs_update) { + intel_miptree_update_etc_shadow_levels(brw, tex_obj->mt); + } } /* Resolve color for each active shader image. */ diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index c3d267721e1..19a46fcf243 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -582,7 +582,7 @@ static void brw_update_texture_surface(struct gl_context *ctx, mt = mt->shadow_mt; format = ISL_FORMAT_R8_UINT; } else if (intel_miptree_needs_fake_etc(brw, mt)) { - assert(mt->shadow_mt); + assert(mt->shadow_mt && !mt->shadow_needs_update); mt = mt->shadow_mt; } diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 1643ce2eeb2..89b31c78bc4 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -3780,7 +3780,6 @@ intel_miptree_unmap(struct brw_context *brw, unsigned int slice) { struct intel_miptree_map *map = mt->level[level].slice[slice].map; - int level_w, level_h; assert(mt->surf.samples == 1); @@ -3790,21 +3789,10 @@ intel_miptree_unmap(struct brw_context *brw, DBG("%s: mt %p (%s) level %d slice %d\n", __func__, mt, _mesa_get_format_name(mt->format), level, slice); - level_w = minify(mt->surf.phys_level0_sa.width, -level - mt->first_level); - level_h = minify(mt->surf.phys_level0_sa.height, -level - mt->first_level); - if (map->unmap) map->unmap(brw, mt, map, level, slice); intel_miptree_release_map(mt, level, slice); - - if (intel_miptree_has_etc_shadow(brw, mt) && mt->shadow_needs_update) { - mt->shadow_needs_update = false; - intel_miptree_update_etc_shadow(brw, mt, level, slice, level_w, - level_h); - } } enum isl_surf_dim @@ -3984,6 +3972,5 @@ intel_miptree_update_etc_shadow_levels(struct brw_context *brw, level_h); } } - mt->shadow_needs_update = false; } -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v5 2/4] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the compressed EAC/ETC2 images to non-compressed RGBA images. When GetCompressed* functions were called, the pixels were returned in this RGBA format and not the compressed format that was expected. Trying to fix this problem, we use a secondary shadow miptree to store the decompressed data for the rendering and the main miptree to store the compressed for the Get functions to work. Each time that the main miptree is written with compressed data, we decompress them to RGB and update the shadow. Then we use the shadow for rendering. v2: - Fixes in the commit message (Nanley Chery) - Reversed the changes in brw_get_texture_swizzle and swapped the b, g values at the time that we decompress the data in the function: intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) - Simplified the format checks in the miptree_create function of the intel_mipmap_tree.c and reserved the call of the intel_lower_compressed_format for the case that we are faking the ETC support (Nanley Chery) - Removed the check for the auxiliary usage for the shadow miptree at creation (miptree_create of intel_mipmap_tree.c) as we won't use auxiliary buffers with these types of trees (Nanley Chery) - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and removed the unecessary checks (Nanley Chery) - Fixed an unrelated indentation change (Nanley Chery) - Modified the function intel_miptree_finish_write to set the mt->shadow_needs_update to true to catch all the cases when we need to update the miptree (Nanley Chery) - In order to update the shadow miptree during the unmap of the main and always map the main (Nanley Chery) the following change was necessary: Splitted the previous update function that was updating all the mipmap levels and use two functions instead: one that updates one level and one that updates all of them. Used the first during unmap and the second before the rendering. - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which miptree should be mapped each time and reversed all the changes in the higher level texture functions that upload data to textures as they aren't needed anymore. - Replaced the boolean needs_fake_etc with an inline function that checks when we need to fake the ETC compression (Nanley Chery) - Removed the initialization of the strides in the update function as the values will be overwritten by the intel_miptree_map call (Nanley Chery) - Used minify instead of division in the new update function intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley Chery) - Removed the depth from the calculation of the number of slices in the new update function (intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c) as we don't need to support 3D ETC images. (Nanley Chery) v3: - Renamed the rgba_fmt in function miptree_create (intel_mipmap_tree.c) to decomp_format as the format is not always in rgba order. (Nanley Chery) - Documented the new usage for the shadow miptree in the comment above the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley Chery) - Removed the redundant flags from the mapping of the miptrees in intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) - Fixed the switch from surface's logical level to physical level in the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c (Nanley Chery) - Excluded the Baytrail GPUs from the check for the ETC emulation as they support the ETC formats natively. (Nanley Chery) - Simplified the check if the format is BGRA in intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) v4: - Removed the functions intel_miptree_(map|unmap)_etc and the check if we need to call them as with the new changes, they became unreachable. (Nanley Chery) - We'd rather calculate the level width and height using the shadow miptree instead of the main in intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c (Nanley Chery) v5: - Fixed the levels calculations in intel_mipmap_tree.c (Nanley Chery) - Update the flag shadow_needs_update outside the function intel_miptree_update_etc_shadow (Nanley Chery) - Fixed indentation error (Nanley Cherry) --- .../drivers/dri/i965/brw_wm_surface_state.c | 5 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 176 +++--- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 24 +++ 3 files changed, 138 insertions(+), 67 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index c55182d7ffb..c3d267721e1 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -521,7 +521,7 @@ static void brw_update_texture_surface(struct gl_context *ctx, */
[Mesa-dev] [PATCH v5 1/4] i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*
From: Nanley Chery Use more generic field names. We'll reuse these fields for a workaround with ASTC miptrees. Reviewed-by: Eleni Maria Stea --- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 8 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 16 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 14 +++--- 3 files changed, 19 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index ece3197a858..c55182d7ffb 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -571,15 +571,15 @@ static void brw_update_texture_surface(struct gl_context *ctx, if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) { if (devinfo->gen <= 7) { -assert(mt->r8stencil_mt && !mt->stencil_mt->r8stencil_needs_update); -mt = mt->r8stencil_mt; +assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update); +mt = mt->shadow_mt; } else { mt = mt->stencil_mt; } format = ISL_FORMAT_R8_UINT; } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) { - assert(mt->r8stencil_mt && !mt->r8stencil_needs_update); - mt = mt->r8stencil_mt; + assert(mt->shadow_mt && !mt->shadow_needs_update); + mt = mt->shadow_mt; format = ISL_FORMAT_R8_UINT; } diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index b4e3524aa51..479188fd1c8 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -1214,7 +1214,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt) brw_bo_unreference((*mt)->bo); intel_miptree_release(&(*mt)->stencil_mt); - intel_miptree_release(&(*mt)->r8stencil_mt); + intel_miptree_release(&(*mt)->shadow_mt); intel_miptree_aux_buffer_free((*mt)->aux_buf); free_aux_state_map((*mt)->aux_state); @@ -2427,7 +2427,7 @@ intel_miptree_finish_write(struct brw_context *brw, switch (mt->aux_usage) { case ISL_AUX_USAGE_NONE: if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7) - mt->r8stencil_needs_update = true; + mt->shadow_needs_update = true; break; case ISL_AUX_USAGE_MCS: @@ -2933,9 +2933,9 @@ intel_update_r8stencil(struct brw_context *brw, assert(src->surf.size_B > 0); - if (!mt->r8stencil_mt) { + if (!mt->shadow_mt) { assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */ - mt->r8stencil_mt = make_surface( + mt->shadow_mt = make_surface( brw, src->target, MESA_FORMAT_R_UINT8, @@ -2949,13 +2949,13 @@ intel_update_r8stencil(struct brw_context *brw, ISL_TILING_Y0_BIT, ISL_SURF_USAGE_TEXTURE_BIT, BO_ALLOC_BUSY, 0, NULL); - assert(mt->r8stencil_mt); + assert(mt->shadow_mt); } - if (src->r8stencil_needs_update == false) + if (src->shadow_needs_update == false) return; - struct intel_mipmap_tree *dst = mt->r8stencil_mt; + struct intel_mipmap_tree *dst = mt->shadow_mt; for (int level = src->first_level; level <= src->last_level; level++) { const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ? @@ -2975,7 +2975,7 @@ intel_update_r8stencil(struct brw_context *brw, } brw_cache_flush_for_read(brw, dst->bo); - src->r8stencil_needs_update = false; + src->shadow_needs_update = false; } static void * diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h index 17668944adc..1a7507023a1 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h @@ -294,16 +294,16 @@ struct intel_mipmap_tree struct intel_mipmap_tree *stencil_mt; /** -* \brief Stencil texturing miptree for sampling from a stencil texture +* \brief Shadow miptree for sampling when the main isn't supported by HW. * -* Some hardware doesn't support sampling from the stencil texture as -* required by the GL_ARB_stencil_texturing extenion. To workaround this we -* blit the texture into a new texture that can be sampled. +* To workaround various sampler bugs and limitations, we blit the main +* texture into a new texture that can be sampled. * -* \see intel_update_r8stencil() +* This miptree may be used for: +* - Stencil texturin
[Mesa-dev] [PATCH v5 0/4] improved the support for ETC2 formats on Gen 7
Intel Gen7 GPUs don't support the ETC2 formats natively and in order to show the pixels properly we convert them to RGBA and create RGBA miptrees. The problem with that is that the GetCompressed* functions that should return the compressed pixel values return the RGBA instead. These patches are an attempt to give a solution to this problem, by using 2 miptrees: the main to stores the ETC values and the generic shadow (mt->shadow) to store the RGBA. Each time that the main miptree is unmapped we unpack the ETC to RGBA and we update the shadow. Similarly, we update all the mipmap levels of the image (if necessary) before the drawing, for the CopyImageSubData to work. Also, the OES_copy_image extension that couldn't work on Gen 7 due to the lack of the ETC support is now enabled back. Finally, the following glcts and piglit tests pass: On HSW (previously failing): KHR-GL46.direct_state_access.textures_compressed_subimage On HSW and IVB (previously skipped): - dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_alpha8_etc2_eac.* (6 tests) dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_etc2.* (6 tests) dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_punchthrough_alpha1_etc2.* (6 tests) On HSW, IVB, SNB (previously skipped): --- dEQP-GLES3.functional.texture.format.compressed.* (12 tests) dEQP-GLES3.functional.texture.wrap.etc2_eac_srgb8_alpha8.* (36 tests) dEQP-GLES3.functional.texture.wrap.etc2_srgb8.* (36 tests) dEQP-GLES3.functional.texture.wrap.etc2_srgb8_punchthrough_alpha1.* (36 tests) piglit.spec.!opengl es 3_0.oes_compressed_etc2_texture-miptree_gles3 (srgb8, srgb8-alpha, srgb8-punchthrough-alpha1) piglit.spec.arb_es3_compatibility.oes_compressed_etc2_texture-miptree (srgb8 compat, srgb8 core, srgb8-alpha8 compat, srgb8-alpha8 core, srgb8-punchthrough-alpha1 compat, srgb8-punchthrough-alpha1 core) (9 tests) Total tests passing: 148 Eleni Maria Stea (3): i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees. i965: Fixed the CopyImageSubData for ETC2 on Gen < 8 i965: Enabled the OES_copy_image extension on Gen 7 GPUs Nanley Chery (1): i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_* src/mesa/drivers/dri/i965/brw_draw.c | 5 + .../drivers/dri/i965/brw_wm_surface_state.c | 13 +- src/mesa/drivers/dri/i965/intel_extensions.c | 16 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 179 ++ src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 38 +++- 5 files changed, 161 insertions(+), 90 deletions(-) -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 0/4] improved the support for ETC2 formats on Gen 7
Intel Gen7 GPUs don't support the ETC2 formats natively and in order to show the pixels properly we convert them to RGBA and create RGBA miptrees. The problem with that is that the GetCompressed* functions that should return the compressed pixel values return the RGBA instead. These patches are an attempt to give a solution to this problem, by using 2 miptrees: the main to stores the ETC values and the generic shadow (mt->shadow) to store the RGBA. Each time that the main miptree is unmapped we unpack the ETC to RGBA and we update the shadow. Similarly, we update all the mipmap levels of the image (if necessary) before the drawing, for the CopyImageSubData to work. Also, the OES_copy_image extension that couldn't work on Gen 7 due to the lack of the ETC support is now enabled back. Finally, the following glcts and piglit tests pass: On HSW (previously failing): KHR-GL46.direct_state_access.textures_compressed_subimage On HSW and IVB (previously skipped): - dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_alpha8_etc2_eac.* (6 tests) dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_etc2.* (6 tests) dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_punchthrough_alpha1_etc2.* (6 tests) On HSW, IVB, SNB (previously skipped): --- dEQP-GLES3.functional.texture.format.compressed.* (12 tests) dEQP-GLES3.functional.texture.wrap.etc2_eac_srgb8_alpha8.* (36 tests) dEQP-GLES3.functional.texture.wrap.etc2_srgb8.* (36 tests) dEQP-GLES3.functional.texture.wrap.etc2_srgb8_punchthrough_alpha1.* (36 tests) piglit.spec.!opengl es 3_0.oes_compressed_etc2_texture-miptree_gles3 (srgb8, srgb8-alpha, srgb8-punchthrough-alpha1) piglit.spec.arb_es3_compatibility.oes_compressed_etc2_texture-miptree (srgb8 compat, srgb8 core, srgb8-alpha8 compat, srgb8-alpha8 core, srgb8-punchthrough-alpha1 compat, srgb8-punchthrough-alpha1 core) (9 tests) Total tests passing: 148 Eleni Maria Stea (3): i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees. i965: Fixed the CopyImageSubData for ETC2 on Gen < 8 i965: Enabled the OES_copy_image extension on Gen 7 GPUs Nanley Chery (1): i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_* src/mesa/drivers/dri/i965/brw_draw.c | 5 + .../drivers/dri/i965/brw_wm_surface_state.c | 13 +- src/mesa/drivers/dri/i965/intel_extensions.c | 16 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 188 +++--- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 38 +++- 5 files changed, 170 insertions(+), 90 deletions(-) -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 1/4] i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*
From: Nanley Chery Use more generic field names. We'll reuse these fields for a workaround with ASTC miptrees. Reviewed-by: Eleni Maria Stea --- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 8 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 16 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 14 +++--- 3 files changed, 19 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index b067a174056..618e2ab35bc 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -571,15 +571,15 @@ static void brw_update_texture_surface(struct gl_context *ctx, if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) { if (devinfo->gen <= 7) { -assert(mt->r8stencil_mt && !mt->stencil_mt->r8stencil_needs_update); -mt = mt->r8stencil_mt; +assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update); +mt = mt->shadow_mt; } else { mt = mt->stencil_mt; } format = ISL_FORMAT_R8_UINT; } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) { - assert(mt->r8stencil_mt && !mt->r8stencil_needs_update); - mt = mt->r8stencil_mt; + assert(mt->shadow_mt && !mt->shadow_needs_update); + mt = mt->shadow_mt; format = ISL_FORMAT_R8_UINT; } diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index b4e3524aa51..479188fd1c8 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -1214,7 +1214,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt) brw_bo_unreference((*mt)->bo); intel_miptree_release(&(*mt)->stencil_mt); - intel_miptree_release(&(*mt)->r8stencil_mt); + intel_miptree_release(&(*mt)->shadow_mt); intel_miptree_aux_buffer_free((*mt)->aux_buf); free_aux_state_map((*mt)->aux_state); @@ -2427,7 +2427,7 @@ intel_miptree_finish_write(struct brw_context *brw, switch (mt->aux_usage) { case ISL_AUX_USAGE_NONE: if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7) - mt->r8stencil_needs_update = true; + mt->shadow_needs_update = true; break; case ISL_AUX_USAGE_MCS: @@ -2933,9 +2933,9 @@ intel_update_r8stencil(struct brw_context *brw, assert(src->surf.size_B > 0); - if (!mt->r8stencil_mt) { + if (!mt->shadow_mt) { assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */ - mt->r8stencil_mt = make_surface( + mt->shadow_mt = make_surface( brw, src->target, MESA_FORMAT_R_UINT8, @@ -2949,13 +2949,13 @@ intel_update_r8stencil(struct brw_context *brw, ISL_TILING_Y0_BIT, ISL_SURF_USAGE_TEXTURE_BIT, BO_ALLOC_BUSY, 0, NULL); - assert(mt->r8stencil_mt); + assert(mt->shadow_mt); } - if (src->r8stencil_needs_update == false) + if (src->shadow_needs_update == false) return; - struct intel_mipmap_tree *dst = mt->r8stencil_mt; + struct intel_mipmap_tree *dst = mt->shadow_mt; for (int level = src->first_level; level <= src->last_level; level++) { const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ? @@ -2975,7 +2975,7 @@ intel_update_r8stencil(struct brw_context *brw, } brw_cache_flush_for_read(brw, dst->bo); - src->r8stencil_needs_update = false; + src->shadow_needs_update = false; } static void * diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h index 17668944adc..1a7507023a1 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h @@ -294,16 +294,16 @@ struct intel_mipmap_tree struct intel_mipmap_tree *stencil_mt; /** -* \brief Stencil texturing miptree for sampling from a stencil texture +* \brief Shadow miptree for sampling when the main isn't supported by HW. * -* Some hardware doesn't support sampling from the stencil texture as -* required by the GL_ARB_stencil_texturing extenion. To workaround this we -* blit the texture into a new texture that can be sampled. +* To workaround various sampler bugs and limitations, we blit the main +* texture into a new texture that can be sampled. * -* \see intel_update_r8stencil() +* This miptree may be used for: +* - Stencil texturin
[Mesa-dev] [PATCH v4 4/4] i965: Enabled the OES_copy_image extension on Gen 7 GPUs
OES_copy_image extension was disabled on Gen7 due to the lack of support for ETC2 images. Enabled it back. (Kenneth Graunke) v2: - Removed the blank lines in the comments above OES_copy_image and OES_texture_view extensions in intel_extensions.c (Nanley Chery) --- src/mesa/drivers/dri/i965/intel_extensions.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 3a95be58a63..d2a6aa185c2 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -287,14 +287,22 @@ intelInitExtensions(struct gl_context *ctx) } if (devinfo->gen >= 8 || devinfo->is_baytrail) { - /* For now, we only enable OES_copy_image on platforms that support - * ETC2 natively in hardware. We would need more hacks to support it - * elsewhere. Same with OES_texture_view. + /* For now, we can't enable OES_texture_view on Gen 7 because of + * some piglit failures coming from + * piglit/tests/spec/arb_texture_view/rendering-formats.c that need + * investigation. */ - ctx->Extensions.OES_copy_image = true; ctx->Extensions.OES_texture_view = true; } + if (devinfo->gen >= 7) { + /* We can safely enable OES_copy_image on Gen 7, since we emulate + * the ETC2 support using the shadow_miptree to store the + * compressed data. + */ + ctx->Extensions.OES_copy_image = true; + } + if (devinfo->gen >= 8) { ctx->Extensions.ARB_gpu_shader_int64 = true; /* requires ARB_gpu_shader_int64 */ -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 0/4] improved the support for ETC2 formats on Gen 7
Intel Gen7 GPUs don't support the ETC2 formats natively and in order to show the pixels properly we convert them to RGBA and create RGBA miptrees. The problem with that is that the GetCompressed* functions that should return the compressed pixel values return the RGBA instead. These patches are an attempt to give a solution to this problem, by using 2 miptrees: the main to stores the ETC values and the generic shadow (mt->shadow) to store the RGBA. Each time that the main miptree is unmapped we unpack the ETC to RGBA and we update the shadow. Similarly, we update all the mipmap levels of the image (if necessary) before the drawing, for the CopyImageSubData to work. Also, the OES_copy_image extension that couldn't work on Gen 7 due to the lack of the ETC support is now enabled back. Finally, the following glcts and piglit tests pass: On HSW (previously failing): KHR-GL46.direct_state_access.textures_compressed_subimage On HSW and IVB (previously skipped): - dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_alpha8_etc2_eac.* (6 tests) dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_etc2.* (6 tests) dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_punchthrough_alpha1_etc2.* (6 tests) On HSW, IVB, SNB (previously skipped): --- dEQP-GLES3.functional.texture.format.compressed.* (12 tests) dEQP-GLES3.functional.texture.wrap.etc2_eac_srgb8_alpha8.* (36 tests) dEQP-GLES3.functional.texture.wrap.etc2_srgb8.* (36 tests) dEQP-GLES3.functional.texture.wrap.etc2_srgb8_punchthrough_alpha1.* (36 tests) piglit.spec.!opengl es 3_0.oes_compressed_etc2_texture-miptree_gles3 (srgb8, srgb8-alpha, srgb8-punchthrough-alpha1) piglit.spec.arb_es3_compatibility.oes_compressed_etc2_texture-miptree (srgb8 compat, srgb8 core, srgb8-alpha8 compat, srgb8-alpha8 core, srgb8-punchthrough-alpha1 compat, srgb8-punchthrough-alpha1 core) (9 tests) Total tests passing: 148 Eleni Maria Stea (3): i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees. i965: Fixed the CopyImageSubData for ETC2 on Gen < 8 i965: Enabled the OES_copy_image extension on Gen 7 GPUs Nanley Chery (1): i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_* src/mesa/drivers/dri/i965/brw_draw.c | 5 + .../drivers/dri/i965/brw_wm_surface_state.c | 13 +- src/mesa/drivers/dri/i965/intel_extensions.c | 16 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 188 +++--- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 38 +++- 5 files changed, 170 insertions(+), 90 deletions(-) -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 3/4] i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
For CopyImageSubData to copy the data during the 1st draw call, we need to update the shadow tree right before the rendering. v2: - Added assertion that the miptree doesn't need update at the time we update the texture surface. (Nanley Chery) v3: - As we now update the tree before the rendering we don't need to copy the data during the unmap anymore. Removed the unnecessary update from the intel_miptree_unmap in intel_mipmap_tree.c and modified the intel_miptree_update_etc_shadow.* functions in the same file to update properly the mipmap levels for the mipmaps generation to continue to work after the change. (Nanley Chery) --- src/mesa/drivers/dri/i965/brw_draw.c| 5 + .../drivers/dri/i965/brw_wm_surface_state.c | 2 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 17 ++--- 3 files changed, 8 insertions(+), 16 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index ec4fe0b096f..d00e0a726b1 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -559,6 +559,11 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering, tex_obj->mt->format == MESA_FORMAT_S_UINT8) { intel_update_r8stencil(brw, tex_obj->mt); } + + if (intel_miptree_has_etc_shadow(brw, tex_obj->mt) && + tex_obj->mt->shadow_needs_update) { + intel_miptree_update_etc_shadow_levels(brw, tex_obj->mt); + } } /* Resolve color for each active shader image. */ diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index c2cf34aee71..437c7c82555 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -582,7 +582,7 @@ static void brw_update_texture_surface(struct gl_context *ctx, mt = mt->shadow_mt; format = ISL_FORMAT_R8_UINT; } else if (intel_miptree_needs_fake_etc(brw, mt)) { - assert(mt->shadow_mt); + assert(mt->shadow_mt && !mt->shadow_needs_update); mt = mt->shadow_mt; } diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index e50db649a23..86085db6a90 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -3780,7 +3780,6 @@ intel_miptree_unmap(struct brw_context *brw, unsigned int slice) { struct intel_miptree_map *map = mt->level[level].slice[slice].map; - int level_w, level_h; assert(mt->surf.samples == 1); @@ -3790,20 +3789,10 @@ intel_miptree_unmap(struct brw_context *brw, DBG("%s: mt %p (%s) level %d slice %d\n", __func__, mt, _mesa_get_format_name(mt->format), level, slice); - level_w = minify(mt->surf.phys_level0_sa.width, -level - mt->first_level); - level_h = minify(mt->surf.phys_level0_sa.height, -level - mt->first_level); - if (map->unmap) map->unmap(brw, mt, map, level, slice); intel_miptree_release_map(mt, level, slice); - - if (intel_miptree_has_etc_shadow(brw, mt) && mt->shadow_needs_update) { - intel_miptree_update_etc_shadow(brw, mt, level, slice, level_w, - level_h); - } } enum isl_surf_dim @@ -3936,7 +3925,6 @@ intel_miptree_update_etc_shadow(struct brw_context *brw, if (!mt->shadow_needs_update) return; - mt->shadow_needs_update = false; smt = mt->shadow_mt; etc_mode = GL_MAP_READ_BIT; @@ -3989,10 +3977,9 @@ intel_miptree_update_etc_shadow_levels(struct brw_context *brw, } level_w = minify(smt->surf.logical_level0_px.width, - level - smt->first_level); + level - smt->first_level + 1); level_h = minify(smt->surf.logical_level0_px.height, - level - smt->first_level); + level - smt->first_level + 1); } - mt->shadow_needs_update = false; } -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4 2/4] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the compressed EAC/ETC2 images to non-compressed RGBA images. When GetCompressed* functions were called, the pixels were returned in this RGBA format and not the compressed format that was expected. Trying to fix this problem, we use a secondary shadow miptree to store the decompressed data for the rendering and the main miptree to store the compressed for the Get functions to work. Each time that the main miptree is written with compressed data, we decompress them to RGB and update the shadow. Then we use the shadow for rendering. v2: - Fixes in the commit message (Nanley Chery) - Reversed the changes in brw_get_texture_swizzle and swapped the b, g values at the time that we decompress the data in the function: intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) - Simplified the format checks in the miptree_create function of the intel_mipmap_tree.c and reserved the call of the intel_lower_compressed_format for the case that we are faking the ETC support (Nanley Chery) - Removed the check for the auxiliary usage for the shadow miptree at creation (miptree_create of intel_mipmap_tree.c) as we won't use auxiliary buffers with these types of trees (Nanley Chery) - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and removed the unecessary checks (Nanley Chery) - Fixed an unrelated indentation change (Nanley Chery) - Modified the function intel_miptree_finish_write to set the mt->shadow_needs_update to true to catch all the cases when we need to update the miptree (Nanley Chery) - In order to update the shadow miptree during the unmap of the main and always map the main (Nanley Chery) the following change was necessary: Splitted the previous update function that was updating all the mipmap levels and use two functions instead: one that updates one level and one that updates all of them. Used the first during unmap and the second before the rendering. - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which miptree should be mapped each time and reversed all the changes in the higher level texture functions that upload data to textures as they aren't needed anymore. - Replaced the boolean needs_fake_etc with an inline function that checks when we need to fake the ETC compression (Nanley Chery) - Removed the initialization of the strides in the update function as the values will be overwritten by the intel_miptree_map call (Nanley Chery) - Used minify instead of division in the new update function intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley Chery) - Removed the depth from the calculation of the number of slices in the new update function (intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c) as we don't need to support 3D ETC images. (Nanley Chery) v3: - Renamed the rgba_fmt in function miptree_create (intel_mipmap_tree.c) to decomp_format as the format is not always in rgba order. (Nanley Chery) - Documented the new usage for the shadow miptree in the comment above the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley Chery) - Removed the redundant flags from the mapping of the miptrees in intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) - Fixed the switch from surface's logical level to physical level in the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c (Nanley Chery) - Excluded the Baytrail GPUs from the check for the ETC emulation as they support the ETC formats natively. (Nanley Chery) - Simplified the check if the format is BGRA in intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) v4: - Removed the functions intel_miptree_(map|unmap)_etc and the check if we need to call them as with the new changes, they became unreachable. (Nanley Chery) - We'd rather calculate the level width and height using the shadow miptree instead of the main in intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c (Nanley Chery) --- .../drivers/dri/i965/brw_wm_surface_state.c | 5 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 185 +++--- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 24 +++ 3 files changed, 147 insertions(+), 67 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 618e2ab35bc..c2cf34aee71 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -521,7 +521,7 @@ static void brw_update_texture_surface(struct gl_context *ctx, */ mesa_fmt = mt->format; } else if (mt->etc_format != MESA_FORMAT_NONE) { - mesa_fmt = mt->format; + mesa_fmt = mt->shadow_mt->format; } else if (plane > 0) { mesa_fmt = mt->format;
Re: [Mesa-dev] [PATCH v3 2/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
Hi Nanley, On Thu, 7 Feb 2019 15:46:29 -0800 Nanley Chery wrote: > > > @@ -3825,10 +3849,20 @@ intel_miptree_unmap(struct brw_context *brw, > > DBG("%s: mt %p (%s) level %d slice %d\n", __func__, > > mt, _mesa_get_format_name(mt->format), level, slice); > > > > + level_w = minify(mt->surf.phys_level0_sa.width, > > +level - mt->first_level); > > + level_h = minify(mt->surf.phys_level0_sa.height, > > +level - mt->first_level); > > + > > if (map->unmap) > >map->unmap(brw, mt, map, level, slice); > > > > intel_miptree_release_map(mt, level, slice); > > + > > + if (intel_miptree_has_etc_shadow(brw, mt) && > > mt->shadow_needs_update) { > > + intel_miptree_update_etc_shadow(brw, mt, level, slice, > > level_w, > > + level_h); > > + } > > With the next patch applied, the change in this function becomes > unnecessary. Is there any reason you're leaving it around? After a second thought, I believe that this change wasn't unnecessary. There is a problem if we remove it: When we generate mipmaps we need to update the shadow for each level. As the update is done per level during unmap, if we remove the call we end-up with the first level correctly updated but all the others empty. An example: git clone https://github.com/hikiko/test-compression.git make ./test compressed/full.tex This test loads dumped compressed mipmap levels from the full.tex and displays them, if you run it with the per level update inside the unmap you will see all the mipmap levels. Without, you will see only the first, like here: https://imgur.com/a/VvS0CYC Do you have any suggestion on how I could bypass this problem? Thanks again, Eleni ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 2/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
Hello, On Thu, 7 Feb 2019 15:46:29 -0800 Nanley Chery wrote: > > - !(mode & BRW_MAP_DIRECT_BIT)) { > > + !(mode & BRW_MAP_DIRECT_BIT) && > > + !(intel_miptree_needs_fake_etc(brw, mt))) { > >intel_miptree_map_etc(brw, mt, map, level, slice); > > Out of curiosity, is there any reason you wait until patch 5 to delete > this case? No, I just removed this lines together with the unreached map/unmap_etc functions. I will move the change to this patch. > > > } else if (mt->stencil_mt && !(mode & BRW_MAP_DIRECT_BIT)) { > >intel_miptree_map_depthstencil(brw, mt, map, level, slice); > > @@ -3816,6 +3839,7 @@ intel_miptree_unmap(struct brw_context *brw, > > unsigned int slice) > > { > > struct intel_miptree_map *map = > > mt->level[level].slice[slice].map; > > + int level_w, level_h; > > > > assert(mt->surf.samples == 1); > > > > @@ -3825,10 +3849,20 @@ intel_miptree_unmap(struct brw_context *brw, > > DBG("%s: mt %p (%s) level %d slice %d\n", __func__, > > mt, _mesa_get_format_name(mt->format), level, slice); > > > > + level_w = minify(mt->surf.phys_level0_sa.width, > > +level - mt->first_level); > > + level_h = minify(mt->surf.phys_level0_sa.height, > > +level - mt->first_level); > > + > > if (map->unmap) > >map->unmap(brw, mt, map, level, slice); > > > > intel_miptree_release_map(mt, level, slice); > > + > > + if (intel_miptree_has_etc_shadow(brw, mt) && > > mt->shadow_needs_update) { > > + intel_miptree_update_etc_shadow(brw, mt, level, slice, > > level_w, > > + level_h); > > + } > > With the next patch applied, the change in this function becomes > unnecessary. Is there any reason you're leaving it around? Right, if we force the update before the rendering, we don't need to copy the data during the unmap. I will remove it, sorry I dismissed it in the previous email. > > } > > > > enum isl_surf_dim > > @@ -3943,3 +3977,81 @@ intel_miptree_get_clear_color(const struct > > gen_device_info *devinfo, return mt->fast_clear_color; > > } > > } > > + > > +static void > > +intel_miptree_update_etc_shadow(struct brw_context *brw, > > +struct intel_mipmap_tree *mt, > > +unsigned int level, > > +unsigned int slice, > > +int level_w, > > +int level_h) > > +{ > > + struct intel_mipmap_tree *smt; > > + ptrdiff_t etc_stride, shadow_stride; > > + GLbitfield etc_mode, shadow_mode; > > + void *mptr, *sptr; > > + > > + assert(intel_miptree_has_etc_shadow(brw, mt)); > > + if (!mt->shadow_needs_update) > > + return; > > + > > + mt->shadow_needs_update = false; > > + smt = mt->shadow_mt; > > + > > + etc_mode = GL_MAP_READ_BIT; > > + shadow_mode = GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_RANGE_BIT; > > + > > + intel_miptree_map(brw, mt, level, slice, 0, 0, level_w, level_h, > > + etc_mode, , _stride); > > + intel_miptree_map(brw, smt, level, slice, 0, 0, level_w, > > level_h, > > + shadow_mode, , _stride); > > + > > + if (mt->format == MESA_FORMAT_ETC1_RGB8) { > > + _mesa_etc1_unpack_rgba(sptr, shadow_stride, mptr, > > etc_stride, > > + level_w, level_h); > > + } else { > > + /* destination and source images must have the same swizzle > > */ > > + bool is_bgra = (smt->format == MESA_FORMAT_B8G8R8A8_SRGB); > > + _mesa_unpack_etc2_format(sptr, shadow_stride, mptr, > > etc_stride, > > + level_w, level_h, mt->format, > > is_bgra); > > + } > > + > > + intel_miptree_unmap(brw, mt, level, slice); > > + intel_miptree_unmap(brw, smt, level, slice); > > +} > > + > > +void > > +intel_miptree_update_etc_shadow_levels(struct brw_context *brw, > > + struct intel_mipmap_tree > > *mt) +{ > > + struct intel_mipmap_tree *smt; > > + int num_slices; > > + int level_w, level_h; > > + > > + assert(mt); > > + assert(mt->surf.size_B > 0); > > + > > + assert(intel_miptree_has_etc_shadow(brw, mt)); > > + > > + smt = mt->shadow_mt; > > + > > + level_w = smt->surf.logical_level0_px.width; > > + level_h = smt->surf.logical_level0_px.height; > > + > > + num_slices = smt->surf.logical_level0_px.array_len; > > + > > + for (int level = smt->first_level; level <= smt->last_level; > > level++) > > + { > > + for (unsigned int slice = 0; slice < num_slices; slice++) { > > + intel_miptree_update_etc_shadow(brw, mt, level, slice, > > level_w, > > + level_h); > > + } > > + > > + level_w = minify(mt->surf.logical_level0_px.width, > > + level - mt->first_level); > > + level_h
Re: [Mesa-dev] [PATCH v2 5/5] i965: Enabled the OES_copy_image extension on Gen 7 GPUs
On Thu, 7 Feb 2019 11:18:59 -0500 Ilia Mirkin wrote: > On Thu, Feb 7, 2019 at 2:49 AM Eleni Maria Stea > wrote: > > > > On Wed, 6 Feb 2019 12:12:27 -0800 > > Nanley Chery wrote: > > > > > > + * For now, we can't enable OES_texture_view on Gen 7 > > > > because of > > > > + * some piglit failures coming from > > > > + * piglit/tests/spec/arb_texture_view/rendering-formats.c > > > > that need > > > > + * investigation. > > > > */ > > > > > > What kind of failures are you seeing? I'd imagine texture views to > > > work with this version of your series. > > > > > > > Hi Nanley, > > > > If you run the piglit test: arb_texture_view-rendering-format, and > > grep for failures on HSW: > > > > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_R32F" : "fail"}} > > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16F" : "fail"}} > > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16I" : "fail"}} > > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16_SNORM" : "fail"}} > > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RGBA8I" : "fail"}} > > PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RGBA8_SNORM" : > > "fail"}} > > > > I remember seeing similar errors on Ivy too. They must be > > irrelevant to the ETC support but as this test passes on BDW where > > the extension is enabled, I didn't enable it on Gen 7 for the > > moment. I think I had discussed about these failures with Kenneth > > before I disabled them, but I didn't investigated them further > > after that. > > Do you also see the failures with desktop GL (and ARB_texture_view)? > If not, that'd be very surprising. > > Note that the piglit test arb_texture_view-rendering-formats is the > desktop GL test. arb_texture_view-rendering-formats_gles3 is the ES > version. > > -ilia > Hi Ilia, I just checked on HSW and IVY with my final patches (sent a few minutes before your reply) and: HSW: extension disabled: the desktop test passes but we receive the following error several times: User Error: GL_INVALID_OPERATION in glTextureView(internalformat X not compatible with origtexture Y) in each subtest. extension enabled: I see the same error but now both the desktop and gles versions pass (which wasn't the case when I checked last week with my previous patches) I could probably enable it now on gen >= 75, if you and Nanley (CC-ed) are OK with this decision. What do you think? on Ivy: --- extension disabled: the desktop version of the test fails with the failures below (and the gles is skipped) extension enabled: both the desktop and the gles versions fail and the failures are the same (see below) PIGLIT: {"subtest": {"clear GL_RGBA16_SNORM as GL_RG32F" : "fail"}} PIGLIT: {"subtest": {"clear GL_RGBA16_SNORM as GL_RG32UI" : "fail"}} PIGLIT: {"subtest": {"clear GL_RGBA16_SNORM as GL_RG32I" : "fail"}} PIGLIT: {"subtest": {"clear GL_RGBA16_SNORM as GL_RGBA16F" : "fail"}} PIGLIT: {"subtest": {"clear GL_RGBA16_SNORM as GL_RGBA16UI" : "fail"}} PIGLIT: {"subtest": {"clear GL_RGBA16_SNORM as GL_RGBA16" : "fail"}} PIGLIT: {"subtest": {"clear GL_RGB16_SNORM as GL_RGB16F" : "fail"}} PIGLIT: {"subtest": {"clear GL_RGB16_SNORM as GL_RGB16" : "fail"}} PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_R32F" : "fail"}} PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_R32UI" : "fail"}} PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_R32I" : "fail"}} PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_RG16F" : "fail"}} PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_RG16UI" : "fail"}} PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_RG16" : "fail"}} PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_RGBA8UI" : "fail"}} PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_RGBA8I" : "fail"}} PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_RGBA8" : "fail"}} PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_RGBA8_SNORM" : "fail"}} PIGLIT: {"subtest": {"clear GL_RG16_SNORM as GL_RGB10_A2UI" : "fail"}} PIGLIT: {"s
[Mesa-dev] [PATCH v3 0/5] improved the support for ETC2 formats on Gen 7
Intel Gen7 GPUs don't support the ETC2 formats natively and in order to show the pixels properly we convert them to RGBA and create RGBA miptrees. The problem with that is that the GetCompressed* functions that should return the compressed pixel values return the RGBA instead. These patches are an attempt to give a solution to this problem, by using 2 miptrees: the main to stores the ETC values and the generic shadow (mt->shadow) to store the RGBA. Each time that the main miptree is unmapped we unpack the ETC to RGBA and we update the shadow. Similarly, we update all the mipmap levels of the image (if necessary) before the drawing, for the CopyImageSubData to work. Also, the OES_copy_image extension that couldn't work on Gen 7 due to the lack of the ETC support is now enabled back. Finally, the following glcts and piglit tests pass: On HSW (previously failing): KHR-GL46.direct_state_access.textures_compressed_subimage On HSW and IVB (previously disabled): - dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_alpha8_etc2_eac.* (6 tests) dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_etc2.* (6 tests) dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_punchthrough_alpha1_etc2.* (6 tests) On HSW, IVB, SNB (previously disabled): --- dEQP-GLES3.functional.texture.format.compressed.* (12 tests) dEQP-GLES3.functional.texture.wrap.etc2_eac_srgb8_alpha8.* (36 tests) dEQP-GLES3.functional.texture.wrap.etc2_srgb8.* (36 tests) dEQP-GLES3.functional.texture.wrap.etc2_srgb8_punchthrough_alpha1.* (36 tests) piglit.spec.!opengl es 3_0.oes_compressed_etc2_texture-miptree_gles3 (srgb8, srgb8-alpha, srgb8-punchthrough-alpha1) piglit.spec.arb_es3_compatibility.oes_compressed_etc2_texture-miptree (srgb8 compat, srgb8 core, srgb8-alpha8 compat, srgb8-alpha8 core, srgb8-punchthrough-alpha1 compat, srgb8-punchthrough-alpha1 core) (9 tests) Total tests passing: 148 --- Eleni Maria Stea (4): i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees. i965: Fixed the CopyImageSubData for ETC2 on Gen < 8 i965: Enabled the OES_copy_image extension on Gen 7 GPUs i965: Removed unused intel_miptree_map_etc/intel_miptree_unmap_etc Nanley Chery (1): i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_* src/mesa/drivers/dri/i965/brw_draw.c | 5 + .../drivers/dri/i965/brw_wm_surface_state.c | 13 +- src/mesa/drivers/dri/i965/intel_extensions.c | 16 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 201 +++--- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 38 +++- 5 files changed, 183 insertions(+), 90 deletions(-) -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 3/5] i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
For CopyImageSubData to copy the data during the 1st draw call, we need to update the shadow tree right before the rendering. v2: - Added assertion that the miptree doesn't need update at the time we update the texture surface. (Nanley Chery) --- src/mesa/drivers/dri/i965/brw_draw.c | 5 + src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index ec4fe0b096f..d00e0a726b1 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -559,6 +559,11 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering, tex_obj->mt->format == MESA_FORMAT_S_UINT8) { intel_update_r8stencil(brw, tex_obj->mt); } + + if (intel_miptree_has_etc_shadow(brw, tex_obj->mt) && + tex_obj->mt->shadow_needs_update) { + intel_miptree_update_etc_shadow_levels(brw, tex_obj->mt); + } } /* Resolve color for each active shader image. */ diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index c2cf34aee71..437c7c82555 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -582,7 +582,7 @@ static void brw_update_texture_surface(struct gl_context *ctx, mt = mt->shadow_mt; format = ISL_FORMAT_R8_UINT; } else if (intel_miptree_needs_fake_etc(brw, mt)) { - assert(mt->shadow_mt); + assert(mt->shadow_mt && !mt->shadow_needs_update); mt = mt->shadow_mt; } -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 5/5] i965: Removed unused intel_miptree_map_etc/intel_miptree_unmap_etc
Functions intel_miptree_(map|unmap)_etc are not reached anymore, as we now use the shadow_mt of each compressed ETC miptree for the emulation. We removed the functions. v2: - In the previous patch series, we only removed the assertions that the tree was mapped for writing. We can now safely remove the whole functions as they won't be reached anymore. (Nanley Chery) --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 59 --- 1 file changed, 59 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index c7367fc385f..a40f606f351 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -3476,61 +3476,6 @@ intel_miptree_map_s8(struct brw_context *brw, map->unmap = intel_miptree_unmap_s8; } -static void -intel_miptree_unmap_etc(struct brw_context *brw, -struct intel_mipmap_tree *mt, -struct intel_miptree_map *map, -unsigned int level, -unsigned int slice) -{ - uint32_t image_x; - uint32_t image_y; - intel_miptree_get_image_offset(mt, level, slice, _x, _y); - - image_x += map->x; - image_y += map->y; - - uint8_t *dst = intel_miptree_map_raw(brw, mt, GL_MAP_WRITE_BIT) -+ image_y * mt->surf.row_pitch_B -+ image_x * mt->cpp; - - if (mt->etc_format == MESA_FORMAT_ETC1_RGB8) - _mesa_etc1_unpack_rgba(dst, mt->surf.row_pitch_B, - map->ptr, map->stride, - map->w, map->h); - else - _mesa_unpack_etc2_format(dst, mt->surf.row_pitch_B, - map->ptr, map->stride, - map->w, map->h, mt->etc_format, true); - - intel_miptree_unmap_raw(mt); - free(map->buffer); -} - -static void -intel_miptree_map_etc(struct brw_context *brw, - struct intel_mipmap_tree *mt, - struct intel_miptree_map *map, - unsigned int level, - unsigned int slice) -{ - assert(mt->etc_format != MESA_FORMAT_NONE); - if (mt->etc_format == MESA_FORMAT_ETC1_RGB8) { - assert(mt->format == MESA_FORMAT_R8G8B8X8_UNORM); - } - - assert(map->mode & GL_MAP_WRITE_BIT); - assert(map->mode & GL_MAP_INVALIDATE_RANGE_BIT); - - intel_miptree_access_raw(brw, mt, level, slice, true); - - map->stride = _mesa_format_row_stride(mt->etc_format, map->w); - map->buffer = malloc(_mesa_format_image_size(mt->etc_format, -map->w, map->h, 1)); - map->ptr = map->buffer; - map->unmap = intel_miptree_unmap_etc; -} - /** * Mapping functions for packed depth/stencil miptrees backed by real separate * miptrees for depth and stencil. @@ -3803,10 +3748,6 @@ intel_miptree_map(struct brw_context *brw, if (mt->format == MESA_FORMAT_S_UINT8) { intel_miptree_map_s8(brw, mt, map, level, slice); - } else if (mt->etc_format != MESA_FORMAT_NONE && - !(mode & BRW_MAP_DIRECT_BIT) && - !(intel_miptree_needs_fake_etc(brw, mt))) { - intel_miptree_map_etc(brw, mt, map, level, slice); } else if (mt->stencil_mt && !(mode & BRW_MAP_DIRECT_BIT)) { intel_miptree_map_depthstencil(brw, mt, map, level, slice); } else if (use_intel_mipree_map_blit(brw, mt, map)) { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 4/5] i965: Enabled the OES_copy_image extension on Gen 7 GPUs
OES_copy_image extension was disabled on Gen7 due to the lack of support for ETC2 images. Enabled it back. (Kenneth Graunke) v2: - Removed the blank lines in the comments above OES_copy_image and OES_texture_view extensions in intel_extensions.c (Nanley Chery) --- src/mesa/drivers/dri/i965/intel_extensions.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 3a95be58a63..d2a6aa185c2 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -287,14 +287,22 @@ intelInitExtensions(struct gl_context *ctx) } if (devinfo->gen >= 8 || devinfo->is_baytrail) { - /* For now, we only enable OES_copy_image on platforms that support - * ETC2 natively in hardware. We would need more hacks to support it - * elsewhere. Same with OES_texture_view. + /* For now, we can't enable OES_texture_view on Gen 7 because of + * some piglit failures coming from + * piglit/tests/spec/arb_texture_view/rendering-formats.c that need + * investigation. */ - ctx->Extensions.OES_copy_image = true; ctx->Extensions.OES_texture_view = true; } + if (devinfo->gen >= 7) { + /* We can safely enable OES_copy_image on Gen 7, since we emulate + * the ETC2 support using the shadow_miptree to store the + * compressed data. + */ + ctx->Extensions.OES_copy_image = true; + } + if (devinfo->gen >= 8) { ctx->Extensions.ARB_gpu_shader_int64 = true; /* requires ARB_gpu_shader_int64 */ -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 2/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the compressed EAC/ETC2 images to non-compressed RGBA images. When GetCompressed* functions were called, the pixels were returned in this RGBA format and not the compressed format that was expected. Trying to fix this problem, we use a secondary shadow miptree to store the decompressed data for the rendering and the main miptree to store the compressed for the Get functions to work. Each time that the main miptree is written with compressed data, we decompress them to RGB and update the shadow. Then we use the shadow for rendering. v2: - Fixes in the commit message (Nanley Chery) - Reversed the changes in brw_get_texture_swizzle and swapped the b, g values at the time that we decompress the data in the function: intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) - Simplified the format checks in the miptree_create function of the intel_mipmap_tree.c and reserved the call of the intel_lower_compressed_format for the case that we are faking the ETC support (Nanley Chery) - Removed the check for the auxiliary usage for the shadow miptree at creation (miptree_create of intel_mipmap_tree.c) as we won't use auxiliary buffers with these types of trees (Nanley Chery) - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and removed the unecessary checks (Nanley Chery) - Fixed an unrelated indentation change (Nanley Chery) - Modified the function intel_miptree_finish_write to set the mt->shadow_needs_update to true to catch all the cases when we need to update the miptree (Nanley Chery) - In order to update the shadow miptree during the unmap of the main and always map the main (Nanley Chery) the following change was necessary: Splitted the previous update function that was updating all the mipmap levels and use two functions instead: one that updates one level and one that updates all of them. Used the first during unmap and the second before the rendering. - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which miptree should be mapped each time and reversed all the changes in the higher level texture functions that upload data to textures as they aren't needed anymore. - Replaced the boolean needs_fake_etc with an inline function that checks when we need to fake the ETC compression (Nanley Chery) - Removed the initialization of the strides in the update function as the values will be overwritten by the intel_miptree_map call (Nanley Chery) - Used minify instead of division in the new update function intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley Chery) - Removed the depth from the calculation of the number of slices in the new update function (intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c) as we don't need to support 3D ETC images. (Nanley Chery) v3: - Renamed the rgba_fmt in function miptree_create (intel_mipmap_tree.c) to decomp_format as the format is not always in rgba order. (Nanley Chery) - Documented the new usage for the shadow miptree in the comment above the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley Chery) - Removed the redundant flags from the mapping of the miptrees in intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) - Fixed the switch from surface's logical level to physical level in the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c (Nanley Chery) - Excluded the Baytrail GPUs from the check for the ETC emulation as they support the ETC formats natively. (Nanley Chery) - Simplified the check if the format is BGRA in intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) --- .../drivers/dri/i965/brw_wm_surface_state.c | 5 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 130 -- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 24 3 files changed, 149 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 618e2ab35bc..c2cf34aee71 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -521,7 +521,7 @@ static void brw_update_texture_surface(struct gl_context *ctx, */ mesa_fmt = mt->format; } else if (mt->etc_format != MESA_FORMAT_NONE) { - mesa_fmt = mt->format; + mesa_fmt = mt->shadow_mt->format; } else if (plane > 0) { mesa_fmt = mt->format; } else { @@ -581,6 +581,9 @@ static void brw_update_texture_surface(struct gl_context *ctx, assert(mt->shadow_mt && !mt->shadow_needs_update); mt = mt->shadow_mt; format = ISL_FORMAT_R8_UINT; + } else if (intel_miptree_needs_fake_etc(brw, mt)) { + assert(mt->shadow_mt); + mt = mt->shadow_mt;
[Mesa-dev] [PATCH v3 1/5] i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*
From: Nanley Chery Use more generic field names. We'll reuse these fields for a workaround with ASTC miptrees. Reviewed-by: Eleni Maria Stea --- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 8 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 16 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 14 +++--- 3 files changed, 19 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index b067a174056..618e2ab35bc 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -571,15 +571,15 @@ static void brw_update_texture_surface(struct gl_context *ctx, if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) { if (devinfo->gen <= 7) { -assert(mt->r8stencil_mt && !mt->stencil_mt->r8stencil_needs_update); -mt = mt->r8stencil_mt; +assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update); +mt = mt->shadow_mt; } else { mt = mt->stencil_mt; } format = ISL_FORMAT_R8_UINT; } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) { - assert(mt->r8stencil_mt && !mt->r8stencil_needs_update); - mt = mt->r8stencil_mt; + assert(mt->shadow_mt && !mt->shadow_needs_update); + mt = mt->shadow_mt; format = ISL_FORMAT_R8_UINT; } diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index b4e3524aa51..479188fd1c8 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -1214,7 +1214,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt) brw_bo_unreference((*mt)->bo); intel_miptree_release(&(*mt)->stencil_mt); - intel_miptree_release(&(*mt)->r8stencil_mt); + intel_miptree_release(&(*mt)->shadow_mt); intel_miptree_aux_buffer_free((*mt)->aux_buf); free_aux_state_map((*mt)->aux_state); @@ -2427,7 +2427,7 @@ intel_miptree_finish_write(struct brw_context *brw, switch (mt->aux_usage) { case ISL_AUX_USAGE_NONE: if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7) - mt->r8stencil_needs_update = true; + mt->shadow_needs_update = true; break; case ISL_AUX_USAGE_MCS: @@ -2933,9 +2933,9 @@ intel_update_r8stencil(struct brw_context *brw, assert(src->surf.size_B > 0); - if (!mt->r8stencil_mt) { + if (!mt->shadow_mt) { assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */ - mt->r8stencil_mt = make_surface( + mt->shadow_mt = make_surface( brw, src->target, MESA_FORMAT_R_UINT8, @@ -2949,13 +2949,13 @@ intel_update_r8stencil(struct brw_context *brw, ISL_TILING_Y0_BIT, ISL_SURF_USAGE_TEXTURE_BIT, BO_ALLOC_BUSY, 0, NULL); - assert(mt->r8stencil_mt); + assert(mt->shadow_mt); } - if (src->r8stencil_needs_update == false) + if (src->shadow_needs_update == false) return; - struct intel_mipmap_tree *dst = mt->r8stencil_mt; + struct intel_mipmap_tree *dst = mt->shadow_mt; for (int level = src->first_level; level <= src->last_level; level++) { const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ? @@ -2975,7 +2975,7 @@ intel_update_r8stencil(struct brw_context *brw, } brw_cache_flush_for_read(brw, dst->bo); - src->r8stencil_needs_update = false; + src->shadow_needs_update = false; } static void * diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h index 17668944adc..1a7507023a1 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h @@ -294,16 +294,16 @@ struct intel_mipmap_tree struct intel_mipmap_tree *stencil_mt; /** -* \brief Stencil texturing miptree for sampling from a stencil texture +* \brief Shadow miptree for sampling when the main isn't supported by HW. * -* Some hardware doesn't support sampling from the stencil texture as -* required by the GL_ARB_stencil_texturing extenion. To workaround this we -* blit the texture into a new texture that can be sampled. +* To workaround various sampler bugs and limitations, we blit the main +* texture into a new texture that can be sampled. * -* \see intel_update_r8stencil() +* This miptree may be used for: +* - Stencil texturin
Re: [Mesa-dev] [PATCH v2 5/5] i965: Enabled the OES_copy_image extension on Gen 7 GPUs
On Wed, 6 Feb 2019 12:12:27 -0800 Nanley Chery wrote: > > + * For now, we can't enable OES_texture_view on Gen 7 > > because of > > + * some piglit failures coming from > > + * piglit/tests/spec/arb_texture_view/rendering-formats.c > > that need > > + * investigation. > > */ > > What kind of failures are you seeing? I'd imagine texture views to > work with this version of your series. > Hi Nanley, If you run the piglit test: arb_texture_view-rendering-format, and grep for failures on HSW: PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_R32F" : "fail"}} PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16F" : "fail"}} PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16I" : "fail"}} PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RG16_SNORM" : "fail"}} PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RGBA8I" : "fail"}} PIGLIT: {"subtest": {"clear GL_RGB10_A2 as GL_RGBA8_SNORM" : "fail"}} I remember seeing similar errors on Ivy too. They must be irrelevant to the ETC support but as this test passes on BDW where the extension is enabled, I didn't enable it on Gen 7 for the moment. I think I had discussed about these failures with Kenneth before I disabled them, but I didn't investigated them further after that. Do you think I should enable it back? Thanks, Eleni ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/8] i965: Update the shadow miptree from the main to fake the ETC2 compression
On Fri, 18 Jan 2019 17:09:03 -0800 Nanley Chery wrote: > On Mon, Nov 19, 2018 at 10:54:08AM +0200, Eleni Maria Stea wrote: [...] > > + int img_d = smt->surf.logical_level0_px.depth; > > I don't think 3D ETC textures are possible. From the GL4.6 spec: > > An INVALID_OPERATION error is generated by > CompressedTexImage3D if internalformat is one of the EAC, ETC2, or > RGTC formats and either border is non-zero, or target is not > TEXTURE_2D_ARRAY. Hi Nanley, Thanks for pointing this out. I've made the change in my new series of patches but after giving it a second thought, I believe that I'd rather put back the depth in the calculation of num_slices: As, I understand the spec, if the border is zero, the 3D images should be supported. Mesa already checks the border value in the file: src/mesa/main/teximage.c function: compressed_texture_error_check and has a comment: /* No compressed formats support borders at this time */ and so only ETC/EAC compressed formats without border will reach the update function and we should support them. Also, I see that we have some CTS tests that call the CompressedTexImage3D for ETC/EAC formats with 0 border value, so I suppose that is expected to have 3D images of these formats. What do you think? Thank you in advance, Eleni ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the compressed EAC/ETC2 images to non-compressed RGBA images. When GetCompressed* functions were called, the pixels were returned in this RGBA format and not the compressed format that was expected. Trying to fix this problem, we use a secondary shadow miptree to store the decompressed data for the rendering and the main miptree to store the compressed for the Get functions to work. Each time that the main miptree is written with compressed data, we decompress them to RGB and update the shadow. Then we use the shadow for rendering. v2: - Fixes in the commit message (Nanley Chery) - Reversed the changes in brw_get_texture_swizzle and swapped the b, g values at the time that we decompress the data in the function: intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) - Simplified the format checks in the miptree_create function of the intel_mipmap_tree.c and reserved the call of the intel_lower_compressed_format for the case that we are faking the ETC support (Nanley Chery) - Removed the check for the auxiliary usage for the shadow miptree at creation (miptree_create of intel_mipmap_tree.c) as we won't use auxiliary buffers with these types of trees (Nanley Chery) - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and removed the unecessary checks (Nanley Chery) - Fixed an unrelated indentation change (Nanley Chery) - Modified the function intel_miptree_finish_write to set the mt->shadow_needs_update to true to catch all the cases when we need to update the miptree (Nanley Chery) - In order to update the shadow miptree during the unmap of the main and always map the main (Nanley Chery) the following change was necessary: Splitted the previous update function that was updating all the mipmap levels and use two functions instead: one that updates one level and one that updates all of them. Used the first during unmap and the second before the rendering. - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which miptree should be mapped each time and reversed all the changes in the higher level texture functions that upload data to textures as they aren't needed anymore. - Replaced the boolean needs_fake_etc with an inline function that checks when we need to fake the ETC compression (Nanley Chery) - Removed the initialization of the strides in the update function as the values will be overwritten by the intel_miptree_map call (Nanley Chery) - Used minify instead of division in the new update function intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley Chery) - Removed the depth from the calculation of the number of slices in the new update function (intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c) as we don't need to support 3D ETC images. (Nanley Chery) --- .../drivers/dri/i965/brw_wm_surface_state.c | 5 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 133 -- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 22 +++ 3 files changed, 150 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 618e2ab35bc..c2cf34aee71 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -521,7 +521,7 @@ static void brw_update_texture_surface(struct gl_context *ctx, */ mesa_fmt = mt->format; } else if (mt->etc_format != MESA_FORMAT_NONE) { - mesa_fmt = mt->format; + mesa_fmt = mt->shadow_mt->format; } else if (plane > 0) { mesa_fmt = mt->format; } else { @@ -581,6 +581,9 @@ static void brw_update_texture_surface(struct gl_context *ctx, assert(mt->shadow_mt && !mt->shadow_needs_update); mt = mt->shadow_mt; format = ISL_FORMAT_R8_UINT; + } else if (intel_miptree_needs_fake_etc(brw, mt)) { + assert(mt->shadow_mt); + mt = mt->shadow_mt; } const int surf_index = surf_offset - >wm.base.surf_offset[0]; diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 0a25dfd0161..3ff36b84a5a 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -57,6 +57,11 @@ static void *intel_miptree_map_raw(struct brw_context *brw, GLbitfield mode); static void intel_miptree_unmap_raw(struct intel_mipmap_tree *mt); +static void intel_miptree_update_etc_shadow(struct brw_context *brw, +struct intel_mipmap_tree *mt, +unsigned int level, +unsigned int slice, +int
[Mesa-dev] [PATCH v2 2/5] i965: Removed assertions from intel_miptree_map_etc
The assertions that the GL_MAP_WRITE_BIT and GL_MAP_INVALIDATE_RANGE_BIT in intel_miptree_map_etc will fail when the ETC miptree is mapped for reading. As we are about to fix the GetCompressed* functions in the following patches and allow the reading from etc miptrees, we have to remove them. Fixes the crash of the test KHR-GL45.direct_state_access.textures_compressed_subimage on Gen 7 GPUs. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 479188fd1c8..0a25dfd0161 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -3497,9 +3497,6 @@ intel_miptree_map_etc(struct brw_context *brw, assert(mt->format == MESA_FORMAT_R8G8B8X8_UNORM); } - assert(map->mode & GL_MAP_WRITE_BIT); - assert(map->mode & GL_MAP_INVALIDATE_RANGE_BIT); - intel_miptree_access_raw(brw, mt, level, slice, true); map->stride = _mesa_format_row_stride(mt->etc_format, map->w); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 5/5] i965: Enabled the OES_copy_image extension on Gen 7 GPUs
OES_copy_image extension was disabled on Gen7 due to the lack of support for ETC2 images. Enabled it back. (Kenneth Graunke) --- src/mesa/drivers/dri/i965/intel_extensions.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 3a95be58a63..d2e232f3ff1 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -287,14 +287,24 @@ intelInitExtensions(struct gl_context *ctx) } if (devinfo->gen >= 8 || devinfo->is_baytrail) { - /* For now, we only enable OES_copy_image on platforms that support - * ETC2 natively in hardware. We would need more hacks to support it - * elsewhere. Same with OES_texture_view. + /* + * For now, we can't enable OES_texture_view on Gen 7 because of + * some piglit failures coming from + * piglit/tests/spec/arb_texture_view/rendering-formats.c that need + * investigation. */ - ctx->Extensions.OES_copy_image = true; ctx->Extensions.OES_texture_view = true; } + if (devinfo->gen >= 7) { + /* + * We can safely enable OES_copy_image on Gen 7, since we emulate + * the ETC2 support using the shadow_miptree to store the + * compressed data. + */ + ctx->Extensions.OES_copy_image = true; + } + if (devinfo->gen >= 8) { ctx->Extensions.ARB_gpu_shader_int64 = true; /* requires ARB_gpu_shader_int64 */ -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 4/5] i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
For CopyImageSubData to copy the data during the 1st draw call, we need to update the shadow tree right before the rendering. --- src/mesa/drivers/dri/i965/brw_draw.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index ec4fe0b096f..d00e0a726b1 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -559,6 +559,11 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering, tex_obj->mt->format == MESA_FORMAT_S_UINT8) { intel_update_r8stencil(brw, tex_obj->mt); } + + if (intel_miptree_has_etc_shadow(brw, tex_obj->mt) && + tex_obj->mt->shadow_needs_update) { + intel_miptree_update_etc_shadow_levels(brw, tex_obj->mt); + } } /* Resolve color for each active shader image. */ -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 1/5] i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*
From: Nanley Chery Use more generic field names. We'll reuse these fields for a workaround with ASTC miptrees. Reviewed-by: Eleni Maria Stea --- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 8 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 16 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 14 +++--- 3 files changed, 19 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index b067a174056..618e2ab35bc 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -571,15 +571,15 @@ static void brw_update_texture_surface(struct gl_context *ctx, if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) { if (devinfo->gen <= 7) { -assert(mt->r8stencil_mt && !mt->stencil_mt->r8stencil_needs_update); -mt = mt->r8stencil_mt; +assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update); +mt = mt->shadow_mt; } else { mt = mt->stencil_mt; } format = ISL_FORMAT_R8_UINT; } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) { - assert(mt->r8stencil_mt && !mt->r8stencil_needs_update); - mt = mt->r8stencil_mt; + assert(mt->shadow_mt && !mt->shadow_needs_update); + mt = mt->shadow_mt; format = ISL_FORMAT_R8_UINT; } diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index b4e3524aa51..479188fd1c8 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -1214,7 +1214,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt) brw_bo_unreference((*mt)->bo); intel_miptree_release(&(*mt)->stencil_mt); - intel_miptree_release(&(*mt)->r8stencil_mt); + intel_miptree_release(&(*mt)->shadow_mt); intel_miptree_aux_buffer_free((*mt)->aux_buf); free_aux_state_map((*mt)->aux_state); @@ -2427,7 +2427,7 @@ intel_miptree_finish_write(struct brw_context *brw, switch (mt->aux_usage) { case ISL_AUX_USAGE_NONE: if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7) - mt->r8stencil_needs_update = true; + mt->shadow_needs_update = true; break; case ISL_AUX_USAGE_MCS: @@ -2933,9 +2933,9 @@ intel_update_r8stencil(struct brw_context *brw, assert(src->surf.size_B > 0); - if (!mt->r8stencil_mt) { + if (!mt->shadow_mt) { assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */ - mt->r8stencil_mt = make_surface( + mt->shadow_mt = make_surface( brw, src->target, MESA_FORMAT_R_UINT8, @@ -2949,13 +2949,13 @@ intel_update_r8stencil(struct brw_context *brw, ISL_TILING_Y0_BIT, ISL_SURF_USAGE_TEXTURE_BIT, BO_ALLOC_BUSY, 0, NULL); - assert(mt->r8stencil_mt); + assert(mt->shadow_mt); } - if (src->r8stencil_needs_update == false) + if (src->shadow_needs_update == false) return; - struct intel_mipmap_tree *dst = mt->r8stencil_mt; + struct intel_mipmap_tree *dst = mt->shadow_mt; for (int level = src->first_level; level <= src->last_level; level++) { const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ? @@ -2975,7 +2975,7 @@ intel_update_r8stencil(struct brw_context *brw, } brw_cache_flush_for_read(brw, dst->bo); - src->r8stencil_needs_update = false; + src->shadow_needs_update = false; } static void * diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h index 17668944adc..1a7507023a1 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h @@ -294,16 +294,16 @@ struct intel_mipmap_tree struct intel_mipmap_tree *stencil_mt; /** -* \brief Stencil texturing miptree for sampling from a stencil texture +* \brief Shadow miptree for sampling when the main isn't supported by HW. * -* Some hardware doesn't support sampling from the stencil texture as -* required by the GL_ARB_stencil_texturing extenion. To workaround this we -* blit the texture into a new texture that can be sampled. +* To workaround various sampler bugs and limitations, we blit the main +* texture into a new texture that can be sampled. * -* \see intel_update_r8stencil() +* This miptree may be used for: +* - Stencil texturin
[Mesa-dev] [PATCH v2 0/5] improved the support for ETC2 formats on Gen 7
Intel Gen7 GPUs don't support the ETC2 formats natively and in order to show the pixels properly we convert them to RGBA and create RGBA miptrees. The problem with that is that the GetCompressed* functions that should return the compressed pixel values return the RGBA instead. These patches are an attempt to give a solution to this problem, by using 2 miptrees: the main to stores the ETC values and the generic shadow (mt->shadow) to store the RGBA. Each time that the main miptree is unmapped we unpack the ETC to RGBA and we update the shadow. Similarly, we update all the mipmap levels of the image (if necessary) before the drawing, for the CopyImageSubData to work. Also, the OES_copy_image extension that couldn't work on Gen 7 due to the lack of the ETC support is now enabled back. Eleni Maria Stea (4): i965: Removed assertions from intel_miptree_map_etc i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees. i965: Fixed the CopyImageSubData for ETC2 on Gen < 8 i965: Enabled the OES_copy_image extension on Gen 7 GPUs Nanley Chery (1): i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_* src/mesa/drivers/dri/i965/brw_draw.c | 5 + .../drivers/dri/i965/brw_wm_surface_state.c | 13 +- src/mesa/drivers/dri/i965/intel_extensions.c | 18 ++- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 152 +++--- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 36 - 5 files changed, 188 insertions(+), 36 deletions(-) -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/8] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
Hi Nanley, On Fri, 18 Jan 2019 15:32:02 -0800 Nanley Chery wrote: > > diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c > > b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index > > e214fae140..4d1eafac91 100644 --- > > a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ > > b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -329,6 [...] > > @@ -474,6 +485,11 @@ static void brw_update_texture_surface(struct > > gl_context *ctx, struct intel_texture_object *intel_obj = > > intel_texture_object(obj); struct intel_mipmap_tree *mt = > > intel_obj->mt; > > + if (mt->needs_fake_etc) { > > + assert(mt->shadow_mt); > > + mt = mt->shadow_mt; > > + } > > + > >if (plane > 0) { > > if (mt->plane[plane - 1] == NULL) > > return; > > @@ -512,7 +528,7 @@ static void brw_update_texture_surface(struct > > gl_context *ctx, > >* is safe because texture views aren't allowed on > > depth/stencil. */ > > mesa_fmt = mt->format; > > - } else if (mt->etc_format != MESA_FORMAT_NONE) { > > + } else if (intel_obj->mt->etc_format != MESA_FORMAT_NONE) { > > mesa_fmt = mt->format; > > For uniformity, lets access mt->shadow_mt->format here and move the > mt->needs_fake_etc check from above to below this condition: > > } else if (devinfo->gen <= 7 && mt->format == > MESA_FORMAT_S_UINT8) { I'd like to ask you one more question on this change: if I do the check for the fake etc later, the following code will run for the main miptree that contains the compressed data and has ETC2 format: > >if (plane > 0) { > > if (mt->plane[plane - 1] == NULL) > > return; > > @@ -512,7 +528,7 @@ static void brw_update_texture_surface(struct > > gl_context *ctx, > >* is safe because texture views aren't allowed on > > depth/stencil. */ > > mesa_fmt = mt->format; Wouldn't this be a problem? Thank you in advance, Eleni ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/8] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
On 1/22/19 9:25 PM, Nanley Chery wrote: [...] > > The performance difference should be negligible if the function is > declared static inline in the intel_mipmap_tree.h header. The compiler > should include the body of function (which should be small) and avoid > the overhead of a function call. [...] > > Firstly, it's not information that's generally useful for most > intel_mipmap_tree objects. Having too much of such state makes debugging > and reading the struct definition more difficult. > > Secondly, it adds to the amount of state-dependent variables I have to > keep in mind when looking at the code. I have to start asking, when is > needs_fake_etc initialized? Is needs_fake_etc ever modified later? I'm > already familiar with the other variables needs_fake_etc can be computed > by: the gen, the miptree format, and the shadow_mt. I hope that helps. > > -Nanley > Ok, I understand, I am going to change the code to use an inline function then. Thank you very much, Eleni ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/8] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
On 1/19/19 1:32 AM, Nanley Chery wrote: >> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c >> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c >> index e214fae140..4d1eafac91 100644 >> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c >> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c >> @@ -329,6 +329,17 @@ brw_get_texture_swizzle(const struct gl_context *ctx, >> { >> const struct gl_texture_image *img = t->Image[0][t->BaseLevel]; >> >> + struct brw_context *brw = brw_context((struct gl_context *)ctx); >> + const struct gen_device_info *devinfo = >screen->devinfo; >> + bool is_fake_etc = _mesa_is_format_etc2(img->TexFormat) && >> + devinfo->gen < 8; >> + >> + mesa_format format; >> + if (is_fake_etc) >> + format = intel_lower_compressed_format(brw, img->TexFormat); >> + else >> + format = img->TexFormat; >> + > > Why is modifying this function necessary? Hi, I'll try to explain this modification: After the changes we made: - the image TexFormat remains ETC2 to match the main miptree's format - the main miptree stores the compressed data (ETC2) so that the GetCompressed* functions work - the shadow miptree stores the RGBA data and we map it for the drawing This texture swizzle function is called before the drawing and it can't access the miptrees. Instead it reads the format of the texture we are supposed to have in the memory from the gl_texture_image struct directly so in this case it reads the ETC2 format. At this time, the texture that we have in the memory and is about to be used in the drawing is RGBA (from the shadow miptree). As a result, we end up calculating the swizzle of the ETC2 format used in the original image (+the main miptree) for the RGBA texture that we have in the memory. As a result the texture is not rendered properly. The solution was to use the corresponding RGBA format when we fake the ETC2, but as I couldn't read it from the shadow miptree inside this function, I took it by calling intel_lower_compressed_format for the original ETC2 format of the gl_texture_image. I hope that this change is more clear now, I will add a comment explaining this just in case, Thank you! Eleni ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/8] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
On 1/22/19 12:46 PM, Eleni Maria Stea wrote: >>> + /** >>> +* \brief Indicates that we fake the ETC2 compression support >>> +* >>> +* GPUs Gen < 8 don't support sampling and rendering of ETC2 >>> formats so >>> +* we need to fake it. This variable is set to true when we >>> fake it. >>> +*/ >>> + bool needs_fake_etc; >>> + >> >> Let's make a function to detect needs_fake_etc instead of adding to >> the data structure. That'd be easier to follow. >> >> -Nanley > > > Hi Nanley, > > I'd like a small clarification here if you don't mind: I wasn't very > sure about this last change you suggest. > > The reasons I preferred to extend the data structure instead of adding > a function were: > > 1- that I need to check if we fake ETC in several different places in > which I don't always have access to the information that helped me > decide if we need to fake the ETC or not, so I found it much easier to > keep this information in the miptree that can be accessed from > everywhere. (That was the main reason). Actually, now I better thought of it, I only need the GPU version and if the format is compressed, so I can probably get this information in all places but we would still need to make many unnecessary calls... Couldn't we avoid them by just checking this once at the beginning? Thanks again, Eleni > The other reasons were that: > 2- I thought that it would be faster to check the miptree than call a > function. > 3- I was hoping that from the name of the variable it won't be > difficult to follow (but I could rename it to something better if you > prefer it). > > Could you explain me why you'd like me to replace it? Is there an > advantage I hadn't thought of? > > Thank you in advance, > Eleni > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/8] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
> > + /** > > +* \brief Indicates that we fake the ETC2 compression support > > +* > > +* GPUs Gen < 8 don't support sampling and rendering of ETC2 > > formats so > > +* we need to fake it. This variable is set to true when we > > fake it. > > +*/ > > + bool needs_fake_etc; > > + > > Let's make a function to detect needs_fake_etc instead of adding to > the data structure. That'd be easier to follow. > > -Nanley Hi Nanley, I'd like a small clarification here if you don't mind: I wasn't very sure about this last change you suggest. The reasons I preferred to extend the data structure instead of adding a function were: 1- that I need to check if we fake ETC in several different places in which I don't always have access to the information that helped me decide if we need to fake the ETC or not, so I found it much easier to keep this information in the miptree that can be accessed from everywhere. (That was the main reason). The other reasons were that: 2- I thought that it would be faster to check the miptree than call a function. 3- I was hoping that from the name of the variable it won't be difficult to follow (but I could rename it to something better if you prefer it). Could you explain me why you'd like me to replace it? Is there an advantage I hadn't thought of? Thank you in advance, Eleni ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/8] i965: r8stencil_mt/needs_update renamed to shadow_mt/needs_update
On 1/19/19 12:55 AM, Nanley Chery wrote: > The series I pointed you to earlier has a patch like this, but it's more > complete. It also modifies the comment above the data structure being > modified. Do you want to review it? > > https://patchwork.freedesktop.org/patch/253197/ > > I think what people usually do in this case is send out their series > with the other person's patch included (and their rb tacked onto it). Hi Nanley, First of all, thank you for taking the time to look at the patches. I will review your patch and replace mine with it in the fixed series when I complete the other changes you suggested. Regards, Eleni ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] i965: improved the support for ETC2 formats on Gen 7
On Mon, 19 Nov 2018 10:54:04 +0200 Eleni Maria Stea wrote: > Intel Gen7 GPUs don't have native support for ETC2 formats. We store > the ETC2 images as RGBA in order to render them. This is a problem for > GetCompressed* functions that should return compressed pixel values > but return instead RGBA. > [...] Hi Nanley and Kenneth, It's been a while I've sent these ETC2-related patches and I was wondering if you could get a look when you have some time available. I've also written a test to check the compressed cubemaps rendering (we already had tests for the Get functions, and compressed mipmaps, so this case was the only one missing). The patch is here (compressed-cubemap test): https://patchwork.freedesktop.org/series/54880/ While working on the test I found an issue with TexImage2D and some other compressed formats (like BPTC), and I wrote another test (included in the same patch) that points it out (see the cover letter). Another problem I hit while working on the cubemap test is described here (I found it by calling glViewport with invalid values accidentally): https://bugs.freedesktop.org/show_bug.cgi?id=108999 I've sent a small patch for it, but so far there was no reply: https://patchwork.freedesktop.org/patch/267292/ I'd really appreciate it if you could take some time to look at these 3 issues. Thank you very much in advance, Eleni ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: fixed clamping in set_scissor_bits when the y is flipped
Calculating the scissor rectangle fields with the y flipped (0 on top) can generate negative values that will cause assertion failure later on as the scissor fields are all unsigned. We must clamp the bbox values again to make sure they don't exceed the fb_height. Also fixed a calculation error. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999 --- src/mesa/drivers/dri/i965/genX_state_upload.c | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c index 8e3fcbf12e..5d8fc8214e 100644 --- a/src/mesa/drivers/dri/i965/genX_state_upload.c +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c @@ -2424,8 +2424,21 @@ set_scissor_bits(const struct gl_context *ctx, int i, /* memory: Y=0=top */ sc->ScissorRectangleXMin = bbox[0]; sc->ScissorRectangleXMax = bbox[1] - 1; + + /* Clamping to fb_height is necessary because otherwise the + * subtractions below would produce a negative result, which would + * then be assigned to the unsigned YMin/YMax scissor fields, + * resulting in an assertion failure in GENX(SCISSOR_RECT_pack) + */ + + if (bbox[3] > fb_height) + bbox[3] = fb_height; + + if (bbox[2] > fb_height) + bbox[2] = fb_height; + sc->ScissorRectangleYMin = fb_height - bbox[3]; - sc->ScissorRectangleYMax = fb_height - bbox[2] - 1; + sc->ScissorRectangleYMax = fb_height - (bbox[2] - 1); } } -- 2.20.0.rc2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/8] i965: Added support for ETC2 texture arrays on Gen7
Modified the calculation of the number of slices in the intel_update_decompressed_shadow function to take the array length into account to support arrays. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 4886bb2b96..0840b3b243 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -3965,6 +3965,8 @@ intel_update_decompressed_shadow(struct brw_context *brw, int level_w = img_w; int level_h = img_h; + int num_slices = img_d * smt->surf.logical_level0_px.array_len; + for (int level = smt->first_level; level <= smt->last_level; level++) { ptrdiff_t shadow_stride = _mesa_format_row_stride(smt->format, level_w); @@ -3972,7 +3974,7 @@ intel_update_decompressed_shadow(struct brw_context *brw, ptrdiff_t main_stride = _mesa_format_row_stride(mt->format, level_w); - for (unsigned int slice = 0; slice < img_d; slice++) { + for (unsigned int slice = 0; slice < num_slices; slice++) { GLbitfield mmode = GL_MAP_READ_BIT | BRW_MAP_DIRECT_BIT | BRW_MAP_ETC_BIT; GLbitfield smode = GL_MAP_WRITE_BIT | -- 2.19.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/8] i965: Enabled the OES_copy_image extension on Gen 7 GPUs
OES_copy_image extension was disabled on Gen7 due to the lack of support for ETC2 images. Enabled it back. --- src/mesa/drivers/dri/i965/intel_extensions.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index d7e02efb54..c3b3c1bd12 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -286,14 +286,24 @@ intelInitExtensions(struct gl_context *ctx) } if (devinfo->gen >= 8 || devinfo->is_baytrail) { - /* For now, we only enable OES_copy_image on platforms that support - * ETC2 natively in hardware. We would need more hacks to support it - * elsewhere. Same with OES_texture_view. + /* + * For now, we can't enable OES_texture_view on Gen 7 because of + * some piglit failures coming from + * piglit/tests/spec/arb_texture_view/rendering-formats.c that need + * investigation. */ - ctx->Extensions.OES_copy_image = true; ctx->Extensions.OES_texture_view = true; } + if (devinfo->gen >= 7) { + /* + * We can safely enable OES_copy_image on Gen 7, since we emulate + * the ETC2 support using the shadow_miptree to store the + * compressed data. + */ + ctx->Extensions.OES_copy_image = true; + } + if (devinfo->gen >= 8) { ctx->Extensions.ARB_gpu_shader_int64 = devinfo->has_64bit_types; /* requires ARB_gpu_shader_int64 */ -- 2.19.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/8] i965: improved the support for ETC2 formats on Gen 7
Intel Gen7 GPUs don't have native support for ETC2 formats. We store the ETC2 images as RGBA in order to render them. This is a problem for GetCompressed* functions that should return compressed pixel values but return instead RGBA. With these patches, we store the compressed image data in the main image mipmap tree and we use a secondary mipmap tree to store the RGBA values for the rendering. We perform a lazy update every time that the main miptree changes. Fix: KHR-GL46.direct_state_access.textures_compressed_subimage Eleni Maria Stea (8): i965: Removed assertions from intel_miptree_map_etc i965: r8stencil_mt/needs_update renamed to shadow_mt/needs_update i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees. i965: Update the shadow miptree from the main to fake the ETC2 compression i965: Fixed the CopyImageSubData for ETC2 on Gen < 8 i965: Added support for ETC2 mipmaps i965: Added support for ETC2 texture arrays on Gen7 i965: Enabled the OES_copy_image extension on Gen 7 GPUs src/mesa/drivers/dri/i965/brw_draw.c | 3 + .../drivers/dri/i965/brw_wm_surface_state.c | 35 +++- src/mesa/drivers/dri/i965/intel_extensions.c | 18 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 168 +++--- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 24 ++- src/mesa/drivers/dri/i965/intel_tex_image.c | 45 - src/mesa/main/texstore.c | 92 +- src/mesa/main/texstore.h | 9 + 8 files changed, 315 insertions(+), 79 deletions(-) -- 2.19.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/8] i965: Update the shadow miptree from the main to fake the ETC2 compression
On GPUs gen < 8 that don't support ETC2 sampling/rendering we now fake the support using 2 mipmap trees: one (the main) that stores the compressed data for the Get* functions to work and one (the shadow) that stores the same data decompressed for the render/sampling to work. Added the intel_update_decompressed_shadow function to update the shadow tree with the decompressed data whenever the main miptree with the compressed is changing. --- .../drivers/dri/i965/brw_wm_surface_state.c | 1 + src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 70 ++- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 3 + 3 files changed, 71 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 4d1eafac91..2e6d85e1fe 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -579,6 +579,7 @@ static void brw_update_texture_surface(struct gl_context *ctx, if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) { if (devinfo->gen <= 7) { +assert(!intel_obj->mt->needs_fake_etc); assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update); mt = mt->shadow_mt; } else { diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index b24332ff67..ef3e2c33d3 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -3740,12 +3740,15 @@ intel_miptree_map(struct brw_context *brw, assert(mt->surf.samples == 1); if (mt->needs_fake_etc) { - if (!(mode & BRW_MAP_ETC_BIT)) { + if (!(mode & BRW_MAP_ETC_BIT) && !(mode & GL_MAP_READ_BIT)) { assert(mt->shadow_mt); - mt->is_shadow_mapped = true; + if (mt->shadow_needs_update) { +intel_update_decompressed_shadow(brw, mt); +mt->shadow_needs_update = false; + } - mt->shadow_needs_update = false; + mt->is_shadow_mapped = true; mt = miptree->shadow_mt; } else { mt->is_shadow_mapped = false; @@ -3762,6 +3765,8 @@ intel_miptree_map(struct brw_context *brw, map = intel_miptree_attach_map(mt, level, slice, x, y, w, h, mode); if (!map){ + miptree->is_shadow_mapped = false; + *out_ptr = NULL; *out_stride = 0; return; @@ -3942,3 +3947,62 @@ intel_miptree_get_clear_color(const struct gen_device_info *devinfo, return mt->fast_clear_color; } } + +void +intel_update_decompressed_shadow(struct brw_context *brw, + struct intel_mipmap_tree *mt) +{ + struct intel_mipmap_tree *smt = mt->shadow_mt; + + assert(smt); + assert(mt->needs_fake_etc); + assert(mt->surf.size_B > 0); + + int img_w = smt->surf.logical_level0_px.width; + int img_h = smt->surf.logical_level0_px.height; + int img_d = smt->surf.logical_level0_px.depth; + + ptrdiff_t shadow_stride = _mesa_format_row_stride(smt->format, img_w); + + for (int level = smt->first_level; level <= smt->last_level; level++) { + struct compressed_pixelstore store; + _mesa_compute_compressed_pixelstore(mt->surf.dim, + mt->format, + img_w, img_h, img_d, + >ctx.Unpack, + ); + for (unsigned int slice = 0; slice < img_d; slice++) { + GLbitfield mmode = GL_MAP_READ_BIT | BRW_MAP_DIRECT_BIT | +BRW_MAP_ETC_BIT; + GLbitfield smode = GL_MAP_WRITE_BIT | +GL_MAP_INVALIDATE_RANGE_BIT | +BRW_MAP_DIRECT_BIT; + + uint32_t img_x, img_y; + intel_miptree_get_image_offset(smt, level, slice, _x, _y); + + void *mptr = intel_miptree_map_raw(brw, mt, mmode) + mt->offset ++ img_y * store.TotalBytesPerRow ++ img_x * store.TotalBytesPerRow / img_w; + + void *sptr; + intel_miptree_map(brw, smt, level, slice, img_x, img_y, img_w, img_h, + smode, , _stride); + + if (mt->format == MESA_FORMAT_ETC1_RGB8) { +_mesa_etc1_unpack_rgba(sptr, shadow_stride, + mptr, store.TotalBytesPerRow, + img_w, img_h); + } else { +_mesa_unpack_etc2_format(sptr, shadow_stride, + mptr, store.TotalBytesPerRow, + img_w, img_h, mt->format, true); + } + + intel_miptree_unmap_raw(mt); + intel_miptree_unmap(brw, smt, level, slice); + } + } + + mt->shadow_needs_update = false; +} diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
[Mesa-dev] [PATCH 3/8] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
GPUs Gen < 8 cannot render ETC2 formats. So far, they converted the compressed EAC/ETC2 images to non-compressed RGB format images that they can render. When GetCompressed* functions were called, the pixels were returned in the RGB format and not the compressed format as expected. Trying to fix this problem, we use the shadow miptree to store the decompressed data for the rendering and the main miptree to store the compressed. We use the BRW_MAP_ETC_BIT as a flag to indicate when we use the fake compression in order to map the main tree with the compressed data. The functions that upload the compressed data as well as the mapping/unmapping functions are now updated to use this flag. --- .../drivers/dri/i965/brw_wm_surface_state.c | 26 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 73 +-- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 17 src/mesa/drivers/dri/i965/intel_tex_image.c | 45 - src/mesa/main/texstore.c | 92 +++ src/mesa/main/texstore.h | 9 ++ 6 files changed, 204 insertions(+), 58 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index e214fae140..4d1eafac91 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -329,6 +329,17 @@ brw_get_texture_swizzle(const struct gl_context *ctx, { const struct gl_texture_image *img = t->Image[0][t->BaseLevel]; + struct brw_context *brw = brw_context((struct gl_context *)ctx); + const struct gen_device_info *devinfo = >screen->devinfo; + bool is_fake_etc = _mesa_is_format_etc2(img->TexFormat) && + devinfo->gen < 8; + + mesa_format format; + if (is_fake_etc) + format = intel_lower_compressed_format(brw, img->TexFormat); + else + format = img->TexFormat; + int swizzles[SWIZZLE_NIL + 1] = { SWIZZLE_X, SWIZZLE_Y, @@ -381,7 +392,7 @@ brw_get_texture_swizzle(const struct gl_context *ctx, } } - GLenum datatype = _mesa_get_format_datatype(img->TexFormat); + GLenum datatype = _mesa_get_format_datatype(format); /* If the texture's format is alpha-only, force R, G, and B to * 0.0. Similarly, if the texture's format has no alpha channel, @@ -422,9 +433,9 @@ brw_get_texture_swizzle(const struct gl_context *ctx, case GL_RED: case GL_RG: case GL_RGB: - if (_mesa_get_format_bits(img->TexFormat, GL_ALPHA_BITS) > 0 || - img->TexFormat == MESA_FORMAT_RGB_DXT1 || - img->TexFormat == MESA_FORMAT_SRGB_DXT1) + if (_mesa_get_format_bits(format, GL_ALPHA_BITS) > 0 || + format == MESA_FORMAT_RGB_DXT1 || + format == MESA_FORMAT_SRGB_DXT1) swizzles[3] = SWIZZLE_ONE; break; } @@ -474,6 +485,11 @@ static void brw_update_texture_surface(struct gl_context *ctx, struct intel_texture_object *intel_obj = intel_texture_object(obj); struct intel_mipmap_tree *mt = intel_obj->mt; + if (mt->needs_fake_etc) { + assert(mt->shadow_mt); + mt = mt->shadow_mt; + } + if (plane > 0) { if (mt->plane[plane - 1] == NULL) return; @@ -512,7 +528,7 @@ static void brw_update_texture_surface(struct gl_context *ctx, * is safe because texture views aren't allowed on depth/stencil. */ mesa_fmt = mt->format; - } else if (mt->etc_format != MESA_FORMAT_NONE) { + } else if (intel_obj->mt->etc_format != MESA_FORMAT_NONE) { mesa_fmt = mt->format; } else if (plane > 0) { mesa_fmt = mt->format; diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 0e67e4d8f3..b24332ff67 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -689,6 +689,8 @@ miptree_create(struct brw_context *brw, if (devinfo->gen < 6 && _mesa_is_format_color_format(format)) tiling_flags &= ~ISL_TILING_Y0_BIT; + bool fakes_etc_compression = devinfo->gen < 8 && _mesa_is_format_etc2(format); + mesa_format mt_fmt; if (_mesa_is_format_color_format(format)) { mt_fmt = intel_lower_compressed_format(brw, format); @@ -700,18 +702,41 @@ miptree_create(struct brw_context *brw, intel_depth_format_for_depthstencil_format(format); } + mesa_format fmt = fakes_etc_compression ? format : mt_fmt; struct intel_mipmap_tree *mt = - make_surface(brw, target, mt_fmt, first_level, last_level, + make_surface(brw, target, fmt, first_level, last_level, width0, height0, depth0, num_samples, - tiling_flags, mt_surf_usage(mt_fmt), + tiling_flags, mt_surf_usage(fmt), alloc_flags, 0, NULL); if (mt == NULL) return NULL; + mt->needs_fake_etc =
[Mesa-dev] [PATCH 2/8] i965: r8stencil_mt/needs_update renamed to shadow_mt/needs_update
Renamed the r8stencil_mt and r8stencil_needs_update to shadow_mt and shadow_needs_update respectively to allow reusing the shadow_mt as a generic purpose secondary mipmap tree. --- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 8 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 16 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 4 ++-- 3 files changed, 14 insertions(+), 14 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 8d21cf5fa7..e214fae140 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -563,15 +563,15 @@ static void brw_update_texture_surface(struct gl_context *ctx, if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) { if (devinfo->gen <= 7) { -assert(mt->r8stencil_mt && !mt->stencil_mt->r8stencil_needs_update); -mt = mt->r8stencil_mt; +assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update); +mt = mt->shadow_mt; } else { mt = mt->stencil_mt; } format = ISL_FORMAT_R8_UINT; } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) { - assert(mt->r8stencil_mt && !mt->r8stencil_needs_update); - mt = mt->r8stencil_mt; + assert(mt->shadow_mt && !mt->shadow_needs_update); + mt = mt->shadow_mt; format = ISL_FORMAT_R8_UINT; } diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 5e11ec0c30..0e67e4d8f3 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -1216,7 +1216,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt) brw_bo_unreference((*mt)->bo); intel_miptree_release(&(*mt)->stencil_mt); - intel_miptree_release(&(*mt)->r8stencil_mt); + intel_miptree_release(&(*mt)->shadow_mt); intel_miptree_aux_buffer_free((*mt)->aux_buf); free_aux_state_map((*mt)->aux_state); @@ -2429,7 +2429,7 @@ intel_miptree_finish_write(struct brw_context *brw, switch (mt->aux_usage) { case ISL_AUX_USAGE_NONE: if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7) - mt->r8stencil_needs_update = true; + mt->shadow_needs_update = true; break; case ISL_AUX_USAGE_MCS: @@ -2935,9 +2935,9 @@ intel_update_r8stencil(struct brw_context *brw, assert(src->surf.size_B > 0); - if (!mt->r8stencil_mt) { + if (!mt->shadow_mt) { assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */ - mt->r8stencil_mt = make_surface( + mt->shadow_mt = make_surface( brw, src->target, MESA_FORMAT_R_UINT8, @@ -2951,13 +2951,13 @@ intel_update_r8stencil(struct brw_context *brw, ISL_TILING_Y0_BIT, ISL_SURF_USAGE_TEXTURE_BIT, BO_ALLOC_BUSY, 0, NULL); - assert(mt->r8stencil_mt); + assert(mt->shadow_mt); } - if (src->r8stencil_needs_update == false) + if (src->shadow_needs_update == false) return; - struct intel_mipmap_tree *dst = mt->r8stencil_mt; + struct intel_mipmap_tree *dst = mt->shadow_mt; for (int level = src->first_level; level <= src->last_level; level++) { const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ? @@ -2977,7 +2977,7 @@ intel_update_r8stencil(struct brw_context *brw, } brw_cache_flush_for_read(brw, dst->bo); - src->r8stencil_needs_update = false; + src->shadow_needs_update = false; } static void * diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h index b0333655ad..b955a2bab1 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h @@ -302,8 +302,8 @@ struct intel_mipmap_tree * * \see intel_update_r8stencil() */ - struct intel_mipmap_tree *r8stencil_mt; - bool r8stencil_needs_update; + struct intel_mipmap_tree *shadow_mt; + bool shadow_needs_update; /** * \brief CCS, MCS, or HiZ auxiliary buffer. -- 2.19.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/8] i965: Removed assertions from intel_miptree_map_etc
The assertions that the GL_MAP_WRITE_BIT and GL_MAP_INVALIDATE_RANGE_BIT in intel_miptree_map_etc should be removed since they will fail when the ETC miptree is mapped for reading. Fixes: KHR-GL45.direct_state_access.textures_compressed_subimage crash on Gen 7 GPUs. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 8e50aabb3b..5e11ec0c30 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -3444,9 +3444,6 @@ intel_miptree_map_etc(struct brw_context *brw, assert(mt->format == MESA_FORMAT_R8G8B8X8_UNORM); } - assert(map->mode & GL_MAP_WRITE_BIT); - assert(map->mode & GL_MAP_INVALIDATE_RANGE_BIT); - intel_miptree_access_raw(brw, mt, level, slice, true); map->stride = _mesa_format_row_stride(mt->etc_format, map->w); -- 2.19.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/8] i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
CopyImageSubData couldn't work for the first draw call because intel_update_decompressed_shadow was called during the rendering. Moved the intel_update_decompressed_shadow in brw_predraw_resolve_inputs to fix this problem. --- src/mesa/drivers/dri/i965/brw_draw.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 8536c04010..b331561f36 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -559,6 +559,9 @@ brw_predraw_resolve_inputs(struct brw_context *brw, bool rendering, tex_obj->mt->format == MESA_FORMAT_S_UINT8) { intel_update_r8stencil(brw, tex_obj->mt); } + + if (tex_obj->mt->needs_fake_etc && tex_obj->mt->shadow_needs_update) + intel_update_decompressed_shadow(brw, tex_obj->mt); } /* Resolve color for each active shader image. */ -- 2.19.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/8] i965: Added support for ETC2 mipmaps
Extended the intel_update_decompress_shadow to update all the mipmap tree levels so that we can display and run Get functions on mipmaps. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 48 +++ 1 file changed, 29 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index ef3e2c33d3..4886bb2b96 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -3962,15 +3962,16 @@ intel_update_decompressed_shadow(struct brw_context *brw, int img_h = smt->surf.logical_level0_px.height; int img_d = smt->surf.logical_level0_px.depth; - ptrdiff_t shadow_stride = _mesa_format_row_stride(smt->format, img_w); + int level_w = img_w; + int level_h = img_h; for (int level = smt->first_level; level <= smt->last_level; level++) { - struct compressed_pixelstore store; - _mesa_compute_compressed_pixelstore(mt->surf.dim, - mt->format, - img_w, img_h, img_d, - >ctx.Unpack, - ); + ptrdiff_t shadow_stride = _mesa_format_row_stride(smt->format, +level_w); + + ptrdiff_t main_stride = _mesa_format_row_stride(mt->format, + level_w); + for (unsigned int slice = 0; slice < img_d; slice++) { GLbitfield mmode = GL_MAP_READ_BIT | BRW_MAP_DIRECT_BIT | BRW_MAP_ETC_BIT; @@ -3978,30 +3979,39 @@ intel_update_decompressed_shadow(struct brw_context *brw, GL_MAP_INVALIDATE_RANGE_BIT | BRW_MAP_DIRECT_BIT; - uint32_t img_x, img_y; - intel_miptree_get_image_offset(smt, level, slice, _x, _y); + uint32_t slevel_x, slevel_y; + intel_miptree_get_image_offset(smt, level, slice, _x, +_y); + + uint32_t mlevel_x, mlevel_y; + intel_miptree_get_image_offset(mt, level, slice, _x, +_y); + + void *mptr; + intel_miptree_map(brw, mt, level, slice, 0, 0, + level_w, level_h, mmode, , _stride); - void *mptr = intel_miptree_map_raw(brw, mt, mmode) + mt->offset -+ img_y * store.TotalBytesPerRow -+ img_x * store.TotalBytesPerRow / img_w; void *sptr; - intel_miptree_map(brw, smt, level, slice, img_x, img_y, img_w, img_h, - smode, , _stride); + intel_miptree_map(brw, smt, level, slice, 0, 0, level_w, + level_h, smode, , _stride); if (mt->format == MESA_FORMAT_ETC1_RGB8) { _mesa_etc1_unpack_rgba(sptr, shadow_stride, - mptr, store.TotalBytesPerRow, - img_w, img_h); + mptr, main_stride, + level_w, level_h); } else { _mesa_unpack_etc2_format(sptr, shadow_stride, - mptr, store.TotalBytesPerRow, - img_w, img_h, mt->format, true); + mptr, main_stride, + level_w, level_h, mt->format, true); } - intel_miptree_unmap_raw(mt); + intel_miptree_unmap(brw, mt, level, slice); intel_miptree_unmap(brw, smt, level, slice); } + + level_w /= 2; + level_h /= 2; } mt->shadow_needs_update = false; -- 2.19.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v5] i965: Fix ETC2/EAC GetCompressed* functions on Gen7 GPUs
On 07/10/2018 03:10 AM, Nanley Chery wrote: > On Thu, Jun 14, 2018 at 10:50:57PM +0300, Eleni Maria Stea wrote: >> On 06/14/2018 10:27 PM, Nanley Chery wrote: >> >>> +Jason, Ken >>> >>> Hello, >>> >>> I recently did some miptree work relating to the r8stencil_mt and I >>> think I now have a more informed opinion about how things should be >>> structured. I'd like to propose an alternative solution. >>> >>> I had initially thought we should have a separate miptree to hold the >>> compressed data, like this patch does, but now I think we should >>> actually have the compressed data be the main miptree and to store the >>> decompressed miptree as part of the main one. The reasoning is that we >>> could reuse this structure to handle the r8stencil workaround and to >>> eventually handle the ASTC_LDR surfaces that are modified on gen9. >>> >>> I'm proposing something like the following: >>> >>> 1. Rename r8stencil_mt ->shadow_mt and >>>r8stencil_needs_update -> shadow_needs_update. >>> 2. Make shadow_mt hold the decompressed ETC miptree >>> 3. Update shadow_needs_update whenever the main mt is modified >>> 4. Add an function to update the shadow_mt using the main mt as a source >>> 5. Sample from the shadow_mt as appropriate >>> 6. Make the main miptree hold the compressed data >>> >>> This method should also be able to handle the CopyImage functions. What >>> do you all think? >>> >>> -Nanley >> >> Hi Nanley, >> >> Thank you for your reply. I wasn't aware that there are other cases we >> might need to store a 2nd image. I agree that it's more reasonable to >> use one generic purpose miptree that can be accessible from different >> parts of the i965 code for such cases instead of storing miptrees in >> different places for different hacks when a feature is not supported. >> >> I will search your patch to get a look and I will also get a look at the >> mesa code to see how easy this fix would be (which parts of the code it >> might affect) and if everyone agrees that this is a good idea I will >> modify this patch according to your suggestions. >> >> BR :) >> Eleni > > Hi Eleni, > > I gave this more thought and am now thinking that what you have here is > fine. Having two different ways of working with a shadow miptree > suggests a refactor later on, but IMO this is ultimately a step in the > right direction. Sorry for the noise. > > With code-sharing among shadow miptrees in mind, my two main > suggestions are 1) to perform mapping operations only with the cmt (if > it's present) and 2) to update the decompressed mt, on demand. Maybe > with intel_miptree_copy_slice_sw? > > Regards, > Nanley > Hi Nanley, I talked to you on IRC but I reply here as well: Thank you for the suggestions, I had misunderstood something from our IRC conversation that followed this e-mail, so the patch v6 has several issues. I will send a new one soon and I will implement the solution you suggested earlier (suggestions 1-6) instead. Sorry for the noise with the patch v6. Thanks, Eleni ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v6] i965: Fix ETC2/EAC GetCompressed* functions on Gen7 GPUs
Gen 7 GPUs store the compressed EAC/ETC2 images in other non-compressed formats that can render. When GetCompressed* functions are called, the pixels are returned in the non-compressed format that is used for the rendering. With this patch we store both the compressed and non-compressed versions of the image, so that both rendering commands and GetCompressed* commands work. Also, the assertions for GL_MAP_WRITE_BIT and GL_MAP_INVALIDATE_RANGE_BIT in intel_miptree_map_etc function have been removed because when the miptree is mapped for reading (for example from a GetCompress* function) the GL_MAP_WRITE_BIT won't be set (and shouldn't be set). Fixes: the following test in CTS for gen7: KHR-GL45.direct_state_access.textures_compressed_subimage test Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104272 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81843 v2: fixes issues: a) initialized uninitialized variables (Juan A. Suarez, Andres Gomez) b) fixed race condition where mt and cmt were mapped at the same time c) fixed indentation issues (Andres Gomez) v3: adds bugzilla bug with id: 104272 v4: adds bugzilla bug with id: 81843 v5: replaced the flags with a bitfield, refactoring (Kenneth Graunke) v6: renamed the r8stencil_mt secondary miptree that is now part of the intel_miptree_struct to shadow_mt and used it to store the compressed miptree (Nanley Chery) --- .../drivers/dri/i965/brw_wm_surface_state.c | 8 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 27 +++--- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 14 ++- src/mesa/drivers/dri/i965/intel_tex.c | 90 ++- src/mesa/drivers/dri/i965/intel_tex_image.c | 46 +- src/mesa/main/texstore.c | 62 - src/mesa/main/texstore.h | 8 ++ 7 files changed, 209 insertions(+), 46 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 9397b637c7..2097fabaeb 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -563,15 +563,15 @@ static void brw_update_texture_surface(struct gl_context *ctx, if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) { if (devinfo->gen <= 7) { -assert(mt->r8stencil_mt && !mt->stencil_mt->r8stencil_needs_update); -mt = mt->r8stencil_mt; +assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update); +mt = mt->shadow_mt; } else { mt = mt->stencil_mt; } format = ISL_FORMAT_R8_UINT; } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) { - assert(mt->r8stencil_mt && !mt->r8stencil_needs_update); - mt = mt->r8stencil_mt; + assert(mt->shadow_mt && !mt->shadow_needs_update); + mt = mt->shadow_mt; format = ISL_FORMAT_R8_UINT; } diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 7b1f0896ae..6d07fede52 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -719,8 +719,12 @@ miptree_create(struct brw_context *brw, } } - mt->etc_format = (_mesa_is_format_color_format(format) && mt_fmt != format) ? -format : MESA_FORMAT_NONE; + if (!(flags & MIPTREE_CREATE_ETC)) { + mt->etc_format = (_mesa_is_format_color_format(format) && +mt_fmt != format) ? format : MESA_FORMAT_NONE; + } else { + mt->etc_format = MESA_FORMAT_NONE; + } if (!(flags & MIPTREE_CREATE_NO_AUX)) intel_miptree_choose_aux_usage(brw, mt); @@ -1214,7 +1218,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt) brw_bo_unreference((*mt)->bo); intel_miptree_release(&(*mt)->stencil_mt); - intel_miptree_release(&(*mt)->r8stencil_mt); + intel_miptree_release(&(*mt)->shadow_mt); intel_miptree_aux_buffer_free((*mt)->aux_buf); free_aux_state_map((*mt)->aux_state); @@ -2426,7 +2430,7 @@ intel_miptree_finish_write(struct brw_context *brw, switch (mt->aux_usage) { case ISL_AUX_USAGE_NONE: if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7) - mt->r8stencil_needs_update = true; + mt->shadow_needs_update = true; break; case ISL_AUX_USAGE_MCS: @@ -2919,9 +2923,9 @@ intel_update_r8stencil(struct brw_context *brw, assert(src->surf.size > 0); - if (!mt->r8stencil_mt) { + if (!mt->shadow_mt) { assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */ - mt->r8stencil_mt = make_surface( + mt->shadow_mt = make_surface( brw, src->target, MESA_FORMAT_R_UINT8, @@ -2935,13 +2939,13 @@ intel_update_r8stencil(struct