On Wed, 2018-06-27 at 09:13 -0700, Jason Ekstrand wrote: > On Wed, Jun 27, 2018 at 2:25 AM, Iago Toral <ito...@igalia.com> > wrote: > > On Tue, 2018-06-26 at 10:59 -0700, Jason Ekstrand wrote: > > > On Tue, Jun 26, 2018 at 4:08 AM, Iago Toral Quiroga <itoral@igali > > > a.com> wrote: > > > > Storage images require to patch push constant stateto work, > > > > which happens during > > > > > > > > binding table emision. In the scenario where our pipeline and > > > > descriptors are > > > > > > > > not dirty, we don't re-emit the binding table, however, if our > > > > push constant > > > > > > > > state is dirty, we will re-emit the push constant state, > > > > trashing storage > > > > > > > > image setup. > > > > > > > > > > > > > > > > While that scenario is probably not very likely to happen in > > > > practice, there > > > > > > > > are some CTS tests that trigger this by clearing storage images > > > > and buffers > > > > > > > > and dispatching a compute shader in a loop. The clearing of the > > > > images > > > > > > > > and buffers will trigger a blorp execution which will dirty our > > > > push constant > > > > > > > > state, however, because we don't alter the descriptors or the > > > > compute dispatch > > > > > > > > at all in the loop (we are basically execution the same program > > > > in a loop), > > > > > > > > our pipeline and descriptor state is not dirty. If the shader > > > > uses a storage > > > > > > > > image, then any iteration after the first will re-emit push > > > > constant state > > > > > > > > without re-emitting binding tables and the storage image will > > > > not be properly > > > > > > > > setup any more. > > > > > > I don't see why that is a problem. The only thing > > > flush_descriptor_sets does is fill out the binding and sampler > > > tables and fill in the push constant data for storage > > > images/buffers. The actual HW packets are filled out by > > > flush_push_constants and emit_descriptor_pointers. Yes, blorp > > > trashes our descriptor pointers but the descriptor sets should be > > > fine. For push constants, it does emit 3DSTATE_CONSTANT_* but it > > > doesn't actually modify anv_cmd_state::push_constants. > > > > > > Are secondary command buffers involved? I could see something > > > funny going on with those. > > > > No, no secondaries are involved. I did some more investigation and > > I think my explanation of the problem was not good, this is what is > > really happening: > > > > First, I found the problem in the compute pipeline and I only > > extended the fix to the graphics pipeline because it looked like > > the same rationale would apply, so I'll explain what happens in > > compute and then we can discuss whether the same problem applies to > > graphics. > > > > The test does something like this: > > > > for (...) { > > clear ssbos / storage images > > dispatch compute > > } > > > > The first iteration of this loop will find that the compute > > pipeline and descriptors are dirty and proceed to emit binding > > tables. We have storage images, so during that process the push > > constant buffer is amended to include storage images. Specifically, > > we call anv_cmd_buffer_ensure_push_constants_size() for the images > > field. This gives us a size of 624. > > > > We move on to the second iteration of the loop. When we clear > > images and ssbos via blorp, we again mark the push constant buffer > > as dirty. Now we execute the compute dispatch and the first thing > > we do there is anv_cmd_buffer_push_base_group_id() which calls > > anv_cmd_buffer_ensure_push_constants_size() for the base group id, > > which gives as a size of 144. This is smaller than what we computed > > in the previous iteration, because we haven't called the same > > function for the images field yet. Unfortunately, we will never > > call that again, because we only do that during binding table > > emission and we only do that if the compute pipeline is dirty (it > > is not) or our descriptors are dirty (they are not). So we don't > > re-emit binding table and we don't ensure push constant space for > > the image data, but because we come from a blorp execution our push > > constant dirty flag is true, so we re-emit push constant data, only > > that this time we won't emit the push constant data we need for the > > storage images, which leads to the problem. > > The intention has always been that > anv_cmd_buffer_ensure_push_constants_size would only ever grow the > push constants and never shrink them. The most obvious bug is in > anv_cmd_buffer_ensure_push_constants_size. > > > I thought that maybe making > > anv_cmd_buffer_ensure_push_constants_size() only update the size if > > we alloc or realloc would fix this, but that can cause GPU hangs in > > some cases when I run multiple tests in parallel, so I guess it > > isn't that simple. > > > > Ugh... that makes things more interesting. That does look like the > right fix and now I'm wondering why it leads to a hang. > > In the compute case, flush_compute_descriptor_sets emits > MEDIA_INTERFACE_DESCRIPTOR_LOAD. My feeling is that not emitting > that packet is the real bug. In GL, we just re-emit all 4 compute > packets all the time and don't try to track dirty bits. I had > patches to do that in Vulkan somewhere. I rebased them and pushed > them here: > > https://gitlab.freedesktop.org/jekstrand/mesa/tree/wip/anv-cs-packets > > Is that plus your patch to fix ensure_push_constants_size() enough to > fix the bug? If so, then I think what's going on is that we have to > re-emit all the compute packets after a switch from 3D to compute and > we're not doing that.
No, with this we still run into a GPU hang some times and a bunch of tests keep failing consistently. The annoying part is that all this only happens when we run the tests in group (i.e. deqp-vk -n dEQP- VK.path.to.tests.*). If I run any of the failing tests alone (with our without your patches, but with the patch for ensure_push_constants_size()), they always pass. I'll see if I can find out more about what is happening. Iago > --Jason > > > I hope I am making more sense now. > > > > Iago > > > --Jason > > > > > > > Fixes multiple failures in some new CTS tests. > > > > > > > > --- > > > > > > > > src/intel/vulkan/genX_cmd_buffer.c | 9 ++++++++- > > > > > > > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c > > > > b/src/intel/vulkan/genX_cmd_buffer.c > > > > > > > > index 97b321ccaeb..6e48aaedb9b 100644 > > > > > > > > --- a/src/intel/vulkan/genX_cmd_buffer.c > > > > > > > > +++ b/src/intel/vulkan/genX_cmd_buffer.c > > > > > > > > @@ -2554,7 +2554,8 @@ genX(cmd_buffer_flush_state)(struct > > > > anv_cmd_buffer *cmd_buffer) > > > > > > > > * 3DSTATE_BINDING_TABLE_POINTER_* for the push constants > > > > to take effect. > > > > > > > > */ > > > > > > > > uint32_t dirty = 0; > > > > > > > > - if (cmd_buffer->state.descriptors_dirty) > > > > > > > > + if (cmd_buffer->state.descriptors_dirty || > > > > > > > > + cmd_buffer->state.push_constants_dirty) > > > > > > > > dirty = flush_descriptor_sets(cmd_buffer); > > > > > > > > > > > > > > > > if (dirty || cmd_buffer->state.push_constants_dirty) { > > > > > > > > @@ -2988,7 +2989,13 @@ > > > > genX(cmd_buffer_flush_compute_state)(struct anv_cmd_buffer > > > > *cmd_buffer) > > > > > > > > anv_batch_emit_batch(&cmd_buffer->batch, &pipeline- > > > > >batch); > > > > > > > > } > > > > > > > > > > > > > > > > + /* Storage images require push constant data, which is > > > > setup during the > > > > > > > > + * binding table emision. If we have dirty push constants, > > > > we need to > > > > > > > > + * re-emit the binding table so we get the push constant > > > > storage image setup > > > > > > > > + * done, otherwise we trash it when we emit push constants > > > > below. > > > > > > > > + */ > > > > > > > > if ((cmd_buffer->state.descriptors_dirty & > > > > VK_SHADER_STAGE_COMPUTE_BIT) || > > > > > > > > + (cmd_buffer->state.push_constants_dirty & > > > > VK_SHADER_STAGE_COMPUTE_BIT) || > > > > > > > > cmd_buffer->state.compute.pipeline_dirty) { > > > > > > > > /* FIXME: figure out descriptors for gen7 */ > > > > > > > > result = flush_compute_descriptor_set(cmd_buffer); > > > > > > > > -- > > > > > > > > 2.14.1 > > > > > > > > > > > >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev