On Tue, Dec 05, 2017 at 10:26:33AM +0000, Rogovin, Kevin wrote: > Hi, > > > >> Here are my comments of the patch posted: > >> > >> 1. it is essentially replication and moving around of the code of the > >> patch series posted earlier but missing various > >> important bits: preventing the sampler from using the auxiliary > >> buffer (this requires to modify surface state > >> sent in brw_wm_surfaces.c). It also does not cover blorp > >> sufficiently (blorp might read from an ASTC5x5 > >> and there are more paths in blorp than blorp_surf_for_miptree() that > >> sample from surfaces. > >> > > >Can you explain both more in detail? Resolves done in > >brw_predraw_resolve_inputs() mean that there is nothing interesting in the > >aux buffers and surface setup won't therefore enable auxiliary for texture > >surfaces. > > That there is nothing interesting is irrelevant to the sampler if the > SURFACE_STATE fed includes the auxiliary buffer, thus when one sets up the > SURFACE_STATE for sampler, the auxiliary buffer cannot be mentioned in the > GPU command; The sampler will always try to read the auxiliary buffer if it > is given the opportunity to do so. Indeed, it is quite feasible that less > bandwidth is consumed if the sampler is given the chance to read an auxiliary > buffer in place of the buffer; as such even if the surface is resolved one > may wish to feed the sampler the auxiliary buffer. Indeed, for HiZ, i965 > programs to use the HiZ auxiliary buffer even if the depth buffer is fully > resolved (see inte_mipmap_tree_sample_with_hiz() in intel_mipmap_tree.c).
If you take a look at brw_update_texture_surface(), just in the end before brw_emit_surface_state() the logic explictly consults for intel_miptree_texture_aux_usage(). This in turn tells if the auxiliary buffer is resolved and it doesn't need to be programmed. > > > In case of blorp, as far as I know all operations sampling something should > > go thru blorp_surf_for_miptree(). Can you point out cases that don't? > > Blorp is used in more than blorp_surf_for_miptree(), for example implementing > GetTexImage(). Indeed, it is possible for blorp to sample from an ASTC5x5 > (you can see this handled in the patch series I posted). I chose the hammer > that the default is to just assume blorp is going to access auxiliary buffers > unless a flag is set when the caller knows that blorp is going to sample from > an astc5x5 (against see my patch series). This path goes thru brw_blorp_download_miptree() which in turn takes either brw_blorp_copy_miptrees() or brw_blorp_blit_miptrees(). Both in turn consult blorp_surf_for_miptree(). > > >Right. In the case of sampling both aux and astc5x5 in the same draw cycle > >the only thing to do is to disable aux. With my question of direction I > >meant the texture > > cache flush between two cycles. Do we need to flush in both cases > > 1) ASTC5x5 in first cycle and AUX in the following > > 2) AUX in first cycle and ASTC5x5 in the following > > YES we need to flush in both cases. What is happening is that the sampler > hardware is bugged. Let us suppose it was bugged in only 1 direction, take 1. > Then if the sampler first samples from an ASTC5x5 then an AUX it would not > hang, but the other way it would. However, if there are multiple draws in > flight where one samples from an ASTC5x5 and the other does not, the command > buffer order gives ZERO guarantee that the sampler will sample in that order > because fragments get executed out-of-order even across draw calls even > within a subslice (this is why sendc is needed at end of shader in GEN). This would be a nice a piece of documentation to add into the code. Thanks for explaining. > > >> 4. With 3 in mind, using the bit-masks is not a good idea as we want to > >> then enforce at the code level > >> that only one of the two is possible without texture invalidates. > > Can you elaborate this a little more? It tells if aux is/was used and it > > tells if astc5x5 is/was used. That is all we need, right? > > WRONG. We must enforce that a given draw call can have neither or only one. > By having bitmasks it is possible to support a state having both. I don't see how the bit mask prevents one from forcing anything. I now do see a flaw in the RFC related to this. In brw_predraw_resolve_inputs() one would actually need to smash down the recorded AUX mask bit when it reacts to ASTC5x5 being present. > > At any rate, please review the patch series I have posted and I am happy to > take suggestions to improve that patch series that I have tested. Well, if nothing else, I would really like to see you used brw_predraw_resolve_inputs() for the resolves. > > -Kevin > _______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
