Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround
Hi, For those interested, I made a version of the series that does not change intel_miptree_texture_aux_usage(), but instead restores the disable_aux argument to intel_miptree_prepare_texture() and uses that. What I do not like is that it forced one to have several little things tweaked here and there to make it well whereas in contrast hitting intel_miptree_texture_aux_usage() put it in one place exactly. The branch is: https://github.com/krogueintel/asem/tree/astc5x5-wa-v3 . I took a look at blorp's using of intel_miptree_prepare_texture(). It uses it in only one place: when blitting (no surprise). Having the approach of changing intel_miptree_prepare_texture() will have blorp potentially not use auxiliary buffers on blitting; functionality will NOT regress because blorp angelically calls intel_miptree_prepare_access(), but it might degrade performance. Specifically: blorp when reading form the source surface will not use the auxiliary surface if the last draw/compute had astc5x5 texture. At this point, the choice is essentially: - cleaner and simpler code by hitting of intel_miptree_prepare_texture() at cost of potential performance hit of where a blit follows a draw (or compute) that uses an ASTC5x5 texture OR - ickier code without that one cost in blorp. I honestly suspect that one will never find an application that has a performance difference between the two unless one deliberately makes such an application. -Kevin -Original Message- From: Palli, Tapani Sent: Wednesday, February 14, 2018 9:58 AM To: Rogovin, Kevin <kevin.rogo...@intel.com>; Jason Ekstrand <ja...@jlekstrand.net> Cc: ML mesa-dev <mesa-dev@lists.freedesktop.org> Subject: Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround On 14.02.2018 09:54, Tapani Pälli wrote: > > > On 14.02.2018 09:38, Rogovin, Kevin wrote: >> Hi, >> >> The 3rd patch, "i965: use ASTC5x5 workaround in >> intel_miptree_texture_aux_usage has issues: >> 1. Definitely: brw_draw lacks the call to >> gen9_astc5x5_perform_wa() which generates the needed flush between >> batchbuffers > > Now it happens via intel_miptree_prepare_texture (in > intel_miptree_texture_aux_usage). No it does not, sorry for that :) Yeah I believe I need to restore that. > >> 2. Uneasy: I am nervous about hitting >> intel_miptree_texture_aux_usage() as it is used in lots of places >> directly and indirectly. The interaction with blorp resolve makes me >> very uneasy I would rather restore the bool argument disable_aux >> to intel_miptree_prepare_render() to keep the ASTC evil a little more >> localized (the function is used only in brw_draw.c for resolving inputs). > > IMO it feels more centralized .. but no strong opinions here. I > haven't spotted any regressions because of this. > > >> -Kevin >> >> >> -Original Message- >> From: Palli, Tapani >> Sent: Monday, February 12, 2018 10:14 AM >> To: Rogovin, Kevin <kevin.rogo...@intel.com>; Jason Ekstrand >> <ja...@jlekstrand.net> >> Cc: ML mesa-dev <mesa-dev@lists.freedesktop.org> >> Subject: Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround >> >> >> >> On 02/12/2018 09:44 AM, Tapani Pälli wrote: >>> Hi; >>> >>> On 02/08/2018 09:50 AM, Rogovin, Kevin wrote: >>>> Hi, >>>> >>>> I gave it a whirl of setting the .mocs field set to 0 passed to >>>> isl_surf_fill_state() ALWAYS. Sadly CarChase GLES continued to hang >>>> (where as the GL did not because it does not use ASTC). This makes >>>> sense since MOCS (atleast last time I looked at it) only really >>>> controls cache usage for L3 and eLLC (please anyone correct me if I >>>> am wrong in this) whereas the issue is that the samplers mess up >>>> how they deal with its own (private) cache. >>>> >>>> It really is nasty that it appears (as of now) that this >>>> complicated work around is needed and needs to somehow be >>>> re-implemented in anv as well. >>> >>> It seems surrounding code has changed so that these patches need >>> some changes. Kevin, are you planning to rebase/refactor these changes? >> >> FYI I've rebased the patches and did additional porting (because of >> commit df13588d21) here: >> >> https://cgit.freedesktop.org/~tpalli/mesa/log/?h=astc5x5 >> >> Let me know if this looks OK for you. >> >>> >>>> -Kevin >>>> >>>> *From:*Jason Ekstrand [mailto:ja...@jlekstrand.net] >>>> *Sent:* Thursday, February 8, 2018 2:47 AM >>>> *To
Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround
On 14.02.2018 09:54, Tapani Pälli wrote: On 14.02.2018 09:38, Rogovin, Kevin wrote: Hi, The 3rd patch, "i965: use ASTC5x5 workaround in intel_miptree_texture_aux_usage has issues: 1. Definitely: brw_draw lacks the call to gen9_astc5x5_perform_wa() which generates the needed flush between batchbuffers Now it happens via intel_miptree_prepare_texture (in intel_miptree_texture_aux_usage). No it does not, sorry for that :) Yeah I believe I need to restore that. 2. Uneasy: I am nervous about hitting intel_miptree_texture_aux_usage() as it is used in lots of places directly and indirectly. The interaction with blorp resolve makes me very uneasy I would rather restore the bool argument disable_aux to intel_miptree_prepare_render() to keep the ASTC evil a little more localized (the function is used only in brw_draw.c for resolving inputs). IMO it feels more centralized .. but no strong opinions here. I haven't spotted any regressions because of this. -Kevin -Original Message- From: Palli, Tapani Sent: Monday, February 12, 2018 10:14 AM To: Rogovin, Kevin <kevin.rogo...@intel.com>; Jason Ekstrand <ja...@jlekstrand.net> Cc: ML mesa-dev <mesa-dev@lists.freedesktop.org> Subject: Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround On 02/12/2018 09:44 AM, Tapani Pälli wrote: Hi; On 02/08/2018 09:50 AM, Rogovin, Kevin wrote: Hi, I gave it a whirl of setting the .mocs field set to 0 passed to isl_surf_fill_state() ALWAYS. Sadly CarChase GLES continued to hang (where as the GL did not because it does not use ASTC). This makes sense since MOCS (atleast last time I looked at it) only really controls cache usage for L3 and eLLC (please anyone correct me if I am wrong in this) whereas the issue is that the samplers mess up how they deal with its own (private) cache. It really is nasty that it appears (as of now) that this complicated work around is needed and needs to somehow be re-implemented in anv as well. It seems surrounding code has changed so that these patches need some changes. Kevin, are you planning to rebase/refactor these changes? FYI I've rebased the patches and did additional porting (because of commit df13588d21) here: https://cgit.freedesktop.org/~tpalli/mesa/log/?h=astc5x5 Let me know if this looks OK for you. -Kevin *From:*Jason Ekstrand [mailto:ja...@jlekstrand.net] *Sent:* Thursday, February 8, 2018 2:47 AM *To:* Rogovin, Kevin <kevin.rogo...@intel.com> *Cc:* ML mesa-dev <mesa-dev@lists.freedesktop.org> *Subject:* Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround Random thought: Nanley and I were talking about this just now and I was complaining about how much I hate the fact that this workaround exists because we can't implement it in Vulkan. Then I got an idea. What would happen if we just set MOCS to zero (uncached) for ASTC 5x5 textures? Does that make the hang go away? How bad is the car chase performance with that compared to this series? It's a bit of a big hammer but has the advantage of simplicity. If it causes performance to tank on anything then then the more complex solution is probably worth it but I thought it was worth a try. --Jason On Thu, Dec 14, 2017 at 9:39 AM, <kevin.rogo...@intel.com <mailto:kevin.rogo...@intel.com>> wrote: From: Kevin Rogovin <kevin.rogo...@intel.com <mailto:kevin.rogo...@intel.com>> This patch series implements a needed workaround for Gen9 for ASTC5x5 sampler reads. The crux of the work around is to make sure that the sampler does not read an ASTC5x5 texture and a surface with an auxilary buffer without having a texture cache invalidate and command streamer stall between such accesses. With this patch series applied to the (current) master branch of mesa, carchase works on my SKL GT4. v2: Rename workaround functions from brw_ to gen9_ (suggested/requested by Topi Pohjolainen). Place texture resolve to avoid using auxilary surface when ASTC5x5 is detected in brw_predraw_resolve_inputs() instead of another detected function; doing so allows one to avoid walking the textures again. (suggested/requested by Topi Pohjolainen). Emit command streamer stall in addition to texture invalidate. (original short-coming caught by Jason Ekstrand) Place workaround function in (new) dedicated file. Minor path re-ordering to accomodate changes. Kevin Rogovin (5): i965: define astx5x5 workaround infrastructure i965: set ASTC5x5 workaround texture type tracking on texture validate i965: use ASTC5x5 workaround in brw_draw i965: use ASTC5x5 workaround in brw_compute i965: ASTC5x5 workaround logic for blorp src/mesa/drivers/dri/i965/Makefile.sources | 1 + src/mesa/drivers/dri/i965/brw_compute.c
Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround
On 14.02.2018 09:38, Rogovin, Kevin wrote: Hi, The 3rd patch, "i965: use ASTC5x5 workaround in intel_miptree_texture_aux_usage has issues: 1. Definitely: brw_draw lacks the call to gen9_astc5x5_perform_wa() which generates the needed flush between batchbuffers Now it happens via intel_miptree_prepare_texture (in intel_miptree_texture_aux_usage). 2. Uneasy: I am nervous about hitting intel_miptree_texture_aux_usage() as it is used in lots of places directly and indirectly. The interaction with blorp resolve makes me very uneasy I would rather restore the bool argument disable_aux to intel_miptree_prepare_render() to keep the ASTC evil a little more localized (the function is used only in brw_draw.c for resolving inputs). IMO it feels more centralized .. but no strong opinions here. I haven't spotted any regressions because of this. -Kevin -Original Message- From: Palli, Tapani Sent: Monday, February 12, 2018 10:14 AM To: Rogovin, Kevin <kevin.rogo...@intel.com>; Jason Ekstrand <ja...@jlekstrand.net> Cc: ML mesa-dev <mesa-dev@lists.freedesktop.org> Subject: Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround On 02/12/2018 09:44 AM, Tapani Pälli wrote: Hi; On 02/08/2018 09:50 AM, Rogovin, Kevin wrote: Hi, I gave it a whirl of setting the .mocs field set to 0 passed to isl_surf_fill_state() ALWAYS. Sadly CarChase GLES continued to hang (where as the GL did not because it does not use ASTC). This makes sense since MOCS (atleast last time I looked at it) only really controls cache usage for L3 and eLLC (please anyone correct me if I am wrong in this) whereas the issue is that the samplers mess up how they deal with its own (private) cache. It really is nasty that it appears (as of now) that this complicated work around is needed and needs to somehow be re-implemented in anv as well. It seems surrounding code has changed so that these patches need some changes. Kevin, are you planning to rebase/refactor these changes? FYI I've rebased the patches and did additional porting (because of commit df13588d21) here: https://cgit.freedesktop.org/~tpalli/mesa/log/?h=astc5x5 Let me know if this looks OK for you. -Kevin *From:*Jason Ekstrand [mailto:ja...@jlekstrand.net] *Sent:* Thursday, February 8, 2018 2:47 AM *To:* Rogovin, Kevin <kevin.rogo...@intel.com> *Cc:* ML mesa-dev <mesa-dev@lists.freedesktop.org> *Subject:* Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround Random thought: Nanley and I were talking about this just now and I was complaining about how much I hate the fact that this workaround exists because we can't implement it in Vulkan. Then I got an idea. What would happen if we just set MOCS to zero (uncached) for ASTC 5x5 textures? Does that make the hang go away? How bad is the car chase performance with that compared to this series? It's a bit of a big hammer but has the advantage of simplicity. If it causes performance to tank on anything then then the more complex solution is probably worth it but I thought it was worth a try. --Jason On Thu, Dec 14, 2017 at 9:39 AM, <kevin.rogo...@intel.com <mailto:kevin.rogo...@intel.com>> wrote: From: Kevin Rogovin <kevin.rogo...@intel.com <mailto:kevin.rogo...@intel.com>> This patch series implements a needed workaround for Gen9 for ASTC5x5 sampler reads. The crux of the work around is to make sure that the sampler does not read an ASTC5x5 texture and a surface with an auxilary buffer without having a texture cache invalidate and command streamer stall between such accesses. With this patch series applied to the (current) master branch of mesa, carchase works on my SKL GT4. v2: Rename workaround functions from brw_ to gen9_ (suggested/requested by Topi Pohjolainen). Place texture resolve to avoid using auxilary surface when ASTC5x5 is detected in brw_predraw_resolve_inputs() instead of another detected function; doing so allows one to avoid walking the textures again. (suggested/requested by Topi Pohjolainen). Emit command streamer stall in addition to texture invalidate. (original short-coming caught by Jason Ekstrand) Place workaround function in (new) dedicated file. Minor path re-ordering to accomodate changes. Kevin Rogovin (5): i965: define astx5x5 workaround infrastructure i965: set ASTC5x5 workaround texture type tracking on texture validate i965: use ASTC5x5 workaround in brw_draw i965: use ASTC5x5 workaround in brw_compute i965: ASTC5x5 workaround logic for blorp src/mesa/drivers/dri/i965/Makefile.sources | 1 + src/mesa/drivers/dri/i965/brw_compute.c | 6 src/mesa/drivers/dri/i965/brw_context.c | 6 src/mesa/drivers/dri/i965/brw_context.h
Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround
Hi, The 3rd patch, "i965: use ASTC5x5 workaround in intel_miptree_texture_aux_usage has issues: 1. Definitely: brw_draw lacks the call to gen9_astc5x5_perform_wa() which generates the needed flush between batchbuffers 2. Uneasy: I am nervous about hitting intel_miptree_texture_aux_usage() as it is used in lots of places directly and indirectly. The interaction with blorp resolve makes me very uneasy I would rather restore the bool argument disable_aux to intel_miptree_prepare_render() to keep the ASTC evil a little more localized (the function is used only in brw_draw.c for resolving inputs). -Kevin -Original Message- From: Palli, Tapani Sent: Monday, February 12, 2018 10:14 AM To: Rogovin, Kevin <kevin.rogo...@intel.com>; Jason Ekstrand <ja...@jlekstrand.net> Cc: ML mesa-dev <mesa-dev@lists.freedesktop.org> Subject: Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround On 02/12/2018 09:44 AM, Tapani Pälli wrote: > Hi; > > On 02/08/2018 09:50 AM, Rogovin, Kevin wrote: >> Hi, >> >> I gave it a whirl of setting the .mocs field set to 0 passed to >> isl_surf_fill_state() ALWAYS. Sadly CarChase GLES continued to hang >> (where as the GL did not because it does not use ASTC). This makes >> sense since MOCS (atleast last time I looked at it) only really >> controls cache usage for L3 and eLLC (please anyone correct me if I >> am wrong in this) whereas the issue is that the samplers mess up how >> they deal with its own (private) cache. >> >> It really is nasty that it appears (as of now) that this complicated >> work around is needed and needs to somehow be re-implemented in anv >> as well. > > It seems surrounding code has changed so that these patches need some > changes. Kevin, are you planning to rebase/refactor these changes? FYI I've rebased the patches and did additional porting (because of commit df13588d21) here: https://cgit.freedesktop.org/~tpalli/mesa/log/?h=astc5x5 Let me know if this looks OK for you. > >> -Kevin >> >> *From:*Jason Ekstrand [mailto:ja...@jlekstrand.net] >> *Sent:* Thursday, February 8, 2018 2:47 AM >> *To:* Rogovin, Kevin <kevin.rogo...@intel.com> >> *Cc:* ML mesa-dev <mesa-dev@lists.freedesktop.org> >> *Subject:* Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround >> >> Random thought: >> >> Nanley and I were talking about this just now and I was complaining >> about how much I hate the fact that this workaround exists because we >> can't implement it in Vulkan. Then I got an idea. What would happen >> if we just set MOCS to zero (uncached) for ASTC 5x5 textures? Does >> that make the hang go away? How bad is the car chase performance >> with that compared to this series? It's a bit of a big hammer but >> has the advantage of simplicity. If it causes performance to tank on >> anything then then the more complex solution is probably worth it but >> I thought it was worth a try. >> >> --Jason >> >> On Thu, Dec 14, 2017 at 9:39 AM, <kevin.rogo...@intel.com >> <mailto:kevin.rogo...@intel.com>> wrote: >> >> From: Kevin Rogovin <kevin.rogo...@intel.com >> <mailto:kevin.rogo...@intel.com>> >> >> This patch series implements a needed workaround for Gen9 for >> ASTC5x5 >> sampler reads. The crux of the work around is to make sure that >> the >> sampler does not read an ASTC5x5 texture and a surface with an >> auxilary >> buffer without having a texture cache invalidate and command >> streamer >> stall between such accesses. >> >> With this patch series applied to the (current) master branch of >> mesa, >> carchase works on my SKL GT4. >> >> v2: >> Rename workaround functions from brw_ to gen9_ >> (suggested/requested by Topi Pohjolainen). >> >> Place texture resolve to avoid using auxilary surface >> when ASTC5x5 is detected in brw_predraw_resolve_inputs() >> instead of another detected function; doing so allows >> one to avoid walking the textures again. >> (suggested/requested by Topi Pohjolainen). >> >> Emit command streamer stall in addition to texture >> invalidate. >> (original short-coming caught by Jason Ekstrand) >> >> Place workaround function in (new) dedicated file. >> >> Minor path re-ordering to accomodate changes. >> >> Kevin Rogovin (5): >> i965: define astx5x5 workaround infrastructure >> i965: set ASTC5x5 workaround t
Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround
On 02/12/2018 09:44 AM, Tapani Pälli wrote: Hi; On 02/08/2018 09:50 AM, Rogovin, Kevin wrote: Hi, I gave it a whirl of setting the .mocs field set to 0 passed to isl_surf_fill_state() ALWAYS. Sadly CarChase GLES continued to hang (where as the GL did not because it does not use ASTC). This makes sense since MOCS (atleast last time I looked at it) only really controls cache usage for L3 and eLLC (please anyone correct me if I am wrong in this) whereas the issue is that the samplers mess up how they deal with its own (private) cache. It really is nasty that it appears (as of now) that this complicated work around is needed and needs to somehow be re-implemented in anv as well. It seems surrounding code has changed so that these patches need some changes. Kevin, are you planning to rebase/refactor these changes? FYI I've rebased the patches and did additional porting (because of commit df13588d21) here: https://cgit.freedesktop.org/~tpalli/mesa/log/?h=astc5x5 Let me know if this looks OK for you. -Kevin *From:*Jason Ekstrand [mailto:ja...@jlekstrand.net] *Sent:* Thursday, February 8, 2018 2:47 AM *To:* Rogovin, Kevin <kevin.rogo...@intel.com> *Cc:* ML mesa-dev <mesa-dev@lists.freedesktop.org> *Subject:* Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround Random thought: Nanley and I were talking about this just now and I was complaining about how much I hate the fact that this workaround exists because we can't implement it in Vulkan. Then I got an idea. What would happen if we just set MOCS to zero (uncached) for ASTC 5x5 textures? Does that make the hang go away? How bad is the car chase performance with that compared to this series? It's a bit of a big hammer but has the advantage of simplicity. If it causes performance to tank on anything then then the more complex solution is probably worth it but I thought it was worth a try. --Jason On Thu, Dec 14, 2017 at 9:39 AM, <kevin.rogo...@intel.com <mailto:kevin.rogo...@intel.com>> wrote: From: Kevin Rogovin <kevin.rogo...@intel.com <mailto:kevin.rogo...@intel.com>> This patch series implements a needed workaround for Gen9 for ASTC5x5 sampler reads. The crux of the work around is to make sure that the sampler does not read an ASTC5x5 texture and a surface with an auxilary buffer without having a texture cache invalidate and command streamer stall between such accesses. With this patch series applied to the (current) master branch of mesa, carchase works on my SKL GT4. v2: Rename workaround functions from brw_ to gen9_ (suggested/requested by Topi Pohjolainen). Place texture resolve to avoid using auxilary surface when ASTC5x5 is detected in brw_predraw_resolve_inputs() instead of another detected function; doing so allows one to avoid walking the textures again. (suggested/requested by Topi Pohjolainen). Emit command streamer stall in addition to texture invalidate. (original short-coming caught by Jason Ekstrand) Place workaround function in (new) dedicated file. Minor path re-ordering to accomodate changes. Kevin Rogovin (5): i965: define astx5x5 workaround infrastructure i965: set ASTC5x5 workaround texture type tracking on texture validate i965: use ASTC5x5 workaround in brw_draw i965: use ASTC5x5 workaround in brw_compute i965: ASTC5x5 workaround logic for blorp src/mesa/drivers/dri/i965/Makefile.sources | 1 + src/mesa/drivers/dri/i965/brw_compute.c | 6 src/mesa/drivers/dri/i965/brw_context.c | 6 src/mesa/drivers/dri/i965/brw_context.h | 24 src/mesa/drivers/dri/i965/brw_draw.c | 16 +-- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 5 src/mesa/drivers/dri/i965/gen9_astc5x5_wa.c | 36 src/mesa/drivers/dri/i965/genX_blorp_exec.c | 5 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 1 + src/mesa/drivers/dri/i965/intel_tex_image.c | 16 --- src/mesa/drivers/dri/i965/intel_tex_validate.c | 13 + src/mesa/drivers/dri/i965/meson.build | 1 + 12 files changed, 124 insertions(+), 6 deletions(-) create mode 100644 src/mesa/drivers/dri/i965/gen9_astc5x5_wa.c -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org <mailto:mesa-dev@lists.freedesktop.org> https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop
Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround
Hi; On 02/08/2018 09:50 AM, Rogovin, Kevin wrote: Hi, I gave it a whirl of setting the .mocs field set to 0 passed to isl_surf_fill_state() ALWAYS. Sadly CarChase GLES continued to hang (where as the GL did not because it does not use ASTC). This makes sense since MOCS (atleast last time I looked at it) only really controls cache usage for L3 and eLLC (please anyone correct me if I am wrong in this) whereas the issue is that the samplers mess up how they deal with its own (private) cache. It really is nasty that it appears (as of now) that this complicated work around is needed and needs to somehow be re-implemented in anv as well. It seems surrounding code has changed so that these patches need some changes. Kevin, are you planning to rebase/refactor these changes? -Kevin *From:*Jason Ekstrand [mailto:ja...@jlekstrand.net] *Sent:* Thursday, February 8, 2018 2:47 AM *To:* Rogovin, Kevin <kevin.rogo...@intel.com> *Cc:* ML mesa-dev <mesa-dev@lists.freedesktop.org> *Subject:* Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround Random thought: Nanley and I were talking about this just now and I was complaining about how much I hate the fact that this workaround exists because we can't implement it in Vulkan. Then I got an idea. What would happen if we just set MOCS to zero (uncached) for ASTC 5x5 textures? Does that make the hang go away? How bad is the car chase performance with that compared to this series? It's a bit of a big hammer but has the advantage of simplicity. If it causes performance to tank on anything then then the more complex solution is probably worth it but I thought it was worth a try. --Jason On Thu, Dec 14, 2017 at 9:39 AM, <kevin.rogo...@intel.com <mailto:kevin.rogo...@intel.com>> wrote: From: Kevin Rogovin <kevin.rogo...@intel.com <mailto:kevin.rogo...@intel.com>> This patch series implements a needed workaround for Gen9 for ASTC5x5 sampler reads. The crux of the work around is to make sure that the sampler does not read an ASTC5x5 texture and a surface with an auxilary buffer without having a texture cache invalidate and command streamer stall between such accesses. With this patch series applied to the (current) master branch of mesa, carchase works on my SKL GT4. v2: Rename workaround functions from brw_ to gen9_ (suggested/requested by Topi Pohjolainen). Place texture resolve to avoid using auxilary surface when ASTC5x5 is detected in brw_predraw_resolve_inputs() instead of another detected function; doing so allows one to avoid walking the textures again. (suggested/requested by Topi Pohjolainen). Emit command streamer stall in addition to texture invalidate. (original short-coming caught by Jason Ekstrand) Place workaround function in (new) dedicated file. Minor path re-ordering to accomodate changes. Kevin Rogovin (5): i965: define astx5x5 workaround infrastructure i965: set ASTC5x5 workaround texture type tracking on texture validate i965: use ASTC5x5 workaround in brw_draw i965: use ASTC5x5 workaround in brw_compute i965: ASTC5x5 workaround logic for blorp src/mesa/drivers/dri/i965/Makefile.sources | 1 + src/mesa/drivers/dri/i965/brw_compute.c | 6 src/mesa/drivers/dri/i965/brw_context.c | 6 src/mesa/drivers/dri/i965/brw_context.h | 24 src/mesa/drivers/dri/i965/brw_draw.c | 16 +-- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 5 src/mesa/drivers/dri/i965/gen9_astc5x5_wa.c | 36 src/mesa/drivers/dri/i965/genX_blorp_exec.c | 5 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 1 + src/mesa/drivers/dri/i965/intel_tex_image.c | 16 --- src/mesa/drivers/dri/i965/intel_tex_validate.c | 13 + src/mesa/drivers/dri/i965/meson.build | 1 + 12 files changed, 124 insertions(+), 6 deletions(-) create mode 100644 src/mesa/drivers/dri/i965/gen9_astc5x5_wa.c -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org <mailto:mesa-dev@lists.freedesktop.org> https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround
Hi, I gave it a whirl of setting the .mocs field set to 0 passed to isl_surf_fill_state() ALWAYS. Sadly CarChase GLES continued to hang (where as the GL did not because it does not use ASTC). This makes sense since MOCS (atleast last time I looked at it) only really controls cache usage for L3 and eLLC (please anyone correct me if I am wrong in this) whereas the issue is that the samplers mess up how they deal with its own (private) cache. It really is nasty that it appears (as of now) that this complicated work around is needed and needs to somehow be re-implemented in anv as well. -Kevin From: Jason Ekstrand [mailto:ja...@jlekstrand.net] Sent: Thursday, February 8, 2018 2:47 AM To: Rogovin, Kevin <kevin.rogo...@intel.com> Cc: ML mesa-dev <mesa-dev@lists.freedesktop.org> Subject: Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround Random thought: Nanley and I were talking about this just now and I was complaining about how much I hate the fact that this workaround exists because we can't implement it in Vulkan. Then I got an idea. What would happen if we just set MOCS to zero (uncached) for ASTC 5x5 textures? Does that make the hang go away? How bad is the car chase performance with that compared to this series? It's a bit of a big hammer but has the advantage of simplicity. If it causes performance to tank on anything then then the more complex solution is probably worth it but I thought it was worth a try. --Jason On Thu, Dec 14, 2017 at 9:39 AM, <kevin.rogo...@intel.com<mailto:kevin.rogo...@intel.com>> wrote: From: Kevin Rogovin <kevin.rogo...@intel.com<mailto:kevin.rogo...@intel.com>> This patch series implements a needed workaround for Gen9 for ASTC5x5 sampler reads. The crux of the work around is to make sure that the sampler does not read an ASTC5x5 texture and a surface with an auxilary buffer without having a texture cache invalidate and command streamer stall between such accesses. With this patch series applied to the (current) master branch of mesa, carchase works on my SKL GT4. v2: Rename workaround functions from brw_ to gen9_ (suggested/requested by Topi Pohjolainen). Place texture resolve to avoid using auxilary surface when ASTC5x5 is detected in brw_predraw_resolve_inputs() instead of another detected function; doing so allows one to avoid walking the textures again. (suggested/requested by Topi Pohjolainen). Emit command streamer stall in addition to texture invalidate. (original short-coming caught by Jason Ekstrand) Place workaround function in (new) dedicated file. Minor path re-ordering to accomodate changes. Kevin Rogovin (5): i965: define astx5x5 workaround infrastructure i965: set ASTC5x5 workaround texture type tracking on texture validate i965: use ASTC5x5 workaround in brw_draw i965: use ASTC5x5 workaround in brw_compute i965: ASTC5x5 workaround logic for blorp src/mesa/drivers/dri/i965/Makefile.sources | 1 + src/mesa/drivers/dri/i965/brw_compute.c | 6 src/mesa/drivers/dri/i965/brw_context.c | 6 src/mesa/drivers/dri/i965/brw_context.h | 24 src/mesa/drivers/dri/i965/brw_draw.c | 16 +-- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 5 src/mesa/drivers/dri/i965/gen9_astc5x5_wa.c | 36 src/mesa/drivers/dri/i965/genX_blorp_exec.c | 5 src/mesa/drivers/dri/i965/intel_batchbuffer.c| 1 + src/mesa/drivers/dri/i965/intel_tex_image.c | 16 --- src/mesa/drivers/dri/i965/intel_tex_validate.c | 13 + src/mesa/drivers/dri/i965/meson.build| 1 + 12 files changed, 124 insertions(+), 6 deletions(-) create mode 100644 src/mesa/drivers/dri/i965/gen9_astc5x5_wa.c -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org<mailto:mesa-dev@lists.freedesktop.org> https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround
Random thought: Nanley and I were talking about this just now and I was complaining about how much I hate the fact that this workaround exists because we can't implement it in Vulkan. Then I got an idea. What would happen if we just set MOCS to zero (uncached) for ASTC 5x5 textures? Does that make the hang go away? How bad is the car chase performance with that compared to this series? It's a bit of a big hammer but has the advantage of simplicity. If it causes performance to tank on anything then then the more complex solution is probably worth it but I thought it was worth a try. --Jason On Thu, Dec 14, 2017 at 9:39 AM,wrote: > From: Kevin Rogovin > > This patch series implements a needed workaround for Gen9 for ASTC5x5 > sampler reads. The crux of the work around is to make sure that the > sampler does not read an ASTC5x5 texture and a surface with an auxilary > buffer without having a texture cache invalidate and command streamer > stall between such accesses. > > With this patch series applied to the (current) master branch of mesa, > carchase works on my SKL GT4. > > v2: > Rename workaround functions from brw_ to gen9_ > (suggested/requested by Topi Pohjolainen). > > Place texture resolve to avoid using auxilary surface > when ASTC5x5 is detected in brw_predraw_resolve_inputs() > instead of another detected function; doing so allows > one to avoid walking the textures again. > (suggested/requested by Topi Pohjolainen). > > Emit command streamer stall in addition to texture > invalidate. > (original short-coming caught by Jason Ekstrand) > > Place workaround function in (new) dedicated file. > > Minor path re-ordering to accomodate changes. > > Kevin Rogovin (5): > i965: define astx5x5 workaround infrastructure > i965: set ASTC5x5 workaround texture type tracking on texture validate > i965: use ASTC5x5 workaround in brw_draw > i965: use ASTC5x5 workaround in brw_compute > i965: ASTC5x5 workaround logic for blorp > > src/mesa/drivers/dri/i965/Makefile.sources | 1 + > src/mesa/drivers/dri/i965/brw_compute.c | 6 > src/mesa/drivers/dri/i965/brw_context.c | 6 > src/mesa/drivers/dri/i965/brw_context.h | 24 > src/mesa/drivers/dri/i965/brw_draw.c | 16 +-- > src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 5 > src/mesa/drivers/dri/i965/gen9_astc5x5_wa.c | 36 > > src/mesa/drivers/dri/i965/genX_blorp_exec.c | 5 > src/mesa/drivers/dri/i965/intel_batchbuffer.c| 1 + > src/mesa/drivers/dri/i965/intel_tex_image.c | 16 --- > src/mesa/drivers/dri/i965/intel_tex_validate.c | 13 + > src/mesa/drivers/dri/i965/meson.build| 1 + > 12 files changed, 124 insertions(+), 6 deletions(-) > create mode 100644 src/mesa/drivers/dri/i965/gen9_astc5x5_wa.c > > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround
On Mon, Dec 18, 2017 at 08:05:38AM +, Rogovin, Kevin wrote: > Hi, > > I gave it a try by modifying isl_genX(surf_fill_state_s) in > src/intel/isl/isl_surface_state.c where I set SamplerL2BypassModeDisable > ALWAYS as true for GEN9; sadly car chase continued to hang. > > -Kevin > Hi, Thanks for giving it a try. Regards, Nanley > -Original Message- > From: Nanley Chery [mailto:nanleych...@gmail.com] > Sent: Friday, December 15, 2017 8:34 PM > To: Rogovin, Kevin <kevin.rogo...@intel.com> > Cc: mesa-dev@lists.freedesktop.org > Subject: Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround > > On Thu, Dec 14, 2017 at 07:39:46PM +0200, kevin.rogo...@intel.com wrote: > > From: Kevin Rogovin <kevin.rogo...@intel.com> > > > > This patch series implements a needed workaround for Gen9 for ASTC5x5 > > sampler reads. The crux of the work around is to make sure that the > > sampler does not read an ASTC5x5 texture and a surface with an > > auxilary buffer without having a texture cache invalidate and command > > streamer stall between such accesses. > > > > This workaround sounds like it deals with the same types of surfaces dealt > with in the RENDER_SURFACE_STATE field, Sampler L2 Out of Order Mode Disable > (or SamplerL2BypassModeDisable in our driver). > > Here's the programming note from the SKL PRM on this field: > * This bit must be set for the following surface types: > BC2_UNORM BC3_UNORM BC5_UNORM BC5_SNORM BC7_UNORM > * This bit must be set for surfaces which contain a HiZ auxilliary surface > if other surfaces using AUX_CCS_E or AUX_CCS_D auxiliary surface state > (lossless color compression) are being sampled at the same time. > > Have we tried setting this bit for ASTC_5x5 textures? > > -Nanley > > > With this patch series applied to the (current) master branch of mesa, > > carchase works on my SKL GT4. > > > > v2: > > Rename workaround functions from brw_ to gen9_ > > (suggested/requested by Topi Pohjolainen). > > > > Place texture resolve to avoid using auxilary surface > > when ASTC5x5 is detected in brw_predraw_resolve_inputs() > > instead of another detected function; doing so allows > > one to avoid walking the textures again. > > (suggested/requested by Topi Pohjolainen). > > > > Emit command streamer stall in addition to texture > > invalidate. > > (original short-coming caught by Jason Ekstrand) > > > > Place workaround function in (new) dedicated file. > > > > Minor path re-ordering to accomodate changes. > > > > Kevin Rogovin (5): > > i965: define astx5x5 workaround infrastructure > > i965: set ASTC5x5 workaround texture type tracking on texture validate > > i965: use ASTC5x5 workaround in brw_draw > > i965: use ASTC5x5 workaround in brw_compute > > i965: ASTC5x5 workaround logic for blorp > > > > src/mesa/drivers/dri/i965/Makefile.sources | 1 + > > src/mesa/drivers/dri/i965/brw_compute.c | 6 > > src/mesa/drivers/dri/i965/brw_context.c | 6 > > src/mesa/drivers/dri/i965/brw_context.h | 24 > > src/mesa/drivers/dri/i965/brw_draw.c | 16 +-- > > src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 5 > > src/mesa/drivers/dri/i965/gen9_astc5x5_wa.c | 36 > > > > src/mesa/drivers/dri/i965/genX_blorp_exec.c | 5 > > src/mesa/drivers/dri/i965/intel_batchbuffer.c| 1 + > > src/mesa/drivers/dri/i965/intel_tex_image.c | 16 --- > > src/mesa/drivers/dri/i965/intel_tex_validate.c | 13 + > > src/mesa/drivers/dri/i965/meson.build| 1 + > > 12 files changed, 124 insertions(+), 6 deletions(-) create mode > > 100644 src/mesa/drivers/dri/i965/gen9_astc5x5_wa.c > > > > -- > > 2.7.4 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround
Hi, I gave it a try by modifying isl_genX(surf_fill_state_s) in src/intel/isl/isl_surface_state.c where I set SamplerL2BypassModeDisable ALWAYS as true for GEN9; sadly car chase continued to hang. -Kevin -Original Message- From: Nanley Chery [mailto:nanleych...@gmail.com] Sent: Friday, December 15, 2017 8:34 PM To: Rogovin, Kevin <kevin.rogo...@intel.com> Cc: mesa-dev@lists.freedesktop.org Subject: Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround On Thu, Dec 14, 2017 at 07:39:46PM +0200, kevin.rogo...@intel.com wrote: > From: Kevin Rogovin <kevin.rogo...@intel.com> > > This patch series implements a needed workaround for Gen9 for ASTC5x5 > sampler reads. The crux of the work around is to make sure that the > sampler does not read an ASTC5x5 texture and a surface with an > auxilary buffer without having a texture cache invalidate and command > streamer stall between such accesses. > This workaround sounds like it deals with the same types of surfaces dealt with in the RENDER_SURFACE_STATE field, Sampler L2 Out of Order Mode Disable (or SamplerL2BypassModeDisable in our driver). Here's the programming note from the SKL PRM on this field: * This bit must be set for the following surface types: BC2_UNORM BC3_UNORM BC5_UNORM BC5_SNORM BC7_UNORM * This bit must be set for surfaces which contain a HiZ auxilliary surface if other surfaces using AUX_CCS_E or AUX_CCS_D auxiliary surface state (lossless color compression) are being sampled at the same time. Have we tried setting this bit for ASTC_5x5 textures? -Nanley > With this patch series applied to the (current) master branch of mesa, > carchase works on my SKL GT4. > > v2: > Rename workaround functions from brw_ to gen9_ > (suggested/requested by Topi Pohjolainen). > > Place texture resolve to avoid using auxilary surface > when ASTC5x5 is detected in brw_predraw_resolve_inputs() > instead of another detected function; doing so allows > one to avoid walking the textures again. > (suggested/requested by Topi Pohjolainen). > > Emit command streamer stall in addition to texture > invalidate. > (original short-coming caught by Jason Ekstrand) > > Place workaround function in (new) dedicated file. > > Minor path re-ordering to accomodate changes. > > Kevin Rogovin (5): > i965: define astx5x5 workaround infrastructure > i965: set ASTC5x5 workaround texture type tracking on texture validate > i965: use ASTC5x5 workaround in brw_draw > i965: use ASTC5x5 workaround in brw_compute > i965: ASTC5x5 workaround logic for blorp > > src/mesa/drivers/dri/i965/Makefile.sources | 1 + > src/mesa/drivers/dri/i965/brw_compute.c | 6 > src/mesa/drivers/dri/i965/brw_context.c | 6 > src/mesa/drivers/dri/i965/brw_context.h | 24 > src/mesa/drivers/dri/i965/brw_draw.c | 16 +-- > src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 5 > src/mesa/drivers/dri/i965/gen9_astc5x5_wa.c | 36 > > src/mesa/drivers/dri/i965/genX_blorp_exec.c | 5 > src/mesa/drivers/dri/i965/intel_batchbuffer.c| 1 + > src/mesa/drivers/dri/i965/intel_tex_image.c | 16 --- > src/mesa/drivers/dri/i965/intel_tex_validate.c | 13 + > src/mesa/drivers/dri/i965/meson.build| 1 + > 12 files changed, 124 insertions(+), 6 deletions(-) create mode > 100644 src/mesa/drivers/dri/i965/gen9_astc5x5_wa.c > > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 0/5] i965: ASTC5x5 workaround
On Thu, Dec 14, 2017 at 07:39:46PM +0200, kevin.rogo...@intel.com wrote: > From: Kevin Rogovin> > This patch series implements a needed workaround for Gen9 for ASTC5x5 > sampler reads. The crux of the work around is to make sure that the > sampler does not read an ASTC5x5 texture and a surface with an auxilary > buffer without having a texture cache invalidate and command streamer > stall between such accesses. > This workaround sounds like it deals with the same types of surfaces dealt with in the RENDER_SURFACE_STATE field, Sampler L2 Out of Order Mode Disable (or SamplerL2BypassModeDisable in our driver). Here's the programming note from the SKL PRM on this field: * This bit must be set for the following surface types: BC2_UNORM BC3_UNORM BC5_UNORM BC5_SNORM BC7_UNORM * This bit must be set for surfaces which contain a HiZ auxilliary surface if other surfaces using AUX_CCS_E or AUX_CCS_D auxiliary surface state (lossless color compression) are being sampled at the same time. Have we tried setting this bit for ASTC_5x5 textures? -Nanley > With this patch series applied to the (current) master branch of mesa, > carchase works on my SKL GT4. > > v2: > Rename workaround functions from brw_ to gen9_ > (suggested/requested by Topi Pohjolainen). > > Place texture resolve to avoid using auxilary surface > when ASTC5x5 is detected in brw_predraw_resolve_inputs() > instead of another detected function; doing so allows > one to avoid walking the textures again. > (suggested/requested by Topi Pohjolainen). > > Emit command streamer stall in addition to texture > invalidate. > (original short-coming caught by Jason Ekstrand) > > Place workaround function in (new) dedicated file. > > Minor path re-ordering to accomodate changes. > > Kevin Rogovin (5): > i965: define astx5x5 workaround infrastructure > i965: set ASTC5x5 workaround texture type tracking on texture validate > i965: use ASTC5x5 workaround in brw_draw > i965: use ASTC5x5 workaround in brw_compute > i965: ASTC5x5 workaround logic for blorp > > src/mesa/drivers/dri/i965/Makefile.sources | 1 + > src/mesa/drivers/dri/i965/brw_compute.c | 6 > src/mesa/drivers/dri/i965/brw_context.c | 6 > src/mesa/drivers/dri/i965/brw_context.h | 24 > src/mesa/drivers/dri/i965/brw_draw.c | 16 +-- > src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 5 > src/mesa/drivers/dri/i965/gen9_astc5x5_wa.c | 36 > > src/mesa/drivers/dri/i965/genX_blorp_exec.c | 5 > src/mesa/drivers/dri/i965/intel_batchbuffer.c| 1 + > src/mesa/drivers/dri/i965/intel_tex_image.c | 16 --- > src/mesa/drivers/dri/i965/intel_tex_validate.c | 13 + > src/mesa/drivers/dri/i965/meson.build| 1 + > 12 files changed, 124 insertions(+), 6 deletions(-) > create mode 100644 src/mesa/drivers/dri/i965/gen9_astc5x5_wa.c > > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev