Additional update from the most recent testing.
When relying solely on guc_lrc_desc_unpin getting a failure from
deregister_context
as a means for identifying that we are in the
"deregister-context-vs-suspend-late" race,
it is too late a location to handle this safely. This is because one of the
just a follow up note-to-self:
On Tue, 2023-08-15 at 12:08 -0700, Teres Alexis, Alan Previn wrote:
> On Tue, 2023-08-15 at 09:56 -0400, Vivi, Rodrigo wrote:
> > On Mon, Aug 14, 2023 at 06:12:09PM -0700, Alan Previn wrote:
> > >
[snip]
in guc_submission_send_busy_loop, we are incrementing the fol
On Tue, 2023-08-15 at 09:56 -0400, Vivi, Rodrigo wrote:
> On Mon, Aug 14, 2023 at 06:12:09PM -0700, Alan Previn wrote:
> > If we are at the end of suspend or very early in resume
> > its possible an async fence signal could lead us to the
> > execution of the context destruction worker (after the
>
On Mon, Aug 14, 2023 at 06:12:09PM -0700, Alan Previn wrote:
> If we are at the end of suspend or very early in resume
> its possible an async fence signal could lead us to the
> execution of the context destruction worker (after the
> prior worker flush).
>
> Even if checking that the CT is enabl
If we are at the end of suspend or very early in resume
its possible an async fence signal could lead us to the
execution of the context destruction worker (after the
prior worker flush).
Even if checking that the CT is enabled before calling
destroyed_worker_func, guc_lrc_desc_unpin may still fai