Re: [Mesa-dev] [PATCH 1/6] st/mesa: fix sampler view context mismatch issue
Hi Brian, On Thursday, 14 March 2019 15:04:28 CET Brian Paul wrote: > > Ok, so basically you are saying that you improve the current situation with > > this > > current series, still leaving a race condition open. But a good final fix > > to the race > > condition is underway. > > Yeah, I can probably stage the multi-thread/race patch series before > this context-fix-up series. Ok, combining that all with the rest of the fix makes highly sense! Thanks for explaining! And sorry for my late replies. Here without family and sickness. best Mathias ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] st/mesa: fix sampler view context mismatch issue
On 08/03/2019 22:52, Brian Paul wrote: After a while of running google-chrome and viewing Bing maps, we'd see "context mismatch in svga_sampler_view_destroy()" messages and eventually crash because we leaked too many sampler views (the kernel module would have too many sampler views). When a texture object is being deleted, we call st_texture_release_all_sampler_views() to release all the sampler views. In the list, there may sampler views which were created from other contexts. Previously, we called pipe_sampler_view_release(pipe, view) which would use the given pipe context to destroy the view if the refcount hit zero. The svga error was triggered because we were calling pipe->sampler_view_destroy(pipe, view) and the pipe context didn't match the view's parent context. Instead, call pipe_sampler_reference(&view, NULL). That way, if the refcount hits zero, we'll use the view's parent context to destroy the view. That's what we want. The pipe_sampler_view_release() function was introduced years ago to avoid freeing a sampler view with a context that was already deleted. But since then we've improved sampler view and context tracking. When a context is destroyed, the state tracker walks over all texture objects and frees all sampler views which belong to that context. So, we should never end up deleting a sampler view after its context is deleted. After this, we can remove all calls to pipe_sampler_view_release() in the drivers. Finally, it appears that we need to implement a similar tear-down mechanism for shaders and programs since we may generate per-context shader variants. Testing done: google chrome, misc GL demos, games --- src/mesa/state_tracker/st_context.c | 3 +-- src/mesa/state_tracker/st_sampler_view.c | 8 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/src/mesa/state_tracker/st_context.c b/src/mesa/state_tracker/st_context.c index 2898279..a7464fd 100644 --- a/src/mesa/state_tracker/st_context.c +++ b/src/mesa/state_tracker/st_context.c @@ -278,8 +278,7 @@ st_destroy_context_priv(struct st_context *st, bool destroy_pipe) st_destroy_bound_image_handles(st); for (i = 0; i < ARRAY_SIZE(st->state.frag_sampler_views); i++) { - pipe_sampler_view_release(st->pipe, -&st->state.frag_sampler_views[i]); + pipe_sampler_view_reference(&st->state.frag_sampler_views[i], NULL); } /* free glReadPixels cache data */ diff --git a/src/mesa/state_tracker/st_sampler_view.c b/src/mesa/state_tracker/st_sampler_view.c index e4eaf39..650a2b0 100644 --- a/src/mesa/state_tracker/st_sampler_view.c +++ b/src/mesa/state_tracker/st_sampler_view.c @@ -74,7 +74,7 @@ st_texture_set_sampler_view(struct st_context *st, if (sv->view) { /* check if the context matches */ if (sv->view->context == st->pipe) { -pipe_sampler_view_release(st->pipe, &sv->view); +pipe_sampler_view_reference(&sv->view, NULL); goto found; } } else { @@ -94,13 +94,13 @@ st_texture_set_sampler_view(struct st_context *st, if (new_max < views->max || new_max > (UINT_MAX - sizeof(*views)) / sizeof(views->views[0])) { -pipe_sampler_view_release(st->pipe, &view); +pipe_sampler_view_reference(&view, NULL); goto out; } struct st_sampler_views *new_views = malloc(new_size); if (!new_views) { -pipe_sampler_view_release(st->pipe, &view); +pipe_sampler_view_reference(&view, NULL); goto out; } @@ -225,7 +225,7 @@ st_texture_release_all_sampler_views(struct st_context *st, simple_mtx_lock(&stObj->validate_mutex); struct st_sampler_views *views = stObj->sampler_views; for (i = 0; i < views->count; ++i) - pipe_sampler_view_release(st->pipe, &views->views[i].view); + pipe_sampler_view_reference(&views->views[i].view, NULL); simple_mtx_unlock(&stObj->validate_mutex); } With your upcoming change that prevents the sampler view from ever being released with the wrong context (which I suppose should go before this one?), this series is Reviewed-By: Jose Fonseca ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] st/mesa: fix sampler view context mismatch issue
On 03/14/2019 12:24 AM, Mathias Fröhlich wrote: Hi Brian, On Sunday, 10 March 2019 21:32:39 CET Brian Paul wrote: On Friday, 8 March 2019 23:52:09 CET Brian Paul wrote: After a while of running google-chrome and viewing Bing maps, we'd see "context mismatch in svga_sampler_view_destroy()" messages and eventually crash because we leaked too many sampler views (the kernel module would have too many sampler views). When a texture object is being deleted, we call st_texture_release_all_sampler_views() to release all the sampler views. In the list, there may sampler views which were created from other contexts. Previously, we called pipe_sampler_view_release(pipe, view) which would use the given pipe context to destroy the view if the refcount hit zero. The svga error was triggered because we were calling pipe->sampler_view_destroy(pipe, view) and the pipe context didn't match the view's parent context. Instead, call pipe_sampler_reference(&view, NULL). That way, if the refcount hits zero, we'll use the view's parent context to destroy the view. That's what we want. That sounds plausible so far. What I wonder, is calling into an 'arbitrary' different context and do work there thread safe for any gallium driver? Right. Before this patch there was the problem of trying to destroy a sampler view with a context different from the one which created it. That's fixed, but now there's the multi-threading issue as you say, and which I believe, is a less-common situation. Hmm, you are saying we tolerate a race condition then? No, I'm planning to fix that too, as described below. Especially since pipe_sampler_view_reference in its documentation says that the views provided need to originate from the same context and that this must be the 'current'. I have been taking a quick look into iris and radeonsi and both seem to be safe in terms of threads for dereferencing the views finally. But either the documentation of pipe_sampler_view_reference should be updated or I may miss an other piece of the puzzle to see that it is still save to do so. Last week I posted a patch series for the VMware driver which handled the thread issue with a list of "zombie" sampler views. Jose suggested moving that up into the state tracker, and that is a better long-term solution. I'm almost done with a patch series that does that. I'll likely post it tomorrow. This is taking longer than expected (long testing, plus a sick child and wife at home). Ok, so basically you are saying that you improve the current situation with this current series, still leaving a race condition open. But a good final fix to the race condition is underway. Yeah, I can probably stage the multi-thread/race patch series before this context-fix-up series. Anyhow, to me it looks like all the reference counting inlines in gallium seem to assume being truely thread safe and do not assume that the context that created the view needs to be current on dereferencing. Means: is this comment at the pipe_sampler_view_*reference that states that the owning context needs to be current really correct? Such a comment implies to me that the context possibly has a non thread safe list of views that may get corrupted if being called from an other current context may be current in an other thread. ... well or something similar to that. Do you remember where this comment comes from? Is there a driver that requires the owning context needs to be current property? I'm not sure I grok all that. But since pipe_sampler_views are per-context objects, I believe they should only be created/destroyed/used/referenced from the same context. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] st/mesa: fix sampler view context mismatch issue
Hi Brian, On Sunday, 10 March 2019 21:32:39 CET Brian Paul wrote: > > On Friday, 8 March 2019 23:52:09 CET Brian Paul wrote: > > > After a while of running google-chrome and viewing Bing maps, we'd see > > > "context mismatch in svga_sampler_view_destroy()" messages and > > > eventually crash because we leaked too many sampler views (the kernel > > > module would have too many sampler views). > > > > > > When a texture object is being deleted, we call > > > st_texture_release_all_sampler_views() to release all the sampler > > > views. In the list, there may sampler views which were created from > > > other contexts. > > > > > > Previously, we called pipe_sampler_view_release(pipe, view) which would > > > use the given pipe context to destroy the view if the refcount hit > > > zero. The svga error was triggered because we were calling > > > pipe->sampler_view_destroy(pipe, view) and the pipe context didn't > > > match the view's parent context. > > > > > > Instead, call pipe_sampler_reference(&view, NULL). That way, if > > > the refcount hits zero, we'll use the view's parent context to > > > destroy the view. That's what we want. > > > > That sounds plausible so far. > > > > What I wonder, is calling into an 'arbitrary' different context and do > > work there > > thread safe for any gallium driver? > > > > Right. Before this patch there was the problem of trying to destroy a > sampler view with a context different from the one which created it. > That's fixed, but now there's the multi-threading issue as you say, and > which I believe, is a less-common situation. Hmm, you are saying we tolerate a race condition then? > > Especially since pipe_sampler_view_reference in its documentation says > > that the > > views provided need to originate from the same context and that this must > > be the 'current'. > > > > I have been taking a quick look into iris and radeonsi and both seem to be > > safe > > in terms of threads for dereferencing the views finally. > > But either the documentation of pipe_sampler_view_reference should be > > updated > > or I may miss an other piece of the puzzle to see that it is still save to > > do so. > > > > Last week I posted a patch series for the VMware driver which handled the > thread issue with a list of "zombie" sampler views. Jose suggested moving > that up into the state tracker, and that is a better long-term solution. > > I'm almost done with a patch series that does that. I'll likely post it > tomorrow. Ok, so basically you are saying that you improve the current situation with this current series, still leaving a race condition open. But a good final fix to the race condition is underway. Anyhow, to me it looks like all the reference counting inlines in gallium seem to assume being truely thread safe and do not assume that the context that created the view needs to be current on dereferencing. Means: is this comment at the pipe_sampler_view_*reference that states that the owning context needs to be current really correct? Such a comment implies to me that the context possibly has a non thread safe list of views that may get corrupted if being called from an other current context may be current in an other thread. ... well or something similar to that. Do you remember where this comment comes from? Is there a driver that requires the owning context needs to be current property? best Mathias ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] st/mesa: fix sampler view context mismatch issue
On Sun, Mar 10, 2019 at 1:20 AM Mathias Fröhlich wrote: > Brian, > > One question also for my own education inline below: > > On Friday, 8 March 2019 23:52:09 CET Brian Paul wrote: > > After a while of running google-chrome and viewing Bing maps, we'd see > > "context mismatch in svga_sampler_view_destroy()" messages and > > eventually crash because we leaked too many sampler views (the kernel > > module would have too many sampler views). > > > > When a texture object is being deleted, we call > > st_texture_release_all_sampler_views() to release all the sampler > > views. In the list, there may sampler views which were created from > > other contexts. > > > > Previously, we called pipe_sampler_view_release(pipe, view) which would > > use the given pipe context to destroy the view if the refcount hit > > zero. The svga error was triggered because we were calling > > pipe->sampler_view_destroy(pipe, view) and the pipe context didn't > > match the view's parent context. > > > > Instead, call pipe_sampler_reference(&view, NULL). That way, if > > the refcount hits zero, we'll use the view's parent context to > > destroy the view. That's what we want. > > That sounds plausible so far. > > What I wonder, is calling into an 'arbitrary' different context and do > work there > thread safe for any gallium driver? > Right. Before this patch there was the problem of trying to destroy a sampler view with a context different from the one which created it. That's fixed, but now there's the multi-threading issue as you say, and which I believe, is a less-common situation. > > Especially since pipe_sampler_view_reference in its documentation says > that the > views provided need to originate from the same context and that this must > be the 'current'. > > I have been taking a quick look into iris and radeonsi and both seem to be > safe > in terms of threads for dereferencing the views finally. > But either the documentation of pipe_sampler_view_reference should be > updated > or I may miss an other piece of the puzzle to see that it is still save to > do so. > Last week I posted a patch series for the VMware driver which handled the thread issue with a list of "zombie" sampler views. Jose suggested moving that up into the state tracker, and that is a better long-term solution. I'm almost done with a patch series that does that. I'll likely post it tomorrow. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] st/mesa: fix sampler view context mismatch issue
Brian, One question also for my own education inline below: On Friday, 8 March 2019 23:52:09 CET Brian Paul wrote: > After a while of running google-chrome and viewing Bing maps, we'd see > "context mismatch in svga_sampler_view_destroy()" messages and > eventually crash because we leaked too many sampler views (the kernel > module would have too many sampler views). > > When a texture object is being deleted, we call > st_texture_release_all_sampler_views() to release all the sampler > views. In the list, there may sampler views which were created from > other contexts. > > Previously, we called pipe_sampler_view_release(pipe, view) which would > use the given pipe context to destroy the view if the refcount hit > zero. The svga error was triggered because we were calling > pipe->sampler_view_destroy(pipe, view) and the pipe context didn't > match the view's parent context. > > Instead, call pipe_sampler_reference(&view, NULL). That way, if > the refcount hits zero, we'll use the view's parent context to > destroy the view. That's what we want. That sounds plausible so far. What I wonder, is calling into an 'arbitrary' different context and do work there thread safe for any gallium driver? Especially since pipe_sampler_view_reference in its documentation says that the views provided need to originate from the same context and that this must be the 'current'. I have been taking a quick look into iris and radeonsi and both seem to be safe in terms of threads for dereferencing the views finally. But either the documentation of pipe_sampler_view_reference should be updated or I may miss an other piece of the puzzle to see that it is still save to do so. Can you tell me? thanks Mathias > > The pipe_sampler_view_release() function was introduced years ago to > avoid freeing a sampler view with a context that was already deleted. > > But since then we've improved sampler view and context tracking. > When a context is destroyed, the state tracker walks over all > texture objects and frees all sampler views which belong to that > context. So, we should never end up deleting a sampler view after > its context is deleted. > > After this, we can remove all calls to pipe_sampler_view_release() > in the drivers. > > Finally, it appears that we need to implement a similar tear-down > mechanism for shaders and programs since we may generate per-context > shader variants. > > Testing done: google chrome, misc GL demos, games > --- > src/mesa/state_tracker/st_context.c | 3 +-- > src/mesa/state_tracker/st_sampler_view.c | 8 > 2 files changed, 5 insertions(+), 6 deletions(-) > > diff --git a/src/mesa/state_tracker/st_context.c > b/src/mesa/state_tracker/st_context.c > index 2898279..a7464fd 100644 > --- a/src/mesa/state_tracker/st_context.c > +++ b/src/mesa/state_tracker/st_context.c > @@ -278,8 +278,7 @@ st_destroy_context_priv(struct st_context *st, bool > destroy_pipe) > st_destroy_bound_image_handles(st); > > for (i = 0; i < ARRAY_SIZE(st->state.frag_sampler_views); i++) { > - pipe_sampler_view_release(st->pipe, > -&st->state.frag_sampler_views[i]); > + pipe_sampler_view_reference(&st->state.frag_sampler_views[i], NULL); > } > > /* free glReadPixels cache data */ > diff --git a/src/mesa/state_tracker/st_sampler_view.c > b/src/mesa/state_tracker/st_sampler_view.c > index e4eaf39..650a2b0 100644 > --- a/src/mesa/state_tracker/st_sampler_view.c > +++ b/src/mesa/state_tracker/st_sampler_view.c > @@ -74,7 +74,7 @@ st_texture_set_sampler_view(struct st_context *st, >if (sv->view) { > /* check if the context matches */ > if (sv->view->context == st->pipe) { > -pipe_sampler_view_release(st->pipe, &sv->view); > +pipe_sampler_view_reference(&sv->view, NULL); > goto found; > } >} else { > @@ -94,13 +94,13 @@ st_texture_set_sampler_view(struct st_context *st, > > if (new_max < views->max || > new_max > (UINT_MAX - sizeof(*views)) / > sizeof(views->views[0])) { > -pipe_sampler_view_release(st->pipe, &view); > +pipe_sampler_view_reference(&view, NULL); > goto out; > } > > struct st_sampler_views *new_views = malloc(new_size); > if (!new_views) { > -pipe_sampler_view_release(st->pipe, &view); > +pipe_sampler_view_reference(&view, NULL); > goto out; > } > > @@ -225,7 +225,7 @@ st_texture_release_all_sampler_views(struct st_context > *st, > simple_mtx_lock(&stObj->validate_mutex); > struct st_sampler_views *views = stObj->sampler_views; > for (i = 0; i < views->count; ++i) > - pipe_sampler_view_release(st->pipe, &views->views[i].view); > + pipe_sampler_view_reference(&views->views[i].view, NULL); > simple_mtx_unlock(&stObj->validate_mutex); > } > > ___
[Mesa-dev] [PATCH 1/6] st/mesa: fix sampler view context mismatch issue
After a while of running google-chrome and viewing Bing maps, we'd see "context mismatch in svga_sampler_view_destroy()" messages and eventually crash because we leaked too many sampler views (the kernel module would have too many sampler views). When a texture object is being deleted, we call st_texture_release_all_sampler_views() to release all the sampler views. In the list, there may sampler views which were created from other contexts. Previously, we called pipe_sampler_view_release(pipe, view) which would use the given pipe context to destroy the view if the refcount hit zero. The svga error was triggered because we were calling pipe->sampler_view_destroy(pipe, view) and the pipe context didn't match the view's parent context. Instead, call pipe_sampler_reference(&view, NULL). That way, if the refcount hits zero, we'll use the view's parent context to destroy the view. That's what we want. The pipe_sampler_view_release() function was introduced years ago to avoid freeing a sampler view with a context that was already deleted. But since then we've improved sampler view and context tracking. When a context is destroyed, the state tracker walks over all texture objects and frees all sampler views which belong to that context. So, we should never end up deleting a sampler view after its context is deleted. After this, we can remove all calls to pipe_sampler_view_release() in the drivers. Finally, it appears that we need to implement a similar tear-down mechanism for shaders and programs since we may generate per-context shader variants. Testing done: google chrome, misc GL demos, games --- src/mesa/state_tracker/st_context.c | 3 +-- src/mesa/state_tracker/st_sampler_view.c | 8 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/src/mesa/state_tracker/st_context.c b/src/mesa/state_tracker/st_context.c index 2898279..a7464fd 100644 --- a/src/mesa/state_tracker/st_context.c +++ b/src/mesa/state_tracker/st_context.c @@ -278,8 +278,7 @@ st_destroy_context_priv(struct st_context *st, bool destroy_pipe) st_destroy_bound_image_handles(st); for (i = 0; i < ARRAY_SIZE(st->state.frag_sampler_views); i++) { - pipe_sampler_view_release(st->pipe, -&st->state.frag_sampler_views[i]); + pipe_sampler_view_reference(&st->state.frag_sampler_views[i], NULL); } /* free glReadPixels cache data */ diff --git a/src/mesa/state_tracker/st_sampler_view.c b/src/mesa/state_tracker/st_sampler_view.c index e4eaf39..650a2b0 100644 --- a/src/mesa/state_tracker/st_sampler_view.c +++ b/src/mesa/state_tracker/st_sampler_view.c @@ -74,7 +74,7 @@ st_texture_set_sampler_view(struct st_context *st, if (sv->view) { /* check if the context matches */ if (sv->view->context == st->pipe) { -pipe_sampler_view_release(st->pipe, &sv->view); +pipe_sampler_view_reference(&sv->view, NULL); goto found; } } else { @@ -94,13 +94,13 @@ st_texture_set_sampler_view(struct st_context *st, if (new_max < views->max || new_max > (UINT_MAX - sizeof(*views)) / sizeof(views->views[0])) { -pipe_sampler_view_release(st->pipe, &view); +pipe_sampler_view_reference(&view, NULL); goto out; } struct st_sampler_views *new_views = malloc(new_size); if (!new_views) { -pipe_sampler_view_release(st->pipe, &view); +pipe_sampler_view_reference(&view, NULL); goto out; } @@ -225,7 +225,7 @@ st_texture_release_all_sampler_views(struct st_context *st, simple_mtx_lock(&stObj->validate_mutex); struct st_sampler_views *views = stObj->sampler_views; for (i = 0; i < views->count; ++i) - pipe_sampler_view_release(st->pipe, &views->views[i].view); + pipe_sampler_view_reference(&views->views[i].view, NULL); simple_mtx_unlock(&stObj->validate_mutex); } -- 1.8.5.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev