Re: [PATCH weston 19/68] compositor-drm: Refcount drm_fb
On Wed, 1 Mar 2017 11:17:14 + Daniel Stonewrote: > Hi, > > On 1 March 2017 at 10:29, Pekka Paalanen wrote: > > On Tue, 28 Feb 2017 14:46:21 + Daniel Stone > > wrote: > >> On 28 February 2017 at 10:59, Pekka Paalanen wrote: > >> > The important thing to document about this change is that is exchanges > >> > a (almost never hit) double-unref into a (always hit) leak of the > >> > drm_fb. It would be even better to mention what patch shall fix the > >> > introduced leak. > >> > >> I ... don't think it does leak? At least not if it gets successfully > >> displayed. > > > > Hmm, yes, on a second look after sleeping several nights, I see calls > > to drm_fb_get_from_bo() are pretty much balanced with drm_fb_unref. > > > > But because this patch adds refcounting, and adds only calls to > > drm_fb_ref() without adding corresponding calls to unref, then I think > > the code just before this patch was somehow broken, because this patch > > as is definitely changes behaviour. > > Sure, I can move the drm_fb_ref() addition into a follow-up patch if > you'd prefer. Let me know. Hi, the thing I care highly about is that the commit message explains all the consequences of a patch. Or at least all the consequences the author and reviewers can think of, particularly the subtle ones. Which patch adds the drm_fb_ref() is secondary to that in my opinion. I believe that is important for the atomic series in particular, because the series is very long, and we'd prefer landing it piece by piece once reviews agree. That means upstream master will remain at intermediate steps for some time. Some more words in the commit message should be fine, I think, to show that the detail has been acknowledged. A counter example is the stray timer_arm() in the global output repaint patch. :-) In general, anything to explain odd things to avoid reviewers getting confused would be nice, IMHO. ;-) > >> Every drm_fb_get_from_bo() whilst the BO is live, sure. But these are > >> balanced by drm_fb_unref (formerly drm_output_release_fb) calls in, > >> say, vblank_handler. So I can't really see a leak, and even if there > >> were, planes are disabled by default. Maybe scanout, but that's > >> trivially broken right now anyway, so not the biggest of regressions > >> as far as I'm concerned, if it even is one - and I can't quite see it. > > > > Scanout is broken? Since when? > > > > I used it successfully when reproducing > > https://bugs.freedesktop.org/show_bug.cgi?id=98833 or is that exactly > > the breakage you refer to? > > ISTR the problem was that, if you repeatedly hit the repaint loop (due > to damage, even if everything was occluded), the buffer currently on > scanout could get released too early? It's hard to remember though; so > many bugs, and this series has been going such a long time. Which release do you mean? If we were to destroy the drm_fb too early, wouldn't that lead to CRTC being turned off, which would essentially kill the output completely, since we don't know to mode-set again? Or can page flip turn it on? If we were to send wl_buffer.release too early, then the client might drawn into it too early... which actually sounds like it might be relevant to fdo#98833. Anyway, that's not too important here. Thanks, pq pgplTKATacqct.pgp Description: OpenPGP digital signature ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: [PATCH weston 19/68] compositor-drm: Refcount drm_fb
Hi, On 1 March 2017 at 10:29, Pekka Paalanenwrote: > On Tue, 28 Feb 2017 14:46:21 + Daniel Stone wrote: >> On 28 February 2017 at 10:59, Pekka Paalanen wrote: >> > The important thing to document about this change is that is exchanges >> > a (almost never hit) double-unref into a (always hit) leak of the >> > drm_fb. It would be even better to mention what patch shall fix the >> > introduced leak. >> >> I ... don't think it does leak? At least not if it gets successfully >> displayed. > > Hmm, yes, on a second look after sleeping several nights, I see calls > to drm_fb_get_from_bo() are pretty much balanced with drm_fb_unref. > > But because this patch adds refcounting, and adds only calls to > drm_fb_ref() without adding corresponding calls to unref, then I think > the code just before this patch was somehow broken, because this patch > as is definitely changes behaviour. Sure, I can move the drm_fb_ref() addition into a follow-up patch if you'd prefer. Let me know. >> Every drm_fb_get_from_bo() whilst the BO is live, sure. But these are >> balanced by drm_fb_unref (formerly drm_output_release_fb) calls in, >> say, vblank_handler. So I can't really see a leak, and even if there >> were, planes are disabled by default. Maybe scanout, but that's >> trivially broken right now anyway, so not the biggest of regressions >> as far as I'm concerned, if it even is one - and I can't quite see it. > > Scanout is broken? Since when? > > I used it successfully when reproducing > https://bugs.freedesktop.org/show_bug.cgi?id=98833 or is that exactly > the breakage you refer to? ISTR the problem was that, if you repeatedly hit the repaint loop (due to damage, even if everything was occluded), the buffer currently on scanout could get released too early? It's hard to remember though; so many bugs, and this series has been going such a long time. >> Other libweston users can and do use multiple views ... > > Then they should have been broken to begin with, no? As you said, > planes are disabled by default. Indeed they have been! Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: [PATCH weston 19/68] compositor-drm: Refcount drm_fb
On Tue, 28 Feb 2017 14:46:21 + Daniel Stonewrote: > Hi, > > On 28 February 2017 at 10:59, Pekka Paalanen wrote: > > On Mon, 27 Feb 2017 22:58:48 + Daniel Stone > > wrote: > >> It's indeed almost certainly a bugfix: if you have one client buffer > >> with multiple views, both of which could be promoted to scanout, > >> you'll get the same user data (drm_fb) handed back for the same BO, > >> and could attempt to destroy it twice. Previously you'd probably > >> mostly luck out due to the memory not yet being reused, but now you'd > >> be near-guaranteed to hit the assert in drm_fb_unref, which seemed > >> harsh. So I squashed it in here. > >> > >> drm_fb_ref doesn't exist before this patch, so the split would have to > >> be this in a later patch. Would that work for you? I don't mind either > >> way. > > > > I think I'd be fine if those points were explained in the commit > > message. To me, it's important to justify every hunk in a patch if it > > somehow stands out as not being part of the main topic. And this one > > caugh my eye as an unexplained change. > > Fair enough. I think the change from unref to actually performing an > unref makes it more or less balancing, but sure. > > > The important thing to document about this change is that is exchanges > > a (almost never hit) double-unref into a (always hit) leak of the > > drm_fb. It would be even better to mention what patch shall fix the > > introduced leak. > > I ... don't think it does leak? At least not if it gets successfully > displayed. Hmm, yes, on a second look after sleeping several nights, I see calls to drm_fb_get_from_bo() are pretty much balanced with drm_fb_unref. But because this patch adds refcounting, and adds only calls to drm_fb_ref() without adding corresponding calls to unref, then I think the code just before this patch was somehow broken, because this patch as is definitely changes behaviour. Was it so that before this patch adding the refcounting, the drm_fbs were destroyed and re-created repeatedly, and it no longer happens? I'm trying to find the explanation on the change of behaviour caused by this patch, because I cannot understand how adding a ref but not adding unref could not change the behaviour. > > To clarify, the leak I am imagining is: drm_fb gets created with > > refcount=1, then every drm_fb_get_from_bo() will increment the > > refcount, but according to the old not-refcounted architecture, there > > is only ever a single call to unref, which means it will never be > > released. And that would apply to *all* drm_fbs ever used. Am I wrong? > > > > Seems like a fairly big temporary regression compared to the > > double-free which can never happen in Weston since we don't use > > multiple views (yet). > > Every drm_fb_get_from_bo() whilst the BO is live, sure. But these are > balanced by drm_fb_unref (formerly drm_output_release_fb) calls in, > say, vblank_handler. So I can't really see a leak, and even if there > were, planes are disabled by default. Maybe scanout, but that's > trivially broken right now anyway, so not the biggest of regressions > as far as I'm concerned, if it even is one - and I can't quite see it. Scanout is broken? Since when? I used it successfully when reproducing https://bugs.freedesktop.org/show_bug.cgi?id=98833 or is that exactly the breakage you refer to? > Other libweston users can and do use multiple views ... Then they should have been broken to begin with, no? As you said, planes are disabled by default. Thanks, pq pgpidiirhxRro.pgp Description: OpenPGP digital signature ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: [PATCH weston 19/68] compositor-drm: Refcount drm_fb
Hi, On 28 February 2017 at 10:59, Pekka Paalanenwrote: > On Mon, 27 Feb 2017 22:58:48 + Daniel Stone wrote: >> It's indeed almost certainly a bugfix: if you have one client buffer >> with multiple views, both of which could be promoted to scanout, >> you'll get the same user data (drm_fb) handed back for the same BO, >> and could attempt to destroy it twice. Previously you'd probably >> mostly luck out due to the memory not yet being reused, but now you'd >> be near-guaranteed to hit the assert in drm_fb_unref, which seemed >> harsh. So I squashed it in here. >> >> drm_fb_ref doesn't exist before this patch, so the split would have to >> be this in a later patch. Would that work for you? I don't mind either >> way. > > I think I'd be fine if those points were explained in the commit > message. To me, it's important to justify every hunk in a patch if it > somehow stands out as not being part of the main topic. And this one > caugh my eye as an unexplained change. Fair enough. I think the change from unref to actually performing an unref makes it more or less balancing, but sure. > The important thing to document about this change is that is exchanges > a (almost never hit) double-unref into a (always hit) leak of the > drm_fb. It would be even better to mention what patch shall fix the > introduced leak. I ... don't think it does leak? At least not if it gets successfully displayed. > To clarify, the leak I am imagining is: drm_fb gets created with > refcount=1, then every drm_fb_get_from_bo() will increment the > refcount, but according to the old not-refcounted architecture, there > is only ever a single call to unref, which means it will never be > released. And that would apply to *all* drm_fbs ever used. Am I wrong? > > Seems like a fairly big temporary regression compared to the > double-free which can never happen in Weston since we don't use > multiple views (yet). Every drm_fb_get_from_bo() whilst the BO is live, sure. But these are balanced by drm_fb_unref (formerly drm_output_release_fb) calls in, say, vblank_handler. So I can't really see a leak, and even if there were, planes are disabled by default. Maybe scanout, but that's trivially broken right now anyway, so not the biggest of regressions as far as I'm concerned, if it even is one - and I can't quite see it. Other libweston users can and do use multiple views ... Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: [PATCH weston 19/68] compositor-drm: Refcount drm_fb
On Mon, 27 Feb 2017 22:58:48 + Daniel Stonewrote: > Hi Pekka, > > On 21 February 2017 at 15:19, Pekka Paalanen wrote: > > On Fri, 9 Dec 2016 19:57:34 + Daniel Stone > > wrote: > >> @@ -312,6 +316,7 @@ drm_fb_create_dumb(struct drm_backend *b, int width, > >> int height, > >> if (!fb->format->depth || !fb->format->bpp) { > >> weston_log("format 0x%lx is not compatible with dumb > >> buffers\n", > >> (unsigned long) format); > >> + goto err_fb; > > > > this hunk belongs in an earlier patch. > > As you already pointed out whilst reviewing that patch. :) > > >> @@ -392,13 +404,14 @@ drm_fb_get_from_bo(struct gbm_bo *bo, struct > >> drm_backend *backend, > >> int ret; > >> > >> if (fb) > >> - return fb; > >> + return drm_fb_ref(fb); > > > > This path is now different from before, and requires adding an > > equivalent drm_fb_unref() call somewhere, but I don't see it. > > Previously unref would have released it immediately, now the first call > > will be a no-op. > > > > Or, if this is a bug fix, it requires an explanation in the commit > > message how in some cases drm_fb_unref() got called twice. > > > > Maybe this hunk belongs in a different patch instead? > > It's indeed almost certainly a bugfix: if you have one client buffer > with multiple views, both of which could be promoted to scanout, > you'll get the same user data (drm_fb) handed back for the same BO, > and could attempt to destroy it twice. Previously you'd probably > mostly luck out due to the memory not yet being reused, but now you'd > be near-guaranteed to hit the assert in drm_fb_unref, which seemed > harsh. So I squashed it in here. > > drm_fb_ref doesn't exist before this patch, so the split would have to > be this in a later patch. Would that work for you? I don't mind either > way. I think I'd be fine if those points were explained in the commit message. To me, it's important to justify every hunk in a patch if it somehow stands out as not being part of the main topic. And this one caugh my eye as an unexplained change. The important thing to document about this change is that is exchanges a (almost never hit) double-unref into a (always hit) leak of the drm_fb. It would be even better to mention what patch shall fix the introduced leak. To clarify, the leak I am imagining is: drm_fb gets created with refcount=1, then every drm_fb_get_from_bo() will increment the refcount, but according to the old not-refcounted architecture, there is only ever a single call to unref, which means it will never be released. And that would apply to *all* drm_fbs ever used. Am I wrong? Seems like a fairly big temporary regression compared to the double-free which can never happen in Weston since we don't use multiple views (yet). Thanks, pq > >> @@ -472,6 +485,10 @@ drm_fb_unref(struct drm_fb *fb) > >> if (!fb) > >> return; > >> > >> + assert(fb->refcnt > 0); > >> + if (--fb->refcnt > 0) > >> + return; > >> + > >> switch (fb->type) { > >> case BUFFER_PIXMAN_DUMB: > >> /* nothing: pixman buffers are destroyed manually */ > > > > It took a while to see the paths: > > > > drm_fb_unref -> gbm_bo_destroy -> drm_fb_destroy_gbm > > > > drm_fb_unref -> gbm_surface_release_buffer > > gbm_surface_destroy -> drm_fb_destroy_gbm > > > > drm_output_fini_pixman -> drm_fb_destroy_dumb > > > > The goal is to solidify drm_fb_unref() as the main entry point for > > destruction, but it's not quite there yet. Very good. > > Slowly but surely ... > > > If you move both of the hunks I pointed out into other patches, then > > this patch gets: > > Reviewed-by: Pekka Paalanen > > If that's not the right thing to do, I'll review the next revision. > > I'll leave you to review the next revision and/or just continue this thread > on. > > Cheers, > Daniel pgpoohd9iZruU.pgp Description: OpenPGP digital signature ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: [PATCH weston 19/68] compositor-drm: Refcount drm_fb
Hi Pekka, On 21 February 2017 at 15:19, Pekka Paalanenwrote: > On Fri, 9 Dec 2016 19:57:34 + Daniel Stone wrote: >> @@ -312,6 +316,7 @@ drm_fb_create_dumb(struct drm_backend *b, int width, int >> height, >> if (!fb->format->depth || !fb->format->bpp) { >> weston_log("format 0x%lx is not compatible with dumb >> buffers\n", >> (unsigned long) format); >> + goto err_fb; > > this hunk belongs in an earlier patch. As you already pointed out whilst reviewing that patch. :) >> @@ -392,13 +404,14 @@ drm_fb_get_from_bo(struct gbm_bo *bo, struct >> drm_backend *backend, >> int ret; >> >> if (fb) >> - return fb; >> + return drm_fb_ref(fb); > > This path is now different from before, and requires adding an > equivalent drm_fb_unref() call somewhere, but I don't see it. > Previously unref would have released it immediately, now the first call > will be a no-op. > > Or, if this is a bug fix, it requires an explanation in the commit > message how in some cases drm_fb_unref() got called twice. > > Maybe this hunk belongs in a different patch instead? It's indeed almost certainly a bugfix: if you have one client buffer with multiple views, both of which could be promoted to scanout, you'll get the same user data (drm_fb) handed back for the same BO, and could attempt to destroy it twice. Previously you'd probably mostly luck out due to the memory not yet being reused, but now you'd be near-guaranteed to hit the assert in drm_fb_unref, which seemed harsh. So I squashed it in here. drm_fb_ref doesn't exist before this patch, so the split would have to be this in a later patch. Would that work for you? I don't mind either way. >> @@ -472,6 +485,10 @@ drm_fb_unref(struct drm_fb *fb) >> if (!fb) >> return; >> >> + assert(fb->refcnt > 0); >> + if (--fb->refcnt > 0) >> + return; >> + >> switch (fb->type) { >> case BUFFER_PIXMAN_DUMB: >> /* nothing: pixman buffers are destroyed manually */ > > It took a while to see the paths: > > drm_fb_unref -> gbm_bo_destroy -> drm_fb_destroy_gbm > > drm_fb_unref -> gbm_surface_release_buffer > gbm_surface_destroy -> drm_fb_destroy_gbm > > drm_output_fini_pixman -> drm_fb_destroy_dumb > > The goal is to solidify drm_fb_unref() as the main entry point for > destruction, but it's not quite there yet. Very good. Slowly but surely ... > If you move both of the hunks I pointed out into other patches, then > this patch gets: > Reviewed-by: Pekka Paalanen > If that's not the right thing to do, I'll review the next revision. I'll leave you to review the next revision and/or just continue this thread on. Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: [PATCH weston 19/68] compositor-drm: Refcount drm_fb
On Fri, 9 Dec 2016 19:57:34 + Daniel Stonewrote: > Sometimes we need to duplicate an existing drm_fb, e.g. when > pageflipping to the same buffer to kickstart the repaint loop. To handle > situations like these, and simplify resource management for dumb and > cursor buffers, refcount drm_fb. > > Differential Revision: https://phabricator.freedesktop.org/D1491 > > Signed-off-by: Daniel Stone > --- > libweston/compositor-drm.c | 19 ++- > 1 file changed, 18 insertions(+), 1 deletion(-) > > diff --git a/libweston/compositor-drm.c b/libweston/compositor-drm.c > index eb735b2..a684ac9 100644 > --- a/libweston/compositor-drm.c > +++ b/libweston/compositor-drm.c > @@ -140,6 +140,8 @@ enum drm_fb_type { > struct drm_fb { > enum drm_fb_type type; > > + int refcnt; > + > uint32_t fb_id, stride, handle, size; > const struct pixel_format_info *format; > int width, height; > @@ -302,6 +304,8 @@ drm_fb_create_dumb(struct drm_backend *b, int width, int > height, > if (!fb) > return NULL; > > + fb->refcnt = 1; > + > fb->format = pixel_format_get_info(format); > if (!fb->format) { > weston_log("failed to look up format 0x%lx\n", > @@ -312,6 +316,7 @@ drm_fb_create_dumb(struct drm_backend *b, int width, int > height, > if (!fb->format->depth || !fb->format->bpp) { > weston_log("format 0x%lx is not compatible with dumb buffers\n", > (unsigned long) format); > + goto err_fb; Hi, this hunk belongs in an earlier patch. > } > > memset(_arg, 0, sizeof create_arg); > @@ -384,6 +389,13 @@ err_fb: > } > > static struct drm_fb * > +drm_fb_ref(struct drm_fb *fb) > +{ > + fb->refcnt++; > + return fb; > +} > + > +static struct drm_fb * > drm_fb_get_from_bo(struct gbm_bo *bo, struct drm_backend *backend, > uint32_t format, enum drm_fb_type type) > { > @@ -392,13 +404,14 @@ drm_fb_get_from_bo(struct gbm_bo *bo, struct > drm_backend *backend, > int ret; > > if (fb) > - return fb; > + return drm_fb_ref(fb); This path is now different from before, and requires adding an equivalent drm_fb_unref() call somewhere, but I don't see it. Previously unref would have released it immediately, now the first call will be a no-op. Or, if this is a bug fix, it requires an explanation in the commit message how in some cases drm_fb_unref() got called twice. Maybe this hunk belongs in a different patch instead? > > fb = zalloc(sizeof *fb); > if (fb == NULL) > return NULL; > > fb->type = type; > + fb->refcnt = 1; > fb->bo = bo; > > fb->width = gbm_bo_get_width(bo); > @@ -472,6 +485,10 @@ drm_fb_unref(struct drm_fb *fb) > if (!fb) > return; > > + assert(fb->refcnt > 0); > + if (--fb->refcnt > 0) > + return; > + > switch (fb->type) { > case BUFFER_PIXMAN_DUMB: > /* nothing: pixman buffers are destroyed manually */ It took a while to see the paths: drm_fb_unref -> gbm_bo_destroy -> drm_fb_destroy_gbm drm_fb_unref -> gbm_surface_release_buffer gbm_surface_destroy -> drm_fb_destroy_gbm drm_output_fini_pixman -> drm_fb_destroy_dumb The goal is to solidify drm_fb_unref() as the main entry point for destruction, but it's not quite there yet. Very good. If you move both of the hunks I pointed out into other patches, then this patch gets: Reviewed-by: Pekka Paalanen If that's not the right thing to do, I'll review the next revision. Thanks, pq pgpXCTX6RiQSF.pgp Description: OpenPGP digital signature ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
[PATCH weston 19/68] compositor-drm: Refcount drm_fb
Sometimes we need to duplicate an existing drm_fb, e.g. when pageflipping to the same buffer to kickstart the repaint loop. To handle situations like these, and simplify resource management for dumb and cursor buffers, refcount drm_fb. Differential Revision: https://phabricator.freedesktop.org/D1491 Signed-off-by: Daniel Stone--- libweston/compositor-drm.c | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/libweston/compositor-drm.c b/libweston/compositor-drm.c index eb735b2..a684ac9 100644 --- a/libweston/compositor-drm.c +++ b/libweston/compositor-drm.c @@ -140,6 +140,8 @@ enum drm_fb_type { struct drm_fb { enum drm_fb_type type; + int refcnt; + uint32_t fb_id, stride, handle, size; const struct pixel_format_info *format; int width, height; @@ -302,6 +304,8 @@ drm_fb_create_dumb(struct drm_backend *b, int width, int height, if (!fb) return NULL; + fb->refcnt = 1; + fb->format = pixel_format_get_info(format); if (!fb->format) { weston_log("failed to look up format 0x%lx\n", @@ -312,6 +316,7 @@ drm_fb_create_dumb(struct drm_backend *b, int width, int height, if (!fb->format->depth || !fb->format->bpp) { weston_log("format 0x%lx is not compatible with dumb buffers\n", (unsigned long) format); + goto err_fb; } memset(_arg, 0, sizeof create_arg); @@ -384,6 +389,13 @@ err_fb: } static struct drm_fb * +drm_fb_ref(struct drm_fb *fb) +{ + fb->refcnt++; + return fb; +} + +static struct drm_fb * drm_fb_get_from_bo(struct gbm_bo *bo, struct drm_backend *backend, uint32_t format, enum drm_fb_type type) { @@ -392,13 +404,14 @@ drm_fb_get_from_bo(struct gbm_bo *bo, struct drm_backend *backend, int ret; if (fb) - return fb; + return drm_fb_ref(fb); fb = zalloc(sizeof *fb); if (fb == NULL) return NULL; fb->type = type; + fb->refcnt = 1; fb->bo = bo; fb->width = gbm_bo_get_width(bo); @@ -472,6 +485,10 @@ drm_fb_unref(struct drm_fb *fb) if (!fb) return; + assert(fb->refcnt > 0); + if (--fb->refcnt > 0) + return; + switch (fb->type) { case BUFFER_PIXMAN_DUMB: /* nothing: pixman buffers are destroyed manually */ -- 2.9.3 ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel