Re: [PATCH v7 4/4] drm/vc4: Allocate binner bo when starting to use the V3D

2019-05-02 Thread Eric Anholt
Paul Kocialkowski  writes:

> Hi,
>
> On Thu, 2019-04-25 at 10:42 -0700, Eric Anholt wrote:
>> Paul Kocialkowski  writes:
>> 
>> > The binner BO is not required until the V3D is in use, so avoid
>> > allocating it at probe and do it on the first non-dumb BO allocation.
>> > 
>> > Keep track of which clients are using the V3D and liberate the buffer
>> > when there is none left, using a kref. Protect the logic with a
>> > mutex to avoid race conditions.
>> > 
>> > The binner BO is created at the time of the first render ioctl and is
>> > destroyed when there is no client and no exec job using it left.
>> > 
>> > The Out-Of-Memory (OOM) interrupt also gets some tweaking, to avoid
>> > enabling it before having allocated a binner bo.
>> > 
>> > We also want to keep the BO alive during runtime suspend/resume to avoid
>> > failing to allocate it at resume. This happens when the CMA pool is
>> > full at that point and results in a hard crash.
>> > 
>> > Signed-off-by: Paul Kocialkowski 
>> > ---
>> >  drivers/gpu/drm/vc4/vc4_bo.c  | 33 +++-
>> >  drivers/gpu/drm/vc4/vc4_drv.c |  6 
>> >  drivers/gpu/drm/vc4/vc4_drv.h | 14 +
>> >  drivers/gpu/drm/vc4/vc4_gem.c | 13 
>> >  drivers/gpu/drm/vc4/vc4_irq.c | 21 +
>> >  drivers/gpu/drm/vc4/vc4_v3d.c | 58 +++
>> >  6 files changed, 125 insertions(+), 20 deletions(-)
>> > 
>> > diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
>> > index 88ebd681d7eb..2b3ec5926fe2 100644
>> > --- a/drivers/gpu/drm/vc4/vc4_bo.c
>> > +++ b/drivers/gpu/drm/vc4/vc4_bo.c
>> > @@ -799,13 +799,38 @@ vc4_prime_import_sg_table(struct drm_device *dev,
>> >return obj;
>> >  }
>> >  
>> > +static int vc4_grab_bin_bo(struct vc4_dev *vc4, struct vc4_file *vc4file)
>> > +{
>> > +  int ret;
>> > +
>> > +  if (!vc4->v3d)
>> > +  return -ENODEV;
>> > +
>> > +  if (vc4file->bin_bo_used)
>> > +  return 0;
>> > +
>> > +  ret = vc4_v3d_bin_bo_get(vc4);
>> > +  if (ret)
>> > +  return ret;
>> > +
>> > +  vc4file->bin_bo_used = true;
>> 
>> I think I found one last race.  Multiple threads could be in an ioctl
>> trying to grab the bin BO at the same time (while this is only during
>> app startup, since the fd only needs to get the ref once, it's
>> particularly plausible given that allocating the bin BO is slow).  I
>> think if you replace this line with:
>> 
>>  mutex_lock(>bin_bo_lock);
>> if (vc4file->bin_bo_used) {
>>  mutex_unlock(>bin_bo_lock);
>> vc4_v3d_bin_bo_put(vc4);
>> } else {
>>  vc4file->bin_bo_used = true;
>>  mutex_unlock(>bin_bo_lock);
>> }
>
> Huh, very good catch once again, thanks! It took me some time to grasp
> this one, but as far as I understand, the risk is that we could ref our
> bin bo twice (although it would only be allocated once) since
> bin_bo_used is not protected.
>
> I'd like to suggest another solution, which would avoid re-locking and
> doing an extra put if we got an extra ref: adding a "bool *used"
> argument to vc4_v3d_bin_bo_get and, which only gets dereferenced with
> the bin_bo lock held. Then we can skip obtaining a new reference if
> (used && *used) in vc4_v3d_bin_bo_get.
>
> So we could pass a pointer to vc4file->bin_bo_used for vc4_grab_bin_bo
> and exec->bin_bo_used for the exec case (where there is no such issue
> since we'll only ever try to _get the bin bo once there anyway).
>
> What do you think?

I like it!


signature.asc
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v7 4/4] drm/vc4: Allocate binner bo when starting to use the V3D

2019-05-02 Thread Paul Kocialkowski
Hi,

On Thu, 2019-04-25 at 10:42 -0700, Eric Anholt wrote:
> Paul Kocialkowski  writes:
> 
> > The binner BO is not required until the V3D is in use, so avoid
> > allocating it at probe and do it on the first non-dumb BO allocation.
> > 
> > Keep track of which clients are using the V3D and liberate the buffer
> > when there is none left, using a kref. Protect the logic with a
> > mutex to avoid race conditions.
> > 
> > The binner BO is created at the time of the first render ioctl and is
> > destroyed when there is no client and no exec job using it left.
> > 
> > The Out-Of-Memory (OOM) interrupt also gets some tweaking, to avoid
> > enabling it before having allocated a binner bo.
> > 
> > We also want to keep the BO alive during runtime suspend/resume to avoid
> > failing to allocate it at resume. This happens when the CMA pool is
> > full at that point and results in a hard crash.
> > 
> > Signed-off-by: Paul Kocialkowski 
> > ---
> >  drivers/gpu/drm/vc4/vc4_bo.c  | 33 +++-
> >  drivers/gpu/drm/vc4/vc4_drv.c |  6 
> >  drivers/gpu/drm/vc4/vc4_drv.h | 14 +
> >  drivers/gpu/drm/vc4/vc4_gem.c | 13 
> >  drivers/gpu/drm/vc4/vc4_irq.c | 21 +
> >  drivers/gpu/drm/vc4/vc4_v3d.c | 58 +++
> >  6 files changed, 125 insertions(+), 20 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
> > index 88ebd681d7eb..2b3ec5926fe2 100644
> > --- a/drivers/gpu/drm/vc4/vc4_bo.c
> > +++ b/drivers/gpu/drm/vc4/vc4_bo.c
> > @@ -799,13 +799,38 @@ vc4_prime_import_sg_table(struct drm_device *dev,
> > return obj;
> >  }
> >  
> > +static int vc4_grab_bin_bo(struct vc4_dev *vc4, struct vc4_file *vc4file)
> > +{
> > +   int ret;
> > +
> > +   if (!vc4->v3d)
> > +   return -ENODEV;
> > +
> > +   if (vc4file->bin_bo_used)
> > +   return 0;
> > +
> > +   ret = vc4_v3d_bin_bo_get(vc4);
> > +   if (ret)
> > +   return ret;
> > +
> > +   vc4file->bin_bo_used = true;
> 
> I think I found one last race.  Multiple threads could be in an ioctl
> trying to grab the bin BO at the same time (while this is only during
> app startup, since the fd only needs to get the ref once, it's
> particularly plausible given that allocating the bin BO is slow).  I
> think if you replace this line with:
> 
>   mutex_lock(>bin_bo_lock);
> if (vc4file->bin_bo_used) {
>   mutex_unlock(>bin_bo_lock);
> vc4_v3d_bin_bo_put(vc4);
> } else {
>   vc4file->bin_bo_used = true;
>   mutex_unlock(>bin_bo_lock);
> }

Huh, very good catch once again, thanks! It took me some time to grasp
this one, but as far as I understand, the risk is that we could ref our
bin bo twice (although it would only be allocated once) since
bin_bo_used is not protected.

I'd like to suggest another solution, which would avoid re-locking and
doing an extra put if we got an extra ref: adding a "bool *used"
argument to vc4_v3d_bin_bo_get and, which only gets dereferenced with
the bin_bo lock held. Then we can skip obtaining a new reference if
(used && *used) in vc4_v3d_bin_bo_get.

So we could pass a pointer to vc4file->bin_bo_used for vc4_grab_bin_bo
and exec->bin_bo_used for the exec case (where there is no such issue
since we'll only ever try to _get the bin bo once there anyway).

What do you think?

Cheers,

Paul

> that will be the last change we need.  If you agree with this, feel free
> to squash it in and apply the series with:
> 
> Reviewed-by: Eric Anholt 
> 
> > +
> > +   return 0;
> > +}
> > +
> >  int vc4_create_bo_ioctl(struct drm_device *dev, void *data,
> > struct drm_file *file_priv)
> >  {
> > struct drm_vc4_create_bo *args = data;
> > +   struct vc4_file *vc4file = file_priv->driver_priv;
> > +   struct vc4_dev *vc4 = to_vc4_dev(dev);
> > struct vc4_bo *bo = NULL;
> > int ret;
> >  
> > +   ret = vc4_grab_bin_bo(vc4, vc4file);
> > +   if (ret)
> > +   return ret;
> > +
> > /*
> >  * We can't allocate from the BO cache, because the BOs don't
> >  * get zeroed, and that might leak data between users.
> > @@ -846,6 +871,8 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void 
> > *data,
> >struct drm_file *file_priv)
> >  {
> > struct drm_vc4_create_shader_bo *args = data;
> > +   struct vc4_file *vc4file = file_priv->driver_priv;
> > +   struct vc4_dev *vc4 = to_vc4_dev(dev);
> > struct vc4_bo *bo = NULL;
> > int ret;
> >  
> > @@ -865,6 +892,10 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, 
> > void *data,
> > return -EINVAL;
> > }
> >  
> > +   ret = vc4_grab_bin_bo(vc4, vc4file);
> > +   if (ret)
> > +   return ret;
> > +
> > bo = vc4_bo_create(dev, args->size, true, VC4_BO_TYPE_V3D_SHADER);
> > if (IS_ERR(bo))
> > return PTR_ERR(bo);
> > @@ -894,7 +925,7 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, 

Re: [PATCH v7 4/4] drm/vc4: Allocate binner bo when starting to use the V3D

2019-04-25 Thread Eric Anholt
Paul Kocialkowski  writes:

> The binner BO is not required until the V3D is in use, so avoid
> allocating it at probe and do it on the first non-dumb BO allocation.
>
> Keep track of which clients are using the V3D and liberate the buffer
> when there is none left, using a kref. Protect the logic with a
> mutex to avoid race conditions.
>
> The binner BO is created at the time of the first render ioctl and is
> destroyed when there is no client and no exec job using it left.
>
> The Out-Of-Memory (OOM) interrupt also gets some tweaking, to avoid
> enabling it before having allocated a binner bo.
>
> We also want to keep the BO alive during runtime suspend/resume to avoid
> failing to allocate it at resume. This happens when the CMA pool is
> full at that point and results in a hard crash.
>
> Signed-off-by: Paul Kocialkowski 
> ---
>  drivers/gpu/drm/vc4/vc4_bo.c  | 33 +++-
>  drivers/gpu/drm/vc4/vc4_drv.c |  6 
>  drivers/gpu/drm/vc4/vc4_drv.h | 14 +
>  drivers/gpu/drm/vc4/vc4_gem.c | 13 
>  drivers/gpu/drm/vc4/vc4_irq.c | 21 +
>  drivers/gpu/drm/vc4/vc4_v3d.c | 58 +++
>  6 files changed, 125 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
> index 88ebd681d7eb..2b3ec5926fe2 100644
> --- a/drivers/gpu/drm/vc4/vc4_bo.c
> +++ b/drivers/gpu/drm/vc4/vc4_bo.c
> @@ -799,13 +799,38 @@ vc4_prime_import_sg_table(struct drm_device *dev,
>   return obj;
>  }
>  
> +static int vc4_grab_bin_bo(struct vc4_dev *vc4, struct vc4_file *vc4file)
> +{
> + int ret;
> +
> + if (!vc4->v3d)
> + return -ENODEV;
> +
> + if (vc4file->bin_bo_used)
> + return 0;
> +
> + ret = vc4_v3d_bin_bo_get(vc4);
> + if (ret)
> + return ret;
> +


> + vc4file->bin_bo_used = true;

I think I found one last race.  Multiple threads could be in an ioctl
trying to grab the bin BO at the same time (while this is only during
app startup, since the fd only needs to get the ref once, it's
particularly plausible given that allocating the bin BO is slow).  I
think if you replace this line with:

mutex_lock(>bin_bo_lock);
if (vc4file->bin_bo_used) {
mutex_unlock(>bin_bo_lock);
vc4_v3d_bin_bo_put(vc4);
} else {
vc4file->bin_bo_used = true;
mutex_unlock(>bin_bo_lock);
}

that will be the last change we need.  If you agree with this, feel free
to squash it in and apply the series with:

Reviewed-by: Eric Anholt 

> +
> + return 0;
> +}
> +
>  int vc4_create_bo_ioctl(struct drm_device *dev, void *data,
>   struct drm_file *file_priv)
>  {
>   struct drm_vc4_create_bo *args = data;
> + struct vc4_file *vc4file = file_priv->driver_priv;
> + struct vc4_dev *vc4 = to_vc4_dev(dev);
>   struct vc4_bo *bo = NULL;
>   int ret;
>  
> + ret = vc4_grab_bin_bo(vc4, vc4file);
> + if (ret)
> + return ret;
> +
>   /*
>* We can't allocate from the BO cache, because the BOs don't
>* get zeroed, and that might leak data between users.
> @@ -846,6 +871,8 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void 
> *data,
>  struct drm_file *file_priv)
>  {
>   struct drm_vc4_create_shader_bo *args = data;
> + struct vc4_file *vc4file = file_priv->driver_priv;
> + struct vc4_dev *vc4 = to_vc4_dev(dev);
>   struct vc4_bo *bo = NULL;
>   int ret;
>  
> @@ -865,6 +892,10 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void 
> *data,
>   return -EINVAL;
>   }
>  
> + ret = vc4_grab_bin_bo(vc4, vc4file);
> + if (ret)
> + return ret;
> +
>   bo = vc4_bo_create(dev, args->size, true, VC4_BO_TYPE_V3D_SHADER);
>   if (IS_ERR(bo))
>   return PTR_ERR(bo);
> @@ -894,7 +925,7 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void 
> *data,
>*/
>   ret = drm_gem_handle_create(file_priv, >base.base, >handle);
>  
> - fail:
> +fail:
>   drm_gem_object_put_unlocked(>base.base);
>  
>   return ret;

Extraneous whitespace change?


signature.asc
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v7 4/4] drm/vc4: Allocate binner bo when starting to use the V3D

2019-04-25 Thread Paul Kocialkowski
The binner BO is not required until the V3D is in use, so avoid
allocating it at probe and do it on the first non-dumb BO allocation.

Keep track of which clients are using the V3D and liberate the buffer
when there is none left, using a kref. Protect the logic with a
mutex to avoid race conditions.

The binner BO is created at the time of the first render ioctl and is
destroyed when there is no client and no exec job using it left.

The Out-Of-Memory (OOM) interrupt also gets some tweaking, to avoid
enabling it before having allocated a binner bo.

We also want to keep the BO alive during runtime suspend/resume to avoid
failing to allocate it at resume. This happens when the CMA pool is
full at that point and results in a hard crash.

Signed-off-by: Paul Kocialkowski 
---
 drivers/gpu/drm/vc4/vc4_bo.c  | 33 +++-
 drivers/gpu/drm/vc4/vc4_drv.c |  6 
 drivers/gpu/drm/vc4/vc4_drv.h | 14 +
 drivers/gpu/drm/vc4/vc4_gem.c | 13 
 drivers/gpu/drm/vc4/vc4_irq.c | 21 +
 drivers/gpu/drm/vc4/vc4_v3d.c | 58 +++
 6 files changed, 125 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
index 88ebd681d7eb..2b3ec5926fe2 100644
--- a/drivers/gpu/drm/vc4/vc4_bo.c
+++ b/drivers/gpu/drm/vc4/vc4_bo.c
@@ -799,13 +799,38 @@ vc4_prime_import_sg_table(struct drm_device *dev,
return obj;
 }
 
+static int vc4_grab_bin_bo(struct vc4_dev *vc4, struct vc4_file *vc4file)
+{
+   int ret;
+
+   if (!vc4->v3d)
+   return -ENODEV;
+
+   if (vc4file->bin_bo_used)
+   return 0;
+
+   ret = vc4_v3d_bin_bo_get(vc4);
+   if (ret)
+   return ret;
+
+   vc4file->bin_bo_used = true;
+
+   return 0;
+}
+
 int vc4_create_bo_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
 {
struct drm_vc4_create_bo *args = data;
+   struct vc4_file *vc4file = file_priv->driver_priv;
+   struct vc4_dev *vc4 = to_vc4_dev(dev);
struct vc4_bo *bo = NULL;
int ret;
 
+   ret = vc4_grab_bin_bo(vc4, vc4file);
+   if (ret)
+   return ret;
+
/*
 * We can't allocate from the BO cache, because the BOs don't
 * get zeroed, and that might leak data between users.
@@ -846,6 +871,8 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void 
*data,
   struct drm_file *file_priv)
 {
struct drm_vc4_create_shader_bo *args = data;
+   struct vc4_file *vc4file = file_priv->driver_priv;
+   struct vc4_dev *vc4 = to_vc4_dev(dev);
struct vc4_bo *bo = NULL;
int ret;
 
@@ -865,6 +892,10 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void 
*data,
return -EINVAL;
}
 
+   ret = vc4_grab_bin_bo(vc4, vc4file);
+   if (ret)
+   return ret;
+
bo = vc4_bo_create(dev, args->size, true, VC4_BO_TYPE_V3D_SHADER);
if (IS_ERR(bo))
return PTR_ERR(bo);
@@ -894,7 +925,7 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void 
*data,
 */
ret = drm_gem_handle_create(file_priv, >base.base, >handle);
 
- fail:
+fail:
drm_gem_object_put_unlocked(>base.base);
 
return ret;
diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index 6d9be20a32be..0f99ad03614e 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -128,8 +128,12 @@ static int vc4_open(struct drm_device *dev, struct 
drm_file *file)
 
 static void vc4_close(struct drm_device *dev, struct drm_file *file)
 {
+   struct vc4_dev *vc4 = to_vc4_dev(dev);
struct vc4_file *vc4file = file->driver_priv;
 
+   if (vc4file->bin_bo_used)
+   vc4_v3d_bin_bo_put(vc4);
+
vc4_perfmon_close_file(vc4file);
kfree(vc4file);
 }
@@ -274,6 +278,8 @@ static int vc4_drm_bind(struct device *dev)
drm->dev_private = vc4;
INIT_LIST_HEAD(>debugfs_list);
 
+   mutex_init(>bin_bo_lock);
+
ret = vc4_bo_cache_init(drm);
if (ret)
goto dev_put;
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index 4f13f6262491..5bfca83deb8e 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -216,6 +216,11 @@ struct vc4_dev {
 * the minor is available (after drm_dev_register()).
 */
struct list_head debugfs_list;
+
+   /* Mutex for binner bo allocation. */
+   struct mutex bin_bo_lock;
+   /* Reference count for our binner bo. */
+   struct kref bin_bo_kref;
 };
 
 static inline struct vc4_dev *
@@ -584,6 +589,11 @@ struct vc4_exec_info {
 * NULL otherwise.
 */
struct vc4_perfmon *perfmon;
+
+   /* Whether the exec has taken a reference to the binner BO, which should
+* happen with a VC4_PACKET_TILE_BINNING_MODE_CONFIG packet.
+*/
+