Re: [Mesa-dev] [PATCH v3 4/4] i965: Avoid problems from referencing orphaned BOs after growing.

2018-01-10 Thread Iago Toral
On Wed, 2018-01-10 at 08:54 -0800, Kenneth Graunke wrote:
> On Wednesday, January 10, 2018 5:11:21 AM PST Iago Toral wrote:
> > On Mon, 2018-01-08 at 13:30 -0800, Kenneth Graunke wrote:
> > > Growing the batch/state buffer is a lot more dangerous than I
> > > thought.
> > > 
> > > A number of places emit multiple state buffer sections, and then
> > > write
> > > data to the returned pointer, or save a pointer to brw-
> > > > batch.state.bo
> > > 
> > > and then use it in relocations.  If each call can grow, this can
> > > result
> > > in stale map references or stale BO pointers.  Furthermore,
> > > fences
> > > refer
> > > to the old batch BO, and that reference needs to continue
> > > working.
> > > 
> > > To avoid these woes, we avoid ever swapping the brw->batch.*.bo
> > > pointer,
> > > instead exchanging the brw_bo structures in place.  That way,
> > > stale
> > > BO
> > > references are fine - the GEM handle changes, but the brw_bo
> > > pointer
> > > doesn't.  We also defer the memcpy until a quiescent point, so
> > > callers
> > > can write to the returned pointer - which may be in either BO -
> > > and
> > > we'll sort it out and combine the two properly in the end.
> > > 
> > > v2:
> > > - Handle stale pointers in the shadow copy case, where realloc
> > > may or
> > >   may not move our shadow copy to a new address.
> > > - Track the partial map explicitly, to avoid problems with buffer
> > > reuse
> > >   where multiple map modes exist (caught by Chris Wilson).
> > > 
> > > Fixes: 2dfc119f22f257082ab0 "i965: Grow the batch/state buffers
> > > if we
> > > need space and can't flush."
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_context.h   |   3 +
> > >  src/mesa/drivers/dri/i965/intel_batchbuffer.c | 128
> > > +-
> > >  2 files changed, 107 insertions(+), 24 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/brw_context.h
> > > b/src/mesa/drivers/dri/i965/brw_context.h
> > > index 4d29e2ef082..1167d04a32a 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_context.h
> > > +++ b/src/mesa/drivers/dri/i965/brw_context.h
> > > @@ -473,6 +473,9 @@ struct brw_reloc_list {
> > >  struct brw_growing_bo {
> > > struct brw_bo *bo;
> > > uint32_t *map;
> > > +   struct brw_bo *partial_bo;
> > > +   uint32_t *partial_bo_map;
> > > +   unsigned partial_bytes;
> > >  };
> > >  
> > >  struct intel_batchbuffer {
> > > diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > > b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > > index b4fcd92b6bd..ca027ebff9f 100644
> > > --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > > +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > > @@ -171,6 +171,9 @@ recreate_growing_buffer(struct brw_context
> > > *brw,
> > >  
> > > grow->bo = brw_bo_alloc(bufmgr, name, size, 4096);
> > > grow->bo->kflags = can_do_exec_capture(screen) ?
> > > EXEC_OBJECT_CAPTURE : 0;
> > > +   grow->partial_bo = NULL;
> > > +   grow->partial_bo_map = NULL;
> > > +   grow->partial_bytes = 0;
> > >  
> > > if (!batch->use_shadow_copy)
> > >    grow->map = brw_bo_map(brw, grow->bo, MAP_READ |
> > > MAP_WRITE);
> > > @@ -267,6 +270,32 @@ intel_batchbuffer_free(struct
> > > intel_batchbuffer
> > > *batch)
> > >    _mesa_hash_table_destroy(batch->state_batch_sizes, NULL);
> > >  }
> > >  
> > > +/**
> > > + * Finish copying the old batch/state buffer's contents to the
> > > new
> > > one
> > > + * after we tried to "grow" the buffer in an earlier operation.
> > > + */
> > > +static void
> > > +finish_growing_bos(struct brw_growing_bo *grow)
> > > +{
> > > +   struct brw_bo *old_bo = grow->partial_bo;
> > > +   if (!old_bo)
> > > +  return;
> > 
> > Maybe just have the early return check at the top instead?:
> > 
> > if (!grow->partial_bo)
> >    return;
> > 
> > > +   /* If using a shadow copy, realloc may have returned the same
> > > pointer
> > > +* when growing, so there's no need to memcpy.  Or it might
> > > have
> > > moved,
> > > +* at which point we need to copy.
> > > +*/
> > 
> > Is this really necessary? According to the man pages realloc is
> > supposed to always keep the contents:
> > 
> > "The contents will be unchanged in the range from the start of the
> > region up to the minimum  of the  old and  new  sizes"
> 
> I believe so - and I'd missed this in my earlier spin of the patch.
> (Re-reading, I didn't explain this well in the comments...)
> 
> The data may not have been written by the time realloc is called.
> 
> For example, consider genX(upload_sf_clip_viewport) for GEN <= 6.
> It does:
> 
>    uint32_t *sf_map =
>   brw_state_batch(brw, GENX(SF_VIEWPORT_length) * 4 *
> viewport_count,
>   32, _vp_offset);
>    uint32_t *clip_map =
>   brw_state_batch(brw, GENX(CLIP_VIEWPORT_length) * 4 *
> viewport_count,
>   32, _vp_offset);
> 
> then writes both.  Let's assume SF_VIEWPORT fits in the
> buffer.  sf_map
> will be a pointer to the 

Re: [Mesa-dev] [PATCH v3 4/4] i965: Avoid problems from referencing orphaned BOs after growing.

2018-01-10 Thread Kenneth Graunke
On Wednesday, January 10, 2018 5:11:21 AM PST Iago Toral wrote:
> On Mon, 2018-01-08 at 13:30 -0800, Kenneth Graunke wrote:
> > Growing the batch/state buffer is a lot more dangerous than I
> > thought.
> > 
> > A number of places emit multiple state buffer sections, and then
> > write
> > data to the returned pointer, or save a pointer to brw-
> > >batch.state.bo
> > and then use it in relocations.  If each call can grow, this can
> > result
> > in stale map references or stale BO pointers.  Furthermore, fences
> > refer
> > to the old batch BO, and that reference needs to continue working.
> > 
> > To avoid these woes, we avoid ever swapping the brw->batch.*.bo
> > pointer,
> > instead exchanging the brw_bo structures in place.  That way, stale
> > BO
> > references are fine - the GEM handle changes, but the brw_bo pointer
> > doesn't.  We also defer the memcpy until a quiescent point, so
> > callers
> > can write to the returned pointer - which may be in either BO - and
> > we'll sort it out and combine the two properly in the end.
> > 
> > v2:
> > - Handle stale pointers in the shadow copy case, where realloc may or
> >   may not move our shadow copy to a new address.
> > - Track the partial map explicitly, to avoid problems with buffer
> > reuse
> >   where multiple map modes exist (caught by Chris Wilson).
> > 
> > Fixes: 2dfc119f22f257082ab0 "i965: Grow the batch/state buffers if we
> > need space and can't flush."
> > ---
> >  src/mesa/drivers/dri/i965/brw_context.h   |   3 +
> >  src/mesa/drivers/dri/i965/intel_batchbuffer.c | 128
> > +-
> >  2 files changed, 107 insertions(+), 24 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.h
> > b/src/mesa/drivers/dri/i965/brw_context.h
> > index 4d29e2ef082..1167d04a32a 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.h
> > +++ b/src/mesa/drivers/dri/i965/brw_context.h
> > @@ -473,6 +473,9 @@ struct brw_reloc_list {
> >  struct brw_growing_bo {
> > struct brw_bo *bo;
> > uint32_t *map;
> > +   struct brw_bo *partial_bo;
> > +   uint32_t *partial_bo_map;
> > +   unsigned partial_bytes;
> >  };
> >  
> >  struct intel_batchbuffer {
> > diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > index b4fcd92b6bd..ca027ebff9f 100644
> > --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > @@ -171,6 +171,9 @@ recreate_growing_buffer(struct brw_context *brw,
> >  
> > grow->bo = brw_bo_alloc(bufmgr, name, size, 4096);
> > grow->bo->kflags = can_do_exec_capture(screen) ?
> > EXEC_OBJECT_CAPTURE : 0;
> > +   grow->partial_bo = NULL;
> > +   grow->partial_bo_map = NULL;
> > +   grow->partial_bytes = 0;
> >  
> > if (!batch->use_shadow_copy)
> >grow->map = brw_bo_map(brw, grow->bo, MAP_READ | MAP_WRITE);
> > @@ -267,6 +270,32 @@ intel_batchbuffer_free(struct intel_batchbuffer
> > *batch)
> >_mesa_hash_table_destroy(batch->state_batch_sizes, NULL);
> >  }
> >  
> > +/**
> > + * Finish copying the old batch/state buffer's contents to the new
> > one
> > + * after we tried to "grow" the buffer in an earlier operation.
> > + */
> > +static void
> > +finish_growing_bos(struct brw_growing_bo *grow)
> > +{
> > +   struct brw_bo *old_bo = grow->partial_bo;
> > +   if (!old_bo)
> > +  return;
> 
> Maybe just have the early return check at the top instead?:
> 
> if (!grow->partial_bo)
>return;
> 
> > +   /* If using a shadow copy, realloc may have returned the same
> > pointer
> > +* when growing, so there's no need to memcpy.  Or it might have
> > moved,
> > +* at which point we need to copy.
> > +*/
> 
> Is this really necessary? According to the man pages realloc is
> supposed to always keep the contents:
> 
> "The contents will be unchanged in the range from the start of the
> region up to the minimum  of the  old and  new  sizes"

I believe so - and I'd missed this in my earlier spin of the patch.
(Re-reading, I didn't explain this well in the comments...)

The data may not have been written by the time realloc is called.

For example, consider genX(upload_sf_clip_viewport) for GEN <= 6.
It does:

   uint32_t *sf_map =
  brw_state_batch(brw, GENX(SF_VIEWPORT_length) * 4 * viewport_count,
  32, _vp_offset);
   uint32_t *clip_map =
  brw_state_batch(brw, GENX(CLIP_VIEWPORT_length) * 4 * viewport_count,
  32, _vp_offset);

then writes both.  Let's assume SF_VIEWPORT fits in the buffer.  sf_map
will be a pointer to the original malloc'd shadow buffer.  If the clip
viewports don't fit in the buffer, then brw_state_batch will realloc,
possibly moving the data to a new memory location.  sf_map becomes a
stale pointer to the old memory store, and will be written below,
causing a segfault.

But now that I've walked through this scenario, I realize my code
doesn't work in that case.  The idea here is to keep 

Re: [Mesa-dev] [PATCH v3 4/4] i965: Avoid problems from referencing orphaned BOs after growing.

2018-01-10 Thread Iago Toral
On Mon, 2018-01-08 at 13:30 -0800, Kenneth Graunke wrote:
> Growing the batch/state buffer is a lot more dangerous than I
> thought.
> 
> A number of places emit multiple state buffer sections, and then
> write
> data to the returned pointer, or save a pointer to brw-
> >batch.state.bo
> and then use it in relocations.  If each call can grow, this can
> result
> in stale map references or stale BO pointers.  Furthermore, fences
> refer
> to the old batch BO, and that reference needs to continue working.
> 
> To avoid these woes, we avoid ever swapping the brw->batch.*.bo
> pointer,
> instead exchanging the brw_bo structures in place.  That way, stale
> BO
> references are fine - the GEM handle changes, but the brw_bo pointer
> doesn't.  We also defer the memcpy until a quiescent point, so
> callers
> can write to the returned pointer - which may be in either BO - and
> we'll sort it out and combine the two properly in the end.
> 
> v2:
> - Handle stale pointers in the shadow copy case, where realloc may or
>   may not move our shadow copy to a new address.
> - Track the partial map explicitly, to avoid problems with buffer
> reuse
>   where multiple map modes exist (caught by Chris Wilson).
> 
> Fixes: 2dfc119f22f257082ab0 "i965: Grow the batch/state buffers if we
> need space and can't flush."
> ---
>  src/mesa/drivers/dri/i965/brw_context.h   |   3 +
>  src/mesa/drivers/dri/i965/intel_batchbuffer.c | 128
> +-
>  2 files changed, 107 insertions(+), 24 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 4d29e2ef082..1167d04a32a 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -473,6 +473,9 @@ struct brw_reloc_list {
>  struct brw_growing_bo {
> struct brw_bo *bo;
> uint32_t *map;
> +   struct brw_bo *partial_bo;
> +   uint32_t *partial_bo_map;
> +   unsigned partial_bytes;
>  };
>  
>  struct intel_batchbuffer {
> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> index b4fcd92b6bd..ca027ebff9f 100644
> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> @@ -171,6 +171,9 @@ recreate_growing_buffer(struct brw_context *brw,
>  
> grow->bo = brw_bo_alloc(bufmgr, name, size, 4096);
> grow->bo->kflags = can_do_exec_capture(screen) ?
> EXEC_OBJECT_CAPTURE : 0;
> +   grow->partial_bo = NULL;
> +   grow->partial_bo_map = NULL;
> +   grow->partial_bytes = 0;
>  
> if (!batch->use_shadow_copy)
>    grow->map = brw_bo_map(brw, grow->bo, MAP_READ | MAP_WRITE);
> @@ -267,6 +270,32 @@ intel_batchbuffer_free(struct intel_batchbuffer
> *batch)
>    _mesa_hash_table_destroy(batch->state_batch_sizes, NULL);
>  }
>  
> +/**
> + * Finish copying the old batch/state buffer's contents to the new
> one
> + * after we tried to "grow" the buffer in an earlier operation.
> + */
> +static void
> +finish_growing_bos(struct brw_growing_bo *grow)
> +{
> +   struct brw_bo *old_bo = grow->partial_bo;
> +   if (!old_bo)
> +  return;

Maybe just have the early return check at the top instead?:

if (!grow->partial_bo)
   return;

> +   /* If using a shadow copy, realloc may have returned the same
> pointer
> +* when growing, so there's no need to memcpy.  Or it might have
> moved,
> +* at which point we need to copy.
> +*/

Is this really necessary? According to the man pages realloc is
supposed to always keep the contents:

"The contents will be unchanged in the range from the start of the
region up to the minimum  of the  old and  new  sizes"

> +   if (grow->map != grow->partial_bo_map) {
> +  memcpy(grow->map, grow->partial_bo_map, grow->partial_bytes);

Also, isn't this supposed to also copy the contents when we are not
using a shadow copy? This patch removed the memcpy after we map the new
BO in replace_bo_in_reloc_list below.

> +   }
> +
> +   grow->partial_bo = NULL;
> +   grow->partial_bo_map = NULL;
> +   grow->partial_bytes = 0;
> +
> +   brw_bo_unreference(old_bo);
> +}
> +
>  static void
>  replace_bo_in_reloc_list(struct brw_reloc_list *rlist,
>   uint32_t old_handle, uint32_t new_handle)
> @@ -296,21 +325,26 @@ grow_buffer(struct brw_context *brw,
> struct brw_bufmgr *bufmgr = brw->bufmgr;
> struct brw_bo *bo = grow->bo;
>  
> -   uint32_t *old_map = grow->map;
> -   struct brw_bo *old_bo = grow->bo;
> +   perf_debug("Growing %s - ran out of space\n", bo->name);
>  
> -   struct brw_bo *new_bo =
> -  brw_bo_alloc(bufmgr, old_bo->name, new_size, old_bo->align);
> -   uint32_t *new_map;
> +   if (grow->partial_bo) {
> +  /* We've already grown once, and now we need to do it again.
> +   * Finish our last grow operation so we can start a new one.
> +   * This should basically never happen.
> +   */
> +  perf_debug("Had to grow multiple times");
> +  

[Mesa-dev] [PATCH v3 4/4] i965: Avoid problems from referencing orphaned BOs after growing.

2018-01-08 Thread Kenneth Graunke
Growing the batch/state buffer is a lot more dangerous than I thought.

A number of places emit multiple state buffer sections, and then write
data to the returned pointer, or save a pointer to brw->batch.state.bo
and then use it in relocations.  If each call can grow, this can result
in stale map references or stale BO pointers.  Furthermore, fences refer
to the old batch BO, and that reference needs to continue working.

To avoid these woes, we avoid ever swapping the brw->batch.*.bo pointer,
instead exchanging the brw_bo structures in place.  That way, stale BO
references are fine - the GEM handle changes, but the brw_bo pointer
doesn't.  We also defer the memcpy until a quiescent point, so callers
can write to the returned pointer - which may be in either BO - and
we'll sort it out and combine the two properly in the end.

v2:
- Handle stale pointers in the shadow copy case, where realloc may or
  may not move our shadow copy to a new address.
- Track the partial map explicitly, to avoid problems with buffer reuse
  where multiple map modes exist (caught by Chris Wilson).

Fixes: 2dfc119f22f257082ab0 "i965: Grow the batch/state buffers if we need 
space and can't flush."
---
 src/mesa/drivers/dri/i965/brw_context.h   |   3 +
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 128 +-
 2 files changed, 107 insertions(+), 24 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 4d29e2ef082..1167d04a32a 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -473,6 +473,9 @@ struct brw_reloc_list {
 struct brw_growing_bo {
struct brw_bo *bo;
uint32_t *map;
+   struct brw_bo *partial_bo;
+   uint32_t *partial_bo_map;
+   unsigned partial_bytes;
 };
 
 struct intel_batchbuffer {
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index b4fcd92b6bd..ca027ebff9f 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -171,6 +171,9 @@ recreate_growing_buffer(struct brw_context *brw,
 
grow->bo = brw_bo_alloc(bufmgr, name, size, 4096);
grow->bo->kflags = can_do_exec_capture(screen) ? EXEC_OBJECT_CAPTURE : 0;
+   grow->partial_bo = NULL;
+   grow->partial_bo_map = NULL;
+   grow->partial_bytes = 0;
 
if (!batch->use_shadow_copy)
   grow->map = brw_bo_map(brw, grow->bo, MAP_READ | MAP_WRITE);
@@ -267,6 +270,32 @@ intel_batchbuffer_free(struct intel_batchbuffer *batch)
   _mesa_hash_table_destroy(batch->state_batch_sizes, NULL);
 }
 
+/**
+ * Finish copying the old batch/state buffer's contents to the new one
+ * after we tried to "grow" the buffer in an earlier operation.
+ */
+static void
+finish_growing_bos(struct brw_growing_bo *grow)
+{
+   struct brw_bo *old_bo = grow->partial_bo;
+   if (!old_bo)
+  return;
+
+   /* If using a shadow copy, realloc may have returned the same pointer
+* when growing, so there's no need to memcpy.  Or it might have moved,
+* at which point we need to copy.
+*/
+   if (grow->map != grow->partial_bo_map) {
+  memcpy(grow->map, grow->partial_bo_map, grow->partial_bytes);
+   }
+
+   grow->partial_bo = NULL;
+   grow->partial_bo_map = NULL;
+   grow->partial_bytes = 0;
+
+   brw_bo_unreference(old_bo);
+}
+
 static void
 replace_bo_in_reloc_list(struct brw_reloc_list *rlist,
  uint32_t old_handle, uint32_t new_handle)
@@ -296,21 +325,26 @@ grow_buffer(struct brw_context *brw,
struct brw_bufmgr *bufmgr = brw->bufmgr;
struct brw_bo *bo = grow->bo;
 
-   uint32_t *old_map = grow->map;
-   struct brw_bo *old_bo = grow->bo;
+   perf_debug("Growing %s - ran out of space\n", bo->name);
 
-   struct brw_bo *new_bo =
-  brw_bo_alloc(bufmgr, old_bo->name, new_size, old_bo->align);
-   uint32_t *new_map;
+   if (grow->partial_bo) {
+  /* We've already grown once, and now we need to do it again.
+   * Finish our last grow operation so we can start a new one.
+   * This should basically never happen.
+   */
+  perf_debug("Had to grow multiple times");
+  finish_growing_bos(grow);
+   }
 
-   perf_debug("Growing %s - ran out of space\n", old_bo->name);
+   struct brw_bo *new_bo = brw_bo_alloc(bufmgr, bo->name, new_size, bo->align);
 
/* Copy existing data to the new larger buffer */
+   grow->partial_bo_map = grow->map;
+
if (batch->use_shadow_copy) {
-  new_map = realloc(old_map, new_size);
+  grow->map = realloc(grow->map, new_size);
} else {
-  new_map = brw_bo_map(brw, new_bo, MAP_READ | MAP_WRITE);
-  memcpy(new_map, old_map, existing_bytes);
+  grow->map = brw_bo_map(brw, new_bo, MAP_READ | MAP_WRITE);
}
 
/* Try to put the new BO at the same GTT offset as the old BO (which
@@ -322,21 +356,18 @@ grow_buffer(struct brw_context *brw,
 *
 * Also preserve kflags for EXEC_OBJECT_CAPTURE.
 */
-