Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix buffer overruns in MSAA MCS buffer clearing.

2014-04-15 Thread Courtney Goeltzenleuchter
What timing, we were just looking at this exact issue - intermittent
glxgears rendering issues when using multisample buffers.

What's the plan for DRM? Seems like it's broken if writing to the buffer
size indicated (bo-size) causes us to clobber other BOs.

Courtney


On Mon, Apr 14, 2014 at 8:54 PM, Kenneth Graunke kenn...@whitecape.orgwrote:

 On 04/14/2014 05:33 PM, Eric Anholt wrote:
  This manifested as rendering failures or sometimes GPU hangs in
  compositors when they accidentally got MSAA visuals due to a bug in the X
  Server.  Today we decided that the problem in compositors was equivalent
  to a corruption bug we'd noticed recently in resizing MSAA-visual
  glxgears, and debugging got a lot easier.
 
  When we allocate our MCS MT, libdrm takes the size we request, aligns it
  to Y tile size (blowing it up from 300x300=90 bytes to 384*320=122880
  bytes, 30 pages), then puts it into a power-of-two-sized BO (131072
 bytes,
  32 pages).  Because it's Y tiled, we attach a 384-byte-stride fence to
 it.
  When we memset by the BO size in Mesa, between bytes 122880 and 131072
 the
  data gets stored to the first 20 or so scanlines of each of the 3 tiled
  pages in that row, even though only 2 of those pages were allocated by
  libdrm.

 What?

 I get that drm_intel_bo_alloc/drm_intel_bo_alloc_tiled might return a
 drm_intel_bo where bo-size is larger than what you asked for, due to
 the BO cache.  But...what you're saying is, it doesn't actually allocate
 enough pages to back the whole bo-size it gives you?  So, if you write
 bytes 0..(bo-size - 1), you'll randomly clobber memory in a way that's
 really difficult to detect?

 I never realized that.  It seems pretty scary.

 There are other places where we memset an entire BO using bo-size.  For
 example, your INTEL_DEBUG=shader_time code does exactly that (though it
 isn't tiled).

 Could we change libdrm to set bo-size to the actual usable size of the
 buffer, rather than the bucket size?  That seems much safer, and would
 prevent mistakes like this in the future.  If libdrm needs to know the
 bucket size internally, we could store that as a field in the private
 drm_intel_gem_bo structure, and not expose it to consumers of the
 library.  I can't imagine users of libdrm want the current value.

  In the glxgears case, the missing 3rd page happened to
  consistently be the static VBO that got mapped right after the first MCS
  allocation, so corruption only appeared once window resize made us throw
  out the old MCS and then allocate the same BO to back the new MCS.
 
  Instead, just memset the amount of data we actually asked libdrm to
  allocate for, which will be smaller (more efficient) and not overrun.
  Thanks go to Kenneth for doing most of the hard debugging to eliminate a
  lot of the search space for the bug.
 
  Cc: 10.0 10.1 mesa-sta...@lists.freedesktop.org
  Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77207
  ---
 
  Ken: how I eventually figured this out was I thought well, I'm
  clearing the whole surface at the start of rendering glxgears, so I
  don't *really* need to memset because I'll be initializing the whole
  buffer during fallback glClear() anyway, so let me just drop the whole
  memset step to definitely eliminate that as a potential problem.  Huh,
  the problem's gone.
 
  The worst is I remember briefly thinking about this code when it was
  landing, and thought nope, seems safe.
 
   src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
 
  diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
 b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
  index 5996a1b..59700ed 100644
  --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
  +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
  @@ -1219,7 +1219,7 @@ intel_miptree_alloc_mcs(struct brw_context *brw,
   * Note: the clear value for MCS buffers is all 1's, so we memset to
 0xff.
   */
  void *data = intel_miptree_map_raw(brw, mt-mcs_mt);
  -   memset(data, 0xff, mt-mcs_mt-region-bo-size);
  +   memset(data, 0xff, mt-mcs_mt-region-height *
 mt-mcs_mt-region-pitch);
  intel_miptree_unmap_raw(brw, mt-mcs_mt);
  mt-fast_clear_state = INTEL_FAST_CLEAR_STATE_CLEAR;

 This does seem to fix the KWin problem, as well as the glxgears problem.

 I agree this is the correct amount of data to memset, and even if we
 make the libdrm change I suggested, this seems worth doing.  bo-size
 may have been rounded up beyond what we need, and memsetting that extra
 space is wasteful (even if it did work).

 Reviewed-by: Kenneth Graunke kenn...@whitecape.org

 Thanks a ton for your help on this, Eric.  I was really stumped.


 ___
 mesa-stable mailing list
 mesa-sta...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-stable




-- 
Courtney Goeltzenleuchter
LunarG
___
mesa-dev mailing list

Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix buffer overruns in MSAA MCS buffer clearing.

2014-04-15 Thread Mike Stroyan
I would go farther than requiring the returned bo-size to be covered by
backing pages.
There really should be backing pages for every page mapped by
drm_intel_gem_bo_map.
No mapped addresses should be affecting memory outside of an object's
backing pages.

If tiling can result in access to unallocated memory it might even lead to
corruption of data in pages
used by different processes.  That would break GL_ARB_robustness
requirements.


On Tue, Apr 15, 2014 at 10:43 AM, Courtney Goeltzenleuchter 
court...@lunarg.com wrote:

 What timing, we were just looking at this exact issue - intermittent
 glxgears rendering issues when using multisample buffers.

 What's the plan for DRM? Seems like it's broken if writing to the buffer
 size indicated (bo-size) causes us to clobber other BOs.

 Courtney


 On Mon, Apr 14, 2014 at 8:54 PM, Kenneth Graunke kenn...@whitecape.orgwrote:

 On 04/14/2014 05:33 PM, Eric Anholt wrote:
  This manifested as rendering failures or sometimes GPU hangs in
  compositors when they accidentally got MSAA visuals due to a bug in the
 X
  Server.  Today we decided that the problem in compositors was equivalent
  to a corruption bug we'd noticed recently in resizing MSAA-visual
  glxgears, and debugging got a lot easier.
 
  When we allocate our MCS MT, libdrm takes the size we request, aligns it
  to Y tile size (blowing it up from 300x300=90 bytes to
 384*320=122880
  bytes, 30 pages), then puts it into a power-of-two-sized BO (131072
 bytes,
  32 pages).  Because it's Y tiled, we attach a 384-byte-stride fence to
 it.
  When we memset by the BO size in Mesa, between bytes 122880 and 131072
 the
  data gets stored to the first 20 or so scanlines of each of the 3 tiled
  pages in that row, even though only 2 of those pages were allocated by
  libdrm.

 What?

 I get that drm_intel_bo_alloc/drm_intel_bo_alloc_tiled might return a
 drm_intel_bo where bo-size is larger than what you asked for, due to
 the BO cache.  But...what you're saying is, it doesn't actually allocate
 enough pages to back the whole bo-size it gives you?  So, if you write
 bytes 0..(bo-size - 1), you'll randomly clobber memory in a way that's
 really difficult to detect?

 I never realized that.  It seems pretty scary.

 There are other places where we memset an entire BO using bo-size.  For
 example, your INTEL_DEBUG=shader_time code does exactly that (though it
 isn't tiled).

 Could we change libdrm to set bo-size to the actual usable size of the
 buffer, rather than the bucket size?  That seems much safer, and would
 prevent mistakes like this in the future.  If libdrm needs to know the
 bucket size internally, we could store that as a field in the private
 drm_intel_gem_bo structure, and not expose it to consumers of the
 library.  I can't imagine users of libdrm want the current value.

  In the glxgears case, the missing 3rd page happened to
  consistently be the static VBO that got mapped right after the first MCS
  allocation, so corruption only appeared once window resize made us throw
  out the old MCS and then allocate the same BO to back the new MCS.
 
  Instead, just memset the amount of data we actually asked libdrm to
  allocate for, which will be smaller (more efficient) and not overrun.
  Thanks go to Kenneth for doing most of the hard debugging to eliminate a
  lot of the search space for the bug.
 
  Cc: 10.0 10.1 mesa-sta...@lists.freedesktop.org
  Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77207
  ---
 
  Ken: how I eventually figured this out was I thought well, I'm
  clearing the whole surface at the start of rendering glxgears, so I
  don't *really* need to memset because I'll be initializing the whole
  buffer during fallback glClear() anyway, so let me just drop the whole
  memset step to definitely eliminate that as a potential problem.  Huh,
  the problem's gone.
 
  The worst is I remember briefly thinking about this code when it was
  landing, and thought nope, seems safe.
 
   src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
 
  diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
 b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
  index 5996a1b..59700ed 100644
  --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
  +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
  @@ -1219,7 +1219,7 @@ intel_miptree_alloc_mcs(struct brw_context *brw,
   * Note: the clear value for MCS buffers is all 1's, so we memset
 to 0xff.
   */
  void *data = intel_miptree_map_raw(brw, mt-mcs_mt);
  -   memset(data, 0xff, mt-mcs_mt-region-bo-size);
  +   memset(data, 0xff, mt-mcs_mt-region-height *
 mt-mcs_mt-region-pitch);
  intel_miptree_unmap_raw(brw, mt-mcs_mt);
  mt-fast_clear_state = INTEL_FAST_CLEAR_STATE_CLEAR;

 This does seem to fix the KWin problem, as well as the glxgears problem.

 I agree this is the correct amount of data to memset, and even if we
 make the libdrm change I suggested, this seems worth 

Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix buffer overruns in MSAA MCS buffer clearing.

2014-04-15 Thread Courtney Goeltzenleuchter
On Tue, Apr 15, 2014 at 1:18 PM, Eric Anholt e...@anholt.net wrote:

 Kenneth Graunke kenn...@whitecape.org writes:

  On 04/14/2014 05:33 PM, Eric Anholt wrote:
  This manifested as rendering failures or sometimes GPU hangs in
  compositors when they accidentally got MSAA visuals due to a bug in the
 X
  Server.  Today we decided that the problem in compositors was equivalent
  to a corruption bug we'd noticed recently in resizing MSAA-visual
  glxgears, and debugging got a lot easier.
 
  When we allocate our MCS MT, libdrm takes the size we request, aligns it
  to Y tile size (blowing it up from 300x300=90 bytes to
 384*320=122880
  bytes, 30 pages), then puts it into a power-of-two-sized BO (131072
 bytes,
  32 pages).  Because it's Y tiled, we attach a 384-byte-stride fence to
 it.
  When we memset by the BO size in Mesa, between bytes 122880 and 131072
 the
  data gets stored to the first 20 or so scanlines of each of the 3 tiled
  pages in that row, even though only 2 of those pages were allocated by
  libdrm.
 
  What?
 
  I get that drm_intel_bo_alloc/drm_intel_bo_alloc_tiled might return a
  drm_intel_bo where bo-size is larger than what you asked for, due to
  the BO cache.  But...what you're saying is, it doesn't actually allocate
  enough pages to back the whole bo-size it gives you?  So, if you write
  bytes 0..(bo-size - 1), you'll randomly clobber memory in a way that's
  really difficult to detect?

 You have that many pages, really.  But you've attached a fence to it, so
 your allocated pages are structured as:

 +---+---+---+
 |   |   |   |
 +---+---+---+
 |   |   |   |
 +---+---+---+
 |   |   |   |
 +---+---+---+
 |   |   |
 +---+---+

 (except taller in this specific example).  If you hit the pixels in
 those quads, you'll be fine.

 
  There are other places where we memset an entire BO using bo-size.  For
  example, your INTEL_DEBUG=shader_time code does exactly that (though it
  isn't tiled).
 
  Could we change libdrm to set bo-size to the actual usable size of the
  buffer, rather than the bucket size?

 The pages containing pixels you asked for go to 122880, and the BO is
 131072, but the pixels you asked for have a maximum linear address of
 384*320=115200.  Which value are you thinking is the actual usable
 size?  We certainly shouldn't have been memsetting more pixels than
 115200.


Why not? I understand that it's not useful to touch pixels beyond 115200,
but from the data structure, we were given 131072 bytes to work with. Why
shouldn't memsetting the entire allocated space be a safe operation?


 ___
 mesa-stable mailing list
 mesa-sta...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-stable




-- 
Courtney Goeltzenleuchter
LunarG
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix buffer overruns in MSAA MCS buffer clearing.

2014-04-15 Thread Eric Anholt
Courtney Goeltzenleuchter court...@lunarg.com writes:

 On Tue, Apr 15, 2014 at 1:18 PM, Eric Anholt e...@anholt.net wrote:

 Kenneth Graunke kenn...@whitecape.org writes:

  On 04/14/2014 05:33 PM, Eric Anholt wrote:
  This manifested as rendering failures or sometimes GPU hangs in
  compositors when they accidentally got MSAA visuals due to a bug in the
 X
  Server.  Today we decided that the problem in compositors was equivalent
  to a corruption bug we'd noticed recently in resizing MSAA-visual
  glxgears, and debugging got a lot easier.
 
  When we allocate our MCS MT, libdrm takes the size we request, aligns it
  to Y tile size (blowing it up from 300x300=90 bytes to
 384*320=122880
  bytes, 30 pages), then puts it into a power-of-two-sized BO (131072
 bytes,
  32 pages).  Because it's Y tiled, we attach a 384-byte-stride fence to
 it.
  When we memset by the BO size in Mesa, between bytes 122880 and 131072
 the
  data gets stored to the first 20 or so scanlines of each of the 3 tiled
  pages in that row, even though only 2 of those pages were allocated by
  libdrm.
 
  What?
 
  I get that drm_intel_bo_alloc/drm_intel_bo_alloc_tiled might return a
  drm_intel_bo where bo-size is larger than what you asked for, due to
  the BO cache.  But...what you're saying is, it doesn't actually allocate
  enough pages to back the whole bo-size it gives you?  So, if you write
  bytes 0..(bo-size - 1), you'll randomly clobber memory in a way that's
  really difficult to detect?

 You have that many pages, really.  But you've attached a fence to it, so
 your allocated pages are structured as:

 +---+---+---+
 |   |   |   |
 +---+---+---+
 |   |   |   |
 +---+---+---+
 |   |   |   |
 +---+---+---+
 |   |   |
 +---+---+

 (except taller in this specific example).  If you hit the pixels in
 those quads, you'll be fine.

 
  There are other places where we memset an entire BO using bo-size.  For
  example, your INTEL_DEBUG=shader_time code does exactly that (though it
  isn't tiled).
 
  Could we change libdrm to set bo-size to the actual usable size of the
  buffer, rather than the bucket size?

 The pages containing pixels you asked for go to 122880, and the BO is
 131072, but the pixels you asked for have a maximum linear address of
 384*320=115200.  Which value are you thinking is the actual usable
 size?  We certainly shouldn't have been memsetting more pixels than
 115200.


 Why not? I understand that it's not useful to touch pixels beyond 115200,
 but from the data structure, we were given 131072 bytes to work with. Why
 shouldn't memsetting the entire allocated space be a safe operation?

If you drm_intel_bo_map() and write 131072, you'd be fine.  If you
drm_intel_bo_map_gtt() on the tiled buffer and your fence readdresses
your writes beyond 131072, you're not fine.


pgpo2vFkvCElW.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix buffer overruns in MSAA MCS buffer clearing.

2014-04-15 Thread Courtney Goeltzenleuchter
On Tue, Apr 15, 2014 at 2:16 PM, Eric Anholt e...@anholt.net wrote:

 Courtney Goeltzenleuchter court...@lunarg.com writes:

  On Tue, Apr 15, 2014 at 1:18 PM, Eric Anholt e...@anholt.net wrote:
 
  Kenneth Graunke kenn...@whitecape.org writes:
 
   On 04/14/2014 05:33 PM, Eric Anholt wrote:
   This manifested as rendering failures or sometimes GPU hangs in
   compositors when they accidentally got MSAA visuals due to a bug in
 the
  X
   Server.  Today we decided that the problem in compositors was
 equivalent
   to a corruption bug we'd noticed recently in resizing MSAA-visual
   glxgears, and debugging got a lot easier.
  
   When we allocate our MCS MT, libdrm takes the size we request,
 aligns it
   to Y tile size (blowing it up from 300x300=90 bytes to
  384*320=122880
   bytes, 30 pages), then puts it into a power-of-two-sized BO (131072
  bytes,
   32 pages).  Because it's Y tiled, we attach a 384-byte-stride fence
 to
  it.
   When we memset by the BO size in Mesa, between bytes 122880 and
 131072
  the
   data gets stored to the first 20 or so scanlines of each of the 3
 tiled
   pages in that row, even though only 2 of those pages were allocated
 by
   libdrm.
  
   What?
  
   I get that drm_intel_bo_alloc/drm_intel_bo_alloc_tiled might return a
   drm_intel_bo where bo-size is larger than what you asked for, due to
   the BO cache.  But...what you're saying is, it doesn't actually
 allocate
   enough pages to back the whole bo-size it gives you?  So, if you
 write
   bytes 0..(bo-size - 1), you'll randomly clobber memory in a way
 that's
   really difficult to detect?
 
  You have that many pages, really.  But you've attached a fence to it, so
  your allocated pages are structured as:
 
  +---+---+---+
  |   |   |   |
  +---+---+---+
  |   |   |   |
  +---+---+---+
  |   |   |   |
  +---+---+---+
  |   |   |
  +---+---+
 
  (except taller in this specific example).  If you hit the pixels in
  those quads, you'll be fine.
 
  
   There are other places where we memset an entire BO using bo-size.
  For
   example, your INTEL_DEBUG=shader_time code does exactly that (though
 it
   isn't tiled).
  
   Could we change libdrm to set bo-size to the actual usable size of
 the
   buffer, rather than the bucket size?
 
  The pages containing pixels you asked for go to 122880, and the BO is
  131072, but the pixels you asked for have a maximum linear address of
  384*320=115200.  Which value are you thinking is the actual usable
  size?  We certainly shouldn't have been memsetting more pixels than
  115200.
 
 
  Why not? I understand that it's not useful to touch pixels beyond 115200,
  but from the data structure, we were given 131072 bytes to work with. Why
  shouldn't memsetting the entire allocated space be a safe operation?

 If you drm_intel_bo_map() and write 131072, you'd be fine.  If you
 drm_intel_bo_map_gtt() on the tiled buffer and your fence readdresses
 your writes beyond 131072, you're not fine.


I'm curious, what would it have cost to reserve the pages necessary to
cover both cases?

The issue caused by this particular overwrite was hard to pin down. In our
test case the failure was intermittent. We could see that memory was
getting corrupted but nothing else had run between the last time we checked
and things were good and when they went bad (building the dlist in
glxgears).

-- 
Courtney Goeltzenleuchter
LunarG
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix buffer overruns in MSAA MCS buffer clearing.

2014-04-15 Thread Kenneth Graunke
On 04/15/2014 10:08 AM, Mike Stroyan wrote:
 I would go farther than requiring the returned bo-size to be covered by
 backing pages.
 There really should be backing pages for every page mapped by
 drm_intel_gem_bo_map.
 No mapped addresses should be affecting memory outside of an object's
 backing pages.
 
 If tiling can result in access to unallocated memory it might even lead
 to corruption of data in pages
 used by different processes.  That would break GL_ARB_robustness
 requirements.

It shouldn't, now that proper PPGTT support has landed in the kernel -
only data belonging to a single process is mapped into its address
space.  So, a process can accidentally read or trash other portions of
its own memory, but never any other memory.  This seems acceptable, as
it's how things work with normal CPU programs.

That said, with older kernels, where we had only Global GTT or the
aliasing PPGTT (using the PPGTT mechanism but storing the exact same
mappings at the GGTT), you could definitely see/stomp other people's
memory.  Which is awful, and one reason we think PPGTT support is so
important.

--Ken



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix buffer overruns in MSAA MCS buffer clearing.

2014-04-15 Thread Ian Romanick
On 04/15/2014 03:50 PM, Kenneth Graunke wrote:
 On 04/15/2014 10:08 AM, Mike Stroyan wrote:
 I would go farther than requiring the returned bo-size to be covered by
 backing pages.
 There really should be backing pages for every page mapped by
 drm_intel_gem_bo_map.
 No mapped addresses should be affecting memory outside of an object's
 backing pages.

 If tiling can result in access to unallocated memory it might even lead
 to corruption of data in pages
 used by different processes.  That would break GL_ARB_robustness
 requirements.
 
 It shouldn't, now that proper PPGTT support has landed in the kernel -
 only data belonging to a single process is mapped into its address
 space.  So, a process can accidentally read or trash other portions of
 its own memory, but never any other memory.  This seems acceptable, as
 it's how things work with normal CPU programs.

It is fine... for now.  GL_ARB_robust_buffer_access makes stronger
guarantees about out-of-buffer accesses.  Reads outside a buffer (e.g.,
accessing too large an array element for an array contained in a UBO)
can only return data from elsewhere in the buffer or zero.  WebGL
implementers really want this.

I don't think that's relevant here because the fences only apply to CPU
access, yeah?

 That said, with older kernels, where we had only Global GTT or the
 aliasing PPGTT (using the PPGTT mechanism but storing the exact same
 mappings at the GGTT), you could definitely see/stomp other people's
 memory.  Which is awful, and one reason we think PPGTT support is so
 important.
 
 --Ken
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix buffer overruns in MSAA MCS buffer clearing.

2014-04-15 Thread Chia-I Wu
On Wed, Apr 16, 2014 at 3:18 AM, Eric Anholt e...@anholt.net wrote:
 Kenneth Graunke kenn...@whitecape.org writes:

 On 04/14/2014 05:33 PM, Eric Anholt wrote:
 This manifested as rendering failures or sometimes GPU hangs in
 compositors when they accidentally got MSAA visuals due to a bug in the X
 Server.  Today we decided that the problem in compositors was equivalent
 to a corruption bug we'd noticed recently in resizing MSAA-visual
 glxgears, and debugging got a lot easier.

 When we allocate our MCS MT, libdrm takes the size we request, aligns it
 to Y tile size (blowing it up from 300x300=90 bytes to 384*320=122880
 bytes, 30 pages), then puts it into a power-of-two-sized BO (131072 bytes,
 32 pages).  Because it's Y tiled, we attach a 384-byte-stride fence to it.
 When we memset by the BO size in Mesa, between bytes 122880 and 131072 the
 data gets stored to the first 20 or so scanlines of each of the 3 tiled
 pages in that row, even though only 2 of those pages were allocated by
 libdrm.

 What?

 I get that drm_intel_bo_alloc/drm_intel_bo_alloc_tiled might return a
 drm_intel_bo where bo-size is larger than what you asked for, due to
 the BO cache.  But...what you're saying is, it doesn't actually allocate
 enough pages to back the whole bo-size it gives you?  So, if you write
 bytes 0..(bo-size - 1), you'll randomly clobber memory in a way that's
 really difficult to detect?

 You have that many pages, really.  But you've attached a fence to it, so
 your allocated pages are structured as:

 +---+---+---+
 |   |   |   |
 +---+---+---+
 |   |   |   |
 +---+---+---+
 |   |   |   |
 +---+---+---+
 |   |   |
 +---+---+

 (except taller in this specific example).  If you hit the pixels in
 those quads, you'll be fine.


 There are other places where we memset an entire BO using bo-size.  For
 example, your INTEL_DEBUG=shader_time code does exactly that (though it
 isn't tiled).

 Could we change libdrm to set bo-size to the actual usable size of the
 buffer, rather than the bucket size?

 The pages containing pixels you asked for go to 122880, and the BO is
 131072, but the pixels you asked for have a maximum linear address of
 384*320=115200.  Which value are you thinking is the actual usable
 size?  We certainly shouldn't have been memsetting more pixels than
 115200.
384*320 is 122880.  It  feels like bo-size could be 122880, and
131072 could be stored elsewhere in bo_gem.

With that change assumed, do you think it makes sense to add

if (tiling_mode != I915_TILING_NONE  bo-size % stride)
   fprintf(stderr, bo size is not a multiple of stride\n);

to drm_intel_gem_bo_set_tiling_internal?  That is, emit a warning when

  drm_intel_gem_bo_map_gtt(bo);
  memset(bo-virtual, 0, bo-size);

is known to explode.



 ___
 mesa-stable mailing list
 mesa-sta...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-stable




-- 
o...@lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix buffer overruns in MSAA MCS buffer clearing.

2014-04-15 Thread Kenneth Graunke
On 04/15/2014 05:16 PM, Ian Romanick wrote:
 On 04/15/2014 03:50 PM, Kenneth Graunke wrote:
 On 04/15/2014 10:08 AM, Mike Stroyan wrote:
 I would go farther than requiring the returned bo-size to be covered by
 backing pages.
 There really should be backing pages for every page mapped by
 drm_intel_gem_bo_map.
 No mapped addresses should be affecting memory outside of an object's
 backing pages.

 If tiling can result in access to unallocated memory it might even lead
 to corruption of data in pages
 used by different processes.  That would break GL_ARB_robustness
 requirements.

 It shouldn't, now that proper PPGTT support has landed in the kernel -
 only data belonging to a single process is mapped into its address
 space.  So, a process can accidentally read or trash other portions of
 its own memory, but never any other memory.  This seems acceptable, as
 it's how things work with normal CPU programs.
 
 It is fine... for now.  GL_ARB_robust_buffer_access makes stronger
 guarantees about out-of-buffer accesses.  Reads outside a buffer (e.g.,
 accessing too large an array element for an array contained in a UBO)
 can only return data from elsewhere in the buffer or zero.  WebGL
 implementers really want this.
 
 I don't think that's relevant here because the fences only apply to CPU
 access, yeah?

Right.  This shouldn't affect that.




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix buffer overruns in MSAA MCS buffer clearing.

2014-04-14 Thread Ian Romanick
On 04/14/2014 05:33 PM, Eric Anholt wrote:
 This manifested as rendering failures or sometimes GPU hangs in
 compositors when they accidentally got MSAA visuals due to a bug in the X
 Server.  Today we decided that the problem in compositors was equivalent
 to a corruption bug we'd noticed recently in resizing MSAA-visual
 glxgears, and debugging got a lot easier.
 
 When we allocate our MCS MT, libdrm takes the size we request, aligns it
 to Y tile size (blowing it up from 300x300=90 bytes to 384*320=122880
  ^
Infinitesimal nit...   one too many zeros here.

 bytes, 30 pages), then puts it into a power-of-two-sized BO (131072 bytes,
 32 pages).  Because it's Y tiled, we attach a 384-byte-stride fence to it.
 When we memset by the BO size in Mesa, between bytes 122880 and 131072 the
 data gets stored to the first 20 or so scanlines of each of the 3 tiled
 pages in that row, even though only 2 of those pages were allocated by
 libdrm.  In the glxgears case, the missing 3rd page happened to
 consistently be the static VBO that got mapped right after the first MCS
 allocation, so corruption only appeared once window resize made us throw
 out the old MCS and then allocate the same BO to back the new MCS.

Oh man...  I didn't realize that bo-size wasn't necessarily the amount
of backing storage.  It makes sense... do we track the backed size
anywhere?  It seems possible to have similar problems elsewhere...

 Instead, just memset the amount of data we actually asked libdrm to
 allocate for, which will be smaller (more efficient) and not overrun.
 Thanks go to Kenneth for doing most of the hard debugging to eliminate a
 lot of the search space for the bug.
 
 Cc: 10.0 10.1 mesa-sta...@lists.freedesktop.org
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77207
 ---
 
 Ken: how I eventually figured this out was I thought well, I'm
 clearing the whole surface at the start of rendering glxgears, so I
 don't *really* need to memset because I'll be initializing the whole
 buffer during fallback glClear() anyway, so let me just drop the whole
 memset step to definitely eliminate that as a potential problem.  Huh,
 the problem's gone.
 
 The worst is I remember briefly thinking about this code when it was
 landing, and thought nope, seems safe.
 
  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
 b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
 index 5996a1b..59700ed 100644
 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
 +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
 @@ -1219,7 +1219,7 @@ intel_miptree_alloc_mcs(struct brw_context *brw,
  * Note: the clear value for MCS buffers is all 1's, so we memset to 0xff.
  */
 void *data = intel_miptree_map_raw(brw, mt-mcs_mt);
 -   memset(data, 0xff, mt-mcs_mt-region-bo-size);
 +   memset(data, 0xff, mt-mcs_mt-region-height * 
 mt-mcs_mt-region-pitch);
 intel_miptree_unmap_raw(brw, mt-mcs_mt);
 mt-fast_clear_state = INTEL_FAST_CLEAR_STATE_CLEAR;
  
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev