CC'ing Piglit.

On 09/14/2013 08:54 AM, Paul Berry wrote:
I'm investigating a failure in spec/OES_fixed_point/attribute-arrays,
specifically the command line
"/home/pberry/.platform/piglit-mesa/piglit/build/bin/oes_fixed_point-attribute-arrays
-auto -fbo".  It's segfaulting during piglit/waffle initialization due to
Mesa accessing freed memory.  This only started happening for me recently,
however I suspect that's because the access to freed memory makes it a
heisenbug, and the root cause has probably been around for a long time.

What's interesting about this test is that it's a GLES1 test being run in
-fbo mode, which means that piglit first starts initializing things
assuming it's going to run the test with fbo's, then at some point it
figures out that it can't (because fbo's are unsupported), so it tears down
its configuration and starts a new configuration to test using a window.

While establishing the new configuration, waffle calls eglMakeCurrent().
Deep in the bowels of Mesa's implementation of this function, it decides
that it needs to flush the context that was previously current.  But that
context refers to data structures that were freed when piglit tore down its
old configuration (specifically, it refers to brw->bufmgr, which was freed
in response to a call to eglTerminate()).

I've been studying the egl calls made by piglit/waffle during this test and
I believe they look like this (I may be missing a few but I think I found
most of them):

Initial setup:
- eglGetDisplay()
- eglInitialize() (causes intel_init_bufmgr() to be called, which creates
bufmgr 1)
- eglQueryString()
- eglChooseConfig()
- eglBindAPI()
- eglCreateContext() (causes brwCreateContext() to be called, which creates
context 1)
- eglGetConfigAttrib()
- eglCreateWindowSurface()
- eglMakeCurrent()
Initial teardown:
- eglDestroySurface()
- eglDestroyContext() (interestingly, does not cause intelDestroyContext to
be called, perhaps because the context is still current?)
- eglTerminate() (causes intelDestroyScreen() to be called, which frees
bufmgr 1)
Second setup:
- eglGetDisplay()
- eglInitialize() (causes intel_init_bufmgr() to be called, which creates
bufmgr 2)
- eglQueryString()
- eglChooseConfig()
- eglBindAPI()
- eglCreateContext() (causes brwCreateContext() to be called, which creates
context 2)
- eglGetConfigAttrib()
- eglCreateWindowSurface()
- eglMakeCurrent() (at this point Mesa tries to flush context 1, which
causes a segfault beause this causes it to try to access bufmgr 1, which
has already been freed)

So, my questions are:
- Does it look like piglit/waffle is making an allowed sequence of EGL
calls?  (In other words, is the bug in Mesa or piglit/waffle?)
- If the bug is in Mesa, what should be happening instead?  I assume that
at some point Mesa should have made the current context non-current (and
destroyed it, perhaps), but I'm not sure when that should have happened,
nor what code should have been responsible for doing it.

Thanks in advance, Chad.  I hope you're enjoying your business travel!

The sequence of EGL calls is legal. The bug is in Mesa. After discovering
the bug many months ago, I posted a test to the Piglit list, but it was
ignored and then forgotten. (Gerrit please!) I'll repost the test in the
next day or so after rebasing it.

See the comments [1] in my test to see why the sequence of EGL calls is legal.

[1] http://cgit.freedesktop.org/~chadversary/piglit/tree/tests/egl/spec/egl-1.4/egl-terminate-then-unbind-context.c?h=egl-terminate-then-unbind#n26

If I correctly understand the EGL spec quote above, the call to eglMakeCurrent 
that currently
segfaults should instead flush the queued-to-be-destroyed context and then 
promptly destroy it.
_______________________________________________
Piglit mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/piglit

Reply via email to