I was never able to exactly reproduce the problem, but I think I came
close.  My attempt is in a temporary branch called "yamwb" (Yet Another
MainWin Bug) on GitHub:

  git clone https://github.com/VirtualGL/virtualgl.git
  cd virtualgl
  git checkout yamwb
  git checkout HEAD~1

The penultimate commit in that branch
(https://github.com/VirtualGL/virtualgl/commit/ea45a4121aeb26e70ff3da3c0dcf4af1f156e031)
creates a shared library with _init() and _fini() functions, calls
XGetSelectionOwner() in the body of _fini(), and links the shared lib
with GLXspheres.  When running this modified version of GLXspheres in
VGL, it doesn't actually lock up, but it does demonstrate a failure in
CriticalSection::lock() that occurs when one of VirtualGL's interposed
functions is called from another shared lib's global destructor.
Basically, at that point in the execution, we can't rely on any mutexes
except the global one, and that could very well be what's causing
MainWin to lock up.

Basically, the issue is that, by the time the shared lib's destructor
function is called, the GlobalCleanup destructor in VGL's faker has
already been called (and if the faker had a global destructor function,
it would have already been called as well.)  At that point, it is
difficult or impossible for VGL to operate with any semblance of
normalcy, particularly given that mutexes don't work properly.  So how
is VGL supposed to sanely handle an application calling an interposed
X11 or OpenGL function after the interposer itself has been essentially
shut down?

If we're lucky and this is just confined to the XCB interposer, meaning
that fixing it is a simple matter of disabling said interposer, then do

  git checkout yamwb

to see my proposed solution
(https://github.com/VirtualGL/virtualgl/commit/5328fe5c0d725b4b04c926aaf53eb657548a9028).
 Symptomatically, what happens is as follows:

(1) GLXspheres returns from main().
(2) GlobalCleanup::~GlobalCleanup() is called in the faker.
(3) _fini() is called in the shared library.
(4) _fini() calls XGetSelectionOwner() [not interposed].
(5) XGetSelectionOwner() calls xcb_poll_for_event() [interposed].
(6) The interposed xcb_poll_for_event() function attempts to access
fconfig to read the status of fconfig.fakeXCB.
(7) fconfig_instance() attempts to lock the mutex guarding its singleton
instance.
(8) The CriticalSection lock fails and attempts to throw an error.

NOTE: This is where MainWin locks up, but my modified version of
GLXspheres doesn't.  Rather, the error is caught by the catch() handler
in xcb_poll_for_event(), safeExit() is called, and the application exits
without returning from XGetSelectionOwner() or _fini().  But that's
still incorrect behavior, because _fini() never returns.

What the proposed solution does:

-- It sets the faker level to 1 within the body of
GlobalCleanup::~GlobalCleanup(), effectively disabling any further XCB
interposition.
-- It re-arranges the if() statements within faker-xcb.cpp so that the
faker level is checked prior to attempting to access the FakerConfig
singleton.

That eliminates the problem with my test application, but you'll have to
tell me whether it fixes the problem with MainWin or not.

DRC


On 7/11/16 4:34 PM, Nathan Kidd wrote:
> On 11/07/16 03:54 PM, DRC wrote:
>> I need to be able to reproduce this before I can fix it, but my attempts
>> to reproduce it by adding a destructor to GLXspheres and calling
>> XGetSelectionOwner() within the destructor failed.  I also tried calling
>> xcb_poll_for_event() within the body of my destructor function, but it
>> just returned NULL without doing anything.
> 
> Does it make a difference if you try to do X things from a separate SO's
> _fini()?
> 
>> Please help me understand exactly what's going on here and how I can
>> reproduce the problem without using MainWin.
> 
> Ha, "reproduce the problem without using MainWin", the story of my life.
>  It's going to take some time before I'll get a chance to try.

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
_______________________________________________
VirtualGL-Devel mailing list
VirtualGL-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtualgl-devel

Reply via email to