On Sun, 27 Dec 2015, Stuart Henderson wrote:
> Widening the audience to tech in case anyone with an idea missed it on 
> ports: it seems some people are having a lot more trouble than just 
> needing to restart build 2 or 3 times.
...
> To replicate
> 
> cd /usr/ports/x11/vlc
> make fake
> cd /usr/ports/pobj/vlc-2.2.1/vlc-2.2.1
> PATH="/usr/ports/pobj/vlc-2.2.1/fake-amd64/usr/local/bin:$PATH" 
> LD_LIBRARY_PATH="/usr/ports/pobj/vlc-2.2.1/fake-amd64/usr/local/lib:$LD_LIBRARY_PATH"
>  /usr/ports/pobj/vlc-2.2.1/fake-amd64/usr/local/lib/vlc/vlc-cache-gen -f 
> /usr/ports/pobj/vlc-2.2.1/fake-amd64/usr/local/lib/vlc/plugins
> 
> repeat the last step until it crashes, e.g.
> 
> /usr/ports/pobj/vlc-2.2.1/vlc-2.2.1/bin/.libs/vlc-cache-gen:/usr/local/lib/libebml.so.3.0:
>  undefined symbol '_ZNSs4_Rep10_M_destroyERKSaIcE'
> lazy binding failed!
> Segmentation fault (core dumped) 
>
> or
>
> Bus error (core dumped)

Okay.

The problem is that vlc-cache-gen dlopen()s a plugin that has a dependency 
(libgio) which is marked nodelete.  When the plugin is dlclose()d, that 
dependency is correctly kept around...but other parts of the load group 
*are* unmapped and their elf_object_t structures freed.

You Can't Do That: the objects that make up a load group must be deleted 
all at once or not at all, as they may have resolved relocations to each 
other and there are certainly pointers between their elf_object_t 
structures via the grpref_list, grpsym_list, and load_object members.

So, once a nodelete object is brought in, the entire load group needs to 
be locked in.  The diff below does this by changing the nodelete bits to 
add an "open" reference on the root of the load group that pulled in the 
nodelete object instead of a "child" reference on the nodelete itself.  
(The type of reference was always wrong and may have permitted nodelete 
modules being deleted even in simpler cases.)


Ports question: does libgio still need to be marked nodelete, or was that 
just from when pthread_atfork() handlers weren't unregistered on 
dlclose()?


Ok?

Philip Guenther


Index: resolve.c
===================================================================
RCS file: /data/src/openbsd/src/libexec/ld.so/resolve.c,v
retrieving revision 1.69
diff -u -p -r1.69 resolve.c
--- resolve.c   2 Nov 2015 07:02:53 -0000       1.69
+++ resolve.c   16 Jan 2016 23:06:46 -0000
@@ -54,12 +54,15 @@ elf_object_t *_dl_loading_object;
 void
 _dl_add_object(elf_object_t *object)
 {
-       /* if a .so is marked nodelete, then add a reference */
+       /*
+        * If a .so is marked nodelete, then the entire load group that it's
+        * in needs to be kept around forever, so add a reference there.
+        */
        if (object->obj_flags & DF_1_NODELETE &&
-           (object->status & STAT_NODELETE) == 0) {
+           (object->load_object->status & STAT_NODELETE) == 0) {
                DL_DEB(("objname %s is nodelete\n", object->load_name));
-               object->refcount++;
-               object->status |= STAT_NODELETE;
+               object->load_object->opencount++;
+               object->load_object->status |= STAT_NODELETE;
        }
 
        /*

Reply via email to