On 2016/01/16 15:10, Philip Guenther wrote:
> On Sun, 27 Dec 2015, Stuart Henderson wrote:
> > Widening the audience to tech in case anyone with an idea missed it on 
> > ports: it seems some people are having a lot more trouble than just 
> > needing to restart build 2 or 3 times.
> ...
> > To replicate
> > 
> > cd /usr/ports/x11/vlc
> > make fake
> > cd /usr/ports/pobj/vlc-2.2.1/vlc-2.2.1
> > PATH="/usr/ports/pobj/vlc-2.2.1/fake-amd64/usr/local/bin:$PATH" 
> > LD_LIBRARY_PATH="/usr/ports/pobj/vlc-2.2.1/fake-amd64/usr/local/lib:$LD_LIBRARY_PATH"
> >  /usr/ports/pobj/vlc-2.2.1/fake-amd64/usr/local/lib/vlc/vlc-cache-gen -f 
> > /usr/ports/pobj/vlc-2.2.1/fake-amd64/usr/local/lib/vlc/plugins
> > 
> > repeat the last step until it crashes, e.g.
> > 
> > /usr/ports/pobj/vlc-2.2.1/vlc-2.2.1/bin/.libs/vlc-cache-gen:/usr/local/lib/libebml.so.3.0:
> >  undefined symbol '_ZNSs4_Rep10_M_destroyERKSaIcE'
> > lazy binding failed!
> > Segmentation fault (core dumped) 
> >
> > or
> >
> > Bus error (core dumped)
> 
> Okay.

Thanks for looking at this which is obviously a tricky area.

> The problem is that vlc-cache-gen dlopen()s a plugin that has a dependency 
> (libgio) which is marked nodelete.  When the plugin is dlclose()d, that 
> dependency is correctly kept around...but other parts of the load group 
> *are* unmapped and their elf_object_t structures freed.
> 
> You Can't Do That: the objects that make up a load group must be deleted 
> all at once or not at all, as they may have resolved relocations to each 
> other and there are certainly pointers between their elf_object_t 
> structures via the grpref_list, grpsym_list, and load_object members.
> 
> So, once a nodelete object is brought in, the entire load group needs to 
> be locked in.  The diff below does this by changing the nodelete bits to 
> add an "open" reference on the root of the load group that pulled in the 
> nodelete object instead of a "child" reference on the nodelete itself.  
> (The type of reference was always wrong and may have permitted nodelete 
> modules being deleted even in simpler cases.)

This makes complete sense.

> Ports question: does libgio still need to be marked nodelete, or was that 
> just from when pthread_atfork() handlers weren't unregistered on 
> dlclose()?

I'm unsure about libgio, but at least gobject (which is another
implicated library) apparently does: "Since the type system does not
support reloading its data and assumes that libgobject remains loaded
for the lifetime of the process, we should link libgobject with a flag
indicating that it can't be unloaded."

https://mail.gnome.org/archives/commits-list/2014-April/msg02316.html
https://bugzilla.gnome.org/show_bug.cgi?id=707298

> Ok?

Yes, OK.

There's a small side-effect: with LD_DEBUG, now only the first object
from the load group is reported as being 'nodelete'.

$ LD_DEBUG=1 /usr/local/lib/vlc/vlc-cache-gen -f . 2>&1 | grep nodel | uniq
objname /usr/lib/libpthread.so.20.1 is nodelete
objname /usr/local/lib/libgthread-2.0.so.4200.2 is nodelete

I don't know if that's considered important, but if it's desirable to
keep it then this diff relative to yours would do so:

$ LD_DEBUG=1 /usr/local/lib/vlc/vlc-cache-gen -f . 2>&1 | grep nodel | uniq
objname /usr/lib/libpthread.so.20.1 is nodelete
objname /usr/local/lib/libgthread-2.0.so.4200.2 is nodelete
objname /usr/local/lib/libgmodule-2.0.so.4200.2 is nodelete
objname /usr/local/lib/libgio-2.0.so.4200.2 is nodelete
objname /usr/local/lib/libgobject-2.0.so.4200.2 is nodelete
objname /usr/local/lib/libglib-2.0.so.4200.2 is nodelete

--- resolve.c,  Mon Jan 18 12:38:27 2016
+++ resolve.c   Mon Jan 18 12:47:03 2016
@@ -59,8 +59,12 @@ _dl_add_object(elf_object_t *object)
         * in needs to be kept around forever, so add a reference there.
         */
        if (object->obj_flags & DF_1_NODELETE &&
-           (object->load_object->status & STAT_NODELETE) == 0) {
+           (object->status & STAT_NODELETE) == 0) {
+               object->status |= STAT_NODELETE;
                DL_DEB(("objname %s is nodelete\n", object->load_name));
+       }
+       if (object->obj_flags & DF_1_NODELETE &&
+           (object->load_object->status & STAT_NODELETE) == 0) {
                object->load_object->opencount++;
                object->load_object->status |= STAT_NODELETE;
        }

> Philip Guenther
> 
> 
> Index: resolve.c
> ===================================================================
> RCS file: /data/src/openbsd/src/libexec/ld.so/resolve.c,v
> retrieving revision 1.69
> diff -u -p -r1.69 resolve.c
> --- resolve.c 2 Nov 2015 07:02:53 -0000       1.69
> +++ resolve.c 16 Jan 2016 23:06:46 -0000
> @@ -54,12 +54,15 @@ elf_object_t *_dl_loading_object;
>  void
>  _dl_add_object(elf_object_t *object)
>  {
> -     /* if a .so is marked nodelete, then add a reference */
> +     /*
> +      * If a .so is marked nodelete, then the entire load group that it's
> +      * in needs to be kept around forever, so add a reference there.
> +      */
>       if (object->obj_flags & DF_1_NODELETE &&
> -         (object->status & STAT_NODELETE) == 0) {
> +         (object->load_object->status & STAT_NODELETE) == 0) {
>               DL_DEB(("objname %s is nodelete\n", object->load_name));
> -             object->refcount++;
> -             object->status |= STAT_NODELETE;
> +             object->load_object->opencount++;
> +             object->load_object->status |= STAT_NODELETE;
>       }
>  
>       /*

Reply via email to