* Jakub Jelinek:

> On Wed, Nov 03, 2021 at 05:28:02PM +0100, Florian Weimer wrote:
>> This function is similar to __gnu_Unwind_Find_exidx as used on arm.
>> It can be used to speed up the libgcc unwinder.
>
> I'm little bit worried that this trades the speed of exceptions for
> speed of dlopen/dlclose and extra memory use in each process.
> I admit I haven't been paying close attention to how many shared libraries
> apps typically link against and how many dlopen/dlclose calls they do
> in the last decade and half, but I'd think more applications don't use
> exceptions compared to apps that do use them, and of many of those that do
> use them don't use them for really exceptional cases, so speeding those
> is a good thing.

dlopen has many sources of quadratic behavior already, and many involve
chasing pointers.  The new data structure is very compact, so the new
work during dlopen does not show up prominently in profiles.

> So, I'd wonder, could this overhead be added lazily, when _dl_find_eh_frame
> is called for the first time just take the rtld lock, prepare anything you
> populate right now already from the process start up and every
> dlopen/dlclose before the first _dl_find_eh_frame call and only since then
> keep it updated on dlopen/dlclose?

I think it's possible to do this lazily (except the memory allocation).
But I don't want to do this unless we have performance numbers that
suggest it is actually required.

> Thus, for the expected majority of apps that aren't using exceptions at all
> nothing would change for dlopen/dlclose overhead, while all but the first
> _dl_find_eh_frame would be faster and with no locking?

One thing I'd like to do is to use the data structure in
_dl_find_dso_for_object, and that is actually called during dlopen to
determine the caller DSO.  _dl_find_dso_for_object can show up in
profiles with a lot of dlopen calls, particularly if an object loaded
later calls dlopen, so that the current implementation takes more time
to find the object.  _dl_find_dso_for_object is also used in dlsym,
although we skip it if the caller passes an explicit handle (but
RTLD_DEFAULT, RTLD_NEXT, etc. definitely need it).

We can also replace the soname and file identity lookup with a hash
table.  *That* will definitely recover any losses from
_dl_find_eh_frame_update.  In my profiles strcmp always shows up higher
than _dl_find_eh_frame_update.

Thanks,
Florian

Reply via email to