> Date: Sat, 20 Feb 2021 18:23:26 +0100
> From: Otto Moerbeek <[email protected]>
> Cc: [email protected], [email protected]
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
>
> On Fri, Feb 19, 2021 at 05:29:31PM +0100, Mark Kettenis wrote:
>
> > > Date: Fri, 19 Feb 2021 16:43:10 +0100
> > > From: Otto Moerbeek <[email protected]>
> > >
> > > On Fri, Feb 19, 2021 at 01:06:43PM +0100, Otto Moerbeek wrote:
> > >
> > > > On Fri, Feb 19, 2021 at 12:45:58PM +0100, Mark Kettenis wrote:
> > > >
> > > > > > Date: Fri, 19 Feb 2021 10:57:30 +0100
> > > > > > From: Otto Moerbeek <[email protected]>
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > working on PowerDNS Recursor, once in a while I'm seeing:
> > > > > >
> > > > > > #0 0x000009fd67ef09dc in
> > > > > > libunwind::UnwindInfoSectionsCache::CacheTree_RB_INSERT_COLOR
> > > > > > (this=<optimized out>,
> > > > > > head=0x9fd67efc8e8 <libunwind::uwis_cache+8>, elm=0x9fca04be900)
> > > > > > at
> > > > > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libunwind/src/AddressSpace.hpp:243
> > > > > > 243 RB_GENERATE(CacheTree, CacheItem, entry, CacheCmp);
> > > > > > [Current thread is 1 (process 349420)]
> > > > > > (gdb) bt
> > > > > > #0 0x000009fd67ef09dc in
> > > > > > libunwind::UnwindInfoSectionsCache::CacheTree_RB_INSERT_COLOR
> > > > > > (this=<optimized out>,
> > > > > > head=0x9fd67efc8e8 <libunwind::uwis_cache+8>, elm=0x9fca04be900)
> > > > > > at
> > > > > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libunwind/src/AddressSpace.hpp:243
> > > > > > #1 0x000009fd67eeddef in
> > > > > > libunwind::UnwindInfoSectionsCache::CacheTree_RB_INSERT
> > > > > > (this=<optimized out>,
> > > > > > head=<optimized out>, elm=<optimized out>)
> > > > > > at
> > > > > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libunwind/src/AddressSpace.hpp:243
> > > > > > #2 libunwind::UnwindInfoSectionsCache::setUnwindInfoSectionsForPC
> > > > > > (this=<optimized out>, key=10983975073074,
> > > > > > uis=...) at
> > > > > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libunwind/src/AddressSpace.hpp:237
> > > > > > #3 libunwind::UnwindCursor<libunwind::LocalAddressSpace,
> > > > > > libunwind::Registers_x86_64>::setInfoBasedOnIPRegister (
> > > > > > this=0x9fd2ca0aa68, isReturnAddress=<optimized out>)
> > > > > > at
> > > > > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libunwind/src/UnwindCursor.hpp:1891
> > > > > > #4 0x000009fd67eedaa4 in
> > > > > > libunwind::UnwindCursor<libunwind::LocalAddressSpace,
> > > > > > libunwind::Registers_x86_64>::step (
> > > > > > this=0x9fd2ca0aa68) at
> > > > > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libunwind/src/UnwindCursor.hpp:2031
> > > > > > #5 0x000009fd67ef15a4 in unwind_phase1 (uc=<optimized out>,
> > > > > > cursor=<optimized out>, exception_object=0x9fd37b24560)
> > > > > > at
> > > > > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libunwind/src/UnwindLevel1.c:46
> > > > > > #6 _Unwind_RaiseException (exception_object=0x9fd37b24560)
> > > > > > at
> > > > > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libunwind/src/UnwindLevel1.c:363
> > > > > > #7 0x000009fd67eeb533 in __cxa_throw (thrown_object=0x9fd37b24580,
> > > > > > tinfo=0x9fa6c615a00 <typeinfo for PDNSException>,
> > > > > > dest=<optimized out>)
> > > > > > at
> > > > > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libcxxabi/src/cxa_exception.cpp:279
> > > > > > #8 0x000009fa6c295955 in ComboAddress::ComboAddress
> > > > > > (this=<optimized
> > > > > > out>, str=..., port=<optimized out>)
> > > > > > at ./iputils.hh:219
> > > > > > #9 0x000009fa6c489970 in startFrameStreamServers (config=...) at
> > > > > > pdns_recursor.cc:1248
> > > > > > #10 checkFrameStreamExport (luaconfsLocal=...) at
> > > > > > pdns_recursor.cc:1290
> > > > > > #11 0x000009fa6c48158f in recursorThread (n=<optimized out>,
> > > > > > ...
> > > > > >
> > > > > > This does not happen always, most of the time this exception is
> > > > > > handled correctly, afaik.
> > > > > >
> > > > > > The code that twrows an exception is:
> > > > > > try {
> > > > > > ComboAddress address(server);
> > > > > > ...
> > > > > > }
> > > > > > catch ...
> > > > > >
> > > > > > The ComboAddress constructor throws the exception (and is supposed
> > > > > > to
> > > > > > do that). It looks like libunwind gets confused somehow.
> > > > > >
> > > > > > Any clue?
> > > > >
> > > > > The cache that pirofti@ added a while ago isn't thread-safe. Or maybe
> > > > > this is a use-after free caused by dlcose(4). We should probably
> > > > > disable/remove it while he is working on a better solution.
> > > > > Unfortunately I don't think adding locking here is a good idea so this
> > > > > may need a more fundamental rethink.
> > > > >
> > > > > Upstream did add an optimization in this area a few months ago:
> > > > >
> > > > >
> > > > > https://github.com/llvm/llvm-project/commit/881aba7071c6e4cc2417e875ca5027ec7c0a92a3
> > > > >
> > > > > The version of libunwind we're using is older than that, so it may be
> > > > > worth picking that up and see if that improves the original problem.
> > > >
> > > > First I'm going to try to fix it my making the cache thread_local.
> > > >
> > > > I'm probably going to regret looking at this code,
> > > >
> > > > -Otto
> > >
> > > The diff below works for my test case on amd64.
> > >
> > > It also feels right from a theoretical point of view. As for practical
> > > matters, if thread local storage isn't working properly, a lot more
> > > things would break. So I am a bit more optimisic about the diff now
> > > than this morning.
> > >
> > > So we have three options, I think:
> > >
> > > - back the caching out,
> > > - investigate the commit you mention above. Sadly I cannot
> > > remember the original case that prompted for the caching code
> > > to be
> > > added.
> > > - continue on the thread_local path
> > >
> > > We *have* to pick a solution. Having broken exception handling for
> > > multi-threaded applications is no fun at all...
> >
> > Agreed.
> >
> > It seems that libc++abi.so already includes the emutls code, so this
> > may very well be worth a shot.
>
> Yes, also tested with a new regress test on arm64 now. With the diff
> the regress tests fails once in a while. For 100 runs the chances are
> pretty high, so I use that.
It fails *with* the diff?
> Full diff:
>
> Index: gnu/llvm/libunwind/src/UnwindCursor.hpp
> ===================================================================
> RCS file: /cvs/src/gnu/llvm/libunwind/src/UnwindCursor.hpp,v
> retrieving revision 1.2
> diff -u -p -r1.2 UnwindCursor.hpp
> --- gnu/llvm/libunwind/src/UnwindCursor.hpp 2 Jan 2021 01:10:02 -0000
> 1.2
> +++ gnu/llvm/libunwind/src/UnwindCursor.hpp 20 Feb 2021 17:21:17 -0000
> @@ -75,7 +75,7 @@ extern "C" _Unwind_Reason_Code __libunwi
>
> namespace libunwind {
>
> -static UnwindInfoSectionsCache uwis_cache;
> +static thread_local UnwindInfoSectionsCache uwis_cache;
>
> #if defined(_LIBUNWIND_SUPPORT_DWARF_UNWIND)
> /// Cache of recently found FDEs.
> Index: regress/misc/exceptions/threads/Makefile
> ===================================================================
> RCS file: regress/misc/exceptions/threads/Makefile
> diff -N regress/misc/exceptions/threads/Makefile
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ regress/misc/exceptions/threads/Makefile 20 Feb 2021 17:21:17 -0000
> @@ -0,0 +1,11 @@
> +# $OpenBSD: Makefile,v 1.1 2007/01/28 19:10:06 kettenis Exp $
> +
> +PROG= exceptions
> +SRCS= exceptions.cc
> +
> +REGRESS_TARGETS=runs
> +
> +runs: exceptions
> + for i in $$(jot 100); do exceptions; done
> +
> +.include <bsd.regress.mk>
> Index: regress/misc/exceptions/threads/exceptions.cc
> ===================================================================
> RCS file: regress/misc/exceptions/threads/exceptions.cc
> diff -N regress/misc/exceptions/threads/exceptions.cc
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ regress/misc/exceptions/threads/exceptions.cc 20 Feb 2021 17:21:17
> -0000
> @@ -0,0 +1,52 @@
> +/* $OpenBSD$ */
> +/*
> + * Written by Otto Moerbeek <[email protected]> 2021 Public Domain
> + */
> +
> +#include <string>
> +#include <iostream>
> +#include <err.h>
> +#include <pthread.h>
> +
> +void
> +a()
> +{
> + try {
> + throw std::string("foo");
> + }
> + catch (const std::string& ex) {
> + if (ex != "foo")
> + errx(1, "foo");
> + }
> +}
> +
> +void
> +b()
> +{
> + a();
> +}
> +
> +void *
> +c(void *)
> +{
> + b();
> + return nullptr;
> +}
> +
> +#define N 100
> +
> +int
> +main()
> +{
> + int i;
> + pthread_t p[N];
> +
> + for (i = 0; i < N; i++)
> + if (pthread_create(&p[i], nullptr, c, nullptr) != 0)
> + err(1, nullptr);
> + for (i = 0; i < N; i++)
> + if (pthread_join(p[i], nullptr) != 0)
> + err(1, nullptr);
> + std::cout << ".";
> + return 0;
> +}
>