On Fri, Feb 19, 2021 at 01:06:43PM +0100, Otto Moerbeek wrote:

> On Fri, Feb 19, 2021 at 12:45:58PM +0100, Mark Kettenis wrote:
> 
> > > Date: Fri, 19 Feb 2021 10:57:30 +0100
> > > From: Otto Moerbeek <[email protected]>
> > > 
> > > Hi,
> > > 
> > > working on PowerDNS Recursor, once in a while I'm seeing:
> > > 
> > > #0  0x000009fd67ef09dc in
> > > libunwind::UnwindInfoSectionsCache::CacheTree_RB_INSERT_COLOR
> > > (this=<optimized out>, 
> > >     head=0x9fd67efc8e8 <libunwind::uwis_cache+8>, elm=0x9fca04be900)
> > >     at 
> > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libunwind/src/AddressSpace.hpp:243
> > > 243       RB_GENERATE(CacheTree, CacheItem, entry, CacheCmp);
> > > [Current thread is 1 (process 349420)]
> > > (gdb) bt
> > > #0  0x000009fd67ef09dc in
> > > libunwind::UnwindInfoSectionsCache::CacheTree_RB_INSERT_COLOR
> > > (this=<optimized out>, 
> > >     head=0x9fd67efc8e8 <libunwind::uwis_cache+8>, elm=0x9fca04be900)
> > >     at 
> > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libunwind/src/AddressSpace.hpp:243
> > > #1  0x000009fd67eeddef in
> > > libunwind::UnwindInfoSectionsCache::CacheTree_RB_INSERT
> > > (this=<optimized out>, 
> > >     head=<optimized out>, elm=<optimized out>)
> > >     at 
> > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libunwind/src/AddressSpace.hpp:243
> > > #2  libunwind::UnwindInfoSectionsCache::setUnwindInfoSectionsForPC
> > > (this=<optimized out>, key=10983975073074, 
> > >     uis=...) at 
> > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libunwind/src/AddressSpace.hpp:237
> > > #3  libunwind::UnwindCursor<libunwind::LocalAddressSpace,
> > > libunwind::Registers_x86_64>::setInfoBasedOnIPRegister (
> > >     this=0x9fd2ca0aa68, isReturnAddress=<optimized out>)
> > >     at 
> > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libunwind/src/UnwindCursor.hpp:1891
> > > #4  0x000009fd67eedaa4 in
> > > libunwind::UnwindCursor<libunwind::LocalAddressSpace,
> > > libunwind::Registers_x86_64>::step (
> > >     this=0x9fd2ca0aa68) at 
> > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libunwind/src/UnwindCursor.hpp:2031
> > > #5  0x000009fd67ef15a4 in unwind_phase1 (uc=<optimized out>,
> > > cursor=<optimized out>, exception_object=0x9fd37b24560)
> > >     at 
> > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libunwind/src/UnwindLevel1.c:46
> > > #6  _Unwind_RaiseException (exception_object=0x9fd37b24560)
> > >     at 
> > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libunwind/src/UnwindLevel1.c:363
> > > #7  0x000009fd67eeb533 in __cxa_throw (thrown_object=0x9fd37b24580, 
> > >     tinfo=0x9fa6c615a00 <typeinfo for PDNSException>, dest=<optimized 
> > > out>)
> > >     at 
> > > /usr/src/gnu/lib/libcxxabi/../../../gnu/llvm/libcxxabi/src/cxa_exception.cpp:279
> > > #8  0x000009fa6c295955 in ComboAddress::ComboAddress (this=<optimized
> > > out>, str=..., port=<optimized out>)
> > >     at ./iputils.hh:219
> > > #9  0x000009fa6c489970 in startFrameStreamServers (config=...) at 
> > > pdns_recursor.cc:1248
> > > #10 checkFrameStreamExport (luaconfsLocal=...) at pdns_recursor.cc:1290
> > > #11 0x000009fa6c48158f in recursorThread (n=<optimized out>,
> > > ...
> > > 
> > > This does not happen always, most of the time this exception is
> > > handled correctly, afaik.
> > > 
> > > The code that twrows an exception is:
> > >       try {
> > >         ComboAddress address(server);
> > >         ...
> > >       }
> > >       catch ...
> > > 
> > > The ComboAddress constructor throws the exception (and is supposed to
> > > do that). It looks like libunwind gets confused somehow.
> > > 
> > > Any clue?
> > 
> > The cache that pirofti@ added a while ago isn't thread-safe.  Or maybe
> > this is a use-after free caused by dlcose(4).  We should probably
> > disable/remove it while he is working on a better solution.
> > Unfortunately I don't think adding locking here is a good idea so this
> > may need a more fundamental rethink.
> > 
> > Upstream did add an optimization in this area a few months ago:
> > 
> >   
> > https://github.com/llvm/llvm-project/commit/881aba7071c6e4cc2417e875ca5027ec7c0a92a3
> > 
> > The version of libunwind we're using is older than that, so it may be
> > worth picking that up and see if that improves the original problem.
> 
> First I'm going to try to fix it my making the cache thread_local.
> 
> I'm probably going to regret looking at this code,
> 
>       -Otto

The diff below works for my test case on amd64.

It also feels right from a theoretical point of view. As for practical
matters, if thread local storage isn't working properly, a lot more
things would break. So I am a bit more optimisic about the diff now
than this morning.

So we have three options, I think:

        - back the caching out, 
        - investigate the commit you mention above. Sadly I cannot
          remember the original case that prompted for the caching code to be
          added.
        - continue on the thread_local path

We *have* to pick a solution. Having broken exception handling for
multi-threaded applications is no fun at all...

        Otto

> 
> Index: UnwindCursor.hpp
> ===================================================================
> RCS file: /cvs/src/gnu/llvm/libunwind/src/UnwindCursor.hpp,v
> retrieving revision 1.2
> diff -u -p -r1.2 UnwindCursor.hpp
> --- UnwindCursor.hpp    2 Jan 2021 01:10:02 -0000       1.2
> +++ UnwindCursor.hpp    19 Feb 2021 12:05:26 -0000
> @@ -75,7 +75,7 @@ extern "C" _Unwind_Reason_Code __libunwi
>  
>  namespace libunwind {
>  
> -static UnwindInfoSectionsCache uwis_cache;
> +static thread_local UnwindInfoSectionsCache uwis_cache;
>  
>  #if defined(_LIBUNWIND_SUPPORT_DWARF_UNWIND)
>  /// Cache of recently found FDEs.
> 

Reply via email to