[ 
https://issues.apache.org/jira/browse/ARROW-15604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17489126#comment-17489126
 ] 

Weston Pace commented on ARROW-15604:
-------------------------------------

We could work around it with some kind of guarded singleton class:
 * The constructor instantiates the instance on the heap
 * The accessor grabs a mutex (or spinlock if we need to be signal safe) and, 
if instance is null, returns an invalid status
    * Note: will require the accessor to return Result<T*> instead of T* like 
it does today, not sure if that will be a problem for OT

Then register an atexit handler that grabs the mutex/spin lock, deletes the 
instance, and sets the pointer to null

Looking at the OT code more closely though I am a bit surprised we are 
encountering this.  The {{END_SPAN_ON_FUTURE_COMPLETION}} macro uses {{Then}} 
and creates a new future.  The new future should only be marked finished after 
the OT work is done.

If no one tackles this in the meantime I will investigate further on Friday.

> [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing
> -----------------------------------------------------------
>
>                 Key: ARROW-15604
>                 URL: https://issues.apache.org/jira/browse/ARROW-15604
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Continuous Integration
>            Reporter: Antoine Pitrou
>            Priority: Major
>
> The error is a heap-use-after-free and involves an OpenTracing structure that 
> was deleted by an atexit hook.
> https://github.com/ursacomputing/crossbow/runs/5097362072?check_suite_focus=true#step:5:4843
> Summary:
> {code}
>   Atomic write of size 4 at 0x7b08000136a8 by thread T2:
>   [...]
>     #10 
> opentelemetry::v1::context::RuntimeContext::GetRuntimeContextStorage() 
> /build/cpp/opentelemetry_ep-install/include/opentelemetry/context/runtime_context.h:156:12
>  (libarrow.so.800+0x1e62ef7)
>     #11 
> opentelemetry::v1::context::RuntimeContext::Detach(opentelemetry::v1::context::Token&)
>  
> /build/cpp/opentelemetry_ep-install/include/opentelemetry/context/runtime_context.h:97:54
>  (libarrow.so.800+0x1e70178)
>     #12 opentelemetry::v1::context::Token::~Token() 
> /build/cpp/opentelemetry_ep-install/include/opentelemetry/context/runtime_context.h:168:3
>  (libarrow.so.800+0x1e7012f)
>   [...]
> {code}
> {code}
>   Previous write of size 8 at 0x7b08000136a8 by main thread:
>     #0 operator delete(void*) <null> (arrow-dataset-scanner-test+0x16a69e)
>   [...]
>     #7 
> opentelemetry::v1::nostd::shared_ptr<opentelemetry::v1::context::RuntimeContextStorage>::~shared_ptr()
>  
> /build/cpp/opentelemetry_ep-install/include/opentelemetry/nostd/shared_ptr.h:98:30
>  (libarrow.so.800+0x1e62fb3)
>     #8 cxa_at_exit_wrapper(void*) <null> (arrow-dataset-scanner-test+0x11866f)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to