[ 
https://issues.apache.org/jira/browse/ARROW-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16417266#comment-16417266
 ] 

Anton Shmigirilov commented on ARROW-2359:
------------------------------------------

Yes, I have issue with this. It's reproduced in whole project that uses Arrow. 
Part of project reads RecordBatch using RecordBatchStreamReader, performs 
access to Schema (RecordBatch::schema()) and field's type(). Such reading 
executes in concurrent threads. Executable built with gcc's ThreadSanitizer and 
here is part of sanitizer's output:

Atomic write of size 4 at 0x7b100000c008 by thread T5:
 #0 __tsan_atomic32_fetch_add <null> (libtsan.so.0+0x000000064aa0)
 #1 __atomic_add /usr/include/c++/7/ext/atomicity.h:53 (exec+0x0000008a68c3)
 #2 __atomic_add_dispatch /usr/include/c++/7/ext/atomicity.h:96 
(exec+0x0000008a68c3)
 #3 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_add_ref_copy() 
/usr/include/c++/7/bits/shared_ptr_base.h:138 (FDIOExec+0x0000008a68c3)
 #4 
std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count(std::__shared_count<(__gnu_cxx::_Lock_policy)2>
 const&) /usr/include/c++/7/bits/shared_ptr_base.h:691 (exec+0x0000008a68c3)
 #5 std::__shared_ptr<arrow::DataType, 
(__gnu_cxx::_Lock_policy)2>::__shared_ptr(std::__shared_ptr<arrow::DataType, 
(__gnu_cxx::_Lock_policy)2> const&) 
/usr/include/c++/7/bits/shared_ptr_base.h:1121 (exec+0x0000008a68c3)
 #6 
std::shared_ptr<arrow::DataType>::shared_ptr(std::shared_ptr<arrow::DataType> 
const&) /usr/include/c++/7/bits/shared_ptr.h:119 (exec+0x0000008a68c3)
 #7 arrow::Field::type() const 
/usr/local/fdio-deps/lib/../include/arrow/type.h:244 (exec+0x0000008a68c3)
 #8 func_reader() (exec+0x0000008a68c3)

Previous write of size 8 at 0x7b100000c008 by thread T4:
 #0 operator new(unsigned long) <null> (libtsan.so.0+0x00000006f846)
 #1 __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<arrow::StringType, 
std::allocator<arrow::StringType>, (__gnu_cxx::_Lock_policy)2> 
>::allocate(unsigned long, void const*) 
/usr/include/c++/7/ext/new_allocator.h:111 (libarrow.so.0+0x000000167307)
 #2 func_reader() (exec+0x000000897a38)

It hard to reproduce it on simple synthetic test, but I will try it.

std::shared_ptr declared as thread safe in relation to control block, but as I 
understand it, safety is guaranteed only in case of modifying control block's 
internals which happened because copying shared_ptr itself (with sharing common 
control block). But in case of TYPE_FACTORY implementation, we have not only 
common/shared control block internals, but common (because static) shared_ptr 
internals itself. I guess this place is unsafe.

Another one opinion about this. Shared_ptr have been designed to maintain 
complex object's lifecycle in case of copying, moving and so on. In case of 
TYPE_FACTORY we haven't advantages of shared_ptr, we have single instance of 
object with lifetime corresponding whole process's lifetime. I guess is't not 
quite correct usage of the concept, in my opinion.

> Type objects produced by DataType factory are not thread safe
> -------------------------------------------------------------
>
>                 Key: ARROW-2359
>                 URL: https://issues.apache.org/jira/browse/ARROW-2359
>             Project: Apache Arrow
>          Issue Type: Task
>          Components: C++
>            Reporter: Anton Shmigirilov
>            Priority: Minor
>              Labels: pull-request-available
>
> TYPE_FACTORY() macro that produces type shortcuts (boolean(), int32(), utf8() 
> and so on) uses static shared_ptr inside. There are race conditions possible 
> against shared_ptr's reference counter.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to