BewareMyPower opened a new pull request #12427:
URL: https://github.com/apache/pulsar/pull/12427


   Fixes #11635
   
   ### Motivation
   
   Python tests failed very often with segmentation fault. It can be reproduced 
easily based on the `apachepulsar/pulsar-build:ubuntu-16.04-pb3` image in local 
env. After dumping the core file and running `gdb python core` to debug, we can 
see the following backtrace.
   
   ```
   #0  0x00007f655e5e8d0d in std::__atomic_base<long>::load 
(__m=std::memory_order_seq_cst, this=0xb8) at 
/usr/include/c++/5/bits/atomic_base.h:396
   #1  std::__atomic_base<long>::operator long (this=0xb8) at 
/usr/include/c++/5/bits/atomic_base.h:259
   #2  0x00007f655e5e6348 in boost::asio::detail::task_io_service::run 
(this=0x0, ec=...) at 
/usr/include/boost/asio/detail/impl/task_io_service.ipp:136
   #3  0x00007f655e5e6cc4 in boost::asio::io_service::run (this=0x1f11990) at 
/usr/include/boost/asio/impl/io_service.ipp:59
   #4  0x00007f655e5e32d0 in pulsar::ExecutorService::startWorker 
(this=0x1f11130, io_service=std::shared_ptr (count 2, weak 0) 0x1e8ba00) at 
/pulsar/pulsar-client-cpp/lib/ExecutorService.cc:37
   ```
   
   The `io_service` object is not null, but the internal `impl_` field, whose 
type is a `task_io_service` pointer, is null (see `(this=0x0` of `#2`).
   
   The cause is `io_service::run` is called in a dependent thread, while the 
destructor of `ExecutorService` is called in another thread. After the 
`ExecutorService` is destructed, the internal thread field `worker_` and the 
`io_service` field will both be invalid to access. `io_service::run` should not 
be called after that.
   
   ### Modifications
   
   Refactor the `ExecutorService`. Since it's not copyable, it's redundant to 
store `io_service` and `io_service::work` as smart pointers. In addition, the 
thread that runs `io_service` is never used, just detach it when 
`ExecutorService` is created.
   
   The key point is to check whether `ExecutorService#close` is called in the 
thread by examining the atomic boolean value `closed_`. Besides, the shared 
pointer is captured in the thread so that the `ExecutorService`'s lifetime 
could be extended.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to