BewareMyPower opened a new issue #8914:
URL: https://github.com/apache/pulsar/issues/8914


   **Describe the bug**
   Sometimes the program may crashed at 
`AckGroupingTrackerEnabled#scheduleTimer`. Though #8519 tries to solve the 
problem by extending the lifetime of `AckGroupingTrackerEnabled` so that the 
callback won't access the outdated `this`. However, the segmentation fault 
still happens.
   
   A typical stack trace is:
   
   ```
    #6 <signal handler called>
    #7 0x00007f5aad920b60 in ?? ()
    #8 0x00007f6e9ee7d1bb in 
boost::asio::detail::wait_handler<pulsar::AckGroupingTrackerEnabled::scheduleTimer()::{lambda(boost::system::error_code
 const&)#1}>::do_complete(void*, boost::asio::detail::scheduler_operation*, 
boost::system::error_code const&, unsigned long) ()
    from 
/opt/vertica/verticadb/v_verticadb_node0003_catalog/Libraries/0213a49612da8c2ad8b19aa3bd77ddec00a000000002090c/PulsarSourceLib_0213a49612da8c2ad8b19aa3bd77ddec00a000000002090c.so
    #9 0x00007f6e9edd78d3 in 
boost::asio::detail::scheduler::run(boost::system::error_code&) ()
    from 
/opt/vertica/verticadb/v_verticadb_node0003_catalog/Libraries/0213a49612da8c2ad8b19aa3bd77ddec00a000000002090c/PulsarSourceLib_0213a49612da8c2ad8b19aa3bd77ddec00a000000002090c.so
    #10 0x00007f6e9edd4aa6 in 
pulsar::ExecutorService::startWorker(std::shared_ptr<boost::asio::io_context>) 
()
    from 
/opt/vertica/verticadb/v_verticadb_node0003_catalog/Libraries/0213a49612da8c2ad8b19aa3bd77ddec00a000000002090c/PulsarSourceLib_0213a49612da8c2ad8b19aa3bd77ddec00a000000002090c.so
    #11 0x00007f6e9edd9c82 in 
std::thread::_Impl<std::_Bind_simple<std::_Bind<std::_Mem_fn<void 
(pulsar::ExecutorService::)(std::shared_ptr<boost::asio::io_context>)> 
(pulsar::ExecutorService, std::shared_ptr<boost::asio::io_context>)> ()> 
>::_M_run() ()
    from 
/opt/vertica/verticadb/v_verticadb_node0003_catalog/Libraries/0213a49612da8c2ad8b19aa3bd77ddec00a000000002090c/PulsarSourceLib_0213a49612da8c2ad8b19aa3bd77ddec00a000000002090c.so
    #12 0x00007f6fcb5d2070 in ?? () from /lib64/libstdc++.so.6
    #13 0x00007f6fcb006dd5 in start_thread () from /lib64/libpthread.so.0
    #14 0x00007f6fca923ead in clone () from /lib64/libc.so.6
   ```
   
   
   **To Reproduce**
   It cannot be reproduced easily. The running environment is that a `Client` 
is long lived, and many `Reader`s are periodly created and used to read some 
messages.
   
   **Expected behavior**
   The segmentation fault should not happen.
   
   **Additional context**
   A solution that may work is refactoring the timer design. Currently, the 
deadline timer is recreated each time in the callback. And there's no state 
check like `PartitionedConsumerImpl::partitionsUpdateTimer_`:
   
   ```c++
   void PartitionedConsumerImpl::runPartitionUpdateTask() {
       partitionsUpdateTimer_->expires_from_now(partitionsUpdateInterval_);
       partitionsUpdateTimer_->async_wait(
           std::bind(&PartitionedConsumerImpl::getPartitionMetadata, 
shared_from_this()));
   }
   
   void PartitionedConsumerImpl::getPartitionMetadata() {
       using namespace std::placeholders;
       lookupServicePtr_->getPartitionMetadataAsync(topicName_)
           
.addListener(std::bind(&PartitionedConsumerImpl::handleGetPartitions, 
shared_from_this(), _1, _2));
   }
   
   void PartitionedConsumerImpl::handleGetPartitions(Result result,
                                                     const LookupDataResultPtr& 
lookupDataResult) {
       Lock stateLock(mutex_);
       if (state_ != Ready) {
           // NOTE: when consumer is not ready, the runPartitionUpdateTask 
won't be scheduled
           return;
       }
       /* do the real work... */
       runPartitionUpdateTask();
   }
   ```
   
   However, we still need to give a detail explanation for the stack trace 
that's mentioned before.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to