BewareMyPower opened a new pull request #6827:
URL: https://github.com/apache/pulsar/pull/6827
Fixes #6822
### Motivation
Whan I reduced the send timeout to reproduce `ResultTimeout` error, I found
that the program may cause segmentation fault after closed accidentally. Then I
found `ProducerImpl::handleSendTimeout()` doesn't check the state or other
fields of `ProducerImpl`, which may cause that a null `sendTimer_` calls its
methods.
### Modifications
- Acquire `mutex_` and check if the `state_` is `Ready` before handling the
send timer callback;
- Devide `failPendingMessages()` into two parts which are
`getPendingCallbacksWhenFailedWithLock()` and `PendingCallbacks::complete()`.
- The 1st part needs to acquire `mutex_` to access class members safely.
In addition, we already hold `mutex_` before `failPendingMessages()`, so we
need a method to get necessary callbacks without the lock.
- The 2nd part doesn't and shouldn't acquire `mutex_`, because we don't
know how long user provided callbacks may cost. If lock this part, `mutex_` may
be hold for a long time.
- Define a new struct contains a
`BatchMessageContainer::MessageContainerListPtr` field, because nested classes
can't be forward declared.
### Verifying this change
- [ ] Make sure that the change passes the CI checks.
This change can be verified as follows:
- Try to reproduce the error as #6822;
- From the log you can see no callbacks called after `closeAsync()` and the
program exited normally.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]