msharee9 commented on a change in pull request #743: Minificpp 1169 - Simplify
C2 metrics collection and reporting
URL: https://github.com/apache/nifi-minifi-cpp/pull/743#discussion_r408289302
##########
File path: libminifi/src/c2/C2Agent.cpp
##########
@@ -89,44 +93,68 @@ C2Agent::C2Agent(const
std::shared_ptr<core::controller::ControllerServiceProvid
}
request_mutex.unlock();
}
-
- if ( time_since > heart_beat_period_ ) {
- last_run_ = now;
- try {
- performHeartBeat();
- }
- catch(const std::exception &e) {
- logger_->log_error("Exception occurred while performing heartbeat.
error: %s", e.what());
- }
- catch(...) {
- logger_->log_error("Unknonwn exception occurred while performing
heartbeat.");
- }
+ try {
+ performHeartBeat();
+ }
+ catch(const std::exception &e) {
+ logger_->log_error("Exception occurred while performing heartbeat.
error: %s", e.what());
+ }
+ catch(...) {
+ logger_->log_error("Unknonwn exception occurred while performing
heartbeat.");
}
checkTriggers();
- std::this_thread::sleep_for(std::chrono::milliseconds(heart_beat_period_
> 500 ? 500 : heart_beat_period_));
- return
state::Update(state::UpdateStatus(state::UpdateState::READ_COMPLETE, false));
+ return
utils::TaskRescheduleInfo::RetryIn(std::chrono::milliseconds(heart_beat_period_));
};
-
functions_.push_back(c2_producer_);
c2_consumer_ = [&]() {
auto now = std::chrono::steady_clock::now();
if ( queue_mutex.try_lock_until(now + std::chrono::seconds(1)) ) {
- if (responses.size() > 0) {
- const C2Payload payload(std::move(responses.back()));
- responses.pop_back();
- extractPayload(std::move(payload));
+ if (responses.empty()) {
+ queue_mutex.unlock();
+ return utils::TaskRescheduleInfo::RetryImmediately();
Review comment:
> @msharee9 Tried lightweight and normal heartbeat and different flows, they
seem to work fine.
>
> However, even when running with no flow, but C2 enabled, minifi consumes
150% CPU both on macOS and Linux.
>
> This is very likely caused by the issue pointed out by Arpad, that is, we
are busy looping because of an incorrect usage of the Thread Pool.
>
> This callgrind profiling of minifi also seems to reinforce that suspicion:
> <img alt="Screenshot 2020-04-14 at 17 21 42" width="1679"
src="https://user-images.githubusercontent.com/9108564/79242487-b1a66680-7e74-11ea-9a6b-f07f61bccc54.png">
Changed it to Retry in 100 ms. I guess then the RetryImmediately() cannot be
used anywhere. We should either remove it altogether or have some delay before
we call continue at this line.
https://github.com/apache/nifi-minifi-cpp/blob/3116eb817a7182486bf3e50b87c3032ac2074031/libminifi/src/utils/ThreadPool.cpp#L53
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services