Hi Gary,

ack, review only. Minor comments below. /Thanks HansN


On 08/01/2018 06:49 AM, Gary Lee wrote:
Sometimes the 'watch' command in the KV plugin will not return
a takeover request, if the KV store does not respond in time.
Then rded would try to read the takeover request in the main
thread after receiving a takeover request notification.

This can cause rded to not respond to AMF callbacks in a timely
fashion. We should try to read the takeover request in another
thread to avoid this.
---
  src/rde/rded/role.cc | 26 ++++++++++++++++++++++++--
  1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/src/rde/rded/role.cc b/src/rde/rded/role.cc
index 0567fdfcf..8e0411ee5 100644
--- a/src/rde/rded/role.cc
+++ b/src/rde/rded/role.cc
@@ -51,12 +51,34 @@ void Role::MonitorCallback(const std::string& key, const 
std::string& new_value,
rde_msg* msg = static_cast<rde_msg*>(malloc(sizeof(rde_msg)));
    if (key == Consensus::kTakeoverRequestKeyname) {
[HansN] std::string request;
+    std::string request("");
+
+    if (new_value.empty() == true) {
+      // sometimes the KV store plugin doesn't return the new value,
+      // let's try to read it in this thread to avoid stalling
+      // the main thread
+      TRACE("Empty takeover request from callback. Try reading it");
+
+      SaAisErrorT rc = SA_AIS_ERR_TRY_AGAIN;
+      constexpr uint8_t max_retry = 5;
+      uint8_t retries = 0;
+      Consensus consensus_service;
+
+      while (retries < max_retry && rc != SA_AIS_OK) {
+        rc = consensus_service.ReadTakeoverRequest(request);
+        ++retries;
+      }
+    } else {
+      // use the value received in callback
+      request = new_value;
+    }
+
      // don't send this to the main thread straight away, as it will
      // need some time to process topology changes.
      msg->type = RDE_MSG_TAKEOVER_REQUEST_CALLBACK;
-    size_t len = new_value.length() + 1;
+    size_t len = request.length() + 1;
      msg->info.takeover_request = new char[len];
-    strncpy(msg->info.takeover_request, new_value.c_str(), len);
+    strncpy(msg->info.takeover_request, request.c_str(), len);
      LOG_NO("Sending takeover request '%s' to main thread",
            msg->info.takeover_request);
      std::this_thread::sleep_for(std::chrono::seconds(4));


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to