This is an automated email from the ASF dual-hosted git repository.

granthenke pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git


The following commit(s) were added to refs/heads/master by this push:
     new 52c5935  [test] Fix ASAN failure when Hive Metastore connections are 
retried.
52c5935 is described below

commit 52c59352f8c06be359edb95d2a68ad06252e2031
Author: Grant Henke <[email protected]>
AuthorDate: Thu Feb 20 13:35:33 2020 -0600

    [test] Fix ASAN failure when Hive Metastore connections are retried.
    
    I saw an ASAN test failure that occured when there was a failure
    to connect to the Hive Metastore. This may not fix the connection
    issue, but it fixes the unsafe ASAN failure and allows the test to
    continue.
    
    Below is a sample of the log:
    
    W0220 18:46:15.548344 18002 client.h:351] Failed to connect to Hive 
Metastore (127.0.0.1:45269): Network error: failed to open Hive Metastore 
connection: socket open() error: Connection refused
    I0220 18:46:16.549294 18002 client.cc:56] TSocket::open() error on socket 
(after THRIFT_POLL) <Host: 127.0.0.1 Port: 45269>Connection refused
    W0220 18:46:16.549479 18002 client.h:351] Failed to connect to Hive 
Metastore (127.0.0.1:45269): Network error: failed to open Hive Metastore 
connection: socket open() error: Connection refused
    /home/jenkins-slave/workspace/kudu-master/0/src/kudu/thrift/client.h:204:3: 
runtime error: left shift of 100 by 26 places cannot be represented in type 
'int'
        #0 0x7f527299d77b in 
kudu::thrift::HaClient<kudu::hms::HmsClient>::Execute(std::function<kudu::Status
 (kudu::hms::HmsClient*)>)::'lambda'()::operator()() const 
/home/jenkins-slave/workspace/kudu-master/0/src/kudu/thrift/client.h:204:3
        #1 0x7f526e44ead7 in boost::function0<void>::operator()() const 
/home/jenkins-slave/workspace/kudu-master/0/thirdparty/installed/uninstrumented/include/boost/function/function_template.hpp:770:14
        #2 0x7f526b6f21f4 in kudu::ThreadPool::DispatchThread() 
/home/jenkins-slave/workspace/kudu-master/0/src/kudu/util/threadpool.cc:685:22
        #3 0x7f526b70c992 in boost::_bi::bind_t<void, boost::_mfi::mf0<void, 
kudu::ThreadPool>, boost::_bi::list1<boost::_bi::value<kudu::ThreadPool*> > 
>::operator()() 
/home/jenkins-slave/workspace/kudu-master/0/thirdparty/installed/uninstrumented/include/boost/bind/bind.hpp:1222:16
        #4 0x7f526e44ead7 in boost::function0<void>::operator()() const 
/home/jenkins-slave/workspace/kudu-master/0/thirdparty/installed/uninstrumented/include/boost/function/function_template.hpp:770:14
        #5 0x7f526b6d812a in kudu::Thread::SuperviseThread(void*) 
/home/jenkins-slave/workspace/kudu-master/0/src/kudu/util/thread.cc:675:3
        #6 0x7f5267917183 in start_thread 
/build/eglibc-SvCtMH/eglibc-2.19/nptl/pthread_create.c:312
        #7 0x7f526742dffc in clone sysdeps/unix/sysv/linux/x86_64/clone.S:111
    
    Change-Id: I1282ad36027b314d090e5a2dffdc3854002af761
    SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
/home/jenkins-slave/workspace/kudu-master/0/src/kudu/thrift/client.h:204:3 in
    Reviewed-on: http://gerrit.cloudera.org:8080/15256
    Tested-by: Kudu Jenkins
    Reviewed-by: Alexey Serbin <[email protected]>
---
 src/kudu/thrift/client.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/kudu/thrift/client.h b/src/kudu/thrift/client.h
index 2e684e7..867cf2b 100644
--- a/src/kudu/thrift/client.h
+++ b/src/kudu/thrift/client.h
@@ -31,6 +31,7 @@
 #include "kudu/gutil/port.h"
 #include "kudu/gutil/ref_counted.h"
 #include "kudu/gutil/strings/substitute.h"
+#include "kudu/rpc/rpc.h"
 #include "kudu/thrift/ha_client_metrics.h"
 #include "kudu/util/async_util.h"
 #include "kudu/util/metrics.h"
@@ -48,6 +49,9 @@ class TProtocol;
 } // namespace apache
 
 namespace kudu {
+
+using rpc::ComputeExponentialBackoff;
+
 namespace thrift {
 
 // Options for a Thrift client connection.
@@ -251,14 +255,10 @@ Status 
HaClient<Service>::Execute(std::function<Status(Service*)> task) {
           if (PREDICT_TRUE(metrics_)) {
             metrics_->reconnections_failed->Increment();
           }
-          // Reconnect failed; retry with exponential backoff capped at 10s and
-          // fail the task. We don't bother with jitter here because only the
-          // leader master should be attempting this in any given period per
-          // cluster.
+          // Reconnect failed; retry with exponential backoff and fail the 
task.
           consecutive_reconnect_failures_++;
           reconnect_after_ = MonoTime::Now() +
-              std::min(MonoDelta::FromMilliseconds(100 << 
consecutive_reconnect_failures_),
-                       MonoDelta::FromSeconds(10));
+              ComputeExponentialBackoff(consecutive_reconnect_failures_);
           reconnect_failure_ = std::move(reconnect_status);
           return callback(reconnect_failure_);
         }

Reply via email to