stack created HBASE-18045:
-----------------------------
Summary: Add ' -o ConnectTimeout=10' to the ssh command we use in
ITBLL chaos monkeys
Key: HBASE-18045
URL: https://issues.apache.org/jira/browse/HBASE-18045
Project: HBase
Issue Type: Improvement
Components: integration tests
Reporter: stack
Priority: Trivial
Monkeys hang on me in long running tests. I've not spent too much time on it
since it rare enough but I just went through a spate of them. When monkey kill
ssh hangs, all killing stops which can give a false sense of victory when you
wake up in the morning and your job 'passed'. I also see monkeys kill all
servers in a cluster and fail to bring them back which causes job fail as no
one is serving data. The latter may actually be another issue but for the
former, I've had some success adding -o ConnectTimeout=10 as an option on
ssh. You can do it easily enough via config but this issue is to suggest that
we add it in code.
Here is how you add it via config if interested:
<property >
<name>hbase.it.clustermanager.ssh.opts</name>
<value> -o ConnectTimeout=10 </value>
</property >
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)