Jordi Esteban created STORM-2551:
------------------------------------
Summary: Thrift client socket timeout
Key: STORM-2551
URL: https://issues.apache.org/jira/browse/STORM-2551
Project: Apache Storm
Issue Type: Bug
Reporter: Jordi Esteban
I am trying to deploy a Highly Available Nimbus using Docker. At the moment I
am only deploying two services (nimbus-1 and nimbus-2), so the configuration
file for Storm includes the following parameter: {{nimbus.seeds: [nimbus-1,
nimbus-2]}}
The issue comes when the first of the services (nimbus-1) is down. For example
trying to deploy a topology from nimbus-2 could take like 15 minutes. I have
checked the code and it is because it loops through all {{nimbus.seeds}} hosts
in order to check which one is the leader. And for each loop it tries to create
a new NimbusClient (therefore a new ThriftClient) but always passing null as
the timeout for the created socket. So it tries to connect to the host until a
ConnectionTimeout is reached. Modifying the parameter
{{storm.thrift.socket.timeout.ms}} does not change the socket timeout.
I think that the ThriftClient should also use the thrift socket timeout
parameter ({{storm.thrift.socket.timeout.ms}}) just the same as the
ThriftServer (or the transport plugin used in the communication) which was
implemented in the Story [link
2254|https://issues.apache.org/jira/browse/STORM-2254].
(This is my first issue + pull request, so sorry if something is wrong)
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)