Jordi Esteban created STORM-2551:
------------------------------------

             Summary: Thrift client socket timeout
                 Key: STORM-2551
                 URL: https://issues.apache.org/jira/browse/STORM-2551
             Project: Apache Storm
          Issue Type: Bug
            Reporter: Jordi Esteban


I am trying to deploy a Highly Available Nimbus using Docker. At the moment I 
am only deploying two services (nimbus-1 and nimbus-2), so the configuration 
file for Storm includes the following parameter:  {{nimbus.seeds: [nimbus-1, 
nimbus-2]}}

The issue comes when the first of the services (nimbus-1) is down. For example 
trying to deploy a topology from nimbus-2 could take like 15 minutes. I have 
checked the code and it is because it loops through all {{nimbus.seeds}} hosts 
in order to check which one is the leader. And for each loop it tries to create 
a new NimbusClient (therefore a new ThriftClient) but always passing null as 
the timeout for the created socket. So it tries to connect to the host until a 
ConnectionTimeout is reached. Modifying the parameter 
{{storm.thrift.socket.timeout.ms}} does not change the socket timeout.

I think that the ThriftClient should also use the thrift socket timeout 
parameter ({{storm.thrift.socket.timeout.ms}}) just the same as the 
ThriftServer (or the transport plugin used in the communication) which was 
implemented in the Story [link 
2254|https://issues.apache.org/jira/browse/STORM-2254].

(This is my first issue + pull request, so sorry if something is wrong)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to