Sudhir Prakash created AMBARI-5800:
--------------------------------------

             Summary: Race condition when starting all services causing Hive 
service check to fail
                 Key: AMBARI-5800
                 URL: https://issues.apache.org/jira/browse/AMBARI-5800
             Project: Ambari
          Issue Type: Bug
    Affects Versions: 1.6.0
         Environment: SLES11
ambari-server-1.6.0-39
hive-0.13.0.2.1.2.0-402
            Reporter: Sudhir Prakash
            Priority: Critical


# I performed an install on a 7 node cluster
# During the install, I noticed that the Hive service check failed with the 
error: {{Test connectivity to hive server Connection to byn001-1 on port 10000 
failed: [Errno 111] Connection refused}}
# I proceeded through the rest of the install wizard
# Stop All
# Start All and noticed the same error again

I retried stop all/start all this time monitoring the Ambari start progess, the 
Hive Server2 logs, and a netstat of port 10000. What I noticed is that 
immediately after the start Hive is issued, the service check is run and fails. 
However, it takes about 55 seconds for HiveServer2 to actually start and claim 
port 10000. 

The start up sequence needs to be modified to wait for Hive to finish starting 
before running the service check.

This issue is easily reproducible and has been seen by multiple people there.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to