Dmitry Lysnichenko created AMBARI-6184:
------------------------------------------

             Summary: Incorrect value for started_count of Datanode component
                 Key: AMBARI-6184
                 URL: https://issues.apache.org/jira/browse/AMBARI-6184
             Project: Ambari
          Issue Type: Bug
          Components: agent
    Affects Versions: 1.6.1
            Reporter: Dmitry Lysnichenko
            Assignee: Dmitry Lysnichenko
             Fix For: 1.6.1


*STR:* 
# Installed a 3-node cluster for HDP 1.3 stack 
HDFS+MapReduce+Nagios+Ganglia+zooKeeper installed with slave components 
installed on all 3 hosts.
# Enable security with no kerberos setup 
# On expected failure of security wizard, Disable security.
# After successfully disabling security, Following API returns incorrect number 
for started_count of Datanode. It says 0 but Datanode is actually running on 
all hosts
{code}
http://server:8080/api/v1/clusters/c1/components/?ServiceComponentInfo/category.in(SLAVE,CLIENT)&fields=ServiceComponentInfo/service_name,ServiceComponentInfo/installed_count,ServiceComponentInfo/started_count,ServiceComponentInfo/total_count&minimal_response=true
{code}

Reason:
During wrong kerberos setup DN processes fail to start, but leave stale pid 
file owned by root. Next one DN start command starts DN process, but can not 
override pid file. So the server considers DN as stopped. If we start DN once 
more, commands fail soon after start (due to lock file at data dir owned by 
already running DN). Agent reports to server that DN is not running, so server 
displays a correct information from his point of view. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to