Hi James,

Would you mind providing your python script so we can take a look?

Thanks,
Nate

From: James Tanner <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, November 24, 2015 at 3:51 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: proper return for status() in a service script?

I killed the test service, restarted ambari-server, then tailed the logs to see 
if there were any clues ...

24 Nov 2015 15:47:35,132  INFO [qtp-ambari-agent-52] HeartBeatHandler:657 - 
State of service component TEST_SLAVE of service TEST of cluster TEST01 has 
changed from INSTALLED to STARTED at host node2.lab.net<http://node2.lab.net>
24 Nov 2015 15:47:35,134  INFO [qtp-ambari-agent-52] HeartBeatHandler:657 - 
State of service component TEST_CLIENT of service TEST of cluster TEST01 has 
changed from INSTALLED to STARTED at host node2.lab.net<http://node2.lab.net>
24 Nov 2015 15:47:37,775  INFO [Thread-23] AbstractPoolBackedDataSource:462 - 
Initializing c3p0 pool... com.mchange.v2.c3p0.ComboPooledDataSource [ 
acquireIncrement -> 3, acquireRetryAttempts -> 30, acquireRetryDelay -> 1000, 
autoCommitOnClose -> false, automaticTestTable -> null, 
breakAfterAcquireFailure -> false, checkoutTimeout -> 0, 
connectionCustomizerClassName -> null, connectionTesterClassName -> 
com.mchange.v2.c3p0.impl.DefaultConnectionTester, dataSourceName -> 
2rvxuc9dgfrzar1v9eztj|7320bccc, debugUnreturnedConnectionStackTraces -> false, 
description -> null, driverClass -> org.postgresql.Driver, factoryClassLocation 
-> null, forceIgnoreUnresolvedTransactions -> false, identityToken -> 
2rvxuc9dgfrzar1v9eztj|7320bccc, idleConnectionTestPeriod -> 50, initialPoolSize 
-> 3, jdbcUrl -> jdbc:postgresql://localhost/ambari, 
lastAcquisitionFailureDefaultUser -> null, maxAdministrativeTaskTime -> 0, 
maxConnectionAge -> 0, maxIdleTime -> 0, maxIdleTimeExcessConnections -> 0, 
maxPoolSize -> 5, maxStatements -> 0, maxStatementsPerConnection -> 120, 
minPoolSize -> 1, numHelperThreads -> 3, numThreadsAwaitingCheckoutDefaultUser 
-> 0, preferredTestQuery -> select 0, properties -> {user=******, 
password=******}, propertyCycle -> 0, testConnectionOnCheckin -> true, 
testConnectionOnCheckout -> false, unreturnedConnectionTimeout -> 0, 
usesTraditionalReflectiveProxies -> false ]
24 Nov 2015 15:47:37,988  INFO [Thread-23] JobStoreTX:861 - Freed 0 triggers 
from 'acquired' / 'blocked' state.
24 Nov 2015 15:47:38,014  INFO [Thread-23] JobStoreTX:871 - Recovering 0 jobs 
that were in-progress at the time of the last shut-down.
24 Nov 2015 15:47:38,014  INFO [Thread-23] JobStoreTX:884 - Recovery complete.
24 Nov 2015 15:47:38,014  INFO [Thread-23] JobStoreTX:891 - Removed 0 
'complete' triggers.
24 Nov 2015 15:47:38,015  INFO [Thread-23] JobStoreTX:896 - Removed 0 stale 
fired job entries.
24 Nov 2015 15:47:38,031  INFO [Thread-23] QuartzScheduler:575 - Scheduler 
ExecutionScheduler_$_NON_CLUSTERED started.
24 Nov 2015 15:47:38,723  INFO [qtp-ambari-agent-39] HeartBeatHandler:657 - 
State of service component TEST_CLIENT of service TEST of cluster TEST01 has 
changed from INSTALLED to STARTED at host node1.lab.net<http://node1.lab.net>
24 Nov 2015 15:47:38,729  INFO [qtp-ambari-agent-39] HeartBeatHandler:657 - 
State of service component TEST_MASTER of service TEST of cluster CAS01 has 
changed from INSTALLED to STARTED at host node1.lab.net<http://node1.lab.net>


Ambari flipped the state from "INSTALLED" to "STARTED", but I can tell from my 
service script's log output that no calls were ever made to it, especially not 
a call to status(). What is ambari actually doing when it decides to switch 
state from installed to started? It seems to be unrelated to the service 
script(s).

On Tue, Nov 24, 2015 at 3:08 PM, James Tanner 
<[email protected]<mailto:[email protected]>> wrote:
What is the proper return values for "running" and "not running" in an Ambari 
service script?

If I reference the wiki, the status function should return nothing:

      
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=38571133#Overview%28Ambari1.5.0orlater%29-CreateandAddtheService.1

If I reference the GlusterFS's yarn service script included in an HDP stack, 
there is no return but a ComponentIsNotRunning should be rasied if it's down.


Regardless of what I return, it seems that the internal ambari database status 
gets set to "running".

ambari=# select component_name,current_state from ambari.hostcomponentstate;
  component_name   | current_state
-------------------+---------------
 ZOOKEEPER_SERVER  | STARTED
 ZOOKEEPER_CLIENT  | INSTALLED
 ZOOKEEPER_CLIENT  | INSTALLED
 TEST_CLIENT        | STARTED
 TEST_CLIENT        | STARTED
 METRICS_MONITOR   | STARTED
 METRICS_COLLECTOR | STARTED
 ZOOKEEPER_SERVER  | STARTED
 TEST_SLAVE         | STARTED                 # the service script raised the 
ComponentIsNotRunning exception for this when status() was called
 METRICS_MONITOR   | STARTED
 METRICS_COLLECTOR | STARTED
 TEST_MASTER        | STARTED               # the service script raised the 
ComponentIsNotRunning exception for this when status() was called
(12 rows)



I've also noticed via log statements that the status() function for the service 
is called upon startup of ambari-server or during manual service state change, 
but it never polls status at any regular interval. Is that supposed to be the 
case? If not, how is the displayed service state ever accurate?

Reply via email to