[ 
https://issues.apache.org/jira/browse/AMBARI-18470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15529322#comment-15529322
 ] 

Hari Sekhon commented on AMBARI-18470:
--------------------------------------

I also hit this on Friday and left it over the weekend to see if it would clear 
given some time to run the service checks. On Monday it still hadn't cleared.

I found triggering the service checks resolved it and have written a CLI tool 
to do this via the Ambari API (it infers the cluster and all services and can 
watch for completion), you can find it here:

https://github.com/harisekhon/pytools

./ambari_trigger_service_checks.py --all

> RU/EU cannot start because ServiceCheckValidityCheck incorrectly calculates 
> Service Checks that ran
> ---------------------------------------------------------------------------------------------------
>
>                 Key: AMBARI-18470
>                 URL: https://issues.apache.org/jira/browse/AMBARI-18470
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.4.0
>            Reporter: Saumil Mayani
>            Assignee: Alejandro Fernandez
>             Fix For: 2.5.0
>
>         Attachments: AMBARI-18470.branch-2.5.patch, AMBARI-18470.trunk.patch
>
>
> RU/EU PreCheck for running ServiceChecks after config changes is incorrect.
> *Example:*
> {noformat}
> Last Service Check should be more recent than the last configuration change 
> for the given service 
> Reason: The following service configurations have been updated and their 
> Service Checks should be run again: HIVE, SPARK, RANGER, YARN 
> Failed on: HIVE,SPARK,RANGER,YARN
> {noformat}
> *Workaround:*
> Add stack.upgrade.bypass.prechecks=true to ambari.properties and restart 
> ambari. 
> The workaround will report the same problem but allow the RU/EU to proceed.
> *Logs:*
> {noformat}
> 21 Sep 2016 10:59:20,905 WARN [ambari-client-thread-172] Errors:173 - The 
> following warnings have been detected with resource and/or provider classes: 
> WARNING: A HTTP GET method, public javax.ws.rs.core.Response 
> org.apache.ambari.server.api.services.ComponentService.getComponents(java.lang.String,javax.ws.rs.core.HttpHeaders,javax.ws.rs.core.UriInfo),
>  should not consume any entity. 
> WARNING: A HTTP GET method, public javax.ws.rs.core.Response 
> org.apache.ambari.server.api.services.ComponentService.getComponent(java.lang.String,javax.ws.rs.core.HttpHeaders,javax.ws.rs.core.UriInfo,java.lang.String,java.lang.String),
>  should not consume any entity. 
> 21 Sep 2016 10:59:35,690 INFO [ambari-client-thread-168] 
> ServiceCheckValidityCheck:144 - Service HIVE latest config change is 
> 09-20-2016 02:04:55, latest service check executed at 09-20-2016 01:47:31 
> 21 Sep 2016 10:59:35,804 INFO [ambari-client-thread-168] 
> ServiceCheckValidityCheck:144 - Service SPARK latest config change is 
> 09-20-2016 10:58:58, latest service check executed at 04-15-2016 10:32:53 
> 21 Sep 2016 10:59:35,805 INFO [ambari-client-thread-168] 
> ServiceCheckValidityCheck:154 - Service RANGER service check has never been 
> executed 
> 21 Sep 2016 10:59:35,805 INFO [ambari-client-thread-168] 
> ServiceCheckValidityCheck:144 - Service YARN latest config change is 
> 09-20-2016 01:47:31, latest service check executed at 09-20-2016 01:44:43 
> 21 Sep 2016 10:59:35,805 INFO [ambari-client-thread-172] 
> ServiceCheckValidityCheck:144 - Service HIVE latest config change is 
> 09-20-2016 02:04:54, latest service check executed at 09-20-2016 01:44:43 
> 21 Sep 2016 10:59:35,808 INFO [ambari-client-thread-172] 
> ServiceCheckValidityCheck:144 - Service SPARK latest config change is 
> 09-20-2016 10:58:58, latest service check executed at 04-15-2016 10:32:53 
> 21 Sep 2016 10:59:35,809 INFO [ambari-client-thread-172] 
> ServiceCheckValidityCheck:154 - Service RANGER service check has never been 
> executed 
> 21 Sep 2016 10:59:35,810 INFO [ambari-client-thread-172] 
> ServiceCheckValidityCheck:144 - Service YARN latest config change is 
> 09-20-2016 02:04:54, latest service check executed at 09-20-2016 01:44:43 
> {noformat}
> *Root Cause:*
> When the database has more than 1000 Service Checks, the EU/RU PreCheck for 
> ensuring that a Service Check has ran after any config changes to a service 
> is incorrect because it takes the first 1000 HostRoleCommand records as 
> opposed to the last page of 1000.
> This is because ServiceCheckValidityCheck.java doesn't impose an ordering 
> when creating a pagination request from TaskResourceProvider.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to