-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57324/
-----------------------------------------------------------

(Updated March 6, 2017, 12:26 p.m.)


Review request for Ambari, Alejandro Fernandez, Attila Magyar, Balázs Bence 
Sári, Eugene Chekanskiy, Jonathan Hurley, Laszlo Puskas, Sebastian Toader, and 
Sid Wagle.


Bugs: AMBARI-20309
    https://issues.apache.org/jira/browse/AMBARI-20309


Repository: ambari


Description
-------

HBase Master CPU Utilization Alert is in unknown state due to kinit error:

```
Execution of '/usr/bin/kinit -c 
/var/lib/ambari-agent/tmp/curl_krb_cache/metric_alert_ambari-qa_cc_56787c2122a8214ca9775f3433361f8b
 -kt HTTP/[email protected] /etc/security/keytabs/spnego.service.keytab > 
/dev/null' returned 1. kinit: Client not found in Kerberos database while 
getting initial credentials
```

This issue is also seen in /var/log/krb5kdc.log:

```
Mar 03 16:43:06 c6401.ambari.apache.org krb5kdc[4749](info): AS_REQ (4 etypes 
{18 17 16 23}) 192.168.64.101: CLIENT_NOT_FOUND: 
/etc/security/keytabs/[email protected] for 
krbtgt/[email protected], Client not found in Kerberos database
```

#Cause
It appears that the HBASE alerts.json file 
(`common-services/HBASE/0.96.0.2.0/alerts.json`) has swapped values for the 
`kerberos_keytab` and `kerberos_principal` properties.

```
      {
        "name": "hbase_master_cpu",
        "label": "HBase Master CPU Utilization",
        "description": "This host-level alert is triggered if CPU utilization 
of the HBase Master exceeds certain warning and critical thresholds. It checks 
the HBase Master JMX Servlet for the SystemCPULoad property. The threshold 
values are in percent.",
        "interval": 5,
        "scope": "ANY",
        "enabled": true,
        "source": {
          "type": "METRIC",
          "uri": {
            "http": "{{hbase-site/hbase.master.info.port}}",
            "default_port": 60010,
            "connection_timeout": 5.0,
            "kerberos_keytab": 
"{{hbase-site/hbase.security.authentication.spnego.kerberos.principal}}",
            "kerberos_principal": 
"{{hbase-site/hbase.security.authentication.spnego.kerberos.keytab}}"
          },
          "reporting": {
            "ok": {
              "text": "{1} CPU, load {0:.1%}"
            },
            "warning": {
              "text": "{1} CPU, load {0:.1%}",
              "value": 200
            },
            "critical": {
              "text": "{1} CPU, load {0:.1%}",
              "value": 250
            },
            "units" : "%",
            "type": "PERCENT"
          },
          "jmx": {
            "property_list": [
              "java.lang:type=OperatingSystem/SystemCpuLoad",
              "java.lang:type=OperatingSystem/AvailableProcessors"
            ],
            "value": "{0} * 100"
          }
        }
      }
```

Notice:
```
"kerberos_keytab": 
"{{hbase-site/hbase.security.authentication.spnego.kerberos.principal}}",
            "kerberos_principal": 
"{{hbase-site/hbase.security.authentication.spnego.kerberos.keytab}}"
```

#Solution
Fix values for the `kerberos_keytab` and `kerberos_principal` properties in 
`common-services/HBASE/0.96.0.2.0/alerts.json`:

```
"kerberos_principal": 
"{{hbase-site/hbase.security.authentication.spnego.kerberos.principal}}",
            "kerberos_keytab": 
"{{hbase-site/hbase.security.authentication.spnego.kerberos.keytab}}"
```


Diffs
-----

  
ambari-server/src/main/java/org/apache/ambari/server/upgrade/UpgradeCatalog250.java
 2a684dc 
  ambari-server/src/main/resources/common-services/HBASE/0.96.0.2.0/alerts.json 
1b3ae25 
  
ambari-server/src/test/java/org/apache/ambari/server/upgrade/UpgradeCatalog250Test.java
 7ee66ef 


Diff: https://reviews.apache.org/r/57324/diff/2/


Testing (updated)
-------

Manually tested in new Ambari 2.5.0 cluster and upgrade scenario from Ambari 
2.4.2 to Ambari 2.5.0

# Local test results:

```
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 26:07.123s
[INFO] Finished at: Mon Mar 06 12:25:32 EST 2017
[INFO] Final Memory: 70M/596M
[INFO] ------------------------------------------------------------------------
```


# Jenkins test results: PENDING


Thanks,

Robert Levas

Reply via email to