[ 
https://issues.apache.org/jira/browse/AMBARI-23872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Onischuk updated AMBARI-23872:
-------------------------------------
    Status: Patch Available  (was: Open)

> New Alert JSON Is Invalid When Sent To Agents
> ---------------------------------------------
>
>                 Key: AMBARI-23872
>                 URL: https://issues.apache.org/jira/browse/AMBARI-23872
>             Project: Ambari
>          Issue Type: Bug
>            Reporter: Andrew Onischuk
>            Assignee: Andrew Onischuk
>            Priority: Major
>             Fix For: 2.7.0
>
>         Attachments: AMBARI-23872.patch
>
>
> STR:
>   * Set a simple cluster with HDFS
>   * Attempt to create a new Alert:
>     
>     
>     
>     POST http://{{ambari-server}}:8080/api/v1/clusters/c1/alert_definitions
>     
>     {
>       "AlertDefinition": {
>         "component_name": "NAMENODE",
>         "description": "This service-level alert is triggered if the total 
> number of volume failures across the cluster is greater than the configured 
> critical threshold.",
>         "enabled": true,
>         "help_url": null,
>         "ignore_host": false,
>         "interval": 2,
>         "label": "NameNode Volume Failures",
>         "name": "namenode_volume_failures",
>         "scope": "ANY",
>         "service_name": "HDFS",
>         "source": {
>           "jmx": {
>             "property_list": [
>               
> "Hadoop:service=NameNode,name=FSNamesystemState/VolumeFailuresTotal"
>             ],
>             "value": "{0}"
>           },
>           "reporting": {
>             "ok": {
>               "text": "There are {0} volume failures"
>             },
>             "warning": {
>               "text": "There are {0} volume failures",
>               "value": 1
>             },
>             "critical": {
>               "text": "There are {0} volume failures",
>               "value": 1
>             },
>             "units": "Volume(s)"
>           },
>           "type": "METRIC",
>           "uri": {
>             "http": "{{hdfs-site/dfs.namenode.http-address}}",
>             "https": "{{hdfs-site/dfs.namenode.https-address}}",
>             "https_property": "{{hdfs-site/dfs.http.policy}}",
>             "https_property_value": "HTTPS_ONLY",
>             "kerberos_keytab": 
> "{{hdfs-site/dfs.web.authentication.kerberos.keytab}}",
>             "kerberos_principal": 
> "{{hdfs-site/dfs.web.authentication.kerberos.principal}}",
>             "default_port": 0,
>             "connection_timeout": 5,
>             "high_availability": {
>               "nameservice": "{{hdfs-site/dfs.internal.nameservices}}",
>               "alias_key": 
> "{{hdfs-site/dfs.ha.namenodes.{{ha-nameservice}}}}",
>               "http_pattern": 
> "{{hdfs-site/dfs.namenode.http-address.{{ha-nameservice}}.{{alias}}}}",
>               "https_pattern": 
> "{{hdfs-site/dfs.namenode.https-address.{{ha-nameservice}}.{{alias}}}}"
>             }
>           }
>         }
>       }
>     }
>     
> This alert will not be scheduled on the agent correctly:
>     
>     
>     
>     ERROR 2018-05-16 20:11:55,186 AlertSchedulerHandler.py:307 - 
> [AlertScheduler] Unable to load an invalid alert definition. It will be 
> skipped.
>     Traceback (most recent call last):
>       File "/usr/lib/ambari-agent/lib/ambari_agent/AlertSchedulerHandler.py", 
> line 287, in __json_to_callable
>         alert = MetricAlert(json_definition, source, self.config)
>       File "/usr/lib/ambari-agent/lib/ambari_agent/alerts/metric_alert.py", 
> line 52, in __init__
>         self.metric_info = JmxMetric(alert_source_meta['jmx'])
>       File "/usr/lib/ambari-agent/lib/ambari_agent/alerts/metric_alert.py", 
> line 288, in __init__
>         self.property_list = jmx_info['property_list']
>     KeyError: 'property_list'
>     
> Looking at `/var/lib/ambari-agent/cache/cluster_cache/alerts.json`, we can see
> that `property_list` was changed into `propertyList`.
>     
>     
>     
>             "name": "namenode_volume_failures",
>             "componentName": "NAMENODE",
>             "description": "This service-level alert is triggered if the 
> total number of volume failures across the cluster is greater than the 
> configured critical threshold.",
>             "interval": 2,
>             "clusterId": 2,
>             "label": "NameNode Volume Failures",
>             "ignore_host": false,
>             "source": {
>               "jmx": {
>                 "urlSuffix": "/jmx",
>                 "propertyList": [
>                   
> "Hadoop:service=NameNode,name=FSNamesystemState/VolumeFailuresTotal"
>                 ],
>     



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to