Andrew Onischuk created AMBARI-23872:
----------------------------------------

             Summary: New Alert JSON Is Invalid When Sent To Agents
                 Key: AMBARI-23872
                 URL: https://issues.apache.org/jira/browse/AMBARI-23872
             Project: Ambari
          Issue Type: Bug
            Reporter: Andrew Onischuk
            Assignee: Andrew Onischuk
             Fix For: 2.7.0
         Attachments: AMBARI-23872.patch

STR:

  * Set a simple cluster with HDFS
  * Attempt to create a new Alert:

    
    
    
    POST http://{{ambari-server}}:8080/api/v1/clusters/c1/alert_definitions
    
    {
      "AlertDefinition": {
        "component_name": "NAMENODE",
        "description": "This service-level alert is triggered if the total 
number of volume failures across the cluster is greater than the configured 
critical threshold.",
        "enabled": true,
        "help_url": null,
        "ignore_host": false,
        "interval": 2,
        "label": "NameNode Volume Failures",
        "name": "namenode_volume_failures",
        "scope": "ANY",
        "service_name": "HDFS",
        "source": {
          "jmx": {
            "property_list": [
              
"Hadoop:service=NameNode,name=FSNamesystemState/VolumeFailuresTotal"
            ],
            "value": "{0}"
          },
          "reporting": {
            "ok": {
              "text": "There are {0} volume failures"
            },
            "warning": {
              "text": "There are {0} volume failures",
              "value": 1
            },
            "critical": {
              "text": "There are {0} volume failures",
              "value": 1
            },
            "units": "Volume(s)"
          },
          "type": "METRIC",
          "uri": {
            "http": "{{hdfs-site/dfs.namenode.http-address}}",
            "https": "{{hdfs-site/dfs.namenode.https-address}}",
            "https_property": "{{hdfs-site/dfs.http.policy}}",
            "https_property_value": "HTTPS_ONLY",
            "kerberos_keytab": 
"{{hdfs-site/dfs.web.authentication.kerberos.keytab}}",
            "kerberos_principal": 
"{{hdfs-site/dfs.web.authentication.kerberos.principal}}",
            "default_port": 0,
            "connection_timeout": 5,
            "high_availability": {
              "nameservice": "{{hdfs-site/dfs.internal.nameservices}}",
              "alias_key": "{{hdfs-site/dfs.ha.namenodes.{{ha-nameservice}}}}",
              "http_pattern": 
"{{hdfs-site/dfs.namenode.http-address.{{ha-nameservice}}.{{alias}}}}",
              "https_pattern": 
"{{hdfs-site/dfs.namenode.https-address.{{ha-nameservice}}.{{alias}}}}"
            }
          }
        }
      }
    }
    

This alert will not be scheduled on the agent correctly:

    
    
    
    ERROR 2018-05-16 20:11:55,186 AlertSchedulerHandler.py:307 - 
[AlertScheduler] Unable to load an invalid alert definition. It will be skipped.
    Traceback (most recent call last):
      File "/usr/lib/ambari-agent/lib/ambari_agent/AlertSchedulerHandler.py", 
line 287, in __json_to_callable
        alert = MetricAlert(json_definition, source, self.config)
      File "/usr/lib/ambari-agent/lib/ambari_agent/alerts/metric_alert.py", 
line 52, in __init__
        self.metric_info = JmxMetric(alert_source_meta['jmx'])
      File "/usr/lib/ambari-agent/lib/ambari_agent/alerts/metric_alert.py", 
line 288, in __init__
        self.property_list = jmx_info['property_list']
    KeyError: 'property_list'
    

Looking at `/var/lib/ambari-agent/cache/cluster_cache/alerts.json`, we can see
that `property_list` was changed into `propertyList`.

    
    
    
            "name": "namenode_volume_failures",
            "componentName": "NAMENODE",
            "description": "This service-level alert is triggered if the total 
number of volume failures across the cluster is greater than the configured 
critical threshold.",
            "interval": 2,
            "clusterId": 2,
            "label": "NameNode Volume Failures",
            "ignore_host": false,
            "source": {
              "jmx": {
                "urlSuffix": "/jmx",
                "propertyList": [
                  
"Hadoop:service=NameNode,name=FSNamesystemState/VolumeFailuresTotal"
                ],
    





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to