I have a custom Ambari service, with a metrics.json and widgets.json
defined.

The widgets display on the service dashboard summary page, but instead of
the graph or data, I see "n/a".

When I use the REST API to query the ambari server, I see the metrics for
the host_component, but not when I query the component.

In metrics.json, I've added some of the basic ams host metrics, plus some
service-specific metrics. All metrics are defined in both "Component" and
"HostComponent". As an example:

   {
     "GPFS_MASTER": {
       "Component": [
         {
           "type": "ganglia",
           "metrics": {
             "default": {
               "metrics/cpu/cpu_idle":{
                 "metric":"cpu_idle",
                 "pointInTime":true,
                 "temporal":true,
                 "amsHostMetric":true
               },
               ...
               "metrics/gpfs/disk_used": {
                 "metric": "gpfs.disk_used",
                 "pointInTime": true,
                 "temporal": true
               },
               ...
             }
           }
         }
       ],
       "HostComponent": [
         {
           "type": "ganglia",
           "metrics": {
             "default": {
               "metrics/cpu/cpu_idle":{
                 "metric":"cpu_idle",
                 "pointInTime":true,
                 "temporal":true,
                 "amsHostMetric":true
               },
               ...
               "metrics/gpfs/disk_used": {
                 "metric": "gpfs.disk_used",
                 "pointInTime": true,
                 "temporal": true
               },
               ...


   I query the AMS Collector, and it seems that the metrics are there:

      [root@dn01-dat nathan]# curl -X GET -u admin:admin "
      
http://dn01:6188/ws/v1/timeline/metrics?metricNames=gpfs.disk_used&hostname=dn01-dat.ibm.com
      "
      
{"metrics":[{"timestamp":1447084964323,"metricname":"gpfs.disk_used","appid":"gpfs","hostname":"dn01-dat.ibm.com","starttime":1447084964,"metrics":{"1447084964":1437696.0}}]}


   I query Ambari, and whether I see the metric or not depends on how I do
   the query. If I query the GPFS_MASTER service component, I do NOT see
   the metric:

      [root@dn01-dat nathan]# curl -X GET -u admin:admin "
      
http://dn01:8080/api/v1/clusters/nate/services/GPFS/components/GPFS_MASTER?fields=metrics/gpfs/disk_used
      "
      {
        "href" : "
      
http://dn01:8080/api/v1/clusters/nate/services/GPFS/components/GPFS_MASTER?fields=metrics/gpfs/disk_used
      ",
        "ServiceComponentInfo" : {
          "cluster_name" : "nate",
          "component_name" : "GPFS_MASTER",
          "service_name" : "GPFS"
        }
      }


   If I query the GPFS_MASTER host component on dn01, then I do see the
   metric:

      [root@dn01-dat nathan]# curl -X GET -u admin:admin "
      
http://dn01:8080/api/v1/clusters/nate/hosts/dn01-dat.ibm.com/host_components/GPFS_MASTER?fields=metrics/gpfs/disk_used
      "
      {
        "href" : "
      
http://dn01:8080/api/v1/clusters/nate/hosts/dn01-dat.ibm.com/host_components/GPFS_MASTER?fields=metrics/gpfs/disk_used
      ",
        "HostRoles" : {
          "cluster_name" : "nate",
          "component_name" : "GPFS_MASTER",
          "host_name" : "dn01-dat.ibm.com"
        },
        "host" : {
          "href" : "
      http://dn01:8080/api/v1/clusters/nate/hosts/dn01-dat.ibm.com";
        },
        "metrics" : {
          "gpfs" : {
            "disk_used" : 1437696.0
          }
        }
      }


   By comparison, if I query the "cpu_idle" metric, also defined in the
   GPFS metrics.json file, I see the metric in both queries:

      [root@dn01-dat nathan]# curl -X GET -u admin:admin "
      
http://dn01:8080/api/v1/clusters/nate/services/GPFS/components/GPFS_MASTER?fields=metrics/cpu/cpu_idle
      "
      {
        "href" : "
      
http://dn01:8080/api/v1/clusters/nate/services/GPFS/components/GPFS_MASTER?fields=metrics/cpu/cpu_idle
      ",
        "ServiceComponentInfo" : {
          "cluster_name" : "nate",
          "component_name" : "GPFS_MASTER",
          "service_name" : "GPFS"
        },
        "metrics" : {
          "cpu" : {
            "cpu_idle" : 0.6248046875
          }
        }
      }[root@dn01-dat nathan]#
      [root@dn01-dat nathan]# curl -X GET -u admin:admin "
      
http://dn01:8080/api/v1/clusters/nate/hosts/dn01-dat.ibm.com/host_components/GPFS_MASTER?fields=metrics/cpu/cpu_idle
      "
      {
        "href" : "
      
http://dn01:8080/api/v1/clusters/nate/hosts/dn01-dat.ibm.com/host_components/GPFS_MASTER?fields=metrics/cpu/cpu_idle
      ",
        "HostRoles" : {
          "cluster_name" : "nate",
          "component_name" : "GPFS_MASTER",
          "host_name" : "dn01-dat.ibm.com"
        },
        "host" : {
          "href" : "
      http://dn01:8080/api/v1/clusters/nate/hosts/dn01-dat.ibm.com";
        },
        "metrics" : {
          "cpu" : {
            "cpu_idle" : 0.624375
          }
        }
      }


   I feel like getting back "n/a" on the widgets is related to not seeing
   the metrics when I query the component rather than the host_component,
   but I'm not 100% sure about that either.

   My problems don't seem to end there, either. When I create new widgets
   using the gpfs metrics, I start seeing some wildly inconsistent
   behavior. Sometimes I'll get the right metric data, sometimes as I add
   and remove widgets they'll go back to displaying n/a or even displaying
   old values for the metric data.

   I must be missing something really simple, but I think I'm going to need
   help to figure out what that might be.

   Does anyone out there have any suggestions for how to investigate this
   further or what I might be missing with regard to defining or posting
   these metrics?

   Thanks,

   Nate Falk
   [email protected]

Reply via email to