[jira] [Comment Edited] (AMBARI-5707) Replace Ganglia with high performant and pluggable Metrics System

Siddharth Wagle (JIRA) Tue, 13 May 2014 15:40:56 -0700

    [ 
https://issues.apache.org/jira/browse/AMBARI-5707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13992398#comment-13992398
 ]


Siddharth Wagle edited comment on AMBARI-5707 at 5/8/14 4:31 AM:
-----------------------------------------------------------------

h2. Proposed Architecture
Please refer to the attachment
*Legend*: Green: New components / services, Blue: Integration points, -> Arrows 
indicate direction of data flow.

h2. Details

*Ambari Metrics Sink*: 
Replacement for Ganglia Sink that implements the Hadoop Metrics Sink interface, 
http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/metrics2/MetricsSink.html

*Ambari Metrics Collector Service*:  A replacement for gmetad.
- We get rid of the writing to FS and then reading back the data in response to 
the call from GangliaPropertyProvider, instead we use a lightweight wire 
protocol to push metrics to a collector service which writes it to a local DB 
(MySQL/Postgres/Oracle) as well as to a pluggable Storage service layer.
- The write to DB can be done in parallel with the push to a remote long term 
storage and analysis solution like OpenTSDB by the collector service using a 
named pipe in an asynchronous and process space independent manner.
- The Remote Storage Service provider will be expected to provide a jar file 
with implementation of a shared Sink interface for pushing metrics at 
real-time. The vision is to allow user to extend a Sink interface and hook 
their own metrics storage.

*Ambari Metrics Service*: 
- An API layer which provides access to the stored metric data and capability 
to query it. Additionally, pluggability in terms of where the fine grained 
metrics data is written for long term storage. 
- The Amabri admin can configure this to use their own metric storage and 
thereby configure the collectors.

*Host Metrics Collector Daemon*: This is replacement for the gmond running on 
hosts.
- The host level metrics like cpu, disk, etc are captured by the Ganglia 
monitor daemon. We should be able to re-purpose this to push metrics to the 
Ambari Metrics Collector Service.
- Long term goal is to re-write gmond and create our own collector to achieve 
the following goals:
-- Reduce network traffic by reducing number of packets sent over the wire
-- Reduce the number of processes running per host for monitoring workload

*HA Requirements*:
Ambari Metrics Service: This is a Master daemon and might have built in HA 
support in the future.

*Scaling out*:
The Ambari Metrics Collector can be envsioned as a Slave and a typical cluster 
should be able to deploy multiple instances of this service achieving fan out 
based number of hosts in the cluster.



was (Author: swagle):
h2. Proposed Architecture
Please refer to the attachment
*Legend*: Green: New components / services, Blue: Integration points, -> Arrows 
indicate direction of data flow.

h2. Details

*Ambari Metrics Sink*: 
Replacement for Ganglia Sink that implements the Hadoop Metrics Sink interface, 
http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/metrics2/MetricsSink.html

*Ambari Metrics Collector Service*:  A replacement for gmetad.
- We get rid of the writing to FS and then reading back the data in response to 
the call from GangliaPropertyProvider, instead we use a lightweight wire 
protocol to push metrics to a collector service which writes it to a local DB 
(MySQL/Postgres/Oracle) as well as to a pluggable Storage service layer.
- The write to DB can be done in parallel with the push to a remote long term 
storage and analysis solution like OpenTSDB by the collector service using a 
named pipe in an asynchronous and process space independent manner.
- The Remote Storage Service provider will be expected to provide a jar file 
with implementation of a shared Sink interface for pushing metrics at 
real-time. The vision is to allow user to extend a Sink interface and hook 
their own metrics storage.

*Ambari Metrics Service*: 
- An API layer which provides access to the stored metric data and capability 
to query it. Additionally, pluggability in terms of where the fine grained 
metrics data is written for long term storage. 
- The Amabri admin can configure this to use their own metric storage and 
thereby configure the collectors.

*Host Metrics Collector Daemon*: This is replacement for the gmond running on 
hosts.
- The host level metrics like cpu, disk, etc are captured by the Ganglia 
monitor daemon. We should be able to re-purpose this to push metrics to the 
Ambari Metrics Collector Service.
- Long term goal is to re-write gmond and create our own collector to achieve 
the following goals:
-- Reduce network traffic by reducing number of packets sent over the wire
-- Reduce the number of processes running per host for monitoring workload

*HA Requirements*:
Ambari Metrics Service: This is a Master daemon and might have built in HA 
support in the future.

*Scaling out*:
The Ambari Metrics Collector can be envsioned as a Slave and a typical cluster 
should be able to deploy multiple instances of this services achieving fan out 
based number of hosts in the cluster.


> Replace Ganglia with high performant and pluggable Metrics System
> -----------------------------------------------------------------
>
>                 Key: AMBARI-5707
>                 URL: https://issues.apache.org/jira/browse/AMBARI-5707
>             Project: Ambari
>          Issue Type: New Feature
>          Components: agent, controller
>    Affects Versions: 1.6.0
>            Reporter: Siddharth Wagle
>            Assignee: Siddharth Wagle
>            Priority: Critical
>         Attachments: MetricsSystemArch.png
>
>
> Ambari Metrics System
> - Ability to collect metrics from Hadoop and other Stack services
> - Ability to retain metrics at a high precision for a configurable time 
> period (say 5 days)
> - Ability to automatically purge metrics after retention period
> - At collection time, provide clear integration point for external system 
> (such as TSDB)
> - At purge time, provide clear integration point for metrics retention by 
> external system
> - Should provide default options for external metrics retention (say “HDFS”)
> - Provide tools / utilities for analyzing metrics in retention system (say 
> “Hive schema, Pig scripts, etc” that can be used with the default retention 
> store “HDFS”)
> System Requirements
> - Must be portable and platform independent
> - Must not conflict with any existing metrics system (such as Ganglia)
> - Must not conflict with existing SNMP infra
> - Must not run as root
> - Must have HA story (no SPOF)
> Usage
> - Ability to obtain metrics from Ambari REST API (point in time and temporal)
> - Ability to view metric graphs in Ambari Web (currently, fixed)
> - Ability to configure custom metric graphs in Ambari Web (currently, we have 
> metric graphs “fixed” into the UI)
> - Need to improve metric graph “navigation” in Ambari Web (currently, metric 
> graphs do not allow navigation at arbitrary timeframes, but only at ganglia 
> aggregation intervals) 
> - Ability to “view cluster” at point in time (i.e. see all metrics at that 
> point)
> - Ability to define metrics (and how + where to obtain) in Stack Definitions



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (AMBARI-5707) Replace Ganglia with high performant and pluggable Metrics System

Reply via email to