Re: about metric processing template for large cluster

Eric Yang Sun, 16 Jan 2011 12:24:04 -0800

Hi,

Chukwa uses pig script to analyze the data.  Hence, the analytics is
entirely up to the developer and researcher.  For topN, sorting, it
can be written easily with piglatin.  Take a look of
https://issues.apache.org/jira/browse/CHUKWA-575.  This script is used
to aggregate large node metrics into a cluster summary number.  It
should help you to calculate histogram and load distribution.


For rrd type of down sampling, we need to introduce a Pig UDF which
calculates d(metric)/dt.

regards,
Eric

On Sun, Jan 16, 2011 at 7:17 AM, ZHOU Qi <[email protected]> wrote:
> Hi Guys,
>
> I got used to using ganglia liked software for monitoring and trouble
> shooting cluster with about 100 machines. But with the growth of
> scale, I found out it became more difficult to identify the abnormal
> metrics, machines or the bottle-net part of the current system.
>
> Up to now, we considered to add some features for rrd viewing, such as
> getting the topN, sorting the machine by its metrics, or grouping the
> metrics to find its distribution. We have no more experience on chukwa
> before and I am wondering that is there any templates for metrics
> processing from chukwa (such as sorting, histogram, machine/rack group
> distribution) ?
>
> If you have better idea for viewing these metrics. Would you mind
> introducing it?
>

Re: about metric processing template for large cluster

Reply via email to