Hi Akshay, In both options, data down sampling is required. RRDTools is doing data down sampling when the data is written to the RRD files. Chukwa 0.4 uses mysql for data down sampling. The graph is then rendered using flot (http://code.google.com/p/flot/) graphing library to serve the data. There was also a prototype to render graph on the server side with jfreechart. However, there was no clear interface to expose graph-able data.
In Chukwa 0.5, we are decoupling the data with the graph library. There is a REST API interface to get metrics data. (See https://issues.apache.org/jira/browse/CHUKWA-520) However, Chukwa 0.5 is still under development, the data down sampling has shifted from sql statements into mapreduce/pig-latin script. I have not determine what will be in the final framework. It is most likely to use Oozie as workflow scheduling engine to run mapreduce/pipg-latin jobs to provide down sampling and aggregation framework. You are welcome to try out code from trunk (0.5). The current limitation is to avoid using a large time range and there is no aggregation. Hope this helps. regards, Eric On Tue, Dec 28, 2010 at 1:18 PM, Akshay Kumar <[email protected]> wrote: > Hi, > I have GWT as the front-end, where I want to embed this information in one > of the following ways: > a) Simply embed RRDtool kind of generated images. That means, I will have to > run rrdtool ( I am looking at rrd4j) on server side and convert the data to > RRD format on agent/server side. > b) Use some graphing library - like http://dygraphs.com/. > I am not expecting too much of volume. To start with simple CPU, Memory and > hadoop metrics collected from 20 or so machines collected at a rate not more > than 10 per minute per metric. > Thanks, > Akshay > On 27 December 2010 00:05, Ariel Rabkin <[email protected]> wrote: >> >> 16 GB isn't a hard limit, just a suggestion. And that's based on the >> assumption that you have a big cluster and are collecting a lot of >> data and using the older MySQL based infrastructure. >> >> How much memory you need depends on what volume of data you're >> collecting and what you're doing with it. How do you intend to store >> the data and how will you be visualizing it? >> >> >> >> --Ari >> >> On Sun, Dec 26, 2010 at 10:29 AM, Akshay Kumar <[email protected]> >> wrote: >> > Thanks, >> > In my setup, I can not afford ( as of now) to have a machine with 16GB >> > memory. >> > So that means, I can not deploy Chukwa as a monitoring solution ? I do >> > not >> > intend to do any log analysis / collection for now - just simple OS and >> > hadoop metrics. >> > >> > I mean, I do not understand why would one have 16GB has hard limit for >> > minimal functioning too. >> > I imagine it should be for a high performance system and not bare-bones >> > structure. What am I missing here? >> > >> > -Akshay >> > >> > On 26 December 2010 23:38, Ariel Rabkin <[email protected]> wrote: >> >> >> >> Yes. That 16 GB number is for the HICC server, not for the collection >> >> side. And even then, it's if you have a lot of data (a whole cluster's >> >> worth) living in a MySQL database with a web application serving the >> >> data. >> >> >> >> The monitoring agent and the collector are both fairly small-footprint. >> >> >> >> --Ari >> >> >> >> On Sun, Dec 26, 2010 at 10:03 AM, Akshay Kumar <[email protected]> >> >> wrote: >> >> > Hi, >> >> > Thanks for the responses. A bit late to check this one. >> >> > I have one more query - >> >> > In the Chukwa administration guide: >> >> > http://people.apache.org/~eyang/docs/r0.1.2/admin.html >> >> > It says >> >> > Chukwa can also be installed on a single node, in which case the >> >> > machine >> >> > must have at least 16 GB of memory. >> >> > >> >> > Q) For my usecase ( for monitoring system metrics) - is it safe to >> >> > assume it >> >> > is not going to be that big a requirement for memory? >> >> > >> >> > Thanks, >> >> > Akshay >> >> > >> >> > >> >> > On 17 December 2010 10:23, ZHOU Qi <[email protected]> wrote: >> >> >> >> >> >> Got it. Thanks. >> >> >> >> >> >> 2010/12/17 Eric Yang <[email protected]>: >> >> >> > Sure, here you go. >> >> >> > >> >> >> > Regards, >> >> >> > Eric >> >> >> > >> >> >> > On 12/16/10 6:21 PM, "ZHOU Qi" <[email protected]> wrote: >> >> >> > >> >> >> > Hi Eric, >> >> >> > >> >> >> > I read the wiki of Chukwa, but there is less information about >> >> >> > HICC. >> >> >> > From where I can get its screen-shot or demo? >> >> >> > >> >> >> > Thanks, >> >> >> > 2010/12/17 Eric Yang <[email protected]>: >> >> >> >> Hi Akshay, >> >> >> >> >> >> >> >> A) Yes. You can use “add sigar.SystemMetrics SystemMetrics >> >> >> >> [interval] >> >> >> >> 0” >> >> >> >> to >> >> >> >> stream CPU state at specified interval. For example: >> >> >> >> >> >> >> >> “add sigar.SystemMetrics SystemMetrics 5 0” without quotes will >> >> >> >> stream >> >> >> >> CPU >> >> >> >> state every 5 seconds. >> >> >> >> >> >> >> >> B) Chukwa has a graphing tool built in which is called HICC. It >> >> >> >> requires >> >> >> >> Hbase deployed in order to use HICC. >> >> >> >> >> >> >> >> However, agent is still required on the client machines. >> >> >> >> >> >> >> >> Regards, >> >> >> >> Eric >> >> >> >> >> >> >> >> On 12/16/10 4:34 AM, "Akshay Kumar" <[email protected]> >> >> >> >> wrote: >> >> >> >> >> >> >> >> Hi, >> >> >> >> I have a Hadoop installation, and I want to collect some basic OS >> >> >> >> level >> >> >> >> metrics like - cpu, memory, disk usage, and Hadoop metrics. >> >> >> >> >> >> >> >> I have looked into Ganglia, but it requires installing agents on >> >> >> >> client >> >> >> >> machines, which is what I want to avoid. >> >> >> >> >> >> >> >> My queries: >> >> >> >> a) Is this a fair use case for using chukwa? e.g. polling client >> >> >> >> machines >> >> >> >> for CPU stats few times per minute? >> >> >> >> b) Is it possible to integrate data collected from chukwa >> >> >> >> collectors >> >> >> >> in >> >> >> >> a >> >> >> >> form readable by rrdtool kind of graphing tools on the server >> >> >> >> side? >> >> >> >> >> >> >> >> Thanks, >> >> >> >> Akshay >> >> >> >> >> >> >> >> >> >> >> > >> >> >> > >> >> > >> >> > >> >> >> >> >> >> >> >> -- >> >> Ari Rabkin [email protected] >> >> UC Berkeley Computer Science Department >> > >> > >> >> >> >> -- >> Ari Rabkin [email protected] >> UC Berkeley Computer Science Department > >
