Thanks so much Eric. I will take some time to grasp all this and try out stuff. Will definitely get back as and when I have some feedback to give. Regards, Akshay
On 29 December 2010 09:03, Eric Yang <[email protected]> wrote: > Hi Akshay, > > In both options, data down sampling is required. RRDTools is doing > data down sampling when the data is written to the RRD files. Chukwa > 0.4 uses mysql for data down sampling. The graph is then rendered > using flot (http://code.google.com/p/flot/) graphing library to serve > the data. There was also a prototype to render graph on the server > side with jfreechart. However, there was no clear interface to expose > graph-able data. > > In Chukwa 0.5, we are decoupling the data with the graph library. > There is a REST API interface to get metrics data. (See > https://issues.apache.org/jira/browse/CHUKWA-520) However, Chukwa 0.5 > is still under development, the data down sampling has shifted from > sql statements into mapreduce/pig-latin script. I have not determine > what will be in the final framework. It is most likely to use Oozie > as workflow scheduling engine to run mapreduce/pipg-latin jobs to > provide down sampling and aggregation framework. > > You are welcome to try out code from trunk (0.5). The current > limitation is to avoid using a large time range and there is no > aggregation. Hope this helps. > > regards, > Eric > > On Tue, Dec 28, 2010 at 1:18 PM, Akshay Kumar <[email protected]> > wrote: > > Hi, > > I have GWT as the front-end, where I want to embed this information in > one > > of the following ways: > > a) Simply embed RRDtool kind of generated images. That means, I will have > to > > run rrdtool ( I am looking at rrd4j) on server side and convert the data > to > > RRD format on agent/server side. > > b) Use some graphing library - like http://dygraphs.com/. > > I am not expecting too much of volume. To start with simple CPU, Memory > and > > hadoop metrics collected from 20 or so machines collected at a rate not > more > > than 10 per minute per metric. > > Thanks, > > Akshay > > On 27 December 2010 00:05, Ariel Rabkin <[email protected]> wrote: > >> > >> 16 GB isn't a hard limit, just a suggestion. And that's based on the > >> assumption that you have a big cluster and are collecting a lot of > >> data and using the older MySQL based infrastructure. > >> > >> How much memory you need depends on what volume of data you're > >> collecting and what you're doing with it. How do you intend to store > >> the data and how will you be visualizing it? > >> > >> > >> > >> --Ari > >> > >> On Sun, Dec 26, 2010 at 10:29 AM, Akshay Kumar <[email protected]> > >> wrote: > >> > Thanks, > >> > In my setup, I can not afford ( as of now) to have a machine with 16GB > >> > memory. > >> > So that means, I can not deploy Chukwa as a monitoring solution ? I > do > >> > not > >> > intend to do any log analysis / collection for now - just simple OS > and > >> > hadoop metrics. > >> > > >> > I mean, I do not understand why would one have 16GB has hard limit for > >> > minimal functioning too. > >> > I imagine it should be for a high performance system and not > bare-bones > >> > structure. What am I missing here? > >> > > >> > -Akshay > >> > > >> > On 26 December 2010 23:38, Ariel Rabkin <[email protected]> wrote: > >> >> > >> >> Yes. That 16 GB number is for the HICC server, not for the > collection > >> >> side. And even then, it's if you have a lot of data (a whole > cluster's > >> >> worth) living in a MySQL database with a web application serving the > >> >> data. > >> >> > >> >> The monitoring agent and the collector are both fairly > small-footprint. > >> >> > >> >> --Ari > >> >> > >> >> On Sun, Dec 26, 2010 at 10:03 AM, Akshay Kumar < > [email protected]> > >> >> wrote: > >> >> > Hi, > >> >> > Thanks for the responses. A bit late to check this one. > >> >> > I have one more query - > >> >> > In the Chukwa administration guide: > >> >> > http://people.apache.org/~eyang/docs/r0.1.2/admin.html > >> >> > It says > >> >> > Chukwa can also be installed on a single node, in which case the > >> >> > machine > >> >> > must have at least 16 GB of memory. > >> >> > > >> >> > Q) For my usecase ( for monitoring system metrics) - is it safe to > >> >> > assume it > >> >> > is not going to be that big a requirement for memory? > >> >> > > >> >> > Thanks, > >> >> > Akshay > >> >> > > >> >> > > >> >> > On 17 December 2010 10:23, ZHOU Qi <[email protected]> > wrote: > >> >> >> > >> >> >> Got it. Thanks. > >> >> >> > >> >> >> 2010/12/17 Eric Yang <[email protected]>: > >> >> >> > Sure, here you go. > >> >> >> > > >> >> >> > Regards, > >> >> >> > Eric > >> >> >> > > >> >> >> > On 12/16/10 6:21 PM, "ZHOU Qi" <[email protected]> > wrote: > >> >> >> > > >> >> >> > Hi Eric, > >> >> >> > > >> >> >> > I read the wiki of Chukwa, but there is less information about > >> >> >> > HICC. > >> >> >> > From where I can get its screen-shot or demo? > >> >> >> > > >> >> >> > Thanks, > >> >> >> > 2010/12/17 Eric Yang <[email protected]>: > >> >> >> >> Hi Akshay, > >> >> >> >> > >> >> >> >> A) Yes. You can use “add sigar.SystemMetrics SystemMetrics > >> >> >> >> [interval] > >> >> >> >> 0” > >> >> >> >> to > >> >> >> >> stream CPU state at specified interval. For example: > >> >> >> >> > >> >> >> >> “add sigar.SystemMetrics SystemMetrics 5 0” without quotes will > >> >> >> >> stream > >> >> >> >> CPU > >> >> >> >> state every 5 seconds. > >> >> >> >> > >> >> >> >> B) Chukwa has a graphing tool built in which is called HICC. > It > >> >> >> >> requires > >> >> >> >> Hbase deployed in order to use HICC. > >> >> >> >> > >> >> >> >> However, agent is still required on the client machines. > >> >> >> >> > >> >> >> >> Regards, > >> >> >> >> Eric > >> >> >> >> > >> >> >> >> On 12/16/10 4:34 AM, "Akshay Kumar" <[email protected]> > >> >> >> >> wrote: > >> >> >> >> > >> >> >> >> Hi, > >> >> >> >> I have a Hadoop installation, and I want to collect some basic > OS > >> >> >> >> level > >> >> >> >> metrics like - cpu, memory, disk usage, and Hadoop metrics. > >> >> >> >> > >> >> >> >> I have looked into Ganglia, but it requires installing agents > on > >> >> >> >> client > >> >> >> >> machines, which is what I want to avoid. > >> >> >> >> > >> >> >> >> My queries: > >> >> >> >> a) Is this a fair use case for using chukwa? e.g. polling > client > >> >> >> >> machines > >> >> >> >> for CPU stats few times per minute? > >> >> >> >> b) Is it possible to integrate data collected from chukwa > >> >> >> >> collectors > >> >> >> >> in > >> >> >> >> a > >> >> >> >> form readable by rrdtool kind of graphing tools on the server > >> >> >> >> side? > >> >> >> >> > >> >> >> >> Thanks, > >> >> >> >> Akshay > >> >> >> >> > >> >> >> >> > >> >> >> > > >> >> >> > > >> >> > > >> >> > > >> >> > >> >> > >> >> > >> >> -- > >> >> Ari Rabkin [email protected] > >> >> UC Berkeley Computer Science Department > >> > > >> > > >> > >> > >> > >> -- > >> Ari Rabkin [email protected] > >> UC Berkeley Computer Science Department > > > > >
