Hi Jonathan, Very useful information. I will look at the ganglia.
However, I do not have the administrative privilege for the cluster. I don't know if I can install Ganglia in the cluster. Thank you for your information. Best, Tim 2015-02-22 0:53 GMT-06:00 Jonathan Aquilina <[email protected]>: > Where I am working we are working on transient cluster (temporary) using > Amazon EMR. When I was reading up on how things work they suggested for > monitoring to use ganglia to monitor memory usage and network usage etc. > That way depending on how things are setup be it using an amazon s3 bucket > for example and pulling data directly into the cluster the network link > will always be saturated to ensure a constant flow of data. > > What I am suggesting is potentially looking at ganglia. > > > > --- > Regards, > Jonathan Aquilina > Founder Eagle Eye T > > On 2015-02-22 07:42, Fang Zhou wrote: > > Hi Jonathan, > > Thank you. > > The number of files impact on the memory usage in Namenode. > > I just want to get the real memory usage situation in Namenode. > > The memory used in heap always changes so that I have no idea about which > value is the right one. > > Thanks, > Tim > > On Feb 22, 2015, at 12:22 AM, Jonathan Aquilina <[email protected]> > wrote: > > I am rather new to hadoop, but wouldnt the difference be potentially in > how the files are split in terms of size? > > > --- > Regards, > Jonathan Aquilina > Founder Eagle Eye T > > On 2015-02-21 21:54, Fang Zhou wrote: > > Hi All, > > I want to test the memory usage on Namenode and Datanode. > > I try to use jmap, jstat, proc/pid/stat, top, ps aux, and Hadoop website > interface to check the memory. > The values I get from them are different. I also found that the memory always > changes periodically. > This is the first thing confused me. > > I thought the more files stored in Namenode, the more memory usage in > Namenode and Datanode. > I also thought the memory used in Namenode should be larger than the memory > used in each Datanode. > However, some results show my ideas are wrong. > For example, I test the memory usage of Namenode with 6000 and 1000 files. > The "6000" memory is less than "1000" memory from jmap's results. > I also found that the memory usage in Datanode is larger than the memory used > in Namenode. > > I really don't know how to get the memory usage in Namenode and Datanode. > > Can anyone give me some advices? > > Thanks, > Tim > >
