Hi Leonardo, You might want to try Starfish which supports the memory profiling as well as cpu/disk/network profiling for the performance tuning.
Jie ------------------ Starfish is an intelligent performance tuning tool for Hadoop. Homepage: www.cs.duke.edu/starfish/ Mailing list: http://groups.google.com/group/hadoop-starfish On Wed, Mar 7, 2012 at 2:36 PM, Leonardo Urbina <lurb...@mit.edu> wrote: > Hello everyone, > > I have a Hadoop job that I run on several GBs of data that I am trying to > optimize in order to reduce the memory consumption as well as improve the > speed. I am following the steps outlined in Tom White's "Hadoop: The > Definitive Guide" for profiling using HPROF (p161), by setting the > following properties in the JobConf: > > job.setProfileEnabled(true); > > job.setProfileParams("-agentlib:hprof=cpu=samples,heap=sites,depth=6," + > "force=n,thread=y,verbose=n,file=%s"); > job.setProfileTaskRange(true, "0-2"); > job.setProfileTaskRange(false, "0-2"); > > I am trying to run this locally on a single pseudo-distributed install of > hadoop (0.20.2) and it gives the following error: > > Exception in thread "main" java.io.FileNotFoundException: > attempt_201203071311_0004_m_000000_0.profile (Permission denied) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.<init>(FileOutputStream.java:194) > at java.io.FileOutputStream.<init>(FileOutputStream.java:84) > at > org.apache.hadoop.mapred.JobClient.downloadProfile(JobClient.java:1226) > at > org.apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java:1302) > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1251) > at > > com.BitSight.hadoopAggregator.AggregatorDriver.run(AggregatorDriver.java:89) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at > > com.BitSight.hadoopAggregator.AggregatorDriver.main(AggregatorDriver.java:94) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > > However, I can access these logs directly from the tasktracker's logs > (through the web UI). For the sakes of running this locally, I could just > ignore this error, however I want to be able to profile the job once > deployed to our hadoop cluster and need to be able to automatically > retrieve these logs. Do I need to change the permissions in HDFS to allow > for this? Any ideas on how to fix this? Thanks in advance, > > Best, > -Leo > > -- > Leo Urbina > Massachusetts Institute of Technology > Department of Electrical Engineering and Computer Science > Department of Mathematics > lurb...@mit.edu >