Hello everyone, I have a Hadoop job that I run on several GBs of data that I am trying to optimize in order to reduce the memory consumption as well as improve the speed. I am following the steps outlined in Tom White's "Hadoop: The Definitive Guide" for profiling using HPROF (p161), by setting the following properties in the JobConf:
job.setProfileEnabled(true); job.setProfileParams("-agentlib:hprof=cpu=samples,heap=sites,depth=6," + "force=n,thread=y,verbose=n,file=%s"); job.setProfileTaskRange(true, "0-2"); job.setProfileTaskRange(false, "0-2"); I am trying to run this locally on a single pseudo-distributed install of hadoop (0.20.2) and it gives the following error: Exception in thread "main" java.io.FileNotFoundException: attempt_201203071311_0004_m_000000_0.profile (Permission denied) at java.io.FileOutputStream.open(Native Method) at java.io.FileOutputStream.<init>(FileOutputStream.java:194) at java.io.FileOutputStream.<init>(FileOutputStream.java:84) at org.apache.hadoop.mapred.JobClient.downloadProfile(JobClient.java:1226) at org.apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java:1302) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1251) at com.BitSight.hadoopAggregator.AggregatorDriver.run(AggregatorDriver.java:89) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at com.BitSight.hadoopAggregator.AggregatorDriver.main(AggregatorDriver.java:94) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) However, I can access these logs directly from the tasktracker's logs (through the web UI). For the sakes of running this locally, I could just ignore this error, however I want to be able to profile the job once deployed to our hadoop cluster and need to be able to automatically retrieve these logs. Do I need to change the permissions in HDFS to allow for this? Any ideas on how to fix this? Thanks in advance, Best, -Leo -- Leo Urbina Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Department of Mathematics lurb...@mit.edu