Hi Jie,

According to the Starfish README, the hadoop programs must be written using
the new Hadoop API. This is not my case (I am using MultipleInputs among
other non-new API supported features). Is there any way around this? Thanks,

-Leo

On Wed, Mar 7, 2012 at 3:19 PM, Jie Li <ji...@cs.duke.edu> wrote:

> Hi Leonardo,
>
> You might want to try Starfish which supports the memory profiling as well
> as cpu/disk/network profiling for the performance tuning.
>
> Jie
> ------------------
> Starfish is an intelligent performance tuning tool for Hadoop.
> Homepage: www.cs.duke.edu/starfish/
> Mailing list: http://groups.google.com/group/hadoop-starfish
>
>
> On Wed, Mar 7, 2012 at 2:36 PM, Leonardo Urbina <lurb...@mit.edu> wrote:
>
> > Hello everyone,
> >
> > I have a Hadoop job that I run on several GBs of data that I am trying to
> > optimize in order to reduce the memory consumption as well as improve the
> > speed. I am following the steps outlined in Tom White's "Hadoop: The
> > Definitive Guide" for profiling using HPROF (p161), by setting the
> > following properties in the JobConf:
> >
> >        job.setProfileEnabled(true);
> >
> > job.setProfileParams("-agentlib:hprof=cpu=samples,heap=sites,depth=6," +
> >                "force=n,thread=y,verbose=n,file=%s");
> >        job.setProfileTaskRange(true, "0-2");
> >        job.setProfileTaskRange(false, "0-2");
> >
> > I am trying to run this locally on a single pseudo-distributed install of
> > hadoop (0.20.2) and it gives the following error:
> >
> > Exception in thread "main" java.io.FileNotFoundException:
> > attempt_201203071311_0004_m_000000_0.profile (Permission denied)
> >        at java.io.FileOutputStream.open(Native Method)
> >        at java.io.FileOutputStream.<init>(FileOutputStream.java:194)
> >        at java.io.FileOutputStream.<init>(FileOutputStream.java:84)
> >        at
> > org.apache.hadoop.mapred.JobClient.downloadProfile(JobClient.java:1226)
> >        at
> >
> org.apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java:1302)
> >        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1251)
> >        at
> >
> >
> com.BitSight.hadoopAggregator.AggregatorDriver.run(AggregatorDriver.java:89)
> >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >        at
> >
> >
> com.BitSight.hadoopAggregator.AggregatorDriver.main(AggregatorDriver.java:94)
> >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >        at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >        at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >        at java.lang.reflect.Method.invoke(Method.java:597)
> >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> >
> > However, I can access these logs directly from the tasktracker's logs
> > (through the web UI). For the sakes of  running this locally, I could
> just
> > ignore this error, however I want to be able to profile the job once
> > deployed to our hadoop cluster and need to be able to automatically
> > retrieve these logs. Do I need to change the permissions in HDFS to allow
> > for this? Any ideas on how to fix this? Thanks in advance,
> >
> > Best,
> > -Leo
> >
> > --
> > Leo Urbina
> > Massachusetts Institute of Technology
> > Department of Electrical Engineering and Computer Science
> > Department of Mathematics
> > lurb...@mit.edu
> >
>



-- 
Leo Urbina
Massachusetts Institute of Technology
Department of Electrical Engineering and Computer Science
Department of Mathematics
lurb...@mit.edu

Reply via email to