Re: Debugging M/R job with tez

Manuel Godbert Thu, 06 Oct 2016 01:35:40 -0700

Thanks Hitesh
I opened TEZ-3461 <https://issues.apache.org/jira/browse/TEZ-3461>
Brgds


On Wed, Oct 5, 2016 at 7:19 PM, Hitesh Shah <hit...@apache.org> wrote:

> Thanks for filing the issues, Manuel.
>
> I took a quick look at trying to run the MR job in tez local mode. A
> native tez job running in local mode ( i.e. by running something like
> "hadoop jar ./tez/tez-examples-0.9.0-SNAPSHOT.jar wordcount
> -Dtez.local.mode=true …” ) works but local mode when trying to run an MR
> job via the tez framework does not. I don’t believe that this has really
> worked at all since the initial implementation of local mode was committed.
> There are some quirks of the MR to Tez translation layer which are still
> pending from an MR local mode perspective. If you can file a JIRA for the
> local mode issue, I can provide a small patch that allowed me to make some
> minor headway before I ended up hitting other ones.
>
> thanks
> — Hitesh
>
>
> > On Oct 5, 2016, at 5:44 AM, Manuel Godbert <manuel.godb...@gmail.com>
> wrote:
> >
> > Hello,
> >
> > I just opened TEZ-3459, with attached code adressing 3 of the issues I
> encountered, including the embedded jars one.
> >
> > I did not manage yet to provide an example showing the issue I had with
> multiple outputs. It would definitely help me if I could run my jobs
> locally with Tez to understand the specificity of these jobs. Would it be
> possible to get some support to set up my workstation to achieve this?
> >
> > Brgds
> >
> > Manuel
> >
> > On Wed, Sep 28, 2016 at 8:37 PM, Hitesh Shah <hit...@apache.org> wrote:
> > Thanks for the context, Manuel.
> >
> > Full compat with MR is something that has not really been fully tested
> with Tez. We believe that it works for the most part but there are probably
> cases out there which have either not been addressed or some which we are
> not aware of.
> >
> > It is great that you are trying this out. We can definitely help you
> figure out these issues and get the fixes into Tez to allow more users to
> seamlessly run MR jobs on Tez. It will be great if you can file a jira for
> the MR distributed cache handling of archives in Tez. A simple example to
> reproduce it would help a lot too so as to allow any of the Tez
> contributors to quickly debug and fix. I am assuming you are passing in
> archives/fat-jars to the distributed cache which MR implicitly applies ./*
> + ./lib/* pattern against to add to the runtime classpath? I am guessing
> this is something we may not have handled correctly in the translation
> layer.
> >
> > thanks
> > — Hitesh
> >
> > > On Sep 28, 2016, at 9:38 AM, Manuel Godbert <manuel.godb...@gmail.com>
> wrote:
> > >
> > > Hello,
> > >
> > > In non local mode my M/R jobs generally behave as expected with Tez.
> However some still resist, and I am trying to have them running locally to
> understand if I they can work with some changes (either in my code or in
> Tez code, and in that latter case I planned to contribute some way to the
> Tez effort). Runnning the WordCount locally is only a first step.
> > >
> > > I won't be able to provide source code easily for the real problematic
> jobs, as we use a quite big home made framework on top of hadoop and that
> is not open source... in a few words most of my issues actually seem to
> come from the task attempts IDs management. We have subclassed the output
> committers to manage multiple outputs, and when we reach the commit task
> step the produced files are not always where expected in the temporary task
> attempt paths. It is hard to say what happens exactly, and this is why I
> wanted to reproduce the issue locally before sharing it.
> > >
> > > Besides this, another minor issue we got is that we used to package
> our applicative jars with nested dependencies in /lib and these are ignored
> by Tez. We could easily work around this expanding these and adapting our
> classpath.
> > >
> > > Regards
> > >
> > > On Wed, Sep 28, 2016 at 5:46 PM, Hitesh Shah <hit...@apache.org>
> wrote:
> > > Hello Manuel,
> > >
> > > Thanks for reporting the issue. Let me try and reproduce this locally
> to see what is going on.
> > >
> > > A quick question in general though - are you hitting issues when
> running in non-local mode too? Would you mind sharing that details on the
> issues you hit?
> > >
> > > thanks
> > > — Hitesh
> > >
> > >
> > > > On Sep 27, 2016, at 9:53 AM, Manuel Godbert <
> manuel.godb...@gmail.com> wrote:
> > > >
> > > > Hello,
> > > >
> > > > I have map/reduce jobs that work as expected within YARN, and I want
> to see if Tez can help me improving their performance. Alas, I am
> experiencing issues and I want to understand what happens, to see if I can
> adapt my code or if I can suggest Tez enhancements. For this I need to be
> able to debug jobs from within eclipse, with breakpoints in Tez source code
> etc.
> > > >
> > > > I am working on a linux (ubuntu) platform
> > > > I use the latest Tez version I found, i.e. 0.9.0-SNAPSHOT (also
> tried with 0.7.0)
> > > > I have set up the hortonworks mini dev cluster https://github.com/
> hortonworks/mini-dev-cluster
> > > > I am trying to run the basic WordCount2 code found here
> https://hadoop.apache.org/docs/r2.7.2/hadoop-mapreduce-
> client/hadoop-mapreduce-client-core/MapReduceTutorial.
> html#Example:_WordCount_v2.0
> > > > I added the following code to have tez running locally:
> > > >     conf.set("mapreduce.framework.name", "yarn-tez");
> > > >     conf.setBoolean("tez.local.mode", true);
> > > >     conf.set("fs.default.name", "file:///");
> > > >     conf.setBoolean("tez.runtime.optimize.local.fetch", true);
> > > >
> > > > And I am getting the following error:
> > > >
> > > > 2016-09-27 18:32:34 Running Dag: dag_1474992804027_0003_1
> > > > 2016-09-27 18:32:34 Running Dag: dag_1474992804027_0003_1
> > > > Exception in thread "main" java.lang.NullPointerException
> > > >       at org.apache.tez.client.LocalClient.getApplicationReport(
> LocalClient.java:153)
> > > >       at org.apache.tez.dag.api.client.rpc.DAGClientRPCImpl.
> getAppReport(DAGClientRPCImpl.java:231)
> > > >       at org.apache.tez.dag.api.client.rpc.DAGClientRPCImpl.
> createAMProxyIfNeeded(DAGClientRPCImpl.java:251)
> > > >       at org.apache.tez.dag.api.client.rpc.DAGClientRPCImpl.
> getDAGStatus(DAGClientRPCImpl.java:96)
> > > >       at org.apache.tez.dag.api.client.DAGClientImpl.
> getDAGStatusViaAM(DAGClientImpl.java:360)
> > > >       at org.apache.tez.dag.api.client.DAGClientImpl.
> getDAGStatusInternal(DAGClientImpl.java:220)
> > > >       at org.apache.tez.dag.api.client.DAGClientImpl.getDAGStatus(
> DAGClientImpl.java:268)
> > > >       at org.apache.tez.dag.api.client.MRDAGClient.getDAGStatus(
> MRDAGClient.java:58)
> > > >       at org.apache.tez.mapreduce.client.YARNRunner.
> getJobStatus(YARNRunner.java:710)
> > > >       at org.apache.tez.mapreduce.client.YARNRunner.submitJob(
> YARNRunner.java:650)
> > > >       at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
> JobSubmitter.java:240)
> > > >       at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
> > > >       at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
> > > >       at java.security.AccessController.doPrivileged(Native Method)
> > > >       at javax.security.auth.Subject.doAs(Subject.java:422)
> > > >       at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGroupInformation.java:1657)
> > > >       at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
> > > >       at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.
> java:1308)
> > > >       at WordCount2.main(WordCount2.java:136)
> > > >
> > > > Please help me understanding what I am doing wrong!
> > > >
> > > > Regards
> > >
> > >
> >
> >
>
>

Re: Debugging M/R job with tez

Reply via email to