Re: Is it better to have a linear chain of 2 jobs?!!

stack Fri, 11 Sep 2009 11:37:35 -0700

When I see the below, its because we're picking up default hadoop
configuration for filesystem; at the time the below runs, the customization
of the hadoop configuration for fs.default.name is not on the CLASSPATH.
Can you check into this?  If hbase-site.xml is available, you might do
something like the below before setting up your job:


c.set("fs.default.name", c.get(HConstants.HBASE_DIR))

...where 'c' in above is instance of HBaseConfiguration (We're setting the
fs.default.name to be same filesystem hbase is pointed at).

St.Ack

On Fri, Sep 11, 2009 at 7:33 AM, Xine Jar <[email protected]> wrote:

> > Hi Jeff,
> > Thanks I have tried the JobControl with two consequent jobs. Both jobs
> are
> > executed one after the other and generating the expected output results.
> But
> > I am getting a weird warning after each Job is finished:
> >
> > ****
> > *09/09/11 17:28:18 WARN mapred.JobClient: Use GenericOptionsParser for
> > parsing the arguments. Applications should implement Tool for the same.
> > Exception in thread "Thread-23" java.lang.IllegalArgumentException: Wrong
> > FS:   hdfs://pc152.../job_200909021344/job.jar, expected: file:///
> >         at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:322)
> >         at
> >
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:52)
> >         at
> >
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:416)
> >         at
> >
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:244)
> >         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:192)
> >         at
> > org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1187)
> >         at
> > org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1163)
> >         at
> > org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1135)
> >         at
> >
> org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:693)
> >         at
> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:788)
> >         at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
> >         at
> >
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
> >         at
> > org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
> >         at java.lang.Thread.run(Thread.java:619)*
> >
> >
> > I could not understand the reason of this warning!! I am wondering if you
> > can clarify it to me!!! I am just concerned about the side effects of
> this
> > warning if there are any.
> >
> > Thank you,
> > CJ
> >
> >
> >
> > **
> > On Fri, Sep 11, 2009 at 11:51 AM, Jeff Hammerbacher <[email protected]
> >wrote:
> >
> >> Hey,
> >>
> >> Within the Apache Hadoop project, there is also an underdocumented tool
> >> called JobControl to allow you to submit a sequence of jobs:
> >>
> >>
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/jobcontrol/JobControl.html
> >> .
> >> Regards,
> >> Jeff
> >>
> >> On Thu, Sep 10, 2009 at 10:41 AM, Jonathan Gray <[email protected]>
> >> wrote:
> >>
> >> > Sometimes you have to, simple as that.
> >> >
> >> > There are tools out there like Cascading (http://www.cascading.org)
> >> that
> >> > are designed to help write multi-job chains.
> >> >
> >> > JG
> >> >
> >> >
> >> > Xine Jar wrote:
> >> >
> >> >> Hallo,
> >> >> I have already written several simple mapreduce applications always 1
> >> >> job/application.
> >> >> Assume I want to write a more complex application !! Can someone tell
> >> me
> >> >> what is the advantage of splitting the application into two or more
> >> linear
> >> >> jobs?
> >> >>
> >> >> Regards,
> >> >> CJ
> >> >>
> >> >>
> >>
> >
> >
>

Re: Is it better to have a linear chain of 2 jobs?!!

Reply via email to