Re: Problem running latest nutch release

Iwan Cornelius Sun, 13 Jan 2008 14:55:58 -0800

There is a bug in 0.15.0 hadoop which has been fixed in 0.16.... see this
posting:
https://issues.apache.org/jira/browse/HADOOP-1642


Any chance of updating the version of hadoop that is used by nutch?



On 1/10/08, Iwan Cornelius <[EMAIL PROTECTED]> wrote:
>
> I have included the property with 'false' attribute in hadoop-site.xml so
> it should be off.
>
>
> On 1/9/08, Dennis Kubes < [EMAIL PROTECTED]> wrote:
> >
> > Are you running with speculative execution on?
> >
> > Dennis
> >
> > Iwan Cornelius wrote:
> > > Hi Susam,
> > >
> > > I get this error for both cases 1 and 2.
> > >
> > > I think it's due to running hadoop in local mode (ie single machine).
> > It
> > > seems it's always giving a jobid of 1. I've been using only a single
> > thread
> > > so i'm not sure why this is; then again I don't really understand how
> > the
> > > whole nutch/hadoop system works ...
> > >
> > > The weird thing is, sometimes the script (both yours and bin/nutch)
> > will run
> > > all the way through, sometimes for 1 or 2 "depths" of a crawl,
> > sometimes
> > > for the  injecting of urls. It's seemingly random.
> > >
> > > I've found nothing online to help out.
> > >
> > > Thanks Susam!
> > >
> > > On 1/9/08, Susam Pal <[EMAIL PROTECTED]> wrote:
> > >> I haven't really worked with the latest trunk. But I am wondering if
> > ...
> > >>
> > >> 1. you get this error when you kill a crawl while it is running, i.e.
> > >> the unfinished crawl is killed and then start a new crawl
> > >>
> > >> 2. you get this error when you crawl using 'bin/nutch crawl' command
> > >> as well as the crawl script?
> > >>
> > >> Regards,
> > >> Susam Pal
> > >>
> > >>> Hi there,
> > >>>
> > >>> I'm having problems running the latest release of nutch. I get the
> > >> following
> > >>> error when I try to crawl:
> > >>>
> > >>> Fetcher: segment: crawl/segments/20080109183955
> > >>> Fetcher: java.io.IOException: Target
> > >>> /tmp/hadoop-me/mapred/local/localRunner/job_local_1.xml already
> > exists
> > >>>         at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java
> > :246)
> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:125)
> > >>>         at org.apache.hadoop.fs.FileUtil.copy (FileUtil.java:116)
> > >>>         at org.apache.hadoop.fs.LocalFileSystem.copyToLocalFile(
> > >>> LocalFileSystem.java:55)
> > >>>         at org.apache.hadoop.fs.FileSystem.copyToLocalFile(
> > >> FileSystem.java
> > >>> :834)
> > >>>         at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(
> > >>> LocalJobRunner.java:86)
> > >>>         at org.apache.hadoop.mapred.LocalJobRunner.submitJob (
> > >>> LocalJobRunner.java:281)
> > >>>         at org.apache.hadoop.mapred.JobClient.submitJob(
> > JobClient.java
> > >> :558)
> > >>>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:753)
> > >>>         at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:526)
> > >>>         at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:561)
> > >>>         at org.apache.hadoop.util.ToolRunner.run (ToolRunner.java
> > :65)
> > >>>         at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:54)
> > >>>         at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:533)
> > >>>
> > >>> If I manually remove the offending directory it works... sometimes.
> > >>>
> > >>> Any help is appreciated.
> > >>>
> > >>> Regards,
> > >>> IWan
> > >>>
> > >
> >
>
>

Re: Problem running latest nutch release

Reply via email to