[jira] Updated: (NUTCH-266) hadoop bug when doing updatedb
[ http://issues.apache.org/jira/browse/NUTCH-266?page=all ] Renaud Richardet updated NUTCH-266: --- Attachment: patch_hadoop-0.5.0.diff Now that Hadoop 0.5 has been released, here's the patch to use hadoop-0.5.0.jar in Nutch-0.8.x HTH, Renaud > hadoop bug when doing updatedb > -- > > Key: NUTCH-266 > URL: http://issues.apache.org/jira/browse/NUTCH-266 > Project: Nutch > Issue Type: Bug >Affects Versions: 0.8 > Environment: windows xp, JDK 1.4.2_04 >Reporter: Eugen Kochuev > Fix For: 0.9.0, 0.8.1 > > Attachments: patch.diff, patch_hadoop-0.5.0.diff > > > I constantly get the following error message > 060508 230637 Running job: job_pbhn3t > 060508 230637 > c:/nutch/crawl-20060508230625/crawldb/current/part-0/data:0+245 > 060508 230637 > c:/nutch/crawl-20060508230625/segments/20060508230628/crawl_fetch/part-0/data:0+296 > 060508 230637 > c:/nutch/crawl-20060508230625/segments/20060508230628/crawl_parse/part-0:0+5258 > 060508 230637 job_pbhn3t > java.io.IOException: Target > /tmp/hadoop/mapred/local/reduce_qnd5sx/map_qjp7tf.out already exists > at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:162) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:62) > at > org.apache.hadoop.fs.LocalFileSystem.renameRaw(LocalFileSystem.java:191) > at org.apache.hadoop.fs.FileSystem.rename(FileSystem.java:306) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:101) > Exception in thread "main" java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:341) > at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:54) > at org.apache.nutch.crawl.Crawl.main(Crawl.java:114) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (NUTCH-266) hadoop bug when doing updatedb
[ http://issues.apache.org/jira/browse/NUTCH-266?page=all ] Sami Siren updated NUTCH-266: - Fix Version/s: 0.8.1 0.9.0 > hadoop bug when doing updatedb > -- > > Key: NUTCH-266 > URL: http://issues.apache.org/jira/browse/NUTCH-266 > Project: Nutch > Issue Type: Bug >Affects Versions: 0.8 > Environment: windows xp, JDK 1.4.2_04 >Reporter: Eugen Kochuev > Fix For: 0.9.0, 0.8.1 > > Attachments: patch.diff > > > I constantly get the following error message > 060508 230637 Running job: job_pbhn3t > 060508 230637 > c:/nutch/crawl-20060508230625/crawldb/current/part-0/data:0+245 > 060508 230637 > c:/nutch/crawl-20060508230625/segments/20060508230628/crawl_fetch/part-0/data:0+296 > 060508 230637 > c:/nutch/crawl-20060508230625/segments/20060508230628/crawl_parse/part-0:0+5258 > 060508 230637 job_pbhn3t > java.io.IOException: Target > /tmp/hadoop/mapred/local/reduce_qnd5sx/map_qjp7tf.out already exists > at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:162) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:62) > at > org.apache.hadoop.fs.LocalFileSystem.renameRaw(LocalFileSystem.java:191) > at org.apache.hadoop.fs.FileSystem.rename(FileSystem.java:306) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:101) > Exception in thread "main" java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:341) > at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:54) > at org.apache.nutch.crawl.Crawl.main(Crawl.java:114) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (NUTCH-266) hadoop bug when doing updatedb
[ http://issues.apache.org/jira/browse/NUTCH-266?page=all ] Renaud Richardet updated NUTCH-266: --- Attachment: patch.diff Thank you Sami, We had a similar problem with Win XP and were able to fix it by using hadoop-nightly.jar. However, because of some changes in Hadoop (http://issues.apache.org/jira/browse/HADOOP-252), Nutch would not compile anymore. The attached patch will solve this. Let us know if there is a better way. > hadoop bug when doing updatedb > -- > > Key: NUTCH-266 > URL: http://issues.apache.org/jira/browse/NUTCH-266 > Project: Nutch > Issue Type: Bug >Affects Versions: 0.8 > Environment: windows xp, JDK 1.4.2_04 >Reporter: Eugen Kochuev > Attachments: patch.diff > > > I constantly get the following error message > 060508 230637 Running job: job_pbhn3t > 060508 230637 > c:/nutch/crawl-20060508230625/crawldb/current/part-0/data:0+245 > 060508 230637 > c:/nutch/crawl-20060508230625/segments/20060508230628/crawl_fetch/part-0/data:0+296 > 060508 230637 > c:/nutch/crawl-20060508230625/segments/20060508230628/crawl_parse/part-0:0+5258 > 060508 230637 job_pbhn3t > java.io.IOException: Target > /tmp/hadoop/mapred/local/reduce_qnd5sx/map_qjp7tf.out already exists > at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:162) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:62) > at > org.apache.hadoop.fs.LocalFileSystem.renameRaw(LocalFileSystem.java:191) > at org.apache.hadoop.fs.FileSystem.rename(FileSystem.java:306) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:101) > Exception in thread "main" java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:341) > at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:54) > at org.apache.nutch.crawl.Crawl.main(Crawl.java:114) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira