Hi all,
I got a problem today when using intranet crawl.
Ø nutch crawl urltext.txt -dir gradschool -depth 7 -topN 200
The following is the error I got.
crawl started in: gradschool
rootUrlDir = urltext.txt
threads = 10
depth = 7
topN = 200
Injector: starting
Injector: crawlDb: gradschool/crawldb
Injector: urlDir: urltext.txt
Injector: Converting injected urls to crawl db entries.
Exception in thread "main" java.io.IOException: Mkdirs failed to create
/tmp/hadoop-mjiang/mapred/system/submit_nbov9
at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:436)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:346)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:253)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:84)
at
org.apache.hadoop.fs.LocalFileSystem.copyFromLocalFile(LocalFileSystem.java:
49)
at
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:741)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:314)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
at org.apache.nutch.crawl.Injector.inject(Injector.java:162)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:115)
Whats the problem here?
Thanks
Alvin