Hi Dima What did you write on the command line? sh ./nutch crawl myurls .....
You need to put your URL's in an input directory (e.g. myurls). There you put text files with your URLs (e.g. myurls/myurllist.txt). Kind regards Matthias -----Ursprüngliche Nachricht----- Von: Dima Mazmanov [mailto:[EMAIL PROTECTED] Gesendet: Mittwoch, 22. Februar 2006 09:31 An: [email protected] Betreff: nutch-0.8 crawl problem Hi! I have problems in crawling..Mainly I cannot even start to crawl. I've downloaded latest source of nutch, and after 3 hours of struggling with config files, I gave up. I have some question I want to ask 1) What is hadoop and how can I use it. I searched information about hadoop and found that it's no longer integrated in nutch.It's another project. But in lib folder I found corresponding hadoop-0.1-dev.jar file. But what does he do? 2) How can I crawl? :) when I type command I get following exception No input directories specified in: Configuration: defaults: hadoop-default.xml , mapred-default.xml , /tmp/hadoop/mapred/local/localRunner/job_vpit8j.xmlfinal: hadoop-site.xml at org.apache.hadoop.mapred.InputFormatBase.listFiles(InputFormatBase.java:84) at org.apache.hadoop.mapred.InputFormatBase.getSplits(InputFormatBase.java:94) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:70) 060222 131857 map 0% reduce 0% Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:310) at org.apache.nutch.crawl.Injector.inject(Injector.java:114) at org.apache.nutch.crawl.Crawl.main(Crawl.java:104) I wrote in hadoop-site.xml following <!--StartFragment--><property> <name>fs.default.name</name> <value>localhost:9000</value> </property> <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> But I don't know what does it mean.(just copied from hadoop website) So, how can I crawl using nutch-0.8? 3) where is ./nutch ndfs? When I execute this command I get Exception in thread "main" java.lang.NoClassDefFoundError: ndfs I had no problems with 0.7 version. I decided to move to 0.8 because of parse-swf plugin, since I couldn't compile it. Please describe how to use new nutch? Or what do I need to compile parse-swf plugin? ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
