Здравствуйте, Gal. Вы писали 22 февраля 2006 г., 14:11:37:
> :) a bit misleading.... > first: Hadoop is the evolution from "Nutch Distributed File System". > It is based on google's file system. It enable one to keep all data in a > distributed file system which is very suitable to Nutch. > When you see bin/nuctch NDFS -ls write instead bin/hadoop dfs -ls > now to create the seeds: > create the urls.txt file in a folder called seeds i.e. seeds/urls.txt > bin/hadoop dfs -put seeds seeds > this will copy the seeds folder into hadoop file system > and now > bin/nutch crawl seeds -dir crawled -depth 3 >& crawl.log > Happy crawling. > Gal. > On Wed, 2006-02-22 at 01:05 -0800, Foong Yie wrote: >> matt >> >> as the tutorial stated .. >> >> bin/nutch crawl urls -dir crawled -depth 3 >& crawl.log >> >> the urls is in .txt right? i created it and put inside c:/nutch-0.7.1 >> >> Stephanie >> >> >> --------------------------------- >> Yahoo! Autos. Looking for a sweet ride? Get pricing, reviews, & more on new >> and used cars. > __________ NOD32 1.1415 (20060221) Information __________ > This message was checked by NOD32 antivirus system. > http://www.eset.com Thanks a lot!!! I'll try it One thing else.. Do I have to download and compile hadoop sources? -- С уважением, Nutch mailto:[EMAIL PROTECTED] ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
