Здравствуйте, Gal. Вы писали 22 февраля 2006 г., 14:11:37:
> :) a bit misleading.... > first: Hadoop is the evolution from "Nutch Distributed File System". > It is based on google's file system. It enable one to keep all data in a > distributed file system which is very suitable to Nutch. > When you see bin/nuctch NDFS -ls write instead bin/hadoop dfs -ls > now to create the seeds: > create the urls.txt file in a folder called seeds i.e. seeds/urls.txt > bin/hadoop dfs -put seeds seeds > this will copy the seeds folder into hadoop file system > and now > bin/nutch crawl seeds -dir crawled -depth 3 >& crawl.log > Happy crawling. > Gal. > On Wed, 2006-02-22 at 01:05 -0800, Foong Yie wrote: >> matt >> >> as the tutorial stated .. >> >> bin/nutch crawl urls -dir crawled -depth 3 >& crawl.log >> >> the urls is in .txt right? i created it and put inside c:/nutch-0.7.1 >> >> Stephanie >> >> >> --------------------------------- >> Yahoo! Autos. Looking for a sweet ride? Get pricing, reviews, & more on new >> and used cars. > __________ NOD32 1.1415 (20060221) Information __________ > This message was checked by NOD32 antivirus system. > http://www.eset.com Thanks a lot!!! I'll try it One thing else.. Do I have to download and compile hadoop sources? -- С уважением, Nutch mailto:[EMAIL PROTECTED]
