Hi nutch-dev,

I was looking at [0] and realized that with the massive number of Hadoop
setup tutorials out there on internet, we need not repeat the same on nutch
wiki page and instead assume that user has already done Hadoop setup. For
convinience, we could direct users to the Hadoop wiki page which has Hadoop
setup details.
Plus, I propose following:

- Section "Downloading Hadoop and Nutch" : Remove the Hadoop portions and
let the Nutch stuff stay.
- Section "Setting Up The Deployment Architecture" must be removed.
- Section "Deploy Nutch to Single Machine" and "Deploy Nutch to Multiple
Machines" can be merged together.
- Section "Performing a Nutch Crawl", "Testing the Crawl" and "Performing a
Search" must be merged, its contents must be updated.
- Section "Rsyncing Code to Slaves" and "Updates" can be completely
removed.

Any comments ?

[0] : http://wiki.apache.org/nutch/NutchHadoopTutorial

Thanks,
Tejas

Reply via email to