Hi Feng, I have created a wiki page for (bin/crawl) thinking about this. Please feel free to edit any of the wiki's and update the documentation.
[0] http://wiki.apache.org/nutch/bin/crawl On Thu, Mar 21, 2013 at 1:18 AM, feng lu <[email protected]> wrote: > << > Second, for a user running Nutch on a single node or local mode the > default size of topN (50,000) makes the crawl run for a long time. Can we > make the topN parameter configurable through the script ? > >> > > May be i agree with Tejas that let user to modify the parameters below to > their needs. But we can add some detail information into the bin/crawl > wiki to tell users how to modify these parameters and what is the meaning > of these parameters. > > > On Thu, Mar 21, 2013 at 3:01 AM, kiran chitturi <[email protected] > > wrote: > >> Hi! >> >> I want to update the Nutch tutorials in the wiki with the crawl script >> (./bin/crawl). The presence of the crawl command in the tutorials makes >> users use these crawl command run in to issues which makes us suggest them >> use the crawl script instead of the command. >> >> Can we make it uniform all over wiki that crawl command is deprecated and >> it is recommended to use crawl script ? >> >> Second, for a user running Nutch on a single node or local mode the >> default size of topN (50,000) makes the crawl run for a long time. Can we >> make the topN parameter configurable through the script ? >> >> Thank you, >> >> -- >> Kiran Chitturi >> >> <http://www.linkedin.com/in/kiranchitturi> >> >> >> > > > -- > Don't Grow Old, Grow Up... :-) > -- Kiran Chitturi <http://www.linkedin.com/in/kiranchitturi>

