Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by JakeVanderdray: http://wiki.apache.org/nutch/NutchTutorial ------------------------------------------------------------------------------ Once things are configured, running the crawl is easy. Just use the crawl command. Its options include: - * -dir dir names the directory to put the crawl in. + * '''-dir''' ''dir'' names the directory to put the crawl in. - * -threads threads determines the number of threads that will fetch in parallel. + * '''-threads''' ''threads'' determines the number of threads that will fetch in parallel. - * -depth depth indicates the link depth from the root page that should be crawled. + * '''-depth''' ''depth'' indicates the link depth from the root page that should be crawled. - * -topN N determines the maximum number of pages that will be retrieved at each level up to the depth. + * '''-topN''' ''N'' determines the maximum number of pages that will be retrieved at each level up to the depth. For example, a typical call might be:
