Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by tyrellperera: http://wiki.apache.org/nutch/Nutch_-_The_Java_Search_Engine ------------------------------------------------------------------------------ The Nutch search engine consists, very roughly, of three components: 1. The Crawler, which discovers and retrieves web pages + 2. The âWebDBâ, a custom database that stores known URLs and fetched page contents + 3. The âIndexerâ, which dissects pages and builds keyword-based indexes from them After the initial creation of an Index, it is usual to perform periodic updates of the index, in order to keep it up-to-date. We will look into the details of index maintenance in the parts following this.
