[Nutch Wiki] Update of FrontPage by peterpuwang

2007-10-18 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Nutch Wiki for change 
notification.

The following page has been changed by peterpuwang:
http://wiki.apache.org/nutch/FrontPage

--
  == Nutch Administration ==
   * DownloadingNutch
   * HardwareRequirements
+  * [http://www.thechristianlife.com/z/NutchGuideForDummies.htm Tutorial] -- 
Latest step by Step Installation guide for dummies: Nutch 0.9.
   * [http://lucene.apache.org/nutch/tutorial.html Tutorial] -- A Step-by-Step 
guide to getting Nutch up and running.
   * NutchTutorial ''on the wiki''
   * [Nutch - The Java Search Engine] (Builds on the basic tutorials. 
Includes index maintenance scripts)


[Nutch Wiki] Update of WritingPluginExample-0.9 by JasperKamperman

2007-10-18 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Nutch Wiki for change 
notification.

The following page has been changed by JasperKamperman:
http://wiki.apache.org/nutch/WritingPluginExample-0%2e9

The comment on the change is:
Added explanation why the field is added as UN_TOKENIZED

--
  
  == The Example ==
  
- Consider this as a plugin example: We want to be able to recommend specific 
web pages for given search terms.  For this example we'll assume we're indexing 
this site.  As you may have noticed, there are a number of pages that talk 
about plugins.  What we want to do is have it so that if someone searches for 
the term plugin we recommend that they start at the PluginCentral page, but 
we also want to return all the normal hits in the expected ranking.  We'll 
seperate the search results page into a section of recommendations and then a 
section with the normal search results.
+ Consider this as a plugin example: We want to be able to recommend specific 
web pages for given search terms.  For this example we'll assume we're indexing 
this site.  As you may have noticed, there are a number of pages that talk 
about plugins.  What we want to do is have it so that if someone searches for 
the term plugins we recommend that they start at the PluginCentral page, but 
we also want to return all the normal hits in the expected ranking.  We'll 
seperate the search results page into a section of recommendations and then a 
section with the normal search results.
  
  You go through your site and add meta-tags to pages that list what terms they 
should be recommended for.  The tags look something like this:
  
@@ -177, +177 @@

  
  == The Indexer Extension ==
  
- The following is the code for the Indexing Filter extension.  If the document 
being indexed had a recommended meta tag this extension adds a lucene text 
field to the index called recommended with the content of that meta tag.  
Create a file called RecommendedIndexer.java in the source code directory:
+ The following is the code for the Indexing Filter extension.  If the document 
being indexed had a recommended meta tag this extension adds a lucene text 
field to the index called recommended with the content of that meta tag. 
Create a file called RecommendedIndexer.java in the source code directory:
  
  {{{
  package org.apache.nutch.parse.recommended;
@@ -242, +242 @@

}  
  }
  }}}
+ 
+ Note that the field is UN_TOKENIZED because we don't want the recommended tag 
to be cut up by a tokenizer. Change to TOKENIZED if you want to be able to 
search on parts of the tag, for example to put multiple recommended terms in 
one tag.  
  
  == The QueryFilter ==