Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by MichaelStack:
http://wiki.apache.org/nutch/FAQ

------------------------------------------------------------------------------
  Anchor text makes a large contribution to document score (You can see the 
anchor text for a page by browsing to "explain" then editing the URL to put in 
place "anchors.jsp" in place of "explain.jsp").
  
  ==== What is the RSS symbol in search results all about? ====
- Clicking on the RSS symbol sends the current query back to Nutch to a servlet 
named OpenSearchServlet that redoes the search returning the results instead 
formatted as RSS (XML).  The RSS format is based on 
[http://a9.com/-/spec/opensearchrss/1.0/ OpenSearch RSS 1.0] from 
[http://www.a9.com a9.com] (Also see [href="http://opensearch.a9.com/ 
OpenSearch]). Nutch extensions add to the OpenSearch RSS the original query, 
navigation information, and any extra fields that available in the search 
result such as Nutch boost, segment name, etc. 
+ Clicking on the RSS symbol sends the current query back to Nutch to a servlet 
named OpenSearchServlet.  OpenSearchServlet reruns the query and returns the 
results formatted instead as RSS (XML).  The RSS format is based on 
[http://a9.com/-/spec/opensearchrss/1.0/ OpenSearch RSS 1.0] from 
[http://www.a9.com a9.com]: "OpenSearch RSS 1.0 is an extension to the RSS 2.0 
standard, conforming to the guidelines for RSS extensibility as outlined by the 
RSS 2.0 specification" (See also [http://opensearch.a9.com/ OpenSearch]). Nutch 
in turn  makes extension to OpenSearch.  The Nutch extensions are identified by 
the 'nutch' namespace prefix and add to OpenSearch navigation information, the 
original query, and all fields that are available at search result time 
including the Nutch page boost, the name of the segment the page resides in, 
etc. 
  
- Results as RSS (XML) rather than HTML are easier for programmatic clients to 
parse: such clients will query against OpenSearchServlet rather than 
search.jsp.  Results as XML can also be transformed using XSL stylesheets, the 
likely direction of UI development in nutch going by mailing list posts.
+ Results as RSS (XML) rather than HTML are easier for programmatic clients to 
parse: such clients will query against OpenSearchServlet rather than 
search.jsp.  Results as XML can also be transformed using XSL stylesheets, the 
likely direction of UI development going forward.
  
  === Crawling ===
  

Reply via email to