Michael Wechner wrote:
Sandy Polanski wrote:

I second that. Is there anyone that can give us some tips on how to use the OpenSearchServlet? I'd really like to see a standalone Java program that would allow me to see the results in RSS format that I can call from the "./bin/nutch" executable.

I guess the bin/nutch resp. some other program (maybe based on NutchBean) should return a RSS feed which then can be pulled/parsed by the OpenSearchServlet. The question is does something like this already exist within Nutch and if not is somebody writing something like this (for instance myself ;-) but I would rather wait if somebody might answer the "exist" question ...


Folks,

As the name itself suggests, the servlet needs a servlet container to run. If you build a standard WAR you will get among others the OpenSearchServlet included in the WAR, under <contextPath>/opensearch. Deploy this WAR to your favorite servlet container, e.g. Tomcat, and you are ready to go.

This is a REST-type service, which means that you send it requests as standard HTTP GET-s with parameters in the URL, and as a response you get an XML document.

Example request:

   http://localhost:8081/nutch/opensearch?query=cnn

Example response:

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:nutch="http://www.nutch.org/opensearchrss/1.0/"; 
xmlns:opensearch="http://a9.com/-/spec/opensearchrss/1.0/"; version="2.0">
<channel>
<title>Nutch: cnn</title>
<description>Nutch search results for query: cnn</description>
<link>http://localhost:8081/nutch/search.jsp?query=cnn&amp;start=0&amp;hitsPerDup=2&amp;hitsPerPage=10</link>
<opensearch:totalResults>1</opensearch:totalResults>
<opensearch:startIndex>0</opensearch:startIndex>
<opensearch:itemsPerPage>10</opensearch:itemsPerPage>

<nutch:query>cnn</nutch:query>
<item>
<title>CNN.com - Breaking News, U.S., World, Weather, Entertainment &amp; Video 
News</title>
<description>&lt;span class="ellipsis"&gt; ... &lt;/span&gt;the world Instant Access &lt;span class="highlight"&gt;CNN&lt;/span&gt; International Live newscasts 
and&lt;span class="ellipsis"&gt; ... &lt;/span&gt;Pipeline Overnight Live feeds from &lt;span class="highlight"&gt;CNN&lt;/span&gt; and its global&lt;span 
class="ellipsis"&gt; ... &lt;/span&gt;</description>

<link>http://www.cnn.com/</link>
<nutch:site>www.cnn.com</nutch:site>
<nutch:cache>http://localhost:8081/nutch/cached.jsp?idx=0&amp;id=0</nutch:cache>
<nutch:explain>http://localhost:8081/nutch/explain.jsp?idx=0&amp;id=0&amp;query=cnn&amp;lang=null</nutch:explain>
<nutch:segment>20060817135307</nutch:segment>
<nutch:digest>6e5e1ede359a88f11fc564cf22f79305</nutch:digest>
<nutch:boost>2.5735338</nutch:boost>

</item>
</channel>
</rss>



--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply via email to