Re: how to crawl when Solr is search engine?

Ian Holsman Thu, 07 Jun 2007 00:57:13 -0700

Manoharam Reddy wrote:

Thanks for your quick response.


This brings me to another question. As far as I know Nutch can take
care of crawling as well as indexing. Then why go through the hassle
of crawling through Nutch and integrating it into Solr?

I found Solr's caching and maintenance easier to use than nutch's. Butthat's just me.


Another question I have, Solr provides the search results in XML
format, any ready made tools to convert them directly to web pages for
visitors to see?

yep.. it's called XSLT. most modern browsers can do the transform on theclient side.otherwise there is some server side tools (cocoon I think does this) todo the transform on the server before sending it out.


--Ian


On 6/7/07, Ian Holsman <[EMAIL PROTECTED]> wrote:

Hi Manoharam.

we use nutch to do the crawl, and have used sami's patch of nutch

(http://blog.foofactory.fi/2007/02/online-indexing-integrating-nutch-with.html

) to have it integrate with Solr. It works quite well for our needs.

If you are concerned with the speed, Solr also has a CSV upload
facility, which you might be able to use to upload the data into solr

that way, but we haven't found the HTTP Post speed to be an issue forus.


Regards
Ian


Manoharam Reddy wrote:
> I have just begun using Solr. I see that we have to insert documents
> by posting XMLs to solr/update
>
> I would like to know how Solr is used as a search engine in
> enterprises. How do you do the crawling of your intranet and passing
> the information as XML to solr/update. Isn't this going to be slow? To
> put all content in the index via a HTTP POST request requiring network
> sockets to be opened?
>
> Isn't there any direct way to to do the same thing without resorting
> to HTTP?
>

Re: how to crawl when Solr is search engine?

Reply via email to