On Feb 23, 2007, at 10:13 AM, Gal Nitzan wrote:
> Hi,
>
> Since I ran into SOLR project the other day I was wandering that  
> maybe SOLR
> could be Nutch's SE...


We've been using it since I realized we could not get the Lucene  
queries we wanted out of the Nutch OpenSearch SE. Solr also  
integrates a lot better with back end development as it outputs in  
JSON, XML and Python. It's even more valuable with Sami's SolrIndexer  
patch that he has on his blog-- Nutch indexing now goes straight to  
Solr. It's very fast and so far robust -- I've lazily crawled 600K  
pages on a single CPU (with stored content!) in the past few days  
after integrating Sami's stuff with no obvious problems yet.

 From what I can tell, you lose some of the advanced Nutch scoring  
features. We get the document boost (Sami, I owe you a patch) but  
that's about it. The Nutch SE is also a great "Google in a Box" setup  
for people that want that. For that reason I am not so sure Solr  
should replace Nutch's SE. Solr is more useful for people that want  
to do something programatically or queries more complex than "AND"  
with the Nutch. There's no search front end in Solr other than the  
admin interface.

Would love to see Sami's patch in trunk as an indexing plugin, though.

-Brian


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to