Re: Nutch/Solr question
Solr is just a search and indexing server. It doesn't do crawling. Nutch does the crawling and page parsing, and can index into Lucene or into a Solr server. Nutch is a biggish beast, and if you just need to index a site or even a small set of them, you may have an easier time with Droids. Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR - Original Message From: Bartosz Gadzimski bartek...@o2.pl To: nutch-user@lucene.apache.org Sent: Wed, November 4, 2009 10:41:14 AM Subject: Nutch/Solr question Hi, I want to make site search for few of my (and friends) websites but without access to database data. So using nutch crawling and then I have 2 ways. 1. index data to solr 2. leave it with nutch index I need help in finding advantages/disadvantages of solr vs nutch searching because I don't know solr (it's hard to have a big picture) Each site is quite small so it can be held by solr with no problems. In solr I probably can't use faceted search or range queries etc. because I don't have necessary data in schema? In nutch I can have one search server and use site:domain to limit results (like google site search) or use multiple indexes (mentioned on mailing list) but what with solr? Any input highly appreciated. Thanks, Bartosz
Nutch/Solr question
Hi, I want to make site search for few of my (and friends) websites but without access to database data. So using nutch crawling and then I have 2 ways. 1. index data to solr 2. leave it with nutch index I need help in finding advantages/disadvantages of solr vs nutch searching because I don't know solr (it's hard to have a big picture) Each site is quite small so it can be held by solr with no problems. In solr I probably can't use faceted search or range queries etc. because I don't have necessary data in schema? In nutch I can have one search server and use site:domain to limit results (like google site search) or use multiple indexes (mentioned on mailing list) but what with solr? Any input highly appreciated. Thanks, Bartosz
Re: Nutch/Solr question
Hi, I have the same problem, i am using Nutch but thinking about using it with Solr. I configured the whole Solr and now i am trying to configure nutch to work with solr. Like you i have no previous experience with Solr so i used a bunch of tutorials. I run a XP and a Linux Ubuntu version on my system and i only configured nuth/solr for xp so far. An i run a server with ubuntu so i also might want to configure solr/nutch for ubuntu. Only crawl about 10 websites(almost like you) and intend to use the results as a search engine for friends and colleague's. Like you want to know what work better, just nutch or in combination with solr. These links really helped me out: http://wiki.apache.org/nutch/GettingNutchRunningWithWindows http://wiki.apache.org/nutch/GettingNutchRunningWithUbuntu http://wiki.apache.org/nutch/RunningNutchAndSolr We might be able to help each other out if you have more questions/sugguestions. Hi, I want to make site search for few of my (and friends) websites but without access to database data. So using nutch crawling and then I have 2 ways. 1. index data to solr 2. leave it with nutch index I need help in finding advantages/disadvantages of solr vs nutch searching because I don't know solr (it's hard to have a big picture) Each site is quite small so it can be held by solr with no problems. In solr I probably can't use faceted search or range queries etc. because I don't have necessary data in schema? In nutch I can have one search server and use site:domain to limit results (like google site search) or use multiple indexes (mentioned on mailing list) but what with solr? Any input highly appreciated. Thanks, Bartosz __ Information from ESET NOD32 Antivirus, version of virus signature database 4574 (20091104) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com __ Information from ESET NOD32 Antivirus, version of virus signature database 4574 (20091104) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com