Re: Nutch/Solr question

2009-11-11 Thread Otis Gospodnetic
Solr is just a search and indexing server.  It doesn't do crawling.  Nutch does 
the crawling and page parsing, and can index into Lucene or into a Solr server.

Nutch is a biggish beast, and if you just need to index a site or even a small 
set of them, you may have an easier time with Droids.

 Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Bartosz Gadzimski bartek...@o2.pl
 To: nutch-user@lucene.apache.org
 Sent: Wed, November 4, 2009 10:41:14 AM
 Subject: Nutch/Solr question
 
 Hi,
 
 I want to make site search for few of my (and friends) websites but without 
 access to database data. So using nutch crawling and then I have 2 ways.
 1. index data to solr
 2. leave it with nutch index
 
 I need help in finding advantages/disadvantages of solr vs nutch searching 
 because I don't know solr (it's hard to have a big picture)
 
 Each site is quite small so it can be held by solr with no problems.
 In solr I probably can't use faceted search or range queries etc. because I 
 don't have necessary data in schema?
 
 In nutch I can have one search server and use site:domain to limit results 
 (like 
 google site search) or use multiple indexes (mentioned on mailing list) but 
 what 
 with solr?
 
 Any input highly appreciated.
 
 Thanks,
 Bartosz



Nutch/Solr question

2009-11-04 Thread Bartosz Gadzimski

Hi,

I want to make site search for few of my (and friends) websites but 
without access to database data. So using nutch crawling and then I have 
2 ways.

1. index data to solr
2. leave it with nutch index

I need help in finding advantages/disadvantages of solr vs nutch 
searching because I don't know solr (it's hard to have a big picture)


Each site is quite small so it can be held by solr with no problems.
In solr I probably can't use faceted search or range queries etc. 
because I don't have necessary data in schema?


In nutch I can have one search server and use site:domain to limit 
results (like google site search) or use multiple indexes (mentioned on 
mailing list) but what with solr?


Any input highly appreciated.

Thanks,
Bartosz


Re: Nutch/Solr question

2009-11-04 Thread Webmaster

Hi,

I have the same problem, i am using Nutch but thinking about using it 
with Solr.
I configured the whole Solr and now i am trying to configure nutch to 
work with solr.


Like you i have no previous experience with Solr so i used a bunch of 
tutorials.
I run a XP and a Linux Ubuntu version on my system and i only configured 
nuth/solr for xp so far.
An i run a server with ubuntu so i also might want to configure 
solr/nutch for ubuntu.
Only crawl about 10 websites(almost like you) and intend to use the 
results as a search engine for friends and colleague's.
Like you want to know what work better, just nutch or in combination 
with solr.


These links really helped me out:
http://wiki.apache.org/nutch/GettingNutchRunningWithWindows
http://wiki.apache.org/nutch/GettingNutchRunningWithUbuntu
http://wiki.apache.org/nutch/RunningNutchAndSolr

We might be able to help each other out if you have more 
questions/sugguestions.



Hi,

I want to make site search for few of my (and friends) websites but 
without access to database data. So using nutch crawling and then I 
have 2 ways.

1. index data to solr
2. leave it with nutch index

I need help in finding advantages/disadvantages of solr vs nutch 
searching because I don't know solr (it's hard to have a big picture)


Each site is quite small so it can be held by solr with no problems.
In solr I probably can't use faceted search or range queries etc. 
because I don't have necessary data in schema?


In nutch I can have one search server and use site:domain to limit 
results (like google site search) or use multiple indexes (mentioned 
on mailing list) but what with solr?


Any input highly appreciated.

Thanks,
Bartosz


__ Information from ESET NOD32 Antivirus, version of virus 
signature database 4574 (20091104) __


The message was checked by ESET NOD32 Antivirus.

http://www.eset.com








__ Information from ESET NOD32 Antivirus, version of virus signature 
database 4574 (20091104) __

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com