[Nutch-general] Implement crawler with custom lucene VS use nutch?

spamsucks Thu, 01 Feb 2007 13:15:24 -0800

I posted this on the lucene list a week ago and haven't heard anything, so 
please don't give me the cross-post slap;)


I am successfully using lucene in our application to index 12 different
types of objects located in a database, and their relationships to each
other to provide some nice search functionality for our website.  We are
building lots of lucene queries programmatically to filter based upon
categories, regions, zip codes, scoring, long/lats...

My problem is that there is content that is not in the database which we
have a lot of... (about 3000+ pages) that we need to also include in the
search results.  It's a whole lot of jsp's.

As I see this, I can either
a) Migrate this application to nutch
b) Write/Implement a web crawler to crawl our site and inject the crawl 
results into
our lucene index.

I am leaning towards option B, since I think it
would only take me a couple of days of implement/write a simple crawler and 
I wouldn't
have to change much else.

Can anyone think of any points/counterpoints for using Nutch vs. writing a
crawler to extend our already used lucene framework?

Thanks. 



-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

[Nutch-general] Implement crawler with custom lucene VS use nutch?

Reply via email to