Hi,
Instead of searching for arbitrary terms, I'd like a list of pre-defined terms.

The actual application.

Inputs:
1. Wikipedia's list of article titles
2. RSS/Atom Feeds

I'll get the permalink URLs from the feeds then fetch/index with Nutch.

Output:
1. List of URLs and the Wikipedia articles they contain.

Of course with Nutch + Lucene as it is, I can iterate through the list
of titles and search for them, but that's not very efficient.

Is anyone working on similar applications?


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to