Hi,
Instead of searching for arbitrary terms, I'd like a list of pre-defined terms.

The actual application.

Inputs:
1. Wikipedia's list of article titles
2. RSS/Atom Feeds

I'll get the permalink URLs from the feeds then fetch/index with Nutch.

Output:
1. List of URLs and the Wikipedia articles they contain.

Of course with Nutch + Lucene as it is, I can iterate through the list
of titles and search for them, but that's not very efficient.

Is anyone working on similar applications?

Reply via email to