I am in need to spider a given url and obtain a list of tokenized
keyword/phrases.  However, I don't want to have this information indexed
into Nutch as we have our own DB storage.

Does Nutch (or Lucene) provide this functionality with their API's.  If so,
is there a working example.  Nothing jumped out at me in the test files of
Nutch's distribution.

We are currently using a product by  http://www.extractor.com
www.extractor.com  but the licensing is too restrictive to upgrade. 
Therefore we are searching for alternatives.

Thanks in advance...
- Doug
-- 
View this message in context: 
http://www.nabble.com/Spider-a-single-url-and-get-tokenzied-keyword-phrases-tp21147864p21147864.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to