I am in need to spider a given url and obtain a list of tokenized keyword/phrases. However, I don't want to have this information indexed into Nutch as we have our own DB storage.
Does Nutch (or Lucene) provide this functionality with their API's. If so, is there a working example. Nothing jumped out at me in the test files of Nutch's distribution. We are currently using a product by http://www.extractor.com www.extractor.com but the licensing is too restrictive to upgrade. Therefore we are searching for alternatives. Thanks in advance... - Doug -- View this message in context: http://www.nabble.com/Spider-a-single-url-and-get-tokenzied-keyword-phrases-tp21147864p21147864.html Sent from the Nutch - User mailing list archive at Nabble.com.
