Hi Doug,

Nutch is not really meant for this type of stuff.  You'd be using a very very 
massive hammer for a very small nail if you were to choose Nutch for this task. 
:)

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Doug Leeper <douglee...@yahoo.com>
> To: nutch-user@lucene.apache.org
> Sent: Tuesday, December 23, 2008 12:04:51 PM
> Subject: Spider a single url and get tokenzied keyword/phrases
> 
> 
> I am in need to spider a given url and obtain a list of tokenized
> keyword/phrases.  However, I don't want to have this information indexed
> into Nutch as we have our own DB storage.
> 
> Does Nutch (or Lucene) provide this functionality with their API's.  If so,
> is there a working example.  Nothing jumped out at me in the test files of
> Nutch's distribution.
> 
> We are currently using a product by  http://www.extractor.com
> www.extractor.com  but the licensing is too restrictive to upgrade. 
> Therefore we are searching for alternatives.
> 
> Thanks in advance...
> - Doug
> -- 
> View this message in context: 
> http://www.nabble.com/Spider-a-single-url-and-get-tokenzied-keyword-phrases-tp21147864p21147864.html
> Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to