I am using nutch 0.9


qa_nutch wrote:
> 
> Hello..I am new to nutch,I have read the basics.I wanted to access the
> parsed html text of each url (seperately) from the segment .I can then use
> each of those parsed text files for other nlp task such as tagging and
> named entity recognition.Using segment dump gave me a lot of information
> together :parsed text,links html etc .So I wish to obtain the parsed text
> of the html corresponding to each url in the linkdb seperately.Is this
> possible?
> 

-- 
View this message in context: 
http://www.nabble.com/html-parse-text-tp14319904p14319916.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to