Hello..I am new to nutch,I have read the basics.I wanted to access the parsed
html text of each url (seperately) from the segment (so that I may print
those files ).I can then use those files for other nlp task such as tagging
and named entity recognition.Using segment dump gave me a lot of information
together parsed text,links html etc everything came together.So the
basically  is that I want to obtain the parsed text of the html  according
to each url in the linkdb.IS there anyway to do this ?? 
-- 
View this message in context: 
http://www.nabble.com/html-parse-text-tp14319904p14319904.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to