I am using nutch 0.9
qa_nutch wrote: > > Hello..I am new to nutch,I have read the basics.I wanted to access the > parsed html text of each url (seperately) from the segment .I can then use > each of those parsed text files for other nlp task such as tagging and > named entity recognition.Using segment dump gave me a lot of information > together :parsed text,links html etc .So I wish to obtain the parsed text > of the html corresponding to each url in the linkdb seperately.Is this > possible? > -- View this message in context: http://www.nabble.com/html-parse-text-tp14319904p14319916.html Sent from the Nutch - User mailing list archive at Nabble.com.
