Dear All, I am Phuong Linh, I am using Tika to extract content form Html file to search. But HtmlParser cannot parse all tag of Html. ( I get Html page by Nutch, then use Tika to extract the important information, after then use Solr to search.) Can you tell me what i can do to parse all tag of html.
Thanks advance! Regards, Tang Thi Phuong Linh. -- P.Linh
