HI folks,
I am using the HTMLDocument http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/ant/src/java/org/apache/lucene/ant/
HtmlDocument hd = new HtmlDocument (p.getInputStream()); doc.add(new Field(F_CONTENTS, new StringReader(hd.getBody()), Field.TermVector.YES )); I keep getting these errors. line 29 column 27 - Error: <st1:place> is not recognized! line 29 column 47 - Error: <st1:country-region> is not recognized! line 36 column 21 - Error: <o:p> is not recognized! line 39 column 67 - Error: <o:p> is not recognized! line 43 column 45 - Error: <o:p> is not recognized! line 46 column 52 - Error: <o:p> is not recognized! line 54 column 27 - Error: <o:p> is not recognized! This document has errors that must be fixed before using HTML Tidy to generate a tidied up version. line 3 column 331 - Error: <img> missing '>' for end of tag This document has errors that must be fixed before using HTML Tidy to generate a tidied up version. line 1 column 1,214 - Error: <img> missing '>' for end of tag This document has errors that must be fixed before using HTML Tidy to generate a tidied up version. line 15 column 1 - Error: o:smarttagtype is not recognized! line 17 column 1 - Error: o:smarttagtype is not recognized! line 19 column 1 - Error: o:smarttagtype is not recognized! line 21 column 1 - Error: o:smarttagtype is not recognized! line 23 column 1 - Error: o:smarttagtype is not recognized! line 111 column 48 - Error: <o:p> is not recognized! line 111 column 196 - Error: <o:p> is not recognized! This document has errors that must be fixed before using HTML Tidy to generate a tidied up version. line 1 column 1,444 - Error: <img> missing '>' for end of tag This document has errors that must be fixed before using HTML Tidy to generate a tidied up version. line 1 column 1,384 - Error: <img> missing '>' for end of tag This document has errors that must be fixed before using HTML Tidy to generate a tidied up version. line 662 column 11 - Error: <st1:city> is not recognized! line 663 column 12 - Error: <st1:place> is not recognized! line 682 column 91 - Error: <st1:personname> is not recognized! line 686 column 87 - Error: <st1:place> is not recognized! line 687 column 12 - Error: <st1:placename> is not recognized! line 687 column 62 - Error: <st1:placetype> is not recognized! This document has errors that must be fixed before using HTML Tidy to generate a tidied up version. line 283 column 61 - Error: <o:p> is not recognized! line 288 column 72 - Error: <o:p> is not recognized! This document has errors that must be fixed before using HTML Tidy to generate a tidied up version. line 118 column 41 - Error: <o:p> is not recognized! line 151 column 34 - Error: <o:p> is not recognized! line 153 column 22 - Error: <o:p> is not recognized! This document has errors that must be fixed before using HTML Tidy to generate a tidied up version. line 174 column 43 - Error: <o:p> is not recognized! line 209 column 36 - Error: <o:p> is not recognized! line 212 column 17 - Error: <o:p> is not recognized! This document has errors that must be fixed before using HTML Tidy to generate a tidied up version. line 163 column 47 - Error: <o:p> is not recognized! line 198 column 38 - Error: <o:p> is not recognized! line 200 column 28 - Error: <o:p> is not recognized! This document has errors that must be fixed before using HTML Tidy to generate a tidied up version. line 123 column 18 - Error: <font> missing '>' for end of tag line 195 column 25 - Error: <font> missing '>' for end of tag This document has errors that must be fixed before using HTML Tidy to generate a tidied up version. line 123 column 18 - Error: <font> missing '>' for end of tag line 195 column 25 - Error: <font> missing '>' for end of tag This document has errors that must be fixed before using HTML Tidy to generate a tidied up version. _Durga