florent andré
Fri, 29 Jan 2010 16:34:50 -0800
HtmlParser is not "extends enabled" because: 1 - all attributes are private and have to be protected 2 - resolve() is in the same case3 - call to super.startElement() is not so easy because of body/title/discard level counting.
HtmlParser is more extendEnabled, but the only reason why I extend this class is to modify the "hardcoded" new HtmlHandler in expression parser.setContentHandler(new XHTMLDowngradeHandler( new HtmlHandler(this, handler, metadata)));
to MyHtmlHandler(...).Maybe a configuration solution for this class instanciation will be profitable.
Can you tell me if I don't take the right way, and if a possibility to "overwrite/extend" the features of parser is in your roadmap ?
My two pences... have a good day ++ Florent André wrote:
Hi all, I work on html parsing via generic AutoDetectParser() class. I have to keep some "specific" attributes (id and class) in <table> attribute in order to detect witch table have "meaning" for my app.So, as far as I understand for now, I have to :- extend HtmlHandler with MyHtmlHandler - in MyHtmlHandler override public void startElement(...) with something like this : if (bodyLevel == 0 && discardLevel == 0) { if ("TABLE".equals(name)){ AttributesImpl attributes = new AttributesImpl();String id = atts.getValue("id");String class = atts.getValue("class"); if (id != null){attributes.addAttribute("", "id", "id", "CDATA", id); }if (class != null){attributes.addAttribute("", "class", "class", "CDATA", class); } xhtml.startElement("http://www.w3.org/1999/xhtml", "table", "table", attributes); }else{ //if other that table super.startElement(...) } else{ //if other bodyLevel and discardLevel super.startElement(...) }- And finally pass MyHtmlHandler to parse() method via parseContext.****** This is the right way to do such a thing ? * How I can use the parseContext to pass MyHtmlHandler ? I don't find anyexample on it... Any comment will be much appreciated, Have a good day