Hi guys Thanks for your valuable comments. I realize that the motivation for this change was not communicated correctly. To give you more insight, we have overridden the GadgetHtmlParser with a similar router which is capable of routing to caja / neko depending on the request context. This is very helpful because it allows us to say things like "use caja parser for accel container, neko otherwise". Also, the logic for deciding which parser to use is consolidated into 1 place, so we dont muck up multiple places. Also, it extends GadgetHtmlParser so that the existing shindig injections of GadgetHtmlParser dont need to change at all.
Part of the motivation for this change was to move out the router to shindig, and override the decision specific stuff in our installation. This helps because other systems (like chirag's use case for caja parser) reuse the functionality. Also, while debugging, we found web pages which dont specify the encoding in the response headers, but do specify it as part of the <meta Http-Equiv="Content-Type"> tag. For accel use case, it is very important to parse the content encoding correctly, otherwise the default encoding is assumed, and the page looks like garbage. Sadly there is no easy of knowing that we messed up there. We have seen caja throw up on lots of web pages, and neko mess up the dom, but still parse it to some extent. Hence the motivation for parseDomForCharsetDetection<http://codereview.appspot.com/1883041/> change which we would override in the router to cycle through all available parsers, so that we get some dom which can be used to extract the page encoding. Given the above reasons, i urge all to give it some more thought. Meanwhile, we'l try to implement parseDomForCharsetDetection in our installation first and come back to this change later when we have more experiences to share. Thanks lots for the feedback. Gagan On Tue, Oct 5, 2010 at 11:32 PM, <[email protected]> wrote: > > > http://codereview.appspot.com/2119043/diff/63001/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParserRouter.java > > File > > java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParserRouter.java > (right): > > > http://codereview.appspot.com/2119043/diff/63001/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParserRouter.java#newcode65 > > java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParserRouter.java:65: > public void setHtmlParserFromName(String htmlParserName) { > My 2 cents is I am not too sure about additional setHtmlParserFromName > public method in this class which is not expose in the > GadgetHTMLRenderer. > > If iterating through existing parsers is really desired we could > probably make the router as factory which does not extend the > GadgetHTMLParser instead. Because essentially thats what the router is > doing, right? > And we could also use the Map binding as Paul suggested in the > implementation. > > > http://codereview.appspot.com/2119043/ >
