Hi guys

Thanks for your valuable comments. I realize that the motivation for this
change was not communicated correctly.
To give you more insight, we have overridden the GadgetHtmlParser with a
similar router which is capable of routing to caja / neko depending on the
request context. This is very helpful because it allows us to say things
like "use caja parser for accel container, neko otherwise".
Also, the logic for deciding which parser to use is consolidated into 1
place, so we dont muck up multiple places.
Also, it extends GadgetHtmlParser so that the existing shindig injections of
GadgetHtmlParser dont need to change at all.

Part of the motivation for this change was to move out the router to
shindig, and override the decision specific stuff in our installation. This
helps because other systems (like chirag's use case for caja parser) reuse
the functionality.

Also, while debugging, we found web pages which dont specify the encoding in
the response headers, but do specify it as part of the <meta
Http-Equiv="Content-Type"> tag. For accel use case, it is very important to
parse the content encoding correctly, otherwise the default encoding is
assumed, and the page looks like garbage. Sadly there is no easy of knowing
that we messed up there.

We have seen caja throw up on lots of web pages, and neko mess up the dom,
but still parse it to some extent.
Hence the motivation for
parseDomForCharsetDetection<http://codereview.appspot.com/1883041/>
change
which we would override in the router to cycle through all available
parsers, so that we get some dom which can be used to extract the page
encoding.

Given the above reasons, i urge all to give it some more thought. Meanwhile,
we'l try to implement parseDomForCharsetDetection in our installation first
and come back to this change later when we have more experiences to share.

Thanks lots for the feedback.
Gagan

On Tue, Oct 5, 2010 at 11:32 PM, <[email protected]> wrote:

>
>
> http://codereview.appspot.com/2119043/diff/63001/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParserRouter.java
>
> File
>
> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParserRouter.java
> (right):
>
>
> http://codereview.appspot.com/2119043/diff/63001/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParserRouter.java#newcode65
>
> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParserRouter.java:65:
> public void setHtmlParserFromName(String htmlParserName) {
> My 2 cents is I am not too sure about additional setHtmlParserFromName
> public method in this class which is not expose in the
> GadgetHTMLRenderer.
>
> If iterating through existing parsers is really desired we could
> probably make the router as factory which does not extend the
> GadgetHTMLParser instead. Because essentially thats what the router is
> doing, right?
> And we could also use the Map binding as Paul suggested in the
> implementation.
>
>
> http://codereview.appspot.com/2119043/
>

Reply via email to