Hey Paul:

I've done a little digging into this, and have found out a few more details.

* Using the test file you've provided, the serialized version of the parsed
DOM turns out to be (schematically):
<head>
<link rel="...">
<script type="os/data"></script>
<os:ViewerRequest.../>
</head>
<body>
<script type="os/template">
..os/template stuff
</script>

* So the problem is the placement of the os/data tag outside of its <script>
block. That in turn happens due to code down in
org.cyberneko.html.HTMLTagBalancer.java line 665 and onward:
        // close previous elements
        // all elements close a <script>
        // in head, no element has children
        if ((fElementStack.top > 1
                && (fElementStack.peek().element.code ==
HTMLElements.SCRIPT))
                || fElementStack.top > 2 &&
fElementStack.data[fElementStack.top-2].element.code == HTMLElements.HEAD) {
            final Info info = fElementStack.pop();
            if (fDocumentHandler != null) {
                callEndElement(info.qname, synthesizedAugs());
            }
        }

This conditional causes callEndElement to prematurely close the <script> tag
during processing of the os/data tag's contents (parsing
<os:ViewerRequest>). As noted in the comment, the semantic reason this
occurs is that the <script> element gets automagically foisted to <head>,
and from there the code assumes that no child of <head> has children of its
own.

That's about it. I'm trying to figure out why <link><script type="os/data">
causes the <script> element to be placed in <head>, while removing <link>
doesn't -- and stranger yet, putting the <script type="os/template"> element
after <link> doesn't cause it to be put in head (thereby exhibiting this
behavior) as well.

--j

On Wed, Nov 11, 2009 at 10:55 AM, Paul Lindner <plind...@linkedin.com>wrote:

> The problems started with this commit.  Louis, could you have a look at it?
>
> commit a98b18f181df9e78c2f90f6b02483e64c3c76a2a
> Author: lryan <lr...@13f79535-47bb-0310-9956-ffa450edef68>
> Date:   Tue Sep 29 22:17:23 2009 +0000
>
>    Upgrade to Neko 1.9.13. Remove unused old Neko based parser
>
>    git-svn-id:
>
> https://svn.apache.org/repos/asf/incubator/shindig/tr...@82010913f79535-47bb-0310-9956-ffa450edef68
>
>
> On Tue, Nov 10, 2009 at 11:03 AM, Paul Lindner <plind...@linkedin.com
> >wrote:
>
> > If there are tags preceding the test/os-data script tags parsing fails.
>  It
> > appears that the DOM parsing mangles the singular tags in the block and
> > somehow sees them as an open/close tag combo.
> >
> > A patch that reproduces the problem follows.  I'll do a git bisect later
> > today to see where this was introduced.
> >
> >
> > diff --git
> >
> a/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-socialmarkup.html
> > b/java/gadgets/src/test/resources/org/apache/shindig/gadgets/p
> > index c7fa769..f38663b 100644
> > ---
> >
> a/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-socialmarkup.html
> > +++
> >
> b/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-socialmarkup.html
> > @@ -1,3 +1,5 @@
> > +<link href="moo"></link>
> > +
> >  <script type="text/os-data" xmlns:os="
> > http://ns.opensocial.org/2008/markup";>
> >    <os:ViewerRequest key="viewer"/>
> >  </script>
> > @@ -14,4 +16,4 @@
> >
> >  <span>Some content</span>
> >
> > -<div><!-- foo -->bar<!-- baz --></div>
> > \ No newline at end of file
> > +<div><!-- foo -->bar<!-- baz --></div>
> >
>

Reply via email to