Hi there,

I have noticed a problem in the output of WebPagePortlet. It doesn't convert
some tags correctly such as SPAN and TBODY. For example, if the page has
something like

<SPAN class="someclass">some text</SPAN>

it is converted as

<SPAN class="someclass">some text<SPAN endtag="true">

This causes some style problems.

After some digging up, I have found out that the HTML parser
(javax.swing.text.html.parser.DocumentParser) from Java Swing's
HTMLEditorKit package uses HTML 3.2 DTD. SPAN is not defined here even
though the SPAN tag is defined as HTML.Tag.SPAN.

My solution to this problem was to alter the handleSimpleTag method in
org.apache.jetspeed.util.HTMLRewriter as:

        public void handleSimpleTag(HTML.Tag tag,MutableAttributeSet
attrs,int param) {
            if (removeMeta && (tag == HTML.Tag.META)) {
                return;
            }

            if(tag == HTML.Tag.SPAN) {
                if(attrs.getAttribute(HTML.Attribute.ENDTAG) == null)
                    handleStartTag(tag, attrs, param);
                else
                    handleEndTag(tag, param);
            } else {
                appendTagToResult(tag,attrs);
            }
        }

This solves the problem just for SPAN. Is it possible that Jetspeed sets
DocumentParser so that it parses HTML 4.0+?

Ozgur Balsoy
Pervasive Technology Labs
Indiana University


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to