On Aug 18, 2008, at 4:37 AM, Wang Ning wrote:
> Hi Vincent,
>
> On Mon, Aug 18, 2008 at 2:59 AM, Vincent Massol <[EMAIL PROTECTED]>
> wrote:
>> Hi Wang,
>>
>> I have some questions/remarks on the commit below:
>>
>> 1) It's hard to see what you're modifying from the default tag infos
>> (in default.xml). I think a better solution might be to extend
>> DefaultTagProvider and use the constructor accepting an info
>> provider,
>> see http://htmlcleaner.sourceforge.net/doc/org/htmlcleaner
>> HtmlCleaner.html#HtmlCleaner(org.htmlcleaner.ITagInfoProvider)
>> . In this manner we just extend the default and not duplicate it.
>> WDYT? Is this possible? Are you adding tags or redefining default
>> one?
>>
> I have seen this DefaultTagProvider, but I don't think it can provide
> what I want.
> I want to replace some tags with others.
public class MyProvider extends DefaultTagProvided
{
public TagInfo getTagInfo(String tagName)
{
if (tagName.equals("s") {
return new TagInfo(....);
} else {
return super.getTagInfo(tagName)
}
}
}
or something like that.
I guess it looks better if we have only a few changes to make. If we
have a lot then the file solution might be better. In any case make
sure you separate and document your changes from the default in
default.xml.
> Such like replace <p> in <li>
> to <span>, replace deprecated tags with others. Like replace <s> to
> <del>. But the default.xml can define this.
>
>> 2) The following code is not required:
>>
>>> + private void removeHead(TagNode node)
>>> + {
>>> + logger.debug("remove the head tag of the html");
>>> + TagNode head = node.findElementByName("head", true);
>>> + head.removeFromTree();
>>> + }
>>
>> instead you can just use props.setOmitHtmlEnvelope(true);
> I have tried props.setOmitHtmlEnvelope(true), but there are some
> errors. If input is
> <html>
> <head/>
> <body>
> <p>p1</p>
> <p>p2</p>
> </body>
> </html>
> the output TagNode is not <p>p1</p><p>p2</p> but only <p>p2</p>. So I
> have to use removeHead() to remove the head tag.
htmlcleaner seems quite buggy... Even with setOmitHtmlEnvelope to
false it still generate something strange with an extra line between
the 2 paragraphs of your example. Do you think you could report an
issue in the issue tracker of the htmlcleaner project so that they can
fix it in the next release?
>> 3) Could you explain why we need convertAllAttribute2Lowercase?
>>
> In xhtml, all the attributes value should be lowercase.
Do you have a link to the XHTML spec for this? I can only find
http://www.w3.org/TR/xhtml1/#h-4.2
but this is different.
I have only found this
http://lists.xml.org/archives/xml-dev/200109/msg00718.html
I'd like to know what we gain by transforming in lowercase. Since it's
extra effort we need to rationalize it.
> And some
> attributes value in the conversion result of openoffice is uppercase,
> like algin="CENTER". I think lowercase will be better. But if
> xhtmlparser could handler uppercase attributes value, I will remove
> this method.
>> 4) Could you refactor your code to use the new HTMLCleaner code I
>> have
>> put in xwiki-xml module?
> I will do it today.
cool
Thanks
-Vincent
>
>>
>> Feel free to provide patches for CleaningFilters.
> [snip]
>
> --
> Thanks
> Wang Ning
> _______________________________________________
> devs mailing list
> [email protected]
> http://lists.xwiki.org/mailman/listinfo/devs
_______________________________________________
devs mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/devs