I was making the output both a Node and String..
Defaulted to a Node.
so <wellFormedHtml dataObjectType="String.class"/>
@XmlRootElement(name = "wellFormedHtml")
@XmlAccessorType(XmlAccessType.FIELD)
public class WellFormedHtmlDataFormat extends DataFormatType {
@XmlAttribute(required = false)
private Class dataObjectType;
public WellFormedHtmlDataFormat(Class<?> dataObjectType) {
super("org.apache.camel.dataformat.tagsoup.WellFormedHtmlDataFormat");
assert dataObjectType.isAssignableFrom(String.class)
|| dataObjectType.isAssignableFrom(Node.class) :
"WellFormedHtmlDataFormat only supports returning a String or a
org.w3c.dom.Node object";
this.dataObjectType = dataObjectType;
}
On Wed, Dec 10, 2008 at 21:12, James Strachan <[EMAIL PROTECTED]>wrote:
> 2008/12/10 Ramon Buckland <[EMAIL PROTECTED]>:
> > Hi Peoples,
> >
> > I am just about finished the proof of concept of using TagSoup as a
> > DataFormat and as a component.
> >
> > For those not familiar with TagSoup, it is a Java Library (APache 2.0
> > License) which converts poorly formatted Html
> >
> > <html> <p> something
> >
> > into well formed (xml) HTML. (not XHTML).
> >
> > ie:
> >
> > <html>
> > <body>
> > <p>something</p>
> > </body>
> > </html>
> >
> > This is very helpful for a following reason.
> >
> > <camelContext xmlns="http://activemq.apache.org/camel/schema/spring">
> > <route>
> > <from uri="direct:start"/>
> > <to uri="http://myserver.com/somequery?foo=1"/>
> > <unmarshal><wellFormedHtml/><unmarshal>
> > <to uri="xslt:file:///foo/bar.xsl"/>
> > <to .../>
> > </route>
> > </camelContext>
> >
> >
> > Questions:
> > Is this component helpful ? *Should I finish, I have not seen anything
> > like it in the toolkit yet)
>
> Definitely! Being able to format HTML nicely as XML so you can do
> XPath and whatnot is *very* useful!
>
>
> > *If continuing is a good idea, what should the "dataFormat" be called
> ?
> > ie the <wellFormedHtml/>
>
> Oooh thats a tricky one - naming is so hard! Maybe <tagSoup/> ? We
> might one day have a few different mechanisms? (e.g. jtidy?).
>
> Though maybe tagSoup is a bit vague :). How about tidyHtml or tidyMarkup?
>
>
> > Am I unmarshalling or marshalling ? (we of course won't support going
> > the other way as good to bad html is just hard(er))
> > I figured it is <unmarshalling> as the <csv/> dataformat is similar,
> CSV
> > --> List<..> is ummarshalling.
>
> Yeah. Whats the output btw - is it a DOM? Or can it be converted to a
> Source so the endpoint could take DOM/SAX/StaX etc?
>
>
> --
> James
> -------
> http://macstrac.blogspot.com/
>
> Open Source Integration
> http://fusesource.com/
>