RE: Basic XSL HTML-> XML query

2012-01-10 Thread McGibbney, Lewis John
From: Timothy Jones [timothy.jo...@syniverse.com] Sent: 08 January 2012 15:24 To: mil...@gmx.de; xalan-j-users@xml.apache.org Subject: Re: Basic XSL HTML-> XML query I recall using a small Java library called Neko to parse HTML into an XML DOM. It did

Re: Basic XSL HTML-> XML query

2012-01-08 Thread Timothy Jones
: Re: Basic XSL HTML-> XML query kesh...@us.ibm.com schrieb am 07.01.2012 um 21:18 (-0500): > The problem is, HTML is not an XML-based language, so unless you've > deliberately written your input document as XHTML, odds are that no > XML parser will accept it. Sure, but as you&#x

Re: Basic XSL HTML-> XML query

2012-01-08 Thread Michael Ludwig
kesh...@us.ibm.com schrieb am 07.01.2012 um 21:18 (-0500): > The problem is, HTML is not an XML-based language, so unless you've > deliberately written your input document as XHTML, odds are that no > XML parser will accept it. Sure, but as you're saying: > There are HTML parsers available which

Re: Basic XSL HTML-> XML query

2012-01-07 Thread keshlam
The problem is, HTML is not an XML-based language, so unless you've deliberately written your input document as XHTML, odds are that no XML parser will accept it. There are HTML parsers available which produce SAX or DOM (XML) output. You could get one of those, use it to read the input documen

Re: Basic XSL HTML-> XML query

2012-01-07 Thread Michael Ludwig
McGibbney, Lewis John schrieb am 07.01.2012 um 17:39 (+): > My situation is that I have lots of legal documents which exist in > HTML, these in turn include lots and lots of presentation mark-up > which I would like to strip before getting down to the Xalan-j XSL > stuff. […] The question I ha

Basic XSL HTML-> XML query

2012-01-07 Thread McGibbney, Lewis John
Hi Everyone, I'm currently using Xalan-J within Yax the Java-based Xproc implementation and have a real basic query. My situation is that I have lots of legal documents which exist in HTML, these in turn include lots and lots of presentation mark-up which I would like to strip before getting do