RE: HTML5 serializer

2012-01-09 Thread Thorsten Scherler
On Mon, 2012-01-09 at 08:32 +0100, Robby Pelssers wrote:
 Hi Thorsten,
 
 Adding meta in general is not a concern faik but setting the correct 
 encoding is. 
 
 Examples are 
 ?xml version=1.0 encoding=UTF-8?  for xml files

That is correct for the doc declaration. 

 And 
 meta http-equiv=Content-Type content=text/html; charset=utf-8/ for html 
 files

nupp, that tag may be needed to be valid html5 but that is not the
concern of the serializer but the prior transformation process.

 
 So I was only referring to setting the correct encoding which can be 
 configured as a Serializer property.

Yes but that only goes in the PI and is used for the serialization.

salu2

 
 Robby
 
 
 -Original Message-
 From: Thorsten Scherler [mailto:scher...@gmail.com] 
 Sent: Sunday, January 08, 2012 10:28 PM
 To: dev@cocoon.apache.org
 Subject: RE: HTML5 serializer
 
 On Fri, 2012-01-06 at 19:56 +0100, Robby Pelssers wrote:
  
 
  So we’re almost there.   Do you have any suggestion how to accomplish
  using the correct meta charset=”utf-8”/  ??  Or do you think that’s
  not worth the effort?
 
 Hmm, actually that is not the concern of the serializer at all. The
 serializer merely adds DOCTYPE PI and not much more. So meta is
 nothing the serializer should add.
 
 salu2
 

-- 
Thorsten Scherler thorsten.at.apache.org
codeBusters S.L. - web based systems
consulting, training and solutions
http://www.codebusters.es/



RE: HTML5 serializer

2012-01-09 Thread Robby Pelssers
Hi Thorsten,

I assume with prior transformation process you are referring to the 
transformerhandler which insert the meta tag for the html use case. I also just 
stumbled across Michael Kay's reponse about serialization for html5 where he 
mentions:

The XSLT and XQuery WGs have taken the view that we will address serialization 
to HTML5 when HTML5 is finished; meanwhile WHAT WG seem to be claiming that 
finished is an obsolete concept and that HTML5 will remain under continuous 
change forever. Perhaps I'm misquoting them, but that's my understanding. 

http://www.biglist.com/lists/lists.mulberrytech.com/xsl-list/archives/201105/msg00067.html

But to wrap up what I'm trying to achieve here:

I want to be able to do following use cases:

* XML data -- transform using xslt to html5 -- serialize to html5
* stringtemplate generator -- serialize to html5

Preferably I want to be able to do so out-of-the-box with Cocoon3.  As you seem 
more acquainted with the topic what would need to be done to enable this?

Kind regards,
Robby

-Original Message-
From: Thorsten Scherler [mailto:scher...@gmail.com] 
Sent: Monday, January 09, 2012 12:37 PM
To: dev@cocoon.apache.org
Subject: RE: HTML5 serializer

On Mon, 2012-01-09 at 08:32 +0100, Robby Pelssers wrote:
 Hi Thorsten,
 
 Adding meta in general is not a concern faik but setting the correct 
 encoding is. 
 
 Examples are 
 ?xml version=1.0 encoding=UTF-8?  for xml files

That is correct for the doc declaration. 

 And 
 meta http-equiv=Content-Type content=text/html; charset=utf-8/ for html 
 files

nupp, that tag may be needed to be valid html5 but that is not the
concern of the serializer but the prior transformation process.

 
 So I was only referring to setting the correct encoding which can be 
 configured as a Serializer property.

Yes but that only goes in the PI and is used for the serialization.

salu2

 
 Robby
 
 
 -Original Message-
 From: Thorsten Scherler [mailto:scher...@gmail.com] 
 Sent: Sunday, January 08, 2012 10:28 PM
 To: dev@cocoon.apache.org
 Subject: RE: HTML5 serializer
 
 On Fri, 2012-01-06 at 19:56 +0100, Robby Pelssers wrote:
  
 
  So we’re almost there.   Do you have any suggestion how to accomplish
  using the correct meta charset=”utf-8”/  ??  Or do you think that’s
  not worth the effort?
 
 Hmm, actually that is not the concern of the serializer at all. The
 serializer merely adds DOCTYPE PI and not much more. So meta is
 nothing the serializer should add.
 
 salu2
 

-- 
Thorsten Scherler thorsten.at.apache.org
codeBusters S.L. - web based systems
consulting, training and solutions
http://www.codebusters.es/



Re: HTML5 serializer

2012-01-09 Thread Andy Stevens
On 9 January 2012 11:36, Thorsten Scherler scher...@gmail.com wrote:
 On Mon, 2012-01-09 at 08:32 +0100, Robby Pelssers wrote:
 Hi Thorsten,

 Adding meta in general is not a concern faik but setting the correct 
 encoding is.

 Examples are
 ?xml version=1.0 encoding=UTF-8?  for xml files

 That is correct for the doc declaration.

 And
 meta http-equiv=Content-Type content=text/html; charset=utf-8/ for 
 html files

 nupp, that tag may be needed to be valid html5 but that is not the
 concern of the serializer but the prior transformation process.


 So I was only referring to setting the correct encoding which can be 
 configured as a Serializer property.

 Yes but that only goes in the PI and is used for the serialization.

Not really convinced, chiefly for reasons of separation of concerns.
Given that throughout the pipeline the XML is being held in java's
unicode strings, IMO the only component that should need to worry
about the charset being used to serialise the output should be the
serialiser that's doing it, otherwise you can end up with a document
using one charset that claims inside to be a different one.
If you're happy to leave it to the serialiser to insert the PI in the
output (including the charset) rather than having it already in the
pipeline's XML stream (e.g. inserted by xsl:processing-instruction in
an XSLT template), and happy to let the the HTML serialiser insert the
doctype rather than having it already in the pipeline's stream, then
why shouldn't the HTML/XHTML serialiser also insert the meta tag
specifying the charset?

In an ideal world, we wouldn't even have to specify a particular
encoding on the serialiser either - there'd be a default configured
somewhere, but it would select an appropriate one dynamically at the
time of output based on the Accept-Charset request header sent by the
browser... and why should the earlier part of the pipeline also need
to worry about that?


Andy.


 salu2


 Robby


 -Original Message-
 From: Thorsten Scherler [mailto:scher...@gmail.com]
 Sent: Sunday, January 08, 2012 10:28 PM
 To: dev@cocoon.apache.org
 Subject: RE: HTML5 serializer

 On Fri, 2012-01-06 at 19:56 +0100, Robby Pelssers wrote:
  

  So we’re almost there.   Do you have any suggestion how to accomplish
  using the correct meta charset=”utf-8”/  ??  Or do you think that’s
  not worth the effort?

 Hmm, actually that is not the concern of the serializer at all. The
 serializer merely adds DOCTYPE PI and not much more. So meta is
 nothing the serializer should add.

 salu2


 --
 Thorsten Scherler thorsten.at.apache.org
 codeBusters S.L. - web based systems
 consulting, training and solutions
 http://www.codebusters.es/


RE: HTML5 serializer

2012-01-08 Thread Thorsten Scherler
On Fri, 2012-01-06 at 19:56 +0100, Robby Pelssers wrote:
 

 So we’re almost there.   Do you have any suggestion how to accomplish
 using the correct meta charset=”utf-8”/  ??  Or do you think that’s
 not worth the effort?

Hmm, actually that is not the concern of the serializer at all. The
serializer merely adds DOCTYPE PI and not much more. So meta is
nothing the serializer should add.

salu2

-- 
Thorsten Scherler thorsten.at.apache.org
codeBusters S.L. - web based systems
consulting, training and solutions
http://www.codebusters.es/



RE: HTML5 serializer

2012-01-08 Thread Robby Pelssers
Hi Thorsten,

Adding meta in general is not a concern faik but setting the correct encoding 
is. 

Examples are 
?xml version=1.0 encoding=UTF-8?  for xml files
And 
meta http-equiv=Content-Type content=text/html; charset=utf-8/ for html 
files

So I was only referring to setting the correct encoding which can be configured 
as a Serializer property.

Robby


-Original Message-
From: Thorsten Scherler [mailto:scher...@gmail.com] 
Sent: Sunday, January 08, 2012 10:28 PM
To: dev@cocoon.apache.org
Subject: RE: HTML5 serializer

On Fri, 2012-01-06 at 19:56 +0100, Robby Pelssers wrote:
 

 So we’re almost there.   Do you have any suggestion how to accomplish
 using the correct meta charset=”utf-8”/  ??  Or do you think that’s
 not worth the effort?

Hmm, actually that is not the concern of the serializer at all. The
serializer merely adds DOCTYPE PI and not much more. So meta is
nothing the serializer should add.

salu2

-- 
Thorsten Scherler thorsten.at.apache.org
codeBusters S.L. - web based systems
consulting, training and solutions
http://www.codebusters.es/



Re: HTML5 serializer

2012-01-06 Thread Jasha Joachimsthal
Hey Robby,

which Cocoon version are you using for your project? In C2.1 and C2.2
there's not only a XMLSerializer but also an HTMLSerializer and
XHTMLSerializer for their specific needs. So why not create your own
HTML5Serializer?

In HTML5 the specification teams tried to specify what browsers were
already doing instead of making a new theoretical specification. HTML5
should be backwards compatible with previous (X)HTML versions. This is the
reason why some old elements are not deprecated but considered obsolete
(remember marquee, it was so cool on Geocities).
The doctype doesn't really matter, browsers generally ignore the PUBLIC
part in the doctype (apart from some hacks in IE going into quirks mode).
A good presentation about HTML5 is http://vimeo.com/15755349.

Jasha Joachimsthal

Europe - Amsterdam - Oosteinde 11, 1017 WT Amsterdam - +31(0)20 522 4466
US - Boston - 1 Broadway, Cambridge, MA 02142 - +1 877 414 4776 (toll free)

www.onehippo.com


On 6 January 2012 15:48, Robby Pelssers robby.pelss...@nxp.com wrote:

 Hi all,

 ** **

 I’ve been looking at how to add a HTML5 serializer to the project.

 ** **

 So far my investigations have led to add following code to
 org.apache.cocoon.sax.component.XMLSerializer

 ** **

 public static XMLSerializer createHTML5Serializer() {

 XMLSerializer serializer = new XMLSerializer();

 ** **

 serializer.setContentType(TEXT_HTML_UTF_8);

 serializer.setDoctypePublic(XSLT-compat);

 serializer.setEncoding(UTF_8);

 serializer.setMethod(HTML);

 ** **

 return serializer;

 }

 ** **

 ** **

 Using the HTML5 serializer in a test to print the output:

 ** **

 @Test

 public void testHTML5Serializer() throws Exception {

 ByteArrayOutputStream baos = new ByteArrayOutputStream();

 ** **

 newNonCachingPipeline()

 .setStarter(

new XMLGenerator(htmlheadtitleserializer
 test/title/headbodyptest/p/body/html)

 )

 .setFinisher(XMLSerializer.createHTML5Serializer())

 .withEmptyConfiguration()

 .setup(baos)

 .execute();

 ** **

 String data = new String(baos.toByteArray());

 System.out.println(data);

 }

 ** **

 Would print

 ** **

 !DOCTYPE html PUBLIC XSLT-compat

 html

 head

 META http-equiv=Content-Type content=text/html; charset=UTF-8

 titleserializer test/title

 /head

 body

 ptest/p

 /body

 /html

 ** **

 ** **

 I read a number of articles describing the issues with serializing html5
 and so far this was the best I could come up with which is not 100%
 conforming due to 

 **· **Non matching doctype although it will not break in the
 browser  à should be !DOCTYPE html

 **· **The charset should be meta charset=”UTF-8”/ according to
 html5 spec

 ** **

 ** **

 http://www.contentwithstyle.co.uk/content/xslt-and-html-5-problems/

 http://www.w3schools.com/html5/tag_meta.asp

 ** **

 ** **

 Does anyone have more knowledge on this subject?

 ** **

 Robby

 ** **

 ** **



RE: HTML5 serializer

2012-01-06 Thread Robby Pelssers
I am using Cocoon2.2 but am planning to switch to C3 in the upcoming months.  
And  in my mail I was actually referring to C3.You are right about what you 
write but I'd prefer to have a Serializer which follows the spec so I can just 
copy the output and validate it without errors and too many warnings at 
http://validator.w3.org/

Robby

From: Jasha Joachimsthal [mailto:j.joachimst...@onehippo.com]
Sent: Friday, January 06, 2012 4:51 PM
To: dev@cocoon.apache.org
Subject: Re: HTML5 serializer

Hey Robby,

which Cocoon version are you using for your project? In C2.1 and C2.2 there's 
not only a XMLSerializer but also an HTMLSerializer and XHTMLSerializer for 
their specific needs. So why not create your own HTML5Serializer?

In HTML5 the specification teams tried to specify what browsers were already 
doing instead of making a new theoretical specification. HTML5 should be 
backwards compatible with previous (X)HTML versions. This is the reason why 
some old elements are not deprecated but considered obsolete (remember marquee, 
it was so cool on Geocities).
The doctype doesn't really matter, browsers generally ignore the PUBLIC part in 
the doctype (apart from some hacks in IE going into quirks mode).
A good presentation about HTML5 is http://vimeo.com/15755349.

Jasha Joachimsthal

Europe - Amsterdam - Oosteinde 11, 1017 WT Amsterdam - +31(0)20 522 4466
US - Boston - 1 Broadway, Cambridge, MA 02142 - +1 877 414 4776 (toll free)

www.onehippo.comhttp://www.onehippo.com/

On 6 January 2012 15:48, Robby Pelssers 
robby.pelss...@nxp.commailto:robby.pelss...@nxp.com wrote:
Hi all,

I've been looking at how to add a HTML5 serializer to the project.

So far my investigations have led to add following code to 
org.apache.cocoon.sax.component.XMLSerializer

public static XMLSerializer createHTML5Serializer() {
XMLSerializer serializer = new XMLSerializer();

serializer.setContentType(TEXT_HTML_UTF_8);
serializer.setDoctypePublic(XSLT-compat);
serializer.setEncoding(UTF_8);
serializer.setMethod(HTML);

return serializer;
}


Using the HTML5 serializer in a test to print the output:

@Test
public void testHTML5Serializer() throws Exception {
ByteArrayOutputStream baos = new ByteArrayOutputStream();

newNonCachingPipeline()
.setStarter(
   new XMLGenerator(htmlheadtitleserializer 
test/title/headbodyptest/p/body/html)
)
.setFinisher(XMLSerializer.createHTML5Serializer())
.withEmptyConfiguration()
.setup(baos)
.execute();

String data = new String(baos.toByteArray());
System.out.println(data);
}

Would print

!DOCTYPE html PUBLIC XSLT-compat
html
head
META http-equiv=Content-Type content=text/html; charset=UTF-8
titleserializer test/title
/head
body
ptest/p
/body
/html


I read a number of articles describing the issues with serializing html5 and so 
far this was the best I could come up with which is not 100% conforming due to

* Non matching doctype although it will not break in the browser  -- 
should be !DOCTYPE html

* The charset should be meta charset=UTF-8/ according to html5 spec


http://www.contentwithstyle.co.uk/content/xslt-and-html-5-problems/
http://www.w3schools.com/html5/tag_meta.asp


Does anyone have more knowledge on this subject?

Robby





Re: HTML5 serializer

2012-01-06 Thread Jasha Joachimsthal
Ok, then create an HTML5Serializer that extends the current Serializer. An
other solution would be to add a boolean that will output differently for
html5 but I'd prefer extension above a number of if statements.

Jasha

On 6 January 2012 16:56, Robby Pelssers robby.pelss...@nxp.com wrote:

 I am using Cocoon2.2 but am planning to switch to C3 in the upcoming
 months.  And  in my mail I was actually referring to C3.You are right
 about what you write but I’d prefer to have a Serializer which follows the
 spec so I can just copy the output and validate it without errors and too
 many warnings at http://validator.w3.org/

 ** **

 Robby

 ** **

 *From:* Jasha Joachimsthal [mailto:j.joachimst...@onehippo.com]
 *Sent:* Friday, January 06, 2012 4:51 PM
 *To:* dev@cocoon.apache.org
 *Subject:* Re: HTML5 serializer

 ** **

 Hey Robby,

 ** **

 which Cocoon version are you using for your project? In C2.1 and C2.2
 there's not only a XMLSerializer but also an HTMLSerializer and
 XHTMLSerializer for their specific needs. So why not create your own
 HTML5Serializer?

 ** **

 In HTML5 the specification teams tried to specify what browsers were
 already doing instead of making a new theoretical specification. HTML5
 should be backwards compatible with previous (X)HTML versions. This is the
 reason why some old elements are not deprecated but considered obsolete
 (remember marquee, it was so cool on Geocities).

 The doctype doesn't really matter, browsers generally ignore the PUBLIC
 part in the doctype (apart from some hacks in IE going into quirks mode).
 

 A good presentation about HTML5 is http://vimeo.com/15755349.


 

 Jasha Joachimsthal


 Europe - Amsterdam - Oosteinde 11, 1017 WT Amsterdam - +31(0)20 522 4466
 US - Boston - 1 Broadway, Cambridge, MA 02142 - +1 877 414 4776 (toll
 free)

 www.onehippo.com

 

 On 6 January 2012 15:48, Robby Pelssers robby.pelss...@nxp.com wrote:***
 *

 Hi all,

  

 I’ve been looking at how to add a HTML5 serializer to the project.

  

 So far my investigations have led to add following code to
 org.apache.cocoon.sax.component.XMLSerializer

  

 public static XMLSerializer createHTML5Serializer() {

 XMLSerializer serializer = new XMLSerializer();

  

 serializer.setContentType(TEXT_HTML_UTF_8);

 serializer.setDoctypePublic(XSLT-compat);

 serializer.setEncoding(UTF_8);

 serializer.setMethod(HTML);

  

 return serializer;

 }

  

  

 Using the HTML5 serializer in a test to print the output:

  

 @Test

 public void testHTML5Serializer() throws Exception {

 ByteArrayOutputStream baos = new ByteArrayOutputStream();

  

 newNonCachingPipeline()

 .setStarter(

new XMLGenerator(htmlheadtitleserializer
 test/title/headbodyptest/p/body/html)

 )

 .setFinisher(XMLSerializer.createHTML5Serializer())

 .withEmptyConfiguration()

 .setup(baos)

 .execute();

  

 String data = new String(baos.toByteArray());

 System.out.println(data);

 }

  

 Would print

  

 !DOCTYPE html PUBLIC XSLT-compat

 html

 head

 META http-equiv=Content-Type content=text/html; charset=UTF-8

 titleserializer test/title

 /head

 body

 ptest/p

 /body

 /html

  

  

 I read a number of articles describing the issues with serializing html5
 and so far this was the best I could come up with which is not 100%
 conforming due to 

 · Non matching doctype although it will not break in the browser
 à should be !DOCTYPE html

 · The charset should be meta charset=”UTF-8”/ according to
 html5 spec

  

  

 http://www.contentwithstyle.co.uk/content/xslt-and-html-5-problems/

 http://www.w3schools.com/html5/tag_meta.asp

  

  

 Does anyone have more knowledge on this subject?

  

 Robby

  

  

 ** **



RE: HTML5 serializer

2012-01-06 Thread Robby Pelssers
For all I know the serializer does not actually output anything directly.  It 
hands over the task to the transformerhandler and this is where the culprit 
resides.

There is no need to extend the current serializer if  I adapt the current way 
of working.

XmlSerializer, Html4Serializer and XhtmlSerializer are all just XmlSerializers 
with  a different set of properties and the current XMLSerializer class 
provides static methods to create them. It makes much more sense to add a 
static factory method for a html5 serializer  there as well.

The problem is that currently to actually have the transformerhandler output a 
doctype declaration you do need to pass I think a doctypepublic property but 
this cannot be empty from the looks of it:

public XMLSerializer setDoctypePublic(String doctypePublic) {
if (doctypePublic == null || EMPTY.equals(doctypePublic)) {
throw new SetupException(A doctype-public has to be passed as 
argument.);
}

this.format.put(OutputKeys.DOCTYPE_PUBLIC, doctypePublic);
return this;
}

public XMLSerializer setDoctypeSystem(String doctypeSystem) {
if (doctypeSystem == null || EMPTY.equals(doctypeSystem)) {
throw new SetupException(A doctype-system has to be passed as 
argument.);
}

this.format.put(OutputKeys.DOCTYPE_SYSTEM, doctypeSystem);
return this;
}

Robby

From: Jasha Joachimsthal [mailto:j.joachimst...@onehippo.com]
Sent: Friday, January 06, 2012 5:00 PM
To: dev@cocoon.apache.org
Subject: Re: HTML5 serializer

Ok, then create an HTML5Serializer that extends the current Serializer. An 
other solution would be to add a boolean that will output differently for html5 
but I'd prefer extension above a number of if statements.

Jasha

On 6 January 2012 16:56, Robby Pelssers 
robby.pelss...@nxp.commailto:robby.pelss...@nxp.com wrote:
I am using Cocoon2.2 but am planning to switch to C3 in the upcoming months.  
And  in my mail I was actually referring to C3.You are right about what you 
write but I'd prefer to have a Serializer which follows the spec so I can just 
copy the output and validate it without errors and too many warnings at 
http://validator.w3.org/

Robby

From: Jasha Joachimsthal 
[mailto:j.joachimst...@onehippo.commailto:j.joachimst...@onehippo.com]
Sent: Friday, January 06, 2012 4:51 PM
To: dev@cocoon.apache.orgmailto:dev@cocoon.apache.org
Subject: Re: HTML5 serializer

Hey Robby,

which Cocoon version are you using for your project? In C2.1 and C2.2 there's 
not only a XMLSerializer but also an HTMLSerializer and XHTMLSerializer for 
their specific needs. So why not create your own HTML5Serializer?

In HTML5 the specification teams tried to specify what browsers were already 
doing instead of making a new theoretical specification. HTML5 should be 
backwards compatible with previous (X)HTML versions. This is the reason why 
some old elements are not deprecated but considered obsolete (remember marquee, 
it was so cool on Geocities).
The doctype doesn't really matter, browsers generally ignore the PUBLIC part in 
the doctype (apart from some hacks in IE going into quirks mode).
A good presentation about HTML5 is http://vimeo.com/15755349.

Jasha Joachimsthal

Europe - Amsterdam - Oosteinde 11, 1017 WT Amsterdam - +31(0)20 522 
4466tel:%2B31%280%2920%20522%204466
US - Boston - 1 Broadway, Cambridge, MA 02142 - +1 877 414 
4776tel:%2B1%20877%20414%204776 (toll free)

www.onehippo.comhttp://www.onehippo.com/
On 6 January 2012 15:48, Robby Pelssers 
robby.pelss...@nxp.commailto:robby.pelss...@nxp.com wrote:
Hi all,

I've been looking at how to add a HTML5 serializer to the project.

So far my investigations have led to add following code to 
org.apache.cocoon.sax.component.XMLSerializer

public static XMLSerializer createHTML5Serializer() {
XMLSerializer serializer = new XMLSerializer();

serializer.setContentType(TEXT_HTML_UTF_8);
serializer.setDoctypePublic(XSLT-compat);
serializer.setEncoding(UTF_8);
serializer.setMethod(HTML);

return serializer;
}


Using the HTML5 serializer in a test to print the output:

@Test
public void testHTML5Serializer() throws Exception {
ByteArrayOutputStream baos = new ByteArrayOutputStream();

newNonCachingPipeline()
.setStarter(
   new XMLGenerator(htmlheadtitleserializer 
test/title/headbodyptest/p/body/html)
)
.setFinisher(XMLSerializer.createHTML5Serializer())
.withEmptyConfiguration()
.setup(baos)
.execute();

String data = new String(baos.toByteArray());
System.out.println(data);
}

Would print

!DOCTYPE html PUBLIC XSLT-compat
html
head
META http-equiv=Content-Type content=text/html; charset=UTF-8
titleserializer test/title
/head
body
ptest/p
/body
/html


I read a number of articles describing the issues with serializing html5 and so 
far this was the best I could come up

Re: HTML5 serializer

2012-01-06 Thread Sylvain Wallez

Le 06/01/12 15:48, Robby Pelssers a écrit :


Hi all,

I've been looking at how to add a HTML5 serializer to the project.

So far my investigations have led to add following code to 
org.apache.cocoon.sax.component.XMLSerializer


public static XMLSerializer createHTML5Serializer() {

XMLSerializer serializer = new XMLSerializer();

serializer.setContentType(TEXT_HTML_UTF_8);

serializer.setDoctypePublic(XSLT-compat);



Looks like XSLT-compat has been changed to about:legacy-compat in 
the latest HTML 5 specification.


See http://dev.w3.org/html5/spec/syntax.html#doctype-legacy-string

Sylvain

--
Sylvain Wallez - http://bluxte.net



RE: HTML5 serializer

2012-01-06 Thread Robby Pelssers
Hi Sylvain,

Thx for the pointer.

Using the same test but with some changes to the html5serializer

public static XMLSerializer createHTML5Serializer() {
XMLSerializer serializer = new XMLSerializer();

serializer.setContentType(TEXT_HTML_UTF_8);
serializer.setDoctypeSystem(about:legacy-compat);
serializer.setEncoding(UTF_8);
serializer.setMethod(HTML);

return serializer;
}

now results in
!DOCTYPE html SYSTEM about:legacy-compat
html
head
META http-equiv=Content-Type content=text/html; charset=UTF-8
titleserializer test/title
/head
body
ptest/p
/body
/html

So we're almost there.   Do you have any suggestion how to accomplish using the 
correct meta charset=utf-8/  ??  Or do you think that's not worth the 
effort?

Robby

From: Sylvain Wallez [mailto:sylv...@apache.org]
Sent: Friday, January 06, 2012 6:13 PM
To: dev@cocoon.apache.org
Subject: Re: HTML5 serializer

Le 06/01/12 15:48, Robby Pelssers a écrit :
Hi all,

I've been looking at how to add a HTML5 serializer to the project.

So far my investigations have led to add following code to 
org.apache.cocoon.sax.component.XMLSerializer

public static XMLSerializer createHTML5Serializer() {
XMLSerializer serializer = new XMLSerializer();

serializer.setContentType(TEXT_HTML_UTF_8);
serializer.setDoctypePublic(XSLT-compat);

Looks like XSLT-compat has been changed to 
about:legacy-compatabout:legacy-compat in the latest HTML 5 specification.

See http://dev.w3.org/html5/spec/syntax.html#doctype-legacy-string

Sylvain



--

Sylvain Wallez - http://bluxte.net