Re: [O] [PATCH] export to various flavors of (X)HTML
Carsten Dominik carsten.domi...@gmail.com writes: Hi Eric, Rick, Francois and others, Nicolas commented to me about this patch that he was wondering if it would not be better to have a separate backend for html5, i.e. ox-html5.el that could be derived from ox-html.el and make it easier in the future to build it out to take full advantage of html5 features. I think he has a point, and I would like to hear your comments. My initial reaction is: yes, eventually, but perhaps not now. A few reasons: 1. This patch is already done, and it works, modulo bugfixes (not a great reason, I know). 2. The patch ended up with two predicate functions (org-html-xhtml-p and org-html-html5-p) because we really are dealing with four distinct states: X or not, and 5 or not. Splitting off ox-html5 only isolates one of those predicates: the X or not question would still have to be asked and answered in both ox-html and ox-html5. You could just as well split it the other way (ox-xhtml and ox-html), and have the X variants actually build a DOM tree and write xml (I'm not actually advocating that, but I just read this[1]). 3. The change to org-html-special-block takes care of the large majority of new html5 features. The change to inline-images is fairly small. Otherwise, there are many new inline elements that could be used, but in many cases browser support for these is limited or nonexistent, and even basic syntax is up in the air. They can wait (or be handled with custom link types). More importantly, the html5 version of, for example, the formatting of timestamps would look very like the (x)html(4) version, except that the final tag would be a bit different (time instead of span, with different attributes). Most of the surrounding logic would be the same. So ox-html5 would only override a few of ox-html's formatting functions, and even those few would largely be copy-n-paste from ox-html. I'm not sure that's worth it. (Unless derived backends could call back to their parent backends' implementations, a la OO inheritance? But that way lies madness.) To be clear, I think *something* more drastic should be done. But my feeling is: go with this patch for now. Then stop there. The next time someone feels the need to expand org's html5 capabilities, think about new backends. I'm happy to continue with the discussion, and the coding. I think part of the problem is HTML itself: as Rick's polyglot concerns show, the formats can be multiple things at once. Another part of the problem is that org has a certain take on HTML that I guess comes out of the early days of Unix documentation, when it really was the HyperText Markup Language: linked sets of static pages, with up/prev/next links, and headers and footers on each page. `org-html-divs' is a good example of this, and a perfect example of where html5 would handle things differently. I would argue that that should no longer be the default point of view on HTML. If we're going to rethink things, let's rethink this too. Eric [1] http://glyph.twistedmatrix.com/2008/06/data-in-garbage-out.html On 2.5.2013, at 23:07, Eric Abrahamsen e...@ericabrahamsen.net wrote: Rick Frankel r...@rickster.com writes: On Tue, Apr 30, 2013 at 08:26:52PM -0700, Eric Abrahamsen wrote: Rick Frankel r...@rickster.com writes: Whoops. Wrong key. Patch actually attached to this email... rick Great, I'll consolidate all these -- would it be better to mush them into one big patch, or to keep them separate (I suppose for ease of rollback, if something goes wrong)? Probably squashing them into one patch would be the best. But Carsten or Bastien might disagree :). rick Okay, there it is: one big patch (including your xml declaration fix). I didn't add any more refined handling of the html5-fancy option. As a second-order option it didn't seem worthy of an #+OPTIONS entry, and I didn't bother checking for an empty string, either. It can always be #+BINDed if necessary, and if it ever annoys anyone I can fix it further. E 0001-ox-html.el-Export-to-different-X-HTML-flavors-includ.patch
Re: [O] [PATCH] export to various flavors of (X)HTML
Hi Eric, thanks for the reply. OK, I am going with the patch for now, let's push more thinking about HTML5 further down the line. Thanks for working this out! - Carsten On 6.5.2013, at 09:36, Eric Abrahamsen e...@ericabrahamsen.net wrote: Carsten Dominik carsten.domi...@gmail.com writes: Hi Eric, Rick, Francois and others, Nicolas commented to me about this patch that he was wondering if it would not be better to have a separate backend for html5, i.e. ox-html5.el that could be derived from ox-html.el and make it easier in the future to build it out to take full advantage of html5 features. I think he has a point, and I would like to hear your comments. My initial reaction is: yes, eventually, but perhaps not now. A few reasons: 1. This patch is already done, and it works, modulo bugfixes (not a great reason, I know). 2. The patch ended up with two predicate functions (org-html-xhtml-p and org-html-html5-p) because we really are dealing with four distinct states: X or not, and 5 or not. Splitting off ox-html5 only isolates one of those predicates: the X or not question would still have to be asked and answered in both ox-html and ox-html5. You could just as well split it the other way (ox-xhtml and ox-html), and have the X variants actually build a DOM tree and write xml (I'm not actually advocating that, but I just read this[1]). 3. The change to org-html-special-block takes care of the large majority of new html5 features. The change to inline-images is fairly small. Otherwise, there are many new inline elements that could be used, but in many cases browser support for these is limited or nonexistent, and even basic syntax is up in the air. They can wait (or be handled with custom link types). More importantly, the html5 version of, for example, the formatting of timestamps would look very like the (x)html(4) version, except that the final tag would be a bit different (time instead of span, with different attributes). Most of the surrounding logic would be the same. So ox-html5 would only override a few of ox-html's formatting functions, and even those few would largely be copy-n-paste from ox-html. I'm not sure that's worth it. (Unless derived backends could call back to their parent backends' implementations, a la OO inheritance? But that way lies madness.) To be clear, I think *something* more drastic should be done. But my feeling is: go with this patch for now. Then stop there. The next time someone feels the need to expand org's html5 capabilities, think about new backends. I'm happy to continue with the discussion, and the coding. I think part of the problem is HTML itself: as Rick's polyglot concerns show, the formats can be multiple things at once. Another part of the problem is that org has a certain take on HTML that I guess comes out of the early days of Unix documentation, when it really was the HyperText Markup Language: linked sets of static pages, with up/prev/next links, and headers and footers on each page. `org-html-divs' is a good example of this, and a perfect example of where html5 would handle things differently. I would argue that that should no longer be the default point of view on HTML. If we're going to rethink things, let's rethink this too. Eric [1] http://glyph.twistedmatrix.com/2008/06/data-in-garbage-out.html On 2.5.2013, at 23:07, Eric Abrahamsen e...@ericabrahamsen.net wrote: Rick Frankel r...@rickster.com writes: On Tue, Apr 30, 2013 at 08:26:52PM -0700, Eric Abrahamsen wrote: Rick Frankel r...@rickster.com writes: Whoops. Wrong key. Patch actually attached to this email... rick Great, I'll consolidate all these -- would it be better to mush them into one big patch, or to keep them separate (I suppose for ease of rollback, if something goes wrong)? Probably squashing them into one patch would be the best. But Carsten or Bastien might disagree :). rick Okay, there it is: one big patch (including your xml declaration fix). I didn't add any more refined handling of the html5-fancy option. As a second-order option it didn't seem worthy of an #+OPTIONS entry, and I didn't bother checking for an empty string, either. It can always be #+BINDed if necessary, and if it ever annoys anyone I can fix it further. E 0001-ox-html.el-Export-to-different-X-HTML-flavors-includ.patch
Re: [O] [PATCH] export to various flavors of (X)HTML
Carsten Dominik carsten.domi...@gmail.com writes: Hi Eric, thanks for the reply. OK, I am going with the patch for now, let's push more thinking about HTML5 further down the line. Thanks for working this out! My pleasure, I hope I haven't stifled debate... On 6.5.2013, at 09:36, Eric Abrahamsen e...@ericabrahamsen.net wrote: Carsten Dominik carsten.domi...@gmail.com writes: Hi Eric, Rick, Francois and others, Nicolas commented to me about this patch that he was wondering if it would not be better to have a separate backend for html5, i.e. ox-html5.el that could be derived from ox-html.el and make it easier in the future to build it out to take full advantage of html5 features. I think he has a point, and I would like to hear your comments. My initial reaction is: yes, eventually, but perhaps not now. A few reasons: 1. This patch is already done, and it works, modulo bugfixes (not a great reason, I know). 2. The patch ended up with two predicate functions (org-html-xhtml-p and org-html-html5-p) because we really are dealing with four distinct states: X or not, and 5 or not. Splitting off ox-html5 only isolates one of those predicates: the X or not question would still have to be asked and answered in both ox-html and ox-html5. You could just as well split it the other way (ox-xhtml and ox-html), and have the X variants actually build a DOM tree and write xml (I'm not actually advocating that, but I just read this[1]). 3. The change to org-html-special-block takes care of the large majority of new html5 features. The change to inline-images is fairly small. Otherwise, there are many new inline elements that could be used, but in many cases browser support for these is limited or nonexistent, and even basic syntax is up in the air. They can wait (or be handled with custom link types). More importantly, the html5 version of, for example, the formatting of timestamps would look very like the (x)html(4) version, except that the final tag would be a bit different (time instead of span, with different attributes). Most of the surrounding logic would be the same. So ox-html5 would only override a few of ox-html's formatting functions, and even those few would largely be copy-n-paste from ox-html. I'm not sure that's worth it. (Unless derived backends could call back to their parent backends' implementations, a la OO inheritance? But that way lies madness.) To be clear, I think *something* more drastic should be done. But my feeling is: go with this patch for now. Then stop there. The next time someone feels the need to expand org's html5 capabilities, think about new backends. I'm happy to continue with the discussion, and the coding. I think part of the problem is HTML itself: as Rick's polyglot concerns show, the formats can be multiple things at once. Another part of the problem is that org has a certain take on HTML that I guess comes out of the early days of Unix documentation, when it really was the HyperText Markup Language: linked sets of static pages, with up/prev/next links, and headers and footers on each page. `org-html-divs' is a good example of this, and a perfect example of where html5 would handle things differently. I would argue that that should no longer be the default point of view on HTML. If we're going to rethink things, let's rethink this too. Eric [1] http://glyph.twistedmatrix.com/2008/06/data-in-garbage-out.html On 2.5.2013, at 23:07, Eric Abrahamsen e...@ericabrahamsen.net wrote: Rick Frankel r...@rickster.com writes: On Tue, Apr 30, 2013 at 08:26:52PM -0700, Eric Abrahamsen wrote: Rick Frankel r...@rickster.com writes: Whoops. Wrong key. Patch actually attached to this email... rick Great, I'll consolidate all these -- would it be better to mush them into one big patch, or to keep them separate (I suppose for ease of rollback, if something goes wrong)? Probably squashing them into one patch would be the best. But Carsten or Bastien might disagree :). rick Okay, there it is: one big patch (including your xml declaration fix). I didn't add any more refined handling of the html5-fancy option. As a second-order option it didn't seem worthy of an #+OPTIONS entry, and I didn't bother checking for an empty string, either. It can always be #+BINDed if necessary, and if it ever annoys anyone I can fix it further. E 0001-ox-html.el-Export-to-different-X-HTML-flavors-includ.patch
Re: [O] [PATCH] export to various flavors of (X)HTML
On Mon, May 06, 2013 at 02:05:18AM -0700, Eric Abrahamsen wrote: Carsten Dominik carsten.domi...@gmail.com writes: Hi Eric, thanks for the reply. OK, I am going with the patch for now, let's push more thinking about HTML5 further down the line. Thanks for working this out! My pleasure, I hope I haven't stifled debate... No you haven't. Rather, you have described my position much better than i could. +1 My additional 2 cents. For me, xhtml and polyglot html5 (xhtml masquerading as html5) are my primary output. html4 (with its sgml derived unclosed tags) is the anomaly. Going forward i could see something like: ox-html-base (incomplete backend) | | | ox-html4ox-xhtml(5) | ox-html5-fancy (with the html5 specific tags) rick
Re: [O] [PATCH] export to various flavors of (X)HTML
Hi Eric, Rick, Francois and others, Nicolas commented to me about this patch that he was wondering if it would not be better to have a separate backend for html5, i.e. ox-html5.el that could be derived from ox-html.el and make it easier in the future to build it out to take full advantage of html5 features. I think he has a point, and I would like to hear your comments. Thanks - Carsten On 2.5.2013, at 23:07, Eric Abrahamsen e...@ericabrahamsen.net wrote: Rick Frankel r...@rickster.com writes: On Tue, Apr 30, 2013 at 08:26:52PM -0700, Eric Abrahamsen wrote: Rick Frankel r...@rickster.com writes: Whoops. Wrong key. Patch actually attached to this email... rick Great, I'll consolidate all these -- would it be better to mush them into one big patch, or to keep them separate (I suppose for ease of rollback, if something goes wrong)? Probably squashing them into one patch would be the best. But Carsten or Bastien might disagree :). rick Okay, there it is: one big patch (including your xml declaration fix). I didn't add any more refined handling of the html5-fancy option. As a second-order option it didn't seem worthy of an #+OPTIONS entry, and I didn't bother checking for an empty string, either. It can always be #+BINDed if necessary, and if it ever annoys anyone I can fix it further. E 0001-ox-html.el-Export-to-different-X-HTML-flavors-includ.patch
Re: [O] [PATCH] export to various flavors of (X)HTML
Thanks, I will look at this patch next week. - Carsten On 2.5.2013, at 23:07, Eric Abrahamsen e...@ericabrahamsen.net wrote: Rick Frankel r...@rickster.com writes: On Tue, Apr 30, 2013 at 08:26:52PM -0700, Eric Abrahamsen wrote: Rick Frankel r...@rickster.com writes: Whoops. Wrong key. Patch actually attached to this email... rick Great, I'll consolidate all these -- would it be better to mush them into one big patch, or to keep them separate (I suppose for ease of rollback, if something goes wrong)? Probably squashing them into one patch would be the best. But Carsten or Bastien might disagree :). rick Okay, there it is: one big patch (including your xml declaration fix). I didn't add any more refined handling of the html5-fancy option. As a second-order option it didn't seem worthy of an #+OPTIONS entry, and I didn't bother checking for an empty string, either. It can always be #+BINDed if necessary, and if it ever annoys anyone I can fix it further. E 0001-ox-html.el-Export-to-different-X-HTML-flavors-includ.patch
Re: [O] [PATCH] export to various flavors of (X)HTML
Rick Frankel r...@rickster.com writes: On Tue, Apr 30, 2013 at 08:26:52PM -0700, Eric Abrahamsen wrote: Rick Frankel r...@rickster.com writes: Whoops. Wrong key. Patch actually attached to this email... rick Great, I'll consolidate all these -- would it be better to mush them into one big patch, or to keep them separate (I suppose for ease of rollback, if something goes wrong)? Probably squashing them into one patch would be the best. But Carsten or Bastien might disagree :). rick Okay, there it is: one big patch (including your xml declaration fix). I didn't add any more refined handling of the html5-fancy option. As a second-order option it didn't seem worthy of an #+OPTIONS entry, and I didn't bother checking for an empty string, either. It can always be #+BINDed if necessary, and if it ever annoys anyone I can fix it further. E From 9224f289801c7f1193718fe7f2ca351e26d7534b Mon Sep 17 00:00:00 2001 From: Eric Abrahamsen e...@ericabrahamsen.net Date: Thu, 2 May 2013 13:40:58 -0700 Subject: [PATCH] ox-html.el Export to different (X)HTML flavors, including HTML5 * lisp/ox-html.el (org-html-doctype-alist): New variable holding an alist of (X)HTML doctypes (org-html-xhtml-p): New function (org-html-html5-p): New function (org-html-close-tag): New function (org-html-html5-fancy): New export option, determining whether or not to use HTML5-specific elements. (org-html-html5-elements): New variable, new HTML5 elements. (org-html-special-block): Export special blocks to new HTML5 elements. (org-html-format-inline-image): Use figure and figcaption for standalone images. Significant changes to `org-html-format-inline-image', `org-html--build-meta-info', `org-html--build-head', `org-html--build-pre/postable', `org-html-template', `org-html-horizontal-rule', `org-html-format-list-item', `org-html-line-break', `org-html-table', and `org-html-verse-block'. doc/org.texi: Document the above --- doc/org.texi| 97 +- lisp/ox-html.el | 251 2 files changed, 277 insertions(+), 71 deletions(-) diff --git a/doc/org.texi b/doc/org.texi index 7437451..c294ea6 100644 --- a/doc/org.texi +++ b/doc/org.texi @@ -596,6 +596,7 @@ Exporting HTML export * HTML Export commands::How to invoke HTML export +* HTML doctypes:: Org can export to various (X)HTML flavors * HTML preamble and postamble:: How to insert a preamble and a postamble * Quoting HTML tags:: Using direct HTML in Org mode * Links in HTML export::How links will be interpreted and formatted @@ -10969,6 +10970,7 @@ language, but with additional support for tables. @menu * HTML Export commands::How to invoke HTML export +* HTML doctypes:: Org can export to various (X)HTML flavors * HTML preamble and postamble:: How to insert a preamble and a postamble * Quoting HTML tags:: Using direct HTML in Org mode * Links in HTML export::How links will be interpreted and formatted @@ -10980,7 +10982,7 @@ language, but with additional support for tables. * JavaScript support:: Info and Folding in a web browser @end menu -@node HTML Export commands, HTML preamble and postamble, HTML export, HTML export +@node HTML Export commands, HTML doctypes, HTML export, HTML export @subsection HTML export commands @table @kbd @@ -11008,7 +11010,98 @@ Export to a temporary buffer. Do not create a file. @c @noindent @c creates two levels of headings and does the rest as items. -@node HTML preamble and postamble, Quoting HTML tags, HTML Export commands, HTML export +@node HTML doctypes, HTML preamble and postamble, HTML Export commands, HTML export +@subsection HTML doctypes +@vindex org-html-doctype +@vindex org-html-doctype-alist + +Org can export to various (X)HTML flavors. + +Setting the variable @code{org-html-doctype} allows you to export to different +(X)HTML variants. The exported HTML will be adjusted according to the sytax +requirements of that variant. You can either set this variable to a doctype +string directly, in which case the exporter will try to adjust the syntax +automatically, or you can use a ready-made doctype. The ready-made options +are: + +@itemize +@item +``html4-strict'' +@item +``html4-transitional'' +@item +``html4-frameset'' +@item +``xhtml-strict'' +@item +``xhtml-transitional'' +@item +``xhtml-frameset'' +@item +``xhtml-11'' +@item +``html5'' +@item +``xhtml5'' +@end itemize + +See the variable @code{org-html-doctype-alist} for details. The default is +``xhtml-strict''. + +@subsubheading Fancy HTML5 export +@vindex org-html-html5-fancy +@vindex org-html-html5-elements + +HTML5 introduces several new element types. By default, Org will not make +use of these element types, but you can set @code{org-html-html5-fancy} to +@code{t} (or use the corresponding @code{HTML_HTML5_FANCY} export option),
Re: [O] [PATCH] export to various flavors of (X)HTML
On Tue, Apr 30, 2013 at 08:26:52PM -0700, Eric Abrahamsen wrote: Rick Frankel r...@rickster.com writes: Whoops. Wrong key. Patch actually attached to this email... rick Great, I'll consolidate all these -- would it be better to mush them into one big patch, or to keep them separate (I suppose for ease of rollback, if something goes wrong)? Probably squashing them into one patch would be the best. But Carsten or Bastien might disagree :). rick
Re: [O] [PATCH] export to various flavors of (X)HTML
On 29.04.2013 02:02, Eric Abrahamsen wrote: Rick Frankel r...@rickster.com writes: On Fri, Apr 26, 2013 at 10:14:17AM -0700, Eric Abrahamsen wrote: Rick Frankel r...@rickster.com writes: See the discussions of polyglot markup @ http://en.wikipedia.org/wiki/Polyglot_markup and http://www.w3.org/TR/2011/WD-html-polyglot-20110405/#dfn-polyglot-markup for the rationale. Ah, those were interesting links, I hadn't considered those issues. Luckily, your second option was a three-line change to the existing patch: using xhtml5 now produces the same output as html5, except that self-closing tags are self-closed, and there's a xmlns declaration in the html element. Best of all worlds, I hope. Overall, works well. A couple of things related to `org-html-xml-declaration': - It should not be added by default for xhtml5 (from http://www.w3.org/TR/2011/WD-html-polyglot-20110405/#PI-and-xml): #+BEGIN_QUOTE 2. Processing Instructions and the XML Declaration Processing Instructions and the XML Declaration are both forbidden in polyglot markup. #+END_QUOTE - If `org-html-xml-declaration' is set to nil or the empty string, a blank first line is placed in the document prior to the DOCTYPE declaration. If the above fix is added (so it is not generated for xhtml5) that should solve the immediate problem, but i think that the formatting code in org-html-html-template should check that the inner format (line #1697). A patch is attached to fix both issues. 1. There's a new export option, org-html-html5-fancy/HTML_HTML5_FANCY, which defaults to 'nil, making most of the following opt-in only. Very nice. 5. It's generally accepted that one should use some variety of the html5shiv[1] to make IE 9 render new HTML5 elements correctly. I've dropped a note to this effect in the docstring of `org-html-html5-fancy', but I suppose it's possible we could take a more interventionist stance, perhaps including hosting a version of the shiv on orgmode.org, and linking to it automatically. I guess I'm in favor of leaving it to the user, though. I agree that we should comment and leave it to the user. I believe the owner of html5shiv is against CDN hosting the javascript and feels that it should always be downloaded and included locally. Tangential coding question: I've noticed that setting HTML_HTML5_FANCY to nil at the top of the export file results in `(plist-get info :html-html5-fancy)' returning the string nil, ie true. Not right, obviously, and it makes it impossible to set it to 'nil per-file if the global value is 't. Am I handling this wrong? I believe Nicolas answered the is a previous email, but the solution is use '() and not nil. rick
Re: [O] [PATCH] export to various flavors of (X)HTML
Whoops. Wrong key. Patch actually attached to this email... rick From d95a365f547fdc681c530c9088f775b30a37d9aa Mon Sep 17 00:00:00 2001 From: Rick Frankel r...@rickster.com Date: Tue, 30 Apr 2013 10:35:14 -0400 Subject: [PATCH] Modify processing of xhtml declaration. * lisp/ox-html.el (org-html-template): If `org-html-xml-declaration' is nil or an empty string, don't output a blank line at the head of the document. Also, don't ouput the declaration if the document type is xhtml5. --- lisp/ox-html.el | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/lisp/ox-html.el b/lisp/ox-html.el index e7cae1a..05b99bf 100644 --- a/lisp/ox-html.el +++ b/lisp/ox-html.el @@ -1692,19 +1692,21 @@ holding export options. CONTENTS is the transcoded contents string. INFO is a plist holding export options. (concat - (when (org-html-xhtml-p info) - (format %s\n - (format (or (and (stringp org-html-xml-declaration) + (when (and (not (org-html-html5-p info)) (org-html-xhtml-p info)) + (let ((decl (or (and (stringp org-html-xml-declaration) org-html-xml-declaration) (cdr (assoc (plist-get info :html-extension) org-html-xml-declaration)) (cdr (assoc html org-html-xml-declaration)) - ) - (or (and org-html-coding-system - (fboundp 'coding-system-get) - (coding-system-get org-html-coding-system 'mime-charset)) - iso-8859-1 + ))) + (when (not (or (eq nil decl) (string= decl))) + (format %s\n + (format decl + (or (and org-html-coding-system + (fboundp 'coding-system-get) + (coding-system-get org-html-coding-system 'mime-charset)) + iso-8859-1)) (let* ((dt (plist-get info :html-doctype)) (dt-cons (assoc dt org-html-doctype-alist))) (if dt-cons -- 1.8.0
Re: [O] [PATCH] export to various flavors of (X)HTML
Rick Frankel r...@rickster.com writes: Whoops. Wrong key. Patch actually attached to this email... rick Great, I'll consolidate all these -- would it be better to mush them into one big patch, or to keep them separate (I suppose for ease of rollback, if something goes wrong)? E
Re: [O] [PATCH] export to various flavors of (X)HTML
Hello, Eric Abrahamsen e...@ericabrahamsen.net writes: Tangential coding question: I've noticed that setting HTML_HTML5_FANCY to nil at the top of the export file results in `(plist-get info :html-html5-fancy)' returning the string nil, ie true. Not right, obviously, and it makes it impossible to set it to 'nil per-file if the global value is 't. Am I handling this wrong? Value for regular keywords is always a string. You can check for the empty string instead. You can also add an item in OPTIONS, since those are read and, thus, can have a symbol as value. Regards, -- Nicolas Goaziou
Re: [O] [PATCH] export to various flavors of (X)HTML
Rick Frankel r...@rickster.com writes: On Fri, Apr 26, 2013 at 10:14:17AM -0700, Eric Abrahamsen wrote: Rick Frankel r...@rickster.com writes: Therefore, `org-html-close-tag' should check that the doctype is not a flavor of html4 rather than a flavor of xhtml. An alternative would be to add (xhtml5 . !DOCTYPE html) to the doctype alist, and the appropriate testing for being html5 and xhtml. See the discussions of polyglot markup @ http://en.wikipedia.org/wiki/Polyglot_markup and http://www.w3.org/TR/2011/WD-html-polyglot-20110405/#dfn-polyglot-markup for the rationale. Ah, those were interesting links, I hadn't considered those issues. Luckily, your second option was a three-line change to the existing patch: using xhtml5 now produces the same output as html5, except that self-closing tags are self-closed, and there's a xmlns declaration in the html element. Best of all worlds, I hope. Brilliant! I will apply the patch and try it later this weekend... So here's the fun part -- using the new bits of HTML5. The attached patch builds on the last one (and corrects a couple of documentation formatting errors), and touches on the following: 1. There's a new export option, org-html-html5-fancy/HTML_HTML5_FANCY, which defaults to 'nil, making most of the following opt-in only. 2. The meat of the change is in `org-html-special-block'. If it comes across a special block #+BEGIN_FOO where foo is a member of `org-html-html5-elements', it will format it as foo rather than div class=foo. So #+BEGIN_ASIDE will create an aside element. Attributes are now parsed (this change applies to all HTML flavors), so this: #+ATTR_HTML :controls controls :width 350 #+BEGIN_VIDEO #+HTML: source src=movie.mp4 type=video/mp4 #+END_VIDEO becomes: video controls=controls width=350 source src=movie.mp4 type=video/mp4 /video 3. Standalone images are formatted as figure with figcaption. 4. Things like timestamps could be expressed as time elements, but I haven't done that here. The relevant attributes still seem to be up in the air, and it would be complicated. 5. It's generally accepted that one should use some variety of the html5shiv[1] to make IE 9 render new HTML5 elements correctly. I've dropped a note to this effect in the docstring of `org-html-html5-fancy', but I suppose it's possible we could take a more interventionist stance, perhaps including hosting a version of the shiv on orgmode.org, and linking to it automatically. I guess I'm in favor of leaving it to the user, though. Tangential coding question: I've noticed that setting HTML_HTML5_FANCY to nil at the top of the export file results in `(plist-get info :html-html5-fancy)' returning the string nil, ie true. Not right, obviously, and it makes it impossible to set it to 'nil per-file if the global value is 't. Am I handling this wrong? EFrom 636720ca8444a4767a44170b6ed29cf471f1aee7 Mon Sep 17 00:00:00 2001 From: Eric Abrahamsen e...@ericabrahamsen.net Date: Sun, 28 Apr 2013 23:00:26 -0700 Subject: [PATCH 10/10] ox-html.el: Give access to new elements in HTML5 * lisp/ox-html.el (org-html-html5-fancy): New variable, determining whether or not to use new elements. (org-html-html5-elements): New variable, new HTML5 elements. (org-html-special-block): Export special blocks to new HTML5 elements. (org-html-format-inline-image): Use figure and figcaption for standalone images. * doc/org.texi: Document the above. --- doc/org.texi| 64 - lisp/ox-html.el | 61 -- 2 files changed, 114 insertions(+), 11 deletions(-) diff --git a/doc/org.texi b/doc/org.texi index 40f5216..ad438f4 100644 --- a/doc/org.texi +++ b/doc/org.texi @@ -11007,11 +11007,11 @@ Export to a temporary buffer. Do not create a file. Org can export to various (X)HTML flavors. -Setting the variable @var{org-html-doctype} allows you to export to different -(X)HTML variants. The exported HTML will be adjusted according to the sytax -requirements of that variant. You can either set this variable to a doctype +Setting the variable @code{org-html-doctype} allows you to export to different +(X)HTML variants. The exported HTML will be adjusted according to the sytax +requirements of that variant. You can either set this variable to a doctype string directly, in which case the exporter will try to adjust the syntax -automatically, or you can use a ready-made doctype. The ready-made options +automatically, or you can use a ready-made doctype. The ready-made options are: @itemize @@ -11035,7 +11035,61 @@ are: ``xhtml5'' @end itemize -See the variable @var{org-html-doctype-alist} for details. The default is ``xhtml-strict''. +See the variable @code{org-html-doctype-alist} for details. The default is
Re: [O] [PATCH] export to various flavors of (X)HTML
On 25.04.2013 17:20, Eric Abrahamsen wrote: Who knew this would turn out to be such a fraught issue! All I wanted was that little green checkmark from the W3C... Here's what I think should be an acceptable final patch. I dropped the CDATA mess, and came up with a slightly different implementation for handling self-closing tags. It's maybe a little /bulkier/ than the previous implementation, but not so hacky, and may continue to be useful in the future. There's also a documentation patch. Overall, looks good, but again, i would _strongly_ argue that html5 should generate valid xhtml. If it doesn't, it will really break my post-processing workflow... Therefore, `org-html-close-tag' should check that the doctype is not a flavor of html4 rather than a flavor of xhtml. An alternative would be to add (xhtml5 . !DOCTYPE html) to the doctype alist, and the appropriate testing for being html5 and xhtml. See the discussions of polyglot markup @ http://en.wikipedia.org/wiki/Polyglot_markup and http://www.w3.org/TR/2011/WD-html-polyglot-20110405/#dfn-polyglot-markup for the rationale. rick
Re: [O] [PATCH] export to various flavors of (X)HTML
Rick Frankel r...@rickster.com writes: On 25.04.2013 17:20, Eric Abrahamsen wrote: Who knew this would turn out to be such a fraught issue! All I wanted was that little green checkmark from the W3C... Here's what I think should be an acceptable final patch. I dropped the CDATA mess, and came up with a slightly different implementation for handling self-closing tags. It's maybe a little /bulkier/ than the previous implementation, but not so hacky, and may continue to be useful in the future. There's also a documentation patch. Overall, looks good, but again, i would _strongly_ argue that html5 should generate valid xhtml. If it doesn't, it will really break my post-processing workflow... Therefore, `org-html-close-tag' should check that the doctype is not a flavor of html4 rather than a flavor of xhtml. An alternative would be to add (xhtml5 . !DOCTYPE html) to the doctype alist, and the appropriate testing for being html5 and xhtml. See the discussions of polyglot markup @ http://en.wikipedia.org/wiki/Polyglot_markup and http://www.w3.org/TR/2011/WD-html-polyglot-20110405/#dfn-polyglot-markup for the rationale. Ah, those were interesting links, I hadn't considered those issues. Luckily, your second option was a three-line change to the existing patch: using xhtml5 now produces the same output as html5, except that self-closing tags are self-closed, and there's a xmlns declaration in the html element. Best of all worlds, I hope. E From 12472f7fe52848a011cc218e36b01416cfa6c146 Mon Sep 17 00:00:00 2001 From: Eric Abrahamsen e...@ericabrahamsen.net Date: Fri, 26 Apr 2013 10:04:47 -0700 Subject: [PATCH 11/11] ox-html.el: Export to various flavors of (X)HTML lisp/ox-html.el (org-html-doctype-alist): New variable holding an alist of (X)HTML doctypes (org-html-xhtml-p): New function (org-html-html5-p): New function (org-html-close-tag): New function Significant changes to `org-html-format-inline-image', `org-html--build-meta-info', `org-html--build-head', `org-html--build-pre/postable', `org-html-template', `org-html-horizontal-rule', `org-html-format-list-item', `org-html-line-break', `org-html-table', and `org-html-verse-block'. doc/org.texi: Document the above --- doc/org.texi| 43 - lisp/ox-html.el | 188 +--- 2 files changed, 166 insertions(+), 65 deletions(-) diff --git a/doc/org.texi b/doc/org.texi index 3f2d1b8..0815c49 100644 --- a/doc/org.texi +++ b/doc/org.texi @@ -596,6 +596,7 @@ Exporting HTML export * HTML Export commands::How to invoke HTML export +* HTML doctypes:: Org can export to various (X)HTML flavors * HTML preamble and postamble:: How to insert a preamble and a postamble * Quoting HTML tags:: Using direct HTML in Org mode * Links in HTML export::How links will be interpreted and formatted @@ -10959,6 +10960,7 @@ language, but with additional support for tables. @menu * HTML Export commands::How to invoke HTML export +* HTML doctypes:: Org can export to various (X)HTML flavors * HTML preamble and postamble:: How to insert a preamble and a postamble * Quoting HTML tags:: Using direct HTML in Org mode * Links in HTML export::How links will be interpreted and formatted @@ -10970,7 +10972,7 @@ language, but with additional support for tables. * JavaScript support:: Info and Folding in a web browser @end menu -@node HTML Export commands, HTML preamble and postamble, HTML export, HTML export +@node HTML Export commands, HTML doctypes, HTML export, HTML export @subsection HTML export commands @table @kbd @@ -10998,7 +11000,44 @@ Export to a temporary buffer. Do not create a file. @c @noindent @c creates two levels of headings and does the rest as items. -@node HTML preamble and postamble, Quoting HTML tags, HTML Export commands, HTML export +@node HTML doctypes, HTML preamble and postamble, HTML Export commands, HTML export +@subsection HTML doctypes +@vindex org-html-doctype +@vindex org-html-doctype-alist + +Org can export to various (X)HTML flavors. + +Setting the variable @var{org-html-doctype} allows you to export to different +(X)HTML variants. The exported HTML will be adjusted according to the sytax +requirements of that variant. You can either set this variable to a doctype +string directly, in which case the exporter will try to adjust the syntax +automatically, or you can use a ready-made doctype. The ready-made options +are: + +@itemize +@item +``html4-strict'' +@item +``html4-transitional'' +@item +``html4-frameset'' +@item +``xhtml-strict'' +@item +``xhtml-transitional'' +@item +``xhtml-frameset'' +@item +``xhtml-11'' +@item +``html5'' +@item +``xhtml5'' +@end itemize + +See the variable @var{org-html-doctype-alist} for details. The default is ``xhtml-strict''. + +@node HTML preamble and postamble, Quoting HTML tags, HTML doctypes, HTML export @subsection HTML
Re: [O] [PATCH] export to various flavors of (X)HTML
On Fri, Apr 26, 2013 at 10:14:17AM -0700, Eric Abrahamsen wrote: Rick Frankel r...@rickster.com writes: Therefore, `org-html-close-tag' should check that the doctype is not a flavor of html4 rather than a flavor of xhtml. An alternative would be to add (xhtml5 . !DOCTYPE html) to the doctype alist, and the appropriate testing for being html5 and xhtml. See the discussions of polyglot markup @ http://en.wikipedia.org/wiki/Polyglot_markup and http://www.w3.org/TR/2011/WD-html-polyglot-20110405/#dfn-polyglot-markup for the rationale. Ah, those were interesting links, I hadn't considered those issues. Luckily, your second option was a three-line change to the existing patch: using xhtml5 now produces the same output as html5, except that self-closing tags are self-closed, and there's a xmlns declaration in the html element. Best of all worlds, I hope. Brilliant! I will apply the patch and try it later this weekend... rick
Re: [O] [PATCH] export to various flavors of (X)HTML
François Pinard pin...@iro.umontreal.ca writes: Christian Wittern cwitt...@gmail.com writes: On 2013-04-23 21:09, François Pinard wrote: If I remember well [...] Well, in this case you are misremembering, empty elements, aka as self-closing tags are one of the innovations of XML. Just my nit to pick, A friendly nit-picking is always a good way to get one another to improve. Thanks! Who knew this would turn out to be such a fraught issue! All I wanted was that little green checkmark from the W3C... Here's what I think should be an acceptable final patch. I dropped the CDATA mess, and came up with a slightly different implementation for handling self-closing tags. It's maybe a little /bulkier/ than the previous implementation, but not so hacky, and may continue to be useful in the future. There's also a documentation patch. Hope this works, E From d3af8f41480eea27e0165e4dcd594ce3475e56cd Mon Sep 17 00:00:00 2001 From: Eric Abrahamsen e...@ericabrahamsen.net Date: Thu, 25 Apr 2013 14:00:24 -0700 Subject: [PATCH 11/11] ox-html.el: Export to various flavors of (X)HTML lisp/ox-html.el (org-html-doctype-alist): New variable holding an alist of (X)HTML doctypes (org-html-xhtml-p): New function (org-html-html5-p): New function (org-html-close-tag): New function Significant changes to `org-html-format-inline-image', `org-html--build-meta-info', `org-html--build-head', `org-html--build-pre/postable', `org-html-template', `org-html-horizontal-rule', `org-html-format-list-item', `org-html-line-break', `org-html-table', and `org-html-verse-block'. doc/org.texi: Document the above --- doc/org.texi| 41 - lisp/ox-html.el | 187 +--- 2 files changed, 163 insertions(+), 65 deletions(-) diff --git a/doc/org.texi b/doc/org.texi index 3f2d1b8..c7fae6d 100644 --- a/doc/org.texi +++ b/doc/org.texi @@ -596,6 +596,7 @@ Exporting HTML export * HTML Export commands::How to invoke HTML export +* HTML doctypes:: Org can export to various (X)HTML flavors * HTML preamble and postamble:: How to insert a preamble and a postamble * Quoting HTML tags:: Using direct HTML in Org mode * Links in HTML export::How links will be interpreted and formatted @@ -10959,6 +10960,7 @@ language, but with additional support for tables. @menu * HTML Export commands::How to invoke HTML export +* HTML doctypes:: Org can export to various (X)HTML flavors * HTML preamble and postamble:: How to insert a preamble and a postamble * Quoting HTML tags:: Using direct HTML in Org mode * Links in HTML export::How links will be interpreted and formatted @@ -10970,7 +10972,7 @@ language, but with additional support for tables. * JavaScript support:: Info and Folding in a web browser @end menu -@node HTML Export commands, HTML preamble and postamble, HTML export, HTML export +@node HTML Export commands, HTML doctypes, HTML export, HTML export @subsection HTML export commands @table @kbd @@ -10998,7 +11000,42 @@ Export to a temporary buffer. Do not create a file. @c @noindent @c creates two levels of headings and does the rest as items. -@node HTML preamble and postamble, Quoting HTML tags, HTML Export commands, HTML export +@node HTML doctypes, HTML preamble and postamble, HTML Export commands, HTML export +@subsection HTML doctypes +@vindex org-html-doctype +@vindex org-html-doctype-alist + +Org can export to various (X)HTML flavors. + +Setting the variable @var{org-html-doctype} allows you to export to different +(X)HTML variants. The exported HTML will be adjusted according to the sytax +requirements of that variant. You can either set this variable to a doctype +string directly, in which case the exporter will try to adjust the syntax +automatically, or you can use a ready-made doctype. The ready-made options +are: + +@itemize +@item +``html4-strict'' +@item +``html4-transitional'' +@item +``html4-frameset'' +@item +``xhtml-strict'' +@item +``xhtml-transitional'' +@item +``xhtml-frameset'' +@item +``xhtml-11'' +@item +``html5'' +@end itemize + +See the variable @var{org-html-doctype-alist} for details. The default is ``xhtml-strict''. + +@node HTML preamble and postamble, Quoting HTML tags, HTML doctypes, HTML export @subsection HTML preamble and postamble @vindex org-html-preamble @vindex org-html-postamble diff --git a/lisp/ox-html.el b/lisp/ox-html.el index ef7d15a..eddc122 100644 --- a/lisp/ox-html.el +++ b/lisp/ox-html.el @@ -143,6 +143,26 @@ (defvar org-html--pre/postamble-class status CSS class used for pre/postamble) +(defconst org-html-doctype-alist + '((html4-strict . !DOCTYPE html PUBLIC \-//W3C//DTD HTML 4.01//EN\ +\http://www.w3.org/TR/html4/strict.dtd\;) +(html4-transitional . !DOCTYPE html PUBLIC \-//W3C//DTD HTML 4.01 Transitional//EN\ +\http://www.w3.org/TR/html4/loose.dtd\;) +(html4-frameset . !DOCTYPE html PUBLIC
Re: [O] [PATCH] export to various flavors of (X)HTML
Christian Wittern cwitt...@gmail.com writes: On 2013-04-23 21:09, François Pinard wrote: If I remember well [...] Well, in this case you are misremembering, empty elements, aka as self-closing tags are one of the innovations of XML. Just my nit to pick, A friendly nit-picking is always a good way to get one another to improve. Thanks! François
Re: [O] [PATCH] export to various flavors of (X)HTML
On 23.4.2013, at 06:57, Samuel Wales samolog...@gmail.com wrote: As a non-expert HTML user, I'd want whatever works on the most browsers, even old ones, as my audience is likely to include many who have old browsers in addition to many who have new ones, mobile ones, and accessibility-oriented browsers and extensions. Dunno if that helps at all. This is a valid point. While I do not object to a way to select an html flavor, the default should render correctly on as many browsers as possible. - Carsten Samuel -- The Kafka Pandemic: http://thekafkapandemic.blogspot.com The disease DOES progress. MANY people have died from it. ANYBODY can get it. There is NO hope without action. This means YOU.
Re: [O] [PATCH] export to various flavors of (X)HTML
Carsten Dominik carsten.domi...@gmail.com writes: On 23.4.2013, at 06:57, Samuel Wales samolog...@gmail.com wrote: As a non-expert HTML user, I'd want whatever works on the most browsers, even old ones, as my audience is likely to include many who have old browsers in addition to many who have new ones, mobile ones, and accessibility-oriented browsers and extensions. Dunno if that helps at all. This is a valid point. While I do not object to a way to select an html flavor, the default should render correctly on as many browsers as possible. - Carsten Yup, absolutely. Inasmuch as anything is certain with HTML, it seems fairly clear that self-closing tags will /render/ properly in HTML4, even if they don't validate, so maybe it doesn't matter. I'll have a think and see if I can't come up with a less hideous way of handling the tags. If I can't, maybe we can just leave it. E
Re: [O] [PATCH] export to various flavors of (X)HTML
Eric Abrahamsen e...@ericabrahamsen.net writes: I read that as just a better statement of what I was trying to say earlier: self-closing tags will render in HTML4, but they're not _strictly correct_ HTML4. I do not understand this assertion. I thought that HTML, up to but excluding HTML5, *is* also valid SGML. If I remember well, self-closing tags date back to SGML, not requiring (but also not forbidding) an introducing space to the closing slash. SGML does allow for closing tags to be optionally omitted (and for opening tags as well) but such optional omissions have to described in the DTD. These features were all meant to favor human legibility of SGML documents, and HTML did use them a lot. These, combined with the generality of the described grammar, made generic SGML validating parsers quite difficult to write, and also, quite expensive at the time (no real problem as most SGML projects usually involved big money). XML was all of a swing in the other direction: trading human legibility in favor of much easier parsing, and was quite successful as it addressed the same problems as SGML, but in a much cheaper and democratic way. But XML (and XHTML) are not SGML anymore. And HTML5 is neither :-). François
Re: [O] [PATCH] export to various flavors of (X)HTML
On 2013-04-23 21:09, François Pinard wrote: If I remember well, self-closing tags date back to SGML, not requiring (but also not forbidding) an introducing space to the closing slash. SGML does allow for closing tags to be optionally omitted (and for opening tags as well) but such optional omissions have to described in the DTD. Well, in this case you are misremembering, empty elements, aka as self-closing tags are one of the innovations of XML; they did not exist in SGML (where you could simple omit the closing tag completely for emtpy elements). Just my nit to pick, Christian -- Christian Wittern, Kyoto
Re: [O] [PATCH] export to various flavors of (X)HTML
On Sat, Apr 20, 2013 at 10:59:32AM +0800, Eric Abrahamsen wrote: The / style doesn't validate for html4, that's what I was going on. It certainly doesn't make my browser explode, but I wanted that little green checkmark! If we can live with that, that's fine, or I can try to come up with a less hacky way of handling closing tags -- a macro maybe. It should validate. According to the w3c compatibility guidelines (http://www.w3.org/TR/xhtml1/guidelines.html): C.2. Empty Elements Include a space before the trailing / and of empty elements, e.g. br /, hr / and img src=karen.jpg alt=Karen /. Also, use the minimized tag syntax for empty elements, e.g. br /, as the alternative syntax br/br allowed by XML gives uncertain results in many existing user agents. C.3. Element Minimization and Empty Element Content Given an empty instance of an element whose content model is not EMPTY (for example, an empty title or paragraph) do not use the minimized form (e.g. use p /p and not p /). The xmns declaration, on the other hand, seems quite meaningless for anything that isn't xhtml (even if it doesn't actually break), and it's only a couple of lines of code to deal with, I'd rather keep that in there... fair enough. rick
Re: [O] [PATCH] export to various flavors of (X)HTML
Rick Frankel r...@rickster.com writes: On Sat, Apr 20, 2013 at 10:59:32AM +0800, Eric Abrahamsen wrote: The / style doesn't validate for html4, that's what I was going on. It certainly doesn't make my browser explode, but I wanted that little green checkmark! If we can live with that, that's fine, or I can try to come up with a less hacky way of handling closing tags -- a macro maybe. It should validate. According to the w3c compatibility guidelines (http://www.w3.org/TR/xhtml1/guidelines.html): C.2. Empty Elements Include a space before the trailing / and of empty elements, e.g. br /, hr / and img src=karen.jpg alt=Karen /. Also, use the minimized tag syntax for empty elements, e.g. br /, as the alternative syntax br/br allowed by XML gives uncertain results in many existing user agents. C.3. Element Minimization and Empty Element Content Given an empty instance of an element whose content model is not EMPTY (for example, an empty title or paragraph) do not use the minimized form (e.g. use p /p and not p /). Right, but as the note at the top of that page says: This appendix summarizes design guidelines for authors who wish their XHTML documents to render on existing HTML user agents. I read that as just a better statement of what I was trying to say earlier: self-closing tags will render in HTML4, but they're not _strictly correct_ HTML4. Try the validation link at the bottom of this page: http://ericabrahamsen.net/html4test.html It's not a disaster. I'm happy to do whatever needs to be done with the patch, whether that means dropping the closing-tags fix, re-implementing it, or whatever. It would be good to hear other HTML-users' opinions on this, if anyone has one! E The xmns declaration, on the other hand, seems quite meaningless for anything that isn't xhtml (even if it doesn't actually break), and it's only a couple of lines of code to deal with, I'd rather keep that in there... fair enough. rick
Re: [O] [PATCH] export to various flavors of (X)HTML
On 19.04.2013 05:57, Eric Abrahamsen wrote: I'm starting a new thread for this since the previous discussion was buried in with something tangential. I'm not proud of some of the implementation (self-closing vs non-self-closing tags are ugly, and I wish org-html-html5-p and org-html-xhtml-p were variables, not functions), but there it is, it seems to work. If this is deemed okay I'll send a version of the patch with a proper commit message, and also updated documentation. I disagree with the minimized closing patch change. All versions of html accept the / idiom (with the extra space so that html4 only browsers don't break) for minimized tags (also /{elem}, e.g. hr/hr is, i believe, always valid). html5 certainly accepts valid xhtml as input. It would entirely break e.g, nxml-mode or xsl post-processing to make this change. Other things that don't need to be removed for html5: - CDATA escapes - xmns: .. xml:lang declarations (as long as you keep the html valid xml) As a positive side effect, backing out these changes would simplify the patch a lot :) The doctype (and fix to the text/javascript closing tag) changes look great. rick
Re: [O] [PATCH] export to various flavors of (X)HTML
Rick Frankel r...@rickster.com writes: On 19.04.2013 05:57, Eric Abrahamsen wrote: I'm starting a new thread for this since the previous discussion was buried in with something tangential. I'm not proud of some of the implementation (self-closing vs non-self-closing tags are ugly, and I wish org-html-html5-p and org-html-xhtml-p were variables, not functions), but there it is, it seems to work. If this is deemed okay I'll send a version of the patch with a proper commit message, and also updated documentation. Thanks for looking at this! I disagree with the minimized closing patch change. All versions of html accept the / idiom (with the extra space so that html4 only browsers don't break) for minimized tags (also /{elem}, e.g. hr/hr is, i believe, always valid). html5 certainly accepts valid xhtml as input. It would entirely break e.g, nxml-mode or xsl post-processing to make this change. The / style doesn't validate for html4, that's what I was going on. It certainly doesn't make my browser explode, but I wanted that little green checkmark! If we can live with that, that's fine, or I can try to come up with a less hacky way of handling closing tags -- a macro maybe. Other things that don't need to be removed for html5: - CDATA escapes - xmns: .. xml:lang declarations (as long as you keep the html valid xml) I'd be happy to leave the CDATA escapes in there, since it really doesn't seem to make any difference, and the implementation is ugly. I think I'm erring on the side of pedantic correctness. The xmns declaration, on the other hand, seems quite meaningless for anything that isn't xhtml (even if it doesn't actually break), and it's only a couple of lines of code to deal with, I'd rather keep that in there... Thanks again, Eric As a positive side effect, backing out these changes would simplify the patch a lot :) The doctype (and fix to the text/javascript closing tag) changes look great. rick