Hendrik and all, hello.

On 26 Aug 2018, at 17:47, Hendrik Boom wrote:

SGML had a hierarchy of tags -- which ones would automatically close off
others, so that it wasn't necessary to slavishly balance all the
tag-bracketting. But the exact hierarchy would depend on the publisher's
style definition for the document type.

For example if 'em' elements are declared to be contained within 'p' elements, and 'p' elements not to include other 'p' elements, then

    <p>Text <em>with emphasis.
    <p>Another paragraph.

The </em> closing tag, and the two </p> closing tags, would be inserted by the parser. In other circumstances, you could have <p>A single <em/emphasised> word</p>.

To be precise, and as a point of historical interest (and this is still somewhat at a tangent from Richard's original query), SGML had exactly[1] the same hierarchical model as XML, but it also had various 'tag minimisation' features, one of which, if enabled, required the SGML parser to insert closing tags into the parse stream in the way you describe, Hendrik. Other options allowed one to omit attribute names if the attribute values were unique, or use a generic end-tag </>, and so on.

When XML first appeared, it was defined as (or rather the independent definition was intended to be equivalent to) a profile of SGML, in the sense that there was an SGML declaration (ie, a set of parser settings) which, amongst other things, turned off all optional features. The differences between the two technologies, expressed in rather recondite SGML terms, is at <https://www.w3.org/TR/NOTE-sgml-xml-971215/> (by the way, the author of this note is indeed the James Clark of groff).

HTML was, I think, initially defined in conceptually the same way, as an SGML declaration and DTD. Then XHTML was defined in terms of a DTD for XML, and HTML redefined in terms of XHTML plus-error-recovery-for-illformed-documents (ie, with things like missing end-tags), but that struggled to be adopted. Finally (?) HTML5 was redefined from scratch by a loose (and reportedly rather bad-tempered) consortium of browser makers as a ragbag of element-start and element-end tags and the presumed effects when a parser stumbled across them (that may be a less sympathetic description of HTML5 than its designers would provide). I may have fumbled bits of that history, but it's something like that.

Best wishes,

Norman
*now getting rather lost down memory lane*


[1] I say 'exactly': I can't think of any differences, but I wouldn't want to insist there were none.



--
Norman Gray  :  https://nxg.me.uk

--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to