Re: [whatwg] Various HTML element feedback

Jukka K. Korpela Tue, 05 Jun 2012 21:39:51 -0700

2012-06-06 2:53, Ian Hickson wrote:

I have rather been optimistic about future developments for markup
elements that have been defined exactly enough to warrant meaningful
semantics-based processing. For example, most of the uses mentioned in
current text imply that <var> element contents should be kept intact in
automatic language translation.


That continues to be the case, so I don't know why you conclude that using
it is now pointless.

It is worse than pointless, if the definition of <var> covers "a termused as a placeholder in prose". Such expressions should definitely notbe kept intact in automatic language translation.

The definition of <var> is so broad that it is questionable whether*anything* useful can be assumed in automated processing. If it weredefined more technically, without that placeholder idea, we could fairlycertainly say that the content should be treated as a technical notationthat should be left untranslated (as such notations are normallyinternational), ignored in spelling checks, treated as equivalent tounknown nouns in syntax analysis of human language text, etc.

So why not simply define <i> recommended and describe <var>,<cite>,
<em>, and <dfn> as deprecated but supported alternatives?


What benefit does empty deprecation have?

Declaring some features as "obsolete" is effectively deprecation; I justused the term "deprecate" as per HTML 4.01 because I find it moredescriptive. Anyway, defining those elements as deprecated/obsoletewould be no less and no more "empty" than the current statements aboutobsolete status. Validators/checkers would issue messages (hopefullyjust warnings) about them, and tutorials would probably describe them assecondary if at all.

Reducing alternatives, from five to one in this case, makes therecommendations simpler and helps authors because they need not spendtime in making choices between the elements. Such choices can be tough,if you try to play by the declared "semantics", especially if it isvague (to a normal reader of a spec).

My point is: either make elements like <var>, <cite>, , <dfn>, defined so that the differences can be utilized in automatic processing,or just bundle them together, to .

It's not like we can ever remove
these elements altogether.


Oh, in 20 or 30 years, I think browsers could support to some of them.

What harm do they cause?

Unnecessary complication to the language, artificial "semantics" that donot actually define meanings, and confusion among those authors who tryto take semantics and specifications seriously. Oh, and pointlessvariation in markup and added complexity of styling.

If we have to keep them, we are better served by embracing them and giving
them renewed purpose and vigour, rather than being ashamed of them.

I think this summarizes well the idea behind some of the most contrived"semantic" definitions. It was a brave attempt, but it failed. No normalauthor will ever get your idea of the new meaning for and , forexample.

And since, for example, the markup needs to be supported for along time, how come *it* has not got a new, semantic definition?

If <var>, <cite>, , <dfn> would be obsoleted/deprecated in favor of, they would still need to be defined in the spec, of course. But thedefinition could simply state that they are outdated elements thatshould not be used by authors and should be treated by browsers asequivalent to .

This would make authoring simpler without any real cost. There’s
little reason to tell authors to use “semantic markup” if we don’t
think it has real effect on anything.


It does have an effect. It has many effects. It makes maintenance easier,
it makes it easier to transition from project to project, it makes it
easier to work on other people's markup, it makes it significantly easier
to dramatically change a site's appearance, it makes it easier to create
apply custom tools to extract information from the documents, it makes it
easier for search engines to guess at author intent, it makes it easier
for the documents to be repurposed for other media, it makes it easier for
documents to be "remixed", it makes it easier for JavaScript libraries to
be used and mixed...

I've often seen such arguments, even in situations where it isstrikingly obvious that they don't apply. The argumentation sounds likea matter of faith or principle rather practical considerations.

Many of the arguments relate to authoring style, coding principles, andorganization of work, rather than something that belongs to a generalspecification. For example, the ease of working on other people's markupin a collaborative environment depends on a large number of factors,including the overall structures, appearance of markup (lower vs. uppercase, use of quotes, omission of omissible tags, indentations, emptylines), principles of choosing id and class names, use of comments, etc.General specifications cannot and need not handle such issues. And, say,the use of vs. , given their current definitions, is quitecomparable to regulating the use of class attributes.

The other major part of the argumentation refers to assumed automaticprocessing. This is mostly just assumptions, or wishes, often presentedif they were facts. But they *could* be turned to reality, in part. Thisis just the reason why I have asked for semantic clarifications. No onecan reasonably base automatic processing on definitions like those for<var>, , etc. now.

Let legacy be legacy, instead of trying to convert it to "semantics".The semantics of physical markup is the visual appearance. It is best todescribe it simply and openly (and accurately - for example, what really means in legacy markup, and will mean in browsers in theforeseeable future, is italic *or* oblique *or* algorithmically slantedfont).

What is _compelling_ about markup for misspellings?


It's a feature that is necessary in text editors, for which we previously
did not have a good solution.

I would not call it a solution to say that the markup, whichactually means bold face to any existing relevant software, should beused for specialized meanings. How could anyone, or any software,reading markup guess whether means "misspelling", or "Chinese name",or some entirely different "unarticulated, though explicitly rendered,non-textual annotation"? Such things can be resolved via classes, tosome extent, but then the artificial "semantic" definition for ispointless.


Yucca

Re: [whatwg] Various HTML element feedback

Reply via email to