Re: [whatwg] contenteditable, em and strong

2007-01-12 Thread Henri Sivonen

On Jan 12, 2007, at 05:25, Matthew Paul Thomas wrote:

Is the effort to get people to use CSS instead of spacer GIFs a bad  
idea?


Is the effort to get people to use h1..h6 instead of pb or  
pfont a bad idea?


No. In those cases the alternatives are substantially different  
technically. Not only that, CSS is more powerful and makes things  
substantially easier and more maintainable even for authors who don't  
care about the philosophy behind the advocacy.


With i vs. em, the argument is over which identifier (opaque  
string that can be compared for equality) is used as an element name.  
There's no substantial technical difference.


Is the effort to get people to use CSS instead of table for  
layout a bad idea?


It often is, sadly. When people really, really want a grid layout  
model and try to fake it with positioning or floats, the result tends  
to be more brittle and (particularly with positioning) less fluidly  
scalable than a table layout (positioning being worse than floats  
but see http://dbaron.org/log/2005-12#e20051228a ).


There were, I'm sure, many more occurrences of those problems than  
there were improper uses of em and strong. And the efforts to  
replace them are much older than the effort to get people who don't  
think about semantics to use b and i, which has hardly even  
started yet.


Considering the IIIR draft I referenced and the Siegel article that  
Anne mentioned, the em vs. i discussion seems to actually be  
older. But regardless of the exact age of the debate, my secondary  
point was that the expected payoff is so light that I don't think  
spending another 14 years on this is worthwhile. My opinion would be  
different if the expected payoff was insanely great.


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/




Re: [whatwg] contenteditable, em and strong

2007-01-12 Thread Alexey Feldgendler

On Fri, 12 Jan 2007 09:41:42 +0100, Henri Sivonen [EMAIL PROTECTED] wrote:

Is the effort to get people to use CSS instead of table for layout a 
bad idea?


It often is, sadly. When people really, really want a grid layout model 
and try to fake it with positioning or floats, the result tends to be 
more brittle and (particularly with positioning) less fluidly scalable 
than a table layout (positioning being worse than floats but see 
http://dbaron.org/log/2005-12#e20051228a ).


Just a side note: for grid layouts, display: table-* should be used.


--
Alexey Feldgendler [EMAIL PROTECTED]
[ICQ: 115226275] http://feldgendler.livejournal.com


Re: [whatwg] contenteditable, em and strong

2007-01-12 Thread Anne van Kesteren
On Fri, 12 Jan 2007 13:16:04 +0100, Spartanicus  
[EMAIL PROTECTED] wrote:

CSS table layouts share all of the many drawbacks of HTML table layouts,
except for the false semantics (one of the least significant issues
IMO).


I agree, CSS needs something like the XUL flexible box model.



Afaics this is off topic for this list, so I'm not going to add further
to this thread spin off.


It prolly is, oh well.


--
Anne van Kesteren
http://annevankesteren.nl/
http://www.opera.com/


Re: [whatwg] contenteditable, em and strong

2007-01-11 Thread Henri Sivonen

On Jan 11, 2007, at 10:42, fantasai wrote:


Are you arguing that i should mean emphasis instead of italics?
If so, I disagree...


Almost, except s/emphasis/different from normal paragraph content/ to  
dodge the discussion on what constitutes emphasis.


I am arguing that

The introduction of em and strong (circa 1993) has failed to  
achieve a semantic improvement over i and b, because prominent  
tools such as Dreamweaver, Tidy, IE and Opera as well as simplified  
well-intentioned advocacy treat em and strong merely as more  
fashionable alternatives to i and b. (I mean failure in terms of  
what meaning a markup consumer can extract from the real Web without  
a private agreement with the producer of a given Web page. I don't  
mean the ability of authors to write style sheets for their own markup.)


Therefore, in retrospect, it might have been more useful to  
generalize i and b back in 1993 instead of trying to launch  
alternatives. i could have been generalized as follows: i  
denotes content that is different from normal paragraph content. For  
scripts that customarily use italics for this purpose, the default  
presentation on the visual media is italics when the ability to  
render text in italics is available. User agents may use different  
default presentations for making the content different from normal  
paragraph content for scripts that don't customarily use italics, on  
non-visual media or when italics are not available for display. For  
example, for Chinese and Japanese accent-like glyphs above or below  
the content could be used, for aural media a different tone of voice  
could be used and for tty display inverted colors could be used.


But that wasn't done back in 1993 and now were are stuck with two  
pairs of elements. I suggest defining the pairs as synonymous (giving  
in to practice made prevalent by tools biased towards bicameral  
scripts) and then generalizing them as outlined above. Nowadays with  
CSS, refining the default presentation is relatively easy when the  
default isn't exactly right. For private styling conventions, hand- 
coding authors would have double the style hooks without having to  
use class. (Specifically, I am not suggesting deprecating or  
obsoleting any of i, b, em and strong.)


Insisting on the difference of i and em is not without harm,  
because arguing about which one to use is not without opportunity  
cost. Also, I think the expected payoff (that mpt gave) from careful  
differentiation between the elements is not worth the trouble even if  
it was achievable through an education campaign.



P.S. To see how far we have come since 1993, check out this example  
in the IIIR draft:


This text contains an ememphasized/em word.
strongDon't assume/strong that it will be italic!
It was made using the CODEEM/CODE element. A citation is
typically italic and has no formal necessary structure:
citeMoby Dick/cite is a book title.

http://www.w3.org/MarkUp/draft-ietf-iiir-html-01.txt

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/




Re: [whatwg] contenteditable, em and strong

2007-01-11 Thread Matthew Paul Thomas

On Jan 12, 2007, at 5:23 AM, Henri Sivonen wrote:

...
The introduction of em and strong (circa 1993) has failed to 
achieve a semantic improvement over i and b, because prominent 
tools such as Dreamweaver, Tidy, IE and Opera as well as simplified 
well-intentioned advocacy treat em and strong merely as more 
fashionable alternatives to i and b. (I mean failure in terms of 
what meaning a markup consumer can extract from the real Web without a 
private agreement with the producer of a given Web page. I don't mean 
the ability of authors to write style sheets for their own markup.)

...


Is the effort to get people to use CSS instead of spacer GIFs a bad 
idea?


Is the effort to get people to use h1..h6 instead of pb or 
pfont a bad idea?


Is the effort to get people to use CSS instead of table for layout a 
bad idea?


There were, I'm sure, many more occurrences of those problems than 
there were improper uses of em and strong. And the efforts to 
replace them are much older than the effort to get people who don't 
think about semantics to use b and i, which has hardly even started 
yet.


Ten years ago, the typical Web developer probably didn't know what em 
and strong were. Now, the typical Web developer probably thinks that 
b and i are dirty and that XHTML is the future. This does not mean 
all is lost, it just means the standards advocates oversteered. Time 
for another adjustment.



...
Insisting on the difference of i and em is not without harm, 
because arguing about which one to use is not without opportunity 
cost.

...


Not without makes that statement look more profound than it is.

--
Matthew Paul Thomas
http://mpt.net.nz/



Re: [whatwg] contenteditable, em and strong

2007-01-10 Thread Henri Sivonen

On Jan 9, 2007, at 23:29, Benjamin Hawkes-Lewis wrote:


Henri Sivonen wrote:


My conclusion is that semantic markup has failed in this case.


Semantic markup hasn't barely been tested in this case. For the most
part, users have been force-fed broken markup by deceptive user
interfaces.


Sure. But is it realistic to expect this to change? What expected  
payoff is there for tool vendors for providing a non-deceptive UI for  
em and strong?



An actual test would have been to provide people with a widespread
interface that accurately reported that they were emphasizing rather
than italicizing.


Part of the overall test is that such UIs haven't been launched  
with success in the last 14 years.



strong and b are both primarily used to achieve
bold rendering on the visual media. Regardless of which tags authors
type or which tags their editor shortcuts produce, authors tend to
think in terms of encoding italicizing and bolding instead of
knowingly articulating their profound motivation for using italics or
bold.


Yes, it's a bad habit picked up from WYSIWIG word processing. If  
people
were still habituated to typewriters you'd be insisting on the  
intrinsic

utility of u. ;)


More to the point, there is utility in being able to typeset a word  
or two differently in a paragraph. In theory, that's em. But in  
practice the choice between em and strong is motivated by the  
default visual rendering.


Therefore, the situation is that there are two semantic elements  
for making a piece of text different: em and strong. The choice  
depends on the preferred default visual rendering: italic vs. bold.  
In practice this isn't any different from saying that the semantics  
of i and b are to set text differently with default visual  
renderings being italics and bold.



Even those who have heard about the theoretical reasons for
using em and strong


[snip]


em, strong, i and b have all been in HTML for over a decade.
I think that’s long enough to see what happens in the wild. I  
think it

is time to give up and admit that there are two pairs of visually-
oriented synonyms instead of putting more time, effort, money, blog
posts, spec examples and discussion threads into educating people
about subtle differences in the hope that important benefits will be
realized once people use these elements the “right” way.


If we accepted that only a few people have heard about the theoretical
advantages of em and strong, wouldn't that suggest that the web
standards community has not done enough communicating, not that
communication has been understood but ineffective because its
prescriptions are somehow impractical?


Perhaps, but what's the payoff of vehemently communicating more about  
this? Is it worth it? Would there be a different way to get the same  
payoff?



There are consequences to using i and b instead of em and
strong. Being ambiguous, i and b are insufficient hooks for  
speech

CSS styling by the author, at least not without additional classes.)


em and i are exactly as stylable. strong and b are also  
equally stylable.



Because they are so ambiguous, talking UAs will have to announce those
elements as italic and bold rather than applying any specific  
aural

styling such as a different rate or pitch of speech.  Because
announcements slow down reading speed much more than voice  
alterations,

it is likely that talking agent users will turn them off. Which means
their web experience will be ultimately degraded.


When an author presses command-i, he may not even know what markup is  
generated. The choice between i and b vs. em and strong is  
pretty much up to chance. This means that in *practice* em and  
strong are, on the real Web out there, about exactly as ambiguous  
as i and b. Since voice browsers aren't truly able to extract  
significance out of the choice of em vs. i (as the choice of  
element is largely up to chance), I conclude that reading them  
differently from each other isn't a particularly useful idea.


If i and b have been implemented in an annoying way for the aural  
media, isn't the conclusion that it would even be rational for  
WYSIWYG tool vendors to use em and strong for italics and bold to  
avoid annoyances on the aural media? (Moreover, as currently defined,  
em and strong have more versatile content models in XHTML5, which  
means tool vendors would have an additional incentive to emit those  
elements.)



I think using span with a style attribute is a bad idea in this case.
Italicizing a word or two in a paragraph is not incidental style that
could easily be considered optional.


Surely it /is/ an incidental style, since authors, publication houses,
and style guides vary in their preferences about when to italicize.
Surely it is the distinctions between foreign and native languages,
between emphasis and non-emphasis, between titles and non-titles,  
and so

forth, that are non-incidental, and that italicization imperfectly
reflects. The typography is 

Re: [whatwg] contenteditable, em and strong

2007-01-10 Thread fantasai

Henri Sivonen wrote:

On Jan 9, 2007, at 23:29, Benjamin Hawkes-Lewis wrote:


Henri Sivonen wrote:


I think using span with a style attribute is a bad idea in this case.
Italicizing a word or two in a paragraph is not incidental style that
could easily be considered optional.


Surely it /is/ an incidental style, since authors, publication houses,
and style guides vary in their preferences about when to italicize.
Surely it is the distinctions between foreign and native languages,
between emphasis and non-emphasis, between titles and non-titles, and so
forth, that are non-incidental, and that italicization imperfectly
reflects. The typography is not the message; it is only its shadow.


Granted, but italics and bold are more sticky properties of the text 
than e.g. font family, font size or column width, so it is a mistake to 
treat all style properties as being equally incidental and expendable.



It is a more essential part of
the text that should be preserved when the content is formatted for a
different display environment possibly with a different font.


How would a different font conflict with its italicization?


It wouldn't. My point was that italic and bold are stickier or closer to 
being part of content than the font.


That depends, actually, on the language. Browsing the Chinese journal
section of a university East Asian Library, I noticed that the Chinese
journals didn't use normal/italics -- instead they switched the style of
font between their equivalents of serif and cursive. Granted these switches
were on a per-paragraph level in the text I saw, but East Asian typesetting
tends not to use italics in general. They have other means of indicating
emphasis: various underlining styles, bold, (in Japanese) a switch to katakana,
and emphasis marks which are placed above/below/beside individual characters
in an emphasized phrase. East Asian texts also don't use italics for works
titles: they have a set of special punctuation for that. You can argue that
italics and bold should be strictly equivalent to em and strong because all
that matters is that their presentation is the same, but that argument doesn't
hold up for non-Latin texts. Restyling em sitewide to use 'text-emphasis'
instead of 'font-style: italic' would be a nifty thing on a Japanese website.
Restyling i the same way would just be silly.

~fantasai


Re: [whatwg] contenteditable, em and strong

2007-01-10 Thread Matthew Paul Thomas

On Jan 10, 2007, at 9:31 PM, Henri Sivonen wrote:


On Jan 9, 2007, at 23:29, Benjamin Hawkes-Lewis wrote:


Henri Sivonen wrote:

...

strong and b are both primarily used to achieve
bold rendering on the visual media. Regardless of which tags authors
type or which tags their editor shortcuts produce, authors tend to
think in terms of encoding italicizing and bolding instead of
knowingly articulating their profound motivation for using italics or
bold.


Yes, it's a bad habit picked up from WYSIWIG word processing. If 
people were still habituated to typewriters you'd be insisting on the 
intrinsic utility of u. ;)


Robin Williams' /The Mac is not a typewriter/ -- which, if I recall, 
advises against underlining -- was first published in 1990 and is still 
in print. Probably the underlining of links quelled underlining for 
emphasis on the Web.


More to the point, there is utility in being able to typeset a word or 
two differently in a paragraph. In theory, that's em. But in 
practice the choice between em and strong is motivated by the 
default visual rendering.


I don't think there's anything wrong with that, in itself. It's shorter 
than emphasis class=italic and emphasis class=bold. :-)



...

em, strong, i and b have all been in HTML for over a decade.
I think that’s long enough to see what happens in the wild. I think 
it is time to give up and admit that there are two pairs of 
visually-

oriented synonyms instead of putting more time, effort, money, blog
posts, spec examples and discussion threads into educating people
about subtle differences in the hope that important benefits will be
realized once people use these elements the “right” way.


If we accepted that only a few people have heard about the theoretical
advantages of em and strong, wouldn't that suggest that the web
standards community has not done enough communicating, not that
communication has been understood but ineffective because its
prescriptions are somehow impractical?


Perhaps, but what's the payoff of vehemently communicating more about 
this? Is it worth it? Would there be a different way to get the same 
payoff?


I think the problem is not with how few people have heard about the 
theoretical advanges of em and strong, but with how many have got the 
mistaken impression that they are replacements for and improvements on 
i and b.


This is where we really need results from Google Markup Search (paging 
Mr Hickson): What proportion of pages use em and/or strong, what 
proportion of these appear to be generated using a Wysiwyg tool, what 
proportion also use i and/or b, and can a sample of their URLs be 
provided for the purpose of surveying how often em and strong are 
used inappropriately?


The message please use b and i unless you really know what you're 
doing, and generate b and i unless your users really know what 
they're doing is *not* well-known. It has not yet consumed much time, 
effort, money, blog posts, spec examples or discussion threads. In the 
absence of other evidence, I think it is worth trying.



There are consequences to using i and b instead of em and
strong. Being ambiguous, i and b are insufficient hooks for 
speech CSS styling by the author, at least not without additional 
classes.)


em and i are exactly as stylable. strong and b are also 
equally stylable.


Benjamin's statement would have been more accurate if he'd said for 
speech CSS styling by the screenreader, because a screenreader would 
be more likely to specify different default intonations for em and 
i than an author would. But even if there are any screenreaders yet 
that make such a distinction (are there any? I forget), that's a very 
small benefit for a very small audience. Fantasai's example of emphasis 
in Chinese text is much more interesting.


--
Matthew Paul Thomas
http://mpt.net.nz/


Re: [whatwg] contenteditable, em and strong

2007-01-10 Thread mail
I've been reading this discussion and I do not get the point. It looks
like we are discussing about the traditional bold button, but to my mind
we should discuss about the logic behind that button.
First of all I want to state that to my mind Alexey Feldgendler was
absolutely right when he said: WYSIWYG is always presentational because
its goal is to produce a document which is as close as possible to the
“original” that exists in the author's imagination..

So, the regular joe who uses a WYSIWYG tool does not care about semantics.
He actually thinks in presentation and just want to make the selection
bold. This is how Word it does, how OpenOffice it does and how whatever
typewriter it does.

The question is, what logic lies beneath the typical bold button? I've
been developing a lot of wysiwyg tools and analyzed a lot of them. Almost
every tool I have seen just uses the bold execCommand;

execCommand('bold', false, null);

So the question is, what should this command return? As the name of the
command states, a request is made to make the selection bold. So use a b
element! I suggest to extend the Command Identifier List to also allow
'important' and 'emphasis'. The latter should use a em element and the
former the strong element.

execCommand('important', false, null);
execCommand('emphasis', false, null);

Now we have defined the right purpose of the specific command identifiers,
it is up to the author of the WYSIWYG tool to decide which command to use.
An author who knows about semantics and has read this discussion would use
the bold command for the bold button and the italic command for the italic
button.

To emphasize text of to make text important, extra buttons or menu items
should be used. The way the traditional bold and italic buttons are being
used should not be altered. That would be inconsistent.

--cheers



Re: [whatwg] contenteditable, em and strong

2007-01-10 Thread Simon Pieters

Hi,

From: Henri Sivonen [EMAIL PROTECTED]
Two of the four implementations that the WHATWG cares about  interoperate. 
Is it worthwhile to disrupt that situation#8212;especially  considering 
that changes to Trident are the hardest for the WHATWG to  induce?


Does the interoperability matter much in this case?

My conclusion is that semantic markup has failed in this case. em  and 
i are both used primarily to achieve italic rendering on the  visual 
media. strong and b are both primarily used to achieve  bold rendering 
on the visual media. Regardless of which tags authors  type or which tags 
their editor shortcuts produce, authors tend to  think in terms of encoding 
italicizing and bolding instead of  knowingly articulating their profound 
motivation for using italics or  bold. Even those who have heard about the 
theoretical reasons for  using em and strong tend to decide which one 
to use based on  which one has the preferred default visual presentation 
for the case  at hand.


em, strong, i and b have all been in HTML for over a decade.  I 
think that#8217;s long enough to see what happens in the wild. I think  it 
is time to give up and admit that there are two pairs of visually- oriented 
synonyms instead of putting more time, effort, money, blog  posts, spec 
examples and discussion threads into educating people  about subtle 
differences in the hope that important benefits will be  realized once 
people use these elements the #8220;right#8221; way.


Compare with: http://ln.hixie.ch/?start=1137799947count=1


Well... in that case strong needs to be defined as being equivalent to b 
and em equivalent to i, and the ability to mark things as being 
important or as stress emphasis is lost. Personally I don't want that, I'd 
rather have IE emit the wrong thing for a while longer and the others do it 
right.


That people misuse em and strong doesn't mean that we have to give up 
and define them differently; if it were then we would probably also have to 
define table and even HTML as a whole to be a visual layout tool.


However as it is now the spec sort of contradicts itself -- it says strong 
must only be used to denote importance yet the contenteditable bold 
feature will emit strong.



[...]


Regards,
Simon Pieters

_
Alla lediga jobb för bartenders http://jobb.msn.monster.se/



Re: [whatwg] contenteditable, em and strong

2007-01-10 Thread Henri Sivonen

On Jan 10, 2007, at 11:40, fantasai wrote:


That depends, actually, on the language. Browsing the Chinese journal
section of a university East Asian Library, I noticed that the Chinese
journals didn't use normal/italics -- instead they switched the  
style of

font between their equivalents of serif and cursive.


Isn't that a use case for reintroducing font with serif mapping to  
mincho and sans-serif mapping to gothic? ;-)


Granted these switches were on a per-paragraph level in the text I  
saw, but East Asian typesetting

tends not to use italics in general.


I am aware of this. But the practically locked-in Web-compatible UA  
style sheet italicizes em, so East Asian Web authors need to deal  
with that default.


They have other means of indicating emphasis: various underlining  
styles,


Is there data on u usage on East Asian pages? Should HTML5  
legitimize u? (For Latin pages, a restyled u would be more  
compatible than m.)



bold,


Seems like a case for keeping b around.


(in Japanese) a switch to katakana,


Wouldn't a normal Japanese writer enter the text as katakana into the  
document content instead of requesting the UA to transform hiragana  
or even kanji to katakana?


East Asian texts also don't use italics for works titles: they have  
a set of special punctuation for that.


I hazard a guess that it is more straight-forward, practical and  
compatible to enter that punctuation in the document content than to  
restyle cite to insert the punctuation as generated content.



You can argue that
italics and bold should be strictly equivalent to em and strong  
because all
that matters is that their presentation is the same, but that  
argument doesn't

hold up for non-Latin texts.


The way I see it is that speccing i and em as synonyms and b  
and strong as synonyms is harmless if pages written in scripts for  
which italics and bold are inapplicable don't use i and b.



Restyling em sitewide to use 'text-emphasis'
instead of 'font-style: italic' would be a nifty thing on a  
Japanese website.


I agree.


Restyling i the same way would just be silly.


From a CSS perspective, there's no difference. If em and i were  
defined to be semantically equivalent, there'd be no difference from  
the semantic point of view either. That would leave the personal code  
aesthetics that particular hand-coders associate with the identifiers  
em and i. If an author who control both markup and style chooses  
one over the other, that's cool.


But that's still about site-wide styling. Is it too late for any of  
this to have an impact on the UA style sheet?


Would it be compatible with the Web to add the following to the UA  
style sheets of visual browsers?


em:lang(ja) {
  font-style: normal;
  text-emphasis: accent before;
}

em:lang(ja-Latn) {
  font-style: italic;
  text-emphasis: none;
}

If that would be compatible with the Web, would the following be?

em:lang(ja), i:lang(ja) {
  font-style: normal;
  text-emphasis: accent before;
}

em:lang(ja-Latn), i:lang(ja-Latn) {
  font-style: italic;
  text-emphasis: none;
}

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/




Re: [whatwg] contenteditable, em and strong

2007-01-10 Thread Benjamin Hawkes-Lewis
Henri Sivonen wrote:

 Part of the overall test is that such UIs haven't been launched  
 with success in the last 14 years.

Well the WYSIWIG paradigm has been dominant in user-space. But I have
pointed to alternatives like Lyx and Mellel. Those seem to be successful
at bringing semantic thought-processes into the word processing sphere,
where there has traditionally been less payoff. (Although now that ODF
and PDF are becoming accessible formats, some sort of semantic authoring
will have to become part of the WYSIWIG workflow.)

Whereas desktop applications are gradually innovating, much web
development has been obsessed with trying to mimic desktop applications
as closely as possible, rather than focusing on the potential of the web
as a medium. In other words, interface conservatism has been an
end-goal.

 In theory, that's em. But in  
 practice the choice between em and strong is motivated by the  
 default visual rendering.

In so far as this is true, it is dependent on people having learned the
proper visual rendering for foreign phrases, film titles, warnings,
and so forth, partly through reading print and largely through word
processing. While people may not consciously think about why they are
using bold or italic, some part of their brain must know, since
otherwise they would get it wrong exactly half the time. Extending that
learned behaviour to the web has certain advantages, but it also has
severe disadvantages. Because people are familiar with concepts like
emphasis, foreign word, and book title, one could build an
interface around that instead, just as much around their familiarity
with the [I] button.

 Perhaps, but what's the payoff of vehemently communicating more about  
 this? Is it worth it? Would there be a different way to get the same  
 payoff?

Well, yes, we'd get a higher payoff from creating a reference
implementation and filing bugs on existing editors.

 em and i are exactly as stylable. strong and b are also  
 equally stylable.

em and strong are specific to emphasis. i and b are not. If you
want to apply one style for emphasis, another for foreign language
phrases, and another for citations (say), you'll therefore need to
employ additional classes. In which case, you might as well have just
used semantic markup in the first place.

 When an author presses command-i, he may not even know what markup is  
 generated. The choice between i and b vs. em and strong is  
 pretty much up to chance. This means that in *practice* em and  
 strong are, on the real Web out there, about exactly as ambiguous  
 as i and b. 

em and strong have been heavily misused thanks to exceptionally
inappropriate tools. But they've been /less/ heavily misused than other
HTML elements, such as table and blockquote. I think a good, if
depressing, question to think about is this: will HTML5 documents
generally continue the web's existing traditions of broken tag content
using tables for layout? If so, we might as well throw /all/ semantics
into microformats, which is practically what XHTML2 is doing, since all
elements defined in the spec will continue to used for their
presentational effects not their semantic import. The different
semantics of em and i would be the least of our problems.

 Since voice browsers aren't truly able to extract  
 significance out of the choice of em vs. i (as the choice of  
 element is largely up to chance), I conclude that reading them  
 differently from each other isn't a particularly useful idea.

Documents authored by badly designed tools are likely to be inaccessible
in other ways too. You'd really have to take a sample of markup that is
generally accessible. I suspect you'd find that the em and i
distinction is rather more common there.

 If i and b have been implemented in an annoying way for the aural  
 media, isn't the conclusion that it would even be rational for  
 WYSIWYG tool vendors to use em and strong for italics and bold to  
 avoid annoyances on the aural media? 

Not really. If tool vendors used em and strong for italics and bold,
then AT and talking browser vendors would have to implement the same
annoying techniques for exposing em and strong, since voice emphasis
would be inappropriate.

 Granted, but italics and bold are more sticky properties of the text  
 than e.g. font family, font size or column width, so it is a mistake  
 to treat all style properties as being equally incidental and  
 expendable.

Agreed on column width and on the general idea (not all styles are
necessarily equal). But what about the use of different typefaces for
code and samp and so forth? What about the symbolic use of different
type sizes in mathematical text? What about the use of type size for
emphasis, or for Ruby annotations? 

--
Benjamin Hawkes-Lewis



Re: [whatwg] contenteditable, em and strong

2007-01-10 Thread Henri Sivonen

On Jan 10, 2007, at 14:40, Simon Pieters wrote:


From: Henri Sivonen [EMAIL PROTECTED]
Two of the four implementations that the WHATWG cares about   
interoperate. Is it worthwhile to disrupt that  
situation#8212;especially  considering that changes to Trident  
are the hardest for the WHATWG to  induce?


Does the interoperability matter much in this case?


If I was writing a cross-browser CMS with a contenteditable-based  
editor, I'd be seriously unhappy about what WebKit does. The  
differences between what IE, Opera and Firefox produce can be dealt  
with relatively easily, but it would still be uncool to have to deal  
with them. So, yes, interop would be desirable.


Well... in that case strong needs to be defined as being  
equivalent to b and em equivalent to i, and the ability to  
mark things as being important or as stress emphasis is lost.


My point is that if the consumer of the markup cannot make practical  
use of the distinction, making the distinction on the producer side  
becomes pointless to the extent the production of markup is about  
communication with a consuming party. The ability of the producer to  
use whatever private distinction him/herself for styling wouldn't be  
affected.


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/




Re: [whatwg] contenteditable, em and strong

2007-01-10 Thread Henri Sivonen

On Jan 10, 2007, at 13:26, Matthew Paul Thomas wrote:

The message please use b and i unless you really know what  
you're doing, and generate b and i unless your users really  
know what they're doing is *not* well-known.


What's the expected payoff if the message is made well-known?

It has not yet consumed much time, effort, money, blog posts, spec  
examples or discussion threads. In the absence of other evidence, I  
think it is worth trying.


In that case, I suggest making the content models for b and i  
equally versatile as the content models for strong and em.  
Otherwise, authors and tool vendors will go with the elements with  
the more versatile content models just in case the versatility is  
ever needed.


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/




Re: [whatwg] contenteditable, em and strong

2007-01-10 Thread Nicholas Shanks
Having come in to this conversation half way, I'd like to give my  
opinions. In the following 'default style' means in the UAs style  
declarations for all documents of the language.


There should be three emphasis elements:

em  Increases emphatic semantics by one level. *No* default  
rendering style for visual media, default rendering for other media  
not specified.


i Equivalent semantics to em. Default rendering style for visual  
media is a language-dependant alternative glyph set of the same font  
family and weight (e.g. italic/курсив, oblique, カタカナ).  
Default rendering style for other media not specified (at least the  
same as em).


b Equivalent semantics to em. Default rendering style for visual  
media is the same font family and glyph collection, but higher  
weight. Default rendering style for other media not specified (at  
least the same as em, perhaps louder for aural).


The strong element is deprecated, replaced by nested levels of em  
or it's visual-specific variants.


Thus where visual presentation is important, i and b can be used  
semantically (they are equivalent) and em ignored. Where visual  
presentation is not important, em can be used without concern for  
what i should sound like.
The basic point is that em has no default rendering style,  
discouraging it's misuse for i want italic text and people tell me  
i is bad these days, so i'll use em.


- Nicholas.

smime.p7s
Description: S/MIME cryptographic signature


Re: [whatwg] contenteditable, em and strong

2007-01-10 Thread Matthew Paul Thomas

On Jan 11, 2007, at 2:17 AM, Henri Sivonen wrote:


On Jan 10, 2007, at 13:26, Matthew Paul Thomas wrote:


The message please use b and i unless you really know what 
you're doing, and generate b and i unless your users really know 
what they're doing is *not* well-known.


What's the expected payoff if the message is made well-known?


As far as I know:
*   Better intonation for screenreaders.
*   Better heuristics for Google Glossary. (Continuing my example from
last month, whereas pbfoo:/b bar/p is likely a
definition, pstrongfoo:/strong bar/p probably isn't. I'm
not *sure* that this is how Google Glossary works, but for example,
all its misdefinitions of the words update and warning are from
b, not strong.)
*   Easier styling for Chinese text.

I didn't know about the last one until yesterday, so I would not be 
surprised if there were others.


It has not yet consumed much time, effort, money, blog posts, spec 
examples or discussion threads. In the absence of other evidence, I 
think it is worth trying.


In that case, I suggest making the content models for b and i 
equally versatile as the content models for strong and em. 
Otherwise, authors and tool vendors will go with the elements with the 
more versatile content models just in case the versatility is ever 
needed.

...


Agreed. I also suggest that the first sentence of the usage notes for 
b and i be toned down a bit, like this: The b element should be 
used when an author cannot find a more appropriate element, and should 
be generated by authoring tools where users are unlikely to choose a 
more appropriate element.


--
Matthew Paul Thomas
http://mpt.net.nz/



Re: [whatwg] contenteditable, em and strong

2007-01-10 Thread Simon Pieters

Hi,

From: Simon Pieters [EMAIL PROTECTED]
Well... in that case strong needs to be defined as being equivalent to 
b and em equivalent to i, and the ability to mark things as being 
important or as stress emphasis is lost.


Actually, when I think about it, the ability to express such semantics 
*could* be moved to the class= attribute, e.g. class=important and 
class=emphasis, with perhaps both being appliciable to all of strong, b, 
em and i, and perhaps some others too. Perhaps that will be better 
understood by authors.


Or perhaps we don't need a way to express these semantics.

I don't know. I don't like giving up on things, though. :-( If it leads to 
this then adding em and strong to HTML was a mistake in the first place.


Regards,
Simon Pieters

_
Titta på livekonserter - exklusivt på MSN 
http://msnpresents.msn.com/hub/?mkt=sv-se




Re: [whatwg] contenteditable, em and strong

2007-01-09 Thread Henri Sivonen

On Jan 8, 2007, at 20:21, Simon Pieters wrote:

I think it is no surprise that most UAs will implement this as  
emitting em for CTRL+I and stong for CTRL+B, or similar  
interfaces that imply that the user actually requested italics or  
bold with (to the UA) unknown intended semantics. (IE and Opera  
emit em and strong, Safari emits SPAN class=Apple-style-span  
style=font-weight: bold;.) I think i and b should be emitted  
instead, and the above text should reflect that.


Two of the four implementations that the WHATWG cares about  
interoperate. Is it worthwhile to disrupt that situation—especially  
considering that changes to Trident are the hardest for the WHATWG to  
induce?


My conclusion is that semantic markup has failed in this case. em  
and i are both used primarily to achieve italic rendering on the  
visual media. strong and b are both primarily used to achieve  
bold rendering on the visual media. Regardless of which tags authors  
type or which tags their editor shortcuts produce, authors tend to  
think in terms of encoding italicizing and bolding instead of  
knowingly articulating their profound motivation for using italics or  
bold. Even those who have heard about the theoretical reasons for  
using em and strong tend to decide which one to use based on  
which one has the preferred default visual presentation for the case  
at hand.


em, strong, i and b have all been in HTML for over a decade.  
I think that’s long enough to see what happens in the wild. I think  
it is time to give up and admit that there are two pairs of visually- 
oriented synonyms instead of putting more time, effort, money, blog  
posts, spec examples and discussion threads into educating people  
about subtle differences in the hope that important benefits will be  
realized once people use these elements the “right” way.


Compare with: http://ln.hixie.ch/?start=1137799947count=1


Incidentally, this morning I came across this:
http://www.elementary-group-standards.com/html/and-all-that-html5- 
malarkey.html


In particular, this caught my eye: “b has been deprecated.  
Replacing b with strong suffices.”


(I checked the schema, and, sure enough, b was not there. It turns  
out that it wasn’t in the spec when I last reviewed that part of the  
schema. It is in the spec now.)


My point here is that the reaction was to simply replace b with  
strong.



As for Safari, I have noticed even stranger behavior. See http:// 
hsivonen.iki.fi/kesakoodi/clipboard/


I think using span with a style attribute is a bad idea in this case.  
Italicizing a word or two in a paragraph is not incidental style that  
could easily be considered optional. It is a more essential part of  
the text that should be preserved when the content is formatted for a  
different display environment possibly with a different font. Hence,  
it makes sense to use an italicizing element.



P.S. I don’t believe semantic markup to be axiomatically better than  
presentational markup. Semantic markup shouldn’t be an end in itself.  
When semantic markup doesn't serve a useful end, such as media  
independence or particularly useful data mining, in *practice*, there  
is no point.


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/




Re: [whatwg] contenteditable, em and strong

2007-01-09 Thread Benjamin Hawkes-Lewis
Henri Sivonen wrote:

 My conclusion is that semantic markup has failed in this case.

Semantic markup hasn't barely been tested in this case. For the most
part, users have been force-fed broken markup by deceptive user
interfaces. And, for the most part, developers haven't cared much about
semantic markup full stop.

An actual test would have been to provide people with a widespread
interface that accurately reported that they were emphasizing rather
than italicizing.

 strong and b are both primarily used to achieve  
 bold rendering on the visual media. Regardless of which tags authors  
 type or which tags their editor shortcuts produce, authors tend to  
 think in terms of encoding italicizing and bolding instead of  
 knowingly articulating their profound motivation for using italics or  
 bold. 

Yes, it's a bad habit picked up from WYSIWIG word processing. If people
were still habituated to typewriters you'd be insisting on the intrinsic
utility of u. ;)

 Even those who have heard about the theoretical reasons for  
 using em and strong 

[snip]

 em, strong, i and b have all been in HTML for over a decade.  
 I think that’s long enough to see what happens in the wild. I think it
 is time to give up and admit that there are two pairs of visually- 
 oriented synonyms instead of putting more time, effort, money, blog  
 posts, spec examples and discussion threads into educating people
 about subtle differences in the hope that important benefits will be  
 realized once people use these elements the “right” way.

If we accepted that only a few people have heard about the theoretical
advantages of em and strong, wouldn't that suggest that the web
standards community has not done enough communicating, not that
communication has been understood but ineffective because its
prescriptions are somehow impractical?

But in any case blog posts, spec examples, and discussion threads are
(largely) a waste of time compared to fixing tools like WYMEditor to use
a more accurate user interface, and even that's a waste of time compared
to fixing the big WYSIWIG editors.

There are consequences to using i and b instead of em and
strong. Being ambiguous, i and b are insufficient hooks for speech
CSS styling by the author, at least not without additional classes.)
Because they are so ambiguous, talking UAs will have to announce those
elements as italic and bold rather than applying any specific aural
styling such as a different rate or pitch of speech.  Because
announcements slow down reading speed much more than voice alterations,
it is likely that talking agent users will turn them off. Which means
their web experience will be ultimately degraded.

 Incidentally, this morning I came across this:
 http://www.elementary-group-standards.com/html/and-all-that-html5- 
 malarkey.html

 In particular, this caught my eye: “b has been deprecated.  
 Replacing b with strong suffices.”

 (I checked the schema, and, sure enough, b was not there. It turns  
 out that it wasn’t in the spec when I last reviewed that part of the  
 schema. It is in the spec now.)

 My point here is that the reaction was to simply replace b with  
 strong.

This data point doesn't demonstrate what you think it does. Andy Clarke
(the author) was talking about a particular example (literally one
use-case), not a wholesale replacement of all instances of b. To quote
the markup he's talking about:

 pThis morning I returned from a (literally) flying visit to New York
 where I had the very real pleasure of visiting my friends at a
 href=http://www.aol.com;AOL/a and speaking at their Design and
 Programming Offsite event. The visit was a memorable one, not only
 because I am a huge fan of the work that AOL are doing and I was able
 to spend time with some of their hugely creative designers.
 (bUpdate:/b There are a
 href=http://www.flickr.com/photos/misterbiscuit/301182515/in/set-72157594383080722/;a
  few photos on Flickr/a.)/p

strong is a reasonable choice to replace bold /in that instance/.
(Whether it's ideal is debatable, but it's clearly better than b.)

 I think using span with a style attribute is a bad idea in this case.  
 Italicizing a word or two in a paragraph is not incidental style that  
 could easily be considered optional.

Surely it /is/ an incidental style, since authors, publication houses,
and style guides vary in their preferences about when to italicize.
Surely it is the distinctions between foreign and native languages,
between emphasis and non-emphasis, between titles and non-titles, and so
forth, that are non-incidental, and that italicization imperfectly
reflects. The typography is not the message; it is only its shadow.

 It is a more essential part of  
 the text that should be preserved when the content is formatted for a  
 different display environment possibly with a different font.

How would a different font conflict with its italicization? Did you mean
in a UA like Lynx that doesn't support CSS?

--
Benjamin 

Re: [whatwg] contenteditable, em and strong

2007-01-09 Thread Leons Petrazickis

On 1/9/07, Henri Sivonen [EMAIL PROTECTED] wrote:

My conclusion is that semantic markup has failed in this case. em
and i are both used primarily to achieve italic rendering on the
visual media. strong and b are both primarily used to achieve
bold rendering on the visual media. Regardless of which tags authors
type or which tags their editor shortcuts produce, authors tend to
think in terms of encoding italicizing and bolding instead of
knowingly articulating their profound motivation for using italics or
bold. Even those who have heard about the theoretical reasons for
using em and strong tend to decide which one to use based on
which one has the preferred default visual presentation for the case
at hand.


A more general question is whether bold or italic are presentational.
Are they any more presentational than capitalizatio?. Methinks the
assumption that capitalization is semantic while bold and italic are
presentational is a historical accident, not reality.

Imagine a world where ASCII only had lowercase characters. A different
font would have to be substituted for uppercase, just as a different
font now has to be substituted for italic or bold. A web browser in
such a world would have this presentational tag:
capitalize - Capitalizes the first letter of every word
And these semantic tags:
sentence - By default, capitalizes the first letter of the first word.
proper - By default, capitalizes the first letter of every word.

CSS would show up, and the semantic markup philosophy would catch on.
Adherents would proclaim capitalize Considered Harmful and urge
people to switch to sentence for sentences, proper for proper
nouns, and CSS spans for other uses. After all, different languages,
different dialects, different cultures all have different
capitalization practices. Different publishing houses capitalize
titles differently.

Instead of doing that, people just swapped proper in place of
capitalize. The adherents raged. What fools these people be. The
first word of a sentence is not a proper noun. We need to proselytize
more! But to no avail.

***

Capitalize, b bold, and i italicize are all intrinsic properties
of prose, just as br line breaks are intrinsic properties of poetry.
They can be abused:
divpbfont size=+4Dragons Be Here/b/div/p/font
But using them mid-paragraph is not abuse. Their use should be neither
deprecated nor discouraged.

--
Leons Petrazickis


Re: [whatwg] contenteditable, em and strong

2007-01-09 Thread Anne van Kesteren
On Tue, 09 Jan 2007 22:43:09 +0100, Anne van Kesteren [EMAIL PROTECTED]  
wrote:

Compare with: http://ln.hixie.ch/?start=1137799947count=1


You know, you're probably right. I'm just not there yet.


Compare with:  
http://web.archive.org/web/1997072703/www.webreview.com/97/04/11/feature/part2.html



--
Anne van Kesteren
http://annevankesteren.nl/
http://www.opera.com/


Re: [whatwg] contenteditable, em and strong

2007-01-09 Thread Benjamin Hawkes-Lewis
Leons Petrazickis inscribed:

 A more general question is whether bold or italic are presentational.
 Are they any more presentational than capitalization?. Methinks the
 assumption that capitalization is semantic while bold and italic are
 presentational is a historical accident, not reality.

I agree entirely, although drawing a different conclusion, namely that
capitalization is indeed presentational. Note that capitalization also
has markup that helps clarify it: abbr, acronym (in HTML4 and
XHTML1), strong, hN for headings, cite, and microformats (for
proper names). The only thing which doesn't currently have any means of
clarification is the start and end of sentences.

 Imagine a world where ASCII only had lowercase characters. 

I enjoyed this imaginative exercise. :)

 Instead of doing that, people just swapped proper in place of
 capitalize. The adherents raged. What fools these people be. The
 first word of a sentence is not a proper noun. We need to proselytize
 more! 

I don't however your fable persuasive, because it presents the
acceptance of markup as a dialectic between elite proselytization and
authorial pragmatism, whereas I would allot greater explanatory power to
the conservatism of tools and a certain disinterest on the part of tool
developers in the meaning of text content.

 Capitalize, b bold, and i italicize are all intrinsic properties
 of prose

Which is of course why modern editions of Latin texts are printed in all
capitals with no punctuation, why modern editions of eighteenth century
English texts use italic for quotations, and why audiobooks announce
italic whenever they come across a word in another language. Oh
wait...

I think there's some creative, but not productive, reinterpretation of
the word intrinsic going on here.

 But using them mid-paragraph is not abuse. Their use should be neither
 deprecated nor discouraged.

So why should font, center, and small be discouraged then?

--
Benjamin Hawkes-Lewis



Re: [whatwg] contenteditable, em and strong

2007-01-09 Thread Alexey Feldgendler
On Wed, 10 Jan 2007 01:20:50 +0100, Benjamin Hawkes-Lewis  
[EMAIL PROTECTED] wrote:



Instead of doing that, people just swapped proper in place of
capitalize. The adherents raged. What fools these people be. The
first word of a sentence is not a proper noun. We need to proselytize
more!



I don't however your fable persuasive, because it presents the
acceptance of markup as a dialectic between elite proselytization and
authorial pragmatism, whereas I would allot greater explanatory power to
the conservatism of tools and a certain disinterest on the part of tool
developers in the meaning of text content.


What happened to b and i -- because of the tools -- isn't random.  
Every presentational markup that today's web contains has this very  
reason: WYSIWYG. This approach is by design targetted at producing a  
document for presentation on one single, chosen media (which is usually  
either screen or paper). WYSIWYG is always presentational because its goal  
is to produce a document which is as close as possible to the “original”  
that exists in the author's imagination. If the author has imagined  
boldface text, it means that he has already performed the irreversible  
mapping from semantics to presentation in his head, and there is no way to  
precisely map it back to semantics. And it never was a goal for WYSIWYG;  
the task of every WYSIWYG tool was to give the user the right buttons to  
press for bold, italic, and underlined. There are indeed different reasons  
why the author may want an italic font, but making a separate button for  
each of those reasons won't do any good because the interface between the  
author and the tool takes place after the conversion from semantics to  
presentation, and a choice of “semantic” buttons wouldn't make any sense  
at that point. What would happen is that authors would pick a random  
button out of those which produce italic rendering, and consider the  
tool's interface overcomplicated.


b and i are not alone here. Continuing the capitalization example, I  
can say that text editors have used to provide capitalization when the  
user holds the Shift key (pretty much like Ctrl-B for bold). Having  
several kinds of Shift keys for different purposes of capitalization  
(Start-of-sentence key, Proper-noun key, Acronym key) would not, in my  
opinion, help preserve more semantic information: the authors would pick  
the key to use randomly because it doesn't make any difference on the  
media this particular WYSIWYG tool targets.


The only radical way to make semantic markup work is to abandon WYSIWYG  
and start thinking in a media-independent way (or, to reuse the word,  
multimedia). I'm not sure if it's feasible on the scale of the entire web  
authoring community, and what model should replace WYSIWYG in that case.



--
Alexey Feldgendler [EMAIL PROTECTED]
[ICQ: 115226275] http://feldgendler.livejournal.com


[whatwg] contenteditable, em and strong

2007-01-08 Thread Simon Pieters

Hi,

The contenteditable spec says:

  Insert, and wrap text in, semantic elements

 UAs should offer a way for the user to mark text as
 having stress emphasis and as being important, and
 may offer the user the ability to mark text and blocks
 with other semantics.

I think it is no surprise that most UAs will implement this as emitting em 
for CTRL+I and stong for CTRL+B, or similar interfaces that imply that the 
user actually requested italics or bold with (to the UA) unknown intended 
semantics. (IE and Opera emit em and strong, Safari emits SPAN 
class=Apple-style-span style=font-weight: bold;.) I think i and b 
should be emitted instead, and the above text should reflect that.


Regards,
Simon Pieters

_
Fynda charter till solen http://www.msn.se/resor/