Brian Smith wrote:
James M Snell wrote:
Brian Smith wrote:
I agree. If there are going to be formatting codes in the document
anyway, then why do we need a mechanism that duplicates their
functionality?
Because their use in markup is problematic and is actively discouraged
for a number of very good reasons.
Even the latest W3C guidelines are now encouraging the use of some formatting
characters (LRM and RLM) in addition to markup, where the previous guidelines
recommended markup only:
http://www.w3.org/TR/i18n-html-tech-bidi/#ri20030218.135304584
The guidelines state, very clearly, that the formatting characters
should not be used when markup can be used instead. The LRM and RLM
characters do not have markup alternatives and do not carry the same
risks as the other formatting characters. This recommendation is very
consistent, and is upheld by the Atom bidi draft, which provides a
markup alternative to using the formatting characters.
Furthermore, the guidelines also state a best practice to "[u]se bidi
markup only when necessary.", and gives the specific recommendation that
"[o]nce you have established the appropriate directionality for the
html element you will only need to apply bidi markup to a block element
if you want that element's directionality to be different." By moving
the dir as far up in the document tree as possible "simplif[ies] the
document and reduce[s] bandwidth requirements."
So, Atom implementations have to be prepared to accept at least LRM and RLM in
documents, anyway.
Never said they didn't. All the atom bidi draft does is provide a
markup alternative in order to make life easier... which, reminds me,
you still haven't explained why you think the dir attribute is more
complicated to implement than the bidi characters.
My hypothesis is that an implementation ignorant of BIDI issues is
more likely to preserve the formatting characters than
Atom/XHTML/HTML
BIDI markup, especially when the effects of those formatting
characters never span multiple nodes in the document.
Have you tested this hypothesis using real editors? Example, in our
internal blogging environment, tags are entered in a single text box,
each tag separated by a comma. The system splits the tags into an
array and saves each tag separately.
Each tag becomes a separate atom:category element. Is the user
responsible for adding the appropriate formatting codes around each
individual tag? When the user wishes to edit the entry later, perhaps
to add a new entry, are they supposed to just know that there are
non-visual bidi formatting codes interspersed into the comma separated
list of tags?
When software breaks apart BIDI text and recombines it, it has to preserve
the BIDI formatting. In this case, the system that splits apart the tags
into an array and/or the system that recombines the tags into a comma-seperated
list should transparently handle the formatting codes.
Let me ask again: is the user responsible for adding the appropriate
formatting codes around each individual tag in the list? When the user
wishes to edit the entry later, are they supposed to just know that
there are non-visual bidi formatting codes interspersed into the comma
separated list of tags?
You're assuming that all users have the same requirements.
In our environment, a single feed may include entries from many
different users. We have group blogs where users from many different
locales have edit rights on any entry in the blog. Further, our users
use many different editors to write and manage their blog entries.
Asking those users to be mindful of how they're using bidi formatting
characters is a lot more difficult than what we currently do, which is
provide a simple check box to indicate whether or not the entry is
"right-to-left", which in turn, is translated into the appropriate
dir="rtl" in the markup.
Clients that do not understand the dir attribute simply ignore it, and
since our software is written so that only explicit changes in value
are recognized (e.g. a missing dir attribute does not mean the dir
attribute value has changed) we're able to work seamlessly with
editors that do not support the attribute.
In my suggested mechanism, this concern is only relevent for language-sensitive
attributes, and atom:name, since that is the only place where XHTML's BIDI
markup cannot be used. Even then, it only applies to the few cases where there
is a user that is editing part of an entry written in a RTL language, where
the Unicode BIDI algorithm fails to work for that part of the entry, they are
using software that doesn't support BIDI text entry (meaning they probably can't
read or write any RTL languages), and their change somehow still requires the
preservation of the RTL base directionality but doesn't require any other formatting
codes. I agree that "extremely unlikely" isn't the same as "impossible," but in
this case it seems pretty close.
Your solution is based on a lot of assumptions and possibilities. It
*may* work in *many* cases. It *likely* won't be a problem. It's
*possible* that clients will get it right. It also goes against
documented best practices and recommendations, defended only by an
untested hypothesis that it's "simple" and "easy". That's not acceptable
when we can do better.
The Atom bidi attribute is not the whole solution; it's just one part of
the larger picture, intended to fill in the gaps and make certain things
simpler. Further, it is consistent with the approach taken by XHTML and
the recommendations of both the W3C and the Unicode organization.
[snip]
RFC 4287 and RFC 5023 [are] pretty unclear about what is required to
be preserved.
RFC 5023, Section 9.3: To avoid unintentional loss of data when
editing Member Entries or Media Link Entries, an Atom Protocol client
SHOULD preserve all metadata that has not been intentionally modified,
including unknown foreign markup as defined in Section 6 of [RFC4287].
Seems pretty darn clear to me.
I will illustrate what I am saying with an example. An atom:link element
is not unknown foreign markup. So, I can remove atom:link elements whenever
I want. In particular, I can remove an atom:link element and then replace
it with an another atom:link element that links to the same destination.
That doesn't violate the specification but the old element might have a
"dir" attribute and the new one might not.
Ok... and? The meaning of the text in the spec is clear: clients should
only change things they intentionally want changed. If the atom:link
element is replaced without the dir attribute, then obviously the only
reasonable interpretation the server can make is that the client wanted
that dir attribute to be changed.
> [snip]
- James