Re: BIDI (was Proposal: Atomext WG)

James M Snell Mon, 07 Jan 2008 11:45:47 -0800



Brian Smith wrote:

James M Snell wrote:
Brian Smith wrote:
I agree. If there are going to be formatting codes in the documentanyway, then why do we need a mechanism that duplicates theirfunctionality?
Because their use in markup is problematic and is actively discouragedfor a number of very good reasons.
Even the latest W3C guidelines are now encouraging the use of some formatting 
characters (LRM and RLM) in addition to markup, where the previous guidelines 
recommended markup only:

http://www.w3.org/TR/i18n-html-tech-bidi/#ri20030218.135304584

The guidelines state, very clearly, that the formatting charactersshould not be used when markup can be used instead. The LRM and RLMcharacters do not have markup alternatives and do not carry the samerisks as the other formatting characters. This recommendation is veryconsistent, and is upheld by the Atom bidi draft, which provides amarkup alternative to using the formatting characters.

Furthermore, the guidelines also state a best practice to "[u]se bidimarkup only when necessary.", and gives the specific recommendation that"[o]nce you have established the appropriate directionality for thehtml element you will only need to apply bidi markup to a block elementif you want that element's directionality to be different." By movingthe dir as far up in the document tree as possible "simplif[ies] thedocument and reduce[s] bandwidth requirements."

So, Atom implementations have to be prepared to accept at least LRM and RLM in 
documents, anyway.

Never said they didn't. All the atom bidi draft does is provide amarkup alternative in order to make life easier... which, reminds me,you still haven't explained why you think the dir attribute is morecomplicated to implement than the bidi characters.

My hypothesis is that an implementation ignorant of BIDI issues ismore likely to preserve the formatting characters than
Atom/XHTML/HTML
BIDI markup, especially when the effects of those formattingcharacters never span multiple nodes in the document.
Have you tested this hypothesis using real editors? Example, in ourinternal blogging environment, tags are entered in a single text box,each tag separated by a comma. The system splits the tags into anarray and saves each tag separately.Each tag becomes a separate atom:category element. Is the userresponsible for adding the appropriate formatting codes around eachindividual tag? When the user wishes to edit the entry later, perhapsto add a new entry, are they supposed to just know that there arenon-visual bidi formatting codes interspersed into the comma separatedlist of tags?
When software breaks apart BIDI text and recombines it, it has to preservethe BIDI formatting. In this case, the system that splits apart the tagsinto an array and/or the system that recombines the tags into a comma-seperatedlist should transparently handle the formatting codes.

Let me ask again: is the user responsible for adding the appropriateformatting codes around each individual tag in the list? When the userwishes to edit the entry later, are they supposed to just know thatthere are non-visual bidi formatting codes interspersed into the commaseparated list of tags?

You're assuming that all users have the same requirements.In our environment, a single feed may include entries from manydifferent users. We have group blogs where users from many differentlocales have edit rights on any entry in the blog. Further, our usersuse many different editors to write and manage their blog entries.Asking those users to be mindful of how they're using bidi formattingcharacters is a lot more difficult than what we currently do, which isprovide a simple check box to indicate whether or not the entry is"right-to-left", which in turn, is translated into the appropriatedir="rtl" in the markup.
Clients that do not understand the dir attribute simply ignore it, andsince our software is written so that only explicit changes in valueare recognized (e.g. a missing dir attribute does not mean the dirattribute value has changed) we're able to work seamlessly witheditors that do not support the attribute.
In my suggested mechanism, this concern is only relevent for language-sensitiveattributes, and atom:name, since that is the only place where XHTML's BIDImarkup cannot be used. Even then, it only applies to the few cases where thereis a user that is editing part of an entry written in a RTL language, wherethe Unicode BIDI algorithm fails to work for that part of the entry, they areusing software that doesn't support BIDI text entry (meaning they probably can'tread or write any RTL languages), and their change somehow still requires thepreservation of the RTL base directionality but doesn't require any other formattingcodes. I agree that "extremely unlikely" isn't the same as "impossible," but inthis case it seems pretty close.

Your solution is based on a lot of assumptions and possibilities. It*may* work in *many* cases. It *likely* won't be a problem. It's*possible* that clients will get it right. It also goes againstdocumented best practices and recommendations, defended only by anuntested hypothesis that it's "simple" and "easy". That's not acceptablewhen we can do better.

The Atom bidi attribute is not the whole solution; it's just one part ofthe larger picture, intended to fill in the gaps and make certain thingssimpler. Further, it is consistent with the approach taken by XHTML andthe recommendations of both the W3C and the Unicode organization.

[snip]
RFC 4287 and RFC 5023 [are] pretty unclear about what is required tobe preserved.
RFC 5023, Section 9.3: To avoid unintentional loss of data whenediting Member Entries or Media Link Entries, an Atom Protocol clientSHOULD preserve all metadata that has not been intentionally modified,including unknown foreign markup as defined in Section 6 of [RFC4287].
Seems pretty darn clear to me.
I will illustrate what I am saying with an example. An atom:link elementis not unknown foreign markup. So, I can remove atom:link elements wheneverI want. In particular, I can remove an atom:link element and then replaceit with an another atom:link element that links to the same destination.That doesn't violate the specification but the old element might have a"dir" attribute and the new one might not.

Ok... and? The meaning of the text in the spec is clear: clients shouldonly change things they intentionally want changed. If the atom:linkelement is replaced without the dir attribute, then obviously the onlyreasonable interpretation the server can make is that the client wantedthat dir attribute to be changed.


> [snip]

- James

Re: BIDI (was Proposal: Atomext WG)

Reply via email to