James M Snell wrote:
> Brian Smith wrote:
> > [snip]
> > I think it makes more sense to get people to get some working 
> > implementations that have informally agreed on some extensions,
> 
> I implemented extended bidi support in Apache Abdera months 
> ago; and our internal blogging environment supports bidi entries.

That is a definitely a positive first step. What other implementations does 
Abdera interoperate with with regard to BIDI text?

> > How much of a problem is BIDI in Atom today? (This isn't a 
> rhetorical question). 
> 
> It's a problem. Quite a few readers I have tested do not even 
> properly support bidi markup used in entry title's that use 
> (x)html.

That is exactly my point. People are not even implementing the current 
standards for BIDI text. Adding another standard to implement is not going to 
make the situation better.

> Since the values for those attributes 
> are typically provided by humans, relying on the software to 
> properly insert the unicode formatting codes (which aren't 
> recommended for markup in the first place) is problematic at best.

If the software cannot correctly insert the formatting codes into the document, 
how would it be able to insert the correct directionality markup?

> > Atom documents are almost never hand-entered, and there is 
> > already a specification in place for markup up BIDI and even
> > ruby text in general XML. The odds that clients and servers
> > are going to correctly implement this extension--except
> > those targeted direclty towards BIDI users--seem pretty
> > low to me. Personally, it seems much easier to 
> > implement the an existing BIDI markup mechanism (Unicode, 
> > XML, and/or XHTML) than a new standard.
> 
> What are you basing that on?

The Unicode/W3C guidelines (http://www.w3.org/TR/unicode-xml/#Bidi and 
http://www.w3.org/International/questions/qa-bidi-controls) say this:

* Use *XHTML* BIDI markup whenever possible.
* Otherwise *CSS* whenever possible.
* Otherwise, consider building BIDI markup into your markup schema.
* We have to support BIDI formatting codes anyway, since the above mechanisms 
don't solve all BIDI problems.

> There's really not much guess work 
> involved in the implementation and apps that choose not to implement 
> support will be no worse or better off than they are currently.

That is not true. Consider a feed aggregator. If it doesn't support Atom BIDI, 
then it will not correctly rewrite entries to handle an inherited "dir" 
attribute from the atom:feed element. Since the directionality is also 
inherited by text constructs, whenever the implementation passes the text 
construct to a rendering engine, it needs to rewrite that content to handle the 
inherited directionality.

> Don't forget atom:link/@title and atom:category/@label. 
> atom:category/@term can also cause problems when 
> implementation use that 
> value for display purposes.

I know that. But, the Atom BIDI draft does not eliminate all uses of BIDI 
formatting characters in these attributes, either. And, it doesn't specify BIDI 
support for language-sensitive content in Atom service documents, category 
documents, RSS feeds, or RSD documents.

> Arbitrary extension elements can also have problems.

The BIDI draft says it only applies to constructs that RFC4287 labeled 
"language sensitive." Accordingly, the BIDI draft does not apply to extension 
elements.

> The bidi draft doesn't attempt to solve all the i18n issues 
> with Atom. Ruby text is a problem for pretty much everything, 
> especially given the fact that most browsers don't have a
> clue how to properly render ruby text yet.  The bidi draft 
> rightfully focuses on one small part of the problem.

I agree that a narrow scope is good. But, a solution for Ruby text will also be 
applicable to BIDI, especially if that solution involves the reuse of XHTML 
markup and/or CSS. 

> 
> > It also doesn't solve the problem with atom:link/@title or 
> other attributes that 
> > are language-sensitive.
> 
> Yes, it does.

The Atom BIDI draft does not provide a way of specifying base different base 
directionalities for attributes on the same element, it doesn't eliminate all 
need for BIDI formatting characters in language-sensitive attribute values, it 
doesn't provide a mechanism for discovering which (nested) extension elements 
and attributes are affected by the proposed Atom BIDI markup. 

If I implement RFC4287, the Unicode BIDI algorithm, XHTML BIDI, HTML BIDI, and 
the "Unicode in XML" guidelines, I will have pretty good BIDI support. It will 
require me to adhere to four different BIDI standards in addition to RFC4287. 
That is a lot of work already. Now, your Atom BIDI and URI template BIDI 
proposals add two more specifications that I would have to support--for a total 
of SIX standards to adhere to and resolve conflicts between, JUST to support 
BIDI text. And, even if I create well-formed documents adhering to all six 
standards, whenever I open them up in any of my text editors, or any feed 
reader, they will look wrong since nobody else is implementing all of those 
standards. I think that is totally unreasonable. Abdera might have amazing 
support for BIDI, for which you should be commended, but unless all Atom 
software is going to be implemented on top of Abdera, Abdera will not be able 
to reliably interoperate with anything. If we want to provide interopera!
 ble support for BIDI, we need to make it as simple to implement as possible.

My counter-proposal is simple:

* Use XHTML/HTML BIDI/Ruby markup whenever possible.
* Otherwise, use Unicode BIDI/Ruby formatting codes, such that matching pairs 
of formatting codes are fully contained within a single text or attribute node.
* Editors of new documents must be meticulous about inserting the proper markup 
and formatting codes.
* Processors of existing documents must be meticulous about preserving 
BIDI/Ruby markup and/or formatting codes whenever any part of the contained 
text is preserved.

I recognize that this goes against the Unicode in XML guidelines. However, Atom 
already goes against the guidelines by having language-sensitive text in 
attribute values and other contexts where XHTML markup cannot be used.

- Brian


Reply via email to