Currently, neither draft-ietf-idn-nameprep-07.txt nor draft-hoffman-stringprep-00.txt deal with bidirectionality (mixing right-to-left (Arabic/Hebrew) and left-to-right writing directions) issues. This should be changed as soon as possible.
If a label can contain both right-to-left and left-to-right characters, how it will be displayed, and how displayed labels will be entered and looked up in the DNS, is highly context-dependent. This is obviously very undesirable. The following is a proposal written up by Mark Davis, based on input from others: >>>> A. Characters are classified into RTL, LTR, DIGIT, OTHER. These categories are drawn from the BIDI algorithm. The precise lists of characters in each category would be added to NamePrep as an appendix. The composition is as follows (See <http://www.unicode.org/reports/tr9/#Bidirectional_Character_Types>http://ww w.unicode.org<http://www.unicode.org/reports/tr9/#Bidirectional_Character_Ty pes>/reports/tr9/#Bidirectional_Character_Types). LTR := L ; # including LRM RTL := R | AL ; DIG := EN | AN ; OTH := all other characters: NSM, ON, etc. Note: The characters in categories LRM, RLM, LRO, RLO, LRE, RLE, PDF, B, S, and some other BIDI categories are prohibited anyway. B. In any field that contains any RTL characters: B0. no LTR characters can occur. C1. a sequence of characters of type DIG can only occur at the end. C2. a sequence of characters of type OTHER can occur only between characters of type RTL. >>>> I propose that this be added as an additional step after the current 'prohibition' step. Regards, Martin.
