On 10/7/16 1:28 PM, Sam Whited wrote:
On Fri, Oct 7, 2016 at 2:00 PM, William Fisher
<[email protected]> wrote:
I've been working on a python implementation of the PRECIS specification
(https:github.com/byllyfish/precis-i18n).
Great timing; I was just this moment looking at your library and
considering including it in some interoperability tests for a set of
test vectors I'm working on.
Can someone please clarify whether:
A. The "Bidi rule" is ONLY applied to strings that contain right-to-left
characters.
B. The "Bidi rule" is applied to ALL strings.
The recent draft-ietf-precis-7613bis-03 clarifies this:
Directionality Rule: Apply the "Bidi Rule" defined in [RFC5893]
to strings that contain right-to-left characters (i.e., each of
the six conditions of the Bidi Rule must be satisfied); for
strings that do not contain right-to-left characters, there is no
special processing for directionality.
I was apparently confused about this before too, I'm applying Bidi all
the time in the Go implementation. I'll be sure to add a test for this
in the test vectors when I publish them.
I'm actually not convinced this is the correct behavior though; it
seems confusing to me that usernames with RTL characters couldn't end
with punctuation, but strings with them could.
There are plenty of RTL punctuation characters (e.g., U+05BE), and those
are allowed in RTL strings (even as the last character). RFC 5893 says
that an RTL string must not have an LTR character at the end, with the
result that an RTL string cannot end in "."; this helps to prevent
confusion in the typical presentation of domain names (which is the
target use case for RFC 5893).
This violates the
principal of least suprise.
It's perhaps not advisable for you and I to speculate about what might
surprise a user whose native language is represented in a right-to-left
script.
Peter
_______________________________________________
precis mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/precis