Richard,

On 2/1/2019 1:30 PM, Richard Wordingham via Unicode wrote:

Language tagging is already available in Unicode, via the tag characters
in the deprecated plane.

Recte:

1. Plane 14 is not a "deprecated plane".

2. The tag characters in Tag Character block (U+E0000..U+E007F) are not deprecated. (They are used, for example, by UTS #51 to specify emoji tag sequences.)

3. However, the use of U+E0001 LANGUAGE TAG and the mechanism of using tag characters for spelling out language tags are explicitly deprecated by the standard. See: "Deprecated Use for Language Tagging" in Section 23.9 Tag Characters.

https://www.unicode.org/versions/Unicode11.0.0/ch23.pdf#G30427

and PropList.txt:

E0001         ; Deprecated # Cf       LANGUAGE TAG

As I stated earlier: language tags should use BCP 47, and belong in the markup level, not in the plain text stream.

--Ken

Reply via email to