Hi Theo,

Theo Buehler wrote on Thu, Dec 21, 2017 at 11:57:12PM +0100:
> Ingo Schwarze wrote:

>> So i really don't feel like adding a BUGS section, but instead i
>> think documenting that -i is intended as an ASCII-only feature is
>> the way to go.

> Yes, sounds reasonable.

Thanks for checking!
Changes to DESCRIPTION and ENVIRONMENT committed.

>> While here, profit from the opportunity to mention that uniq(1) is
>> intended to work on the level of codepoints, not on the level of
>> fully combined characters.

> ok, I think that's an improvement. Thanks.

I didn't commit the CAVEATS section because deraadt@ convincingly
pointed out that it was quite hard to understand.  Fully understanding
a run-of-the-mill section 1 manual page must not require knowledge
about Unicode terms like "normalization forms" and "canonical equivalence".

Besides, the way our base system utilities define "string equality"
for strings that may be either ASCII or UTF-8 is not specific to
uniq(1).  So i'll probably document that in another, central place,
quite possibly in locale(1) because that's where LC_CTYPE is defined,
so people are likely to look there for the gory details of UTF-8
handling, and also because using full Unicode terminology in that
place is less disruptive than in an innocent manual line uniq(1).

Yours,
  Ingo

Reply via email to