forcemerge 12192 9365 thanks Michael Stummvoll wrote: > Hi gnu folks, > > as already known, tr cannot handle multibyte-encodings like utf-8: > >> mst@eddie:~$ echo "foo" | tr o ö >> fÃÃ > > i know, that multibyte encoding support is not needed for > posix-compilance, BUT: > > the manpage of tr says the following: > >> Translate, squeeze, and/or delete characters from standard input, >> writing to standard output. > > and thats the inconsistence imho. > > The typical interpretation of "character" in such a context means one > character on display. regardless which encoding is used or how many > bytes are used to display this. So, if tr realy translates "characters" > it should preserve the encoding. If it doesn't do, it does not > translate "characters" but "bytes". So there I see two ways: > > - add multybyte-encoding support to tr > or > - change the manpage and helptext to not say "characters" but "bytes" > > since it doesn't seem that somebody want to add the support to tr, an > update of the manpage would be the easier way to ensure the consistence.
Thanks for the report. I'm merging this issue with the others that relate to tr and multi-byte support.
