On Mon, May 29, 2017 at 07:28:37PM +0200, Ingo Schwarze wrote: > Hi Walter, > > Walter Alejandro Iglesias wrote on Mon, May 29, 2017 at 06:44:40PM +0200: > > > Are those wide char versions of C functions consistent enough to write > > a separate implementation to be loaded when LC_TYPE is set to utf-8? > > Sure, you can rewrite the complete shell to use wchar_t * rather > than char *, and if you do that, you can use the new code to handle > ASCII as well, no need to have two copies. But that would be a > huge effort, even more error-prone than the small, careful adjustments > we are doing now, and would have a number of additional downsides; > among others, losing the ability to handle arbitrary bytes, while > in UTF-8 mode. > > For an editor, going wchar_t might be better because having substantial > amounts of UTF-8 in user input is a common case in some files that > people edit. > > For a shell, editing strings that contain non-ASCII is not the main > purpose. Sure, it is nice if the command line is able to handle > strings containing an occasional UTF-8 character. But the main > purpose of the shell remains to safely input and execute Unix-style > command lines, where non-ASCII characters are a non-essential addition > at best.
I totally agree with you and that's exactly why I value you're preserving the ascii version, not only ksh, even the editor, I mostly use vi and have nvi from packages at hand just for when I want to send mail to family or edit my web site. Thanks for your kind explanation. > > Yours, > Ingo > > > For more details, see > https://www.openbsd.org/papers/eurobsdcon2016-utf8.pdf
