On 04.04.2014 16:46, Gleb Smirnoff wrote: > On Thu, Apr 03, 2014 at 01:34:33AM +0400, Andrey Chernov wrote: > A> On 02.04.2014 21:15, Gleb Smirnoff wrote: > A> > S> + :lang=en_US.UTF-8:\ > A> > S> + :charset=UTF-8: > A> > > A> > And I'd like to do same change for the 'russian' login class > A> > in /etc/login.conf. > A> > A> Please everybody remember that we don't have UTF-8 collation > A> implemented, just fallback to bytecode comparison. > > Any objections on checking in FreeBSD-compatible UTF-8 collation > implementation from Alex Tutubalin? > > http://blog.lexa.ru/2008/03/03/freebsd_utf8_russian_collate_vtoraja_popitka.html >
Even his "version 2" have my objections. I already reply Alex about this in 2008. In short: 1) It is error there: almost all single chars above ASCII should be "chains", i.t. two bytes minimum, since there almost no intersections with ISO8859-1 as UTF-8 subset. 2) The table itself is very incomplete, f.e. not covering either whole KOI8-R, nor ISO8859-5, nor CP866. It is made from CP1251 with all its restrictions. So, switching from f.e. KOI8-R to UTF-8 will cause sorting regression. Russian UTF-8 collation should be able to sort all major Russian charsets mentioned, i.e. we need combined table. 3) "charmap map.ISO8859-1" declaration is missing (needed mainly for using pure ASCII chars mnemonic names). Even in case above mentioned errors will be removed and the code will be committed afterwards, we should understand that this way (implementing multibyte collation via single byte one) even while being possible is a big hack and slowing sorting down up to 10 times. Proper "Unicode collation algorithm" is already implemented by ICU and other projects. See http://unicode.org/reports/tr10/ It will be better if someone adopt it instead of hacks. -- http://ache.vniz.net/ _______________________________________________ firstname.lastname@example.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"