On Sat, 2014-04-05 at 05:35 +0400, Andrey Chernov wrote:
> On 04.04.2014 16:46, Gleb Smirnoff wrote:
> > On Thu, Apr 03, 2014 at 01:34:33AM +0400, Andrey Chernov wrote:
> > A> On 02.04.2014 21:15, Gleb Smirnoff wrote:
> > A> > S> +   :lang=en_US.UTF-8:\
> > A> > S> +   :charset=UTF-8:
> > A> > 
> > A> > And I'd like to do same change for the 'russian' login class
> > A> > in /etc/login.conf.
> > A> 
> > A> Please everybody remember that we don't have UTF-8 collation
> > A> implemented, just fallback to bytecode comparison.
> > 
> > Any objections on checking in FreeBSD-compatible[1] UTF-8 collation
> > implementation from Alex Tutubalin?
> > 
> > http://blog.lexa.ru/2008/03/03/freebsd_utf8_russian_collate_vtoraja_popitka.html
> > 
> 
> Even his "version 2" have my objections. I already reply Alex about this
> in 2008. In short:
> 1) It is error there: almost all single chars above ASCII should be
> "chains", i.t. two bytes minimum, since there almost no intersections
> with ISO8859-1 as UTF-8 subset.
> 2) The table itself is very incomplete, f.e. not covering either whole
> KOI8-R, nor ISO8859-5, nor CP866. It is made from CP1251 with all its
> restrictions. So, switching from f.e. KOI8-R to UTF-8 will cause sorting
> regression. Russian UTF-8 collation should be able to sort all major
> Russian charsets mentioned, i.e. we need combined table.
> 3) "charmap map.ISO8859-1" declaration is missing (needed mainly for
> using pure ASCII chars mnemonic names).
> 
> Even in case above mentioned errors will be removed and the code will be
> committed afterwards, we should understand that this way (implementing
> multibyte collation via single byte one) even while being possible is a
> big hack and slowing sorting down up to 10 times.
> 
> Proper "Unicode collation algorithm" is already implemented by ICU and
> other projects. See
> http://unicode.org/reports/tr10/
> It will be better if someone adopt it instead of hacks.
> 


If you have a different patch, I'd appreciate seeing it.  

Sean

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to