On March 6, 2015 8:39:41 PM GMT+01:00, Rich Felker <dal...@libc.org> wrote: >I just tried building uclibc (via buildroot) on my Alpine Linux >system, which is based on musl libc. I encountered some portability >issues in the locale generation code which I'll report separately >later, but the big issue I found here was a silent failure to generate >wctables.h. Looking into gen_wctype.c, I found it's failing at this >test: > > if ((l != (short)l) || (u != (short)u)) { > verbose_msg("range assumption error! %x %ld %ld\n", c, l, u); > return EXIT_FAILURE; > } > >Apparently uclibc's locale system encodes an assumption that >towlower/towupper map a character to an offset that fits in a signed >16-bit offset. This assumption is false on the current versions of >Unicode (and even fairly old ones); at least the following characters >fail it: > >char up low du dl >0265 ɥ 0265 a78d 0 42280 >0266 ɦ 0266 a7aa 0 42308 >1d79 ᵹ 1d79 a77d 0 35332 > >Presumably the only reason uclibc's gen_wctype works right now on >glibc-based hosts is that glibc's Unicode alignment is severely >outdated. This is about to change; see >https://sourceware.org/ml/libc-alpha/2014-11/msg00664.html and the >thread that extends into the following months. So without a fix, >uclibc will probably break soon "in the wild". > >A suitable replacement assumption might be that towupper/towlower stay >in the same "plane", so that instead of a signed 16-bit offset, uclibc >could use an unsigned 16-bit replacement of the low 16 bits. I have no >idea how practical this might be to implement but if it works it would >at least avoid increasing the size of the tables.
Mhm. Thanks alot for the heads-up! Cheers, _______________________________________________ uClibc mailing list uClibc@uclibc.org http://lists.busybox.net/mailman/listinfo/uclibc