Tushar Teredesai wrote:
Hi:

I stumbled across this message
<http://lists.gnu.org/archive/html/bug-grep/2004-12/msg00064.html>
from the Grep maintainer. It seems that we should not be using the
above configure option.

+1 here. Included regex is just incorrect for ru_RU.KOI8-R (i.e. this affects the 6.1 book). Testcase:

echo -e '\300\n\301\n\302\n\303\n\304\n\305\n\306\n\307' | \
        LC_ALL=ru_RU.KOI8-R grep `echo -e '[\302-\304]'` | od -b

Correct output:

0000000 302 012 304 012 307 012
0000006

Incorrect output, currently in LFS:

0000000 302 012 303 012 304 012
0000006

Explanation:

the beginning of Russian alphabet, lower case letters and corresponding bytes in KOI8-R:

cyrillic small letter a (а, \301)
cyrillic small letter be (б, \301)
cyrillic small letter ve (в, \327)
cyrillic small letter ge (г, \307)
cyrillic small letter de (д, \304)
cyrillic small letter e (е, \305)
cyrillic small letter zhe (ж, \326)

You see that numeric order of bytes is not the same as alphabetical order in this locale. The testcase outputs 8 numerically-first letters:

cyrillic small letter yu (ю, \300)
cyrillic small letter a (а, \301)
cyrillic small letter be (б, \302)
cyrillic small letter ce (ц, \303)
cyrillic small letter de (д, \304)
cyrillic small letter e (е, \305)
cyrillic small letter ef (ф, \306)
cyrillic small letter ge (г, \307)

and asks grep to pass only letters between cyrillic small letter be and cyrillic small letter de. This should be done alphabetically, so the letters that pass should be cyrillic small letter be (б, \302), cyrillic small letter ve (в, \327) and cyrillic small letter ge (г, \307). Grep compiled without internal regex uses the LC_COLLATE locale category properly and outputs exactly that.

Grep compiled with internal regex uses numerical values of bytes and produces incorrect result.

The openi18n patch still has to be applied to grep in LFS 7.0 to pass the LSB testsuite.

--
Alexander E. Patrakov
--
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page

Reply via email to