https://qa.mandrakesoft.com/show_bug.cgi?id=578
Product: bash
Component: bash
Summary: char-range in case is always case-less (locale-related)
Version: 2.05b-9mdk
Platform: PC
OS/Version: All
Status: UNCONFIRMED
Severity: major
Priority: P2
AssignedTo: [EMAIL PROTECTED]
ReportedBy: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
A few of my shell-scripts did stop working correctly, and they screwed up
some database of mine that was feeded with filters based on bash/sed. Both
show the same problem: a char-range of [a-z] matches any char in A..Z too!!
Discussing this problem on alt.os.linux.mandrake did reveal:
(a) it works in mdk 8.2
(b) it works in mdk cooker
(c) it works not in 9.0, its RCs, or updates.
The bug is obviously locale related since modifiying $LANG or $LC_COLLATE
will show differing result where each should be identical. Note that both
french settings are not affected, but almost all german settings are bad.
Here are the test routines to reproduce the problem in `bash case` and
in `gnu sed`. The test results should never say MATCHED, only CORRECT
means the result would be correct.
for LANG in `locale -a`
do printf "%22s\t" $LANG
echo :CORRECT: | sed -e "s,[a-z],,g" -e "s,::,:MATCHED:,"
done
for LANG in `locale -a`
do printf "%22s\t" $LANG
bash -c "case BigFirstChar in
[a-z]*) echo MATCHED ;;
*) echo CORRECT ;;
esac"
done
for LC_COLLATE in `locale -a`
do printf "%22s\t" $LC_COLLATE
bash -c "case BigFirstChar in
[a-z]*) echo MATCHED ;;
*) echo CORRECT ;;
esac"
done | grep de
The last example has a `grep de` - and this is the (bad) result:
de MATCHED
de_AT MATCHED
de_AT.ISO-8859-1 MATCHED
de_AT.ISO-8859-15 MATCHED
de_AT.UTF-8 MATCHED
de_BE MATCHED
de_BE.ISO-8859-1 MATCHED
de_BE.ISO-8859-15 MATCHED
de_BE.UTF-8 MATCHED
de_CH MATCHED
de_CH.ISO-8859-1 MATCHED
de_CH.UTF-8 MATCHED
de_DE MATCHED
de_DE.ISO-8859-15 MATCHED
de_DE.UTF-8 MATCHED
de_LU MATCHED
de_LU.ISO-8859-1 MATCHED
de_LU.ISO-8859-15 MATCHED
de_LU.UTF-8 MATCHED
deutsch CORRECT
The default installation will therefore break existing
shell scripts and sed scripts - and any database system
using these will silently do the wrong thing when they
take different actions depending on case. That's atleast
for me.
The problem is not easily recognizable since:
(a) most alpha-ranges are [a-zA-Z] or they want to transform case
via y/a-z/A-Z/ which does not hurt to match the other case too.
(b) the same lines of above taken with `grep fr` will say
fran�ais CORRECT
french CORRECT
Please be so kind as to provide an update-package for 9.0 release.
(for bug creation, I took bash as the most important package being
affected although the problem is somewhere deeper in the system).
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.