https://qa.mandrakesoft.com/show_bug.cgi?id=578

           Product: bash
         Component: bash
           Summary: char-range in case is always case-less (locale-related)
           Version: 2.05b-9mdk
          Platform: PC
        OS/Version: All
            Status: UNCONFIRMED
          Severity: major
          Priority: P2
        AssignedTo: [EMAIL PROTECTED]
        ReportedBy: [EMAIL PROTECTED]
                CC: [EMAIL PROTECTED]


A few of my shell-scripts did stop working correctly, and they screwed up 
some database of mine that was feeded with filters based on bash/sed. Both
show the same problem: a char-range of [a-z] matches any char in A..Z too!!

Discussing this problem on alt.os.linux.mandrake did reveal:
(a) it works in mdk 8.2
(b) it works in mdk cooker
(c) it works not in 9.0, its RCs, or updates.

The bug is obviously locale related since modifiying $LANG or $LC_COLLATE
will show differing result where each should be identical. Note that both
french settings are not affected, but almost all german settings are bad.

Here are the test routines to reproduce the problem in `bash case` and
in `gnu sed`. The test results should never say MATCHED, only CORRECT
means the result would be correct.

for LANG in `locale -a`
    do printf "%22s\t" $LANG
    echo :CORRECT: | sed -e "s,[a-z],,g" -e "s,::,:MATCHED:,"
done 

for LANG in `locale -a`
    do printf "%22s\t" $LANG
    bash -c "case BigFirstChar in
      [a-z]*) echo MATCHED ;;
      *) echo CORRECT ;;
    esac"
done 

for LC_COLLATE in `locale -a`
    do printf "%22s\t" $LC_COLLATE
    bash -c "case BigFirstChar in
      [a-z]*) echo MATCHED ;;
      *) echo CORRECT ;;
    esac"
done | grep de

The last example has a `grep de` - and this is the (bad) result:

                    de  MATCHED
                 de_AT  MATCHED
      de_AT.ISO-8859-1  MATCHED
     de_AT.ISO-8859-15  MATCHED
           de_AT.UTF-8  MATCHED
                 de_BE  MATCHED
      de_BE.ISO-8859-1  MATCHED
     de_BE.ISO-8859-15  MATCHED
           de_BE.UTF-8  MATCHED
                 de_CH  MATCHED
      de_CH.ISO-8859-1  MATCHED
           de_CH.UTF-8  MATCHED
                 de_DE  MATCHED
     de_DE.ISO-8859-15  MATCHED
           de_DE.UTF-8  MATCHED
                 de_LU  MATCHED
      de_LU.ISO-8859-1  MATCHED
     de_LU.ISO-8859-15  MATCHED
           de_LU.UTF-8  MATCHED
               deutsch  CORRECT

The default installation will therefore break existing
shell scripts and sed scripts - and any database system 
using these will silently do the wrong thing when they
take different actions depending on case. That's atleast
for me.

The problem is not easily recognizable since:
(a) most alpha-ranges are [a-zA-Z] or they want to transform case
    via y/a-z/A-Z/ which does not hurt to match the other case too.
(b) the same lines of above taken with `grep fr` will say
              fran�ais  CORRECT
                french  CORRECT

Please be so kind as to provide an update-package for 9.0 release.

(for bug creation, I took bash as the most important package being
 affected although the problem is somewhere deeper in the system).



------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

Reply via email to