Package: grep
Version: 2.14-4
Severity: important

Dear Maintainer,

The string '^\s*' is not catched at all when LC_CTYPE is set to any
valid UTF-8 locale, but it is when LC_CTYPE is set to anything else
(not really; in fact, this also affects 140 other locales + an empty
LC_CTYPE).

user@sid:~$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
user@sid:~$ echo 'whatever you want' | grep '^\s*'
user@sid:~$ echo 'whatever you want' | LC_CTYPE= grep '^\s*'
user@sid:~$ echo 'whatever you want' | LC_CTYPE=C grep '^\s*'
whatever you want
user@sid:~$ echo 'whatever you want' | LC_CTYPE=POSIX grep '^\s*'
whatever you want
user@sid:~$ echo 'whatever you want' | LC_CTYPE=FooBar grep '^\s*'
whatever you want
user@sid:~$ echo 'whatever you want' | LC_CTYPE=not-valid-utf8 grep '^\s*'
whatever you want
user@sid:~$ echo 'whatever you want' | LC_CTYPE=xx_XX.UTF-8 grep '^\s*'
whatever you want
user@sid:~$ echo 'whatever you want' | LC_CTYPE=de_DE.UTF-8 grep '^\s*'
user@sid:~$ echo 'whatever you want' | LC_CTYPE=es_ES.UTF-8 grep '^\s*'
user@sid:~$ echo 'whatever you want' | LC_CTYPE=fr_FR.UTF-8 grep '^\s*'
user@sid:~$ echo 'whatever you want' | LC_CTYPE=it_IT.UTF-8 grep '^\s*'
user@sid:~$ # and so on

More:
user@sid:~$ locale -a | wc -l
462
user@sid:~$ for x in $(locale -a | grep '\.utf8$'); do echo 'foobar' | 
LC_CTYPE=$x grep '^\s*'; done | wc -l
0
user@sid:~$ for x in $(locale -a | grep -v '\.utf8$'); do echo 'foobar' | 
LC_CTYPE=$x grep '^\s*'; done | wc -l
174

Also note that the command behaves as expected if '^\s*' is replaced by
'^[[:space:]]*' or '^[[:blank:]]*' or '^[       ]*', or if \s is not at
the beginning of the regex.

Cheers,
quidame


-- System Information:
Debian Release: jessie/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: i386 (i686)

Kernel: Linux 3.11-2-486
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages grep depends on:
ii  dpkg      1.17.1
ii  libc6     2.17-96
ii  libpcre3  1:8.31-2

grep recommends no packages.

grep suggests no packages.

-- no debconf information


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to