Bug#720482: grep: improper handling of unicode word boundaries
Control: tags -1 + fixed-in-experimental El 22/08/13 a las 16:41, a...@barak.in escribió: > Package: grep > Version: 2.12-2 > Severity: normal > > Dear Maintainer, > regexp quantifiers \< \> \b \B \w \W give wrong results with unicode > words. > > example: > $ echo "я" | grep -q "\<я"; echo $? > 1 > > i wrote small test script (see attachm.). > it's result: > \ \bx x\b x\B \Bx \w \W > b: 0 0 0 0 1 1 0 1 > я: 1 1 1 1 0 0 1 0 > Σ: 1 1 1 1 0 0 1 0 > ä: 1 1 1 1 0 0 1 0 > Hi, grep 2.21 fixes these boundaries. For the moment, it is available in experimental. \ \bx x\b x\B \Bx \w \W b: 0 0 0 0 1 1 0 1 я: 0 0 0 0 1 1 0 1 Σ: 0 0 0 0 1 1 0 1 ä: 0 0 0 0 1 1 0 1 a: 0 0 0 0 1 1 0 1 Cheers, Santiago -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#720482: grep: improper handling of unicode word boundaries
Package: grep Version: 2.12-2 Severity: normal Dear Maintainer, regexp quantifiers \< \> \b \B \w \W give wrong results with unicode words. example: $ echo "я" | grep -q "\<я"; echo $? 1 i wrote small test script (see attachm.). it's result: \ \bx x\b x\B \Bx \w \W b: 0 0 0 0 1 1 0 1 я: 1 1 1 1 0 0 1 0 Σ: 1 1 1 1 0 0 1 0 ä: 1 1 1 1 0 0 1 0 -- System Information: Debian Release: 7.1 APT prefers stable APT policy: (500, 'stable') Architecture: i386 (i686) Kernel: Linux 3.2.0-4-686-pae (SMP w/2 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages grep depends on: ii dpkg 1.16.10 ii install-info 4.13a.dfsg.1-10 ii libc6 2.13-38 grep recommends no packages. Versions of packages grep suggests: ii libpcre3 1:8.30-5 -- no debconf information s Description: application/shellscript