URL:
<http://savannah.gnu.org/bugs/?33271>
Summary: non-US characters and -o
Project: grep
Submitted by: golimar
Submitted on: mar 10 may 2011 11:02:17 GMT
Category: None
Severity: 3 - Normal
Item Group: None
Status: None
Privacy: Public
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
_______________________________________________________
Details:
I use egrep and a pattern that contains '.', because I want to
match any character in that position. It actually finds a match, but when I
try to extract only the matching part, I use -o, but it prints nothing:
$ cat file.csv | egrep '1021_POZUELO_.C0_0179'
90240.00;1021_POZUELO_ÑC0_0179;ADM;1662
$ cat file.csv | egrep -o '1021_POZUELO_.C0_0179'
$
This other example works fine:
$ cat file.csv | egrep '1021.POZUEL'
90240.00;1021_POZUELO_ÑC0_0179;ADM;1662
90242.00;1021_POZUELO_CM0_0181;ADM;1662
$ cat file.csv | egrep -o '1021.POZUEL'
1021_POZUEL
1021_POZUEL
It seems that the non-US character (Ñ) is causing a problem to the -o option.
This happens with grep 2.6.3 in RHEL 6 x86_64.
I tried with grep 2.7 and it's worse: it never matches the non-US character,
with or without -o
$ cat file.csv | grep-2.7/src/egrep '1021_POZUELO_.'
90242.00;1021_POZUELO_CM0_0181;ADM;1662
$ cat file.csv | grep-2.7/src/egrep '1021_POZUELO_'
90240.00;1021_POZUELO_ÑC0_0179;ADM;1662
90242.00;1021_POZUELO_CM0_0181;ADM;1662
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?33271>
_______________________________________________
Mensaje enviado vía/por Savannah
http://savannah.gnu.org/