[bug #33271] non-US characters and -o

Goli Mar Tue, 10 May 2011 03:54:47 -0700

URL:
  <http://savannah.gnu.org/bugs/?33271>


                 Summary: non-US characters and -o
                 Project: grep
            Submitted by: golimar
            Submitted on: mar 10 may 2011 11:02:17 GMT
                Category: None
                Severity: 3 - Normal
              Item Group: None
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any

    _______________________________________________________

Details:


I use egrep and a pattern that contains '.', because I want to
match any character in that position. It actually finds a match, but when I
try to extract only the matching part, I use -o, but it prints nothing:

$ cat file.csv | egrep '1021_POZUELO_.C0_0179'
90240.00;1021_POZUELO_ÑC0_0179;ADM;1662
$ cat file.csv | egrep -o '1021_POZUELO_.C0_0179'
$


This other example works fine:

$ cat file.csv | egrep '1021.POZUEL'
90240.00;1021_POZUELO_ÑC0_0179;ADM;1662
90242.00;1021_POZUELO_CM0_0181;ADM;1662
$ cat file.csv | egrep -o '1021.POZUEL'
1021_POZUEL
1021_POZUEL

It seems that the non-US character (Ñ) is causing a problem to the -o option.
This happens with grep 2.6.3 in RHEL 6 x86_64.
I tried with grep 2.7 and it's worse: it never matches the non-US character,
with or without -o


$ cat file.csv | grep-2.7/src/egrep '1021_POZUELO_.'
90242.00;1021_POZUELO_CM0_0181;ADM;1662
$ cat file.csv | grep-2.7/src/egrep '1021_POZUELO_'
90240.00;1021_POZUELO_ÑC0_0179;ADM;1662
90242.00;1021_POZUELO_CM0_0181;ADM;1662






    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?33271>

_______________________________________________
  Mensaje enviado vía/por Savannah
  http://savannah.gnu.org/

[bug #33271] non-US characters and -o

Reply via email to