Your message dated Mon, 5 Feb 2018 08:17:41 +0100
with message-id <20180205071741.GA5146@novelo>
and subject line Re: Bug#889595: grep -w mismatching
has caused the Debian Bug report #889595,
regarding grep -w mismatching
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ow...@bugs.debian.org
immediately.)


-- 
889595: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=889595
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems
--- Begin Message ---
Package: grep
Version: 3.1-2

Hello,

I am trying to use grep -w to match a word, but it seems that grep ignores
the -w switch and returns all occurencies.

Test case:
$ cat file1
one two three
one-two-three
one-two
two-three

$ grep -w 'two' file1 #or grep -Ew 'two'
Result: all four lines of file1 are returned.

According  to man grep:
-w, --word-regexp
              Select only those lines containing matches that form whole
words.  The test is  that  the  matching  substring  must either  be  at
the  beginning  of the line, or preceded by a non-word constituent
character.  Similarly, it must be either at the end of the line or followed
by a non-word  constituent  character. Word-constituent  characters
are letters,
digits, and the underscore.  This option has no effect if -x is also
specified.


Since dash '-' does not belong to word-constituent characters i was
expecting only first line to be matched : one two three

Just for the record , even options \<two\>, or \btwo\b will also return all
four lines.

Easy workaround for the job would be to use
grep ' two ' or grep -E '\stwo\s'

But this bug report is focused on the failure of -w flag

Cheers,

George Vasiliou

--- End Message ---
--- Begin Message ---
El 04/02/18 a las 23:14, George Vasiliou (GMAIL) escribió:
…
> I am trying to use grep -w to match a word, but it seems that grep ignores the
> -w switch and returns all occurencies.
> 
> Test case: 
> $ cat file1
> one two three
> one-two-three
> one-two
> two-three
> 
> $ grep -w 'two' file1 #or grep -Ew 'two'
> Result: all four lines of file1 are returned.
> 
> According  to man grep:
> -w, --word-regexp
>               Select only those lines containing matches that form whole
> words.  The test is  that  the  matching  substring  must either  be  at  the 
> beginning  of the line, or preceded by a non-word constituent character. 
> Similarly, it must be either at the end of the line or followed by a non-word 
> constituent  character. Word-constituent  characters  are letters, digits, and
> the underscore.  This option has no effect if -x is also specified.
> 
> 
> Since dash '-' does not belong to word-constituent characters i was expecting
> only first line to be matched : one two three
…

I think you are wrongly interpreting this option. As far as I understand
it, grep -w is doing was is documented. Please, compare:

printf "one_two_three \none-two-three \none two tree" | grep -w "two"
one-two-three 
one two tree

printf "one_two_three \none-two-three \none two tree" | grep "two"
one_two_three 
one-two-three 
one two tree

Since grep considers dash a non-word constituent character, two is
"bounded" as a single word in your example.

Cheers,

 -- Santiago

--- End Message ---

Reply via email to