* I completely rewrote this answer, so I'm not giving you a diff. It's easier just to read it straight. As of today and until I commit a change, the current answer is http://faq.perl.org/perlfaq6.html#Why_don_t_word_bound
* I discovered a slight error in perlre: it defines \b in terms of \w and \W, but that isn't strictly true since the start and end of strings can stand in for non-word characters. perlre needs a slight patch before I can reference it. * In the old answer, the examples weren't very interesting or concrete, so I added more examples and show several strings that match or don't match a pattern. =head2 Why don't word-boundary searches with C<\b> work for me? (contributed by brian d foy) Ensure that you know what \b really does: it's the boundary between a word character, \w, and something that isn't a word character. That thing that isn't a word character might be \W, but it can also be the start or end of the string. It's not (not!) the boundary between whitespace and non-whitespace, and it's not the stuff between words we use to create sentences. In regex speak, a word boundary (\b) is a "zero width assertion", meaning that it doesn't represent a character in the string, but a condition at a certain position. For the regular expression, /\bPerl\b/, there has to be a word boundary before the "P" and after the "l". As long as something other than a word character precedes the "P" and succeeds the "l", the pattern will match. These strings match /\bPerl\b/. "Perl" # no word char before P or after l "Perl " # same as previous (space is not a word char) "'Perl'" # the ' char is not a word char "Perl's" # no word char before P, non-word char after "l" These strings do not match /\bPerl\b/. "Perl_" # _ is a word char! "Perler" # no word char before P, but one after l You don't have to use \b to match words though. You can look for non-word characters surrrounded by word characters. These strings match the pattern /\b'\b/. "don't" # the ' char is surrounded by "n" and "t" "qep'a'" # the ' char is surrounded by "p" and "a" These strings do not match /\b'\b/. "foo'" # there is no word char after non-word ' You can also use the complement of \b, \B, to specify that there should not be a word boundary. In the pattern /\Bam\B/, there must be a word character before the "a" and after the "m". These patterns match /\Bam\B/: "llama" # "am" surrounded by word chars "Samuel" # same These strings do not match /\Bam\B/ "Sam" # no word boundary before "a", but one after "m" "I am Sam" # "am" surrounded by non-word chars -- brian d foy, [EMAIL PROTECTED]
