Thanks to both Ben, and Ronald for correcting me. I
had misunderstood the documentation.

I do expect that others may also think that using
$string =~ /\b$sub_str\b/ is a reasonable way to match
a substring on word-boundaries, or beginning/end of
the string, but hopefully they will realize the
nuanced effect of the "imaginary characters" at the
beginning and end of the string, or find this post.

Thanks again,

-Carl

--- Ben Tilly <[EMAIL PROTECTED]> wrote:

> On 11/15/06, Carl Eklof <[EMAIL PROTECTED]> wrote:
> > Hi Guys n Gals,
> >
> > I have found some seemingly strange behavior that
> may
> > be of interest to this list.
> >
> > My assumption was that the \b pattern in a regex
> would
> > always match the beginning and end of a string (as
> > documented in the perlre page). However on my
> build of
> > 5.8.7 this is not the case if the character being
> > matched at the beginning or the end is a
> > "meta-character" ie. quotemeta would escape it.
> Also
> > note that escaping the charcter doesn't seem to
> make a
> > difference.
> 
> Actually that is NOT as documented in the perlre
> page.  And thoughts
> to the contrary are a misreading of the
> documentation.
> 
> What the perlre page says is that there is an
> imaginary \W at the
> beginning and end of the string.  The result is that
> if the first
> character in the string matches \w, then \b will
> match at the start,
> and if the last character matches \w, then \b will
> match at the end.
> 
> However if the first and/or last characters do *not*
> match \w, then
> that is not a word boundary and \b will not match
> there.
> 
> [examples snipped]
> 
> > Maybe this is not a bug, and this is just another
> > nuance of regexs' that I have not learned, but it
> > looks very fishy.
> 
> It is definitely not a bug.  If the string is "...",
> then there are no
> words, hence no word boundaries, therefore \b should
> not match at all.
>  (And it does not.)
> 
> Conversely if the string is "hello" then there is a
> word, and it has
> boundaries, and those boundaries should be matched
> by \b.  (And they
> are, thanks to the "imaginary characters" discussed
> in the
> documentation.)
> 
> > Any thoughts/wisdom?
> 
> See above.
> 
> Cheers,
> Ben
> 

 
_______________________________________________
Boston-pm mailing list
[email protected]
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to