RE: Rule to catch PO#

Bowie Bailey Tue, 02 Dec 2008 09:32:17 -0800

Ray Jette wrote:
> Bowie Bailey wrote:
> > Ray Jette wrote:
> > 
> > > Good morning,
> > > I am trying to write a negative scoring rule that files on the
> > > following: PO PO#
> > > PO #
> > > 
> > > Following is the rule I am using:
> > > 
> > > header PO_AND_ORDERS        Subject =~ /\bPO*?#?/i
> > > score PO_AND_ORDERS        -0.50
> > > describe PO_AND_ORDERS    A negative scoring rule that searches
> > > the subject for PO #'s. 
> > > 
> > > Thanks for any help you can provide.
> > > 
> > 
> > Try this one:
> > 
> >     Subject =~ /\bPO\b ?#?/i
> > 
> > The "\b" after the "PO" will prevent it from matching things like
> > "positive", "pollen", or anything else that happens to start with
> > "po". Keep in mind that the "i" at the end makes it
> > case-insensitive, so this will match "PO", "po", "pO", etc.
> > 
> > 
> Sometimes the subject will be: PO#34598459 so do I realy want to us
> \b? I need to match all of the ollowing:
> PO
> PO#
> PO  [0-9] - im not sure the max amount of numbers
> PO#  [0-9] - im not sure the number of numbers
> PO[0-9] - not sure how many numbers
> PO#[0-9] - not sure how many numbers


\b matches a zero-length word boundary.  This means that one side is a
"word character" and the other side is not.  Word characters are defined
as alphanumeric plus "_".  So the only option in your list that would
cause a problem is "PO12345".

Try this one:

    Subject =~ /\bPO(?:\b ?#?|\d)/i

Actually, since both the space and the hash are optional, is there any
point in matching them?

This might be better:

    Subject =~ /\bPO(?:\b|\d)/i

Or you could look for the number (which removes the need for a word
boundary check):

    Subject =~ /\bPO ?#? ?\d/i

-- 
Bowie

RE: Rule to catch PO#

Reply via email to