On Tue, May 26, 2009 at 09:45:25AM -0700, mmayer344 wrote:
> 
> Hi folks,
> 
> I'm running through some logs of search queries, trying to pull out
> the ones containing variations on United Nations. I am trying to write
> a regular expression that can match a two letter format (such as "UN",
> "U.N." or "U. N.") or a two word format ("United Nations", "united
> +nation").
> 
> So far my flailing attempts have been able to match all of the above,
> but they are also yielding any other word beginning with "un".
> 
> I want to match these:
> 
> un
> u.n.
> u. n.
> united nation
> united +nation
> 
> ...but not these:
> 
> united states
> UNDP
> unsuccessful

Try this:

\bun\b|\bu\.\s*n\.|\bunited\s+\+?nation

\b matches on a word-boundary, i.e. the position between a word character
[a-zA-Z0-9_] and a non-word character [^a-zA-Z0-9_].

Ronald

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "BBEdit Talk" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/bbedit?hl=en
If you have a specific feature request or would like to report a suspected (or 
confirmed) problem with the software, please email to "[email protected]" 
rather than posting to the group.
-~----------~----~----~----~------~----~------~--~---

Reply via email to