I'm trying to create a rule to catch some of the perscription drug
references that come into our system. We're not in pharmaceuticals, so
I'm not too concerned about false positives :)
Some examples of what I'm looking for (using an innocent drug so I don't
trip someone else's filters):
ADVwIL
ADxDVIL
ADxV1L
Advjjl
Or summed up in english: insertion of a random character, the same thing
but with a letter repeated, inserted character and "1" (or "l") instead
of "I", and the recent (and odd) occurrence of "I" replaced with "jj".
I've come up with a rule that'll match every one of those instances, but
also has the unfortunate consequence of matching plain old "ADVIL":
/A[a-z]?A?D[a-z]?D?V[a-z]?V?[Il1j][a-z]?[Il1j]?L[a-z]?L?/
Now, I'm by no means a regular expression guru. I'm hoping someone on
this list can help me refine this a bit, either by sharing a method of
making it match the obfuscated name without matching the unobfuscated
name, or even a different approach to the same end. Any advice?