Here's a reg exp problem that's got me up at night.  I'm looking for 
currency and numbers in various formats.  The currency symbol and the 
actual character code of the number I'm looking for may vary, depending on 
the file I'm looking at, as we're working in Unicode and looking at 
languages from all over the world.  I'm also using utf8.

Basically valid formats would be:
$123
$ 123
123$
123 $

and up to here, I'm okay:

my $sText = "Foo $ 123 bar 123$ hello $123 world 123 $";
my $reNumber   = "\x{0030}-\x{0039}"; # these may be different depending on 
the language, but let's work with English
my $reCurrency = "\\x{0024}|\\x{00a3}"; # just to keep it simple
my @asCurrencies = $sText =~ 
/[$reNumber]?\x{0020}?[$reCurrency]\x{0020}?[$reNumber]?/g;
foreach my $currency (@asCurrencies)
         {
         print "$currency\n";
         }

ok.

I want to add in the text for currency symbols, like "dollar" and "pound", 
so that I match on either a currency symbol or a currency word and grab the 
numbers to the left of to the right.

Here are the strings for those already formatted for utf8:

my $string = 
"(\x{0064}\x{006f}\x{006c}\x{006c}\x{0061}\x{0072}\x{0073})|(\x{0070}\x{006f}\x{0075}\x{006e}\x{0064}\x{0073})";
 
# (dollar)|(pound)

I've tried all sorts of variations on parentheses etc. in the reg exp, to 
no avail.  I've checked the docs on forward and backward checking, and 
messed around with it some, but either that's not what I need, or I haven't 
completely grasped the concept yet.

Any ideas?


Aaron Craig
Programming
iSoftitler.com

Reply via email to