Well, it seems that the first one is what you want, but you just need to 
use $1 and ignore $2.

You do need parentheses in '(mr|mrs|miss|dr|prof|sir)' but if you do not 
want for them to be captured in $2, you can use:
'(?:mr|mrs|miss|dr|prof|sir)'.  For example:

print "match3='$1' '$2'\n" if
($T=~/^((?:mr|mrs|miss|dr|prof|sir) .{5,}?)\n/smi);

would give output:

match3='Miss Jayne Doe' ''

On Wed, 2 Dec 2020, Gary Stainburn wrote:

> I have an array of regex expressions that I apply to text returned from
> tesseract.
> 
> Each match that I get then gets stored for future processing. However, I'm
> struggling with one regex.
> 
> The problem is that:
> 
> 1) with brackets round the titles it returns two matches.
> 2) without brackets, it returns nothing.
> 
> Can anyone point me at the correct syntax please.
> 
> Gary
> 
> [root@dev dev]# ./t
> match1='Miss Jayne Doe' 'Miss'
> [root@dev dev]# cat t
> #!/usr/bin/perl
> 
> use strict;
> use warnings;
> 
> my $T=<<EOF;
> Customer name and address
> Miss Jayne Doe
> 19 Their Street
> Somewehere
> In Yorkshire
> IN1 3YY
> EOF
> 
> print "match1='$1' '$2'\n" if ($T=~/^((mr|mrs|miss|dr|prof|sir)
> .{5,}?)\n/smi);
> print "match2='$1' '$2'\n" if ($T=~/^(mr|mrs|miss|dr|prof|sir .{5,}?)\n/smi);
> [root@dev dev]#
> 
> -- 
> To unsubscribe, e-mail: beginners-unsubscr...@perl.org
> For additional commands, e-mail: beginners-h...@perl.org
> http://learn.perl.org/
> 
> 

-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to