On Friday 21 May 2010, Akhthar Parvez K wrote:
> Look at this code:
> 
> my @data = ( 'Twinkle twinkle little star
> How I wonder what you are
> Up above the world so high
> Like a diamond in the sky.
> 123
> Twinkle twinkle little star
> How I wonder what you are');
> my $rx1 = qr{ world.*diamond }imx;
> my $rx2 = qr{ what.*world }mix;
> my $rx3 = qr{ little\D*wonder }imx;
> my $rx4 = qr{ high.*like }imx;
> my @regex = ($rx1, $rx2, $rx3, $rx4);
> my $regx = join ("|", @regex);
> my @matches = map { tr/\n//d; /($regx)/g } @data;
> print 'result: ', Dumper \...@matches;
> 
> and the output is:
> result: $VAR1 = [
>           'little starHow I wonder',
>           'what you areUp above the world',
>           'highLike',
>           'little starHow I wonder'
>         ];
> 
> The string that matches the regex 'world.*diamond' wasn't picked by the above 
> expression. It looks like it was not picked because some part of the string 
> was already picked by another regex. How can I get the expression pick that 
> as well so the output would be like below:
> 
> result: $VAR1 = [
>           'little starHow I wonder',
>           'what you areUp above the world',
>           'world so highLike a diamond',
>           'highLike',
>           'little starHow I wonder'
>         ];

I would believe regex that's how regex works unless someone tells me otherwise 
and show me a way to get the expression picks *all*. I'm fine with either, 
however I would be pleased if someone with more knowledge on regex confirms 
this.

I am stuck with regex again, this time I really need to *fix* it:

Code:

my @data = ( 'Twinkle twinkle little star
How I wonder what you are
Up above the world so high
Like (a) diamond in the sky.
123
Twinkle twinkle little star
How I wonder what you are');
my $rx1 = qr{ little(\D*wonder) }imx;
my $rx2 = qr{ high(.*like) }imx;
my @regex = ($rx1, $rx2);
my $regx = join ("|", @regex);
print "regx: $regx\n";
my @matches = map { tr/\n//d; /($regx)/g } @data;
print 'array: ', Dumper \...@matches;

Output:
regx: (?ix-sm: little(\D*wonder) )|(?ix-sm: high(.*like) )
array: $VAR1 = [
          'little starHow I wonder',
          ' starHow I wonder',
          undef,
          'highLike',
          undef,
          'Like',
          'little starHow I wonder',
          ' starHow I wonder',
          undef
        ];

I'm expecting a result like this:
regx: (?ix-sm: little(\D*wonder) )|(?ix-sm: high(.*like) )
array: $VAR1 = [
          'little starHow I wonder',
          ' starHow I wonder',
          'highLike',
          'Like',
          'little starHow I wonder',
          ' starHow I wonder',
         ];

I would like to know why these undefs are appearing in between and how can I 
get rid of them. I am sure it's due to the way how I am concatenating the regex 
(with join) as it works fine if I put just one regex, but how can this be fixed 
when I'm using mutiple regex?

-- 
Regards,
Akhthar Parvez K
http://tips.sysadminguide.com/
UNIX is basically a simple operating system, but you have to be a genius to 
understand the simplicity - Dennis Ritchie

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to