On 11/04/2011 06:43, Shlomi Fish wrote:
> On Sunday 10 Apr 2011 14:05:49 cityuk wrote:
>>
>> This is more of a generic question on regular expressions as my
>> program is working fine but I was just curious.
>>
>> Say you have the following URLs:
>>
>> http://www.test.com/image.gif
>> http://www.test.com/?src=image.gif?width=12
>>
> 
> Don't use regular expressions to parse URLs - instead use URI.pm:
> 
> http://cpan.uwinnipeg.ca/dist/URI

I agree. The program below shows a subroutine which will extract the
file type from either form of URL. It first checks to see if there is a
'src' option in the query, using this for the file name if so; otherwise
it uses the last segment of the URL path. The file type type is
extracted by capturing all trailing non-dot characters from the file
name.

(I assume your second address should read
<http://www.test.com/?src=image.gif&width=12> with an ampersand instead
of a second question mark?)

HTH,

Rob


use strict;
use warnings;

use URI;

sub filetype_from_url {
  my $url = URI->new($_[0]);
  my %form = $url->query_form;
  my $file = $form{src} || ($url->path_segments)[-1];
  return $file =~ /([^.]+)\z/;
}

print filetype_from_url('http://www.test.com/image.gif'), "\n";
print filetype_from_url('http://www.test.com/?src=image.gif&width=12'), "\n";





-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to