The requirement for this module came about intially because I was thinking about how to handle virtual URLs in websites; for example:
/photos/album12/photo17.jpg This will fetch the 17th photo from the 12th album, by whatever method internally is used. Internally, we need to know these values. Trying to make as generic a system as possible, I came up with the idea that somewhere in site config, would live a regexp-like pattern to explain how to parse that. This pattern needs to be reversible - the logic that generates pages has to be able to construct URLs that give paths to the files in question. The format I came upon would look like this: '/photos/album${ALBUM:\d+}/photo${PHOTO:\d+}.jpg' This pattern consists of literals, with variable interpolations embedded in it. Looks obvious from a string-generation point of view. Also regexp patterns are present, to explain what is valid in each position. This means, given this pattern, we can convert in either direction: /photos/album12/photo17.jpg gives: { ALBUM => 12, PHOTO => 17 } { ALUBM = 9, PHOTO => 15 } gives: /photos/album9/photo15.jpg The use-case here is that patterns come from some source such as a config file, being a fairly small fixed set which is known at the time the server is started. Incoming strings or sets of replacement values come from the running of the server, which is much more often. The implementation I have chosen, compiles the pattern into two CODE references, to allow efficient runtime usage, comparible to hand-coded regexps or variable interpolation. Also, no special considerations on the security of the patterns are made - it would be quite possible to embed arbitrary perl code within these patterns - the current implementation does not protect against this because the source of these patterns is assumed to be trusted. This would be noted in the documentation of this class. I've written an implementation of code, and a test script, by way of example for how it might be used. Find these attached. I'd appreciate some comments on this; specifically, if this functionallity would be useful enough to put on CPAN, or if it seems a quite specialised solution to a specific problem and not worth doing. -- Paul "LeoNerd" Evans [EMAIL PROTECTED] ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/
package Parse::Reversable; use strict; use Carp; sub new { my $class = shift; my ( $pattern, %opts ) = @_; my $self = bless { pattern => $pattern, }, $class; my %vars; my $parsepattern = ""; my $capturenumber = 1; my $parsebind = ""; my @buildparts; # The buildsub closure will contain elements of this array in its # environment my @literals; my @components = split( m/(\$\{\w+:.*?\})/, $pattern, -1 ); foreach my $c ( @components ) { next if length( $c ) == 0; if( $c =~ m/^\$\{(\w+):(.*)\}$/ ) { my ( $var, $pattern ) = ( $1, $2 ); croak "Multiple occurances of $var" if exists $vars{$var}; $vars{$var} = 1; $parsepattern .= "($pattern)"; $parsebind .= " \$var->{$var} = \$$capturenumber;\n"; $capturenumber++; push @buildparts, "\$var->{$var}"; } else { $parsepattern .= quotemeta $c; push @literals, $c; push @buildparts, "\$literals[$#literals]"; } } if( $opts{allow_trail} ) { $parsepattern .= "(.*?)"; $parsebind .= " \$var->{_trail} = \$$capturenumber;\n"; $capturenumber++; } my $parsecode = " \$_[0] =~ m{^$parsepattern\$} or return undef; my \$var = {}; $parsebind \$var; "; $self->{parsesub} = eval "sub { $parsecode }"; my $buildcode = " my ( \$var ) = [EMAIL PROTECTED]; " . join( " . ", @buildparts ) . "; "; $self->{buildsub} = eval "sub { $buildcode }"; return $self; } sub parse { my $self = shift; my ( $str ) = @_; return $self->{parsesub}->( $str ); } sub build { my $self = shift; my ( $var ) = @_; return $self->{buildsub}->( $var ); } # Keep perl happy; keep Britain tidy 1;
02reversable-simple.t
Description: Troff document
signature.asc
Description: Digital signature