The requirement for this module came about intially because I was
thinking about how to handle virtual URLs in websites; for example:

  /photos/album12/photo17.jpg

This will fetch the 17th photo from the 12th album, by whatever method
internally is used. Internally, we need to know these values. Trying to
make as generic a system as possible, I came up with the idea that
somewhere in site config, would live a regexp-like pattern to explain
how to parse that. This pattern needs to be reversible - the logic that
generates pages has to be able to construct URLs that give paths to the
files in question.

The format I came upon would look like this:

  '/photos/album${ALBUM:\d+}/photo${PHOTO:\d+}.jpg'

This pattern consists of literals, with variable interpolations embedded
in it. Looks obvious from a string-generation point of view. Also regexp
patterns are present, to explain what is valid in each position.

This means, given this pattern, we can convert in either direction:

  /photos/album12/photo17.jpg 
    gives: { ALBUM => 12, PHOTO => 17 }

  { ALUBM = 9, PHOTO => 15 }
    gives: /photos/album9/photo15.jpg

The use-case here is that patterns come from some source such as a
config file, being a fairly small fixed set which is known at the time
the server is started. Incoming strings or sets of replacement values
come from the running of the server, which is much more often. The
implementation I have chosen, compiles the pattern into two CODE
references, to allow efficient runtime usage, comparible to hand-coded
regexps or variable interpolation. Also, no special considerations on
the security of the patterns are made - it would be quite possible to
embed arbitrary perl code within these patterns - the current
implementation does not protect against this because the source of these
patterns is assumed to be trusted. This would be noted in the
documentation of this class.

I've written an implementation of code, and a test script, by way of
example for how it might be used. Find these attached.

I'd appreciate some comments on this; specifically, if this
functionallity would be useful enough to put on CPAN, or if it seems a
quite specialised solution to a specific problem and not worth doing.

-- 
Paul "LeoNerd" Evans

[EMAIL PROTECTED]
ICQ# 4135350       |  Registered Linux# 179460
http://www.leonerd.org.uk/
package Parse::Reversable;

use strict;

use Carp;

sub new
{
   my $class = shift;
   my ( $pattern, %opts ) = @_;

   my $self = bless {
      pattern => $pattern,
   }, $class;

   my %vars;

   my $parsepattern = "";
   my $capturenumber = 1;
   my $parsebind = "";

   my @buildparts;

   # The buildsub closure will contain elements of this array in its
   # environment
   my @literals;

   my @components = split( m/(\$\{\w+:.*?\})/, $pattern, -1 );
   foreach my $c ( @components ) {
      next if length( $c ) == 0;
      if( $c =~ m/^\$\{(\w+):(.*)\}$/ ) {
         my ( $var, $pattern ) = ( $1, $2 );
         croak "Multiple occurances of $var" if exists $vars{$var};
         $vars{$var} = 1;

         $parsepattern .= "($pattern)";
         $parsebind .= "   \$var->{$var} = \$$capturenumber;\n";
         $capturenumber++;

         push @buildparts, "\$var->{$var}";
      }
      else {
         $parsepattern .= quotemeta $c;

         push @literals, $c;
         push @buildparts, "\$literals[$#literals]";
      }
   }

   if( $opts{allow_trail} ) {
      $parsepattern .= "(.*?)";
      $parsebind .= "   \$var->{_trail} = \$$capturenumber;\n";
      $capturenumber++;
   }

   my $parsecode = "
   \$_[0] =~ m{^$parsepattern\$} or return undef;
   my \$var = {};
$parsebind
   \$var;
";

   $self->{parsesub} = eval "sub { $parsecode }";

   my $buildcode = "
   my ( \$var ) = [EMAIL PROTECTED];
   " . join( " . ", @buildparts ) . ";
";

   $self->{buildsub} = eval "sub { $buildcode }";

   return $self;
}

sub parse
{
   my $self = shift;
   my ( $str ) = @_;
   return $self->{parsesub}->( $str );
}

sub build
{
   my $self = shift;
   my ( $var ) = @_;
   return $self->{buildsub}->( $var );
}

# Keep perl happy; keep Britain tidy
1;

Attachment: 02reversable-simple.t
Description: Troff document

Attachment: signature.asc
Description: Digital signature

Reply via email to