RFC 264 (v3) Provide a standard module to simplify the creation of source filters

Perl6 RFC Librarian Fri, 29 Sep 2000 14:23:45 -0700
This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Provide a standard module to simplify the creation of source filters

=head1 VERSION

  Maintainer: Damian Conway <[EMAIL PROTECTED]>
  Date: 20 Sep 2000
  Last Modified: 29 Sep 2000
  Mailing List: [EMAIL PROTECTED]
  Number: 264
  Version: 3
  Status: Frozen
  Frozen since: v2


=head1 ABSTRACT

This RFC proposes that the interface to Perl's source filtering facilities
be made much easier to use.


=head1 DESCRIPTION

Source filtering is an immensely powerful feature of recent versions of Perl.
It allows one to extend the language itself (e.g. the Switch module), to 
simplify the language (e.g. Language::Pythonesque), or to completely recast the
language (e.g. Lingua::Romana::Perligata). Effectively, it allows one to use
the full power of Perl as its own, recursively applied, macro language.

The Filter::Util::Call module (by Paul Marquess) provides a usable Perl
interface to source filtering, but it is not nearly as simple as it could be. 

To use the module it is necessary to do the following:

=over 4

=item 1.

Download, build, and install the Filter::Util::Call module.

=item 2.

Set up a module that does a C<use Filter::Util::Call>.

=item 3.

Within that module, create an C<import> subroutine.

=item 4.

Within the C<import> subroutine do a call to C<filter_add>, passing
it either a subroutine reference.

=item 5.

Within the subroutine reference, call C<filter_read> or C<filter_read_exact>
to "prime" $_ with source code data from the source file that will
C<use> your module. Check the status value returned to see if any
source code was actually read in.

=item 6.

Process the contents of $_ to change the source code in the desired manner.

=item 7.

Return the status value.

=item 8.

If the act of unimporting your module (via a C<no>) should cause source
code filtering to cease, create an C<unimport> subroutine, and have it call
C<filter_del>. Make sure that the call to C<filter_read> or
C<filter_read_exact> in step 5 will not accidentally read past the
C<no>. Effectively this limits source code filters to line-by-line
operation, unless the C<import> subroutine does some fancy
pre-pre-parsing of the source code it's filtering.

=back

This last requirement is often the stumbling block. Line-by-line
source filters are not difficult to set up using Filter::Util::Call,
but line-by-line filtering is the exception, rather than the norm.
Since a newline is just whitespace throughout much of a Perl program,
most useful source filters have to make allowance for components that
may span two or more newlines. And that complicates the filtering code
enormously.

For example, here is a minimal source code filter in a module named
BANG.pm. It simply converts every occurrence of the sequence C<BANG\s+BANG>
(which may include newlines) to the sequence C<die 'BANG' if $BANG> in
any piece of code following a C<use BANG;> statement (until the next
C<no BANG;> statement, if any):

        package BANG;
 
        use Filter::Util::Call ;

        sub import {
            filter_add( sub {
                my $caller = caller;
                my ($status, $no_seen, $data);
                while ($status = filter_read()) {
                        if (/^\s*no\s+$caller\s*;\s*$/) {
                                $no_seen=1;
                                last;
                        }
                        $data .= $_;
                        $_ = "";
                }
                $_ = $data;
                s/BANG\s+BANG/die 'BANG' if \$BANG/g
                        unless $status < 0;
                $_ .= "no $class;\n" if $no_seen;
                return 1;
            })
        }

        sub unimport {
            filter_del();
        }

        1 ;

Given this level of complexity, it's perhaps not surprising that source
code filtering is not commonly used.

This RFC proposes that a new standard module -- Filter::Simple -- be
provided, to vastly simplify the task of source code filtering, 
at least in common cases.


=head2 The Filter::Simple module

Instead of the above process, it is proposed that the Filter::Simple module 
would simplify the creation of source code filters to the following
steps:

=over 4

=item 1.

Set up a module that does a C<use Filter::Simple sub { ... }>.

=item 2.

Within the anonymous subroutine passed to C<use Filter>, process the
contents of $_ to change the source code in the desired manner.

=back

In other words, the previous example, would become:

        package BANG;
 
        use Filter::Simple sub {
            s/BANG\s+BANG/die 'BANG' if \$BANG/g;
        };

        1 ;


=head2 Module semantics

This drastic simplication is achieved by having the standard
Filter::Simple module export into the package that C<use>s it (e.g.
package "BANG" in the above example) two automagically constructed
subroutines -- C<import> and C<unimport> -- which take care of all the
nasty details.

In addition, the generated C<import> subroutine passes its own argument
list to the filtering subroutine, so the BANG.pm filter could easily
be made parametric:

        package BANG;

        use Filter::Simple sub {
            my ($die_msg, $var_name) = @_;
            s/BANG\s+BANG/die '$die_msg' if \${$var_name}/g;
        };

        # and in some user code:

        use BANG "BOOM", "BAM;  # "BANG BANG" becomes: die 'BOOM' if $BAM


The specified filtering subroutine is called every time a C<use BANG>
is encountered, and passed all the source code following that call,
up to either a C<no BANG;> call or the end of the source file (whichever
occurs first).
        

=head1 MIGRATION ISSUES

None.


=head1 IMPLEMENTATION

A prototype implementation is available from:

        http://www.csse.monash.edu.au/~damian/CPAN/Filter-Simple.tar.gz

and should soon appear on the CPAN.

This prototype requires the Filter::Util::Call module, but it is
proposed that a standard Filter::Simple module would be self-sufficient.


=head1 NOTE

It is certainly not suggested that Filter::Simple should I<replace> 
Filter::Util::Call. That module provides much more flexible control 
over source code filtering, and will still be needed in many cases.
It is merely proposed that simplified code filtering covering
the common cases ought to be incorporated in the core.


=head1 REFERENCES

http://search.cpan.org/search?dist=Filter
RFC 264 (v3) Provide a standard module to simplify the creation of source filters

Reply via email to