This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Provide a standard module to simplify the creation of source filters =head1 VERSION Maintainer: Damian Conway <[EMAIL PROTECTED]> Date: 20 Sep 2000 Last Modified: 29 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 264 Version: 3 Status: Frozen Frozen since: v2 =head1 ABSTRACT This RFC proposes that the interface to Perl's source filtering facilities be made much easier to use. =head1 DESCRIPTION Source filtering is an immensely powerful feature of recent versions of Perl. It allows one to extend the language itself (e.g. the Switch module), to simplify the language (e.g. Language::Pythonesque), or to completely recast the language (e.g. Lingua::Romana::Perligata). Effectively, it allows one to use the full power of Perl as its own, recursively applied, macro language. The Filter::Util::Call module (by Paul Marquess) provides a usable Perl interface to source filtering, but it is not nearly as simple as it could be. To use the module it is necessary to do the following: =over 4 =item 1. Download, build, and install the Filter::Util::Call module. =item 2. Set up a module that does a C<use Filter::Util::Call>. =item 3. Within that module, create an C<import> subroutine. =item 4. Within the C<import> subroutine do a call to C<filter_add>, passing it either a subroutine reference. =item 5. Within the subroutine reference, call C<filter_read> or C<filter_read_exact> to "prime" $_ with source code data from the source file that will C<use> your module. Check the status value returned to see if any source code was actually read in. =item 6. Process the contents of $_ to change the source code in the desired manner. =item 7. Return the status value. =item 8. If the act of unimporting your module (via a C<no>) should cause source code filtering to cease, create an C<unimport> subroutine, and have it call C<filter_del>. Make sure that the call to C<filter_read> or C<filter_read_exact> in step 5 will not accidentally read past the C<no>. Effectively this limits source code filters to line-by-line operation, unless the C<import> subroutine does some fancy pre-pre-parsing of the source code it's filtering. =back This last requirement is often the stumbling block. Line-by-line source filters are not difficult to set up using Filter::Util::Call, but line-by-line filtering is the exception, rather than the norm. Since a newline is just whitespace throughout much of a Perl program, most useful source filters have to make allowance for components that may span two or more newlines. And that complicates the filtering code enormously. For example, here is a minimal source code filter in a module named BANG.pm. It simply converts every occurrence of the sequence C<BANG\s+BANG> (which may include newlines) to the sequence C<die 'BANG' if $BANG> in any piece of code following a C<use BANG;> statement (until the next C<no BANG;> statement, if any): package BANG; use Filter::Util::Call ; sub import { filter_add( sub { my $caller = caller; my ($status, $no_seen, $data); while ($status = filter_read()) { if (/^\s*no\s+$caller\s*;\s*$/) { $no_seen=1; last; } $data .= $_; $_ = ""; } $_ = $data; s/BANG\s+BANG/die 'BANG' if \$BANG/g unless $status < 0; $_ .= "no $class;\n" if $no_seen; return 1; }) } sub unimport { filter_del(); } 1 ; Given this level of complexity, it's perhaps not surprising that source code filtering is not commonly used. This RFC proposes that a new standard module -- Filter::Simple -- be provided, to vastly simplify the task of source code filtering, at least in common cases. =head2 The Filter::Simple module Instead of the above process, it is proposed that the Filter::Simple module would simplify the creation of source code filters to the following steps: =over 4 =item 1. Set up a module that does a C<use Filter::Simple sub { ... }>. =item 2. Within the anonymous subroutine passed to C<use Filter>, process the contents of $_ to change the source code in the desired manner. =back In other words, the previous example, would become: package BANG; use Filter::Simple sub { s/BANG\s+BANG/die 'BANG' if \$BANG/g; }; 1 ; =head2 Module semantics This drastic simplication is achieved by having the standard Filter::Simple module export into the package that C<use>s it (e.g. package "BANG" in the above example) two automagically constructed subroutines -- C<import> and C<unimport> -- which take care of all the nasty details. In addition, the generated C<import> subroutine passes its own argument list to the filtering subroutine, so the BANG.pm filter could easily be made parametric: package BANG; use Filter::Simple sub { my ($die_msg, $var_name) = @_; s/BANG\s+BANG/die '$die_msg' if \${$var_name}/g; }; # and in some user code: use BANG "BOOM", "BAM; # "BANG BANG" becomes: die 'BOOM' if $BAM The specified filtering subroutine is called every time a C<use BANG> is encountered, and passed all the source code following that call, up to either a C<no BANG;> call or the end of the source file (whichever occurs first). =head1 MIGRATION ISSUES None. =head1 IMPLEMENTATION A prototype implementation is available from: http://www.csse.monash.edu.au/~damian/CPAN/Filter-Simple.tar.gz and should soon appear on the CPAN. This prototype requires the Filter::Util::Call module, but it is proposed that a standard Filter::Simple module would be self-sufficient. =head1 NOTE It is certainly not suggested that Filter::Simple should I<replace> Filter::Util::Call. That module provides much more flexible control over source code filtering, and will still be needed in many cases. It is merely proposed that simplified code filtering covering the common cases ought to be incorporated in the core. =head1 REFERENCES http://search.cpan.org/search?dist=Filter