* Kragen Javier Sitaker <[EMAIL PROTECTED]> [2006-11-11 09:37]:
> #!/usr/bin/perl -w
> 
> # find /var/cache/wwwoffle/http -type f | perl this-script > tmp.html
> 
> # Generate a really big page of small inline JPEG images from a
> # listing of files in the WWWOFFLE cache.  The WWWOFFLE cache
> # is in /var/cache/wwwoffle/http on my system, and it's not the
> # most convenient possible structure for doing this kind of
> # thing (it has files containing HTTP responses, not the actual
> # resources), but it doesn't take all that much work to grab
> # stuff out of it.  You still have to be running WWWOFFLE for
> # the resulting page not to take an absurdly long time to load,
> # and it still might hork your browser.
> 
> while (<>) {
>   chomp;
>   my $fname = $_;
>   my ($path, $type, $id) = /^(.*)\/([DU])([^\/]*)$/;  # data or URL
>   print STDERR "$_ path $path type $type id $id\n" if not defined $type;
>   if ($type eq 'D') {
>     open FILE, "<$fname" or die "Can't open $fname: $!";
>     while (<FILE>) {  # grab content-type out of HTTP header, hope it's there
>       if (/^Content-Type: (.*)\r\n/) {  # specify \r so . doesn't match it
>       my $ctype = $1;
>       #print STDERR "content-type: <$ctype>\n";
>       if ($ctype eq 'image/jpeg') {
>         open URL, "<$path/U$id" or die "Can't open $id: $!"; # crappy msg
>         my $url = <URL>;
>         print qq(<img src="$url" width="128" height="128" />\n);
>       }
>         last;  # don't bother reading the rest of the file
>       }
>     }
>   }
> }
> __END__

A cleaner version that crawls the filesystem without external aid:

    #!/usr/bin/perl
    use strict;
    use warnings;
    use File::Find;

    # perl this-script /var/cache/wwwoffle/http > tmp.html

    sub file_matches {
        my ( $rx, filename ) = @_;
        open my $fh, '<', $filename or die "Can't open $filename: $!";
        my $line;
        $line =~ $rx and return 1 while defined $line = <$fh>;
        return;
    }

    sub find_response_where (&;@) {
        my $is_interesting = shift;
        my @path;
        find sub {
            my ( $type, $id ) = m[ \A ([DU]) (.*) ]x;
            push @path, $File::Find::name
                if defined $type
                and $type eq 'D'   # D = data, U = URL
                and -f
                and $is_interesting->( $_ )
            }
        }, @_;
        return @path;
    }

    my $criterion = qr[ \A Content-Type: [ ]* image/jpeg \r\n ]ix;

    @ARGV = find_response_where { file_matches $criterion, shift } @ARGV;

    s[ \A (.*) /D ]{ $1 . '/U' }ex for @ARGV;

    print qq(<img src="$_" width="128" height="128" />\n) while <>;


This code uses a minor trick: the <> operator will open and read
all files listed in @ARGV sequentially, so the code stuffs the
names of the data files of interest into that array, then turns
them into URL file names, then uses the diamond operator to read
them.

But the punchline is the implementation of this same thing in
terms of the File::Find::Rule module from the CPAN:

    #!/usr/bin/perl
    use strict;
    use warnings;
    use File::Find::Rule;

    # perl this-script /var/cache/wwwoffle/http > tmp.html

    @ARGV = File::Find::Rule
        ->name( 'D*' )
        ->file
        ->grep( qr/ \A Content-Type: [ ]* image/jpeg \r\n /ix )
        ->in( @ARGV );

    s[ \A (.*) /D ]{ $1 . '/U' }ex for @ARGV;

    print qq(<img src="$_" width="128" height="128" />\n) while <>;


You can see that F::F::R is the bee’s knees.

-- 
*AUTOLOAD=*_;sub _{s/(.*)::(.*)/print$2,(",$\/"," ")[defined wantarray]/e;$1}
&Just->another->Perl->hack;
#Aristotle

Reply via email to