Terje Bless <[EMAIL PROTECTED]> writes:

> In a CGI app I'm using HTML::Parser to rewrite web pages before further
> processing. It appears that a document containing an ASCII NUL byte (in
> this case, as the last byte in the file) has the NUL byte stripped after
> processing with HTML::Parser.

It appears that you forgot to call the 'eof' method to flush remaining
buffered text.  I don't think there should be any problems with NUL.

>   The offending code is below, and the version
> of HTML::Parser is:
> 
> % perl -MHTML::Parser -e 'print $HTML::Parser::VERSION'
> 3.26
> 
> 
> code:
> 
> 
> sub supress_doctype {
>   no strict 'vars';
>   my $file = shift; # $file = \@file
>   local $HTML = '';
> 
>   HTML::Parser->new(
>     default_h => [sub {$HTML .= shift}, 'text'],
>     declaration_h => [sub {$HTML .= '<!-- ' . $_[0] . ' -->'}, 'text']
>   )->parse(join "\n", @{$file});
                                 ^
                                 insert '->eof' here.

>   return [split /\n/, $HTML];
> }

Regards,
Gisle

Reply via email to