I tried to create a short patch. Perhaps it became too short and this is
better:

londo@peter:~/work/mojo> git diff
diff --git a/lib/Mojo/Exception.pm b/lib/Mojo/Exception.pm
index b218ee0..64a6899 100644
--- a/lib/Mojo/Exception.pm
+++ b/lib/Mojo/Exception.pm
@@ -20,7 +20,15 @@ sub inspect {
   # Search for context in files
   for my $file (@files) {
     next unless -r $file->[0] && open my $handle, '<:utf8', $file->[0];
-    $self->_context($file->[1], [[<$handle>]]);
+    # If there are UTF-8 problems in the source file, don't store any
context
+    my @lines = eval {
+      use warnings 'FATAL' => 'utf8';
+      <$handle>;
+    };
+    if ($@) {
+      next;
+    }
+    $self->_context($file->[1], [\@lines]);
     return $self;
   }

Now we're not ignoring any potentially unrelated errors from
$self->_context(). I'll shut up for now. :-)

Peter

On Mon, Nov 7, 2016 at 12:35 PM, Peter Valdemar Mørch <pmo...@gmail.com>
wrote:

> Hi,
>
> Today we encountered an exception in a source file that wasn't in UTF-8.
> This caused first a warning and then an error in the morbo output. Apache2
> showed a "502 Proxy Error: Error reading from remote server" from in our
> apache2+morbo setup.
>
> The file on-disk is "binary", because it uses a perl source filter
> <http://perldoc.perl.org/perlfilter.html>. Once given to the parser, it
> *is* UTF-8. The same situation could've been occurred in a 3rd-party
> library if it was encoded in other than [UTF-8, ASCII]. Something similar
> was previously discussed in an old thread, Molo::Exception generate
> warnings while read non-utf8 file
> <https://groups.google.com/forum/#%21searchin/mojolicious/Mojo$3A$3AException$20UTF-8%7Csort:relevance/mojolicious/6c-ZavT2KzQ/vTsJO2E6RQkJ>,
> but no solution was presented there.
>
> This occurs when an exception is encountered in a source file, and
> "Mojo::Exception->throw($@)" is called because of Mojolicious.pm's
>
> local $SIG{__DIE__}
>   = sub { ref $_[0] ? CORE::die $_[0] : Mojo::Exception->throw(shift) };
>
> First, Mojo::Exception's "sub inspect" gave warnings because it guesses
> the source file from parsing $@ and opens the source file with:
>
> next unless -r $file->[0] && open my $handle, '<:utf8', $file->[0];
> $self->_context($file->[1], [[<$handle>]]);
>
> Next we get a "Malformed UTF-8 character" fatal error from Mojolicious
> (when generating output?) because of a s/// in Mojo::Util's xml_escape:
>
> sub xml_escape {
>   return $_[0] if ref $_[0] && ref $_[0] eq 'Mojo::ByteStream';
>   my $str = shift // '';
>   $str =~ s/([&<>"'])/$XML{$1}/ge;
>   return $str;
> }
>
> It seems to be that the reading of non-UTF-8 data causes a warning, but
> the later s/// causes a fatal exception. The behavior I see is replicated
> by this snippet:
>
> #!/usr/bin/perl -w
> use strict;
> open my $handle, '<:utf8', "binary.bin";
> my $str = join('', <$handle>);
> $str =~ s/([&<>"'])/foo/;
>
> I'd like to suggest that reading a source file with UTF-8 problems be
> treated as if the file was unreadable. To that end, I suggest this patch
> (against current master HEAD 49dd3e7):
>
> londo@peter:~/work/mojo> git diff
> diff --git a/lib/Mojo/Exception.pm b/lib/Mojo/Exception.pm
> index b218ee0..08759f5 100644
> --- a/lib/Mojo/Exception.pm
> +++ b/lib/Mojo/Exception.pm
> @@ -20,7 +20,14 @@ sub inspect {
>    # Search for context in files
>    for my $file (@files) {
>      next unless -r $file->[0] && open my $handle, '<:utf8', $file->[0];
> -    $self->_context($file->[1], [[<$handle>]]);
> +    # If there are UTF-8 problems in the source file, don't store any
> context
> +    eval {
> +      use warnings 'FATAL' => 'utf8';
> +      $self->_context($file->[1], [[<$handle>]]);
> +    };
> +    if ($@) {
> +      next;
> +    }
>      return $self;
>    }
>
> For ASCII or UTF-8 encoded source files there is no change. For files with
> UTF-8 problems, they are simply not stored as _context(). But they also
> don't cause the entire application to crash. The best of both worlds.
>
> Would that be acceptable? If so, should I just create a github PR? If not,
> can somebody suggest an alternative given that some source files are in
> fact not UTF-8?
>
> Sincerely,
>
> Peter
>

-- 
You received this message because you are subscribed to the Google Groups 
"Mojolicious" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mojolicious+unsubscr...@googlegroups.com.
To post to this group, send email to mojolicious@googlegroups.com.
Visit this group at https://groups.google.com/group/mojolicious.
For more options, visit https://groups.google.com/d/optout.

Reply via email to