Hi, Today we encountered an exception in a source file that wasn't in UTF-8. This caused first a warning and then an error in the morbo output. Apache2 showed a "502 Proxy Error: Error reading from remote server" from in our apache2+morbo setup.
The file on-disk is "binary", because it uses a perl source filter <http://perldoc.perl.org/perlfilter.html>. Once given to the parser, it *is* UTF-8. The same situation could've been occurred in a 3rd-party library if it was encoded in other than [UTF-8, ASCII]. Something similar was previously discussed in an old thread, Molo::Exception generate warnings while read non-utf8 file <https://groups.google.com/forum/#%21searchin/mojolicious/Mojo$3A$3AException$20UTF-8%7Csort:relevance/mojolicious/6c-ZavT2KzQ/vTsJO2E6RQkJ>, but no solution was presented there. This occurs when an exception is encountered in a source file, and "Mojo::Exception->throw($@)" is called because of Mojolicious.pm's local $SIG{__DIE__} = sub { ref $_[0] ? CORE::die $_[0] : Mojo::Exception->throw(shift) }; First, Mojo::Exception's "sub inspect" gave warnings because it guesses the source file from parsing $@ and opens the source file with: next unless -r $file->[0] && open my $handle, '<:utf8', $file->[0]; $self->_context($file->[1], [[<$handle>]]); Next we get a "Malformed UTF-8 character" fatal error from Mojolicious (when generating output?) because of a s/// in Mojo::Util's xml_escape: sub xml_escape { return $_[0] if ref $_[0] && ref $_[0] eq 'Mojo::ByteStream'; my $str = shift // ''; $str =~ s/([&<>"'])/$XML{$1}/ge; return $str; } It seems to be that the reading of non-UTF-8 data causes a warning, but the later s/// causes a fatal exception. The behavior I see is replicated by this snippet: #!/usr/bin/perl -w use strict; open my $handle, '<:utf8', "binary.bin"; my $str = join('', <$handle>); $str =~ s/([&<>"'])/foo/; I'd like to suggest that reading a source file with UTF-8 problems be treated as if the file was unreadable. To that end, I suggest this patch (against current master HEAD 49dd3e7): londo@peter:~/work/mojo> git diff diff --git a/lib/Mojo/Exception.pm b/lib/Mojo/Exception.pm index b218ee0..08759f5 100644 --- a/lib/Mojo/Exception.pm +++ b/lib/Mojo/Exception.pm @@ -20,7 +20,14 @@ sub inspect { # Search for context in files for my $file (@files) { next unless -r $file->[0] && open my $handle, '<:utf8', $file->[0]; - $self->_context($file->[1], [[<$handle>]]); + # If there are UTF-8 problems in the source file, don't store any context + eval { + use warnings 'FATAL' => 'utf8'; + $self->_context($file->[1], [[<$handle>]]); + }; + if ($@) { + next; + } return $self; } For ASCII or UTF-8 encoded source files there is no change. For files with UTF-8 problems, they are simply not stored as _context(). But they also don't cause the entire application to crash. The best of both worlds. Would that be acceptable? If so, should I just create a github PR? If not, can somebody suggest an alternative given that some source files are in fact not UTF-8? Sincerely, Peter -- You received this message because you are subscribed to the Google Groups "Mojolicious" group. To unsubscribe from this group and stop receiving emails from it, send an email to mojolicious+unsubscr...@googlegroups.com. To post to this group, send email to mojolicious@googlegroups.com. Visit this group at https://groups.google.com/group/mojolicious. For more options, visit https://groups.google.com/d/optout.