utf8 handling in perl is still fraught with peril, but I won't go into all the reasons for that here.
What you are probably seeing is perl auto-upgrading your strings to utf8 when it concatenates a non-utf8 string with a utf8 string. This concatenation is happening in HTML::Mason::Request::print when it concatenates each string into the buffer. So if a module such as XML::RSS is returning utf8-encoded strings, it will infect all your other strings with the utf8-ness. Because Mason concatenates all the strings, the first time a utf8-encoded string is output, it will cause the buffer string to be upgraded. A good test to confirm whether this is the case would be to remove the flush_buffer and see if the first string also becomes utf8. It was not upgraded before because the buffer was flushed before a utf8 string was added to it. Unfortunately I do not know of any elegant solution for this. You could patch your version of Mason like this to make sure the string never get upgraded, but that is terribly ugly and I would not do that myself unless there were bugs in perl itself[1] which prevented me from using a better method. --- HTML-Mason-1.33/lib/HTML/Mason/Request.pm +++ Request.pm @@ -1168,7 +1168,10 @@ ); # use 'if defined' for maximum efficiency; grep creates a list. + use Encode; + Encode::_utf8_off($$bufref); for ( @_ ) { + Encode::_utf8_off($_); $$bufref .= $_ if defined; } ~ John Williams [1] such as <http://rt.perl.org/rt3//Public/Bug/Display.html?id=36248> On Wed, 6 Dec 2006, Vegard Vesterheim wrote: > I have experienced a puzzling behaviour with the use of > flush_buffer. I use iso-latin-1 encoding on my pages, but I have a > specific Mason page which immediately after a call to flush_buffers > starts producing utf-8 encoded content. > > This is a snippet from the page which exhibits this behaviour > ----- snip - snip ------------------------------------------------- > <h1>øæåØÆÅ</h1> > % $m->flush_buffer; > <h1>øæåØÆÅ</h1> > ----- snip - snip ------------------------------------------------- > The first H1 content is correctly encoded as iso-latin-1, but the > second is utf-8. > > This page is rather complex, and includes among other things some RSS > processing (XML::RSS). I will try to produce a smaller test case which > reproduces the problem, but until then I was wondering if anyone on > this list could explain this behaviour. > > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Mason-users mailing list Mason-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mason-users