The original problem which I posted to the users list was that gadgets with non
UTF-8 encodings (I used iso-8859-1 to test) were losing all non ascii
characters in both the title (metadata call) and content (gadget rendering
call).
Details of the problem and solution is as follows:
In BasicRemoteContentFetcher this line:
$content = mb_convert_encoding($content, 'UTF-8', $charset);
converts the fetched XML as a string to UTF-8 whatever encoding it was in.
($charset is the source encoding)
But the xml declaration line was not touched. So, after this we may have a
gadget like this:
<?xml version="1.0" encoding="iso-8859-1"?><Module> <ModulePrefs
title="IñtërnâtiônàlizætiønX" /> <Content type="html"> <![CDATA[
]]> </Content> </Module>
which is UTF-8 encoded but with an iso-8859-1 encoding attribute.
Later in the call (metadata request or gadget rendering) in
GadgetSpecParser->parse() we load the XML content into an XML DOM object. At
this point the error occurs - naturally as the UTF-8 content is flagged as
being in iso-8859-1.
My fix is as follows:
In BasicRemoteContentFetcher->parseResult replace:
$content = mb_convert_encoding($content, 'UTF-8', $charset);
with
$content = mb_convert_encoding($content, 'UTF-8', $charset); $pattern =
'encoding=\s*([' . '\'"])' . $charset . '\s*\1'; $content =
mb_ereg_replace($pattern,'encoding="UTF-8"',$content,"i") ;
Now the XML is UTF-8 encoded and has the correct UTF-8 encoding attribute.
Justin
_________________________________________________________________
http://clk.atdmt.com/UKM/go/197222280/direct/01/
Do you have a story that started on Hotmail? Tell us now