Hi,

I would like some help with an encoding problem, please. I would like to encode
some text (a news entry entered via a form, to be exact) into UTF-8 and then
save it in an XML file for persistent storage. My problem is, some of the users
are Japanese and would like to enter Japanese multi-byte characters. The
following should work, I believe:

// Open file
if (!$handle = fopen($filename, "wb")) {
    echo("Error! Cannot open file $filename");
    exit;
}

// Generate XML string
$newsXML = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n";
$newsXML .= "<newsItem>\n";
$newsXML .= "<headline>".mb_convert_encoding($headline, "UTF-8");
$newsXML .= "</headline>\n";
$newsXML .= "<maintext>".mb_convert_encoding($maintext, "UTF-8");
$newsXML .= "</maintext>\n";
$newsXML .= "</newsItem>\n";

// Encode
$newsXML = mb_convert_encoding($newsXML, "UTF-8","auto");
$encodedNewsXML = utf8_encode($newsXML);
echo("<p>".mb_detect_encoding($encodedNewsXML)."</p>");
echo("<p>".$encodedNewsXML."</p>");

// Write news item content to the file
if (fwrite($handle, $encodedNewsXML) == FALSE) {
    echo("Error! Could not write to file $filename");
    exit;
}
echo("Success, wrote news item to the file $filename");
fclose($handle);



 ... but it doesn't! :-( Whenever I run this, it displays "ASCII" followed by
the Japanese text characters (both kanji and kana). Note that the caharcters
_are_ displayed correctly, although the encoding is detected as ASCII, which
doesn't make sense to me. The script then happily proceeds to save in
ASCII-format, and consequently, when the main script reads the saved file, it
replaces all characters by ????. Other UTF-8 files, saved in an external editor
such as Bluefish or GEdit _can_ be read correctly. The problem simply must be in
the encoding.

And before you ask, yes, mb_*** is supported in the PHP server.

What am I doing wrong? Please let me know; I've been struggling a long time with
this and will be very grateful for any assistance.

Best regards,
Jan

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to