php-i18n Digest 24 Jan 2003 15:20:28 -0000 Issue 145
Topics (messages 418 through 425):
mb_detect_encoding mb_convert_encoding and rss
418 by: Tony Laszlo
419 by: Moriyoshi Koizumi
420 by: Tony Laszlo
Re: Problem with gettext
421 by: Jan Schneider
Re: Content-Length: and multibyte
422 by: Yasuo Ohgaki
problem with mbstring (search)functions
423 by: Simon Dedeyne
424 by: Moriyoshi Koizumi
425 by: Simon Dedeyne
Administrivia:
To subscribe to the digest, e-mail:
[EMAIL PROTECTED]
To unsubscribe from the digest, e-mail:
[EMAIL PROTECTED]
To post to the list, e-mail:
[EMAIL PROTECTED]
----------------------------------------------------------------------
--- Begin Message ---
In Xoops, modules/headlines/blocks/headlines.php ,
right around:
$synd = new RSStoHTML($headlinesurl, $cache_dir,
$cache_file, $cache_time, $max_items);
$block['content'] .=
"<b>".$synd->getTitle()."</b><br />";
$block['content'] .= $synd->getHtml();
(I think)
mb_convert_encoding and mb_detect_encoding are
being inserted more or less as follows:
$str = mb_convert_encoding($str,"encoding1","encoding2");
and
mb_detect_order("encoding1,encoding2");
$ary[] = "encoding1";
$ary[] = "encoding2";
mb_detect_order($ary);
I needs to use the above sort of lines
to convert rss feeds of _all_ possible
encodings and to one encoding: utf-8 .
I am doing this in postnuke, however,
and don't know where to insert this
code.
I suspect it would be /includes/pnAPI.php
and/or
/includes/blocks/rss.php
I have some notes and code related to
this up at:
http://www.issho.org/modules.php?op=modload&name=DB_phpBB2&file=viewtopic&p=56#56
Any ideas would be very much appreciated.
Thanks.
Tony Laszlo
http://www.issho.org/laszlo.html
--- End Message ---
--- Begin Message ---
> I needs to use the above sort of lines
> to convert rss feeds of _all_ possible
> encodings and to one encoding: utf-8 .
>
How about passing multiple encoding names to mb_convert_encoding()?
mb_convert_encoding($str, "UTF-8", array("ISO-8859-15", "EUC-JP", "GB2312"));
Moriyoshi
--- End Message ---
--- Begin Message ---
Thank you for your kind response.
On Wed, 22 Jan 2003, Moriyoshi Koizumi wrote:
> How about passing multiple encoding names to mb_convert_encoding()?
>
> mb_convert_encoding($str, "UTF-8", array("ISO-8859-15", "EUC-JP", "GB2312"));
* While I would gladly write out each and every encoding in
existence
(as you can see from the top page here:
http://www.issho.org/ , pretty much every language - and
encoding - out there, needs to be supported).
a way to wildcard it would be preferred. :)
Is there not such a way?
* The other part of the problem is perhaps the larger;
the convoluted code in Postnuke. I know it is not popular
in Japan, but some people have had experience with it,
surely.
Where to operate, when attempting to convert the
encoding of the incoming rss feeds?
(I really _would_ use Xoops and the very fine fix
that those developers have devised, _if only_ multiple
language interfaces were a possibility... ) :)
Viva la PHP.
--
Tony Laszlo
http://www.issho.org/LaszloBlog/
--- End Message ---
--- Begin Message ---
Eneko Lacunza wrote:
Hi,
On Sat, 11 Jan 2003 16:30:18 +0100, Jan Schneider wrote:
Marcos Lois Bermdez wrote:
I have RH7.3 with php and gettext support, and i get this strange
beaviour: if the mo files are under document root of web server seems
to work well, but if put the mo files in a directory outside the doc
root of the web server it work some times, i get the messages
translated, but when i reload the page some times i get no translated
the page in some times.
It's normal?
it's a bug?
Yes. Yes.
Using gettext in PHP I have seen strange behaviour so often I can't
count. This particular behaviour can sometimes be fixed by restarting
the webserver. No idea why.
This was very helpfull, as I was suffering the same problem. Restarting
the apache server made it work well.
Interestingly enough, some PHP pages worked well at all, but some others
failed; using one locale they failed 3 times out of 4; with another locale,
it failed 1 time out of 4. The failing ones where in one directory
(parent) and the working one in another (subdirectory) (couldn't test with
more,
sorry).
My test were done in a Red Hat 7.3 with updated stock apache/php/gettext
(4.1.2).
Does anyone know if there's another way to "fix" this? I don't want to
have the production server broken and I don't have administrative access
to it (just ftp).
Not really, as gettext is somehow broken and show this strange behaviour.
This behaviour appeared more often when we used the two letter language
codes. As soon as we switched to the ll_CC form (de_DE, en_US) things
started to work much smoother.
Jan.
--- End Message ---
--- Begin Message ---
Jan Schneider wrote:
OK, this is not really PHP related but I thought I might ask anway:
If sending the Content-Length: http header to the browser for a page
that's encoded in a multibyte charset, do I use the binary length or the
character length?
I suppose you know verious issues related to output buffer...
Use binary length (byte length)
--
Yasuo Ohgaki
--- End Message ---
--- Begin Message ---
When trying to do a search in a mbstring, I sometimes(!) get the following error:
"
Warning: mb_ereg_search()[function.mb-ereg-search]:mbregex compile err: premature end
of regular expression in c:\myfile.php on line 12.
"
It's only the case for some strings. Why is this? I assume it must something related
to the word itself interacting with the mbfunctions
I'm using PHP 4.3.0. I added an example below where it fails... I noticed that if i
add a space in the $word string it works, but I'm sure there's nothing wrong with
$word.
Thanks a lot!!!!!
Simon
Maybe this little file helps to see what i'm talking about: Copy & paste away
<html>
<head>
<meta http-equiv="Content-Type" content="Text/Html; Charset=UTF-8">
<title>coding-problem</title>
</head>
<body>
<?php
$examplesnt="ã‚ãŸã—ã‚";
$word="愛犬";
//**************************FUNCTIONS*******************************************
function find_word($word,$sentence)
{ mb_ereg_search_init ($sentence);
$result=mb_ereg_search($word);
if($result==1){
echo "$sentence contains $word<br>";
}
return $result;
}
//************************MAIN**************************************************
/* Set internal character encoding to UTF-8 */
mb_internal_encoding("UTF-8");
/* Display current internal character encoding */
echo mb_internal_encoding()."<br>";
echo $examplesnt." is the example sentence<br>";
echo $word." is the word we're looking for in the sentence<br>";// i know the word is
not in it, but that doesn't matter
find_word($word,$examplesnt);
?>
</body>
</html>
--- End Message ---
--- Begin Message ---
Hi,
> Warning: mb_ereg_search()[function.mb-ereg-search]:mbregex compile
> err: premature end of regular expression in c:\myfile.php on line 12.
Did you try mb_ereg() yet? if you see the same error message with it,
it's possibly a bug.
http://www.php.net/mb_ereg
And you can use preg_* functions as well if your scripts use UTF-8 only.
Regards,
Moriyoshi
--- End Message ---
--- Begin Message ---
Hi,
Yes, I tried mb_ereg, and I've got the same problem, so it must be a
bug,
However I followed your suggestion and used regular ereg function and
that works. I suppose it would be really nice to see some kind of
dedicated Japanese language and PHP tutorial around...
Well, wishfull thinking!
Thanks for the reply.
Simon
Hi,
> Warning: mb_ereg_search()[function.mb-ereg-search]:mbregex compile
> err: premature end of regular expression in c:\myfile.php on line 12.
Did you try mb_ereg() yet? if you see the same error message with it,
it's possibly a bug.
http://www.php.net/mb_ereg
And you can use preg_* functions as well if your scripts use UTF-8 only.
Regards,
Moriyoshi
--- End Message ---