php-i18n Digest 24 Jan 2003 15:20:28 -0000 Issue 145

php-i18n-digest-help Fri, 24 Jan 2003 07:20:07 -0800

php-i18n Digest 24 Jan 2003 15:20:28 -0000 Issue 145

Topics (messages 418 through 425):


mb_detect_encoding mb_convert_encoding and rss
        418 by: Tony Laszlo
        419 by: Moriyoshi Koizumi
        420 by: Tony Laszlo

Re: Problem with gettext
        421 by: Jan Schneider

Re: Content-Length: and multibyte
        422 by: Yasuo Ohgaki

problem with mbstring (search)functions
        423 by: Simon Dedeyne
        424 by: Moriyoshi Koizumi
        425 by: Simon Dedeyne

Administrivia:

To subscribe to the digest, e-mail:
        [EMAIL PROTECTED]

To unsubscribe from the digest, e-mail:
        [EMAIL PROTECTED]

To post to the list, e-mail:
        [EMAIL PROTECTED]


----------------------------------------------------------------------

--- Begin Message ---

In Xoops,  modules/headlines/blocks/headlines.php , 
right around: 
                        $synd = new RSStoHTML($headlinesurl, $cache_dir, 
$cache_file, $cache_time, $max_items);
                        $block['content'] .= 
"<b>".$synd->getTitle()."</b><br />";
                        $block['content'] .= $synd->getHtml();


(I think) 

mb_convert_encoding and mb_detect_encoding are 
being inserted more or less as follows: 


$str = mb_convert_encoding($str,"encoding1","encoding2");

and 

mb_detect_order("encoding1,encoding2");

$ary[] = "encoding1";
$ary[] = "encoding2";
mb_detect_order($ary);


I needs to use the above sort of lines 
to convert rss feeds of _all_ possible 
encodings and to one encoding: utf-8 . 

I am doing this in postnuke, however, 
and don't know where to insert this 
code. 
I suspect it would be /includes/pnAPI.php 
and/or
/includes/blocks/rss.php 

I have some notes and code related to 
this up at: 
http://www.issho.org/modules.php?op=modload&name=DB_phpBB2&file=viewtopic&p=56#56

Any ideas would be very much appreciated. 


Thanks. 

Tony Laszlo
http://www.issho.org/laszlo.html

--- End Message ---

--- Begin Message ---

> I needs to use the above sort of lines 
> to convert rss feeds of _all_ possible 
> encodings and to one encoding: utf-8 . 
> 

How about passing multiple encoding names to mb_convert_encoding()?

mb_convert_encoding($str, "UTF-8", array("ISO-8859-15", "EUC-JP", "GB2312"));


Moriyoshi

--- End Message ---

--- Begin Message ---

Thank you for your kind response. 

On Wed, 22 Jan 2003, Moriyoshi Koizumi wrote:

> How about passing multiple encoding names to mb_convert_encoding()?
> 
> mb_convert_encoding($str, "UTF-8", array("ISO-8859-15", "EUC-JP", "GB2312"));

* While I would gladly write out each and every encoding in 
existence

(as you can see from the top page here: 
http://www.issho.org/ , pretty much every language - and 
encoding - out there, needs to be supported). 

a way to wildcard it would be preferred. :)
Is there not such a way?

* The other part of the problem is perhaps the larger; 
the convoluted code in Postnuke. I know it is not popular 
in Japan, but some people have had experience with it, 
surely. 

Where to operate, when attempting to convert the 
encoding of the incoming rss feeds? 

(I really _would_ use Xoops and the very fine fix 
that those developers have devised, _if only_ multiple 
language interfaces were a possibility... ) :)

Viva la PHP. 

-- 
Tony Laszlo
http://www.issho.org/LaszloBlog/

--- End Message ---

--- Begin Message ---

Eneko Lacunza wrote:

Hi,

On Sat, 11 Jan 2003 16:30:18 +0100, Jan Schneider wrote:

Marcos Lois Berm�dez wrote:

I have RH7.3 with php and gettext support, and i get this strange
beaviour: if the mo files are under document root of web server seems
to work well, but if  put the mo files in a directory outside the doc
root of the web server it work some times, i get the messages
translated, but when i reload the page some times i get no translated
the page in some times.

It's normal?
it's a bug?

Yes. Yes.
Using gettext in PHP I have seen strange behaviour so often I can't
count. This particular behaviour can sometimes be fixed by restarting
the webserver. No idea why.


	This was very helpfull, as I was suffering the same problem. Restarting
the apache server made it work well.

	Interestingly enough, some PHP pages worked well at all, but some others
failed; using one locale they failed 3 times out of 4; with another locale,
it failed 1 time out of 4. The failing ones where in one directory
(parent) and the working one in another (subdirectory) (couldn't test with
more,
 sorry).

	My test were done in a Red Hat 7.3 with updated stock apache/php/gettext
(4.1.2).

	Does anyone know if there's another way to "fix" this? I don't want to
have the production server broken and I don't have administrative access
to it (just ftp).

Not really, as gettext is somehow broken and show this strange behaviour.

This behaviour appeared more often when we used the two letter language codes. As soon as we switched to the ll_CC form (de_DE, en_US) things started to work much smoother.

Jan.

--- End Message ---

--- Begin Message ---
Jan Schneider wrote:
OK, this is not really PHP related but I thought I might ask anway:
If sending the Content-Length: http header to the browser for a page that's encoded in a multibyte charset, do I use the binary length or the character length?
I suppose you know verious issues related to output buffer...

Use binary length (byte length)


--
Yasuo Ohgaki
--- End Message ---

--- Begin Message ---

When trying to do a search in a mbstring, I sometimes(!) get the following error:
 
"
Warning: mb_ereg_search()[function.mb-ereg-search]:mbregex compile err: premature end 
of regular expression in c:\myfile.php on line 12.
"
 
It's only the case for some strings. Why is this? I assume it must something related 
to the word itself interacting with the mbfunctions
I'm using PHP 4.3.0. I added an example below where it fails... I noticed that if i 
add a space in the $word string it works, but I'm sure there's nothing wrong with 
$word.
 
 
Thanks a lot!!!!!
Simon
 
 
 
Maybe this little file helps to see what i'm talking about: Copy & paste away
 
<html>
<head>
<meta http-equiv="Content-Type" content="Text/Html; Charset=UTF-8">
<title>coding-problem</title>
</head>
<body>
<?php
$examplesnt="ã‚ãŸã—ã‚";
$word="æ„›çŠ¬";
//**************************FUNCTIONS*******************************************
function find_word($word,$sentence)
{ mb_ereg_search_init ($sentence);
$result=mb_ereg_search($word);
if($result==1){
echo "$sentence contains $word<br>";
} 
return $result;
}
 
//************************MAIN**************************************************
/* Set internal character encoding to UTF-8 */
mb_internal_encoding("UTF-8");
/* Display current internal character encoding */
echo mb_internal_encoding()."<br>"; 
echo $examplesnt." is the example sentence<br>";
echo $word." is the word we're looking for in the sentence<br>";// i know the word is 
not in it, but that doesn't matter
find_word($word,$examplesnt);
?>
</body>
</html>

--- End Message ---

--- Begin Message ---

Hi,

> Warning: mb_ereg_search()[function.mb-ereg-search]:mbregex compile
> err: premature end of regular expression in c:\myfile.php on line 12.

Did you try mb_ereg() yet? if you see the same error message with it, 
it's possibly a bug. 

http://www.php.net/mb_ereg

And you can use preg_* functions as well if your scripts use UTF-8 only.

Regards,
Moriyoshi

--- End Message ---

--- Begin Message ---

Hi, 

Yes, I tried mb_ereg, and I've got the same problem, so it must be a
bug,
However I followed your suggestion and used regular ereg function and
that works. I suppose it would be really nice to see some kind of
dedicated Japanese language and PHP tutorial around...

Well, wishfull thinking!

Thanks for the reply.
Simon


Hi,

> Warning: mb_ereg_search()[function.mb-ereg-search]:mbregex compile
> err: premature end of regular expression in c:\myfile.php on line 12.

Did you try mb_ereg() yet? if you see the same error message with it, 
it's possibly a bug. 

http://www.php.net/mb_ereg

And you can use preg_* functions as well if your scripts use UTF-8 only.

Regards,
Moriyoshi

--- End Message ---

php-i18n Digest 24 Jan 2003 15:20:28 -0000 Issue 145

Reply via email to