Re: [PHP] PHP 6: Mysql with iso-8859-1 chars outputting utf-8: Could not convert binary string to Unicode string

2007-05-02 Thread Richard Lynch
This is a question from a guy who does NOT really do Unicode very well...

If everything else in your entire system is iso-8859-1 (aka Latin1)
why are you making your output be utf-8?

Seems to me that that is where the conversion is probably taking place...

On Sun, April 29, 2007 11:07 am, Rangel Reale wrote:
 Hello!

 I have a MySQL database where all tables are in the latin1 character
 set, with accented (Portuguese) characters.

 In my php.ini I have

 
 ; Unicode settings ;
 

 unicode.semantics = on
 unicode.runtime_encoding = iso-8859-1
 unicode.script_encoding = iso-8859-1
 unicode.output_encoding = utf-8
 unicode.from_error_mode = U_INVALID_SUBSTITUTE
 unicode.from_error_subst_char = 3f
 unicode.fallback_encoding = iso-8859-1

 because all my files and data in mysql server are in iso-8859-1.


 When connecting to mysql I issue:

   mysql_query('set names latin1', $this-mysql_link);

 but when I do query in any record that have accented characters I get
 this warning (using mysql_fetch_assoc):

 --
 Could not convert binary string to Unicode string (converter UTF-8
 failed on bytes (0xE7) at offset 9)
 --

 for all accented characters in all fields.


 If I changed the set names query to:

   mysql_query('set names utf8', $this-mysql_link);

 it works, but I would like to keep compatibility with PHP 5, and for
 my application it requires set names to be latin1. Also, my databases
 are not created with the utf8 option.

 As I understood PHP 6's unicode support, all string characters
 (including mysql result values) are converted from
 unicode.runtime_encoding to unicode (utf-16), but looks like it is
 trying to convert from ASCII, which does not have all the accented
 characters. Am I assuming right? How to make mysql_fetch_assoc assume
 field values are in iso-8859-1 instead of ASCII?

 Thanks,
 Rangel Reale



-- 
Some people have a gift link here.
Know what I want?
I want you to buy a CD from some indie artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] PHP 6: Mysql with iso-8859-1 chars outputting utf-8: Could not convert binary string to Unicode string

2007-05-02 Thread Rangel Reale

You are right, I really don't know Unicode very much! :P

What I was trying to understand was, because unicode.runtime_encoding =
iso-8859-1, I tought that all internal operations were done in this
encoding, and only when outputting (unicode.output_encoding = utf-8) data
would be converted to utf-8. So to me, I did a mysql query with latin1, data
comes to my variables as iso-8859-1, I use them, and only when I echo'ed
them, they would become utf-8, from a iso-8859-1-to-utf-8-like function.

The strange thing to me, is the mysql_fetch_assoc function give this error
even before I accessed the field values, as I understanded from the above
explanation.

Did I understood it wrong?

Thanks,
Rangel

- Original Message - 
From: Richard Lynch [EMAIL PROTECTED]

To: Rangel Reale [EMAIL PROTECTED]
Cc: php-general@lists.php.net
Sent: Wednesday, May 02, 2007 6:53 PM
Subject: Re: [PHP] PHP 6: Mysql with iso-8859-1 chars outputting utf-8:
Could not convert binary string to Unicode string



This is a question from a guy who does NOT really do Unicode very well...

If everything else in your entire system is iso-8859-1 (aka Latin1)
why are you making your output be utf-8?

Seems to me that that is where the conversion is probably taking place...

On Sun, April 29, 2007 11:07 am, Rangel Reale wrote:

Hello!

I have a MySQL database where all tables are in the latin1 character
set, with accented (Portuguese) characters.

In my php.ini I have


; Unicode settings ;


unicode.semantics = on
unicode.runtime_encoding = iso-8859-1
unicode.script_encoding = iso-8859-1
unicode.output_encoding = utf-8
unicode.from_error_mode = U_INVALID_SUBSTITUTE
unicode.from_error_subst_char = 3f
unicode.fallback_encoding = iso-8859-1

because all my files and data in mysql server are in iso-8859-1.


When connecting to mysql I issue:

  mysql_query('set names latin1', $this-mysql_link);

but when I do query in any record that have accented characters I get
this warning (using mysql_fetch_assoc):

--
Could not convert binary string to Unicode string (converter UTF-8
failed on bytes (0xE7) at offset 9)
--

for all accented characters in all fields.


If I changed the set names query to:

  mysql_query('set names utf8', $this-mysql_link);

it works, but I would like to keep compatibility with PHP 5, and for
my application it requires set names to be latin1. Also, my databases
are not created with the utf8 option.

As I understood PHP 6's unicode support, all string characters
(including mysql result values) are converted from
unicode.runtime_encoding to unicode (utf-16), but looks like it is
trying to convert from ASCII, which does not have all the accented
characters. Am I assuming right? How to make mysql_fetch_assoc assume
field values are in iso-8859-1 instead of ASCII?

Thanks,
Rangel Reale




--
Some people have a gift link here.
Know what I want?
I want you to buy a CD from some indie artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?




--
Internal Virus Database is out-of-date.
Checked by AVG Free Edition.
Version: 7.5.463 / Virus Database: 269.6.1/776 - Release Date: 25/4/2007
12:19




--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] PHP 6: Mysql with iso-8859-1 chars outputting utf-8: Could not convert binary string to Unicode string

2007-05-02 Thread Richard Lynch
On Wed, May 2, 2007 5:32 pm, Rangel Reale wrote:
 You are right, I really don't know Unicode very much! :P

 What I was trying to understand was, because unicode.runtime_encoding
 =
 iso-8859-1, I tought that all internal operations were done in this
 encoding, and only when outputting (unicode.output_encoding = utf-8)
 data
 would be converted to utf-8. So to me, I did a mysql query with
 latin1, data
 comes to my variables as iso-8859-1, I use them, and only when I
 echo'ed
 them, they would become utf-8, from a iso-8859-1-to-utf-8-like
 function.

 The strange thing to me, is the mysql_fetch_assoc function give this
 error
 even before I accessed the field values, as I understanded from the
 above
 explanation.

 Did I understood it wrong?

No, you're probably right...

I think that PHP may be biased toward UTF-16 (32?) internally, and
be converting to/from UTF-16 and keeping everything in UTF-16
internally...

But, really, since PHP 6 is not actually released yet, you probably
need to be asking about this on Internals, I think, to get a real
answer...

I suspect you'll still want to use iso-8859-1 on the output after you
solve this bug, though, if you don't want it converted to utf-8 in the
end..

You may also want to just try it with unicode semantics OFF, since
that's probably the least debugged and biggest change in PHP 6, and if
you do *everything* in Latin1, that's pretty much the state PHP was in
since nineteen-ninety-mumble, and it will just work anyway...

-- 
Some people have a gift link here.
Know what I want?
I want you to buy a CD from some indie artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP] PHP 6: Mysql with iso-8859-1 chars outputting utf-8: Could not convert binary string to Unicode string

2007-04-29 Thread Rangel Reale
Hello!

I have a MySQL database where all tables are in the latin1 character set, with 
accented (Portuguese) characters.

In my php.ini I have


; Unicode settings ;


unicode.semantics = on
unicode.runtime_encoding = iso-8859-1
unicode.script_encoding = iso-8859-1
unicode.output_encoding = utf-8
unicode.from_error_mode = U_INVALID_SUBSTITUTE
unicode.from_error_subst_char = 3f
unicode.fallback_encoding = iso-8859-1

because all my files and data in mysql server are in iso-8859-1.


When connecting to mysql I issue:

  mysql_query('set names latin1', $this-mysql_link);

but when I do query in any record that have accented characters I get this 
warning (using mysql_fetch_assoc):

--
Could not convert binary string to Unicode string (converter UTF-8 failed on 
bytes (0xE7) at offset 9)
--

for all accented characters in all fields.


If I changed the set names query to:

  mysql_query('set names utf8', $this-mysql_link);

it works, but I would like to keep compatibility with PHP 5, and for my 
application it requires set names to be latin1. Also, my databases are not 
created with the utf8 option.

As I understood PHP 6's unicode support, all string characters (including mysql 
result values) are converted from unicode.runtime_encoding to unicode (utf-16), 
but looks like it is trying to convert from ASCII, which does not have all the 
accented characters. Am I assuming right? How to make mysql_fetch_assoc assume 
field values are in iso-8859-1 instead of ASCII?

Thanks,
Rangel Reale