Re: [PHP] utf-8-safe replacement for strtr()?
On 3/26/09 11:36 AM, Nisse Engström news.nospam.0ixbt...@luden.se wrote: On Wed, 25 Mar 2009 11:32:42 +0100, Nisse Engström wrote: On Tue, 24 Mar 2009 08:15:35 -0400, Tom Worster wrote: strtr() with three parameters is certainly unsafe. but my tests are showing that it may be ok with two parameters if the strings in the second parameter are well formed utf-8. does anyone know more? can confirm or contradict? The two-argument version of strtr() should work fine since there are no collisions in utf-8 such that part of one character matches part of a different character. Oops. I meant to write that one complete character does not match any part of any other character. If a string of one or more utf-8 characters match a utf-8 text, it matches exactly those characters in the text. If that makes sense... yes. my conclusion is that 2-param strtr is safe if the subject text and parameter strings are valid utf-8. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] utf-8-safe replacement for strtr()?
On Wed, 25 Mar 2009 11:32:42 +0100, Nisse Engström wrote: On Tue, 24 Mar 2009 08:15:35 -0400, Tom Worster wrote: strtr() with three parameters is certainly unsafe. but my tests are showing that it may be ok with two parameters if the strings in the second parameter are well formed utf-8. does anyone know more? can confirm or contradict? The two-argument version of strtr() should work fine since there are no collisions in utf-8 such that part of one character matches part of a different character. Oops. I meant to write that one complete character does not match any part of any other character. If a string of one or more utf-8 characters match a utf-8 text, it matches exactly those characters in the text. If that makes sense... /Nisse -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] utf-8-safe replacement for strtr()?
On Tue, 24 Mar 2009 08:15:35 -0400, Tom Worster wrote: On 3/23/09 2:02 PM, Tom Worster f...@thefsb.org wrote: i havea general replacement or workaround for every php function in my code that i know to be utf-8-unsafe. except one: strtr(). strtr() with three parameters is certainly unsafe. but my tests are showing that it may be ok with two parameters if the strings in the second parameter are well formed utf-8. does anyone know more? can confirm or contradict? The two-argument version of strtr() should work fine since there are no collisions in utf-8 such that part of one character matches part of a different character. The question is whether the function is binary safe. The manual page doesn't say as far as I can tell. Google came up with the following: strtr() made binary safe in PHP3: http://marc.info/?l=php-generalm=92740681805351w=4 Two-argument version added in PHP4: http://php.net/ChangeLog-4.php /Nisse -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] utf-8-safe replacement for strtr()?
thanks for the info. i'll leave 2-param uses of strtr in my code alone. i have a replacement for the 3-param version. btw: i have quite a long checklist of stuff to do when upgrading code for utf-8, including notes on about 100 functions. do you think it would be worth putting it on a wiki somewhere? On 3/25/09 6:32 AM, Nisse Engström news.nospam.0ixbt...@luden.se wrote: On Tue, 24 Mar 2009 08:15:35 -0400, Tom Worster wrote: On 3/23/09 2:02 PM, Tom Worster f...@thefsb.org wrote: i havea general replacement or workaround for every php function in my code that i know to be utf-8-unsafe. except one: strtr(). strtr() with three parameters is certainly unsafe. but my tests are showing that it may be ok with two parameters if the strings in the second parameter are well formed utf-8. does anyone know more? can confirm or contradict? The two-argument version of strtr() should work fine since there are no collisions in utf-8 such that part of one character matches part of a different character. The question is whether the function is binary safe. The manual page doesn't say as far as I can tell. Google came up with the following: strtr() made binary safe in PHP3: http://marc.info/?l=php-generalm=92740681805351w=4 Two-argument version added in PHP4: http://php.net/ChangeLog-4.php /Nisse -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] utf-8-safe replacement for strtr()?
On 3/23/09 2:02 PM, Tom Worster f...@thefsb.org wrote: i havea general replacement or workaround for every php function in my code that i know to be utf-8-unsafe. except one: strtr(). strtr() with three parameters is certainly unsafe. but my tests are showing that it may be ok with two parameters if the strings in the second parameter are well formed utf-8. does anyone know more? can confirm or contradict? the only ideas i have to implement strtr in php with known utf-8-safe php functions would be rather inefficient. my replacement for the 3-param strtr is an order of magnitude less unlovely to behold than my replacement for the 2-param version any ideas, suggestions, pointers? or maybe you're better at googling an answer than i am. assume mbstring is available. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php