Re: [nyphp-talk] SimpleXML - UTF8

2009-10-19 Thread John Campbell
On Mon, Oct 19, 2009 at 7:32 AM, Dan Cech wrote: > Try: > > $text = @iconv('UTF-8','UTF-8//TRANSLIT',$text); Thanks Dan, I knew there had to be something simple. It looks like mb_convert_encoding($txt,'UTF-8','UTF-8') will work similarly, but just deletes the offending bytes. Regards, John Cam

Re: [nyphp-talk] SimpleXML - UTF8

2009-10-19 Thread Dan Cech
John Campbell wrote: > I am using a remote XML service, that about 1 in 100 times returns XML > with invalid UTF-8 bytes. I don't have any control over the remote > service, but simpleXML pukes when I pass malformed UTF-8 to it. Does > anyone know of a simple way to cleanup bad UTF-8 bytes, e.g.

Re: [nyphp-talk] SimpleXML - UTF8

2009-10-17 Thread Fernando Gabrieli
Hi, i believe php.net/mb_convert_encoding will use "?" or just avoid printing a character when it can't convert. There's also php.net/iconv On Sat, Oct 17, 2009 at 5:59 PM, John Campbell wrote: > I hope there is an easy answer to this: > > I am using a remote XML service, that about 1 in 100 tim

Re: [nyphp-talk] SimpleXML - UTF8

2009-10-17 Thread Adrian Noland
I have this handy function I pulled from somewhere else. Does it help? Apologies if the actual characters don't come across in the email. /** * This function was created to scrub additional html entities that are not in the PHP get_html_translation_table * Currently bug #34577 in th

Re: [nyphp-talk] SimpleXML - UTF8

2009-10-17 Thread Donald J. Organ IV
a simple str_replace can replace the invalid characters..but you have to know what they are first. - Original Message - From: "John Campbell" To: "NYPHP Talk" Sent: Saturday, October 17, 2009 4:59:53 PM Subject: [nyphp-talk] SimpleXML - UTF8 I hope there is an eas

[nyphp-talk] SimpleXML - UTF8

2009-10-17 Thread John Campbell
I hope there is an easy answer to this: I am using a remote XML service, that about 1 in 100 times returns XML with invalid UTF-8 bytes. I don't have any control over the remote service, but simpleXML pukes when I pass malformed UTF-8 to it. Does anyone know of a simple way to cleanup bad UTF-8