On 9 Nov 2009, at 17:00, RPB wrote:

> 
> Hello,
> 
> I am retrieving XML data from the Amazon UK api which returns XML
> including a £ (GBP) sign. I found that XMLParser.parse(xmlText) will
> throw an exception (com.google.gwt.xml.client.impl.DOMParseException:
> Failed to parse ) unless i remove the £ signs from the XML.

The £ sign is not part of the 7-bit US-ASCII character set. That means that 
character encoding issues become critical, if you don't want corrupted data.

If your file was encoded in ISO 8859-1 (Latin 1) but you were treating it as 
though it was encoded in UTF-8, or some similar mismatched pair, you'd see 
problems of this kind - in fact, be thankful that an exception was thrown - in 
some cases, you'd just get silent data corruption!

> 
> I am hoping someone can explain why this happens? It doesn't seem to
> make sense to me to have to pre-process the XML by removing the £
> signs or adding CDATA sections - please let me know if there is a
> better way.

Take steps to preserve character encoding information at the various stages, or 
else find a single one that will work through all stages of the chain. UTF-8 is 
becoming a de-facto standard, but nevertheless not all systems support it yet...

> 
> Thanks!
> 
> > 

-- 
Bill Michell
billmich...@gmail.com





--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google Web Toolkit" group.
To post to this group, send email to google-web-toolkit@googlegroups.com
To unsubscribe from this group, send email to 
google-web-toolkit+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-web-toolkit?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to