I've noticed some of the results contain funny characters which seem
to be related to encoding.

For instance, to search from the google search page:

http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=ajaxian+javascript+multi

The first result of that search returns a few results with those funny
right-bracket characters (chevrons I think). They appear fine on the
google site, ie:

Ajaxian » Multi-threaded JavaScript?

But if I do a search using the same term using the search API and I
inspect the title field - I can see it's been URL encoded as follows:

Ajaxian%20%C2%BB%20Multi-threaded%20JavaScript%3F

If I decode that using javascript uridecodecomponent then it appears
like this:

Ajaxian » Multi-threaded JavaScript?

With that funny "A" character appearing before the chevron. If I
decode it server-side using the UTF-8 character set then it appears
the same, ie:

Ajaxian » Multi-threaded JavaScript?

Clearly, in this case, the %C2 character is that funny "A", it's
listed here on this page.

http://www.w3schools.com/TAGS/ref_urlencode.asp

I haven't tried any other character sets mainly because I searched the
API reference to see if there was a field that indicated which
character set an individual result was encoded in. I couldn't see one,
but maybe I've missed something?

I can happily use iconv to decode if I know which character set to
use. Am I wrong to assume everything returned by the Search API is in
UTF-8? Should I instead assume everything returned by the Google
Search API is in a different character set (eg: ISO-8859-1?)

Firstly: may I apologise in advance for any naivete on my part?

I'd really appreciate some advice on how I should deal with this
issue!

Regards
Tristen

-- 
You received this message because you are subscribed to the Google Groups 
"Google AJAX APIs" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-ajax-search-api?hl=en.

Reply via email to