Hi people,
I can imagine most of you haven't noticed, but Google have actually changed
their default encoding to UTF-8.
This means that any characters not in the 7-bit range (pretty much anything
but A-Z, a-z, 0-9, etc), most notably, '�' (since it's part of my name) are
transmitted to Google in the wrong format.
For example, if I search for "gr�sman", the Google querystring becomes:
http://www.google.com/search?hl=&cat=&meta=&q=gr%E4sman
This used to work, but doesn't anymore. If I enter "gr�sman" in the Google
search field, this is what's generated:
http://www.google.com/search?hl=en&lr=&q=gr%C3%A4sman
(it also adds an ie=UTF-8 value, but since UTF-8 is now the default, it's
not strictly necessary).
So, I guess there are two ways of solving this;
1) Add an ie=UTF-7 hint in the querystring for searches
2) Make sure the querystring is properly UTF-8 encoded before sending it on
to Google
(1) is much easier, but I'm not sure how correct it is, I don't know what
IE/JS actually does with the string... Especially with regards Hebrew,
Japanese, etc.
(2) sounds like the way to go, and then we should add the ie=UTF-8 hint to
the querystring as well, in case they change their minds again, but I
honestly don't know how to get a URL-encoded UTF-8 string. We could always
build support into DQSDTools if necessary.
First off - can anybody see if I'm missing something obvious here? If not -
is there a simpler, correct solution?
Thanks,
Kim
-------------------------------------------------------
This SF.Net email is sponsored by: Oracle 10g
Get certified on the hottest thing ever to hit the market... Oracle 10g.
Take an Oracle 10g class now, and we'll give you the exam FREE.
http://ads.osdn.com/?ad_id149&alloc_id�66&op=click
_______________________________________________
DQSD-Devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dqsd-devel